2025-10-01

The advent of LLM reasoning models has opened up doors to create systems that mimic human knowledge work. Coding, science, market research, and many more.

Smart contract auditing is undoubtedly one of them.

There are many tools (static analysis, fuzzing, etc) which assist human auditors.

But they only augment the human guided process of auditing. These tools catch specific classes of bugs but fail to reason holistically about protocol-level invariants, which is something reasoning models excel at.

We believe LLM reasoning models have the potential to surpass human auditors in reliability.

A lot of R&D is required, but at some point we will trust AI auditors more than human auditors.

As we trust self-driving cars more than we trust cars driven by humans.

How to build an AI auditor?

One of the most important components in building highly capable AI systems is building robust evals. We must create evals from real-world audit reports (code4rena, etc) with real-world vulnerabilities. In order to make the AI system trustworthy, it must be able to uncover nearly 100% of the vulnerabilities in the evals.

Instead of “pattern matching” to known bugs, we aim to build an AI auditor that can discover bugs from first principles. “Pattern matching” works for uncovering known bugs, but it fails to identify general errors that are specific to projects. Identifying these bugs require a holistic understanding of the project, its goals, invariants, value-flows, all the interactions it has with external systems and parties.

Building such system will require a lot of prompt engineering, context engineering, and possibly reinforcement learning as well.