Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

AI Agent Smart Contract Exploit Generation

Created by
  • Haebom

Author

Arthur Gervais, Liyi Zhou

Outline

A1 is an agent-execution-based system that transforms any LLM into an end-to-end exploit generator. A1 provides the agent with six domain-specific tools that enable autonomous vulnerability discovery without any handcrafted heuristics. The agent can flexibly utilize these tools to understand smart contract behavior, generate exploit strategies, test them on blockchain state, and improve its approach based on execution feedback. All outputs are specifically validated to eliminate false positives. In an evaluation on 36 real-world vulnerable contracts from Ethereum and Binance Smart Chain, it achieved a success rate of 62.96% (17 out of 27) on the VERITE benchmark. In addition to the VERITE dataset, A1 identified nine additional vulnerable contracts, five of which occurred after the training cutoff date of the strongest model. In the 26 successful cases, A1 extracted up to $8.59 million per case, for a total of $9.33 million. Analyzing the iteration-by-iteration performance across 432 experiments across six LLMs, we demonstrate diminishing returns with average marginal profits of +9.7%, +3.7%, +5.1%, and +2.8% at iterations 2 and 5, respectively, at costs of $0.01 and $3.59 per experiment. Monte Carlo analyses of 19 historical attacks show success probabilities of 85.9% to 88.8% with no detection delay. We investigate whether deploying A1 as a continuous on-chain scanning system benefits attackers or defenders. OpenAI’s o3-pro model remains profitable with a vulnerability encounter rate of 0.100% up to a 30-day scanning delay, while faster models require an encounter rate of 1.000% or higher to break-even. These results demonstrate a disturbing asymmetry where attackers achieve on-chain scanning profitability at an exploit value of $6,000, while defenders require $60,000, at a vulnerability incidence of 0.1%, raising a fundamental question of whether AI agents inevitably prefer exploits over defenses.

Takeaways, Limitations

Takeaways:
We empirically demonstrate the effectiveness of an automated smart contract vulnerability discovery and exploitation system based on LLM.
Achieving high success rates and significant financial gains for real smart contracts.
It highlights the importance and challenge of AI security by demonstrating an asymmetric advantage that AI agents have over attacks.
Provides Takeaways for attacker and defender strategy formulation through economic efficiency analysis of on-chain scanning system.
Limitations:
Lack of evaluation on datasets other than the VERITE benchmark.
Further research is needed on the generalizability of A1 and its applicability to various blockchain environments.
Lack of consideration for detection evasion strategies.
The need for continued evolution of models and development of more sophisticated defense mechanisms.
A deeper discussion is needed on the ethical issues and potential exploitation of AI agents.
👍