GPT-5.3-Codex achieves a 72.2% success rate attacking vulnerable DeFi smart contracts, yet manages only 36% in detection mode, according to Binance Research's EVMbench benchmark published in April 2026. The same reasoning capabilities that make AI lethal at finding exploits are nearly twice as ineffective when turned toward defense. At $1.22 per contract attempt, the economics of AI-powered attacks have become impossible to ignore.
TL;DR: Binance Research's EVMbench benchmark shows GPT-5.3-Codex hits 72.2% success in attack mode against DeFi smart contracts but only 36% in detect mode. Chainalysis data shows AI-powered scams are 4.5x more profitable per case than traditional ones.
Key Data
- GPT-5.3-Codex — attack mode (EVMbench) 72.2%
- GPT-5.3-Codex, detect mode 36%
- AI-powered vs traditional scams (per case) 4.5x more profitable
- Average AI attack cost per contract $1.22
- Specialized defensive AI (Cecuro) 92% detection rate
- DeFi protocols with on-chain firewall <1%
Source: Binance Research EVMbench · Chainalysis Crime Report 2026 · Cecuro/CoinDesk · April-May 2026
Source: Binance Research EVMbench · Chainalysis Crime Report 2026 · Cecuro/CoinDesk · April-May 2026
The benchmark is called EVMbench. Binance Research published it in its April 2026 report, testing AI models on vulnerable Ethereum contracts across two modes: attack mode (finding and exploiting vulnerabilities) and detect mode (identifying them without exploitation). The gap is stark, and it isn't a measurement error. GPT-5.3-Codex in attack mode: 72.2%. In detect mode: 36%. The model isn't simply “bad at defense.” The same code-reasoning capabilities that make it effective at analysis make it equally effective at finding exactly where that logic breaks, at $1.22 per attempt.
Chainalysis fills in the broader picture in its Crypto Crime Report 2025-2026: AI-powered scams proved 4.5 times more profitable per case than traditional ones. Not because the attackers are more skilled. Because AI scales attack volume in ways no human team could replicate. A single operator with access to an AI model can launch thousands of parallel exploit attempts, at near-zero marginal cost. The attacker who adopts offensive AI before the defender has a structural advantage measurable in tens of millions of dollars.
How Does an AI Attack on a DeFi Smart Contract Work?
Think of it as a security audit run in reverse. An AI agent entering attack mode on a contract does exactly what a security auditor does: reads the code, maps the data flows, hunts for logical anomalies in how functions interact. The difference is the end goal. Instead of producing a report, it builds an exploit. Instead of flagging the issue, it executes.
~40% of daily code written at Coinbase is AI-generated. I want to get it to >50% by October.
, Brian Armstrong (@brian_armstrong) September 3, 2025
Obviously it needs to be reviewed and understood, and not all areas of the business can use AI-generated code. But we should be using it responsibly as much as we possibly can. pic.twitter.com/Nmnsdxgosp
The Cecuro benchmark, cited by CoinDesk in February 2026, had already identified the same asymmetry across 90 real contracts exploited between October 2024 and early 2026, generating verified losses of $228 million. A security-specialized AI agent detected 92% of vulnerabilities. A generic GPT-5.1 model found only 34%. Cecuro also measured the pace: offensive AI capability doubles roughly every 1.3 months. Adoption of defensive AI tools across DeFi sits below 10%. The gap is widening.
This week the asymmetry had a concrete face. THORChain, Verus Bridge, and Echo Protocol were hit within five days for over $23 million combined. None of the three attacks used AI directly as the attack vector, but all three exploited vulnerability windows that an offensive AI system could have identified in minutes. For a full technical breakdown, the SpazioCrypto Hack section has complete details on this week's exploits. April had already seen Kelp DAO at $292 million and Drift Protocol at $285 million: both built on exploits prepared over weeks, with a precision that closely resembles the systematic reasoning of an AI agent.
Very soon there are going to be more AI agents than humans making transactions.
, Brian Armstrong (@brian_armstrong) March 9, 2026
They can't open a bank account, but they can own a crypto wallet. Think about it.
On May 11, 2026, Google GTIG confirmed the first zero-day developed entirely by an AI agent: a two-factor authentication bypass on an open-source tool, already primed for mass exploitation before the team intercepted it. For DeFi operators, the question is no longer “will AI be used to attack?” but “who is already using it, and for how long?” On that front, our article on LLM routers and wallet security documents how offensive AI distribution channels have been active for months. The GPT-5.5 launch targeting banking use cases and Coinbase's AI pivot signal that the industry knows where the fight will be waged. On-chain security needs to reach the same conclusion before the next EVMbench drops with 80% in attack mode.
The gap isn't static. Binance Research has signaled that the next EVMbench cycle is expected in July 2026: it will be the sharpest measure of whether the DeFi sector has begun closing the gap between offensive and defensive AI capability, or whether 72.2% was already the floor. Meanwhile, fewer than 1% of DeFi protocols use on-chain firewalls, and 90% still carry critical exploitable vulnerabilities, according to Cecuro data. Bitcoin miners selling BTC to buy AI GPUs have understood that AI is the terrain that matters. On-chain security needs to reach the same conclusion. For all DeFi security updates, the SpazioCrypto Hack section is updated in real time.
