EVMbench evaluates AI agents on detecting, exploiting, and patching smart contract vulnerabilities.
OpenAI today unveiled EVMbench, a benchmarking system designed to evaluate how effectively AI agents can detect and address security flaws in crypto tokens and smart contracts.
The system, developed in collaboration with Paradigm, a crypto-focused venture capital firm, establishes standardized testing protocols for vulnerabilities in code running on Ethereum Virtual Machine-compatible blockchains.
EVMbench measures AI performance across three categories: identifying weaknesses in smart contracts, demonstrating how those flaws could be exploited, and applying fixes to remediate the issues.
The release also includes ecosystem safeguards. OpenAI expanded its private beta of Aardvark, a security research agent, and committed $10 million in API credits through its Cybersecurity Grant Program to support defensive research, particularly for open source and critical infrastructure projects.
The release comes days after OpenAI announced last Sunday its acquisition of OpenClaw, underscoring a broader push into autonomous AI agents.

1 hour ago
1
















English (US) ·