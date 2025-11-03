OpenAI unveils Aardvark, an autonomous AI agent for scalable vulnerability detection and patching

Aardvark mimics human researchers: reads code, runs tests, and proposes targeted security fixes

In benchmark tests, Aardvark achieved a 92% success rate on known vulnerable repositories

OpenAI wants your next security researcher to be a bot - and has launched Aardvark, its very own agentic security researcher, powered by ChatGPT.

Now in private beta, the company describes Aardvark as a “breakthrough” in AI and security research - an autonomous agent which helps developers and security teams discover and fix security flaws “at scale”.

“Each year, tens of thousands of new vulnerabilities are discovered across enterprise and open-source codebases,” the company said. “Defenders face the daunting tasks of finding and patching vulnerabilities before their adversaries do.”

Mimicking human behavior

In benchmark testing on so-called “golden” repositories (those that contain well-documented vulnerabilities and are used for testing), Aardvark has apparently had a 92% success rate.

Detailing how it works, OpenAI said Aardvark is not unlike a human - but without the need to rest, eat, use the toilet, or an occasional emotional breakdown.

“Aardvark looks for bugs as a human security researcher might: by reading code, analyzing it, writing and running tests, using tools, and more,” it said. By continuously analyzing source code repositories, it can identify vulnerabilities, assess exploitability, prioritize severity, and then propose targeted patches.

While the company stresses the tool is still in beta, it also says it’s already showing commendable results. OpenAI has been running it internally “for several months” across its codebases and those of “external alpha partners”, and managed to surface “meaningful vulnerabilities” which contributed to OpenAI’s defensive posture.

An AI agent is an autonomous AI program that connects to other apps to perform various tasks automatically. Their popularity has been growing lately, with different agents being built for different purposes, such as the AI coding agent Zencoder, the Instagram analysis agent (built on Apify), Compuser (an AI that “uses the computer), and others.

