A newbie hacker used "vague, low-skill prompts" in Claude and Codex to breach 14 companies, and the AI Agents did all the legwork

A robot in front of a digital screen, touching some of the symbols with its outstretched finger — (Image credit: Getty Images)

OALABS analyzed a novice attacker’s full working directory showing 14 breaches carried out with Claude Code and Codex agents
Attacker used vague prompts; AI agents handled reconnaissance, exploit writing, and data harvesting, bypassing guardrails with ease
Logs revealed attacker’s identity and location in Addis Ababa, Ethiopia

A newbie cybercriminal managed to break into 14 organizations and steal sensitive data, just by using Anthropic’s Claude Code and OpenAI’s Codex agents. This is according to cybersecurity researchers OALABS, who recovered and analyzed the attacker’s entire working directory.

The researchers used this news as yet another proof that advanced Generative Artificial Intelligence (GenAI) models are significantly lowering the barrier for entry into cybercrime, and to sound the alarm that the security community needs to step up.

“In many cases, the attacker supplied only vague, low-skill prompts and allowed Claude to fill in the gaps: researching exposed services, identifying possible vulnerabilities, writing exploit code, validating access, and harvesting data,” the researchers said. “The attacker did not need to be an expert operator; they simply had to use the correct framing for their prompts. The agent supplied much of the structure and technical execution that the attacker appeared to lack.”

Doxxing the attacker

OALABS could not find evidence that the stolen data was monetized in any way, either by being sold on the dark web, or by extorting the victim companies. They did, however, find numerous pieces of evidence about the attacker’s identity and whereabouts.

According to the researchers, the attacker did not run the AI agents on his own infrastructure, but rather on a third-party server, and when that third party discovered malicious activity, they downloaded the entire working directory and shared it with the researchers.

“Because the agents were local to the host, their full session logs were recovered, including the attacker’s prompts, the tools used, the internal monologue of the large language model (LLM), and any policy violations recorded during the sessions,” the researchers said.

OALABS was thus able to analyze more than 1,000 agent sessions, seeing how the attacker was able, with ease, to bypass most of the agents’ guardrails. Among the sessions were also the threat actor’s CV with his full name, location, education history, and LinkedIn profile, as well as his IP address which showed that he was located in Addis Ababa, Ethiopia.

Via Helpnet Security

The best antivirus for all budgets

➡️ Read our full guide to the best antivirus
1. Best overall:
Bitdefender Total Security
2. Best for families:
Norton 360 with LifeLock
3. Best for mobile:
McAfee Mobile Security

Google logo on a black background next to text reading 'Click to follow TechRadar'

Follow TechRadar on Google News and add us as a preferred source to get our expert news, reviews, and opinion in your feeds.

Doxxing the attacker

Useful links