A newbie hacker used "vague, low-skill prompts" in Claude and Codex to breach 14 companies, and the AI Agents did all the legwork

A robot in front of a digital screen, touching some of the symbols with its outstretched finger
(Image credit: Getty Images)

  • OALABS analyzed a novice attacker’s full working directory showing 14 breaches carried out with Claude Code and Codex agents
  • Attacker used vague prompts; AI agents handled reconnaissance, exploit writing, and data harvesting, bypassing guardrails with ease
  • Logs revealed attacker’s identity and location in Addis Ababa, Ethiopia

A newbie cybercriminal managed to break into 14 organizations and steal sensitive data, just by using Anthropic’s Claude Code and OpenAI’s Codex agents. This is according to cybersecurity researchers OALABS, who recovered and analyzed the attacker’s entire working directory.

The researchers used this news as yet another proof that advanced Generative Artificial Intelligence (GenAI) models are significantly lowering the barrier for entry into cybercrime, and to sound the alarm that the security community needs to step up.

“In many cases, the attacker supplied only vague, low-skill prompts and allowed Claude to fill in the gaps: researching exposed services, identifying possible vulnerabilities, writing exploit code, validating access, and harvesting data,” the researchers said. “The attacker did not need to be an expert operator; they simply had to use the correct framing for their prompts. The agent supplied much of the structure and technical execution that the attacker appeared to lack.”

Latest Videos From

Doxxing the attacker

OALABS could not find evidence that the stolen data was monetized in any way, either by being sold on the dark web, or by extorting the victim companies. They did, however, find numerous pieces of evidence about the attacker’s identity and whereabouts.

According to the researchers, the attacker did not run the AI agents on his own infrastructure, but rather on a third-party server, and when that third party discovered malicious activity, they downloaded the entire working directory and shared it with the researchers.

“Because the agents were local to the host, their full session logs were recovered, including the attacker’s prompts, the tools used, the internal monologue of the large language model (LLM), and any policy violations recorded during the sessions,” the researchers said.

OALABS was thus able to analyze more than 1,000 agent sessions, seeing how the attacker was able, with ease, to bypass most of the agents’ guardrails. Among the sessions were also the threat actor’s CV with his full name, location, education history, and LinkedIn profile, as well as his IP address which showed that he was located in Addis Ababa, Ethiopia.

Via Helpnet Security


Best antivirus software header
The best antivirus for all budgets

Google logo on a black background next to text reading 'Click to follow TechRadar'

Follow TechRadar on Google News and add us as a preferred source to get our expert news, reviews, and opinion in your feeds.


Sead is a seasoned freelance journalist based in Sarajevo, Bosnia and Herzegovina. He writes about IT (cloud, IoT, 5G, VPN) and cybersecurity (ransomware, data breaches, laws and regulations). In his career, spanning more than a decade, he’s written for numerous media outlets, including Al Jazeera Balkans. He’s also held several modules on content writing for Represent Communications.

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.