'Not just development tools': Security experts discover critical flaw in OpenAI's Codex which could compromise entire enterprise organizations

Abstract image of cyber security in action.
OpenVPN-protokollet - därför är det så bra (Image credit: Shutterstock)

  • BeyondTrust Phantom Labs finds critical command injection flaw in OpenAI’s ChatGPT Codex
  • Vulnerability let attackers steal GitHub OAuth tokens via malicious branch names
  • OpenAI patched with stronger input validation, shell escaping, and token controls

Experts have claimed OpenAI’s ChatGPT Codex carried a critical command injection vulnerability which allowed threat actors to steal sensitive GitHub authentication tokens.

This is according to BeyondTrust’s research department, Phantom Labs, whose work helped OpenAI identify and patch the flaw.

ChatGPT Codex is a coding feature within the famed chatbot that helps users write and edit software using plain-language instructions. Users can turn human-language requests into working code or can suggest fixes and improvements the same way.

Article continues below

How to govern AI agents

When a developer makes changes to a GitHub project, they do it in their own copy, which is a separate branch of the project. Now, according to BeyondTrust Phantom Labs, the problem stems from the way Codex processes branch names during task creation.

Apparently, the tool allowed a (malicious) actor to manipulate the branch parameter and inject arbitrary shell commands while setting up the environment.

These commands could run any code within the container, including malicious ones. Phantom Labs said they were able to pull GitHub OAuth tokens this way, gaining access to a theoretical third-party project, and using the tokens to move laterally within GitHub.

Unfortunately - it gets worse. Codex’s command-line interface, SDK, and development environment integrations were all flawed in the same way, and the researchers said that by embedding malware into GitHub branch names they would be able to compromise numerous developers working on the same project.

After responsibly disclosing the findings to OpenAI, the company fixed the problem with improved input validation, stronger shell escaping protections, and better controls over token exposures inside containers. Token scope and lifetime during task creation were also limited, it was said.

AI coding agents are “live execution environments with access to sensitive credentials and organizational resources,” the researchers concluded.

“Because these agents act autonomously, security teams must understand how to govern AI agent identities to prevent command injection, token theft, and automated exploitation at scale. As AI agents become more deeply integrated into developer workflows, the security of the containers they run in—and the input they consume—must be treated with the same rigor as any other application security boundary.”


Best antivirus software header
The best antivirus for all budgets

Follow TechRadar on Google News and add us as a preferred source to get our expert news, reviews, and opinion in your feeds. Make sure to click the Follow button!

And of course you can also follow TechRadar on TikTok for news, reviews, unboxings in video form, and get regular updates from us on WhatsApp too.

Sead is a seasoned freelance journalist based in Sarajevo, Bosnia and Herzegovina. He writes about IT (cloud, IoT, 5G, VPN) and cybersecurity (ransomware, data breaches, laws and regulations). In his career, spanning more than a decade, he’s written for numerous media outlets, including Al Jazeera Balkans. He’s also held several modules on content writing for Represent Communications.

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.