OpenAI launches GPT-5-Codex with a 74.5% success rate on real world coding

GPT-5
(Image credit: OpenAI)

  • GPT-5-Codex promises higher performance and success rates
  • It’s included for Plus, Pro, Business, Edu and Enterprise users
  • The model can use 93.7% fewer tokens on lightweight tasks

OpenAI has shared more details about GPT-5-Codex, a purpose-built version of GPT-5 specifically optimized for agentic coding and real-world software engineering, and we’re in for a treat when it comes to reliability and performance.

The ChatGPT maker claimed a SWE-bench Verified benchmark success rate of 74.5%, with refactoring performance improving to 51.3% (up from 33.9% in GPT-5).

Like GPT-5, GPT-5-Codex will dynamically adjust reasoning time for faster performance on small tasks and more comprehensive reasoning on complex ones, and it’s already been tested working independently for over seven hours on large refactors.

GPT-5-Codex is a big upgrade

OpenAI says GPT-5-Codex is strong in code reviews, catching critical bugs before release, but it can also handle frontend work with visual inspection, screenshots and mobile web design improvements.

The news comes just a couple of months after OpenAI launched Codex CLI (in April) and Codex web (in May), before combining them into one “unified… experience connected by… ChatGPT” in early September.

It’s included with ChatGPT Plus, Pro, Business, Edu and Enterprise plans, and works across terminal, IDEs, on the web, in GitHub and on the iOS app.

The company also detailed how GPT-5-Codex uses 93.7% fewer tokens than GPT-5 on lightweight interactions, but it will also spend twice as long reasoning, editing, testing and iterating if it needs to.

Equally as important for developers, the tool will provide logs, citations and test results for transparency. Developers using Codex CLI via API key will also get API access to GPT-5-Codex “soon.”

“Codex is becoming the coding partner we’ve always envisioned – one that’s faster, more reliable, and deeply integrated into the tools you already use,” OpenAI wrote.

Plus, Edu and Business plans have enough to cover “a few focused coding sessions each week” – users who need more should upgrade to Pro for “a full workweek across multiple projects.” Enterprise accounts pay for what they use via a shared credit pool.

You might also like

With several years’ experience freelancing in tech and automotive circles, Craig’s specific interests lie in technology that is designed to better our lives, including AI and ML, productivity aids, and smart fitness. He is also passionate about cars and the decarbonisation of personal transportation. As an avid bargain-hunter, you can be sure that any deal Craig finds is top value!

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.