OpenAI o3 checkmates Grok in a chess showdown, and it wasn't even close

(Image credit: Take Take Take)

OpenAI’s o3 defeated Elon Musk’s Grok 4 at chess
Magnus Carlsen delivered biting commentary on the quality of Grok's logic
Grok 4 made repeated blunders, while o3 played steady

The AI chess tournament between OpenAI’s o3 model and xAI's Grok 4 invited plenty of speculation as a kind of proxy battle between the two companies and their respective CEOs. Any comparison to the days of Deep Blue and Bobby Fischer soon faded, though, as OpenAI o3 repeatedly wiped out Grok 4, winning four games in a row, accompanied by the derisive commentary of former world chess champion Magnus Carlsen and grandmaster David Howell.

The showdown happened on Kaggle’s Game Arena, a digital coliseum where AI models battle in chess and other games. The tournament featured eight of the most prominent LLMs in the business: OpenAI’s o3 and o4-mini, Google’s Gemini 2.5 Pro and Flash, Anthropic’s Claude Opus, Moonshot’s DeepSeek and Kimi, and xAI’s Grok 4. The final came down to Grok and o3, but Grok's performance in the final round didn't seem like a battle of champions.

Carlsen and Howell veered between serious commentary and a roast as Grok’s performance came off as somewhat erratic. In the first game, it quickly sacrificed its bishop, then began trading pieces like it was in a hurry to go home. Things didn't improve in the next game for Grok.

“[Grok] is like that one guy in a club tournament who has learnt theory and literally knows nothing else," Carlsen said during the second game. "Makes the worst blunders after that.”

YouTube

Watch On

Grok’s performance was so off-the-rails that Carlsen rated it around 800 ELO, or slightly above a beginner. He gave o3 a modest but respectable 1200, in the middle of most hobby players. Though o3 didn’t play brilliantly, it didn’t have to. It played solid chess. It didn’t blunder pieces. It converted its advantages and carried out the classic chess moves.

“o3 is fairly ruthless in conversions; it looks like a chess player. Grok looks like it learnt a few opening moves and knows the rules, but not much more.," Carlsen said. "Grok’s moves are chess-related moves. They just came at the wrong time and in weird sequences.”

Chess AI

The chess wasn't the main point of the tournament, despite its prominence. It was about how general-purpose AI models handle events with strict rules like chess games. Turns out, they're not great, but o3 is the best of the limited sample. As AI becomes embedded in everything, the ability to follow rules and spot patterns becomes essential. Chess is a uniquely transparent way to observe that. You either made the right move or you didn’t. When a model plays well, you can see the logic; otherwise, queens fall like dominoes, and the game becomes as confused as that metaphor.

Chess is a window into how well an AI can plan, evaluate options, avoid catastrophic mistakes, and stay logically consistent. If Grok throws away a queen because it doesn’t grasp long-term consequences, what might it do in a legal document, or when booking travel?

That the final was between OpenAI and xAI did add some drama with Sam Altman and Elon Musk at loggerheads in public. The chess final didn’t resolve the battle between them, but it did give OpenAI a PR win in the realm of public perception, and a limited but very real compliment from Magnus Carlsen.

TOPICS

Eric Hal Schwartz is a freelance writer for TechRadar with more than 15 years of experience covering the intersection of the world and technology. For the last five years, he served as head writer for Voicebot.ai and was on the leading edge of reporting on generative AI and large language models. He's since become an expert on the products of generative AI models, such as OpenAI’s ChatGPT, Anthropic’s Claude, Google Gemini, and every other synthetic media tool. His experience runs the gamut of media, including print, digital, broadcast, and live events. Now, he's continuing to tell the stories people want and need to hear about the rapidly evolving AI space and its impact on their lives. Eric is based in New York City.

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.

Chess AI

You might also like