A Chinese AI just solved a decade-old math problem in 80 hours with zero human help and proved it

Complex mathematics solved and verified — (Image credit: AI Generated)

The dual agent AI system autonomously solved Anderson's conjecture from 2014
Rethlas explores problem-solving strategies like a human mathematician would
Archon transforms potential proofs into projects for the Lean 4 verifier

A research team led by Peking University developed a dual-agent AI system capable of solving advanced mathematical problems while also verifying its own results.

The system resolved a conjecture proposed in 2014 by Dan Anderson, completing the process within 80 hours of runtime.

"Using this framework, we successfully solved an open problem in commutative algebra and automatically formalized the proof with essentially no human intervention," the researchers wrote in a preprint paper published on arXiv.

How the dual-agent framework actually works

The AI tool applies a reasoning system called Rethlas, which draws from a math theorem search engine named Matlas to explore problem-solving strategies.

When Rethlas produces a potential proof, a second system called Archon uses another search engine called LeanSearch to transform that proof into a project for an interactive theorem prover.

The theorem prover, Lean 4, is also a programming language with a community-maintained library containing hundreds of thousands of theorems and definitions.

The researchers noted that no mathematical judgment was required from the human operator during the problem-solving process.

The AI system performed mathematical tasks faster than any human, including independently doing work that would normally require collaboration between experts in different fields.

However, the team also found that a mathematician could speed up the process by guiding Archon when needed.

"This work provides a concrete example of how mathematical research can be substantially automated using AI," the researchers stated.

Mathematical proofs demand complete rigor, yet even expert-written proofs may contain subtle flaws.

Similarly, proofs produced by large language models are prone to hallucination and are far less reliable than formal verification methods.

The Chinese team's framework bridges the gap between natural language reasoning and formal machine verification, allowing the AI system to both solve problems and verify its own findings.

"Our work illustrates a promising paradigm for mathematical research in which informal and formal reasoning systems operate in tandem to produce verifiable results," the researchers noted.

The paper has not yet been peer-reviewed by experts, so independent verification is still pending.

Anderson's conjecture was a relatively obscure problem in commutative algebra, which makes the AI's achievement noteworthy.

However, this feat is not comparable to solving a millennium prize-level challenge like the Riemann Hypothesis or the P vs NP problem.

Whether this approach scales to more difficult mathematical problems remains to be seen.

That said, for a field that has resisted automation for centuries, this represents a notable milestone.

Via The Independent

Follow TechRadar on Google News and add us as a preferred source to get our expert news, reviews, and opinion in your feeds. Make sure to click the Follow button!

And of course you can also follow TechRadar on TikTok for news, reviews, unboxings in video form, and get regular updates from us on WhatsApp too.

'Essentially no human intervention': Chinese AI solves 12-year-old math problem in just 80 hours — and even proves it

How the dual-agent framework actually works

Useful links