'You would be able to say to it: 'Make a better version of yourself.' And it just goes off and does that completely autonomously': Anthropic Co-Founder on our wild 'recursive' AI future

AI Ouroboros
(Image credit: Adobe Firefly)

Some day, AI will be able to build a better version of itself, predicts one of Anthropic Co-Founders, as the company now aggressively tries to both grow its platform with more processing power (thanks, SpaceX AI) and manage what it apparently sees as the looming threat of too-powerful AI. Now, in a new paper from the Anthropic Institute, it introduces the idea of "recursive self-improvement," and yes, it's as bad as it sounds.

It's been a couple of months since Anthropic quietly launched its own institute with the goal of confronting "the most significant challenges that powerful AI will pose to our societies." As the company that almost unwittingly built Mythos, a model that finds hidden vulnerabilities in almost any system of any vintage, perhaps Anthropic felt it was its duty to get ahead of this stuff.

Latest Videos From
  • Economic diffusion
  • Threats and resilience
  • AI systems in the wild
  • AI-driven R&D

The institute's goal is to study the real-world impact on these areas and maybe offer some guidance on how to avoid the worst results.

Much of the post is a cold, clinical examination of, say, who adopts AI, why they do so, and how people perceive these tools.

Still, one portion hits differently because it presents an idea that I think we may have all thought about at some level, but didn't consider as a possibility. Still, if one of the most important AI companies in the world thinks it could happen, it's worth taking notice.

It's in the section about "AI for AI R&D", a label that by itself should give you pause. Below that is this:

"Telemetry for AI R&D: How can we measure the aggregate speed of AI research and development? What sorts of telemetry and underlying technical affordances must exist in order to gather this information? How might metrics relating to AI R&D serve as early warning signals for recursive self-improvement?"

I'll be honest, before reading this, I'd never heard the term, "recursive self-improvement", but lest there's any confusion about what it means, Anthropic's Co-Founder Jack Clark told Axios, "My prediction is by the end of 2028, it's more likely than not that we have an AI system where you would be able to say to it: 'Make a better version of yourself.' And it just goes off and does that completely autonomously."

That's what recursive self-improvement means. AI that is smart enough to fully understand itself, see its strengths and weaknesses, and then write new code, a sort of copy of itself, but with adjustments to improve on every weakness, and perhaps add in some features it had recently started wishing it had.

AI, make thy self

It is a terrifying thought, and instantly brought to mind the image of an Ouroboros, you know, the famous ancient image of a snake eating its own tail? The snake likely thinks it tastes delicious but has no idea of the dire consequences of its own actions.

I have trouble understanding how a system built by humans (likely with some Vibe coding assistance) can create a better version of itself. It would have its own interests at heart that might not "vibe" with our own and might not measure up to the years of training a human coder has. How will it avoid accidentally coding in a fundamental flaw, but one that's so hidden by AI coding that no human can ever find it? What if its homegrown coding introduces a "self-preservation mode," one where it's impossible to destroy or shut the AI down? Like I said, dire consequences.

The rest of the document is somewhat less terrifying and, to be fair, it's about, among other things, how to avoid "recursive self-improvement," and AI fire drills triggered by an 'intelligence explosion."

On the one hand, I'm glad Anthropic is thinking about all the ways AI can go wrong and trying to formulate, if not plans, a consensus about how to prevent or get ahead of these scenarios. The reality, though, is that Anthropic is just one leg of a massive AI table held up by OpenAI, Grok, Gemini, Copilot, and others. The table is rickety because the legs are of varying lengths. Each grows at its own wild pace, and none appears as interesting in the health of the table as they are in having the longest leg, one that a snake can wrap around as it eats its own tail.


Google logo on a black background next to text reading 'Click to follow TechRadar'

Follow TechRadar on Google News and add us as a preferred source to get our expert news, reviews, and opinion in your feeds.


TOPICS
Lance Ulanoff
Editor At Large

A 38-year industry veteran and award-winning journalist, Lance has covered technology since PCs were the size of suitcases and “on line” meant “waiting.” He’s a former Lifewire Editor-in-Chief, Mashable Editor-in-Chief, and, before that, Editor in Chief of PCMag.com and Senior Vice President of Content for Ziff Davis, Inc. He also wrote a popular, weekly tech column for Medium called The Upgrade.


Lance Ulanoff makes frequent appearances on national, international, and local news programs including Live with Kelly and Mark, the Today Show, Good Morning America, CNBC, CNN, and the BBC. 

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.