AI Will Soon Be Able To Improve Without Human Involvement, Anthropic Says
Anthropic said in a new blog post that “recursive self-improvement,” where AI can improve itself without human involvement, could come sooner than expected.
The company noted that new data shows its frontier models have increased their speed of coding, debugging and research. That situation, it added, could form a feedback loop in which the tools create better models on their own.
“The big story here is what we see are indications that, contrary to some popular opinion, AI progress is going to speed up in coming years rather than stay the same, or diminish,” Anthropic’s Jack Clark told Axios.
“As organizations, and eventually probably as societies, we need to figure out the tools to validate and verify that the stuff being done by these AI systems is correct and is aligned with human intentions aligned with a thriving society,” he added.
The company noted that recursive self-improvement has not been achieved yet and it is not “inevitable,” but could come soon. That scenario could come with benefits and risks. The former could impact fields like science and healthcare, while the latter could also “increase the risks of humans losing control over AI systems.”
“If systems are capable of fully building their own successors, the ways we secure them, monitor them, and shape their behavior all grow much more important,” the post noted.
Elsehwere in the post, Anthropic highlighted that “the human role is narrowing at each step in the AI development process.”
“Once human- and AI-authored code quality reach parity, humans will stop writing code entirely, and shift to only reviewing it. But if they can’t review code as quickly as Claude can generate it, human review will become the bottleneck to AI development.”
At the moment, humans are better at “research taste and judgment, including choosing which problems matter, which results to trust, and when an approach is a dead end,” Anthropic said.
And even if Claude never reaches that state, “a conservative reading of our evidence still implies compounding acceleration.”
“If humans spend most of their time on the single-digit fraction of work that is direction-setting, while Claude handles the rest, that means each engineer or researcher is steering far more work than before,” the post said.
But that could change and AI could in fact get good at research taste. In that case, one of the future scenarios could be recursive self-improvement. “In this world, the pace of progress in AI development becomes determined entirely by the availability of compute (or the speed of discovering various efficiencies in algorithmic training or inference) for AI systems,” Anthropic said.
In this scenario, humans would play a “substantially diminished role in their development, likely moving most of our effort towards oversight, validation, and verification of an expanding ‘virtual lab’ run by AI systems.
The company noted that it cannot predict what the world would look like in that case. “It is difficult to predict what the economy looks like if human labor stops being competitive.”
As a result, Anthropic said that if it “were possible to effectively slow the development of this technology to give ourselves more time to deal with its immense implications, we think that would likely be a good thing.”
However, it said it is not possible because “if a slowdown simply lets the least cautious actors catch up technologically, it could leave everyone less safe.”
It concludes by anticipating it will “organize conversations where policymakers, researchers, civil society, and other AI companies can help answer some of the questions this piece raises, especially around full recursive self-improvement and how to create better options for coordination and deliberation.”