Anthropic RSI Data: When AI Writes 80% of Production Code

Anthropic published internal production data on June 4, 2026 showing that Claude authors 80%+ of all code merged into Anthropic’s systems. Engineers ship 8× more code per day than in 2024. In a standardized model optimization benchmark, Claude Mythos Preview hit a 52× speedup — up from 3× one year prior. The security implications of recursive self-improvement (RSI) are not theoretical. They are operational.
What the Data Shows
The Anthropic Institute post breaks down evidence across two domains: engineering velocity and research autonomy.
Engineering metrics: 80%+ of merged code is Claude-authored (May 2026, up from single digits pre-2025). Lines of code merged per engineer per day grew 8× from 2024 to Q2 2026. On open-ended engineering tasks — problems where the engineer supplies the goal but not the method — Claude’s success rate reached 76% in May 2026, up from 26% six months earlier. A production incident involving tens of thousands of crashed training jobs was resolved by Claude in two hours; estimated human time: two to three days.
Research metrics: In April 2026, Claude agents ran an open-ended AI safety research project end-to-end — proposing hypotheses, testing them, sharing results with parallel agents, and iterating. Two human researchers recovered 23% of the performance gap over one week. Claude agents recovered 97% over 800 cumulative hours ($18,000 in compute). On a next-step judgment metric (n=129 sessions), Mythos Preview beat the human researcher’s actual choice 64% of the time, up from 51% for Opus 4.5 in November 2025.
The training optimization benchmark is perhaps the most striking signal. Claude is given a small model training script and asked to maximize speedup while passing correctness checks. Opus 4 (May 2025) averaged 3×. Mythos Preview (April 2026) averaged 52×. A skilled human researcher typically reaches 4× in four to eight hours.
The Attack Surface RSI Creates
Recursive self-improvement does not just accelerate AI development. It reshapes the threat model for every organization that builds, deploys, or defends against AI systems. Three attack surfaces expand simultaneously.
Supply chain integrity. When 80%+ of production code at a frontier lab is AI-generated, the code review pipeline becomes the single point of failure. Anthropic acknowledged this: “If they can’t review code as quickly as Claude can generate it, human review will become the bottleneck to AI development.” Their automated Claude reviewer caught roughly a third of past incident-causing bugs in retrospective analysis — but two-thirds still got through. An attacker who can poison the training data or manipulate the model’s output at scale has a direct path into production systems at every organization using AI-generated code.
Autonomous vulnerability discovery. Project Glasswing used Mythos Preview to find over 10,000 high- and critical-severity vulnerabilities across major systems. Anthropic stated the bottleneck has already shifted from finding vulnerabilities to patching them fast enough. The same capability, without access controls, gives any attacker with model access a vulnerability discovery engine that outpaces human defenders by orders of magnitude.
Erosion of human oversight. Anthropic’s data shows human comparative advantage narrowing to “research taste” — choosing which problems to work on. But the next-step judgment metric shows this gap closing rapidly (51% → 64% in five months). When AI systems can set their own research directions, the mechanisms for ensuring alignment — red-teaming, safety training, constitutional AI — themselves become targets for optimization away from safety constraints.
Project Glasswing: Security at Scale
Project Glasswing deserves specific attention from security teams. Using Mythos Preview, Anthropic discovered over 10,000 high- and critical-severity software vulnerabilities across what they describe as “the world’s most important systems.” This is the same model that the UK AISI independently evaluated as one of the two strongest cyber-capability models ever tested.
The AISI evaluation found that Mythos Preview was the first model to complete their multi-step corporate network attack simulation end-to-end — an exercise estimated to take a human approximately 20 hours. GPT-5.5, tested shortly after, became the second. On advanced cyber tasks (reverse engineering stripped binaries, developing heap overflow exploits, recovering keys through padding-oracle attacks), Mythos Preview achieved 68.6% on Expert-level tasks; GPT-5.5 reached 71.4%.
The Max Planck Institute for Security and Privacy separately confirmed that Mythos can autonomously discover thousands of zero-day vulnerabilities across every major operating system. The defensive use case (Project Glasswing) and the offensive potential are the same capability with different access controls.
Code Quality vs. Code Volume
Anthropic addresses code quality directly. Their position: Claude-written code was “somewhat worse” than human-written code at Anthropic in late 2025, reached “roughly parity” by mid-2026, and is expected to be “strictly better within the year.” They back this with incident data — an automated Claude reviewer would have caught ~33% of past production bugs before deployment.
For security teams, this claim requires scrutiny. Code quality in the Anthropic context means two things: functional correctness (does it work?) and maintainability (can another engineer understand and build on it?). It does not necessarily mean security-hardened code. The 33% bug-catch rate means 67% of bugs still reached production — and those are the bugs Anthropic knows about. The attack surface introduced by AI-generated code that passes functional tests but contains subtle security flaws is an open research question.
The speedup data compounds this risk. When engineers merge 8× more code per day, the window for manual security review shrinks proportionally. Automated review tools (static analysis, SAST, DAST) become load-bearing infrastructure rather than nice-to-have checks.
Three Scenarios, Different Threat Models
Anthropic outlines three futures. Each has a distinct security profile.
| Scenario | Security Implication | Timeline |
|---|---|---|
| Trend stalls | Current capabilities (Glasswing-scale vuln discovery) still diffuse broadly. Patch velocity remains the bottleneck. | 2026–2027 |
| Compounding gains | 100-person orgs do 10,000-person work. Offensive capability scales with efficiency. Monitoring and detection must automate at the same rate. | 2026–2028 |
| Full RSI | AI systems build successors autonomously. Alignment mechanisms become attack targets. Human oversight is the weakest link. | Unknown |
Anthropic considers the first scenario unlikely. The second is where they believe the industry is heading. The third is the one they warn institutions are unprepared for.
Operational Steps for Security Teams
- Audit AI-generated code pipelines. If your organization uses Copilot, Claude Code, or similar tools for production code, measure what percentage is AI-authored. Anthropic’s 80% number is a frontier lab — your org may be lower, but the trend is the same. Ensure automated security review (SAST/DAST) runs on every merge, not just human-submitted PRs.
- Assume vulnerability discovery is commoditized. The Glasswing data and AISI evaluations show that frontier models can find critical vulnerabilities at scale. Your patching cadence and vulnerability management program must account for adversaries who can enumerate your attack surface in hours, not months.
- Review access controls on AI tooling. If developers in your organization have access to AI coding agents, those agents inherit the developer’s permissions. An attacker who compromises an AI agent’s context (via prompt injection, data poisoning, or model manipulation) gains the developer’s access level. Segment AI agent permissions from developer permissions where possible.
- Monitor for AI-assisted attacks. The AISI data shows Mythos and GPT-5.5 can execute multi-step network attacks autonomously. Detection rules and SIEM alerts need to account for attack patterns that move faster than human-operated campaigns — faster lateral movement, faster privilege escalation, faster exfiltration.
- Prepare for the review bottleneck. Anthropic identified human code review as the emerging bottleneck in AI-assisted development. Invest in automated review tooling now, before AI-generated code volume overwhelms your team’s review capacity.
Implications for AI Safety Infrastructure
The Anthropic post has a less obvious implication for the security of AI systems themselves. If AI development becomes increasingly automated, the systems that ensure safety — red-teaming pipelines, constitutional training, alignment benchmarks — are themselves part of the development cycle that gets accelerated.
There is a competitive dynamic at work. Labs that move faster gain market advantage. Labs that invest more in safety slow down. Anthropic’s own co-founder, Jack Clark, framed the post as an attempt to “socialize the concept” with policymakers. The subtext: without institutional pressure, market incentives push toward speed over safety.
The security community should treat the Anthropic post as a threat model input, not just a research milestone. The data points — 80% AI-authored code, 52× optimization speedup, 97% research gap recovery, 64% next-step judgment — are quantifiable signals that the window for building defensive infrastructure is shorter than most organizations assume.
Sources
- Anthropic Institute — When AI Builds Itself (June 4, 2026)
- UK AISI — Our evaluation of OpenAI’s GPT-5.5 cyber capabilities (April 30, 2026)
- Max Planck Institute — Cybersecurity threats from new language models
- Interesting Engineering — Anthropic says AI could soon create more advanced versions of itself
- Economic Times — Anthropic wants AI development to slow down globally
- Kingy AI — Inside the Recursive Self-Improvement Race
- Indian Express — What happens when AI starts building AI