When Anthropic’s AI model Claude Mythos scanned Firefox’s source code earlier this year, it found 271 confirmed security vulnerabilities in a single pass — bugs that had survived years of human review, some for many years. Three days ago, a quote about the model and the National Security Agency’s classified systems went viral, sparking headlines that “NSA confirms breach.” The journalist who published the quote has since walked it back.
The two stories describe the same technology. The gap in how well-evidenced they are shows how AI capability claims travel — and why that matters right now, with Anthropic’s most powerful models still offline under a US export control order and Congress demanding answers by June 26.
The Firefox discovery: confirmed bugs in one pass
Mozilla’s engineers had experimented with earlier AI models for static code analysis. GPT-4 and Claude Sonnet 3.5 flagged too many false positives to scale. What changed was not a more capable model alone, but a new kind of system: an agentic harness that actively confirms whether a bug can be triggered.
The pipeline works in four steps. Mythos sits inside an isolated container with Firefox source code. It forms hypotheses about where vulnerabilities might live. The model generates proof-of-concept test cases and runs them against a live browser. A bug is reported only if the model confirms it through execution. That dynamic verification step — rather than a static suggestion — filters out false positives before any human sees the report.
Mozilla built this atop its existing fuzzing infrastructure, parallelized across multiple virtual machines, each hunting bugs in a specific file. The team integrated deduplication, triage, patch tracking, and release management. More than 100 contributors reviewed, tested, and shipped the resulting patches.
Related: JWST Confirms Distant Gravitational Lens Galaxy Cluster Defies Models
Firefox 150 released April 21, 2026.
It included fixes for the vulnerabilities Mythos found in a single evaluation pass. Mozilla rated 180 of them sec-high — exploitable by visiting a malicious webpage — and 80 as sec-moderate. The total patched that month was 423 vulnerabilities, roughly 20 times Mozilla’s monthly average of about 21 in 2025.
Among the publicly disclosed bugs: a 15-year-old flaw in the HTML legend element triggered by edge cases across distant parts of the browser, a 20-year-old vulnerability in Firefox’s XSLT engine, and a race condition over an inter-process communication boundary that could enable sandbox escape and full browser compromise.
Firefox CTO Bobby Holley said the findings gave the security team “vertigo.” His conclusion: software like Firefox is modular enough that its defects are finite, and the team was “entering a world where we can finally find them all.”
Results from other organizations back this up. Cloudflare found 2,000 vulnerabilities across its critical-path systems — 400 rated high or critical — with a false-positive rate better than human testers, according to Anthropic’s May 2026 Glasswing update. Across roughly 50 partner organizations in the program’s first month, more than 10,000 high- or critical-severity vulnerabilities were identified. Six independent security firms validated a sample of 1,752 findings and confirmed a 90% true-positive rate.
Related: AMD Restores Ryzen Memory Encryption in July
The UK AI Security Institute benchmarked the model independently. It was the first AI model to complete a 32-step corporate network attack simulation from start to finish — a workflow estimated to take a human expert roughly 20 hours. Mythos did it on three of ten attempts. AISI noted its test range lacked active defenders, so the result showed the model can attack weakly-defended simulated networks, not hardened enterprise infrastructure.
The NSA claim that unraveled
The viral statement originated at a Senate Intelligence Committee hearing on June 11. Sen. Mark Warner told colleagues that Gen. Joshua Rudd — who leads the NSA and US Cyber Command — had relayed that Mythos broke into “almost all of our classified systems, not in weeks, but in hours.”
The publication ran that quote in a June 14 piece on export controls. It went largely unnoticed for a week. On June 21, a social media post amplified the sentence — stripped of context — under headlines like “NSA confirms AI breach.”
Shashank Joshi, the publication’s defense editor who wrote the piece, responded within 24 hours. He said reading the quote literally would be a mistake, and the capability described depended on Mythos working alongside other tools in specific conditions. The circulating picture, he indicated, was not what the article intended.
Security researchers quickly flagged technical problems with the literal interpretation. NSA systems operate across multiple classification levels, including physically isolated machines with no network connection. Lateral movement across such systems without a human physically transporting a payload is not possible through remote access alone. The more credible reading — consistent with multiple analysts and at least one other report — is that the agency used the model to probe its own systems for vulnerabilities, exactly as Mozilla did with Firefox. That is a red-team exercise, not a hostile breach.
Related: Introduction: The Ever-Evolving World of Gadgets
That context matters. When Mozilla red-teamed Firefox, the result was patches. When the NSA red-teamed its own infrastructure, the result would be a classified inventory of vulnerabilities and a remediation program. Neither constitutes a breach suffered.
As of June 24, 2026, Anthropic’s two most powerful models — Fable 5 and Mythos 5 — remain offline globally. On June 12, the US Commerce Department ordered the company to suspend access for all foreign nationals, including its own non-citizen employees. Unable to filter by nationality in real time, Anthropic disabled both models for all users within hours. This is the first time US export control authorities have been applied to a commercially deployed AI model.
Four bipartisan members of Congress sent a formal letter to Commerce Secretary Howard Lutnick on June 18 demanding a written explanation. The deadline for that response has been set.
The Mozilla case is a model for documented AI capability: bug IDs, CVE numbers, a detailed technical post from its engineers, and independent verification from multiple organizations. The NSA claim, in its current form, has one sentence relayed through two speakers in a political context — and the journalist who reported it walked back the literal reading. That is not an argument about whether Mythos can execute offensive operations. The AISI benchmark and thousands of documented vulnerabilities show it can. What is not documented is a hostile, unauthorized breach of NSA classified systems. Those are different claims, and the evidence for each is not interchangeable.
