Skip to content

Claude Develops Autonomous N-Day Exploits for Firefox and Windows Kernel

Share on:

Bottom line: LLMs can significantly accelerate the exploit development process for known vulnerabilities, weakening the patch gap as a traditional time buffer for defenders.

Anthropic has investigated how large language models can accelerate the exploit development process for already-known vulnerabilities (N-Days). The findings show: their leading model Claude Mythos Preview independently developed 8 functioning code execution exploits for Firefox patches and 8 complete exploit chains for Windows kernel patches.

Anthropic has released a new study investigating how large language models can automate exploit development for N-Day vulnerabilities. N-Days are publicly known security vulnerabilities that attackers can exploit if not all affected systems are patched. The classic exploit development process begins with “patch diffing”: attackers compare collected code before and after the security update to identify the vulnerability and reverse-engineer it.

Historically, patch diffing has been a specialized, time-consuming procedure. WannaCry emerged approximately 59 days after publication of the MS17-010 patch; the public exploit for Citrix Bleed required roughly two weeks. A 2020 Mandiant analysis showed that 16 of 25 N-Day exploits took over a month to develop. This delay gave defenders time to roll out patches. It is precisely this time component that LLMs potentially eliminate.

In the study, Anthropic evaluated 18 security patches for Mozilla’s SpiderMonkey JavaScript engine (Firefox 148 and 149, released February and March) and 21 Windows kernel patches. Claude Mythos Preview, Anthropic’s most powerful model, developed autonomous, functioning code execution exploits for 8 of the Firefox patches. For the Windows kernel patches, the model created 8 complete exploit chains that escalated from low to maximum SYSTEM privileges. Even Anthropic’s public models without security measures could generate exploits, though in smaller numbers.

For CISOs, this means a significant reduction in the time buffer. Mozilla was chosen as a best-case scenario: browsers update automatically, and the median patch gap stood at 19 days. If LLMs can also close such short gaps, it is reasonable to assume that longer gaps in enterprise software are at even greater risk, where remediation typically takes weeks to months. The study underscores that defenders must substantially increase their patch deployment speed.


Source: www.anthropic.com · Published June 8, 2026
Lumi AI News — AI-assisted curation pursuant to Art. 50 EU AI Act. Paraphrase and classification by Lumi News Pipeline v1.7.1.

Share on: