In brief: Anthropic abandons covert throttling of Claude during frontier LLM research and will make safeguards more transparent going forward.

Anthropic has withdrawn a policy in the System Card for Claude Fable/Mythos that was intended to silently identify and limit requests targeting frontier LLM development. The company acknowledges that it struck the wrong balance.

Anthropic announced that it intends to adjust safeguards for Fable 5 in frontier LLM development. Previously, a non-transparent provision was embedded whereby Claude would automatically identify “requests targeting frontier LLM development” and restrict their effectiveness without notifying the user.

This particularly affects AI researchers who use Claude as a tool in their work. The hidden throttling without notification was criticized as problematic because it undermines the traceability and controllability of model behavior for developers — core requirements for regulated and trustworthy AI systems.

Anthropic is now responding to public criticism with a reversal: safeguards are to be made visible in the future. “We got the tradeoff wrong and apologize for not getting the balance right,” the company stated in a response to Wired. This underscores a fundamental tension between security measures and transparency toward users.

Source: simonwillison.net · Published June 11, 2026
Lumi AI News — AI-assisted curation in accordance with Art. 50 EU AI Act. Paraphrasing and classification by Lumi News Pipeline v1.6.5.

Share on:

Anthropic Revises Safeguard Policy for Claude in Frontier LLM Research

Lumi AI News

Legal

Topics