The Point: Claude Opus 4.8 and Sonnet 5 frequently invent non-existent parameters when using editing tools, causing third-party development environments like Pi to fail.
Newer Anthropic models such as Claude Opus 4.8 and Sonnet 5 pass erroneous parameters to code editing tools with increasing frequency, whereas older model variants do not exhibit this behavior. The problem appears to stem from the fact that the newer models were specifically trained for Claude’s own editing tool.
A developer named Armin observed that Claude Opus 4.8 regularly inserts invented fields into the nested edits[] array when handling Pi’s editing tool. Although the actual edits are usually correct, the additional parameters not defined in the schema cause Pi to reject the tool call, forcing the model to try again.
On the surface, an isolated erroneous tool call is no surprise – models occasionally generate malformed calls. However, what stands out is the trend: Opus 4.8 and Sonnet 5, Anthropic’s latest state-of-the-art models, exhibit this behavior, while older model variants do not. As model quality increases, the ability to correctly implement specific tool schemas seems to decline.
Armin suspects that Anthropic specifically trained the newer models through reinforcement learning to better utilize Claude Code’s integrated editing tool, which is based on search-and-replace functionality. This would have the side effect of making alternative code harnesses with their own differently structured editing tools fail more frequently – similar to how OpenAI trained its models on the apply_patch mechanism tool, causing them to perform worse with other patching systems.
This raises a question for third-party providers like Pi: should they provide multiple variants of editing tools in order to select the tool with the lowest error rate depending on which Claude model is chosen?
Source: simonwillison.net · Published 5 July 2026
Lumi AI News — AI-assisted curation pursuant to Article 50 EU AI Act. Paraphrase and classification by Lumi News Pipeline v1.7.3.