Harness-1: Search Agent with Externalized State Management Trained via RL2. June 20264. July 2026AI ModelsA 20B search agent achieves 0.730 average curated recall across eight benchmarks by training RL on explicit state rather than integrating state management into the policy. Share on: