Skip to content

35B Agent Model Achieves Trillion-Parameter System Performance Through Horizon Scaling

In brief: A 35B agent model with horizon scaling and multi-teacher distillation achieves comparable performance to trillion-parameter models on long-horizon benchmarks.

Researchers have developed Agents-A1, a 35-billion-parameter mixture-of-experts model that competes with trillion-parameter systems on long-horizon tasks. The approach scales not parameter count, but the complexity and length of agent actions.

Agents-A1 uses a mixture-of-experts architecture and scales agent actions across two dimensions: long trajectories (averaging 45,000 tokens per sequence) and heterogeneous capabilities across six different domains. The infrastructure connects external knowledge sources, actions, observations, and verifier outputs into coherent agent sequences.

Training follows a three-stage recipe: first, full-domain supervised fine-tuning to align with broad agent behaviors, then domain-specific teacher models for specialized expertise, finally multi-teacher domain-routed on-policy distillation with salient-vocabulary alignment to improve knowledge transfer across domains.

On established long-horizon benchmarks, Agents-A1 meets or exceeds the performance of systems such as Kimi-K2.6 and DeepSeek-V4-pro: SEAL-0 (56.4), IFBench (80.6), HiPhO (46.4), FrontierScience-Olympiad (79.0), and MolBench-Bind (56.8). On SciCode (44.3), HLE (47.6), and BrowseComp (75.5), the model remains highly competitive.

For CTOs, this approach represents a practical alternative to trillion-parameter models: smaller, specialized agents with extended horizon capabilities enable cost efficiency in inference, deployment, and fine-tuning while delivering comparable results on complex multi-step tasks.


Source: arxiv.org · Published 28 June 2026
Lumi AI News — AI-assisted curation in accordance with Article 50 EU AI Act. Paraphrase and classification by Lumi News Pipeline v1.7.2.

Share on: