GauntletBench: New Benchmark Reveals Limitations of AI Agents26. June 20264. July 2026AI ModelsCurrent AI agents fail at complex visual tasks in professional applications far more frequently than previous benchmarks suggest. Share on: