Zum Inhalt

Gemma 4 and what makes an open model succeed

Having written a lot of model release blog posts, there’s something much harder about reviewing open models when they drop relative to closed models, especially in 2026. In recent years, open models were quite scarce, so when Llama 3 came out, most people were still actively researching Llama 2 and were thrilled to finally get an update. When Qwen 3 dropped, the Llama 4 disaster had just unfolded and an entire research community was forming around RL on Qwen 2.5 — switching was an obvious choice. Today, any new open model is immediately up against Qwen 3.5, Kimi K2.5, GLM 5.3, MiniMax M2.5, GPT-OSS, Arcee Large, Nemotron 3, OLMo 3, and many more. The space is populated, but still feels full of hidden opportunity. The promise of open models feels like dark matter: we sense its immense potential, yet there are few clear recipes or examples showing how to truly unlock it. Agentic AI, OpenClaw, and everything brewing in that space is going to spur mass experimentation in open models to complement the likes of Claude and Codex, not replace them.. Especially with open models, the benchmarks at release are an extremely incomplete story. In some respects, this is exciting because new open models exhibit far greater variance and unpredictability. At the same time, it highlights underlying structural challenges that make it more difficult to build viable businesses and compelling AI experiences around open models compared to their closed-source counterparts. Whenever a new Claude Opus or GPT model is released, spending a few hours testing them inside my agentic workflows is a solid vibe check.

  Interconnects AI