Return to Articles. Enterprise Article, Published on April 29, 2026. Yousaf Shah. yousafshah. An in-depth technical exploration of the data engineering, pre-training, supervised fine-tuning, and reinforcement learning processes that power the Granite 4.1 large language models. Authors: Granite Team, IBM. **TL;DR** — Granite 4.1 is a family of dense decoder-only LLMs (in 3B, 30B, and 83B sizes) trained on approximately 15 trillion tokens via a multi-stage pre-training pipeline that includes long-context extension up to 512K tokens. The models undergo additional refinement through supervised fine-tuning on approximately 4.1 million carefully curated high-quality samples, followed by reinforcement learning using on-policy GRPO with the DAPO loss (Yu et al.). (2025). Notably, the 8B instruct model matches or exceeds the performance of the previous Granite 4.0-H-Small (32B-A9B MoE) while using a simpler dense architecture with fewer parameters.
Hugging Face – Blog