MiniMax Sparse Attention: Efficient Long-Context Processing for Billion-Parameter Models12. June 2026AI Models, Claude CodeShare on:MSA reduces attention computation for million-token contexts by a factor of 28.4 through blockwise sparse selection and achieves practical speedups via co-design of algorithm and GPU kernel. Share on: