Lookahead Sparse Attention: DeepSeek-V4 Reduces KV-Cache to 13.5 Percent10. June 20264. July 2026AI ModelsLSA predicts relevant context sections in advance and retains only these in GPU memory, compressing the KV-cache by over 86 percent without sacrificing accuracy. Share on: