Tangram: Static KV-Cache Compression for Faster Multi-Turn LLM Serving16. June 20264. July 2026AI ModelsTangram achieves statically predictable memory budgets per attention head to eliminate fragmentation and latency drag caused by dynamic KV-cache compression. Share on: