NEUP-EAGLE: Parallel Speculation for Faster LLM Inference on AWS SageMaker16. June 2026AI Models, Claude CodeShare on:AWS has developed P-EAGLE, a parallelized variant of speculative decoding that generates draft tokens in a single forward pass instead of sequentially, achieving inference throughput improvements of up to 1.69x on SageMaker AI. Share on: