Poisoned documents can turn reasoning-based AI guardrails into DoS weapons by leveraging security systems themselves as resource sinks—a new attack vector with concentration risks in shared governance infrastructure.
Attackers can exploit reasoning guardrails of AI agents through deliberately manipulated inputs to cause resource exhaustion without bypassing the security mechanisms themselves.
HarnessX automates the assembly and adaptation of agent harnesses from execution traces, achieving an average +14.5% performance improvement without model scaling.
A new benchmark enables identification of the exact point where medical AI models produce hallucinations and enables targeted countermeasures through trace-supervised fine-tuning.
MSA reduces attention computation for million-token contexts by a factor of 28.4 through blockwise sparse selection and achieves practical speedups via co-design of algorithm and GPU kernel.
Agent-EvalKit automates the evaluation of AI agents through structured test-case generation, observability instrumentation, and combined code and LLM-based metrics directly in the development environment.
Grammar-Constrained Decoding (GCD), a technique for ensuring syntactically correct code, opens a new jailbreak method for attackers with a success rate over 30 percentage points higher than previous approaches.