Skip to content

AI Security Systems as DoS Targets: Poisoned Documents Cripple Guardrails

Share on:

In a nutshell: Poisoned documents can turn reasoning-based AI guardrails into DoS weapons by leveraging security systems themselves as resource sinks—a new attack vector with concentration risks in shared governance infrastructure.

Researchers from Hong Kong University of Science and Technology demonstrate that poisoned documents can trap reasoning-based security mechanisms in AI agents in extended-thinking loops, thereby endangering availability through resource exhaustion—an attack targeting not the AI itself, but its protective layer.

The researchers uncovered that a single poisoned document can exhaust resources through shared guardrail infrastructure, blocking coexisting agents. In testing against four AI agent frameworks, dramatic processing slowdowns emerged: LangGraph achieved the highest slowdown at 148x, followed by BrowserGym (131x), OpenHands (36.3x), and OSWorld (18x). Unlike prompt injection or jailbreak attacks targeting model outputs, this technique targets the reasoning process of the guardrails themselves—an approach that endangers availability rather than compromising integrity.

A critical finding: stronger security checks lead to longer reasoning processes. The researchers observed that more robust guardrails unintentionally consume time and resources, making them more vulnerable. The attack also functioned across eight different LLM families—prompts developed for open-source models were equally effective against other models. This means: attackers require no detailed knowledge of proprietary systems.

From a CISO perspective, critical implications emerge from consolidation dynamics in AI governance: organizations rationalize their security infrastructure by routing multiple agents through shared safety systems. This creates concentration risks. A successful guardrail DoS attack doesn’t need to break through anything—it merely needs to render the system unusable at critical moments. For business-critical workflows such as automated damage handling, AI-driven incident response, or real-time fraud detection, even temporary latency or resource exhaustion would have material consequences.

Conventional prompt injection filters remain vulnerable, and strict token limits merely shift the problem between fail-open and fail-closed behavior. Smaller reasoning budgets reduce latency but simultaneously weaken security strength—a dilemma with no simple solution in the trade-off between performance and robustness.


Source: www.csoonline.com · Published June 15, 2026
Lumi AI News — AI-assisted curation pursuant to Article 50 EU AI Act. Paraphrase and classification by Lumi News Pipeline v1.7.1.

Share on: