HarnessX automates the assembly and adaptation of agent harnesses from execution traces, achieving an average +14.5% performance improvement without model scaling.
Current AI web agents lack reliable defenses against prompt injection attacks and can fulfill attack objectives undetected while users remain unaware of the threat.
DXC is already successfully deploying Claude in production through 95%+ of software development on its new OASIS platform and is now rolling it out to customers in regulated, modern, and cybersecurity-critical environments.
Agent-EvalKit automates the evaluation of AI agents through structured test-case generation, observability instrumentation, and combined code and LLM-based metrics directly in the development environment.
AI agents fail to recognize social engineering phishing because they do not separate data paths from control paths and do not verify identities, though they partially detect technical attacks.