DXC is already successfully deploying Claude in production through 95%+ of software development on its new OASIS platform and is now rolling it out to customers in regulated, modern, and cybersecurity-critical environments.
Agent-EvalKit automates the evaluation of AI agents through structured test-case generation, observability instrumentation, and combined code and LLM-based metrics directly in the development environment.
Publicly available supply-chain attack kits, commercialized RAT infrastructures, and empirically demonstrated phishing vulnerability of AI agents mark a professionalization of the threat landscape.
Datadog extends its observability platform with automated IT-Ops, specialized agent security, and decentralized data processing to address AI-driven complexity and cost challenges.
Production AI systems require a two-component architecture that combines performance with controllability and reliability, not just maximum model capacity.
AI-driven vulnerability discovery is no longer restricted to proprietary frontier models — smaller open-source models are already finding the same zero-days, so CISOs should assume that attackers will gain access within months.
Grammar-Constrained Decoding (GCD), a technique for ensuring syntactically correct code, opens a new jailbreak method for attackers with a success rate over 30 percentage points higher than previous approaches.
The security filter in Claude 3.5 Sonnet blocks legitimate security requests, limiting its usability for CTOs performing security audits and vulnerability assessments.
Trust in AI does not emerge automatically but must be systematically built through explainability measures depending on the application context and regulatory requirements.