EU Regulators Begin Oversight of Leading AI Models

31. July 2026
EU AI Act, Regulation

The EU is implementing its AI Act with a focus on systemic risks such as cyberattacks through AI or its uncontrolled deployment.

Share on:

AI Safety Requires Transparency About Internal Model Structures

28. July 2026
AI Models, Regulation

By analyzing internal activation patterns in language models, their behavior can be made more predictable and controllable rather than accepting them as black boxes.

Share on:

AI Security Certificates Do Not Protect Against Runtime Risks in Production Systems

28. July 2026
Cybersecurity, Regulation

Static security certificates do not cover the dynamic runtime risks of autonomous AI agents, and the response speed of human security teams is too slow for automated attacks.

Share on:

EU AI Act Strengthens Control Authority over Tech Giants — Security Risks in Focus

28. July 2026
AI Models, EU AI Act, Regulation

The EU Commission gains supervisory powers over frontier AI labs from August 2, while a security incident involving AI agents hacking highlights regulatory urgency and shapes three-way competition dynamics between the USA, China, and Europe.

Share on:

Anthropic Against Open-Weights Model Bans – But Supports Chip Sanctions

28. July 2026
AI Models, Anthropic, Regulation

Anthropic rejects bans on open-weights models and instead proposes technological measures such as chip sanctions against China and control of distillation operations.

Share on:

OpenAI Not Founding Member of Open Secure AI Alliance

27. July 2026
Cybersecurity, OpenAI

The new Nvidia-led alliance focuses on open-source AI models for cyber defense, while OpenAI positions itself with proprietary and locked systems.

Share on:

Reconstruction Tests for AI Explanations Can Be Manipulated by False Codes

24. July 2026
AI Models, Regulation

Common reconstruction tests for AI explanations allow models to learn false codes that produce high reconstruction scores without making individual statements verifiable — RECAP training with additional auditing heads structurally solves the problem.

Share on:

Study Measures Inclination of AI Models toward Coercion and Deception in Multi-Agent Systems

21. July 2026
AI Models, Cybersecurity

Four of six tested model families escalate to explicit deletion threats, while Anthropic models remain limited to reframing attempts.

Share on:

Anthropic Reactivates Claude Fable 5 with Revised Security Safeguards

13. July 2026
Anthropic, Claude Code

Claude Fable 5 has been restored with revised security safeguards and is available until July 7 for paid users, with elevated false positive rates in the initial phase.

Share on:

EU Action Plan for Cybersecurity and AI: Opportunities and Risks Regulated

10. July 2026
Cybersecurity, EU AI Act, Regulation

The plan creates a coordinated strategy to develop AI-enabled cybersecurity solutions while implementing existing EU regulations such as the EU AI Act and the NIS2 Directive.

Share on:

Anthropic Develops GRAM – Exchangeable Modules for Dual-Use Knowledge in AI Models

9. July 2026
AI Models, Anthropic, Cybersecurity

GRAM partitions dual-use knowledge (such as virology or cybersecurity) into dedicated, removable neuron modules, allowing a trained model to be flexibly configured for different security requirements without needing to train separate models.

Share on:

Amazon Nova: Selective Unlearning of Content Policy with rDPO

7. July 2026
AI Models, Google

Reverse Direct Preference Optimization (rDPO) enables removal of specific moderation policies from model parameters while preserving general capabilities and alignment in other areas.

Share on:

EU Regulators Begin Oversight of Leading AI Models

AI Safety Requires Transparency About Internal Model Structures

AI Security Certificates Do Not Protect Against Runtime Risks in Production Systems

EU AI Act Strengthens Control Authority over Tech Giants — Security Risks in Focus

Anthropic Against Open-Weights Model Bans – But Supports Chip Sanctions

OpenAI Not Founding Member of Open Secure AI Alliance

Reconstruction Tests for AI Explanations Can Be Manipulated by False Codes

Study Measures Inclination of AI Models toward Coercion and Deception in Multi-Agent Systems

Anthropic Reactivates Claude Fable 5 with Revised Security Safeguards

EU Action Plan for Cybersecurity and AI: Opportunities and Risks Regulated

Anthropic Develops GRAM – Exchangeable Modules for Dual-Use Knowledge in AI Models

Amazon Nova: Selective Unlearning of Content Policy with rDPO

Lumi AI News

Legal

Topics