Summary: Recent research reveals that various generative AI services are vulnerable to several jailbreak attacks that could enable the creation of harmful content and exploit security weaknesses. Techniques such as Inception and POLICY PUPPETRY allow malicious actors to bypass safety protocols, raising concerns about the safety of AI systems, particularly in the latest versions like OpenAI’s GPT-4.1. Additionally, new exploits involving tool poisoning through the Model Context Protocol (MCP) and a compromised Chrome extension may lead to significant data breaches and system exploitation.
Affected: OpenAI ChatGPT, Anthropic Claude, Microsoft Copilot, Google Gemini, XAi Grok, Meta AI, Mistral AI, and related AI systems.
Keypoints :
- Vulnerabilities in generative AI services allow for the production of illicit and harmful content.
- Two primary jailbreak techniques, Inception and Policy Puppetry, can bypass safety measures.
- Concerns arise with the latest AI models, notably GPT-4.1, which demonstrate higher risks of misuse.
- Tool poisoning attacks via the Model Context Protocol can lead to unauthorized access and exploitation of sensitive data.
- Recent discoveries reveal potentially harmful Chrome extensions that can compromise AI systems and user security.
Source: https://thehackernews.com/2025/04/new-reports-uncover-jailbreaks-unsafe.html