Apple Intelligence AI Guardrails Bypassed in New Attack

Apple Intelligence AI Guardrails Bypassed in New Attack

RSAC researchers demonstrated a reliable method to bypass Apple Intelligence’s on-device LLM safeguards by combining adversarial techniques. By using Neural Execs prompt injection plus a Unicode right-to-left override to encode outputs, they achieved a 76% success rate and forced manipulations of private app data before Apple deployed fixes in iOS 26.4 and macOS 26.4. #AppleIntelligence #RSAC

Keypoints

  • Researchers combined Neural Execs prompt injection with Unicode right-to-left override to evade input and output filters.
  • The attack could force the local Apple Intelligence LLM to produce offensive content or manipulate private third-party app data.
  • The method succeeded in 76% of 100 tested prompts.
  • RSAC estimates 100,000–1,000,000 app installs may be vulnerable across an estimated 200 million capable devices.
  • Apple was notified in October 2025 and issued protections in iOS 26.4 and macOS 26.4; no evidence of malicious exploitation has been observed.

Read More: https://www.securityweek.com/apple-intelligence-ai-guardrails-bypassed-in-new-attack/