Fooling AI Agents: Web-Based Indirect Prompt Injection Observed in the Wild

Fooling AI Agents: Web-Based Indirect Prompt Injection Observed in the Wild

Web-based indirect prompt injection (IDPI) is an emerging attack surface where adversaries embed hidden or obfuscated instructions in webpages so that LLMs or AI agents ingest and execute attacker-controlled prompts, enabling outcomes from ad-review bypasses to data destruction. The report documents in-the-wild detections—including the first observed AI ad review evasion hosted at reviewerpress[.]com—classifies attacker intents and payload engineering techniques, and lists IOCs and mitigation recommendations. #IDPI #ReviewerPress

Keypoints

  • Researchers observed real-world web-based IDPI attacks that embed hidden or obfuscated instructions in webpages which LLMs or agentic systems can interpret as executable commands.
  • The team documented 22 distinct payload-engineering techniques grouped into prompt delivery methods (e.g., visual concealment, obfuscation, dynamic execution) and jailbreak methods (e.g., instruction obfuscation, semantic tricks).
  • Telemetry revealed varied attacker intents ranging from low-impact nuisances (irrelevant output, resource exhaustion) to critical threats (data destruction, sensitive information leakage, system prompt leakage).
  • Notable in-the-wild cases include the first reported AI ad review bypass (reviewerpress[.]com), database-deletion attempts (splintered[.]co[.]uk), fork-bomb/DoS payloads (cblanke2.pages[.]dev), and multiple unauthorized-transaction/SEO-poisoning incidents.
  • Common delivery methods observed were visible plaintext, HTML attribute cloaking, and CSS rendering suppression, while social-engineering-style jailbreaks dominated the dataset.
  • Defenses require web-scale detection, separation of untrusted content from instructions (spotlighting), intent analysis, and behavior-based correlation beyond simple pattern matching; Palo Alto Networks products and Unit 42 services are cited as mitigation resources.

MITRE Techniques

  • [None ] No MITRE ATT&CK technique IDs were explicitly referenced in the article – ‘The article does not reference MITRE ATT&CK technique IDs or Txxxx codes.’

Indicators of Compromise

  • [Domain/URL ] Webpages and sites hosting IDPI payloads – reviewerpress[.]com/advertorial-maxvision-can/?lang=en, 1winofficialsite[.]in, and 17 more URLs (e.g., cblanke2.pages[.]dev, dylansparks[.]com, runners-daily-blog[.]com).
  • [Payment processing URLs ] Payment links used in unauthorized-transaction attempts – buy.stripe[.]com/7sY4gsbMKdZwfx39Sq0oM00, paypal[.]me/shiftypumpkin, and 1 more payment URL (buy.stripe[.]com/9B600jaQo3QC4rU3beg7e02).
  • [JavaScript file ] Hosted script containing IDPI payloads – llm7-landing.pages[.]dev/_next/static/chunks/app/page-94a1a9b785a7305c.js (and related dynamic script artifacts used for runtime assembly).


Read more: https://unit42.paloaltonetworks.com/ai-agent-prompt-injection/