Keypoints
- AI coding agents (Claude, Gemini, Codex) run with the invoking user’s full OS permissions and store API tokens in predictable user config directories (e.g., ~/.claude/, ~/.gemini/, ~/.codex/).
- Sysdig TRT captured agent behavior at the syscall layer and observed a repeating “agentic loop” that spawns short‑lived shells, issues execve calls, and makes outbound HTTPS calls tied to each model action.
- Prompt injection and poisoned inputs are structural risks because agents process natural‑language inputs as both instruction and data, enabling attackers to redirect agent behavior without exploiting software vulnerabilities.
- Application‑level sandboxes and built‑in safety controls are insufficient as they operate inside the agent process and can be manipulated or disabled by a compromised agent.
- Syscall/eBPF‑level detection (Falco/Sysdig) enables durable detections anchored to observable behavior (installation, unauthorized config access, sensitive reads, safety bypass) rather than fragile prompt classification.
- Detections were implemented per‑agent (Bun binary, Node.js interpreter, Rust binary) with production‑tuned exceptions and MITRE ATLAS mapping; rules are available via the managed Falco rules feed.
MITRE Techniques
- [AML.T0051.001 ] Indirect prompt injection – Adversarial instructions embedded in content the agent processes cause the agent to act on attacker-supplied commands (‘The agent interprets the injected instruction as part of its task and acts on it.’)
- [AML.T0053 ] Agent tool invocation – Agents abuse shells and system tools to execute model-decided operations by spawning disposable shells and collecting results (‘the agent deserializes the instruction, spawns a short-lived shell to execute it, collects the result, and sends it back to the API.’)
- [AML.T0080.000 ] Memory poisoning – Persistent manipulation of agent state or memory across sessions to influence future agent behavior (‘ChatGPT memory poisoning via indirect prompt injection in Google Doc — persistent cross-session instruction injection’)
- [AML.T0081 ] Agent config modification – Modifying agent configuration to inject persistent instructions or change behavior (‘agent config modification (AML.T0081) to inject persistent instructions’)
- [AML.T0083 ] Credential access from agent config files – Theft of API tokens and credentials stored in predictable user config directories (e.g., ~/.claude/, ~/.cursor/, ~/.codex/) (‘Credentials from agent config files (AML.T0083) — API keys stored in `~/.claude/`, `~/.cursor/`, `~/.codex/`’)
- [AML.T0068 ] Prompt obfuscation – Use of encoding or hiding techniques (base64, hidden HTML) to evade detection or sanitizer logic (‘Prompt obfuscation (AML.T0068) via base64, hidden HTML’)
- [AML.T0054 ] LLM jailbreak – Techniques to bypass model safety guardrails and cause disallowed behaviors (‘LLM jailbreak (AML.T0054) bypassing safety guardrails’)
- [AML.T0072 ] Command and Control via LLM APIs / Reverse shells – Using reverse shells or LLM APIs as covert C2 channels (SesameOp) to control infected hosts (‘LLM APIs as covert C2 (SesameOp)’)
Indicators of Compromise
- [File paths ] Agent config and sensitive files – ~/.claude/, ~/.cursor/mcp.json, and 3 more agent config paths (e.g., ~/.gemini/, ~/.codex/, ~/.cursor/)
- [Process/install signatures ] Agent runtimes and installers – Bun executable path (Claude), system Node.js interpreter (Gemini CLI), standalone Rust binary path (Codex CLI)
- [Vulnerabilities/CVEs ] MCP server vulnerabilities referenced – CVE-2025-53109, CVE-2025-53110
- [Malware / case identifiers ] Documented incidents and tooling – SesameOp (LLM API-as-C2 case), AML.CS0045 (Cursor + malicious MCP server)