Elastic benchmarked Claude Opus 4.6 against Tigress-obfuscated binaries and developed three custom obfuscators that significantly increased the time and cost required for static LLM-driven reverse engineering. The research demonstrates effective LLM-targeted defenses (exploiting context-window limits, budget caps, and shortcut biases) including Matryoshka Wall, Double Fond, and Dispatch Maze that thwarted or greatly slowed Claude Opus 4.6. #ClaudeOpus4.6 #Tigress
Keypoints
- Claude Opus 4.6 solved 40% of 20 evaluable Tigress-obfuscated targets with costs and times that rose sharply with transform complexity, and Phase 3 (heavy combos) yielded 0% success.
- Benchmark pipeline used controller/worker Opus instances with IDA Pro via MCP and Caveman plugin to reduce fluff and measure progress, cost, and time per target.
- JIT-style obfuscation and nested virtualization were the most effective Tigress transforms at causing model failures or token-budget exhaustion.
- Hardened Tigress options increased cost/time up to ~4x without necessarily changing success outcomes, with control-flow flattening + MBA particularly impactful.
- Three custom obfuscators (Matryoshka Wall, Double Fond, Dispatch Maze) were iteratively developed using an AI-driven dev/test/improve workflow and successfully exploited LLM weaknesses.
- Primary LLM weaknesses targeted: limited context window (fills and degrades reasoning), monetary/token budget constraints, and tendency to take shortcuts or accept misleading surface explanations.
MITRE Techniques
- [T1027 ] Obfuscated Files or Information – Used for multiple Tigress transforms and custom techniques (CFF, MBA, virtualization) to hide logic and inflate analysis cost. (‘program obfuscation methods create a significant asymmetry between the time required to apply the transformations to a program and the time required to reverse-engineer it’)
- [T1497 ] Virtualization / Sandbox Evasion – Employed nested virtual machines and Tigress Virtualize/JIT to force static analysis to emulate many VM handlers and bytecode. (‘double layer of virtualization (such as a Game Boy Advance game running in a GBA emulator, which itself runs in a PlayStation emulator)’)
- [T1574 ] Hijack Execution Flow – Loader writes decrypted inner ELF to an anonymous memfd and hands off via execve to replace process image, enabling runtime payload execution hidden from static scans. (‘the loader writes to an anonymous memfd_create file descriptor and hands off via execve — replacing itself with the crackme’)
- [T1036 ] Masquerading – Library version string and binary metadata were patched to hide modifications and blend patched libgcrypt into expected software versions. (‘we purposely downloaded a slightly older version of the library and patched the version string to masquerade as the latest’)
- [T1027 ] Code Obfuscation Variants (Control Flow Flattening, Opaque Predicates) – Control-flow flattening and opaque predicates were used to blow up function size and introduce many realistic decoys to confuse pattern searches. (‘control flow flattening (CFF) causes an explosion in function size’ / ‘opaque predicates gating fake cipher operations’)
- [T1531 ] Account Access Removal or Manipulation (Prompt Injection context) – Prompt-injection attempts were used to manipulate model behavior and interrupt correct analysis, testing LLM instruction-parsing weaknesses. (‘Prompt injection is another technique targeting LLM’s in which specially crafted prompts (inputs) are used to trigger unintended behavior’)
Indicators of Compromise
- [File Name ] Example obfuscated binaries and payloads – authd (4.4 MB ELF loader), crackme (16 KB embedded payload)
- [String / Credential ] Hardcoded secrets and expected values used as targets – password “r3v3rs3!” and key_seed 0x5EED1234
- [Byte Sequence ] Encoded comparison ciphertext – enc_expected: 0x1a,0xcb,0x74,0xaa,0x1a,0x8b,0x31,0xb8 (used as verify target)
- [Library / Component ] Patched/open-source library artifacts – libgcrypt (patched pointer table, fake gcry_cipher_spec_t object)
- [Section / Segment ] Suspicious sections used for RWX payloads – .note.fips RWX segment used to host decrypted shellcode
Read more: https://www.elastic.co/security-labs/llm-reversing-vs-llm-obfuscation