Delving into Dalvik: A Look Into DEX Files

Mandiant examined a Nexus banking trojan Android sample and found widespread string obfuscation implemented via XOR and buried in methods full of dead code, which complicated static analysis. The authors describe DEX internals and release dexmod, a Python tool that adds decoded strings, patches method bytecode and updates DEX metadata so obfuscated methods return plaintext strings for easier reverse engineering. #Nexus #dexmod

Keypoints

Analysis of a Nexus banking trojan sample (MD5 d87e04db4f4a36df263ecbfe8a8605bd) revealed repeated string obfuscation implemented by XORing byte arrays and excessive dead code in decoding methods.
The obfuscated methods decode strings at runtime and are often invoked indirectly, complicating manual deobfuscation and static analysis with tools like jadx.
DEX file structure (header, string table, type_ids, class_defs, class_data, code_items) and bytecode encoding (registers_size, insns_size in 16-bit units) determine how to safely patch methods.
Patching requires adding decoded strings to the DEX string table, updating indices/offsets, replacing obfuscated bytecode with a const-string + return-object sequence, and adjusting method prefaces.
After bytecode edits, the DEX checksum and SHA-1 in the header must be recalculated to avoid integrity failures when re-inspecting with decompilers.
The authors provide dexmod (uses dexterity) with scripts for searching methods, adding strings, and applying the case-study patch logic to automate safe DEX modifications.

MITRE Techniques

[T1027.002] Software Packing – The main activity “is not present in the application initially as it is later unpacked” (‘main activity in AndroidManifest.xml is not present in the application initially as it is later unpacked’).
[T1027] Obfuscated Files or Information – The sample uses a repeated “string obfuscation mechanism” and inserts dead code to confuse analysis (‘the repeated use of a string obfuscation mechanism throughout the application code’).
[T1140] Deobfuscate/Decode Files or Information – The decoding routine XORs an encoded byte array with a key to produce strings, which must be decoded and added back into the DEX for patching (‘This is done by XORing a byte array (the encoded string) with another byte array (the XOR key)’)

Indicators of Compromise

[File hash] Sample MD5 – d87e04db4f4a36df263ecbfe8a8605bd
[Class name] Malicious Application subclass – com.toss.soda.RWzFxGbGeHaKi
[Method names] Obfuscated/deobfuscation methods – bleakperfect, melodynight, justclinic
[Source URL] Original analysis post – https://cloud.google.com/blog/topics/threat-intelligence/dalvik-look-into-dex-files/

To patch the Nexus sample, first identify the obfuscated methods that perform XOR-based decoding and contain dead-code sequences. Decode each string by reproducing the XOR logic, then add the resulting plaintexts into the DEX string table so they have corresponding string_data_item and string_id_item entries; this step necessarily changes section sizes, indices, and offsets so it must be done while maintaining consistent references across type_ids, string_ids, and class/method structures.

Next, overwrite the start of each obfuscated method’s bytecode with a compact Dalvik sequence that returns the new string: const-string v0, [string_id] followed by return-object v0 (three 16-bit instructions, 6 bytes). Update the method’s code_item preface: set registers_size to 1 (only v0 used) and insns_size to 3 (three 16-bit instruction units), leaving other preface fields unchanged if unaffected. Optionally pad remaining original bytes with 0x00 (NOP) for cleanliness.

Finally, recalculate and update the DEX header checksum and SHA-1 signature to reflect the modified content. The authors automated these tasks in dexmod (a Python tool using the dexterity library) with modules for method discovery, string insertion, and custom bytecode edits; after applying the patches and header fixes, re-inspect the DEX with JADX to confirm obfuscated methods now return decoded strings and simplify further analysis.