Scaling Up Malware Analysis with Gemini 1.5 Flash

Gemini 1.5 Flash, Google’s lightweight malware-analysis model, is shown operating at production scale to unpack, decompile, and analyze large batches of binaries with high speed. In a real-world-style pipeline, it processed 1,000 Windows executables/DLLs at an average of 12.72 seconds per file (excluding unpacking/decompilation) and supports up to 1,000 requests per minute and 4 million tokens per minute. #Gemini1_5Flash #VirusTotal

Keypoints

  • Gemini 1.5 Flash is optimized for rapid inference and cost-effective deployment, capable of handling up to 1,000 requests per minute and 4 million tokens per minute.
  • A production-style pipeline combines automated unpacking (Mandiant Backscatter), decompilation (Hex-Rays Decompiler), and Gemini 1.5 Flash analysis to scale malware dissection.
  • The study analyzed 1,000 Windows executables and DLLs from VirusTotal to assess false positives, obfuscation, and samples with zero detections.
  • Average processing time per file was 12.72 seconds, with extremes from 1.51 seconds to 59.60 seconds depending on size and obfuscation.
  • Gemini provides detailed, human-readable summaries and can differentiate legitimate software from malware (e.g., game launchers vs. true threats) based on code analysis alone.
  • Examples demonstrate detection of backdoors, C2 exfiltration behavior via TLS, cryptominer behavior, and zero-hour threats like a keylogger, all with extracted IOCs and actionable insights.
  • Challenges remain in unpacking and decompilation quality, obfuscation techniques, and integrating richer context (e.g., data-flow graphs) to continually improve analysis accuracy.

MITRE Techniques

  • [T1071.001] Application Layer Protocol: Web Protocols – The model identifies and analyzes C2 communications over TLS by inspecting the code’s use of OpenSSL to establish a secure TLS connection to a C2 IP address. ‘The analysis highlights the code’s use of OpenSSL to establish a secure TLS connection to the IP address on port 443.’
  • [T1027] Obfuscated/Compressed Files and Information – The pipeline notes obfuscation techniques (e.g., XOR encryption) used by samples, affecting analysis depth. ‘obfuscation techniques like XOR encryption’.
  • [T1041] Exfiltration Over C2 Channel – Malware is described as exfiltrating data and connecting to C2 servers, including references to Russian-domain infrastructure. ‘designed to exfiltrate data and connect to command-and-control (C2) servers located on Russian domains.’

Indicators of Compromise

  • [Filename] context – goopdate.dll, BootstrapPackagedGame-Win64-Shipping.exe, and 4 more samples
  • [SHA-256] context – 0d2115d3de900bcd5aeca87b9af0afac90f99c5a009db7c162101a200fbfeb2c, 07db922be22e4feedbacea7f92983f51404578bd0c495abaae3d4d6bf87ae6d0 (and 4 more hashes)

Read more: https://cloud.google.com/blog/topics/threat-intelligence/scaling-up-malware-analysis-with-gemini/