Analyse, hunt and classify malware using .NET metadata

The post demonstrates reliably extracting .NET assembly metadata (MVID, Typelib GUID, assembly name) and using Yara’s dotnet/console modules plus a new Python tool to hunt, cluster and classify .NET malware at scale. The author applied these methods across families (RedLine, Agent Tesla, Quasar, Pure*) and discovered a crypter identified as Cronos-Crypter. #PureCrypter #PureLogStealer #DotNet #RedLine #AgentTesla #Quasar #Cronos-Crypter

Keypoints

  • Extracting .NET metadata (MVID, Typelib GUID, assembly name) is a stable signal for hunting and clustering .NET-based malware.
  • Yara’s dotnet module can directly access .NET GUIDs and fields (e.g., dotnet.guids, dotnet.typelib, dotnet.assembly.name) for reliable detection versus simple strings/regex.
  • The author published a Python tool (DotNet-MetaData) that enumerates Typelib, MVID and assembly name across single files or directories, suitable for large-scale analysis.
  • The Python tool requires Python 3, pythonnet, and a compiled dnlib.dll and skips non-.NET binaries while producing statistics for clustering and rule generation.
  • Using extracted metadata, the author generated Yara rules and classifiers (including console logging) to label likely families (Quasar, Pure family, AsyncRAT) and found instances of Cronos-Crypter in public datasets.
  • Visualizations (assembly name, Typelib and MVID frequency) helped identify candidate identifiers to build higher-confidence Yara detections for specific families.
  • Caveats: GUIDs can be spoofed/removed or altered by build/obfuscation; avoid empty GUIDs (all zeros) and validate rules against unpacked/deobfuscated samples.

MITRE Techniques

  • [T1027] Obfuscated Files or Information – Use of crypters/obfuscators to hide payloads: (‘PureCrypter, a loader and obfuscator for all different kinds of malware such as Agent Tesla and RedLine’)
  • [T1059.001] Command and Scripting Interpreter: PowerShell – PowerShell used in the infection chain for related samples: (‘the post discusses the use of PowerShell scripts in the infection chain of PureLogStealer’)
  • [T1082] System Information Discovery – Extracting assembly metadata and GUIDs from .NET binaries for analysis and clustering: (‘extract metadata from .NET assemblies, including assembly names, which can be used for malware clustering and identification’)
  • [T1005] Data from Local System – Collecting .NET metadata from local sample repositories for bulk analysis and statistics: (‘The Python script is capable of extracting the desired data from a large set of .NET assemblies’)

Indicators of Compromise

  • [File hash] sample referenced on VirusTotal – c201449a0845d659c32cc48f998b8cc95c20153bb1974e3a1ba80c53a90f1b27 (used to illustrate MVID detection)
  • [GUID (MVID/Typelib)] .NET identifiers useful for hunting – MVID: 9066ee39-87f9-4468-9d70-b57c25f29a67, Typelib: 856e9a70-148f-4705-9549-d69a57e669b0 (examples), and many more GUIDs across the dataset
  • [Assembly names] default/suspicious assembly names – “Client”, “Product Design 1”, “Sample Design 1”, “AsyncClient” (used to classify Quasar, Pure family, AsyncRAT)
  • [Yara string] crypter identifier used in rules – “Cronos-Crypter” (used to find crypter instances in Unpac.me dataset)
  • [Platforms/Repositories] sample sources and tooling references – MalwareBazaar (MalwareBazaar/bazaar.abuse.ch), Unpac.me (unpac.me), GitHub repo (https://github.com/bartblaze/DotNet-MetaData)

The technical procedure centers on extracting and leveraging .NET metadata (MVID, Typelib GUID, assembly name) to create precise hunting and classification rules. Simply running strings/regex on binaries produces many false positives/negatives because the MVID is stored as binary and Typelib may be the only GUID present as text; instead, use Yara’s dotnet module (introduced 2017) to query dotnet.guids, dotnet.typelib and dotnet.assembly.name directly (for example: dotnet.guids[0] == “9066ee39-87f9-4468-9d70-b57c25f29a67” or dotnet.typelib == “856e9a70-148f-4705-9549-d69a57e669b0”).

For scale, the author created a Python tool (DotNet-MetaData) that iterates a single file or a folder, detects .NET assemblies, and extracts Typelib, MVID and assembly name while skipping non-.NET binaries. The script requires Python 3, pythonnet and a compiled dnlib.dll; its output can be fed into pandas/matplotlib to compute frequencies and produce visuals (assembly name, Typelib and MVID frequency) which reveal dominant identifiers useful for generating Yara classifiers and clustering samples by family or campaign.

Practical workflow: run the Python extractor across your repository to gather MVID/Typelib/assembly name statistics, pick high-frequency or otherwise unique identifiers, then author Yara rules using the dotnet and console modules to detect and label likely families (examples shown for Quasar, Pure family, AsyncRAT) and to log matches. Keep in mind GUIDs can be spoofed/removed or changed by obfuscation/build tooling and avoid using the all-zero GUID; validate rules against unpacked/deobfuscated samples for best results.

Read more: https://www.hendryadrian.com/analyse-hunt-and-classify-malware-using-net-metadata/