ReversingLabs researchers uncovered a new malicious campaign leveraging ML models serialized in the Pickle format to distribute infostealer malware via PyPI packages masquerading as Alibaba AI Labs SDKs. This campaign highlights the emerging threat of malware embedded in ML file formats, emphasizing the need for advanced detection tools tailored to AI/ML software supply chains. #nullifAI #Pickle #PyPI #ReversingLabs
Keypoints
- Threat actors exploited the Python Pickle file format to embed malicious infostealer payloads inside PyTorch ML models distributed through PyPI packages.
- Three malicious packages—aliyun-ai-labs-snippets-sdk, ai-labs-snippets-sdk, and aliyun-ai-labs-sdk—were uploaded to PyPI posing as Alibaba AI Labs SDKs but served no legitimate function.
- The infostealer payload exfiltrated user information, network details, organizational data, and contents of the victim’s .gitconfig file, targeting developers likely in China.
- The malicious payload was loaded immediately upon package installation via the init.py script embedded in two of the packages.
- Base64 obfuscation was used in some package versions to make detection more difficult for security solutions.
- ReversingLabs’ Spectra Assure platform improvements, including Threat Hunting Policies (THPs), enabled detection of these malicious ML models and flagged suspicious serialized Pickle-based code execution.
- This campaign demonstrates an evolving attack surface where AI/ML models are increasingly targeted within software supply chains, necessitating zero-trust policies and enhanced detection capabilities.
MITRE Techniques
- [T1059] Command and Scripting Interpreter – Malicious Python code inside Pickle serialized ML models is executed immediately upon package installation through the init.py script (‘malicious PyTorch models are loaded from the init.py script immediately upon installation’).
- [T1074] Data Staged – The infostealer collects and stages data such as user, network, and organizational information as well as contents of the .gitconfig file before exfiltration (‘The malicious payload exfiltrates basic information about the infected machine and the content of the .gitconfig file’).
- [T1140] Deobfuscate/Decode Files or Information – Some malicious payloads include an additional Base64 encoding layer for obfuscation to evade detection (‘The malicious payload from the PyTorch model is obfuscated by an additional layer of a Base64 encoding’).
Indicators of Compromise
- [Domain] Attacker-controlled exfiltration servers – Not specifically named but implied as the destination for stolen data.
- [File Names] Malicious PyPI packages – aliyun-ai-labs-snippets-sdk, ai-labs-snippets-sdk, aliyun-ai-labs-sdk (packages delivering the malicious PyTorch models).
- [File Format] Malicious PyTorch ML models – PyTorch zipped Pickle files containing infostealer code embedded inside the PyPI packages.
Read more: https://www.reversinglabs.com/blog/malicious-attack-method-on-hosted-ml-models-now-targets-pypi