How to Analyze Malicious PDF Files

PDFs are a popular vector for delivering malware, often via phishing emails, and attackers abuse PDF features like JavaScript, embedded streams, and reader vulnerabilities to drop payloads. The article demonstrates real-world examples, open-source tools for analysis, and how automated platforms can aid in triage for high-volume PDF investigations. #RaccoonStealer #Azorult #CVE-2017-11882 #Amazon

Keypoints

  • PDFs are widely used and were the most commonly attached malicious file type in phishing emails in 2023.
  • Attackers can hide malicious code inside PDFs using streams, compression/filters, and embedded objects, making analysis harder.
  • PDFs can trigger actions through links, JavaScript, and embedded content to download payloads or steal data.
  • Various PDF readers have vulnerabilities that threat actors can exploit to execute code and compromise endpoints.
  • Example 1 demonstrates a malicious PDF (MD5: a2852936a7e33787c0ab11f346631d89) used in a German-speaking phishing campaign leading to credential theft and Raccoon Stealer.
  • Example 2 shows a second malicious PDF (MD5: 1ba5c7ecab62609e4f1d44192cef850e) containing an embedded RTF with CVE-2017-11882 exploitation, also linked to Raccoon Stealer.
  • The article highlights open-source tools (peepdf, pdf-parser, rtfobj) and Intezer for automated analysis and alert triage to handle high volumes of PDFs.

MITRE Techniques

  • [T1566.001] Spearphishing Attachment – Used PDFs attached to phishing emails to lure victims into opening the file. “Many phishing attacks will contain links, which may appear as clickable images of buttons, coupons, fake CAPTCHA, fake play buttons, or QR codes.”
  • [T1027] Obfuscated/Compressed Files and Information – Streams in PDFs can be compressed and encoded to hide embedded code. “Streams can contain any type of data (including scripts and binary files) and they can be compressed and encoded which makes it harder to detect embedded code inside files.”
  • [T1059.007] JavaScript – PDF files natively support JavaScript, so attackers can create files that will execute scripts once a file has been opened. “PDF files natively support JavaScript, so attackers can create files that will execute scripts once a file has been opened…”
  • [T1203] Exploitation for Client Execution – Exploiting vulnerabilities in PDF readers to execute code and gain access. “Adobe PDF Reader alone has 91 reported vulnerabilities. Therefore threat actors can make PDF files that will exploit vulnerabilities, which will allow them to execute code and gain access to the victim’s endpoint.”
  • [T1056.003] Web Form Grabbing – Credential collection via a web form embedded in the PDF flow. “the data will actually be submitted to a malicious site http://sellercentral[.]amazon.de.56U8GTHDGT4U7YWEWE84GTYS.abecklink.com for credential stealing.”

Indicators of Compromise

  • [File Hash] Example PDFs – a2852936a7e33787c0ab11f346631d89, 1ba5c7ecab62609e4f1d44192cef850e
  • [File Name] Example PDFs referenced – example1.pdf, example2.pdf
  • [URL] Credential-stealing domain – http://sellercentral[.]amazon.de.56U8GTHDGT4U7YWEWE84GTYS.abecklink.com
  • [CVE] Vulnerability exploited – CVE-2017-11882
  • [Process] Word process used to open embedded content – winword.exe
  • [Malware] Related malware families – Raccoon Stealer, Azorult

Read more: https://intezer.com/blog/incident-response/analyze-malicious-pdf-files/