By Floser Bacurio Jr., Bernadette Canubas, Michaelo Oliveros · April 02, 2024
Introduction
Cyber attackers are always finding new ways to outsmart security systems and distribute malware effectively. We discovered an interesting detection evasion technique of delivering archive files containing overly large malware payload.
This technique overwhelms scanning engines with such big files and hopefully avoids file analysis and immediate detection. Scanning larger files takes more time and resources, which can slow down the overall system performance during the scan process. To minimize heavy memory footprint, some antivirus engines may set size limits for scanning, leading to oversized files being skipped. It is also necessary for the oversized payload to be efficiently compressed in an archive to increase distribution methods.
In this blog, we will explore telemetry insights, delivery methods, compression efficiency, and file types related to oversized malware payloads. We will also discuss the importance of effective compression for the oversize malware to be delivered and how these payloads manage to increase their file size without causing corruption or disruptions to the execution flow.
Upstream Insights
Adversaries keep on exploring diverse ways on how to successfully lure and deliver malicious files to its victims. Tons of emails, bearing malicious URLs and attachments were delivered to its victim utilizing different techniques on evading detection. Based on our 2023 telemetry, we observed a spike of email with compressed attachment containing overly large files.
Inflated Malware filetype distribution every month
We have uncovered the monthly breakdown of inflated file type payload. It is evident that Portable Executable files (DLL and EXE) remain persistent across all months. Moreover, irregular instances of inflated malicious documents have been observed in March, which is related to EMOTET campaign.
Distribution Archive and its Inflated File Type Content
We also discovered an interesting combination between compression types and the inflated file content of their payloads. This reveals that “ZIP” is the most used archive format for compressing oversized malware files. Its prevalence can be attributed to being one of the most widely supported archive formats across different operating systems, with users being familiar with how to extract and manage files from this archive.
This also indicates that document files and Portable Executable files (such as EXE and DLL) are the top file types that utilize this file inflation technique.
Receiving/Target Industries
Analyzing the data on oversized malware reveals a widespread trend across different industries, suggesting a significant potential for versatile use. This highlights the adaptability and scalability of this technique, indicating its capability to be a versatile threat in various sectors.
Malware Delivery
Malicious inflated malware is often delivered through email attachments or downloaded via URLs, often accompanied by persuasive language to entice recipients into opening the attachment. These email attachments typically appear as relatively small files but, upon extraction, expand into much larger files, sometimes exceeding 100MB in size.
In this example, the ZIP archive attachment is small (1.77 MB) and can be easily distributed due to its compact size.
83B38XM4C_ETRANSFER_RECEIPT.zip 1.77 MB
└───83B38XM4C_ETRANSFER_RECEIPT.iso 300 MB
└───83B38XM4C_ETRANSFER_RECEIPT.exe 300 MB
Upon decompression, it reveals an ISO file which is another container layer. At this point, the inflated malware remains concealed. It is only when the user mounts the ISO file that the inflated malware is exposed and becomes ready for execution. The inflated malware has a substantial file size of 300 MB. This significant size can potentially surpass the scanning limitations of conventional security scanners.
This sample also used ISO to add another layer of compression/container to conceal the inflated malware. This is another evasion technique to hide the malware from scanners that do not recursively extract archives/containers and scan files.
Here is another example of an email containing a download link that points to a compressed archive that will contain the inflated malware file.
document_atqxJ9.zip 77.6 KB
├───password.txt 4 bytes
└───New Document001.rar 77.3 KB
└───New Document001.pdf.exe 664 MB
In this example, the malware is hidden in another layer of compressed “.rar” archive. The inflated malware file expands to 664MB after extraction. It also utilizes masquerading technique where the malware file is pretending to be PDF file by changing its icon and adding pdf extension.
File Compression Effectiveness
Distributing a malicious file of a huge size could potentially affect its effectiveness due to factors such as network limitations and conspicuous file downloads, among others. This is where file compression comes into play. File compression works by encoding its data to have a lesser size compared to what the original file has. One of the basic concepts of compression algorithms is to reduce data redundancy. The more redundant data there is, the more file compression works, making it more efficient to reduce the file size.
A quick demonstration will show its file compression effectiveness. Two simple exe binary files were created as dummy files. For each file, dummy section was inserted with around 10MB in size that contains random bytes content and the same bytes content as show in the figure below.
After file compression (zip compression), it is obvious how the file size was reduced. For the compressed 1st file (Hello1.exe) that contains the dummy section with random data, although its size was reduced, there is still an enormous difference compared to the decreased size from the compressed 2nd file (Hello2.exe).
Most of the samples we have identified exhibit a common trait, as demonstrated earlier. Perpetrators have exploited this technique to efficiently distribute malicious files with inflated sizes. File compression plays a crucial role in the prevalence of oversized malware, allowing archives to be compact, thereby conserving storage space and bandwidth during file transfers.
In the upcoming section, we will explore various techniques that demonstrate the inflation of samples.
Oversized Payload: What’s Inside the Archive?
Shortcut Files
The malicious use of shortcut files (.lnk) equipped with scripts has become a prevalent initial attack vector. However, in this evasion tactic, the .lnk files are now inflated to avoid detection.
In our recent investigation, we discovered malicious shortcut files (.lnk) with an unusually enormous size, cleverly compressed within an archive. As depicted below, the zip file, initially 1,170,465 bytes in size (~1.116MB), contains a shortcut that expands to over 1,141,991,714bytes (~1GB) upon extraction. This significant size increase poses a challenge for file analysis, as it surpasses typical file scanning limits, reducing the likelihood of detection.
In this case, the shortcut file has been inflated by taking advantage of the Extra Data Block structure, with a substantial volume of extraneous data inserted across multiple extra data block sections.
According to Microsoft [link], the extra data section is a data structure appended to the fundamental Shell Link Binary File Format data, serving to provide additional information about the link target. This supplementary data is crucial for enhancing the functionality or visual presentation of the shortcut.
Illustrated in the figure below, the Extra Data Block section has been intentionally filled with trash data. The breakdown is as follows:
- sExtraDataBlock[0] has a size of CFD0h (~52 KB) and accommodates a trash file (in this example, a block with 0xDOCF Header which corresponds to an OLE file).
- Sections sExtraDataBlock[1] through [30856] each have a size of 9090h (~36 KB).
- The cumulative size, when summed up (36 KB x 30856 sections), results in a substantial 1.06 GB, effectively inflating the file.
- sExtraDataBlock[30857] serves as the Terminal Block, marking the end of the ExtraDataBlock Structure.
In summary, the deliberate insertion of extensive trash data, be it in the form of repeated bytes or multiple files within the optional Extra Data Block section, serves the dual purpose of ensuring the uninterrupted execution of malware and evading detection mechanisms. All of this is achieved while preserving the integrity of the underlying Shortcut File structure.
Portable Executable Files (EXE, DLL)
Portable Executable files like EXE and DLL are the most common filetype used for inflated malware file delivery. This is because of broad execution compatibility within the Windows ecosystem. Threat actors employ this technique by adding massive amounts of trash code in various sections of the file without affecting the execution flow of the malware. These trash code can be series of repeating bytes or as inserted or appended irrelevant files.
In this section, we will delve into various techniques for inflating portable executables that we have uncovered being employed by malicious actors in the wild.
Portable Executable (PE) Section Inflation
(59a87df740546eb35968dbfb39bcfc85)
One of the file inflation tricks that was examined is by adding trash data in a resource section. As shown in the figure below, the resource section is more than 185MB (0x0B195000) in physical size. A closer look at the resource section, the BITMAP directory contains some random name bitmap files.
Upon extracting the file stored in the BITMAP resource directory, numerous bitmap files are revealed, each characterized by substantial and sizable file sizes.
Looking at one of the bitmap files (Z3.bmp with 40 MB file size), it contains a lot of scan lines from its Pixel storage. This pixel storage is the one that represents the bitmap pixels which are packed in rows. This contains the RGB (Red Green Blue) values which represent the color used in display screen. Looking at the figure below, the colors that were used in RGB are all white (0xFF or 255 value). It does not make sense that this bitmap will only display white colors. One of the possibilities is that it uses this color to have the same byte values which will be more effective in reducing its file size once compressed.
Using ResourceHacker, a tool that is capable of effectively eliminating resource data from a compiled executable file, without requiring recompilation, file readjustment, or realignment, we removed the bitmap resource directory from the specified file and subsequently executing it. The sample continues to function, indicating that this resource does not contribute to its logical behavior. Notably, the presence of these inflated bitmap in the resource directory significantly influences file size inflation. The following illustration depicts the disparity in file sizes between the sample with the trash data in bitmap resource directory and its counterpart lacking this directory.
Portable Executable (PE) Overlay Inflation (02da968bfb06dd10524f0f00d005b52a)
Another technique that was also observed for inflated malware is by appending huge size of unused overlay data. Overlay data is not defined as part of the image in the PE header. Therefore, this is not part of virtual image once the PE is loaded and will not consume a lot in memory.
The data that can be appended in the overlay is arbitrary. It is hard to say what are the different uses of overlay data unless we have a grasp of the data format that has been placed. Certificates is one of the most common data that can be appended as overlay, it can also be some data (encrypted or packed), just to name a few. In a nutshell, any data can be placed in the overlay.
Perpetrators took advantage of these overlay features in a file. Since overlays will not be part of virtual image, perpetrators can append huge size of overlay without consuming a lot of memory once executed. As shown in the figure below, the inflated file has been appended with more than 1 GB of overlay data. It is also worth noting that the data appended has the same byte value, making it more effective in size reduction once compressed.
Trimming down this binary without the overlay, the original size would only be around 100 KB as shown in the figure below.
Object Linking and Embedding Files (Document Files)
In March 2023, numerous documents exhibiting inflated sizes were observed in circulation. These OLE (Object Linking and Embedding) format documents underwent inflation by appending trash data beyond the calculated maximum file size mapped by the FAT (File Allocation Table). Our observations indicate that this trash data is consistently added after the FAT’s maximum file size boundary, potentially facilitating evasion from detection mechanisms.
To determine the boundaries of a file within the Compound File Binary (CFB) format, the OLE header provides crucial information, including the file’s major version and the number of File Allocation Table (FAT) sectors. This data enables us to estimate the maximum file size that can be mapped by the FAT, under the assumption that all sectors are utilized, and none are marked as free.
The file information below is an example of an overly large OLE file. File named “Malware.doc” with an actual size of 538,142,720 bytes.
The calculation of the maximum file size covered by the FAT is based on the following details:
- The file’s major version (wVerDLL) is identified as 3 . According to the MS-CFB specification, this corresponds to a sector size of 512 bytes.
- The OLE header indicates there are 4 FAT sectors (dwNumFatSects).
- Each FAT sector comprises an array capable of holding 128 entries. Each entry is a 32-bit sector number, pointing to another sector within the file.
- Multiplying 128 (the number of entries per FAT sector) by 4 (the total number of FAT sectors) results in 512, which represents the total number of sectors that can be mapped by the FAT.
- The maximum file size that the FAT can map is determined using the formula: (N + 1) x Sector Size, where ‘N’ is the total number of mappable sectors. The ‘+1’ accounts for the fact that sector numbering starts at 0.
- Applying this formula, (512 + 1) x 512 bytes, we calculate the estimated maximum FAT file size to be 262,656 bytes.
The calculated file size (262,656 bytes) marks the boundary of valid data within the file. Data appended beyond this limit does not influence the execution of OLE (Object Linking and Embedding) processes and could be used to artificially increase the file size without affecting its operational functionality.
To compute for the excess data beyond the maximum FAT size
- Actual file size is (538,142,720 bytes) – Estimated Maximum File Size (262,656 bytes)
- 538,142,720 bytes – 262,656 bytes = 537,880,064 bytes (~513 MB)
The substantial amount of excess data, amounting to about 513MB, can surpass the scanning limits of some detection systems, posing a risk of evading detection.
The OLEMAP tool is useful in facilitating the analysis of OLE documents, offering a direct approach to uncovering OLE properties, and identifying potential excess data.
The illustration provided below, produced using the OLEMAP tool, corroborates our analysis. It highlights the document’s major version, enumerates the FAT (File Allocation Table) sections, outlines the FAT’s maximum mappable size, and indicates the magnitude of data appended in the file.
Main payload
Our analysis of the samples collected indicates that file inflation is employed as an element of a broader attack strategy, primarily serving as a mechanism for downloading. Its primary function is to retrieve and initiate the execution of additional malicious payloads. Notably, the primary payloads associated with these inflated files are identified as malicious Remote Access Tools, such as QUASAR RAT, AGENTTESLA, REMCOS, and ASYNCRAT. Additionally, our findings suggest that the inflated document files identified in March 2023 are linked to the EMOTET campaign.
Moreover, the application of this file inflation technique extends to other malware categories, including banking trojans and ransomware, highlighting its versatility and widespread use in cyber threats.
Conclusion
Threat actors are continuously innovating to bypass detection mechanisms, developing sophisticated strategies to deploy and execute attacks on targeted networks or systems. As demonstrated by the techniques outlined above, they skillfully manipulate and exploit specific file types, effectively convincing victims to open these files undetected by security solutions. At Trellix, our commitment is centered on offering robust protection against such attacks, ensuring our dedication to safeguarding against these evolving threats.
IOC (indicator of compromise) / Trellix Detection
Malware archive:
EF03E6CCB9B97C898B779381B01402FA
9E1C78CFFBA238E340E4F8C1F6B2A20B
02DA968BFB06DD10524F0F00D005B52A
18C9003A0A5EDD71FFCA13A815D612CB
7E5CC47880BF2CCD244CF925093D2D16
Inflated payload:
5D863654DD020D744AFF25AE91B251BF
473AB5076D99785ECBE1F933F0747C1B
849DC8705536F331DA1C4978ABDBA92C
B132C1FF68E000A70B3C085CFDD72FEB
Detection List of oversized files:
Named Detection:
FEC_Downloader_Macro_Generic_57 FEC_Trojan_LNK_Generic_10 FEC_Trojan_LNK_Generic_9 FE_Loader_MSIL_Generic_100 FE_InfoStealer_MSIL_Generic_6 FE_Loader_MSIL_Generic_117 FE_Loader_MSIL_Generic_127 FE_Loader_Win64_Generic_140 FE_Loader_Win64_Generic_148 FE_Loader_Win_Generic_15 FE_Trojan_LNK_Generic_36 FE_Trojan_MSIL_Generic_203 FE_Trojan_Win_Generic_76 FE_Trojan_ZIP_Generic_14 Backdoor.MSIL.ASYNCRAT.MVX Backdoor.Win.REMCOS.MVX Backdoor.Win.Remcos.MVX Downloader.Macro.Generic.MVX InfoStealer.Generic.MVX InfoStealer.MSIL.AGENTTESLA.MVX Keylogger.Win.Generic.MVX Loader.Win.Generic.MVX Trojan.Win.Generic.MVX Trojan.Win.Ursnif.MVX Trojan.Emotet Downloader.Generic Trojan.Win.Generic |
Source: Original Post