Introduction
Since the creation of Elastic Security Labs, we have focused on developing malware analysis tools to not only aid in our research and analysis, but also to release to the public. We want to give back to the community and give back as much as we get from it. In an effort to make these tools more robust and reduce code duplication, we created the Python library nightMARE. This library brings together various useful features for reverse engineering and malware analysis. We primarily use it to create our configuration extractors for different widespread malware families, but nightMARE is a library that can be applied to multiple use cases.
With the release of version 0.16, we want to officially introduce the library and provide details in this article on some interesting features offered by this module, as well as a short tutorial explaining how to use it to implement your own configuration extractor compatible with the latest version of LUMMA (as of the post date).
nightMARE features tour
Powered by Rizin
To reproduce the capabilities of popular disassemblers, nightMARE initially used a set of Python modules to perform the various tasks necessary for static analysis. For example, we used LIEF for executable parsing (PE, ELF), Capstone to disassemble binaries, and SMDA to obtain cross-reference (xref) analysis.
These numerous dependencies made maintaining the library more complex than necessary. That's why, in order to reduce the use of third-party modules as much as possible, we decided to use the most comprehensive reverse engineering framework available. Our choice naturally gravitated towards Rizin.
Rizin is an open-source reverse engineering software, forked from the Radare2 project. Its speed, modular design, and almost infinite set of features based on its Vim-like commands make it an excellent backend choice. We integrated it into the project using the rz-pipe module, which makes it very easy to create and instrument a Rizin instance from Python.
Project structure
The project is structured along three axes:
- The "analysis" module contains sub-modules useful for static analysis.
- The "core" module contains commonly useful sub-modules: bitwise operations, integer casting, and recurring regexes for configuration extraction.
- The "malware" module contains all algorithm implementations (crypto, unpacking, configuration extraction, etc.), grouped by malware family and, when applicable, by version.
Analysis modules
For static binary analysis, this module offers two complementary working techniques: disassembly and instruction analysis with Rizin via the reversing module, and instruction emulation via the emulation module.
For example, when constants are manually moved onto the stack, instead of trying to analyze the instructions one by one to retrieve the immediates, it is possible to emulate the entire piece of code and read the data on the stack once the processing is done.
Another example that we will see later in this article is that, in the case of cryptographic functions, if it is complex, it is often simpler to directly call it in the binary using emulation than to try to implement it manually.
Reversing module
This module contains the Rizin class, which is an abstraction of Rizin's functionalities that send commands directly to Rizin thanks to rz-pipe
and offers the user an incredible amount of analysis power for free. Because it’s an abstraction, the functions that the class exposes can be easily used in a script without prior knowledge of the framework.
Although this class exposes a lot of different features, we are not trying to be exhaustive. The goal is to reduce duplicated code for recurring functionalities across all our tools. However, if a user finds that a function is missing, they can directly interact with the rz-pipe
object to send commands to Rizin and achieve their goals.
Here is a short list of the functions we use the most:
# Disassembling
def disassemble(self, offset: int, size: int) -> list[dict[str, typing.Any]]
def disassemble_previous_instruction(self, offset: int) -> dict[str, typing.Any]
def disassemble_next_instruction(self, offset: int) -> dict[str, typing.Any]
# Pattern matching
def find_pattern(
self,
pattern: str,
pattern_type: Rizin.PatternType) -> list[dict[str, typing.Any]]
def find_first_pattern(
self,
patterns: list[str],
pattern_type: Rizin.PatternType) -> int
# Reading bytes
def get_data(self, offset: int, size: int | None = None) -> bytes
def get_string(self, offset: int) -> bytes
# Reading words
def get_u8(self, offset: int) -> int
...
def get_u64(self, offset: int) -> int
# All strings, functions
def get_strings(self) -> list[dict[str, typing.Any]]
def get_functions(self) -> list[dict[str, typing.Any]]
# Xrefs
def get_xrefs_from(self, offset: int) -> list
def get_xrefs_to(self, offset: int) -> list[int]
Emulation module
In version 0.16, we reworked the emulation module to take full advantage of Rizin's capabilities to perform its various data-related tasks. Under the hood, it’s using the Unicorn engine to perform emulation.
For now, this module only offers a "light" PE emulation with the class WindowsEmulator, light in the sense that only the strict minimum is done to load a PE. No relocations, no DLLs, no OS emulation. The goal is not to completely emulate a Windows executable like Qiling or Sogen, but to offer a simple way to execute code snippets or short sequences of functions while knowing its limitations.
The WindowsEmulator class offers several useful abstractions.
# Load PE and its stack
def load_pe(self, pe: bytes, stack_size: int) -> None
# Manipulate stack
def push(self, x: int) -> None
def pop(self) -> int
# Simple memory management mechanisms
def allocate_memory(self, size: int) -> int
def free_memory(self, address: int, size: int) -> None
# Direct ip and sp manipulation
@property
def ip(self) -> int
@property
def sp(self) -> int
# Emulate call and ret
def do_call(self, address: int, return_address: int) -> None
def do_return(self, cleaning_size: int = 0) -> None
# Direct unicorn access
@property
def unicorn(self) -> unicorn.Uc
The class allows the registration of two types of hooks: normal unicorn hooks and IAT hooks.
# Set unicorn hooks, however the WindowsEmulator instance get passed to the callback instead of unicorn
def set_hook(self, hook_type: int, hook: typing.Callable) -> int:
# Set hook on import call
def enable_iat_hooking(self) -> None:
def set_iat_hook(
self,
function_name: bytes,
hook: typing.Callable[[WindowsEmulator, tuple, dict[str, typing.Any]], None],
) -> None:
As a usage example, we use the Windows binary DismHost.exe
.
The binary uses the Sleep import at address 0x140006404
:
We will therefore create a script that registers an IAT hook for the Sleep import, starts the emulation execution at address 0x140006404
, and ends at address 0x140006412
.
# coding: utf-8
import pathlib
from nightMARE.analysis import emulation
def sleep_hook(emu: emulation.WindowsEmulator, *args) -> None:
print(
"Sleep({} ms)".format(
emu.unicorn.reg_read(emulation.unicorn.x86_const.UC_X86_REG_RCX)
),
)
emu.do_return()
def main() -> None:
path = pathlib.Path(r"C:\Windows\System32\Dism\DismHost.exe")
emu = emulation.WindowsEmulator(False)
emu.load_pe(path.read_bytes(), 0x10000)
emu.enable_iat_hooking()
emu.set_iat_hook("KERNEL32.dll!Sleep", sleep_hook)
emu.unicorn.emu_start(0x140006404, 0x140006412)
if __name__ == "__main__":
main()
It is important to note that the hook function must necessarily return with the do_return
function so that we can reach the address located after the call.
When the emulator starts, our hook is correctly executed.
Malware module
The malware module contains all the algorithm implementations for each malware family we cover. These algorithms can cover configuration extraction, cryptographic functions, or sample unpacking, depending on the type of malware. All these algorithms use the functionalities of the analysis module to do their job and provide good examples of how to use the library.
With the release of v0.16, here are the different malware families that we cover.
blister
deprecated
ghostpulse
latrodectus
lobshot
lumma
netwire
redlinestealer
remcos
smokeloader
stealc
strelastealer
xorddos
The complete implementation of the LUMMA algorithms we cover in the next chapter tutorial can be found under the LUMMA sub-module.
Please take note that the rapidly evolving nature of malware makes maintaining these modules difficult, but we welcome any help to the project, direct contribution, or opening issues.
Example: LUMMA configuration-extraction
LUMMA STEALER, also known as LUMMAC2, is an information-stealing malware still widely used in infection campaigns despite a recent takedown operation in May 2025. This malware incorporates control flow obfuscation and data encryption, making it more challenging to analyze both statically and dynamically.
In this section, we will use the following unencrypted sample as reference: 26803ff0e079e43c413e10d9a62d344504a134d20ad37af9fd3eaf5c54848122
We do a short analysis of how it decrypts its domain names step by step, and then demonstrate along the way how we build the configuration extractor using nightMARE.
Step 1: Initializing the ChaCha20 context
In this version, LUMMA performs the initialization of its cryptographic context after loading WinHTTP.dll
, with the decryption key and nonce; this context will be reused for each call to the ChaCha20
decryption function without being reinitialized. The nuance here is that an internal counter within the context is updated with each use, so later we’ll need to take into account the value of this counter before the first domain decryption and then decrypt them in the correct order.
To reproduce this step in our script, we need to collect the key and nonce. The problem is that we don't know their location in advance, but we know where they are used. We pattern match this part of the code, then extract the addresses g_key_0 (key)
and g_key_1 (nonce)
from the instructions.
CRYPTO_SETUP_PATTERN = "b838?24400b???????00b???0???0096f3a5"
def get_decryption_key_and_nonce(binary: bytes) -> tuple[bytes, bytes]:
# Load the binary in Rizin
rz = reversing.Rizin.load(binary)
# Find the virtual address of the pattern
if not (
x := rz.find_pattern(
CRYPTO_SETUP_PATTERN, reversing.Rizin.PatternType.HEX_PATTERN
)
):
raise RuntimeError("Failed to find crypto setup pattern virtual address")
# Extract the key and nonce address from the instruction second operand
crypto_setup_va = x[0]["address"]
key_and_nonce_address = rz.disassemble(crypto_setup_va, 1)[0]["opex"]["operands"][
1
]["value"]
# Return the key and nonce data
return rz.get_data(key_and_nonce_address, CHACHA20_KEY_SIZE), rz.get_data(
key_and_nonce_address + CHACHA20_KEY_SIZE, CHACHA20_NONCE_SIZE
)
def build_crypto_context(key: bytes, nonce: bytes, initial_counter: int) -> bytes:
crypto_context = bytearray(0x40)
crypto_context[0x10:0x30] = key
crypto_context[0x30] = initial_counter
crypto_context[0x38:0x40] = nonce
return bytes(crypto_context)
Step 2: Locate the decryption function
In this version, LUMMA's decryption function is easily located across samples as it is utilized immediately after loading WinHTTP imports.
We derive the hex pattern from the first bytes of the function to locate it in our script:
DECRYPTION_FUNCTION_PATTERN = "5553575681ec1?0100008b??243?01000085??0f84??080000"
def get_decryption_function_address(binary) -> int:
# A cache system exist so the binary is only loaded once, then we get the same instance of Rizin :)
if x := reversing.Rizin.load(binary: bytes).find_pattern(
DECRYPTION_FUNCTION_PATTERN, reversing.Rizin.PatternType.HEX_PATTERN
):
return x[0]["address"]
raise RuntimeError("Failed to find decryption function address")
Step 3: Locate the encrypted domain's base address
By using xrefs from the decryption function, which is not called with obfuscated indirection like other LUMMA functions, we can easily find where it is called to decrypt the domains.
As with the first step, we will use the instructions to discover the base address of the encrypted domains in the binary:
C2_LIST_MAX_LENGTH = 0xFF
C2_SIZE = 0x80
C2_DECRYPTION_BRANCH_PATTERN = "8d8?e0?244008d7424??ff3?565?68????4500e8????ffff"
def get_encrypted_c2_list(binary: bytes) -> list[bytes]:
rz = reversing.Rizin.load(binary)
address = get_encrypted_c2_list_address(binary)
encrypted_c2 = []
for ea in range(address, address + (C2_LIST_MAX_LENGTH * C2_SIZE), C2_SIZE):
encrypted_c2.append(rz.get_data(ea, C2_SIZE))
return encrypted_c2
def get_encrypted_c2_list_address(binary: bytes) -> int:
rz = reversing.Rizin.load(binary)
if not len(
x := rz.find_pattern(
C2_DECRYPTION_BRANCH_PATTERN, reversing.Rizin.PatternType.HEX_PATTERN
)
):
raise RuntimeError("Failed to find c2 decryption pattern")
c2_decryption_va = x[0]["address"]
return rz.disassemble(c2_decryption_va, 1)[0]["opex"]["operands"][1]["disp"]
Step 4: Decrypt domains using emulation
A quick analysis of the decryption function shows that this version of LUMMA uses a slightly customized version of ChaCha20
. We recognize the same small and diverse decryption functions scattered throughout the binaries. Here, they are used to decrypt parts of the ChaCha20
"expand 32-byte k" constant, which are then XOR-ROL derived before being stored in the context structure.
While we could implement the decryption function in our script, we have all the necessary addresses to demonstrate how we can directly call the function already present in the binary to decrypt our domains, using nightMARE's emulation module.
# We need the right initial value, before decrypting the domain
# the function is already called once so 0 -> 2
CHACHA20_INITIAL_COUNTER = 2
def decrypt_c2_list(
binary: bytes, encrypted_c2_list: list[bytes], key: bytes, nonce: bytes
) -> list[bytes]:
# Get the decryption function address (step 2)
decryption_function_address = get_decryption_function_address(binary)
# Load the emulator, True = 32bits
emu = emulation.WindowsEmulator(True)
# Load the PE in the emulator with a stack of 0x10000 bytes
emu.load_pe(binary, 0x10000)
# Allocate the chacha context
chacha_ctx_address = emu.allocate_memory(CHACHA20_CTX_SIZE)
# Write at the chacha context address the crypto context
emu.unicorn.mem_write(
chacha_ctx_address,
build_crypto_context(
key,
nonce,
CHACHA20_INITIAL_COUNTER,
),
)
decrypted_c2_list = []
for encrypted_c2 in encrypted_c2_list:
# Allocate buffers
encrypted_buffer_address = emu.allocate_memory(C2_SIZE)
decrypted_buffer_address = emu.allocate_memory(C2_SIZE)
# Write encrypted c2 to buffer
emu.unicorn.mem_write(encrypted_buffer_address, encrypted_c2)
# Push arguments
emu.push(C2_SIZE)
emu.push(decrypted_buffer_address)
emu.push(encrypted_buffer_address)
emu.push(chacha_ctx_address)
# Emulate a call
emu.do_call(decryption_function_address, emu.image_base)
# Fire!
emu.unicorn.emu_start(decryption_function_address, emu.image_base)
# Read result from decrypted buffer
decrypted_c2 = bytes(
emu.unicorn.mem_read(decrypted_buffer_address, C2_SIZE)
).split(b"\x00")[0]
# If result isn't printable we stop, no more domain
if not bytes_re.PRINTABLE_STRING_REGEX.match(decrypted_c2):
break
# Add result to the list
decrypted_c2_list.append(b"https://" + decrypted_c2)
# Clean up the args
emu.pop()
emu.pop()
emu.pop()
emu.pop()
# Free buffers
emu.free_memory(encrypted_buffer_address, C2_SIZE)
emu.free_memory(decrypted_buffer_address, C2_SIZE)
# Repeat for the next one ...
return decrypted_c2_list
Result
Finally, we can run our module with pytest
and view the LUMMA C2 list (decrypted_c2_list
):
https://mocadia[.]com/iuew
https://mastwin[.]in/qsaz
https://ordinarniyvrach[.]ru/xiur
https://yamakrug[.]ru/lzka
https://vishneviyjazz[.]ru/neco
https://yrokistorii[.]ru/uqya
https://stolevnica[.]ru/xjuf
https://visokiykaf[.]ru/mntn
https://kletkamozga[.]ru/iwqq
This example highlights how the nightMARE library can be used for binary analysis, specifically, for extracting the configuration from the LUMMA stealer.
Download nightMARE
The complete implementation of the code presented in this article is available here.
Conclusion
nightMARE is a versatile Python module, based on the best tools the open source community has to offer. With the release of version 0.16 and this short article, we hope to have demonstrated its capabilities and potential.
Internally, the project is at the heart of various even more ambitious projects, and we will continue to maintain nightMARE to the best of our abilities.