Skip to content

Commit b0d1846

Browse files
committed
Update RE findings across 5 MD docs and sync all slide decks
MD updates: PE emulation 5M instruction budget with batch model, VEH/SEH/x64 exception handling, TEB/PEB fake env, CryptAPI/BCrypt; NScript per-language transform counts (1,358 total), dynamic engines with 30s timeouts, AES crypto; MAPS 5 cloud block levels, cert pinning, 72 STREAM_ATTRIBUTEs, 7 report types; SigTree weighted scoring, priority arbitration, PE boolean attributes, 7 tree types; Overview container count 25+ to 70+. Add SigTree standalone work log. Slide updates: synced all 5 HTML slide decks with corresponding MD changes including new slides for exception handling (PE emu) and scoring/arbitration (SigTree ML).
1 parent 0140f21 commit b0d1846

11 files changed

Lines changed: 1012 additions & 226 deletions

md/00_overview.md

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -145,7 +145,7 @@ File/Buffer Input
145145
│ x86/x64 CPU emulator with 198 WinAPI handlers. │
146146
│ Maps PE sections, resolves imports via 973 VDLLs. │
147147
│ Records FOP (opcode trace) and APICLOG (API behavior). │
148-
500K instruction limit per execution.
148+
5M instruction budget per execution (configurable).
149149
│ RTTI: ".?AVx86_IL_emulator@@" @ 0x10C748CC │
150150
│ String: "reemulate" @ 0x10981878 │
151151
└────────┬─────────────────────────────────────────────────────────┘
@@ -250,25 +250,26 @@ File/Buffer Input
250250
| Metric | Value | Source |
251251
|--------|-------|--------|
252252
| Total exports | 90 | export table |
253-
| FPU emulation functions | 66 | `FPU_*` exports |
253+
| FPU emulation functions | 67 | `FPU_*` exports |
254254
| Binary size | 14.3 MB | PE header |
255255
| Total threats defined | 358,756 | VDM TLV index |
256256
| PEHSTR rules | 117,563 | VDM index |
257257
| KCRCE entries | 691,145 | VDM index |
258258
| MD5 static hashes | 2,433,812 | VDM index |
259259
| Lua detection scripts | 59,415 | VDM LUASTANDALONE |
260-
| Virtual DLLs | 973 | VDM VDLL entries |
260+
| Virtual DLLs | 973 | VDM VDLL entries (750 x86, 195 x64, 18 ARM, 10 MSIL) |
261261
| Virtual files | 144 | VDM VFILE entries |
262262
| DBVARs (config entries) | 547 | VDM DBVAR entries |
263263
| FOP behavioral rules | 4,601 | VDM FOP entries |
264-
| SIG_TREE ML trees | 33,428 | VDM SIG_TREE + EXT + BM |
264+
| SIG_TREE ML trees | 33,428 | VDM SIG_TREE + EXT + BM (408,708 nodes) |
265265
| Threat name prefixes | 504 | VDM prefix table |
266266
| TLV entries | 9.3M | Across 4 VDM files |
267267
| Signature types | 158+ | TLV type constants |
268268
| WinAPI handlers | 198 | Emulator dispatch |
269-
| Container formats | 25+ | Extraction framework |
269+
| Emulation instruction budget | 5M | Default, configurable via DBVAR |
270+
| Container formats | 70+ | Extraction framework (25+ primary) |
270271
| Script languages | 4 | PS, VBS, JS, Batch |
271-
| Deobfuscation transforms | 1,358 | Across 4 languages |
272+
| Deobfuscation transforms | 1,358 | PS:394, VBS:340, JS:359, Batch:265 |
272273

273274
---
274275

md/05_pe_emulation.md

Lines changed: 183 additions & 48 deletions
Original file line numberDiff line numberDiff line change
@@ -7,9 +7,9 @@
77

88
## Overview
99

10-
Stage 5 is the PE emulation engine -- a full CPU emulator embedded within mpengine.dll that executes PE files in a sandboxed virtual environment. The emulator interprets x86, x64, and ARM instructions, provides 198 emulated Windows API handlers, loads 973 virtual DLLs (VDLLs) into a synthetic address space, and records behavioral telemetry (FOP opcode traces and API call logs) for signature matching.
10+
Stage 5 is the PE emulation engine -- a full CPU emulator embedded within mpengine.dll that executes PE files in a sandboxed virtual environment. The emulator interprets x86, x64, and ARM instructions, provides 198 emulated Windows API handlers (including CryptAPI and BCrypt), loads 973 virtual DLLs (VDLLs) into a synthetic address space, and records behavioral telemetry (FOP opcode traces and API call logs) for signature matching. The emulator runs with a default budget of **5 million instructions** (configurable), processing in batches of ~1,000.
1111

12-
The emulator's primary purpose is **dynamic unpacking**: many malware samples encrypt or compress their payloads and only reveal the real code at runtime. By emulating execution, Defender can observe the decrypted payload and scan it through the full pipeline recursively (Stage 6).
12+
The emulator's primary purpose is **dynamic unpacking**: many malware samples encrypt or compress their payloads and only reveal the real code at runtime. By emulating execution, Defender can observe the decrypted payload and scan it through the full pipeline recursively (Stage 6). Full exception handling support (VEH, SEH chain walking, x64 table-driven RUNTIME_FUNCTION dispatch) ensures packed malware that uses SEH-based control flow transfer is handled correctly.
1313

1414
### Key RTTI Classes from the Binary
1515

@@ -234,68 +234,197 @@ Emulated code continues at return address
234234

235235
### Instruction Processing Loop
236236

237-
The core emulation loop fetches, decodes, and executes one instruction at a time:
237+
The core emulation loop processes instructions in batches of approximately 1,000, checking
238+
control conditions between each batch:
238239

239240
```
240241
Pseudocode:
241242
─────────────────────────────────────────────────────────────────────────
242243
243-
fn emulate_main_loop(ctx: &mut EmuContext) -> ScanResult {
244-
let mut insn_count: u32 = 0;
245-
let max_instructions: u32 = 500_000; // Hard limit
244+
emulate_main_loop(ctx):
245+
insn_count = 0
246+
max_instructions = 5,000,000 // Default budget (configurable via DBVAR)
247+
batch_size = 1,000
246248
247-
loop {
248-
// Fetch instruction at current EIP
249-
let eip = ctx.regs.eip;
249+
loop:
250+
// Execute a batch of instructions
251+
execute_batch(ctx, batch_size)
252+
insn_count += batch_size
250253
251254
// Check stop sentinel
252-
if eip == 0xDEADBEEF {
253-
break; // Normal termination
254-
}
255+
if EIP == 0xDEADBEEF:
256+
break // Normal termination (return address sentinel)
255257
256-
// Decode instruction
257-
let insn = decode_instruction(ctx.memory, eip);
258-
259-
// Check instruction limit
260-
insn_count += 1;
261-
if insn_count >= max_instructions {
258+
// Check instruction budget
259+
if insn_count >= max_instructions:
262260
// "abort: execution limit met (%u instructions)"
263261
// @ 0x109334D8
264-
break;
265-
}
266-
267-
// Execute instruction
268-
match insn.opcode_type {
269-
DASM_OPTYPE_FPU_RM => {
270-
// Route to FPU_* export function
271-
// String: "DASM_OPTYPE_FPU_RM" @ 0x109815DC
272-
execute_fpu_instruction(ctx, &insn);
273-
}
274-
_ => execute_general_instruction(ctx, &insn),
275-
}
276-
277-
// Check for API trampoline hit
278-
if eip >= 0x7FFE0000 && eip < 0x7FFF0000 {
279-
let api_index = (eip - 0x7FFE0000) / TRAMPOLINE_STRIDE;
280-
handle_api_call(ctx, api_index);
281-
}
282-
283-
// Update EIP
284-
ctx.regs.eip = insn.next_eip;
285-
}
286-
287-
return ctx.scan_result;
288-
}
262+
break
263+
264+
// Check for API trampoline hit (0F FF F0 opcode at current IP)
265+
if [EIP] == 0x0F 0xFF 0xF0:
266+
api_id = EAX
267+
dispatch_api_handler(ctx, api_id)
268+
269+
// Check for direct syscall (0F 05 = SYSCALL, 0F 34 = SYSENTER)
270+
if [EIP] == 0x0F 0x05 or [EIP] == 0x0F 0x34:
271+
dispatch_syscall(ctx, EAX)
272+
273+
// Self-modifying code: flush translation cache if code regions were written
274+
if code_region_written:
275+
flush_translation_cache()
276+
277+
// FPU instruction: route to exported FPU_* handler
278+
if opcode_type == DASM_OPTYPE_FPU_RM:
279+
// "DASM_OPTYPE_FPU_RM" @ 0x109815DC
280+
execute_fpu_instruction(ctx, insn)
289281
```
290282

291283
### Execution Limits
292284

293285
| Limit | Value | String/Source |
294286
|-------|-------|---------------|
295-
| Max instructions per run | 500,000 | `"abort: execution limit met (%u instructions)"` @ `0x109334D8` |
296-
| Infinite loop detection | configurable | `"Infinite loop detected (more that %d instructions executed)"` @ `0x10983320` |
287+
| Max instructions per run | 5,000,000 | `"abort: execution limit met (%u instructions)"` @ `0x109334D8` |
288+
| Instruction batch size | ~1,000 | Between-batch control checks |
289+
| Fopclog max entries | 8,192 | First-opcode-byte recording cap |
290+
| Max SEH dispatches | 64 | Prevents infinite exception loops |
291+
| TLS callback budget | 50,000 per callback | Budget before main entry point |
292+
| DllMain budget | 10,000 per VDLL | Budget for VDLL initialization |
293+
| Tight loop detection | 50,000 insns without API call | Anti-analysis delay loop detection |
294+
| Consecutive error limit | 3 | Unhandled exception termination |
295+
296+
*(from RE of mpengine.dll -- execution limit strings and emulator control flow)*
297+
298+
---
299+
300+
## Exception Handling
301+
302+
The emulator supports three exception handling mechanisms, checked in priority order:
303+
304+
### VEH (Vectored Exception Handlers)
305+
306+
VEH handlers registered via `AddVectoredExceptionHandler` are checked **before** the SEH chain
307+
on x86. Dispatch builds `EXCEPTION_POINTERS { ExceptionRecord*, ContextRecord* }` on the emulated
308+
stack and calls the handler. Return value `0xFFFFFFFF` (`EXCEPTION_CONTINUE_EXECUTION`) resumes
309+
execution; `0` (`EXCEPTION_CONTINUE_SEARCH`) tries the next handler.
310+
311+
### SEH (x86 Structured Exception Handling)
312+
313+
The SEH chain is walked from `TEB[0x00]` (FS:[0]). Up to 32 frames are walked. For each handler:
314+
1. Builds `EXCEPTION_RECORD` (80 bytes) and `CONTEXT` (716 bytes) on the emulated stack
315+
2. Calls handler with arguments: `(ExceptionRecord*, EstablisherFrame*, ContextRecord*, DispatcherContext*)`
316+
3. Sets return address to SEH return sentinel (`0xDEADC0DE`)
317+
4. Handler return value `0` = continue execution; `1` = continue search
318+
319+
### x64 Table-Driven Exception Handling
320+
321+
x64 uses `RUNTIME_FUNCTION` entries parsed from the PE's exception directory (data directory 3):
322+
1. Binary-searches the sorted `RUNTIME_FUNCTION` table for the faulting RIP
323+
2. Reads `UNWIND_INFO` at the entry's `UnwindInfoAddress`
324+
3. Checks for `UNW_FLAG_EHANDLER` (1) or `UNW_FLAG_UHANDLER` (2) flags
325+
4. Reads handler RVA from after the unwind codes array
326+
5. Sets up x64 fastcall call: RCX=ExceptionRecord*, RDX=EstablisherFrame, R8=ContextRecord*
327+
328+
---
329+
330+
## TEB/PEB Environment Setup
331+
332+
The emulator constructs a realistic Windows process environment that defeats common sandbox
333+
detection techniques used by malware.
334+
335+
### Segment Configuration
336+
337+
- **x86**: FS segment base → TEB at `0x00020000`
338+
- **x64**: GS segment base → TEB at `0x00020000`
339+
340+
### Process Parameters (Fake Environment)
341+
342+
```
343+
Key TEB/PEB fields:
344+
FS:[0x18] / GS:[0x30] Self-pointer (TEB address)
345+
FS:[0x30] / GS:[0x60] PEB pointer
346+
PEB.BeingDebugged = 0 (anti-debug)
347+
PEB.NtGlobalFlag = 0 (anti-debug)
348+
PEB.ImageBaseAddress = loaded PE base
349+
PEB.Ldr = PEB_LDR_DATA (module list)
350+
PEB.ProcessParameters = RTL_USER_PROCESS_PARAMETERS
351+
352+
Process Parameters:
353+
ComputerName: HAL9TH (not "DESKTOP-...", matches mpengine default)
354+
UserName: JohnDoe (not "admin" or "malware")
355+
ImagePath: C:\Users\JohnDoe\Desktop\target.exe
356+
CurrentDir: C:\Windows\System32\
357+
SystemRoot: C:\Windows
358+
TEMP: C:\Windows\Temp
359+
```
360+
361+
The PEB_LDR_DATA maintains three doubly-linked module lists (`InLoadOrderModuleList`,
362+
`InMemoryOrderModuleList`, `InInitializationOrderModuleList`) populated with the target PE
363+
and loaded VDLLs. Malware that walks these lists for DLL enumeration sees a realistic module chain.
364+
365+
---
366+
367+
## Cryptographic API Emulation
368+
369+
### CryptAPI (ADVAPI32.DLL)
370+
371+
The emulator tracks cryptographic state (hash objects, key objects) for operations including:
372+
- `CryptAcquireContext` / `CryptReleaseContext` -- provider management
373+
- `CryptCreateHash` / `CryptHashData` / `CryptGetHashParam` -- MD5, SHA-1, SHA-256 hashing
374+
- `CryptDeriveKey` / `CryptGenKey` / `CryptImportKey` -- key management
375+
- `CryptDecrypt` / `CryptEncrypt` -- RC4 stream cipher, AES-CBC/ECB block cipher
376+
- `CryptSetKeyParam` -- IV and cipher mode configuration
377+
378+
### BCrypt (BCRYPT.DLL)
379+
380+
Modern CNG API support:
381+
- `BCryptOpenAlgorithmProvider` -- AES, RC4, SHA-256, etc.
382+
- `BCryptGenerateSymmetricKey` -- key import/generation
383+
- `BCryptDecrypt` / `BCryptEncrypt` -- block/stream cipher operations
384+
385+
This enables the emulator to observe malware that decrypts its payload using Windows crypto APIs
386+
before executing it.
387+
388+
---
389+
390+
## Memory Tracking and Content Extraction
391+
392+
### Dirty Page Tracking
393+
394+
A memory write hook records every written page address (page-aligned) during emulation. This
395+
identifies which memory regions were modified by the emulated code.
396+
397+
### Self-Modifying Code Detection
398+
399+
PE section address ranges are registered as "code regions." When a write targets any of these
400+
ranges, the translation block cache is invalidated at the next batch boundary, ensuring
401+
self-modified code executes correctly.
402+
403+
### Unpacked Content Extraction
404+
405+
After emulation completes, modified memory is collected:
406+
1. **PE sections**: All sections are read back; sections with >16 non-zero bytes are included
407+
2. **Dirty pages outside PE**: Pages not in PE sections, stack, TEB, or trampoline regions
408+
are coalesced into contiguous regions (capped at 1MB per region)
409+
3. **Embedded PE scan**: Extracted data is scanned for `MZ` + `PE\0\0` signatures to find
410+
unpacked PE payloads
411+
412+
### Dropped File Collection
413+
414+
Files created during emulation are collected from two sources:
415+
1. **VFS write tracking**: Files added via `CreateFileW` / `WriteFile` during emulation
416+
2. **Object manager**: Writable file handles with non-empty data
417+
418+
All extracted content is fed back through the full scan pipeline at Stage 6 (Unpacked Content).
419+
420+
---
421+
422+
## APC Draining
297423

298-
*(from RE of mpengine.dll -- execution limit strings)*
424+
When `NtQueueApcThread` is called during emulation, APC routines are queued. When the main
425+
emulation loop reaches the stop sentinel or instruction budget, any pending APCs are drained
426+
(each queued routine is called with its arguments) before termination. This handles malware
427+
that uses APC injection to execute unpacking code.
299428

300429
---
301430

@@ -461,13 +590,19 @@ VFS-dropped files are extracted after emulation and fed back through the scan pi
461590
| FPU export functions | 67 |
462591
| SSE export functions | 1 (SSE_convert) |
463592
| Emulated WinAPI handlers | 198 |
464-
| Virtual DLLs (VDLLs) | 973 |
465-
| Max instructions per run | 500,000 |
593+
| Virtual DLLs (VDLLs) | 973 (750 x86 + 195 x64 + 18 ARM + 10 MSIL) |
594+
| Max instructions per run | 5,000,000 (configurable via DBVAR) |
595+
| Instruction batch size | ~1,000 |
596+
| Fopclog max entries | 8,192 |
597+
| Max SEH dispatches | 64 |
598+
| TLS callback budget | 50,000 per callback |
599+
| DllMain budget | 10,000 per VDLL |
466600
| FOP behavioral rules | 4,601 |
467601
| TUNNEL signature variants | 4 (x86, x64, ARM, ARM64) |
468602
| THREAD signature variants | 4 (x86, x64, ARM, ARM64) |
469603
| PE analysis attributes | 302 (`pea_*`) |
470604
| Emulator RTTI classes | 3 (x86, base, ARM) |
605+
| Crypto support | CryptAPI (MD5/SHA/AES/RC4) + BCrypt |
471606

472607
---
473608

0 commit comments

Comments
 (0)