Skip to content

Commit ed440d2

Browse files
committed
VMWhere
1 parent 1fff54e commit ed440d2

13 files changed

Lines changed: 820 additions & 75 deletions

CLAUDE.md

Lines changed: 22 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -4,19 +4,23 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
44

55
## Project
66

7-
Python port of Pluto LLVM obfuscation passes using llvm-nanobind bindings. Six passes transform LLVM IR to obfuscate code: Substitution, MBA Obfuscation, Bogus Control Flow, Flattening, Global Encryption, and Indirect Call.
7+
Python port of Pluto, Polaris, and VMwhere LLVM obfuscation passes using llvm-nanobind bindings. Passes transform LLVM IR to obfuscate code. See README.md for the full pass list.
88

99
## Commands
1010

11+
**Always use `uv` to run tests** — never use the system Python directly. The system Python may have a different llvm-nanobind build with incompatible API (e.g. `is_terminator` vs `is_terminator_inst`). Only `uv run` uses the correct project venv.
12+
13+
llvm-nanobind requires LLVM dev headers to build. Set `CMAKE_PREFIX_PATH` to your LLVM installation if `uv sync` fails to find LLVM (CI does this automatically).
14+
1115
```bash
12-
# Run all tests
13-
python -m uv run pytest tests/ -v
16+
# Run all tests (always use this form)
17+
CMAKE_PREFIX_PATH="C:\llvm\clang+llvm-21.1.0-x86_64-pc-windows-msvc" python -m uv run pytest tests/ -v
1418

1519
# Run a single test file
16-
python -m uv run pytest tests/test_substitution.py -v
20+
CMAKE_PREFIX_PATH="C:\llvm\clang+llvm-21.1.0-x86_64-pc-windows-msvc" python -m uv run pytest tests/test_substitution.py -v
1721

1822
# Run a single test by name
19-
python -m uv run pytest tests/test_substitution.py -k "test_add_substitution" -v
23+
CMAKE_PREFIX_PATH="C:\llvm\clang+llvm-21.1.0-x86_64-pc-windows-msvc" python -m uv run pytest tests/test_substitution.py -k "test_add_substitution" -v
2024

2125
# Run UI (requires llvm-nanobind built)
2226
python -m uv run python -m shifting_codes.ui.app
@@ -42,14 +46,14 @@ pipeline.add(SubstitutionPass(rng=CryptoRandom(seed=42)))
4246
pipeline.run(mod, ctx)
4347
```
4448

45-
**FunctionPasses:** Substitution, MBAObfuscation, BogusControlFlow, Flattening
46-
**ModulePasses:** GlobalEncryption, IndirectCall
49+
**FunctionPasses:** Substitution, MBAObfuscation, BogusControlFlow, Flattening, IndirectBranch, AliasAccess, AntiDisassembly
50+
**ModulePasses:** GlobalEncryption, IndirectCall, CustomCC, MergeFunction, StringEncryption
4751

4852
### Utilities (`src/shifting_codes/utils/`)
4953

5054
- **`crypto.py`**`CryptoRandom`: wraps `secrets` (production) or `random.Random(seed)` (testing). All passes accept an `rng` parameter for determinism.
5155
- **`mba.py`** — Z3-based MBA coefficient generation with result caching. Generates linear (15 truth tables) and univariate polynomial expressions.
52-
- **`ir_helpers.py`** — PHI/register demotion to stack (`demote_phi_to_stack`, `demote_regs_to_stack`), used by Flattening pass.
56+
- **`ir_helpers.py`** — PHI/register demotion to stack (`demote_phi_to_stack`, `demote_regs_to_stack`), shared encryption utilities (`build_decrypt_function`, `encrypt_bytes`).
5357

5458
### XTEA (`src/shifting_codes/xtea/`)
5559

@@ -61,6 +65,16 @@ Reference XTEA cipher implementation (pure Python) plus an LLVM IR builder that
6165
- `rng`: Seeded `CryptoRandom(seed=42)` for deterministic tests
6266
- Helper functions: `make_add_function()`, `make_arith_function()`, `make_branch_function()`, `make_loop_function()`
6367

68+
## Maintenance Rules
69+
70+
- **Keep README.md up to date.** When adding new passes, changing pass behavior, or making other significant changes, update the README pass tables, usage examples, and any other affected sections. The README is the public-facing documentation and must accurately reflect the current state of the project.
71+
72+
## Testing Policy
73+
74+
- **All tests pass on CI. There are no pre-existing test failures.** If tests fail after your changes, your changes broke them — investigate and fix. Never assume failures are pre-existing.
75+
- **Always run tests via `uv run`**, not the system Python. The system Python has a different llvm-nanobind with incompatible API names.
76+
- Test helper imports use `from conftest import ...` (not `from tests.conftest import ...`).
77+
6478
## llvm-nanobind API Pitfalls
6579

6680
- `ctx.types.ptr`, `ctx.types.i32`, `ctx.types.void` are **properties** (not methods)

README.md

Lines changed: 25 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Shifting Codes
22

3-
Python port of [Pluto](https://github.com/bluesadi/Pluto) and [Polaris](https://github.com/za233/Polaris-Obfuscator/) LLVM obfuscation passes using [llvm-nanobind](https://github.com/LLVMParty/llvm-nanobind) bindings, with a PyQt6 visualization UI.
3+
Python port of [Pluto](https://github.com/bluesadi/Pluto), [Polaris](https://github.com/za233/Polaris-Obfuscator/), and [VMwhere](https://github.com/MrRoy09/VMwhere) LLVM obfuscation passes using [llvm-nanobind](https://github.com/LLVMParty/llvm-nanobind) bindings, with a PyQt6 visualization UI.
44

55
![](assets/UI-showcase.gif)
66

@@ -32,6 +32,13 @@ Upgraded versions of four Pluto passes plus four new passes:
3232
| **Custom CC** | Module | Randomly assigns non-standard calling conventions to internal functions |
3333
| **Merge Function** | Module | Merges multiple functions into a single switch-based dispatcher |
3434

35+
### [VMwhere](https://github.com/MrRoy09/VMwhere) (2 passes)
36+
37+
| Pass | Type | Description |
38+
|------|------|-------------|
39+
| **String Encryption** | Module | XOR-encrypts string constant globals (`[N x i8]`) with per-function stack-local decryption at runtime |
40+
| **Anti-Disassembly** | Function | Injects crafted x86 inline assembly that desynchronizes linear-sweep disassemblers (IDA, Ghidra, objdump) |
41+
3542
## Prerequisites
3643

3744
- **Python 3.12+**
@@ -119,6 +126,23 @@ pipeline.add(MergeFunctionPass(rng=rng))
119126
pipeline.run(mod, ctx)
120127
```
121128

129+
VMwhere passes:
130+
131+
```python
132+
from shifting_codes.passes import PassPipeline
133+
from shifting_codes.passes.string_encryption import StringEncryptionPass
134+
from shifting_codes.passes.anti_disassembly import AntiDisassemblyPass
135+
from shifting_codes.utils.crypto import CryptoRandom
136+
137+
rng = CryptoRandom(seed=42)
138+
139+
pipeline = PassPipeline()
140+
pipeline.add(StringEncryptionPass(rng=rng))
141+
pipeline.add(AntiDisassemblyPass(rng=rng, density=0.3)) # density: 0.0-1.0
142+
143+
pipeline.run(mod, ctx)
144+
```
145+
122146
Passes are registered via `@PassRegistry.register` and can be looked up by name:
123147

124148
```python
Lines changed: 105 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,105 @@
1+
"""Anti-Disassembly Pass — injects junk bytes via inline assembly.
2+
3+
Crafted x86 byte sequences exploit linear-sweep disassembler weaknesses.
4+
They look like multi-byte instructions but include a hidden short jump
5+
that skips junk bytes. The CPU executes correctly but disassemblers
6+
(IDA, Ghidra, objdump) get desynchronized.
7+
8+
Only active when the module target triple contains "x86" or "x86_64".
9+
"""
10+
11+
from __future__ import annotations
12+
13+
import llvm
14+
15+
from shifting_codes.passes import PassRegistry
16+
from shifting_codes.passes.base import FunctionPass, PassInfo
17+
from shifting_codes.utils.crypto import CryptoRandom
18+
19+
20+
@PassRegistry.register
21+
class AntiDisassemblyPass(FunctionPass):
22+
23+
def __init__(self, rng: CryptoRandom | None = None, density: float = 0.3):
24+
self.rng = rng or CryptoRandom()
25+
self.density = max(0.0, min(1.0, density))
26+
27+
@classmethod
28+
def info(cls) -> PassInfo:
29+
return PassInfo(
30+
name="anti_disassembly",
31+
description="[VMwhere] Inject anti-disassembly junk bytes (x86 only)",
32+
)
33+
34+
def _make_asm_string(self) -> str:
35+
"""Build an anti-disassembly byte pattern.
36+
37+
Pattern: 0x48 0xB8 <r1> <r2> <r3> 0xEB 0x08 0xFF 0xFF 0x48 0x31 0xC0 0xEB 0xF7 0xE8
38+
39+
The 0x48 0xB8 prefix makes disassemblers think it's a 10-byte
40+
movabs rax, imm64. The 0xEB 0x08 is a short jump +8 that skips
41+
the junk bytes. Random bytes add variation.
42+
"""
43+
r1 = self.rng.get_range(256)
44+
r2 = self.rng.get_range(256)
45+
r3 = self.rng.get_range(256)
46+
return (
47+
f".byte 0x48, 0xB8, {r1:#04x}, {r2:#04x}, {r3:#04x}, "
48+
f"0xEB, 0x08, 0xFF, 0xFF, 0x48, 0x31, 0xC0, 0xEB, 0xF7, 0xE8"
49+
)
50+
51+
def run_on_function(self, func: llvm.Function, ctx: llvm.Context) -> bool:
52+
# Check target triple — only inject on x86
53+
mod = func.module
54+
triple = mod.target_triple.lower() if mod.target_triple else ""
55+
if "x86" not in triple and "i386" not in triple and "i686" not in triple:
56+
return False
57+
58+
# void() function type for the inline asm calls
59+
void_fn_ty = ctx.types.function(ctx.types.void, [])
60+
changed = False
61+
62+
for bb in func.basic_blocks:
63+
instructions = list(bb.instructions)
64+
if not instructions:
65+
continue
66+
67+
# Find first non-PHI instruction as insertion point
68+
first_non_phi = None
69+
non_phi_insts = []
70+
for inst in instructions:
71+
if inst.opcode == llvm.Opcode.PHI:
72+
continue
73+
if first_non_phi is None:
74+
first_non_phi = inst
75+
non_phi_insts.append(inst)
76+
77+
if first_non_phi is None:
78+
continue
79+
80+
# Always inject at block start (before first non-PHI)
81+
asm_str = self._make_asm_string()
82+
asm_val = llvm.get_inline_asm(
83+
void_fn_ty, asm_str, "~{eax}", True, False,
84+
llvm.InlineAsmDialect.ATT, False)
85+
with bb.create_builder() as b:
86+
b.position_before(first_non_phi)
87+
b.call(void_fn_ty, asm_val, [], "")
88+
changed = True
89+
90+
# Randomly inject before other non-PHI, non-terminator instructions
91+
for inst in non_phi_insts:
92+
if inst == first_non_phi:
93+
continue
94+
if inst.is_terminator:
95+
continue
96+
if self.rng.get_range(1000) < int(self.density * 1000):
97+
asm_str = self._make_asm_string()
98+
asm_val = llvm.get_inline_asm(
99+
void_fn_ty, asm_str, "~{eax}", True, False,
100+
llvm.InlineAsmDialect.ATT, False)
101+
with bb.create_builder() as b:
102+
b.position_before(inst)
103+
b.call(void_fn_ty, asm_val, [], "")
104+
105+
return changed

src/shifting_codes/passes/global_encryption.py

Lines changed: 6 additions & 65 deletions
Original file line numberDiff line numberDiff line change
@@ -14,8 +14,9 @@
1414
from shifting_codes.passes import PassRegistry
1515
from shifting_codes.passes.base import ModulePass, PassInfo
1616
from shifting_codes.utils.crypto import CryptoRandom
17-
18-
KEY_LEN = 4
17+
from shifting_codes.utils.ir_helpers import (
18+
KEY_LEN, build_decrypt_function, encrypt_bytes,
19+
)
1920

2021

2122
def _type_byte_size(ty: llvm.Type) -> int | None:
@@ -29,66 +30,6 @@ def _type_byte_size(ty: llvm.Type) -> int | None:
2930
return None
3031

3132

32-
def _encrypt_bytes(orig_val: int, byte_size: int, key: int,
33-
byte_offset: int = 0) -> int:
34-
"""XOR integer value byte-by-byte with 4-byte key (cyclic)."""
35-
key_bytes = (key & 0xFFFFFFFF).to_bytes(4, 'little')
36-
val_bytes = bytearray(orig_val.to_bytes(byte_size, 'little', signed=False))
37-
for i in range(byte_size):
38-
val_bytes[i] ^= key_bytes[(byte_offset + i) % KEY_LEN]
39-
return int.from_bytes(val_bytes, 'little')
40-
41-
42-
def _build_decrypt_function(mod: llvm.Module, ctx: llvm.Context) -> llvm.Function:
43-
"""Build: void @__obfu_globalenc_dec(ptr %data, ptr %key, i64 %len, i64 %keyLen)
44-
45-
Loop body: data[i] ^= key[i % keyLen]
46-
"""
47-
i8 = ctx.types.i8
48-
i64 = ctx.types.i64
49-
ptr = ctx.types.ptr
50-
fn_ty = ctx.types.function(ctx.types.void, [ptr, ptr, i64, i64])
51-
func = mod.add_function("__obfu_globalenc_dec", fn_ty)
52-
func.linkage = llvm.Linkage.Private
53-
54-
entry_bb = func.append_basic_block("entry")
55-
cmp_bb = func.append_basic_block("cmp")
56-
body_bb = func.append_basic_block("body")
57-
end_bb = func.append_basic_block("end")
58-
59-
data = func.get_param(0)
60-
key = func.get_param(1)
61-
length = func.get_param(2)
62-
key_len = func.get_param(3)
63-
64-
with entry_bb.create_builder() as b:
65-
i_ptr = b.alloca(i64, name="i")
66-
b.store(i64.constant(0), i_ptr)
67-
b.br(cmp_bb)
68-
69-
with cmp_bb.create_builder() as b:
70-
iv = b.load(i64, i_ptr, "iv")
71-
cond = b.icmp(llvm.IntPredicate.SLT, iv, length, "cmp")
72-
b.cond_br(cond, body_bb, end_bb)
73-
74-
with body_bb.create_builder() as b:
75-
iv = b.load(i64, i_ptr, "iv")
76-
key_idx = b.srem(iv, key_len, "kidx")
77-
key_ptr = b.gep(i8, key, [key_idx], "kptr")
78-
key_byte = b.load(i8, key_ptr, "kbyte")
79-
data_ptr = b.gep(i8, data, [iv], "dptr")
80-
data_byte = b.load(i8, data_ptr, "dbyte")
81-
dec = b.xor(key_byte, data_byte, "dec")
82-
b.store(dec, data_ptr)
83-
b.store(b.add(iv, i64.constant(1), "inc"), i_ptr)
84-
b.br(cmp_bb)
85-
86-
with end_bb.create_builder() as b:
87-
b.ret_void()
88-
89-
return func
90-
91-
9233
def _is_encryptable_global(gv) -> bool:
9334
"""Check if a value is a global variable suitable for encryption."""
9435
if not hasattr(gv, 'global_value_type'):
@@ -154,7 +95,7 @@ def run_on_module(self, mod: llvm.Module, ctx: llvm.Context) -> bool:
15495
return False
15596

15697
# Phase 2: Build shared decrypt helper
157-
decrypt_func = _build_decrypt_function(mod, ctx)
98+
decrypt_func = build_decrypt_function(mod, ctx)
15899

159100
i32 = ctx.types.i32
160101
i64 = ctx.types.i64
@@ -188,7 +129,7 @@ def run_on_module(self, mod: llvm.Module, ctx: llvm.Context) -> bool:
188129
try:
189130
if vtype.kind == llvm.TypeKind.Integer:
190131
orig_val = init.const_zext_value
191-
enc_val = _encrypt_bytes(orig_val, byte_size, key)
132+
enc_val = encrypt_bytes(orig_val, byte_size, key)
192133
gv.initializer = vtype.constant(enc_val)
193134
elif vtype.kind == llvm.TypeKind.Array:
194135
elem_type = vtype.element_type
@@ -199,7 +140,7 @@ def run_on_module(self, mod: llvm.Module, ctx: llvm.Context) -> bool:
199140
for j in range(elem_count):
200141
elem = init.get_aggregate_element(j)
201142
orig_val = elem.const_zext_value
202-
enc_val = _encrypt_bytes(orig_val, elem_size, key,
143+
enc_val = encrypt_bytes(orig_val, elem_size, key,
203144
byte_offset)
204145
enc_elements.append(enc_val)
205146
byte_offset += elem_size

0 commit comments

Comments
 (0)