Skip to content

Commit 46afeca

Browse files
committed
init
0 parents  commit 46afeca

36 files changed

Lines changed: 3713 additions & 0 deletions

.gitignore

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
__pycache__/
2+
*.py[cod]
3+
*.egg-info/
4+
dist/
5+
build/
6+
7+
.venv/
8+
.pytest_cache/
9+
*.o
10+
*.dll
11+
*.so
12+
*.pyd

NANOBIND_ISSUES.md

Lines changed: 136 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,136 @@
1+
# llvm-nanobind Issues & Gotchas
2+
3+
Issues encountered while porting Pluto obfuscation passes to Python using llvm-nanobind.
4+
5+
## API Surprises
6+
7+
### `ctx.types.ptr` is a property, not a method
8+
```python
9+
# WRONG — TypeError: 'llvm.Type' object is not callable
10+
ptr_ty = ctx.types.ptr()
11+
12+
# CORRECT
13+
ptr_ty = ctx.types.ptr
14+
```
15+
Other type accessors like `ctx.types.i32`, `ctx.types.void` are also properties.
16+
Only `ctx.types.function(ret, args)` and `ctx.types.array(elem, count)` are methods.
17+
18+
### `ctx.create_module()` returns `ModuleManager`, not `Module`
19+
Must use as a context manager to get the actual `Module`:
20+
```python
21+
# WRONG — returns ModuleManager, has no add_function/add_global/etc.
22+
mod = ctx.create_module("test")
23+
24+
# CORRECT
25+
with ctx.create_module("test") as mod:
26+
mod.add_function(...)
27+
```
28+
29+
### `mod.target_triple`, not `mod.triple`
30+
```python
31+
# WRONG
32+
mod.triple = "x86_64-pc-windows-msvc"
33+
34+
# CORRECT
35+
mod.target_triple = "x86_64-pc-windows-msvc"
36+
```
37+
38+
### `llvm.create_target_machine()` is a module-level function
39+
```python
40+
# WRONG — Target has no create_target_machine method
41+
tm = target.create_target_machine(triple, cpu, features)
42+
43+
# CORRECT
44+
tm = llvm.create_target_machine(target, triple, cpu, features)
45+
```
46+
47+
### `inst.block` for parent block, not `inst.parent`
48+
```python
49+
bb = inst.block # correct
50+
```
51+
52+
### `gv.global_value_type` for content type
53+
`gv.type` returns the pointer type. Use `gv.global_value_type` for the actual stored type.
54+
55+
## Call Instruction Limitations
56+
57+
### No setter for call callee
58+
There is no `set_called_operand()` or `set_callee()` on call instructions.
59+
`called_value` is read-only. To change a call's target, rebuild the call:
60+
61+
```python
62+
# Build new indirect call and replace the old one
63+
with bb.create_builder() as builder:
64+
builder.position_before(call_inst)
65+
loaded = builder.load(ptr_ty, gv, "fn.ptr")
66+
new_call = builder.call(func_ty, loaded, args, "result")
67+
call_inst.replace_all_uses_with(new_call)
68+
call_inst.erase_from_parent()
69+
```
70+
71+
### Two overloads for `builder.call()`
72+
```python
73+
# Direct call (infers function type from Function object)
74+
builder.call(func, args, name)
75+
76+
# Indirect call (explicit function type, callee can be any value/pointer)
77+
builder.call(func_ty, loaded_ptr, args, name)
78+
```
79+
Passing a loaded pointer to the 2-arg form causes `LLVMAssertionError`.
80+
81+
## Segfaults & Crashes
82+
83+
### ConstantDataArray element access crashes
84+
Accessing elements of array initializers via `init.get_operand(i)` on arrays created
85+
with `const_array` causes a segfault. No workaround found — we removed array encryption
86+
from the GlobalEncryption pass entirely.
87+
88+
### `func.dll_storage_class` required for Windows DLL exports
89+
Functions emitted to object files for Windows DLLs must have:
90+
```python
91+
func.dll_storage_class = llvm.DLLExport
92+
```
93+
Otherwise the symbol won't be exported and `ctypes.CDLL` can't find it.
94+
95+
## Missing APIs
96+
97+
### No `splitBasicBlock()`
98+
Cannot split a basic block at an arbitrary instruction. The Bogus Control Flow pass
99+
had to be redesigned to work without block splitting — it inserts opaque predicates
100+
before existing terminators instead of cloning block contents.
101+
102+
## Initialization
103+
104+
### Must initialize ASM printers for `emit_to_file()`
105+
```python
106+
llvm.initialize_all_targets()
107+
llvm.initialize_all_target_mcs()
108+
llvm.initialize_all_target_infos()
109+
llvm.initialize_all_asm_printers() # without this: "can't emit a file of this type"
110+
```
111+
112+
## Integer Constant Overflow
113+
114+
### Constants must fit the type's bit width
115+
`vtype.constant(value)` raises `TypeError` if `value` exceeds the type's range.
116+
When generating random keys for XOR encryption, mask to the bit width:
117+
```python
118+
bit_width = vtype.int_width
119+
mask = (1 << bit_width) - 1
120+
key = rng.get_uint64() & mask
121+
```
122+
123+
## PHI Node Maintenance
124+
125+
### New predecessors require PHI incoming entries
126+
When adding a new block that branches to an existing block containing PHI nodes,
127+
you must add incoming values for the new predecessor:
128+
```python
129+
for inst in target_bb.instructions:
130+
if inst.opcode == llvm.Opcode.PHI:
131+
inst.add_incoming(inst.type.undef(), new_bb)
132+
else:
133+
break
134+
```
135+
Without this, the module verifier fails with:
136+
`PHINode should have one entry for each predecessor of its parent basic block!`

pyproject.toml

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
[project]
2+
name = "shifting-codes"
3+
version = "0.1.0"
4+
description = "LLVM obfuscation passes ported from Pluto to Python, with PyQt6 visualization"
5+
requires-python = ">=3.12"
6+
dependencies = [
7+
"z3-solver>=4.12",
8+
"PyQt6>=6.6",
9+
]
10+
11+
[build-system]
12+
requires = ["hatchling"]
13+
build-backend = "hatchling.build"
14+
15+
[tool.hatch.build.targets.wheel]
16+
packages = ["src/shifting_codes"]
17+
18+
[tool.uv.sources]
19+
llvm-nanobind = { path = "../llvm-nanobind", editable = true }
20+
21+
[dependency-groups]
22+
dev = [
23+
"pytest>=8.0",
24+
"pytest-cov",
25+
]
26+
27+
[tool.pytest.ini_options]
28+
testpaths = ["tests"]
29+
pythonpath = ["src", "../llvm-nanobind/build"]

src/shifting_codes/__init__.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
"""Shifting Codes - LLVM obfuscation passes ported from Pluto to Python."""
2+
3+
__version__ = "0.1.0"
Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
"""Pass registry and pipeline for obfuscation passes."""
2+
3+
from __future__ import annotations
4+
5+
import llvm
6+
7+
from shifting_codes.passes.base import FunctionPass, ModulePass, PassInfo
8+
9+
10+
class PassRegistry:
11+
"""Registry of available obfuscation passes."""
12+
13+
_passes: dict[str, type[FunctionPass | ModulePass]] = {}
14+
15+
@classmethod
16+
def register(cls, pass_cls: type[FunctionPass | ModulePass]) -> type:
17+
info = pass_cls.info()
18+
cls._passes[info.name] = pass_cls
19+
return pass_cls
20+
21+
@classmethod
22+
def get(cls, name: str) -> type[FunctionPass | ModulePass] | None:
23+
return cls._passes.get(name)
24+
25+
@classmethod
26+
def all_passes(cls) -> dict[str, type[FunctionPass | ModulePass]]:
27+
return dict(cls._passes)
28+
29+
30+
class PassPipeline:
31+
"""Ordered pipeline of obfuscation passes."""
32+
33+
def __init__(self, passes: list[FunctionPass | ModulePass] | None = None):
34+
self.passes: list[FunctionPass | ModulePass] = passes or []
35+
36+
def add(self, p: FunctionPass | ModulePass) -> None:
37+
self.passes.append(p)
38+
39+
def run(self, mod: llvm.Module, ctx: llvm.Context) -> bool:
40+
changed = False
41+
for p in self.passes:
42+
if isinstance(p, ModulePass):
43+
changed |= p.run_on_module(mod, ctx)
44+
elif isinstance(p, FunctionPass):
45+
for func in mod.functions:
46+
if not func.is_declaration:
47+
changed |= p.run_on_function(func, ctx)
48+
return changed

src/shifting_codes/passes/base.py

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
"""Base classes for obfuscation passes."""
2+
3+
from abc import ABC, abstractmethod
4+
from dataclasses import dataclass
5+
6+
import llvm
7+
8+
9+
@dataclass(frozen=True)
10+
class PassInfo:
11+
name: str
12+
description: str
13+
is_module_pass: bool = False
14+
15+
16+
class FunctionPass(ABC):
17+
"""Abstract base class for function-level obfuscation passes."""
18+
19+
@abstractmethod
20+
def run_on_function(self, func: llvm.Function, ctx: llvm.Context) -> bool:
21+
"""Run the pass on a single function. Returns True if the function was modified."""
22+
...
23+
24+
@classmethod
25+
@abstractmethod
26+
def info(cls) -> PassInfo:
27+
"""Return metadata about this pass."""
28+
...
29+
30+
31+
class ModulePass(ABC):
32+
"""Abstract base class for module-level obfuscation passes."""
33+
34+
@abstractmethod
35+
def run_on_module(self, mod: llvm.Module, ctx: llvm.Context) -> bool:
36+
"""Run the pass on the entire module. Returns True if the module was modified."""
37+
...
38+
39+
@classmethod
40+
@abstractmethod
41+
def info(cls) -> PassInfo:
42+
"""Return metadata about this pass."""
43+
...
Lines changed: 106 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,106 @@
1+
"""Bogus Control Flow Pass — port of Pluto BogusControlFlowPass.cpp.
2+
3+
Inserts opaque predicates (always-true conditions) before unconditional
4+
branches, adding dead code paths to confuse static analysis.
5+
"""
6+
7+
from __future__ import annotations
8+
9+
import llvm
10+
11+
from shifting_codes.passes import PassRegistry
12+
from shifting_codes.passes.base import FunctionPass, PassInfo
13+
from shifting_codes.utils.crypto import CryptoRandom
14+
15+
16+
@PassRegistry.register
17+
class BogusControlFlowPass(FunctionPass):
18+
19+
def __init__(self, rng: CryptoRandom | None = None):
20+
self.rng = rng or CryptoRandom()
21+
self._bcf_counter = 0
22+
23+
@classmethod
24+
def info(cls) -> PassInfo:
25+
return PassInfo(
26+
name="bogus_control_flow",
27+
description="Insert opaque predicates and bogus branches",
28+
)
29+
30+
def run_on_function(self, func: llvm.Function, ctx: llvm.Context) -> bool:
31+
mod = func.module
32+
i32 = ctx.types.i32
33+
changed = False
34+
35+
# Collect blocks with unconditional branches to transform
36+
blocks_to_transform = []
37+
for bb in func.basic_blocks:
38+
term = bb.terminator
39+
if term is None:
40+
continue
41+
if term.opcode == llvm.Opcode.Br:
42+
succs = list(term.successors)
43+
if len(succs) == 1:
44+
blocks_to_transform.append(bb)
45+
46+
for bb in blocks_to_transform:
47+
term = bb.terminator
48+
succs = list(term.successors)
49+
if len(succs) != 1:
50+
continue
51+
real_target = succs[0]
52+
53+
self._bcf_counter += 1
54+
tag = self._bcf_counter
55+
56+
# Create global variables for the opaque predicate
57+
x_gv = mod.add_global(i32, f"__bcf_x_{tag}")
58+
x_gv.initializer = i32.constant(0)
59+
x_gv.linkage = llvm.Linkage.Private
60+
61+
y_gv = mod.add_global(i32, f"__bcf_y_{tag}")
62+
y_gv.initializer = i32.constant(0)
63+
y_gv.linkage = llvm.Linkage.Private
64+
65+
# Create bogus block that branches back to the real target
66+
bogus_bb = func.append_basic_block(f"bcf.bogus.{tag}")
67+
with bogus_bb.create_builder() as builder:
68+
builder.br(real_target)
69+
70+
# Fix PHI nodes in real_target: add incoming from bogus_bb
71+
# The bogus block is unreachable, so we use undef values
72+
for inst in real_target.instructions:
73+
if inst.opcode == llvm.Opcode.PHI:
74+
inst.add_incoming(inst.type.undef(), bogus_bb)
75+
else:
76+
break # PHIs are always at the start of a block
77+
78+
# Insert opaque predicate before the terminator:
79+
# (y < 10) || (x * (x + 1) % 2 == 0) — always true
80+
with bb.create_builder() as builder:
81+
builder.position_before(term)
82+
83+
x_val = builder.load(i32, x_gv, "bcf.x")
84+
y_val = builder.load(i32, y_gv, "bcf.y")
85+
86+
# cond1: y < 10
87+
cond1 = builder.icmp(
88+
llvm.IntPredicate.SLT, y_val, i32.constant(10), "bcf.cmp1"
89+
)
90+
91+
# cond2: x * (x + 1) % 2 == 0 (always true: product of consecutive ints is even)
92+
xp1 = builder.add(x_val, i32.constant(1), "bcf.xp1")
93+
xmul = builder.mul(x_val, xp1, "bcf.xmul")
94+
xmod = builder.urem(xmul, i32.constant(2), "bcf.xmod")
95+
cond2 = builder.icmp(
96+
llvm.IntPredicate.EQ, xmod, i32.constant(0), "bcf.cmp2"
97+
)
98+
99+
bogus_cond = builder.or_(cond1, cond2, "bcf.cond")
100+
builder.cond_br(bogus_cond, real_target, bogus_bb)
101+
102+
# Remove original unconditional branch
103+
term.erase_from_parent()
104+
changed = True
105+
106+
return changed

0 commit comments

Comments
 (0)