|
| 1 | +# CLAUDE.md |
| 2 | + |
| 3 | +This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. |
| 4 | + |
| 5 | +## Project |
| 6 | + |
| 7 | +Python port of Pluto LLVM obfuscation passes using llvm-nanobind bindings. Six passes transform LLVM IR to obfuscate code: Substitution, MBA Obfuscation, Bogus Control Flow, Flattening, Global Encryption, and Indirect Call. |
| 8 | + |
| 9 | +## Commands |
| 10 | + |
| 11 | +```bash |
| 12 | +# Run all tests |
| 13 | +python -m uv run pytest tests/ -v |
| 14 | + |
| 15 | +# Run a single test file |
| 16 | +python -m uv run pytest tests/test_substitution.py -v |
| 17 | + |
| 18 | +# Run a single test by name |
| 19 | +python -m uv run pytest tests/test_substitution.py -k "test_add_substitution" -v |
| 20 | + |
| 21 | +# Run UI (requires llvm-nanobind built) |
| 22 | +python -m uv run python -m shifting_codes.ui.app |
| 23 | +``` |
| 24 | + |
| 25 | +## Dependencies |
| 26 | + |
| 27 | +- **llvm-nanobind**: Local editable dependency at `../llvm-nanobind/build/` (must be built separately before tests/UI work) |
| 28 | +- **z3-solver**: Constraint solving for MBA coefficient generation |
| 29 | +- **PyQt6**: GUI framework (UI not yet tested) |
| 30 | +- Python 3.12+ required, managed with UV + hatchling build backend |
| 31 | + |
| 32 | +## Architecture |
| 33 | + |
| 34 | +### Pass System |
| 35 | + |
| 36 | +All passes inherit from `FunctionPass` or `ModulePass` (in `src/shifting_codes/passes/base.py`) and are auto-registered via `@PassRegistry.register` decorator. Each pass implements `run_on_function(func, ctx)` or `run_on_module(mod, ctx)` returning a bool indicating modification. |
| 37 | + |
| 38 | +Passes are composed via `PassPipeline` (in `src/shifting_codes/passes/__init__.py`): |
| 39 | +```python |
| 40 | +pipeline = PassPipeline() |
| 41 | +pipeline.add(SubstitutionPass(rng=CryptoRandom(seed=42))) |
| 42 | +pipeline.run(mod, ctx) |
| 43 | +``` |
| 44 | + |
| 45 | +**FunctionPasses:** Substitution, MBAObfuscation, BogusControlFlow, Flattening |
| 46 | +**ModulePasses:** GlobalEncryption, IndirectCall |
| 47 | + |
| 48 | +### Utilities (`src/shifting_codes/utils/`) |
| 49 | + |
| 50 | +- **`crypto.py`** — `CryptoRandom`: wraps `secrets` (production) or `random.Random(seed)` (testing). All passes accept an `rng` parameter for determinism. |
| 51 | +- **`mba.py`** — Z3-based MBA coefficient generation with result caching. Generates linear (15 truth tables) and univariate polynomial expressions. |
| 52 | +- **`ir_helpers.py`** — PHI/register demotion to stack (`demote_phi_to_stack`, `demote_regs_to_stack`), used by Flattening pass. |
| 53 | + |
| 54 | +### XTEA (`src/shifting_codes/xtea/`) |
| 55 | + |
| 56 | +Reference XTEA cipher implementation (pure Python) plus an LLVM IR builder that constructs the same cipher using the nanobind Builder API. Used for end-to-end testing: build IR → apply all passes → compile → execute via ctypes → verify against reference. |
| 57 | + |
| 58 | +### Test Fixtures (`tests/conftest.py`) |
| 59 | + |
| 60 | +- `ctx`: Fresh LLVM context per test |
| 61 | +- `rng`: Seeded `CryptoRandom(seed=42)` for deterministic tests |
| 62 | +- Helper functions: `make_add_function()`, `make_arith_function()`, `make_branch_function()`, `make_loop_function()` |
| 63 | + |
| 64 | +## llvm-nanobind API Pitfalls |
| 65 | + |
| 66 | +- `ctx.types.ptr`, `ctx.types.i32`, `ctx.types.void` are **properties** (not methods) |
| 67 | +- `ctx.create_module("name")` returns a context manager: `with ctx.create_module("name") as mod:` |
| 68 | +- `inst.block` for parent block (not `.parent`) |
| 69 | +- `gv.global_value_type` for content type (not `gv.type` which returns pointer type) |
| 70 | +- `call_inst.called_value` is read-only — to change call target, rebuild the call instruction |
| 71 | +- `builder.call(func, args, name)` for direct calls; `builder.call(func_ty, ptr, args, name)` for indirect calls |
| 72 | +- `mod.target_triple = "..."` (not `mod.triple`) |
| 73 | +- `func.dll_storage_class = llvm.DLLExport` required for Windows DLL exports |
| 74 | +- Integer constants must be masked to bit width: `key & ((1 << vtype.int_width) - 1)` |
| 75 | +- ConstantDataArray element access via `get_operand()` crashes — avoid array encryption |
| 76 | +- PHI nodes need `inst.add_incoming(value, pred_bb)` when new predecessors are added |
| 77 | +- Z3 non-determinism: bound coefficients (`-10 <= X[i] <= 10`) and set `smt.random_seed` |
0 commit comments