|
| 1 | +# feature snapshot fixtures |
| 2 | + |
| 3 | +This directory holds [capa freeze](../../../../capa/features/freeze/__init__.py) files that |
| 4 | +serve as snapshot fixtures for feature extraction. They're consumed by |
| 5 | +[`tests/test_feature_snapshots.py`](../../../test_feature_snapshots.py), which regenerates a |
| 6 | +freeze from each sample and asserts it matches the committed `.frz` byte-for-byte. Any change |
| 7 | +that perturbs what capa extracts (a backend fix, a new feature, a refactor that drops a |
| 8 | +feature) shows up as a test failure with a feature-count delta and a truncated unified diff. |
| 9 | + |
| 10 | +Each fixture is produced with `python -m capa.features.freeze --reproducible SAMPLE OUTPUT`. |
| 11 | +The `--reproducible` flag zeros out dynamic header metadata (notably the capa version that |
| 12 | +is otherwise embedded in the freeze) so fixtures stay stable across capa version bumps — |
| 13 | +only changes to extracted features cause test failures. |
| 14 | + |
| 15 | +This directory lives under `tests/fixtures/snapshots/features/`; the enclosing |
| 16 | +`tests/fixtures/snapshots/` namespace is where future snapshot kinds (e.g. rendered |
| 17 | +capabilities, verbose output) will live alongside this one. |
| 18 | + |
| 19 | +## layout |
| 20 | + |
| 21 | +- `manifest.json` — JSON list of snapshots (validated by `tests/feature_snapshot_util.py`). |
| 22 | + Each entry has: |
| 23 | + - `name`, `sample` (path under `tests/data/`), `freeze` (filename in this directory); |
| 24 | + - a human-written `explanation` describing why the sample was picked; |
| 25 | + - `generated_at_commit` — informational-only: the capa HEAD at which this fixture was last |
| 26 | + regenerated. Surfaced in test failure output so a reviewer can run |
| 27 | + `git log <commit>..HEAD -- capa/` to see what's changed since. Not validated at test |
| 28 | + time; humans keep it accurate when regenerating. |
| 29 | + - optional `format`/`backend`/`os` overrides that get passed through to the freeze CLI. |
| 30 | +- `*.frz` — a `capa.features.freeze` byte stream (magic `capa0000` + zlib(utf-8(json(...)))). |
| 31 | + |
| 32 | +## how the sample set was picked |
| 33 | + |
| 34 | +The goal is to exercise every major (format, backend) pair capa supports with the smallest |
| 35 | +reasonable sample, so running the snapshot suite stays under ~1 minute on a laptop while |
| 36 | +still catching regressions in many extraction code paths. Each fixture's `explanation` field |
| 37 | +in `manifest.json` spells out why that specific file is in the set and flags any candidate |
| 38 | +for removal. |
| 39 | + |
| 40 | +Backends/formats currently covered: |
| 41 | + |
| 42 | +| fixture | backend | format | |
| 43 | +|------------------|------------|------------------| |
| 44 | +| `pma01-01-dll` | viv | PE 32-bit DLL | |
| 45 | +| `mimikatz-exe` | viv | PE 32-bit EXE | |
| 46 | +| `pma21-01-exe` | viv | PE 64-bit EXE | |
| 47 | +| `7351f-elf` | viv | ELF | |
| 48 | +| `1c444-dotnet` | dotnet | .NET | |
| 49 | +| `mimikatz-exe-ida` | ida (idalib) | PE 32-bit EXE | |
| 50 | + |
| 51 | +Backends deliberately not covered here today: |
| 52 | +- BinExport2, |
| 53 | +- Binary Ninja |
| 54 | +- dynamic sandbox formats (CAPE, DRAKVUF, VMRay). |
| 55 | + |
| 56 | +## regenerating a fixture after an intentional change |
| 57 | + |
| 58 | +``` |
| 59 | +python -m capa.features.freeze --reproducible \ |
| 60 | + tests/data/<sample> tests/fixtures/snapshots/features/<name>.frz |
| 61 | +``` |
| 62 | + |
| 63 | +The freeze CLI logs a ready-to-paste manifest entry to its INFO output — including a |
| 64 | +`generated_at_commit` taken from the current git HEAD — so updating `manifest.json` is |
| 65 | +copy/paste. |
| 66 | + |
| 67 | +## adding a new fixture |
| 68 | + |
| 69 | +1. Add an entry to the `snapshots` list in `manifest.json`. At minimum specify `name`, |
| 70 | + `sample`, `freeze`, and `explanation`. Use `format`/`backend`/`os` only if the defaults |
| 71 | + don't pick the right extractor. |
| 72 | +2. Generate the `.frz` file using the command above. |
| 73 | +3. Copy the `generated_at_commit` the CLI suggested into the manifest entry. |
| 74 | +4. Commit the updated manifest and the new `.frz` file together. |
0 commit comments