Skip to content

Commit 8997a35

Browse files
Merge pull request #306 from mandiant/add-feature-snapshot-frz
add feature snapshot freeze fixtures
2 parents 1b6638d + 78249ad commit 8997a35

8 files changed

Lines changed: 93 additions & 1 deletion

File tree

.github/check_sample_filenames.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@
4343
".zip",
4444
".bndb",
4545
)
46-
IGNORED_DIRS = (".git", ".github", "sigs")
46+
IGNORED_DIRS = (".git", ".github", "sigs", "fixtures")
4747

4848

4949
def main(argv=None):
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
version https://git-lfs.github.com/spec/v1
2+
oid sha256:62c569bf59d9df537de557c9eb57eed82021a9b436d3ba0d0cef50f0970c806e
3+
size 71197
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
version https://git-lfs.github.com/spec/v1
2+
oid sha256:9abbb08374b74b71d18aebbf9c725c1b5a731799523640a8668bba7f5058bc17
3+
size 151835
Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
# feature snapshot fixtures
2+
3+
This directory holds [capa freeze](../../../../capa/features/freeze/__init__.py) files that
4+
serve as snapshot fixtures for feature extraction. They're consumed by
5+
[`tests/test_feature_snapshots.py`](../../../test_feature_snapshots.py), which regenerates a
6+
freeze from each sample and asserts it matches the committed `.frz` byte-for-byte. Any change
7+
that perturbs what capa extracts (a backend fix, a new feature, a refactor that drops a
8+
feature) shows up as a test failure with a feature-count delta and a truncated unified diff.
9+
10+
Each fixture is produced with `python -m capa.features.freeze --reproducible SAMPLE OUTPUT`.
11+
The `--reproducible` flag zeros out dynamic header metadata (notably the capa version that
12+
is otherwise embedded in the freeze) so fixtures stay stable across capa version bumps —
13+
only changes to extracted features cause test failures.
14+
15+
This directory lives under `tests/fixtures/snapshots/features/`; the enclosing
16+
`tests/fixtures/snapshots/` namespace is where future snapshot kinds (e.g. rendered
17+
capabilities, verbose output) will live alongside this one.
18+
19+
## layout
20+
21+
- `manifest.json` — JSON list of snapshots (validated by `tests/feature_snapshot_util.py`).
22+
Each entry has:
23+
- `name`, `sample` (path under `tests/data/`), `freeze` (filename in this directory);
24+
- a human-written `explanation` describing why the sample was picked;
25+
- `generated_at_commit` — informational-only: the capa HEAD at which this fixture was last
26+
regenerated. Surfaced in test failure output so a reviewer can run
27+
`git log <commit>..HEAD -- capa/` to see what's changed since. Not validated at test
28+
time; humans keep it accurate when regenerating.
29+
- optional `format`/`backend`/`os` overrides that get passed through to the freeze CLI.
30+
- `*.frz` — a `capa.features.freeze` byte stream (magic `capa0000` + zlib(utf-8(json(...)))).
31+
32+
## how the sample set was picked
33+
34+
The goal is to exercise every major (format, backend) pair capa supports with the smallest
35+
reasonable sample, so running the snapshot suite stays under ~1 minute on a laptop while
36+
still catching regressions in many extraction code paths. Each fixture's `explanation` field
37+
in `manifest.json` spells out why that specific file is in the set and flags any candidate
38+
for removal.
39+
40+
Backends/formats currently covered:
41+
42+
| fixture | backend | format |
43+
|------------------|------------|------------------|
44+
| `pma01-01-dll` | viv | PE 32-bit DLL |
45+
| `mimikatz-exe` | viv | PE 32-bit EXE |
46+
| `pma21-01-exe` | viv | PE 64-bit EXE |
47+
| `7351f-elf` | viv | ELF |
48+
| `1c444-dotnet` | dotnet | .NET |
49+
| `mimikatz-exe-ida` | ida (idalib) | PE 32-bit EXE |
50+
51+
Backends deliberately not covered here today:
52+
- BinExport2,
53+
- Binary Ninja
54+
- dynamic sandbox formats (CAPE, DRAKVUF, VMRay).
55+
56+
## regenerating a fixture after an intentional change
57+
58+
```
59+
python -m capa.features.freeze --reproducible \
60+
tests/data/<sample> tests/fixtures/snapshots/features/<name>.frz
61+
```
62+
63+
The freeze CLI logs a ready-to-paste manifest entry to its INFO output — including a
64+
`generated_at_commit` taken from the current git HEAD — so updating `manifest.json` is
65+
copy/paste.
66+
67+
## adding a new fixture
68+
69+
1. Add an entry to the `snapshots` list in `manifest.json`. At minimum specify `name`,
70+
`sample`, `freeze`, and `explanation`. Use `format`/`backend`/`os` only if the defaults
71+
don't pick the right extractor.
72+
2. Generate the `.frz` file using the command above.
73+
3. Copy the `generated_at_commit` the CLI suggested into the manifest entry.
74+
4. Commit the updated manifest and the new `.frz` file together.
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
version https://git-lfs.github.com/spec/v1
2+
oid sha256:37bf79a62bcfbe94962326f23d3499af2d7cedfa4c826236c5acf2ac06fcdef6
3+
size 3451103
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
version https://git-lfs.github.com/spec/v1
2+
oid sha256:ff91b15e1218dac2a664da55fc2ab7df8ab415dcb2941a3a013ee4b83e34792a
3+
size 4719869
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
version https://git-lfs.github.com/spec/v1
2+
oid sha256:c5b8cd87859e37814a349182468c0c055de38bff83f5266ccbdadf05d55c097a
3+
size 7701
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
version https://git-lfs.github.com/spec/v1
2+
oid sha256:d0ab2ec1d3edde8cdfae5f8c3d900adde176fa03c10f06766fada2a7c9539439
3+
size 159159

0 commit comments

Comments
 (0)