Skip to content

Commit 3b3188b

Browse files
authored
Create README.md
1 parent ac6db67 commit 3b3188b

1 file changed

Lines changed: 118 additions & 0 deletions

File tree

CAPE/README.md

Lines changed: 118 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,118 @@
1+
# CAPE Host Scanner
2+
3+
A host-side detection utility that hunts a live or imaged Windows system for the
4+
durable artifacts of [CAPE Sandbox](https://github.com/CAPESandbox/community)
5+
behavioral signatures. CAPE signatures normally classify a malware sample *after*
6+
detonation by inspecting its API-call trace; this utility extracts the
7+
statically-reachable indicators from those signatures and checks for them on a
8+
real host — registry autostart/tamper keys, on-disk file paths, services,
9+
process command lines, network IOCs, and ransom-note-style file content.
10+
11+
Read-only. Makes no changes to the system it scans.
12+
13+
## Files
14+
15+
| File | Role | Runs on |
16+
|------|------|---------|
17+
| `extract_cape_indicators.py` | Mines the CAPE signature corpus, classifies each signature, builds the indicator pack | Any Python 3.8+ host |
18+
| `indicator_pack.json` | The extracted indicators the scanner consumes (regenerate to update) | data |
19+
| `Invoke-CapeHostScan.ps1` | The scanner — matches the pack against host state | Windows (PS 5.1 / 7+) |
20+
| `ref_matcher.py` | Reference matching engine; validates scanner logic, never touches a host | Any Python (optional) |
21+
| `coverage_report.md` | Human-readable reachability report for the signature set | reference |
22+
23+
## Coverage at a glance
24+
25+
Across **614** CAPE Windows signatures: **171 (28%)** are statically host-scannable,
26+
2 are runtime-only (mutexes), and **441 (72%)** are behavioral-only — pure
27+
API-sequence logic with no on-disk residue. This utility covers the 28%. The
28+
behavioral majority needs live telemetry (Sysmon/Sigma); see *Roadmap*.
29+
30+
## Requirements
31+
32+
- **Windows PowerShell 5.1 or PowerShell 7+**
33+
- Run **elevated** for full registry / service / process coverage
34+
- Python 3.8+ only if you want to (re)generate the indicator pack
35+
36+
## Quick start
37+
38+
```powershell
39+
# one-time setup in the scanner's folder
40+
Unblock-File .\Invoke-CapeHostScan.ps1 # strip mark-of-the-web
41+
Set-ExecutionPolicy -Scope Process Bypass # this session only
42+
43+
# 1. verify the matching engine on your box (no system access)
44+
.\Invoke-CapeHostScan.ps1 -Pack .\indicator_pack.json -SelfTest
45+
# expect: "4/4 expected signatures fired" + generic-FP suppressed
46+
47+
# 2. real scan
48+
.\Invoke-CapeHostScan.ps1 -Pack .\indicator_pack.json -OutDir .\out -IncludeStringScan
49+
```
50+
51+
Regenerating the pack (on any Python host):
52+
53+
```bash
54+
python3 extract_cape_indicators.py --clone ./_cape
55+
# -> indicator_pack.json + coverage_report.md
56+
```
57+
58+
## Parameters
59+
60+
| Parameter | Purpose |
61+
|-----------|---------|
62+
| `-Pack <path>` | Path to `indicator_pack.json` (required) |
63+
| `-OutDir <path>` | Output directory for JSON + CSV (default: current dir) |
64+
| `-SelfTest` | Run engine against a synthetic host; no real access |
65+
| `-Categories <list>` | Restrict to CAPE categories, e.g. `ransomware,persistence,stealth` |
66+
| `-IncludeStringScan` | Enable file-content scanning (ransom notes etc.); slower |
67+
| `-IncludeGeneric` | Include low-specificity patterns (high false-positive rate; off by default) |
68+
| `-Root <path>` | Offline/imaged analysis root, e.g. `E:\mount\C` (filesystem + content only) |
69+
| `-MaxFileSizeKB <n>` | Skip files larger than this (default 4096) |
70+
71+
## Output
72+
73+
Console prints a severity-sorted findings table. `OutDir` receives:
74+
75+
- `cape_hostscan_<timestamp>.json` — full evidence, for reports/pipelines
76+
- `cape_hostscan_<timestamp>.csv` — flat triage view
77+
78+
Each finding includes: signature name, CAPE category, severity, MITRE ATT&CK
79+
TTP(s), the pattern that matched, and the exact evidence (registry path, file,
80+
command line, or host).
81+
82+
> **Findings are leads, not verdicts.** These are behavioral indicators
83+
> repurposed as static checks. Legitimate software touches autostart keys and
84+
> common paths, so verify each hit before acting.
85+
86+
## What it does and doesn't reach
87+
88+
- **Covers:** registry autostart/tamper keys, file-path indicators, services,
89+
process command lines, network IOCs, opt-in ransom-note content.
90+
- **Skips by default:** 128 low-specificity patterns (e.g. "any `.dll`") that
91+
would flood results; re-enable with `-IncludeGeneric`.
92+
- **Out of scope (Engine 2 territory):** the 441 behavioral-only signatures
93+
(injection, anti-debug, evasion chains). `coverage_report.md` lists each by
94+
name with its TTPs.
95+
- **Offline limit:** with `-Root`, only filesystem and content are checked.
96+
Registry/service/process/network are live-host only.
97+
98+
## Validation
99+
100+
The matching engine — regex-vs-literal detection, generic-pattern suppression,
101+
case handling, per-artifact dispatch — is validated at parity against the full
102+
614-signature corpus via `ref_matcher.py`, and confirmed on Windows PowerShell
103+
via `-SelfTest` (4/4 control signatures fire, generic false positive
104+
suppressed). The Windows collector layer (registry/CIM/network/file walking) is
105+
exercised by real runs.
106+
107+
## Roadmap
108+
109+
- **Engine 2** — translate the behavioral-only TTPs into Sysmon + Sigma rules to
110+
cover the remaining 72%.
111+
- Offline registry-hive parsing for `-Root` (imaged-disk analysis).
112+
- Scheduled-task XML parsing for persistence signatures.
113+
- Per-signature finding de-duplication.
114+
115+
---
116+
117+
*Pacific Northwest Computers — IT support & cybersecurity, Vancouver WA.
118+
Built on CAPE Sandbox community signatures (GPLv3).*

0 commit comments

Comments
 (0)