|
| 1 | +# CAPE Host Scanner |
| 2 | + |
| 3 | +A host-side detection utility that hunts a live or imaged Windows system for the |
| 4 | +durable artifacts of [CAPE Sandbox](https://github.com/CAPESandbox/community) |
| 5 | +behavioral signatures. CAPE signatures normally classify a malware sample *after* |
| 6 | +detonation by inspecting its API-call trace; this utility extracts the |
| 7 | +statically-reachable indicators from those signatures and checks for them on a |
| 8 | +real host — registry autostart/tamper keys, on-disk file paths, services, |
| 9 | +process command lines, network IOCs, and ransom-note-style file content. |
| 10 | + |
| 11 | +Read-only. Makes no changes to the system it scans. |
| 12 | + |
| 13 | +## Files |
| 14 | + |
| 15 | +| File | Role | Runs on | |
| 16 | +|------|------|---------| |
| 17 | +| `extract_cape_indicators.py` | Mines the CAPE signature corpus, classifies each signature, builds the indicator pack | Any Python 3.8+ host | |
| 18 | +| `indicator_pack.json` | The extracted indicators the scanner consumes (regenerate to update) | data | |
| 19 | +| `Invoke-CapeHostScan.ps1` | The scanner — matches the pack against host state | Windows (PS 5.1 / 7+) | |
| 20 | +| `ref_matcher.py` | Reference matching engine; validates scanner logic, never touches a host | Any Python (optional) | |
| 21 | +| `coverage_report.md` | Human-readable reachability report for the signature set | reference | |
| 22 | + |
| 23 | +## Coverage at a glance |
| 24 | + |
| 25 | +Across **614** CAPE Windows signatures: **171 (28%)** are statically host-scannable, |
| 26 | +2 are runtime-only (mutexes), and **441 (72%)** are behavioral-only — pure |
| 27 | +API-sequence logic with no on-disk residue. This utility covers the 28%. The |
| 28 | +behavioral majority needs live telemetry (Sysmon/Sigma); see *Roadmap*. |
| 29 | + |
| 30 | +## Requirements |
| 31 | + |
| 32 | +- **Windows PowerShell 5.1 or PowerShell 7+** |
| 33 | +- Run **elevated** for full registry / service / process coverage |
| 34 | +- Python 3.8+ only if you want to (re)generate the indicator pack |
| 35 | + |
| 36 | +## Quick start |
| 37 | + |
| 38 | +```powershell |
| 39 | +# one-time setup in the scanner's folder |
| 40 | +Unblock-File .\Invoke-CapeHostScan.ps1 # strip mark-of-the-web |
| 41 | +Set-ExecutionPolicy -Scope Process Bypass # this session only |
| 42 | +
|
| 43 | +# 1. verify the matching engine on your box (no system access) |
| 44 | +.\Invoke-CapeHostScan.ps1 -Pack .\indicator_pack.json -SelfTest |
| 45 | +# expect: "4/4 expected signatures fired" + generic-FP suppressed |
| 46 | +
|
| 47 | +# 2. real scan |
| 48 | +.\Invoke-CapeHostScan.ps1 -Pack .\indicator_pack.json -OutDir .\out -IncludeStringScan |
| 49 | +``` |
| 50 | + |
| 51 | +Regenerating the pack (on any Python host): |
| 52 | + |
| 53 | +```bash |
| 54 | +python3 extract_cape_indicators.py --clone ./_cape |
| 55 | +# -> indicator_pack.json + coverage_report.md |
| 56 | +``` |
| 57 | + |
| 58 | +## Parameters |
| 59 | + |
| 60 | +| Parameter | Purpose | |
| 61 | +|-----------|---------| |
| 62 | +| `-Pack <path>` | Path to `indicator_pack.json` (required) | |
| 63 | +| `-OutDir <path>` | Output directory for JSON + CSV (default: current dir) | |
| 64 | +| `-SelfTest` | Run engine against a synthetic host; no real access | |
| 65 | +| `-Categories <list>` | Restrict to CAPE categories, e.g. `ransomware,persistence,stealth` | |
| 66 | +| `-IncludeStringScan` | Enable file-content scanning (ransom notes etc.); slower | |
| 67 | +| `-IncludeGeneric` | Include low-specificity patterns (high false-positive rate; off by default) | |
| 68 | +| `-Root <path>` | Offline/imaged analysis root, e.g. `E:\mount\C` (filesystem + content only) | |
| 69 | +| `-MaxFileSizeKB <n>` | Skip files larger than this (default 4096) | |
| 70 | + |
| 71 | +## Output |
| 72 | + |
| 73 | +Console prints a severity-sorted findings table. `OutDir` receives: |
| 74 | + |
| 75 | +- `cape_hostscan_<timestamp>.json` — full evidence, for reports/pipelines |
| 76 | +- `cape_hostscan_<timestamp>.csv` — flat triage view |
| 77 | + |
| 78 | +Each finding includes: signature name, CAPE category, severity, MITRE ATT&CK |
| 79 | +TTP(s), the pattern that matched, and the exact evidence (registry path, file, |
| 80 | +command line, or host). |
| 81 | + |
| 82 | +> **Findings are leads, not verdicts.** These are behavioral indicators |
| 83 | +> repurposed as static checks. Legitimate software touches autostart keys and |
| 84 | +> common paths, so verify each hit before acting. |
| 85 | +
|
| 86 | +## What it does and doesn't reach |
| 87 | + |
| 88 | +- **Covers:** registry autostart/tamper keys, file-path indicators, services, |
| 89 | + process command lines, network IOCs, opt-in ransom-note content. |
| 90 | +- **Skips by default:** 128 low-specificity patterns (e.g. "any `.dll`") that |
| 91 | + would flood results; re-enable with `-IncludeGeneric`. |
| 92 | +- **Out of scope (Engine 2 territory):** the 441 behavioral-only signatures |
| 93 | + (injection, anti-debug, evasion chains). `coverage_report.md` lists each by |
| 94 | + name with its TTPs. |
| 95 | +- **Offline limit:** with `-Root`, only filesystem and content are checked. |
| 96 | + Registry/service/process/network are live-host only. |
| 97 | + |
| 98 | +## Validation |
| 99 | + |
| 100 | +The matching engine — regex-vs-literal detection, generic-pattern suppression, |
| 101 | +case handling, per-artifact dispatch — is validated at parity against the full |
| 102 | +614-signature corpus via `ref_matcher.py`, and confirmed on Windows PowerShell |
| 103 | +via `-SelfTest` (4/4 control signatures fire, generic false positive |
| 104 | +suppressed). The Windows collector layer (registry/CIM/network/file walking) is |
| 105 | +exercised by real runs. |
| 106 | + |
| 107 | +## Roadmap |
| 108 | + |
| 109 | +- **Engine 2** — translate the behavioral-only TTPs into Sysmon + Sigma rules to |
| 110 | + cover the remaining 72%. |
| 111 | +- Offline registry-hive parsing for `-Root` (imaged-disk analysis). |
| 112 | +- Scheduled-task XML parsing for persistence signatures. |
| 113 | +- Per-signature finding de-duplication. |
| 114 | + |
| 115 | +--- |
| 116 | + |
| 117 | +*Pacific Northwest Computers — IT support & cybersecurity, Vancouver WA. |
| 118 | +Built on CAPE Sandbox community signatures (GPLv3).* |
0 commit comments