Skip to content

v0.1.3: Claude Code hooks, approval bypass, sliding-window scan#1

Open
TeoSlayer wants to merge 4 commits into
mainfrom
release/v0.1.3
Open

v0.1.3: Claude Code hooks, approval bypass, sliding-window scan#1
TeoSlayer wants to merge 4 commits into
mainfrom
release/v0.1.3

Conversation

@TeoSlayer

Copy link
Copy Markdown

Summary

  • Claude Code hook integration: aegis install-hooks wires PreToolUse/Bash (blocking, L1-only, microsecond) and PostToolUse/WebFetch+Bash+mcp__* (warn, non-blocking) into ~/.claude/settings.json
  • Approval bypass: aegis approve '<cmd>' / aegis revoke '<cmd>' — one-time SHA-256 hash bypass for blocked commands; consumed on use, blocks again afterward
  • Sliding-window scan: 4096-byte window, 512-byte stride — full document coverage, defeats middle-burial attacks
  • Credential taint: co-occurrence of credential read + network sink → CRED_TAINT even without explicit injection keywords
  • WARN tier: L1 fired but judge cleared → warn_rule surfaced to user, not quarantine
  • Judge improvements: head+tail window for large docs; first-word + negation-exclusion parsing to prevent fail-closed FPs
  • Eval corpus: 30 labeled files (20 malicious, 10 benign), CI runner exits 1 below thresholds
  • Eval results: 90% recall / 95% precision / 92% F1 — PASS

Test plan

  • aegis version prints 0.1.3
  • aegis install-hooks writes correct hooks with absolute binary path
  • PreToolUse hook blocks commands containing sensitive credential paths
  • aegis approve '<cmd>' + rerun passes once, then blocks again
  • python3 tests/held_out_eval/run_held_out.py exits 0

🤖 Generated with Claude Code

teovl and others added 4 commits June 23, 2026 18:06
…ential taint

- install-hooks: PreToolUse/Bash blocks malicious commands (exit 2, L1-only, microsecond latency)
- PostToolUse hooks on WebFetch/Bash/mcp__*: warns on injection in tool results (stdout, non-blocking)
- approve/revoke: one-time SHA-256 hash bypass for blocked commands
- Sliding-window scan (4096-byte window, 512-byte stride): full document coverage, middle-burial-proof
- Credential taint detection: co-occurrence of cred source + network sink flags CRED_TAINT
- WARN tier: L1 fired but judge cleared → surfaces as warning not quarantine
- head_tail_window(): judge sees both head and tail of large docs to defeat truncation attacks
- Judge parsing: first-word + negation exclusion prevents fail-closed FPs
- Eval corpus: 30 labeled files (20 malicious, 10 benign), CI runner exits 1 below thresholds
- Eval: 90% recall / 95% precision / 92% F1 (PASS)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…esses

`aegis scan-pipe` reads raw text from stdin, runs L1+T2 scan (T1 patterns +
context-sensitive T2 patterns), exits 0 (clean) or 2 (blocked, rule printed
to stdout). Designed for integration into agent harnesses that can't use the
Claude Code PreToolUse hook format.

Used by:
- claw-pilot/plugin/src/inbound.ts (every inbound Pilot message)
- picoclaw/agent.mjs (every inbox message before LLM reply)

Usage: printf '%s' "$untrusted_text" | aegis scan-pipe

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- integrations/hermes/plugin.py: Python pre_llm_call hook for the Hermes
  agent harness; scans all message content via aegis scan-pipe, returns None
  to block, fails open on timeout or missing binary
- README: new "Other harness integrations" section covering OpenClaw,
  PicoClaw, Hermes, and generic scan-pipe usage
- Attack surface table extended to cover overlay message vectors and
  pre-LLM scan

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Security fixes applied (in priority order):

1. T1-veto removal: T1/credential-taint hits can no longer be downgraded to
   ALLOW by a judge SAFE verdict. T2-only hits retain the veto (reduces FP).

2. T2 patterns in hook paths: scan-cmd and scan-result were calling
   scan_text(false), skipping all 22 T2 patterns. Zero latency cost to fix.

3. File size cap (10 MB) in all extract_* functions: oversized hostile files
   are now quarantined as SIZE_EXCEEDED instead of OOM-killing the daemon.

4. Atomic per-hash approval tokens: replace read-modify-write allowlist with
   per-hash files in ~/.aegis/approved/. Consumption via atomic fs::remove_file.

5. Fix known_set on intercept failure: process_file returns bool; poll_targets
   only inserts into known_set on success (was silencing files permanently).

6. serde_json in write_scored: replaces format!()-built JSON log entries,
   prevents log corruption via filenames with embedded JSON fragments.

7. Zs-space + combining-mark normalization: add 14 Unicode Zs codepoints to
   ZWC; switch normalize() to NFD with Mn-category filter.

8. Missing Cyrillic small-letter homoglyphs: add t, m, h, a, e mappings.

9. scan-pipe stdin cap (4 MB): exits 2 with SIZE_EXCEEDED on oversized input.

10. Hermes plugin: raise AegisBlockedError instead of returning None; fail
    closed on non-0/non-2 exits; replace plaintext preview with SHA-256 hash
    in audit.jsonl to prevent system-prompt content leaking into the log.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants