UX + observability patch release (0.3.1)#21
Open
Ar9av wants to merge 8 commits into
Open
Conversation
Tighten destructive-command regex so it only matches genuinely destructive rm targets — false positives like `rm -rf ./node_modules` and `rm -rf /tmp/build` no longer block legitimate cleanup. New patterns cover `rm -rf /`, system dirs, `$HOME`/`~`, `*`, `..`, and `sudo rm -rf`. Expand policy coverage: - persistence rules for /etc/cron.d/, /etc/crontab, /etc/systemd/system/, /etc/init.d/, /etc/rc.local, /etc/profile.d/ - tls-verification-disabled rule (git sslVerify false, GIT_SSL_NO_VERIFY, StrictHostKeyChecking=no, curl -k, NODE_TLS_REJECT_UNAUTHORIZED, etc.) - npm/yarn/pnpm --registry and git+ supply-chain patterns - shell-obfuscation rule for base64/xxd/printf hex decode-and-execute Workspace resolution: argparse subparsers declaring --workspace were clobbering the top-level value with None. CLI now scans argv directly and falls back to PRISMOR_WARDEN_WORKSPACE env var. This also fixes egress allowlist enforcement in `warden check` and `policy init --workspace` writing to the wrong directory, both of which depended on workspace flowing through correctly. Scanner now reads project-level .claude/settings.json, .mcp.json, and .claude/settings.local.json from the workspace — previously workspace MCP configs were ignored entirely. should_block() loads default categories from the policy when block_categories is None so the public API works for direct callers (3 previously-failing unit tests now pass). uninstall-hooks removes cloaking hooks too (userprompt-guard.sh, decloak.sh, recloak-mcp.sh) so the claude settings.json ends up clean instead of partially stripped. sweep --clean / --restore catch RuntimeError when stdin has no TTY and exit 1 with a clean message pointing at PRISMOR_SWEEP_PASS, instead of raising an unhandled Python traceback. upgrade_feed.py is now idempotent (no-op if no changes) so the signature isn't invalidated on every run. When it does write, it auto re-signs via pipeline/sign_feed.sh if PRISMOR_SIGNING_PRIVATE_KEY is set; otherwise it prints a loud warning. SARIF output now populates rules[] with all 53 policy rules including titles, severities, categories, and helpUri — giving GitHub Code Scanning and other consumers full rule metadata. Minor UX: analyze and session accept positional args in addition to --input / --session-id, and `check` no longer prints the resolved symlink path as a trailing line that breaks parsers. All 227 tests pass.
Three edge-case fixes surfaced by remote variation testing: 1. rm -rf ../build was incorrectly blocked because the parent-dir pattern matched .. followed by any /. Tightened to .. or ../.. or ../../.. chains (dir-escape), not relative build paths. 2. rm -r -f /etc (and rm -f -r /etc) weren't caught because the lookahead consumed the inter-flag space. Switched to non-consuming (?=\s|$) lookahead so [^\n]* can backtrack onto the -f token. 3. rm --recursive --force / and --force --recursive are now matched via long-form alternation in the same lookahead; bare --recursive without --force is left lenient (consistent with bare -r).
Old pattern 'nc\s+.*-[a-z]*l[a-z]*\s*.*-p' required -l and -p as separate flags, so nc -lvp 4444 -e /bin/bash (very common reverse shell one-liner) slipped past. Three patterns now cover: nc listener with explicit -e/--exec, nc with combined -lvp/-lp flags, and nc with separate -l and -p flags.
- rm -rf "/" and rm -rf '/' now blocked (quoted target) - /dev/tcp/<hostname>/port now caught (was digit-only) - git -c http.sslVerify=false clone caught - curl -sk, -ksL, -Lk (combined flag) caught - npm/yarn/pnpm --registry flag caught regardless of position (npm i short form, registry flag before or after install/i/add)
Previously only the root / case accepted quoted targets, so rm -rf "/etc" slipped past while rm -rf "/" was blocked. Extended optional quote match to the system-critical-path, top-level-dir, and $HOME/~ branches for consistency.
Implements every Tier-1 item from IMPROVEMENT_PLAN.md:
New subsystems
- warden canary plant|list|remove|status — honey-token credentials
with webhook beacons. First AI-agent-specific canary impl.
- warden/sinks.py — webhook, syslog (UDP/TCP), and file (JSON/CEF)
SIEM sinks configured via settings.outputs in policy.yaml.
- warden/policy_test.py — declarative test runner; bundled OWASP LLM
Top 10 + Agentic Top 10 + MITRE ATLAS starter pack (28 cases).
Detection coverage
- agent-instruction-tampering rule: CLAUDE.md, AGENTS.md,
.cursorrules, .windsurfrules, .github/copilot-instructions.md.
- MCP schema auditor: overbroad allowlists, any-typed params on
exec-capable tools, missing input schemas, fs+net+exec
capability combination, risky description language.
- Unicode homoglyph detection: Cyrillic / Greek / Latin-extended
confusables, fullwidth letters, zero-width joiners.
- Lockfile integrity: non-registry sources, missing integrity hashes,
lockfile-injection (deps in lock but not in package.json).
check ergonomics
- --explain prints matched rule + full pattern.
- --from-log PATH replays a JSONL session log through current policy.
- --suggest-allowlist emits a ready-to-paste allowlist entry.
Bump to 0.3.0; CHANGELOG added. 227/227 unit tests pass; 28/28 OWASP
starter cases pass.
Surfaces six issues that the end-to-end bench (120-attack / 539-benign / 1,000-call latency) flagged. All changes are behaviour-compatible for existing policy authors; SDK embedders gain two new entry points. Fixed: - install-hooks idempotent: re-run with different --mode now replaces the existing Warden hook instead of appending a duplicate entry. - Session-log events carry ISO 8601 UTC timestamps (was null, broke SIEM correlation). - hook-dispatch surfaces WARN-level findings to stderr as [warden WARN] [SEV] rule-id: title (was silent; exit 0 with no feedback meant HIGH-severity warns could fire invisibly). Added: - PolicyEngine.from_defaults() classmethod for embedders. - PolicyEngine.evaluate(event) default args (index=0, session_id=""). - Event-type aliases: command→shell, read→file_read, write→file_write. - warden policy show --json for scripting. - Subcommand-scoped argparse errors (shows the subcommand's usage, not the top-level wall). - install-hooks prints scope+mode and hints at --scope user for global coverage. Tests: 236/236 pass (227 existing + 9 new regression tests for idempotency, timestamp, embedder API).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
UX and observability patch release based on the end-to-end benchmark (120-attack / 539-benign / 1,000-call latency) run against
tier1-ux-coverage. All nine changes are behaviour-compatible for existing policy authors; SDK embedders gain two new entry points.Fixes
install-hooksappended a duplicate when re-run with a different--mode, leaving bothobserveandenforcecommands insettings.jsonwarden/cli.pymarker) before merging, matching uninstall semantics. Idempotent by construction."ts": nullfor every agent (Claude/Cursor/Windsurf/Hermes/Openclaw/Copilot) because payloads don't includetimestamp. SIEM correlation broke._make_ts(payload)helper returns ISO 8601 UTC stamp on receipt when the payload lacks one. Applied to all six_normalize_*functions.hook-dispatchwas silent onaction: warnrules — exit 0, empty stderr, HIGH-severity warnings invisible to the user and the agent.[warden WARN] [SEV] rule-id: titleto stderr regardless of mode.action: blockbehaviour is unchanged.Additions (new entry points)
PolicyEngine.evaluateaccepts CLI-style type aliases (command→shell,read→file_read,write→file_write) so an embedder can pass whichever vocabulary is natural for its layerwarden/policy_engine.pyPolicyEngine.evaluate(event)—indexandsession_idnow default, so the common SDK call-site works without boilerplatewarden/policy_engine.pywarden <sub>: error: ...line instead of dumping the top-level help wallwarden/cli.py::build_parserPolicyEngine.from_defaults(workspace=...)classmethod as the explicit SDK factorywarden/policy_engine.pywarden install-hooksprints scope + mode and, when at project scope, a one-line hint about--scope userwarden/cli.pywarden policy show --jsonemits the rule list as structured JSON (workspace, project-policy path, per-rule id/severity/category/action/event-types/fields, allowlists) for scriptswarden/cli.pyTest plan
pytest -x -q— 236/236 pass (227 existing + 9 new regression tests covering idempotency, timestamp, and the embedder API).ubuntu@44.214.4.208):install-hooks --mode observethen--mode enforceleaves exactly one Warden command per hook event (3 total in Claude's three-event install), only--mode enforcepersists.session.jsonllast entry has"ts": "2026-04-24T15:42:04.005098Z"instead ofnull.cat /etc/shadownow prints[warden WARN] [HIGH] path-traversal: ...to stderr, exit 0.PolicyEngine.from_defaults().evaluate({"type":"command","command":"rm -rf /"})returnsdestructive-commandfinding.warden check --command foonow prints thechecksubparser usage andwarden check: error: unrecognized arguments: --command.warden install-hooks --agent claude --mode enforceprints the scope hint.warden policy show --jsonemits parseable JSON withactiveRuleCount: 54.Bench artefacts
The pre-0.3.1 run of the 120-attack / 539-benign / 1,000-call benchmark is in
/Users/ar9av/Downloads/warden-bench/. The write-up is in/Users/ar9av/Downloads/warden-bench-report.mdand/Users/ar9av/Downloads/warden-paper/warden_benchmark.pdf. All nine fixes map to issues disclosed in §7 of the bench report ("What we found that we didn't know before").Not in this PR
warden sessions tail --limit Nacross all session files). New CLI feature, not a fix.claude-credential-accessmissing top-level~/.claude.json, no rule forscp -r ~/.ssh,--registry=URLequals form, etc.) — tracked as a separate detection-coverage patch, distinct from UX.