Skip to content

Commit dd9c36d

Browse files
committed
feat: sync all v3.0.5 changes — security skills, PR review flow, schemas, images
Recovers 21 new files and 19 updated files that were published to npm but never committed to GitHub. Includes: - Security skills pack (commit-scan, security-review, threat-model, vuln-validation) - PR review & strategic fix flow (pr-scope.cjs, fix-pipeline updates) - New schemas (fix-plan, fix-strategy) - Generated hero/feature images for docs - OpenAI agent config (agents/openai.yaml) - Updated CHANGELOG, README, SKILL.md with v3.0.5 content - All 60 tests passing
1 parent 5860831 commit dd9c36d

41 files changed

Lines changed: 3018 additions & 183 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

CHANGELOG.md

Lines changed: 35 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,10 +5,43 @@ All notable changes to this project will be documented in this file.
55
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
66
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
77

8+
## [3.0.5] — 2026-03-11
9+
10+
### Added
11+
- `agents/openai.yaml` UI metadata for skill lists and quick-invoke prompts
12+
13+
### Changed
14+
- `SKILL.md` frontmatter now validates cleanly against the `skill-creator` validator
15+
- `evals/evals.json` now matches the current `.bug-hunter/*` JSON-first pipeline, default loop/fix behavior, and modern flags like `--deps`, `--threat-model`, `--dry-run`, and `--autonomous`
16+
- npm package files now include the `agents/` directory so `openai.yaml` ships with the published skill
17+
818
## [Unreleased]
919

20+
### Highlights
21+
- PR review is now a first-class workflow with `--pr`, `--pr current`, `--pr recent`, `--pr 123`, `--last-pr`, and `--pr-security`.
22+
- Bug Hunter now emits both `fix-strategy.json` and `fix-plan.json` before fix execution so remediation stays reviewable and confidence-gated.
23+
- The enterprise security pack now ships inside the repository under `skills/`, making PR security review and full security audits portable.
24+
- Fix execution is now safer through schema-validated planning, atomic lock handling, safer worktree cleanup, stash preservation, and shell-safe templating.
25+
1026
### Added
1127
- GitHub Actions npm publish workflow on release publish or manual dispatch, with version/tag verification before `npm publish`
28+
- bundled local security skills under `skills/`: `commit-security-scan`, `security-review`, `threat-model-generation`, and `vulnerability-validation`
29+
- enterprise security entrypoints: `--pr-security`, `--security-review`, and `--validate-security`
30+
- regression tests and eval coverage for integrated local security-skill routing
31+
- `schemas/fix-plan.schema.json` plus validation coverage for canonical fix-plan artifacts
32+
- focused regressions for lock-token ownership, atomic lock acquisition, stale artifact clearing, shell-safe worker paths, failed-chunk fix-plan suppression, managed worktree cleanup, and stash-ref preservation
33+
34+
### Changed
35+
- portable security capabilities now live inside the repository under `skills/` instead of depending on external machine-specific skill paths
36+
- package metadata now ships the `skills/` directory for self-contained distribution
37+
- main Bug Hunter orchestration now routes into the bundled local security skills for PR security review, threat-model generation, enterprise security review, and vulnerability validation
38+
- fix-lock now uses owner tokens for renew/release, atomic acquisition under contention, and safe recovery from corrupted lock files
39+
- run-bug-hunter now shell-quotes templated command arguments, clears stale artifacts before retries, validates fix-plan artifacts, and skips fix-plan emission when chunks fail
40+
- worktree cleanup/status now preserve unrelated directories, preserve stash metadata from defensive harvests, and avoid reporting manifest-only worktrees as dirty
41+
- current-PR git fallback now diffs against the discovered `origin/<default-branch>` ref when the base branch comes from `origin/HEAD`
42+
- README now opens with a short “New in This Update” and PR-first quick-start section
43+
- `llms.txt` and `llms-full.txt` now describe the PR review flow, bundled local security pack, current fix artifacts, and the current regression-test coverage
44+
- `skills/README.md` now explains how the bundled security skills map into Bug Hunter workflows
1245

1346
## [3.0.4] — 2026-03-11
1447

@@ -167,7 +200,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
167200
- Coverage enforcement — partial audits produce explicit warnings
168201
- Large codebase strategy with domain-first tiered scanning
169202

170-
[Unreleased]: https://github.com/codexstar69/bug-hunter/compare/v3.0.4...HEAD
203+
[Unreleased]: https://github.com/codexstar69/bug-hunter/compare/v3.0.5...HEAD
204+
[3.0.5]: https://github.com/codexstar69/bug-hunter/compare/v3.0.4...v3.0.5
171205
[3.0.4]: https://github.com/codexstar69/bug-hunter/compare/v3.0.3...v3.0.4
172206
[3.0.3]: https://github.com/codexstar69/bug-hunter/compare/v3.0.2...v3.0.3
173207
[3.0.2]: https://github.com/codexstar69/bug-hunter/compare/v3.0.1...v3.0.2

README.md

Lines changed: 140 additions & 9 deletions
Large diffs are not rendered by default.

SKILL.md

Lines changed: 68 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,6 @@
11
---
22
name: bug-hunter
33
description: "Adversarial bug hunting with a sequential-first pipeline (Recon, Hunter, Skeptic, Referee) that can optionally use safe read-only parallel triage. Finds, verifies, and auto-fixes real bugs by default (with --scan-only opt-out) using checkpointed verification and resume state for large codebases. Use this skill whenever the user wants bug finding, security audits, regression checks, or code review focused on runtime behavior."
4-
argument-hint: "[path | -b <branch> [--base <base-branch>] | --staged | --scan-only | --fix | --autonomous | --no-loop | --approve | --deps | --threat-model | --dry-run]"
5-
disable-model-invocation: true
64
---
75

86
# Bug Hunt - Adversarial Bug Finding
@@ -44,9 +42,21 @@ For large scans: process chunks sequentially with persistent state to avoid comp
4442
/bug-hunter lib/auth.ts # Scan specific file
4543
/bug-hunter -b feature-xyz # Scan files changed in feature-xyz vs main
4644
/bug-hunter -b feature-xyz --base dev # Scan files changed in feature-xyz vs dev
45+
/bug-hunter --pr # Easy alias for --pr current
46+
/bug-hunter --pr current # Review the current PR end to end
47+
/bug-hunter --pr recent --scan-only # Review the most recent PR without editing code
48+
/bug-hunter --pr 123 # Review a specific PR number
49+
/bug-hunter --pr-security # PR security review: PR scope + threat model + dependency scan
50+
/bug-hunter --last-pr --review # Easy mnemonic for “review the last PR”
51+
/bug-hunter --review-pr # Alias for --pr current
4752
/bug-hunter --staged # Scan staged files (pre-commit check)
4853
/bug-hunter --scan-only src/ # Scan only, no code changes
54+
/bug-hunter --review src/ # Easy alias for --scan-only
4955
/bug-hunter --fix src/ # Find bugs AND auto-fix them
56+
/bug-hunter --plan-only src/ # Build fix strategy + plan, but do not edit files
57+
/bug-hunter --plan src/ # Easy alias for --plan-only
58+
/bug-hunter --safe src/ # Easy alias for --fix --approve
59+
/bug-hunter --preview src/ # Easy alias for --fix --dry-run
5060
/bug-hunter --autonomous src/ # Alias for no-intervention auto-fix run
5161
/bug-hunter --fix -b feature-xyz # Find + fix on branch diff
5262
/bug-hunter --fix --approve src/ # Find + fix, but ask before each fix
@@ -55,6 +65,8 @@ For large scans: process chunks sequentially with persistent state to avoid comp
5565
/bug-hunter --no-loop --scan-only src/ # Single-pass scan, no fixes, no loop
5666
/bug-hunter --deps src/ # Include dependency CVE scan
5767
/bug-hunter --threat-model src/ # Generate/use STRIDE threat model
68+
/bug-hunter --security-review src/ # Enterprise security workflow: threat model + CVEs + validation
69+
/bug-hunter --validate-security src/ # Force vulnerability-validation for security findings
5870
/bug-hunter --deps --threat-model src/ # Full security audit
5971
/bug-hunter --fix --dry-run src/ # Preview fixes without editing files
6072
```
@@ -75,24 +87,46 @@ The raw arguments are: $ARGUMENTS
7587
0g. If arguments contain `--deps`: strip it and set `DEP_SCAN=true`. Dependency scanning runs package manager audit tools and checks if vulnerable APIs are actually called in the codebase.
7688
0h. If arguments contain `--threat-model`: strip it and set `THREAT_MODEL_MODE=true`. Generates a STRIDE threat model at `.bug-hunter/threat-model.md` if one doesn't exist, then feeds it to Recon + Hunter for targeted security analysis.
7789
0i. If arguments contain `--dry-run`: strip it and set `DRY_RUN_MODE=true`. Forces `FIX_MODE=true`. In dry-run mode, Phase 2 builds the fix plan and the Fixer reads code and outputs planned changes as unified diff previews, but no file edits, git commits, or lock acquisition occur. Produces `fix-report.json` with `"dry_run": true`.
90+
0j. If arguments contain `--preview`: strip it, set `DRY_RUN_MODE=true`, and force `FIX_MODE=true`. Treat it as a memorable alias for `--fix --dry-run`.
91+
0k. If arguments contain `--plan-only`: strip it and set `PLAN_ONLY_MODE=true`. The pipeline still scans, verifies, and builds `fix-strategy.json` + `fix-plan.json`, but it stops before the Fixer edits code.
92+
0l. If arguments contain `--plan`: strip it and set `PLAN_ONLY_MODE=true`. Treat it as a memorable alias for `--plan-only`.
93+
0m. If arguments contain `--review-pr`: strip it and treat it as `--pr current`.
94+
0n. If arguments contain `--pr` with no selector after it, treat it as `--pr current`.
95+
0o. If arguments contain `--last-pr`: strip it and treat it as `--pr recent`.
96+
0p. If arguments contain `--review`: strip it and set `FIX_MODE=false`. Treat it as a memorable alias for `--scan-only`.
97+
0q. If arguments contain `--safe`: strip it, set `FIX_MODE=true`, and set `APPROVE_MODE=true`. Treat it as a memorable alias for `--fix --approve`.
98+
0r. If arguments contain `--pr-security`: strip it, set `PR_SECURITY_MODE=true`, force `DEP_SCAN=true`, force `THREAT_MODEL_MODE=true`, force `FIX_MODE=false`, and if no explicit `--pr` selector was provided treat it as `--pr current`.
99+
0s. If arguments contain `--security-review`: strip it, set `SECURITY_REVIEW_MODE=true`, force `DEP_SCAN=true`, force `THREAT_MODEL_MODE=true`, and force `FIX_MODE=false`.
100+
0t. If arguments contain `--validate-security`: strip it and set `VALIDATE_SECURITY_MODE=true`.
101+
102+
1. If arguments contain `--pr <selector>`: this is **PR review mode**.
103+
- Valid selectors: `current`, `recent`, or a PR number like `123`.
104+
- If `--base <base-branch>` is present, pass it through for current-branch git fallback.
105+
- Run:
106+
```bash
107+
node "$SKILL_DIR/scripts/pr-scope.cjs" resolve "<selector>" --repo-root "$PWD" [--base <base-branch>]
108+
```
109+
- If it fails, report the error to the user and stop.
110+
- Save the JSON result to `.bug-hunter/pr-scope.json` for later reporting.
111+
- Use `changedFiles` from the JSON output as the scan target (scan full file contents, not just the diff).
78112

79-
1. If arguments contain `--staged`: this is **staged file mode**.
113+
2. If arguments contain `--staged`: this is **staged file mode**.
80114
- Run `git diff --cached --name-only` using the Bash tool to get the list of staged files.
81115
- If the command fails, report the error to the user and stop.
82116
- If no files are staged, tell the user there are no staged changes to scan and stop.
83117
- The scan target is the list of staged files (scan their full contents, not just the diff).
84118

85-
2. If arguments contain `-b <branch>`: this is **branch diff mode**.
119+
3. If arguments contain `-b <branch>`: this is **branch diff mode**.
86120
- Extract the branch name after `-b`.
87121
- If `--base <base-branch>` is also present, use that as the base branch. Otherwise default to `main`.
88122
- Run `git diff --name-only <base>...<branch>` using the Bash tool to get the list of changed files.
89123
- If the command fails (e.g. branch not found), report the error to the user and stop.
90124
- If no files changed, tell the user there are no changes to scan and stop.
91125
- The scan target is the list of changed files (scan their full contents, not just the diff).
92126

93-
3. If arguments do NOT contain `-b` or `--staged`: treat the entire argument string as a **path target** (file or directory). If empty, scan the current working directory.
127+
4. If arguments do NOT contain `--pr`, `-b`, or `--staged`: treat the entire argument string as a **path target** (file or directory). If empty, scan the current working directory.
94128

95-
**After resolving the file list (for modes 1 and 2), filter out non-source files:**
129+
**After resolving the file list (for modes 1, 2, and 3), filter out non-source files:**
96130

97131
Remove any files matching these patterns — they are not scannable source code:
98132
- Docs/text: `*.md`, `*.txt`, `*.rst`, `*.adoc`
@@ -169,7 +203,7 @@ Before doing anything else, verify the environment:
169203
170204
5. **Verify helper scripts exist**:
171205
```
172-
ls "$SKILL_DIR/scripts/run-bug-hunter.cjs" "$SKILL_DIR/scripts/bug-hunter-state.cjs" "$SKILL_DIR/scripts/delta-mode.cjs" "$SKILL_DIR/scripts/payload-guard.cjs" "$SKILL_DIR/scripts/fix-lock.cjs" "$SKILL_DIR/scripts/triage.cjs" "$SKILL_DIR/scripts/doc-lookup.cjs"
206+
ls "$SKILL_DIR/scripts/run-bug-hunter.cjs" "$SKILL_DIR/scripts/bug-hunter-state.cjs" "$SKILL_DIR/scripts/delta-mode.cjs" "$SKILL_DIR/scripts/payload-guard.cjs" "$SKILL_DIR/scripts/fix-lock.cjs" "$SKILL_DIR/scripts/triage.cjs" "$SKILL_DIR/scripts/doc-lookup.cjs" "$SKILL_DIR/scripts/pr-scope.cjs"
173207
```
174208
If any are missing, stop and tell the user to update/reinstall the skill.
175209
Note: `code-index.cjs` is optional — enables cross-domain dependency analysis for boundary audits in large-codebase mode, but the pipeline works fully without it.
@@ -249,10 +283,10 @@ Before doing anything else, verify the environment:
249283
250284
### Step 1: Parse arguments, resolve target, and run triage
251285
252-
Follow the rules in the **Target** section above. If in branch diff or staged mode, run the appropriate git command now, collect the file list, and apply the filter.
286+
Follow the rules in the **Target** section above. If in PR review, branch diff, or staged mode, run the appropriate resolver command now, collect the file list, and apply the filter.
253287
254288
Report to the user:
255-
- Mode (full project / directory / file / branch diff / staged)
289+
- Mode (full project / directory / file / PR review / branch diff / staged)
256290
- Number of source files to scan (after filtering)
257291
- Number of files filtered out
258292
@@ -304,7 +338,10 @@ Proceeding with partial scan — highest-priority queued files only.
304338
### Step 1b: Generate threat model (if --threat-model)
305339

306340
If `THREAT_MODEL_MODE=true`:
307-
1. Check if `.bug-hunter/threat-model.md` already exists.
341+
1. Read the bundled local skill `SKILL_DIR/skills/threat-model-generation/SKILL.md` before generating the threat model. This keeps the enterprise security pack end-to-end connected to the main Bug Hunter flow.
342+
2. Use the bundled skill's Bug Hunter-native artifact conventions (`.bug-hunter/threat-model.md`, `.bug-hunter/security-config.json`).
343+
344+
3. Check if `.bug-hunter/threat-model.md` already exists.
308345
- If it exists and was modified within the last 90 days: use it as-is. Set `THREAT_MODEL_AVAILABLE=true`.
309346
- If it exists but is >90 days old: warn user ("Threat model is N days old — regenerating"), regenerate.
310347
- If it doesn't exist: generate it.
@@ -321,7 +358,10 @@ If `THREAT_MODEL_MODE=false` but `.bug-hunter/threat-model.md` exists:
321358

322359
### Step 1c: Dependency scan (if --deps)
323360

324-
If `DEP_SCAN=true`:
361+
If `DEP_SCAN=true` or `SECURITY_REVIEW_MODE=true` or `PR_SECURITY_MODE=true`:
362+
- Read the bundled local skill `SKILL_DIR/skills/security-review/SKILL.md` when running the broader enterprise security workflow.
363+
364+
If `DEP_SCAN=true`:
325365
```bash
326366
node "$SKILL_DIR/scripts/dep-scan.cjs" --target "<TARGET_PATH>" --output .bug-hunter/dep-findings.json
327367
```
@@ -335,15 +375,23 @@ If `.bug-hunter/dep-findings.json` exists with REACHABLE findings, include them
335375

336376
### Step 2: Read prompt files on demand (context efficiency)
337377

378+
**Security-pack routing:**
379+
- If `PR_SECURITY_MODE=true`, read `SKILL_DIR/skills/commit-security-scan/SKILL.md` before the normal PR-review scan.
380+
- If `SECURITY_REVIEW_MODE=true`, read `SKILL_DIR/skills/security-review/SKILL.md` before the broader security audit flow.
381+
- If `VALIDATE_SECURITY_MODE=true`, read `SKILL_DIR/skills/vulnerability-validation/SKILL.md` before finalizing confirmed security findings.
382+
338383
**MANDATORY**: You MUST read prompt files using the Read tool before passing them to subagents or executing them yourself. Do NOT skip this or act from memory. Use the absolute SKILL_DIR path resolved in Step 0.
339384

340385
**Load only what you need for each phase — do NOT read all files upfront:**
341386

342387
| Phase | Read These Files |
343388
|-------|-----------------|
344-
| Threat Model (Step 1b) | `prompts/threat-model.md` (only if THREAT_MODEL_MODE=true) |
389+
| PR security review | `skills/commit-security-scan/SKILL.md` (if `PR_SECURITY_MODE=true` or the user asks for PR-focused security review) |
390+
| Security review | `skills/security-review/SKILL.md` (if `SECURITY_REVIEW_MODE=true` or the user asks for an enterprise/full security audit) |
391+
| Threat Model (Step 1b) | `skills/threat-model-generation/SKILL.md` + `prompts/threat-model.md` (only if THREAT_MODEL_MODE=true) |
345392
| Recon (Step 4) | `prompts/recon.md` (skip for single-file mode) |
346393
| Hunters (Step 5) | `prompts/hunter.md` + `prompts/doc-lookup.md` + `prompts/examples/hunter-examples.md` |
394+
| Security validation | `skills/vulnerability-validation/SKILL.md` (if `VALIDATE_SECURITY_MODE=true` or confirmed security findings need exploitability validation) |
347395
| Skeptics (Step 6) | `prompts/skeptic.md` + `prompts/doc-lookup.md` + `prompts/examples/skeptic-examples.md` |
348396
| Referee (Step 7) | `prompts/referee.md` |
349397
| Fixers (Phase 2) | `prompts/fixer.md` + `prompts/doc-lookup.md` (only if FIX_MODE=true) |
@@ -525,7 +573,15 @@ If the coverage assessment shows ANY queued scannable source files were not scan
525573
If zero bugs were confirmed, say so clearly — a clean report is a good result.
526574
527575
**Routing after report:**
576+
- If there are confirmed security findings AND (`VALIDATE_SECURITY_MODE=true` OR `PR_SECURITY_MODE=true` OR `SECURITY_REVIEW_MODE=true`):
577+
- Read `SKILL_DIR/skills/vulnerability-validation/SKILL.md`.
578+
- Re-check reachability, exploitability, PoC quality, and CVSS details for the confirmed security findings before finalizing the security summary.
579+
- If confirmed bugs > 0 AND `PLAN_ONLY_MODE=true`:
580+
- Build `fix-strategy.json` and `fix-plan.json`.
581+
- Present the strategy clusters (safe autofix vs manual review vs larger refactor vs architectural remediation).
582+
- Stop before the Fixer edits code.
528583
- If confirmed bugs > 0 AND `FIX_MODE=true`:
584+
- Build and present `fix-strategy.json` first.
529585
- Auto-fix only `ELIGIBLE` bugs.
530586
- Apply canary-first rollout: fix top critical eligible subset first, verify, then continue remaining eligible fixes.
531587
- Keep `MANUAL_REVIEW` bugs in report only (do not auto-edit).

agents/openai.yaml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
interface:
2+
display_name: "Bug Hunter"
3+
short_description: "Find, verify, and auto-fix real code bugs"
4+
default_prompt: "Use $bug-hunter to scan this codebase for confirmed runtime, logic, and security bugs."
4.21 MB
Loading
3.94 MB
Loading
4.66 MB
Loading
4.34 MB
Loading
4.46 MB
Loading

0 commit comments

Comments
 (0)