Skip to content

Commit daf5ea2

Browse files
committed
experiment: experiment #1
Result: keep
1 parent abe1320 commit daf5ea2

3 files changed

Lines changed: 69 additions & 49 deletions

File tree

CHANGELOG.md

Lines changed: 35 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,38 @@ All notable changes to this project will be documented in this file.
55
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
66
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
77

8+
## [3.1.0] - 2026-05-01
9+
10+
### Security
11+
- **CRITICAL**: `dep-scan.cjs` — eliminated all `bash -c` shell execution. Audit commands and reachability search now use `spawnSync` with explicit argv arrays, preventing command injection via `targetDir` or `packageName`
12+
- **MEDIUM**: `fix-lock.cjs` — TOCTOU race in `acquire()` fixed by wrapping `unlinkSync` in try-catch (ignores `ENOENT` only, re-throws permission errors)
13+
- **MEDIUM**: `worktree-harvest.cjs``isManagedWorktreeDir` now requires both a valid manifest (`fixBranch` + matching `worktreeDir`) AND git worktree list confirmation. `harvestCore` also validates manifest before operating
14+
- **LOW**: `schema-runtime.cjs``resolveRef()` blocks `__proto__`, `constructor`, `prototype` path segments and uses `hasOwnProperty` instead of `in` operator
15+
- **LOW**: `triage.cjs` — files >5MB skipped during line-count sampling to prevent OOM
16+
- **LOW**: `bug-hunter-state.cjs` — files >10MB use size+mtime fingerprint instead of SHA-256 hash to prevent OOM
17+
- **LOW**: `dep-scan.cjs` — graceful fallback when `rg` (ripgrep) is not installed
18+
19+
### Added
20+
- `scripts/shared.cjs` — shared utility module extracting `nowIso`, `readJson`, `writeJson`, `ensureDir`, `toArray`, `toPositiveInt`, `toBoolean`, `severityRank`, `shellQuote` from 4 scripts (eliminated 18+ duplicate definitions)
21+
- `modes/loop-generic.md` — harness-agnostic loop mode using `experiment-loop.cjs` for agents without `ralph_start`/`ralph_done`
22+
- Installer support for Copilot, Windsurf, and Opencode agents in `bin/bug-hunter`
23+
- Option C2 (native-dispatch) in SKILL.md for Cursor, Copilot, Windsurf, Kiro agent backends
24+
- Node.js graceful degradation — core pipeline continues with reduced features when Node.js is unavailable
25+
26+
### Changed
27+
- **Cross-harness compatibility**: 25+ Claude-specific tool name references ("Read tool", "Bash tool", "Edit tool") replaced with functional phrasing ("read the file", "run a shell command", "edit the file") across all skill files, modes, and templates
28+
- `modes/local-sequential.md` now reads from `skills/*/SKILL.md` (canonical) instead of `prompts/*.md`
29+
- `modes/fix-pipeline.md` references updated from `prompts/fixer.md` to `skills/fixer/SKILL.md`
30+
- `EnterWorktree`/`ExitWorktree` references generalized to "your runtime's built-in isolation tools"
31+
- `modes/loop.md` and `modes/fix-loop.md` now include cross-harness notes directing non-Claude agents to `loop-generic.md`
32+
- Login shell overhead eliminated — worker dispatch changed from `bash -lc` to `bash -c` in `run-bug-hunter.cjs` and `dep-scan.cjs`
33+
- `worktree-harvest.cjs` `harvestCore` returns `{ ok: false }` for expected errors instead of throwing JSON strings
34+
- `payload-guard.cjs` removed redundant `require('fs')` and `require('path')` inside `generate()`
35+
- `triage.cjs` removed false-positive `env` and `.env` from SKIP_DIRS
36+
- Refactored `run-bug-hunter.cjs`, `bug-hunter-state.cjs`, `render-report.cjs`, `delta-mode.cjs` to use `shared.cjs`
37+
- SKILL.md fallback probe order expanded to 8 agent directories
38+
- Test suite: **113 tests**, 0 failures
39+
840
## [3.0.10] - 2026-03-14
941

1042
### Fixed
@@ -283,7 +315,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
283315
- Coverage enforcement - partial audits produce explicit warnings
284316
- Large codebase strategy with domain-first tiered scanning
285317

286-
[Unreleased]: https://github.com/codexstar69/bug-hunter/compare/v3.0.9...HEAD
318+
[Unreleased]: https://github.com/codexstar69/bug-hunter/compare/v3.1.0...HEAD
319+
[3.1.0]: https://github.com/codexstar69/bug-hunter/compare/v3.0.10...v3.1.0
320+
[3.0.10]: https://github.com/codexstar69/bug-hunter/compare/v3.0.9...v3.0.10
287321
[3.0.9]: https://github.com/codexstar69/bug-hunter/compare/v3.0.8...v3.0.9
288322
[3.0.8]: https://github.com/codexstar69/bug-hunter/compare/v3.0.7...v3.0.8
289323
[3.0.7]: https://github.com/codexstar69/bug-hunter/compare/v3.0.5...v3.0.7

README.md

Lines changed: 33 additions & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -43,43 +43,36 @@ git clone https://github.com/codexstar69/bug-hunter.git ~/.agents/skills/bug-hun
4343
npm install -g @aisuite/chub
4444
```
4545

46-
> **Requirements:** Node.js 18+. No other dependencies.
46+
> **Requirements:** Node.js 18+ recommended (enables triage, schema validation, and experiment tracking). Core pipeline works without Node.js in degraded mode.
4747
>
48-
> **Works with:** [Pi](https://github.com/mariozechner/pi-coding-agent), Claude Code, Codex, Cursor, Windsurf, Kiro, Copilot — or any AI agent that can read files and run shell commands.
48+
> **Works with:** [Pi](https://github.com/mariozechner/pi-coding-agent), Claude Code, Codex CLI, Cursor, Windsurf, Kiro, Copilot, Opencode — or any AI agent that can read files and run shell commands.
4949
5050
---
5151

5252
## New in This Update
5353

54-
This release is a reliability hardening pass — 11 bugs fixed, 10 previously-failing tests now pass, and the full pipeline is more robust end-to-end.
54+
This release is a security hardening + cross-harness compatibility pass — 8 security fixes, shared utility extraction, and full multi-agent compatibility.
5555

56-
- **`High` severity works everywhere.** All JSON schemas, severity ranking, and payload-guard templates now recognize `High` — previously only `Critical`, `Medium`, and `Low` were accepted, silently dropping valid findings.
57-
- **Confidence threshold is fully wired.** `--confidence-threshold` now propagates from the CLI through the orchestrator to `record-findings`. Previously the flag was parsed but never forwarded, always defaulting to 75.
58-
- **Shell injection fixed in doc-lookup.** Library names passed to `chub` CLI are now properly shell-quoted — prevents command injection via crafted library names.
59-
- **SIGKILL timer leak fixed.** The failsafe kill timer in `runCommandOnce` is now cleared on normal exit — previously it could fire after the child had already exited.
60-
- **Modern Bun lockfile support.** `dep-scan.cjs` now detects `bun.lock` (text format, Bun 1.2+) alongside the legacy `bun.lockb` binary format.
61-
- **Worktree commit parsing hardened.** Edge case where `git log` lines with no space separator caused truncated hashes and wrong messages is now handled.
62-
- **61 tests, 0 failures.** Up from 50 passing / 10 failing — the test suite now covers severity ranking, schema validation, confidence threshold propagation, and shell-safe worker templating.
56+
- **Command injection eliminated in dep-scan.** All shell commands now use `spawnSync` with explicit argv arrays — no more `bash -c` with interpolated strings.
57+
- **TOCTOU race fixed in fix-lock.** Concurrent agents no longer crash on stale lock cleanup.
58+
- **Prototype pollution blocked in schema validator.** `resolveRef()` blocks `__proto__`/`constructor`/`prototype` and uses `hasOwnProperty` for safe traversal.
59+
- **OOM guards added.** Files >5MB are skipped during triage sampling; files >10MB use size+mtime fingerprints instead of SHA-256.
60+
- **Worktree manifest validation hardened.** Both `isManagedWorktreeDir` and `harvestCore` now verify `fixBranch` and `worktreeDir` match before operating.
61+
- **Cross-harness compatibility.** Tool-specific references ("Read tool", "Bash tool") replaced with functional phrasing. Loop mode works without `ralph_start` via generic self-driven loop. Installer now supports 8 agents.
62+
- **Shared utilities module.** 18+ duplicate functions extracted into `scripts/shared.cjs`, reducing maintenance burden.
63+
- **113 tests, 0 failures.**
6364

6465
<p align="center">
6566
<img src="docs/images/2026-03-12-pr-review-flow.png" alt="PR review workflow banner — pull request scope, security checks, threat-model context, and final verdict in a clean product-style UI" width="100%">
6667
</p>
6768

6869
## Start Here
6970

70-
If you're evaluating the new PR flow, start with one of these:
71-
7271
```bash
72+
/bug-hunter # scan entire project, auto-fix confirmed bugs
7373
/bug-hunter --pr # review the current PR end to end
74-
/bug-hunter --pr-security # PR-focused security review without editing code
75-
/bug-hunter --last-pr --review # review the most recent PR without fixes
76-
/bug-hunter --plan src/ # build fix-strategy.json + fix-plan.json only
77-
```
78-
79-
If you just want the default repo audit:
80-
81-
```bash
82-
/bug-hunter
74+
/bug-hunter --pr-security # PR-focused security review
75+
/bug-hunter --scan-only src/ # report only, no code changes
8376
```
8477

8578
---
@@ -210,8 +203,6 @@ This scoring creates a **self-correcting equilibrium**. The Hunter doesn't flood
210203

211204
---
212205

213-
---
214-
215206
## Features
216207

217208
### Bundled Local Security Skills
@@ -690,7 +681,7 @@ All flags compose: `/bug-hunter --deps --threat-model --fix src/`
690681

691682
Bug Hunter ships with a test fixture containing an Express app with **6 intentionally planted bugs** (2 Critical, 3 Medium, 1 Low):
692683

693-
The repository also ships with **61 Node.js regression tests** covering orchestration, schemas, PR scope resolution, fix-plan validation, lock behavior, worktree lifecycle, severity ranking, and the bundled local security-skill routing.
684+
The repository also ships with **113 Node.js regression tests** covering orchestration, schemas, PR scope resolution, fix-plan validation, lock behavior, worktree lifecycle, severity ranking, experiment loop, and the bundled local security-skill routing.
694685

695686
```bash
696687
/bug-hunter test-fixture/
@@ -739,19 +730,13 @@ bug-hunter/
739730
│ ├── scaled.md # State-driven chunks with resume
740731
│ ├── large-codebase.md # Domain-scoped pipelines
741732
│ ├── local-sequential.md # Single-agent execution
742-
│ ├── loop.md # Iterative coverage loop
733+
│ ├── loop.md # Iterative coverage loop (ralph-loop)
734+
│ ├── loop-generic.md # Iterative coverage loop (any agent)
743735
│ ├── fix-pipeline.md # Auto-fix orchestration (with worktree isolation)
744736
│ ├── fix-loop.md # Fix + re-scan loop
745737
│ └── _dispatch.md # Shared dispatch patterns + worktree lifecycle
746738
747-
├── prompts/ # Agent system prompts
748-
│ ├── recon.md # Reconnaissance agent
749-
│ ├── hunter.md # Bug hunting agent
750-
│ ├── skeptic.md # Adversarial reviewer
751-
│ ├── referee.md # Final verdict judge
752-
│ ├── fixer.md # Auto-fix agent
753-
│ ├── doc-lookup.md # Documentation verification
754-
│ ├── threat-model.md # STRIDE threat model generator
739+
├── prompts/ # Legacy agent prompts (redirects to skills/)
755740
│ └── examples/ # Calibration few-shot examples
756741
│ ├── hunter-examples.md # 3 real + 2 false positives
757742
│ └── skeptic-examples.md # 2 accepted + 2 disproved + 1 review
@@ -767,31 +752,32 @@ bug-hunter/
767752
│ ├── recon.schema.json # Recon artifact schema
768753
│ └── shared.schema.json # Shared definitions
769754
770-
├── skills/ # Bundled local security pack
771-
│ ├── commit-security-scan/
772-
│ ├── security-review/
773-
│ ├── threat-model-generation/
774-
│ └── vulnerability-validation/
755+
├── skills/ # Bundled agent skills (canonical source)
756+
│ ├── hunter/ # Deep behavioral code analysis
757+
│ ├── skeptic/ # Adversarial code reviewer
758+
│ ├── referee/ # Final verdict judge
759+
│ ├── fixer/ # Surgical code repair
760+
│ ├── recon/ # Codebase reconnaissance
761+
│ ├── doc-lookup/ # Documentation verification
762+
│ ├── commit-security-scan/ # PR security scanning
763+
│ ├── security-review/ # Enterprise security workflow
764+
│ ├── threat-model-generation/ # STRIDE threat model generator
765+
│ └── vulnerability-validation/ # Exploitability validation
775766
776767
├── scripts/ # Node.js helpers (zero AI tokens)
768+
│ ├── shared.cjs # Shared utilities (deduped across scripts)
777769
│ ├── triage.cjs # File classification (<2s)
778770
│ ├── dep-scan.cjs # Dependency CVE scanner
779771
│ ├── doc-lookup.cjs # Documentation lookup (chub + Context7 fallback)
780-
│ ├── context7-api.cjs # Context7 API fallback
781772
│ ├── run-bug-hunter.cjs # Chunk orchestrator
782773
│ ├── bug-hunter-state.cjs # Persistent state for resume
774+
│ ├── experiment-loop.cjs # Autonomous experiment loop
775+
│ ├── schema-runtime.cjs # JSON schema validator
783776
│ ├── delta-mode.cjs # Changed-file scope reduction
784777
│ ├── payload-guard.cjs # Subagent payload validation
785778
│ ├── fix-lock.cjs # Concurrent fixer prevention
786779
│ ├── worktree-harvest.cjs # Worktree isolation for Fixer subagents
787-
│ ├── code-index.cjs # Cross-domain analysis (optional)
788-
│ └── tests/ # Test suite (node --test)
789-
│ ├── run-bug-hunter.test.cjs # Orchestrator tests
790-
│ ├── bug-hunter-state.test.cjs # State management tests
791-
│ ├── code-index.test.cjs # Code index tests
792-
│ ├── delta-mode.test.cjs # Delta mode tests
793-
│ ├── pr-scope.test.cjs # PR scope resolution tests
794-
│ └── worktree-harvest.test.cjs # Worktree lifecycle tests
780+
│ └── tests/ # 113 tests (node --test)
795781
796782
├── templates/
797783
│ └── subagent-wrapper.md # Subagent launch template (with worktree rules)

package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "@codexstar/bug-hunter",
3-
"version": "3.0.10",
3+
"version": "3.1.0",
44
"description": "Adversarial AI bug hunter — multi-agent pipeline finds security vulnerabilities, logic errors, and runtime bugs, then fixes them autonomously. Works with Claude Code, Cursor, Codex CLI, Copilot, Kiro, and more.",
55
"license": "MIT",
66
"main": "bin/bug-hunter",

0 commit comments

Comments
 (0)