You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Applies reviewer findings from PR #5 external code review. All tests
still pass (45/45 verify-plugin + expanded CI matrices).
🔴 Critical — command injection in detect-surface.sh (Gemini)
Before: python3 -c "json.loads('''$JSON''')" — backticks or $(…) in
idea text would execute before python parsed them.
After: payload piped via stdin (printf + sys.stdin.read). No shell
interpolation of user-controlled data anywhere in the script.
Second python emission (scores JSON) also moved to env vars for
defense-in-depth consistency.
CI: new security-regression test injects `touch /tmp/pf-injection-canary`
and asserts the file is NOT created.
🟡 Important — cost-regression wrote to wrong table (Codex P1)
Before: CREATE TABLE events(ts, kind, severity, payload) — not
readable by /pf:status or /pf:budget which query the canonical
`blackboard` table.
After: writes to `blackboard` with schema matching CLAUDE.md §6:
(ts, agent_id, key, value, tier, dept). key = "status.cost_{warn|alert}",
value = JSON payload, agent_id = "cost-regression", tier = 1 (Meta),
dept = "meta". idx_bb_key index created for the polling pattern.
CI: new schema assertion confirms `blackboard` table exists after breach.
🟡 Medium — Korean single-char tokens dropped (Gemini)
Before: filter len(t) > 1 dropped 앱 (app), 웹 (web), 봇 (bot), 툴 (tool).
After: len(t) > 1 OR not t.isascii() — ASCII single-chars still
dropped as noise (a, i, o), non-ASCII single chars kept as signal.
STOPWORDS already handles Korean particles (가, 는, 을, …).
CI: new KR on-idea fixture exercises the path.
🟡 Medium — grep -oc counted lines not occurrences (Gemini)
Before: grep -oc returns line count (max 1 for single-line text), so
"api api api" scored 1 instead of 3, breaking the `rest > 2*ui` rule.
After: grep -o | wc -l — true occurrence count.
CI: regression guard asserts `"rest": 3` for "api api api".
🟡 Medium — ls in for loop fragile (Gemini)
Before: for d in $(ls -d runs/*/ 2>/dev/null) — word-splits on spaces,
ARG_MAX limit under many runs.
After: for d in runs/*/ ; do [ -d "$d" ] || continue — glob-safe.
🟡 Medium — profile["cost_ceiling"] KeyError risk (Gemini)
Before: direct subscript assumed schema compliance at runtime.
After: profile.get("cost_ceiling") + field presence check → early
return 0. Schema validation at CI still prevents broken profiles
from merging, but runtime is now defensive.
CI: malformed profile fixture asserts exit 0, no crash.
🟢 Nice-to-have — cache key ignored --previews override (Codex P2)
Before: cmd_key(idea, profile) derived advocate count from profile
default only. Same idea + pro profile + --previews=9 vs --previews=18
collided.
After: cmd_key(idea, profile, previews_override?) — 3rd optional arg.
When set, overrides the profile's default count in the key input.
Backwards compatible (2-arg callers unchanged).
CI: new assertion K(idea,pro) != K(idea,pro,9).
Also hardened all other python3 -c shell-interpolation points in
preview-cache.sh (cmd_get + cmd_prune) to pass paths via argv — no
longer interpolated into python source. Defense-in-depth against future
path injection even though cache dir is under user's HOME.
Test matrix growth:
- detect-surface: 3 → 5 cases (+ occurrence regression + injection canary)
- cost-regression: 6 → 8 cases (+ blackboard schema + defensive unknown profile)
- preview-cache: 4 → 5 cases (+ --previews override produces distinct key)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
R3=$(bash scripts/detect-surface.sh <<<'{"text":"Admin panel with dashboard UI and REST API for programmatic access. Self-service customer portal with settings page."}')
0 commit comments