security: fail-closed gate when ML classifiers are inactive by garagon · Pull Request #1155 · garrytan/gstack

garagon · 2026-04-22T23:39:33Z

Summary

The sidebar agent's ML prompt injection defense (TestSavantAI + Haiku transcript classifier) degrades silently to zero coverage when both classifiers fail to load. preSpawnSecurityCheck returns false (safe) and the agent spawns without any ML protection — the user sees "inactive" on the shield icon but gets no blocking gate.

An attacker who can cause model download failure (DNS poisoning on huggingface.co, disk full, proxy blocking the 112MB ONNX download) permanently removes all ML protection while the agent keeps running as if everything is fine.

What this PR does

Adds a fail-closed gate at the top of preSpawnSecurityCheck:

isClassifierInactive(status) — pure function that returns true only when ALL classifiers (testsavant, transcript, and deberta if enabled) report 'off'
When inactive: blocks agent spawn, emits security_event (verdict=block, reason=classifiers_inactive) + agent_error to the sidepanel so the user sees exactly why the session was refused
Does NOT fire when GSTACK_SECURITY_OFF=1 — that's an explicit operator kill switch, they already know
Does NOT fire on partial degradation — if one classifier is 'degraded' or 'ok', the remaining coverage is enough to proceed
Override: GSTACK_SECURITY_ALLOW_INACTIVE=1 for operators who knowingly run without ML (CI, air-gapped environments)

Files changed

File	Change
`browse/src/security-classifier.ts`	Extract `isClassifierInactive()` pure function
`browse/src/sidebar-agent.ts`	Fail-closed gate in `preSpawnSecurityCheck`
`browse/test/security-classifier.test.ts`	7 test cases: all status combinations

Test plan

bun test browse/test/security-classifier.test.ts — 16/16 pass (7 new + 9 existing)
Manual: start gstack without internet, verify agent spawn is blocked with clear error message
Manual: set GSTACK_SECURITY_ALLOW_INACTIVE=1, verify agent spawns despite inactive classifiers
Manual: start normally (classifiers load), verify no change to existing behavior

When both TestSavantAI and Haiku transcript classifiers fail to load, preSpawnSecurityCheck silently returns safe and the agent spawns with zero ML prompt injection defense. This adds a fail-closed gate that blocks agent spawn when all classifiers are inactive, with an explicit opt-out via GSTACK_SECURITY_ALLOW_INACTIVE=1.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

security: fail-closed gate when ML classifiers are inactive#1155

security: fail-closed gate when ML classifiers are inactive#1155
garagon wants to merge 1 commit intogarrytan:mainfrom
garagon:security/classifier-fail-closed-gate

garagon commented Apr 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

garagon commented Apr 22, 2026

Summary

What this PR does

Files changed

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant