Skip to content

security: fail-closed gate when ML classifiers are inactive#1155

Open
garagon wants to merge 1 commit intogarrytan:mainfrom
garagon:security/classifier-fail-closed-gate
Open

security: fail-closed gate when ML classifiers are inactive#1155
garagon wants to merge 1 commit intogarrytan:mainfrom
garagon:security/classifier-fail-closed-gate

Conversation

@garagon
Copy link
Copy Markdown
Contributor

@garagon garagon commented Apr 22, 2026

Summary

The sidebar agent's ML prompt injection defense (TestSavantAI + Haiku transcript classifier) degrades silently to zero coverage when both classifiers fail to load. preSpawnSecurityCheck returns false (safe) and the agent spawns without any ML protection — the user sees "inactive" on the shield icon but gets no blocking gate.

An attacker who can cause model download failure (DNS poisoning on huggingface.co, disk full, proxy blocking the 112MB ONNX download) permanently removes all ML protection while the agent keeps running as if everything is fine.

What this PR does

Adds a fail-closed gate at the top of preSpawnSecurityCheck:

  • isClassifierInactive(status) — pure function that returns true only when ALL classifiers (testsavant, transcript, and deberta if enabled) report 'off'
  • When inactive: blocks agent spawn, emits security_event (verdict=block, reason=classifiers_inactive) + agent_error to the sidepanel so the user sees exactly why the session was refused
  • Does NOT fire when GSTACK_SECURITY_OFF=1 — that's an explicit operator kill switch, they already know
  • Does NOT fire on partial degradation — if one classifier is 'degraded' or 'ok', the remaining coverage is enough to proceed
  • Override: GSTACK_SECURITY_ALLOW_INACTIVE=1 for operators who knowingly run without ML (CI, air-gapped environments)

Files changed

File Change
browse/src/security-classifier.ts Extract isClassifierInactive() pure function
browse/src/sidebar-agent.ts Fail-closed gate in preSpawnSecurityCheck
browse/test/security-classifier.test.ts 7 test cases: all status combinations

Test plan

  • bun test browse/test/security-classifier.test.ts — 16/16 pass (7 new + 9 existing)
  • Manual: start gstack without internet, verify agent spawn is blocked with clear error message
  • Manual: set GSTACK_SECURITY_ALLOW_INACTIVE=1, verify agent spawns despite inactive classifiers
  • Manual: start normally (classifiers load), verify no change to existing behavior

When both TestSavantAI and Haiku transcript classifiers fail to load,
preSpawnSecurityCheck silently returns safe and the agent spawns with
zero ML prompt injection defense. This adds a fail-closed gate that
blocks agent spawn when all classifiers are inactive, with an explicit
opt-out via GSTACK_SECURITY_ALLOW_INACTIVE=1.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant