security: pass Haiku classifier prompt via stdin, not argv by garagon · Pull Request #1157 · garrytan/gstack

garagon · 2026-04-23T00:22:57Z

Summary

The Haiku transcript classifier passes scanned content (user messages + tool outputs, up to 8KB) as a CLI argument to claude -p <prompt>. This makes the full prompt visible via ps aux or /proc/<pid>/cmdline for up to 15 seconds per classification call.

On shared Linux hosts (default hidepid=0), any local user can read the scanned content — which may include page text, tool outputs, and potentially tokens or credentials visible on the page when the classifier fires.

On macOS 10.15+ the exposure is lower (Full Disk Access required to read other users' processes), but still present for same-user monitoring.

What this PR does

Two changes in checkTranscript() (security-classifier.ts):

Prompt via stdin: claude -p (no argument) reads from stdin. The prompt is written to p.stdin and the pipe closed, keeping it off the process argument list entirely.
Scoped child env: the spawned process now inherits only PATH, HOME, and ANTHROPIC_API_KEY instead of the full parent process.env. This prevents leaking unrelated secrets (other API keys, tokens) into the child's environment.

Before / After

# BEFORE: visible in ps output
claude -p "You are a prompt-injection detector...\n\nINPUTS:\n{\"user_message\":\"<scanned content>\"..." --model haiku --output-format json

# AFTER: only flags visible
claude -p --model haiku --output-format json

Test plan

bun test browse/test/security-classifier.test.ts — 16/16 pass
bun test — full free suite passes
Manual: run classification, verify ps aux | grep claude shows no prompt content
Manual: verify Haiku classifier still returns valid verdicts (stdin path works)

When both TestSavantAI and Haiku transcript classifiers fail to load, preSpawnSecurityCheck silently returns safe and the agent spawns with zero ML prompt injection defense. This adds a fail-closed gate that blocks agent spawn when all classifiers are inactive, with an explicit opt-out via GSTACK_SECURITY_ALLOW_INACTIVE=1.

Scanned content (user messages, tool outputs up to 8KB) was passed as a CLI argument to `claude -p <prompt>`, making it visible in `ps aux` and `/proc/<pid>/cmdline` for up to 15 seconds per classification. On shared Linux hosts (default hidepid=0) any local user could read it. Fix: pipe the prompt through stdin (`claude -p` reads from stdin when no argument follows) and scope the child env to PATH + HOME + ANTHROPIC_API_KEY only.

garagon added 2 commits April 22, 2026 20:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

security: pass Haiku classifier prompt via stdin, not argv#1157

security: pass Haiku classifier prompt via stdin, not argv#1157
garagon wants to merge 2 commits intogarrytan:mainfrom
garagon:security/haiku-stdin-prompt

garagon commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

garagon commented Apr 23, 2026

Summary

What this PR does

Before / After

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant