Skip to content

Commit fc64033

Browse files
committed
feat(prim): AI-deferred disambiguator + programmatic-Claude lockdown
Adds an AI-deferred receiver-type classifier to the `prim` audit/codemod for the duck-typed prototype methods (`.test`, `.then`, `.exec`, `.catch`, `.finally`) that the static heuristic can't classify (semver Range.test → not RegExp; PromiseLike .then → not Promise). Opt-in via `--ai-disambiguate`; cached on disk so re-runs are free. Reference impl for the new fleet-wide "Programmatic Claude Invocation Lockdown" pattern: - src/disambiguate.mts: SDK-form `query()` callsite with all four lockdown flags wired (tools, allowedTools, disallowedTools, permissionMode: 'dontAsk') per the official permission flow at https://code.claude.com/docs/en/agent-sdk/permissions. - src/ambiguous-methods.mts: the hard-cases table. - test/disambiguate.test.mts: 10 tests including source-text guards that fail if BASE_TOOLS widens, if `tools: BASE_TOOLS` is unwired, or if permissionMode drifts from 'dontAsk'. CLAUDE.md gains one bullet pointing at the new .claude/skills/programmatic-claude-lockdown/SKILL.md (cascaded from socket-repo-template), which holds the four-flag table, both recipes (read-only and Bash-needing), and the never-do list. Tools/prim package additions: peer dep on @anthropic-ai/claude-agent-sdk (optional via peerDependenciesMeta), test script, README section. .gitignore: .prim-cache/. Drive-by: fix releases-socket-btm.test.mts mock to resolve undefined (not null) — fallout from the prior null→undefined migration.
1 parent ec2d3d4 commit fc64033

14 files changed

Lines changed: 1986 additions & 49 deletions

File tree

Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,84 @@
1+
---
2+
name: programmatic-claude-lockdown
3+
description: Reference for locking down programmatic Claude invocations (the `claude` CLI in workflows/scripts, the `@anthropic-ai/claude-agent-sdk` `query()` in code). Loads on demand when writing or reviewing any callsite that runs Claude programmatically. Source: https://code.claude.com/docs/en/agent-sdk/permissions.
4+
user-invocable: false
5+
allowed-tools: Read, Grep, Glob
6+
---
7+
8+
# Programmatic Claude lockdown
9+
10+
**Rule:** every programmatic Claude callsite sets four flags. Skip any one and a future edit silently widens the surface.
11+
12+
## The four flags
13+
14+
| Layer | SDK option | CLI flag | What it does |
15+
|---|---|---|---|
16+
| Definition | `tools` | `--tools` | Base set the model is told about. Tools not listed are invisible — no `tool_use` block possible. |
17+
| Auto-approve | `allowedTools` | `--allowedTools` | Step 4. Listed tools run without invoking `canUseTool`. |
18+
| Deny | `disallowedTools` | `--disallowedTools` | Step 2. Wins even against `bypassPermissions`. Defense-in-depth. |
19+
| Mode | `permissionMode: 'dontAsk'` | `--permission-mode dontAsk` | Step 3. Unmatched tools denied without falling through to a missing `canUseTool`. |
20+
21+
The official permission flow (1) hooks → (2) deny rules → (3) permission mode → (4) allow rules → (5) `canUseTool`. In `dontAsk` mode step 5 is skipped — denied. The doc states verbatim: *"`allowedTools` and `disallowedTools` ... control whether a tool call is approved, not whether the tool is available."* Availability is `tools`.
22+
23+
## Recipe — read-only agent (audit, classify, summarize)
24+
25+
```ts
26+
import { query } from '@anthropic-ai/claude-agent-sdk'
27+
28+
query({
29+
prompt: '...',
30+
options: {
31+
tools: ['Read', 'Grep', 'Glob'],
32+
allowedTools: ['Read', 'Grep', 'Glob'],
33+
disallowedTools: ['Agent', 'Bash', 'Edit', 'NotebookEdit', 'Task', 'WebFetch', 'WebSearch', 'Write'],
34+
permissionMode: 'dontAsk',
35+
},
36+
})
37+
```
38+
39+
CLI form for workflow YAML / shell scripts:
40+
41+
```yaml
42+
claude --print \
43+
--tools "Read" "Grep" "Glob" \
44+
--allowedTools "Read" "Grep" "Glob" \
45+
--disallowedTools "Agent" "Bash" "Edit" "NotebookEdit" "Task" "WebFetch" "WebSearch" "Write" \
46+
--permission-mode dontAsk \
47+
--model "$MODEL" \
48+
--max-turns 25 \
49+
"<prompt>"
50+
```
51+
52+
## Recipe — agent that needs Bash (e.g. `/updating`: pnpm + git + jq)
53+
54+
Narrow `Bash(...)` patterns surgically. Block dangerous Bash patterns explicitly. Fleet rules: no `npx`/`pnpm dlx`/`yarn dlx`; no `curl`/`wget` exfil; no destructive `rm -rf`; no `sudo`. Build the deny list as shell vars so the npx/dlx denials can carry the `# zizmor:` exemption marker (the pre-commit `scanNpxDlx` hook treats those literal strings as the prohibited tools, not as exemptions, unless the line is tagged):
55+
56+
```yaml
57+
DISALLOW_BASE='Agent Task NotebookEdit WebFetch WebSearch Bash(curl:*) Bash(wget:*) Bash(rm -rf*) Bash(sudo:*)'
58+
DISALLOW_PKG_EXEC='Bash(npx:*) Bash(pnpm dlx:*) Bash(yarn dlx:*)' # zizmor: documentation-prohibition
59+
claude --print \
60+
--tools "Bash" "Read" "Write" "Edit" "Glob" "Grep" \
61+
--allowedTools "Bash(pnpm:*)" "Bash(git:*)" "Bash(jq:*)" "Read" "Write" "Edit" "Glob" "Grep" \
62+
--disallowedTools $DISALLOW_BASE $DISALLOW_PKG_EXEC \
63+
--permission-mode dontAsk \
64+
--model "$MODEL" --max-turns 25 \
65+
"<prompt>"
66+
```
67+
68+
## Never
69+
70+
-`permissionMode: 'default'` in headless contexts — falls through to a missing `canUseTool`. Behavior undefined.
71+
-`permissionMode: 'bypassPermissions'` / `allowDangerouslySkipPermissions: true`.
72+
- ❌ Omitting `tools` — SDK default is the full claude_code preset.
73+
-`Agent` / `Task` permitted — sub-agents inherit modes and can escape per-subagent restrictions when the parent is `bypassPermissions`/`acceptEdits`/`auto`.
74+
75+
## Reference implementation
76+
77+
`socket-lib/tools/prim/src/disambiguate.mts` — canonical SDK-form callsite. The file header documents each flag against the eval-flow step it enforces.
78+
79+
`socket-lib/tools/prim/test/disambiguate.test.mts` — source-text guards that fail the build if `BASE_TOOLS` widens, if `tools: BASE_TOOLS` is unwired, if `permissionMode` drifts from `'dontAsk'`, or if `bypassPermissions` / `allowDangerouslySkipPermissions: true` ever appears. Mirror this pattern in any new callsite.
80+
81+
## Existing fleet callsites
82+
83+
- `socket-registry/.github/workflows/weekly-update.yml` — two `claude --print` invocations (run `/updating` skill, fix test failures). Bash recipe above.
84+
- `socket-lib/tools/prim/src/disambiguate.mts` — read-only recipe above (`query()` SDK form).

.gitignore

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,12 @@ node_modules/
2323
# tools occasionally drop scratch dirs into a project-local .cache/.
2424
# node_modules/.cache/ is the canonical home for tools we control.
2525
**/.cache/
26+
27+
# `prim` audit/codemod's AI-disambiguation cache. Keyed on
28+
# (method, receiver, snippet-hash) so verdicts persist across runs
29+
# and re-runs are free; not portable across machines (verdicts
30+
# depend on an account's API key budget) so it's gitignored.
31+
.prim-cache/
2632
npm-debug.log*
2733
pnpm-debug.log*
2834
yarn-error.log*

CLAUDE.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -146,6 +146,7 @@ See `docs/references/error-messages.md` for worked examples and anti-patterns.
146146
- **minimumReleaseAge**: NEVER add packages to `minimumReleaseAgeExclude` in CI. Locally, ASK before adding — the age threshold is a security control.
147147
- 🚨 **NEVER mention private repos or internal project names** in commits, PR titles/descriptions/comments, issues, release notes, or any public-surface text. Internal codenames, unreleased product names, internal tooling repo names not on the public org page, customer names, partner names — none belong in public surfaces. **Omit the reference entirely.** Don't substitute a placeholder ("an internal tool", "a downstream consumer", etc.) — the placeholder itself is a tell that something is being elided. Rewrite the sentence to not need the reference at all. The `.claude/hooks/private-name-guard` hook re-prints this rule on every public-surface `git`/`gh` command as a priming nudge; the rule applies even when the hook isn't installed.
148148
- 🚨 **NEVER trigger Publish / Release / Provenance / Build-Release workflows** — no `gh workflow run`, `gh workflow dispatch`, or `gh api .../dispatches`. Workflow dispatches are irrevocable: Publish workflows push npm versions (unpublishable after 24h), Build/Release workflows pin GitHub releases by SHA, container workflows push immutable tags. Even build workflows with a `dry_run` input still treat the dispatch itself as the prod trigger. The user runs workflow_dispatch jobs manually after CI passes on the release commit + tag — Claude **never** dispatches them. If the user asks for a publish, tell them to run the command in their own terminal (or the GitHub Actions UI). The `.claude/hooks/release-workflow-guard` hook BLOCKS these commands; the rule applies even when the hook isn't installed.
149+
- 🚨 **Programmatic Claude calls** (workflows, skills, scripts that invoke `claude` CLI or `@anthropic-ai/claude-agent-sdk`) MUST set all four lockdown flags: `--tools`/`tools`, `--allowedTools`/`allowedTools`, `--disallowedTools`/`disallowedTools`, and `--permission-mode dontAsk`/`permissionMode: 'dontAsk'`. NEVER `default` mode in headless contexts (falls through to a missing `canUseTool` → undefined behavior). NEVER `bypassPermissions`. See `.claude/skills/programmatic-claude-lockdown/SKILL.md` for the recipe + reference impl (`tools/prim/src/disambiguate.mts`).
149150

150151
## DOCUMENTATION POLICY
151152

0 commit comments

Comments
 (0)