Add tools/captureToolSchemas.mjs + main-README listing for tool-schemas/ (follow-up to #24) by YiRaaaan · Pull Request #25 · Piebald-AI/claude-code-system-prompts

YiRaaaan · 2026-06-09T02:47:30Z

Follow-up to #24, as requested in this comment.

Two pieces; you can think of them as "refresh" and "surface":

1. `tools/captureToolSchemas.mjs` — one-command schema regeneration

node tools/captureToolSchemas.mjs

On any machine that has claude -p installed, this regenerates the full tool-schemas/ directory in ~8s. No Anthropic account, no API key, no upstream call. The script runs a tiny intercept server on 127.0.0.1:4099, spawns claude -p "ok" four times with the env that surfaces each gated tool set (default, --agent-teams, local-agent entrypoint, brief/KAIROS), captures tools[] out of each request body, and replies with a stub 403 so claude exits immediately without ever talking to Anthropic.

Output is byte-stable across runs (all object keys are lexically sorted) and across machines (no env-dependent code paths). StructuredOutput is intentionally skipped — its input_schema is supplied per-call by the workflow, see tool-schemas/README.md.

Verified end-to-end on Claude Code v2.1.172 against the schemas committed to #24: 33/35 files reproduce byte-identical; the two deltas are real upstream changes (agent.json gained "fable" in its model enum; send-message.json gained a 200-char maxLength plus matching regex on summary). Independent fresh-clone smoke test passed: 35/35 valid JSON, ~8.3s wall clock.

2. `tools/updatePrompts.js` — main README listing

A small extension so the next time updatePrompts.js runs, the main README.md will include a ### Tool Schemas section listing every file in tool-schemas/, one bullet per schema, linking to the file.

The new code path is fully guarded by existsSync(TOOL_SCHEMAS_DIR) — if tool-schemas/ isn't in the working tree (e.g. on a branch where #24 hasn't merged yet), the script behaves identically to before. So either PR is safe to land before or after the other in any order.

What the new section looks like

After the existing ### Builtin Tool Descriptions section, the script will append:

### Tool Schemas

JSON `input_schema` for each builtin tool, captured verbatim from the Anthropic API payload. See [`tool-schemas/README.md`](./tool-schemas/README.md) for grouping by surface condition.

- [Tool Schema: Agent](./tool-schemas/agent.json)
- [Tool Schema: AskUserQuestion](./tool-schemas/ask-user-question.json)
- [Tool Schema: Bash](./tool-schemas/bash.json)
…
- [Tool Schema: Workflow](./tool-schemas/workflow.json)
- [Tool Schema: Write](./tool-schemas/write.json)

Why a directory scan, not a JSON-file argument

The existing prompt flow is: tweakcc/promptExtractor.js → prompts-X.X.X.json → updatePrompts.js reads that JSON → writes system-prompts/*.md.

Schemas don't have a tweakcc-side extractor — per #24 they're verbatim wire captures committed directly, so the JSON files in tool-schemas/ are themselves the source of truth. The capture script writes there directly, and updatePrompts.js just lists what's present in the README. No intermediate JSON, no second source of truth.

Implementation notes

captureToolSchemas.mjs

Single file, Node stdlib only (http, https, child_process, fs, path, url). No dependencies.
toKebab() validates its output against /^[a-z0-9](?:[a-z0-9-]*[a-z0-9])?$/ and throws on any name that wouldn't produce a safe slug (defense in depth against unexpected tool names — see the path-traversal thread on this PR).
Configurable port via CAPTURE_PORT env (default 4099).

updatePrompts.js

getToolSchemas() reads tool-schemas/, returns [] if absent, else a sorted list of {filename, displayName}.
schemaFileToDisplayName() does kebab → PascalCase with one explicit override (lsp → LSP). New overrides go in SCHEMA_DISPLAY_NAME_OVERRIDES; the matching capture-side override is NAME_TO_KEBAB_OVERRIDES in captureToolSchemas.mjs.

What this does NOT do

Does not version-stamp the schemas (no token-counting analog — schemas aren't text content).
Does not maintain a separate CHANGELOG entry for schema changes — they're captured in the verbatim files themselves and surface naturally in git diffs.
Does not modify tweakcc/promptExtractor.js. Per the design discussion in Add tool-schemas/ — JSON input_schema for 35 builtin tools (refs #22) #24, the schema-capture path stays independent of the prompt-extraction pipeline.

Summary by CodeRabbit

New Features
- README now includes an auto-generated "Tool Schemas" section that lists available tool schemas when present, with human-friendly display names.
Chores
- Added a local capture utility that collects tool definitions across runs, consolidates them deterministically, and emits sorted per-tool schema files for documentation and discovery.

coderabbitai · 2026-06-09T02:47:41Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds a CLI (tools/captureToolSchemas.mjs) that runs a local stub server and captures per-tool input_schema from /v1/messages requests into tool-schemas/*.json, and updates tools/updatePrompts.js to discover those schemas and conditionally append a "Tool Schemas" section with links into README.md.

Changes

Tool Schemas

Layer / File(s)	Summary
Capture CLI: parse, merge, write schemas `tools/captureToolSchemas.mjs`	Runs a local stub server, spawns the `claude` CLI across four capture passes, captures `tools[]` from the first `/v1/messages` POST, merges by first-seen name, removes `StructuredOutput`, maps names to kebab-case (with overrides), sorts JSON keys, and writes one deterministic `tool-schemas/<name>.json` per tool.
Schema discovery infrastructure `tools/updatePrompts.js`	Adds `TOOL_SCHEMAS_DIR` and a filename→display-name mapper (override `lsp`→`LSP`) plus `getToolSchemas()` to enumerate and sort `tool-schemas/*.json`, returning an empty list if the directory is absent.
README generation for schemas `tools/updatePrompts.js`	Extends `updateReadme()` to conditionally append a `### Tool Schemas` section with a brief description and a bullet list of links to each discovered schema file when schemas exist.

Sequence Diagram(s)

sequenceDiagram
  participant ClaudeCLI as claude CLI
  participant StubServer as Stub HTTP Server
  participant CaptureScript as captureToolSchemas.mjs
  participant FileSystem as File System

  ClaudeCLI->>StubServer: POST /v1/messages (request body with tools[])
  StubServer->>CaptureScript: deliver parsed JSON (tools[])
  CaptureScript->>CaptureScript: merge tools across runs, kebab-case names, remove StructuredOutput
  CaptureScript->>FileSystem: write tool-schemas/<kebab-name>.json (sorted keys)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

Would JSON input_schema for builtin tools be in scope? #22 — Adds tooling to generate and consume tool-schemas/*.json, which implements the requested JSON input_schema workflow.

"A rabbit found schemas in a row, 🐇
nibbling names that softly glow,
LSP and friends in tidy files,
README hums across the miles,
hopping schemas to and fro."

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 40.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and concisely identifies the main changes: adding captureToolSchemas.mjs and updating the README to list tool schemas, with context that it follows PR `#24`.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tools/captureToolSchemas.mjs`:
- Around line 58-64: The toKebab function (and its use with
NAME_TO_KEBAB_OVERRIDES when writing files into OUT_DIR) must sanitize tool
names to prevent path traversal; update toKebab to first strip any directory
components (e.g., via path.basename or by removing path separators), reject or
replace suspicious characters (.., /, \, null bytes), and constrain the output
to a safe whitelist pattern and length (e.g., only a-z0-9 and hyphens) before
returning; ensure callers that pass the kebab name into path.join(OUT_DIR, ...)
only receive this sanitized value so generated filenames cannot escape OUT_DIR.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6d098088-aa88-40b9-9289-d50b68b98086

📥 Commits

Reviewing files that changed from the base of the PR and between 281cf76 and e54a658.

📒 Files selected for processing (1)

tools/captureToolSchemas.mjs

Complements Piebald-AI#24, which adds the `tool-schemas/` directory. Per the PR discussion on Piebald-AI#24, the README listing should surface these alongside the existing prompt categories. Behavior: - `getToolSchemas()` scans `tool-schemas/*.json` alphabetically; returns empty when the directory is absent, so the patch is a no-op on repos that haven't merged the schemas yet (safe to land in either order). - `schemaFileToDisplayName()` does kebab→PascalCase with one explicit override (`lsp` → `LSP`). Add more overrides as needed. - A new `### Tool Schemas` section is appended after the existing Builtin Tool Descriptions section, with a one-line intro and one bullet per file linking to `./tool-schemas/<file>.json`. No changes to existing categories, prompt extraction, token counting, or any other behavior. Only one new code path, guarded by directory existence.

Bundles a tiny HTTP intercept server with the four-run capture loop so that, on any machine with `claude -p` installed, running node tools/captureToolSchemas.mjs regenerates the full `tool-schemas/` directory in ~15s, with no external proxy, no API key, no Anthropic account, no upstream call at all. How it works: 1. Server listens on 127.0.0.1:4099 (configurable via CAPTURE_PORT). 2. Spawns `claude -p "ok"` four times, each with the env that surfaces a different tool set (default, --agent-teams, local-agent entrypoint, brief / KAIROS). 3. On each spawn, claude POSTs to /v1/messages through ANTHROPIC_BASE_URL pointed at us. The server pulls `tools[]` out of the request body and replies with a stub 403 — claude exits immediately without retrying, and we move on to the next env. 4. After all four runs, the unioned `tools[]` (first-seen wins per name) is sorted by tool name, each `input_schema` is recursively lexically key-sorted for byte-stable diffs, StructuredOutput is dropped (its schema is caller-supplied), and one file per tool is written under tool-schemas/. Why a stub 403 instead of forwarding to api.anthropic.com: Anthropic's OAuth bearer tokens are bound to the TLS / connection profile of the proxy that established them. A fresh Node proxy forwarding identical headers gets `403 Request not allowed` consistently. So we don't forward — we capture the request body (which is what we actually want) and stub a fast-exit error response. Claude still emits the full tools[] before we reply. Verified end-to-end on Claude Code v2.1.168: 33/35 output files are byte-identical to PR Piebald-AI#24's committed schemas; the 2 deltas (agent.json, …) reflect real-world wire changes since Piebald-AI#24 was captured. Two back-to-back runs produce byte-identical output.

Two schemas changed since the initial capture against v2.1.168: - agent.json: model enum gained "fable" (Fable 5 family). - send-message.json: added length constraints — a 200-char maxLength on the top-level `summary` field and a matching ^[^\n\r]{1,200}$ regex on the same field inside the nested shutdown_request / status_update message variants. Regenerated with tools/captureToolSchemas.mjs (PR Piebald-AI#25). No other schema bytes changed across the four runs.

Per @coderabbitai on Piebald-AI#25: even though tool names come from Claude Code's own bundle (and would be a supply-chain compromise on Anthropic's side before they're attacker-controlled), checking the derived slug against a strict /^[a-z0-9](?:[a-z0-9-]*[a-z0-9])?$/ pattern is cheap and prevents the script from silently writing outside tool-schemas/ if Claude Code ever emits an unexpected character. Verified: the six current canonical names (Bash, AskUserQuestion, LSP, WebFetch, CronCreate, SendUserMessage) all pass; ../etc/passwd, foo/bar, '..', and 'foo bar' all throw and abort the run.

coderabbitai Bot approved these changes Jun 9, 2026

View reviewed changes

bl-ue mentioned this pull request Jun 10, 2026

Add tool-schemas/ — JSON input_schema for 35 builtin tools (refs #22) #24

Open

coderabbitai Bot requested changes Jun 11, 2026

View reviewed changes

Comment thread tools/captureToolSchemas.mjs

YiRaaaan force-pushed the updatePrompts-tool-schemas branch from e54a658 to 20d5f47 Compare June 11, 2026 03:27

YiRaaaan added 2 commits June 11, 2026 11:31

YiRaaaan force-pushed the updatePrompts-tool-schemas branch from 20d5f47 to 41f0045 Compare June 11, 2026 03:31

coderabbitai Bot approved these changes Jun 11, 2026

View reviewed changes

YiRaaaan changed the title ~~Surface tool-schemas/ in the main README via updatePrompts.js (follow-up to #24)~~ Add tools/captureToolSchemas.mjs + main-README listing for tool-schemas/ (follow-up to #24) Jun 11, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add tools/captureToolSchemas.mjs + main-README listing for tool-schemas/ (follow-up to #24)#25

Add tools/captureToolSchemas.mjs + main-README listing for tool-schemas/ (follow-up to #24)#25
YiRaaaan wants to merge 3 commits into
Piebald-AI:mainfrom
YiRaaaan:updatePrompts-tool-schemas

YiRaaaan commented Jun 9, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Jun 9, 2026 •

edited

Loading

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related issues

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

YiRaaaan commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

1. tools/captureToolSchemas.mjs — one-command schema regeneration

2. tools/updatePrompts.js — main README listing

What the new section looks like

Why a directory scan, not a JSON-file argument

Implementation notes

What this does NOT do

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related issues

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

YiRaaaan commented Jun 9, 2026 •

edited

Loading

1. `tools/captureToolSchemas.mjs` — one-command schema regeneration

2. `tools/updatePrompts.js` — main README listing

coderabbitai Bot commented Jun 9, 2026 •

edited

Loading