Skip to content

proposal: MINIMAL_ROUTING env flag + per-agent tools_available filter to reduce SessionStart routing-block context cost #641

@alex-hairexplained

Description

@alex-hairexplained

Summary

The current routing block emitted by hooks/routing-block.mjs (lines 18–79) injects ~700 tokens of XML as SessionStart additionalContext on every Claude Code session — including every background sub-agent spawn. For harnesses that orchestrate many agents per day, this becomes a meaningful tax on the context window, and a large fraction of that tax is paid by agents that have no ctx_* MCP tools available and therefore cannot act on the instructions.

This issue proposes two complementary mitigations that preserve current behaviour by default.

Quantified cost

Measured against a real multi-agent harness (one parent thread + 6–10 background spawns per session, several sessions per day):

  • Block size: ~700 tokens (75 lines of XML after frontmatter strip).
  • Per session arc (parent + ~6 spawns): ~7 × 700 = ~4.9K tokens.
  • Daily rate at typical harness throughput (10–15 spawns/day): ~7–10K tokens/day.
  • Cumulative across a 30-day sprint: ~210–300K tokens — roughly one full session's worth of context burned on identical boilerplate.

~70% of agent types receive the block as dead text

A survey of the agent types this harness routinely spawns shows the majority are loaded with Read / Edit / Write / Grep / Glob / Bash only — no MCP tools, and therefore no ctx_* surface to route to:

Agent type Has ctx_* MCP Block relevance
general-purpose, parent Claude Code thread, researcher Yes Relevant
caveman:cavecrew-builder / investigator / reviewer No Dead text
pr-review-toolkit:code-reviewer, silent-failure-hunter No Dead text
feature-dev:code-architect / explorer / reviewer No Dead text
Explore, Plan No Dead text

For these agents the block is 100% inert — they cannot call the tools it tells them to prefer, and the instructions are paid for in tokens but never consulted.

Proposed options

Either option is backward-compatible. Both gated behind an opt-in flag would also be acceptable.

Option A — MINIMAL_ROUTING=1 env flag returning a 12-line caveman version

A semantically-equivalent compressed form preserves every routing rule (ctx_batch_execute primary; ctx_search follow-ups; ctx_execute / ctx_execute_file for data not file writes; Bash forbidden >20-line output; WebFetch → ctx_fetch_and_index; Read/Edit/Write native; ctx_stats / doctor / upgrade / purge handlers; post-/clear KB-preserved hint; ≤500-word response cap with file-path artifact policy) in ~110 tokens — ~6.4× compression on this surface.

Opt-in via env: users who already trust their agents to follow the routing prefer the compact form; new users keep the verbose form for clarity.

Option B — per-agent tools_available filter

If the SessionStart hook payload exposes the active tool list (Claude Code does pass tool metadata to hooks in some channels), skip the block entirely when no ctx_* tools are loaded for that agent. If the hook surface does not expose tool availability, fall back to a hard-coded skip-list of agent types known not to load MCP tools (caveman, pr-review-toolkit, feature-dev, Explore, Plan), configurable via env.

Combining both (compression + skip-when-irrelevant): the same harness above would see token cost drop from ~7–10K/day to ~1–2K/day — ~6–9× reduction on this single surface, with zero loss of behaviour for agents that actually use the MCP tools.

Why this is filed instead of patched locally

Plugin code lives in the user's cache directory and is overwritten on every plugin upgrade; local edits are not durable. Hence this proposal.

Happy to draft a PR if either option is acceptable in principle — please indicate which (or both) you'd like, and any constraint on default behaviour.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions