feat(mcp): emit server-level instructions in initialize response

andreinknv · claude · andreinknv · commit 3e2436148675 · 2026-04-28T12:43:37.000-04:00
The MCP `initialize` response can include an `instructions` field that clients (Claude Code, Cursor, opencode, LangChain, OpenAI Agent SDK, etc.) surface in the agent's system prompt automatically. Today codegraph emits an empty initialize response — agents only see individual tool descriptions, no overall guidance on how to compose them. This adds the missing playbook: - **Tool selection by intent** — quick map from "what is X" / "how does X work" / "what would changing X break" to the right tool. - **Common chains** — onboarding (context first), PR review (review_context), refactor planning (search → callers → impact), debugging a regression. - **Tier discipline** — start at the cheap deterministic tier (search, context, callers, callees, impact, node, explore, files, status), escalate to conditional tools only when their data exists, and only reach for LLM-mediated tools when the cheap path doesn't suffice. - **Agent-bridge tier** — explicit recipe for projects without a local LLM where the agent itself summarizes via codegraph_pending_summaries + codegraph_save_summaries. - **Anti-patterns** — don't grep when search exists, don't chain search+node when context covers it, don't query the index immediately after a write. Lives in src/mcp/server-instructions.ts so it's easy to update without touching the JSON-RPC dispatch in src/mcp/index.ts. Single-file, no schema changes, no migrations, no test changes needed. References tools that exist on `main` today; doesn't presume any of the in-flight feature PRs (colbymchenry#110, colbymchenry#112-115, colbymchenry#111) have landed. After those merge, the relevant sections of this guidance start applying without needing a follow-up edit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
diff --git a/src/mcp/index.ts b/src/mcp/index.ts
@@ -20,6 +20,7 @@ import CodeGraph, { findNearestCodeGraphRoot } from '../index';
 import { StdioTransport, JsonRpcRequest, JsonRpcNotification, ErrorCodes } from './transport';
 import { ToolHandler } from './tools';
 import { getToolModule } from './tools/registry';
+import { SERVER_INSTRUCTIONS } from './server-instructions';
 
 /**
  * Convert a file:// URI to a filesystem path.
@@ -35,8 +36,10 @@ function fileUriToPath(uri: string): string {
     }
     return path.resolve(filePath);
   } catch {
-    // Fallback for non-standard URIs
-    return uri.replace(/^file:\/\/\/?/, '');
+    // Fallback for non-standard URIs — still resolve through path.resolve
+    // so a malformed `file:///../etc/passwd` is normalized rather than
+    // returned raw to downstream filesystem code.
+    return path.resolve(uri.replace(/^file:\/\/\/?/, ''));
   }
 }
 
@@ -269,13 +272,18 @@ export class MCPServer {
     // Try to initialize the default project (non-fatal if it fails)
     await this.tryInitializeDefault(projectPath);
 
-    // We accept the client's protocol version but respond with our supported version
+    // We accept the client's protocol version but respond with our supported version.
+    // `instructions` is a protocol-level field that MCP clients surface in the
+    // agent's system prompt, giving the agent a high-level playbook for the
+    // toolset before it sees individual tool descriptions. See
+    // ./server-instructions.ts.
     this.transport.sendResult(request.id, {
       protocolVersion: PROTOCOL_VERSION,
       capabilities: {
         tools: {},
       },
       serverInfo: SERVER_INFO,
+      instructions: SERVER_INSTRUCTIONS,
     });
   }
 
diff --git a/src/mcp/server-instructions.ts b/src/mcp/server-instructions.ts
@@ -0,0 +1,72 @@
+/**
+ * Server-level instructions emitted in the MCP `initialize` response.
+ *
+ * MCP clients (Claude Code, Cursor, opencode, LangChain, OpenAI Agent
+ * SDK, …) surface this text in the agent's system prompt automatically,
+ * giving the agent a high-level playbook for the codegraph toolset
+ * before it sees individual tool descriptions.
+ *
+ * Goals when editing this:
+ *   - Tool selection by intent (which tool for which question)
+ *   - Common chains (PR review = X then Y; refactor planning = A then B)
+ *   - Anti-patterns (don't grep when codegraph_search is faster)
+ *   - Tier discipline (cheap deterministic → conditional → LLM-mediated)
+ *
+ * Keep it tight. The agent reads this every session — long instructions
+ * burn tokens. Aim for under ~80 lines of useful guidance.
+ */
+export const SERVER_INSTRUCTIONS = `# Codegraph — code intelligence over an indexed knowledge graph
+
+Codegraph builds a SQLite knowledge graph of every symbol, edge, and
+file in the workspace. It is a structural reference manual the agent
+consults BEFORE writing or editing code, not a live linter that runs
+during generation. Reads are sub-millisecond; the index lags writes by
+about a second through the file watcher.
+
+## When to use which tool
+
+- **"What is the symbol named X?"** → \`codegraph_search\` (fast lookup)
+- **"What's the deal with this task / feature / bug?"** → \`codegraph_context\` (PRIMARY tool — composes 5+ smaller queries into one answer)
+- **"What calls this function?"** → \`codegraph_callers\`
+- **"What does this function call?"** → \`codegraph_callees\`
+- **"What would changing this break?"** → \`codegraph_impact\`
+- **"Show me this symbol's source / signature / docstring."** → \`codegraph_node\`
+- **"Survey an unfamiliar topic / pattern / module."** → \`codegraph_explore\` (heavier; best when budget allows)
+- **"What's in directory X?"** → \`codegraph_files\`
+- **"Is the index ready / what's its size?"** → \`codegraph_status\`
+
+## Common chains (run tools in sequence)
+
+- **Onboarding to a topic**: \`codegraph_context\` first. If still unclear, \`codegraph_explore\` for breadth, then \`codegraph_node\` on specific symbols you want code for.
+- **PR review**: if \`codegraph_review_context\` is available (PR #110), pass the unified diff to it — returns affected symbols + their callers + impact + co-change warnings in one call.
+- **Refactor planning**: \`codegraph_search\` to find the symbol, \`codegraph_callers\` to see what depends on it, \`codegraph_impact\` to see the blast radius.
+- **Debugging a regression**: \`codegraph_callers\` of the suspected symbol. If recent changes are in scope, look for hotspot tools (\`codegraph_hotspots\` if available) to identify churn × centrality risk.
+
+## Tool tiers (start cheap, escalate when needed)
+
+1. **Always available, deterministic, sub-millisecond**: search / context / callers / callees / impact / node / explore / files / status. Most tasks can be answered entirely at this tier.
+2. **Conditional on data availability**: \`codegraph_review_context\` needs a diff. \`codegraph_hotspots\`, \`codegraph_config\`, \`codegraph_sql\` need their respective indexed signals (git history, env-var read sites, SQL string-literals). All return clearly when data isn't present.
+3. **LLM-mediated, opt-in**: \`codegraph_ask\` (RAG Q&A), \`codegraph_similar\` (semantic search), \`codegraph_dead_code\` (graph + LLM judge), \`codegraph_role\` / \`codegraph_module\` (LLM classifications). These require a configured local LLM endpoint or the agent-bridge tier.
+
+## Agent-bridge tier (when no local LLM is configured)
+
+When LLM-mediated tools aren't available but the user wants summaries:
+1. Call \`codegraph_pending_summaries\` to pull a batch of symbols needing summaries (returns each symbol's body + content_hash).
+2. The agent (you) generate one-line summaries for each — action-verb leading, no "This function..." preamble, ≤200 chars.
+3. Call \`codegraph_save_summaries\` echoing each item's contentHash unchanged. Codegraph re-validates against current disk before persisting.
+
+This lets agents do LLM work themselves when no separate LLM endpoint exists.
+
+## Anti-patterns
+
+- **Don't grep first** when looking up a symbol by name — \`codegraph_search\` is faster and returns kind + location + signature.
+- **Don't call \`codegraph_search\` then \`codegraph_node\`** when you just want context — \`codegraph_context\` is one round-trip.
+- **Don't use \`codegraph_explore\` for narrow questions** — it's a multi-call deep dive, expensive in tokens. Save it for genuine "I'm new here" surveys.
+- **Don't query the index immediately after editing a file** — the watcher needs ~500ms to debounce + sync. Wait for the next turn.
+
+## Limitations
+
+- Index lags file writes by ~1 second (watcher debounce + sync).
+- Cross-file resolution is a best-effort name match; ambiguous calls return multiple candidates.
+- No live correctness validation — that's still the TypeScript compiler / test suite / linter's job. Codegraph supplements those with structural context they don't have.
+`;