|
| 1 | +# KnowCode Operationalization Priorities |
| 2 | + |
| 3 | +**Date:** 2026-05-12 |
| 4 | +**Status:** Plan (not yet executed) |
| 5 | +**Owner:** Solo |
| 6 | +**Scope:** Make KnowCode trustworthy for daily use as the repository-context provider for AI coding agents. |
| 7 | + |
| 8 | +## Context |
| 9 | + |
| 10 | +KnowCode is intended to be invoked by the following agent CLIs/IDEs as the source of repository-level context: |
| 11 | + |
| 12 | +- Claude Code |
| 13 | +- Codex |
| 14 | +- Hermes Agent CLI |
| 15 | +- Gemini CLI |
| 16 | +- Antigravity IDE |
| 17 | + |
| 18 | +Target repository: whichever repo KnowCode is run from (solo, one repo at a time). Team features (RBAC, sharing) are out of scope. |
| 19 | + |
| 20 | +The questions this plan is built to answer: |
| 21 | + |
| 22 | +1. Is KnowCode ready for production use? |
| 23 | +2. Is the MCP interface optimized to minimize token consumption (schema load in the consumer's context window)? |
| 24 | +3. What gaps am I not yet aware of? |
| 25 | + |
| 26 | +## Evidence baseline (already done — do not re-do) |
| 27 | + |
| 28 | +- **Phase 4.5 architectural hardening AD-1 through AD-7** is complete (see `KnowCode.md` Phase 4.5 checklist; commits `983e6b4`, `8f9a8cd`). |
| 29 | +- Of the 5 MCP token-reduction strategies in `docs/MCP_TOKEN_OVERHEAD_REDUCTION.md`: |
| 30 | + - Strategy 3 (lower defaults): partially done — `get_entity_context` default is now `max_tokens=2000`. |
| 31 | + - Strategy 4 (compact JSON): done — `json.dumps(separators=(',', ':'))` at `src/knowcode/mcp/server.py:344`. |
| 32 | + - Strategy 2 (stripped response): partially done — `verbosity="minimal"` default on `retrieve_context_for_query`. |
| 33 | + - Strategy 1 (consolidate to one tool): **NOT done** — 4 tools still injected per turn. |
| 34 | + - Strategy 5 (summaries over full source): **NOT done**. |
| 35 | +- Item **#22 "Layer Contract Tests" is the only unchecked box** in Phase 4.5 of the roadmap. |
| 36 | +- Per-agent rule files exist in the repo (`CLAUDE.md`, `AGENTS.md`, `GEMINI.md`, `.agent/rules/`, previously `.kilocode/rules/`) but are not maintained from one canonical source and are not installed by a single command. |
| 37 | + |
| 38 | +## Priorities |
| 39 | + |
| 40 | +### P1 — Production-readiness verification |
| 41 | + |
| 42 | +**Problem it addresses:** "I don't know if KnowCode is ready for production." |
| 43 | + |
| 44 | +**Deliverables:** |
| 45 | + |
| 46 | +1. Implement the last open Phase 4.5 item (#22 "Layer Contract Tests"): |
| 47 | + - Parser → `ParseResult` contract tests. |
| 48 | + - Knowledge store save/load roundtrip with `schema_version` assertions. |
| 49 | + - Retrieval golden-query tests (fixed query → fixed top-K entity IDs, regression-guarded). |
| 50 | + - CLI smoke tests via `click.testing.CliRunner`. |
| 51 | + - API endpoint contract tests, conditional on the `server` extra. |
| 52 | +2. Add a `knowcode doctor` CLI subcommand. Target wall-clock: under 10s. Checks: |
| 53 | + - Knowledge store present and `schema_version` recognized. |
| 54 | + - Index present, dimension matches configured embedding model. |
| 55 | + - Required API keys present for the providers enabled in `aimodels.yaml`. |
| 56 | + - MCP handshake works: spawn the MCP server, `list_tools`, call one tool, parse the response. |
| 57 | + - Configuration loads cleanly under `strict_config=True`. |
| 58 | + - Disk footprint under a configurable cap (warn, do not fail). |
| 59 | +3. Treat `doctor` green as a precondition for landing P2–P5. |
| 60 | + |
| 61 | +**Why first:** Without it, every subsequent change is built on uncertainty. Directly addresses the production-readiness question. |
| 62 | + |
| 63 | +### P2 — Finish the MCP token diet |
| 64 | + |
| 65 | +**Problem it addresses:** "I don't know whether the MCP interface is optimized for token consumption." |
| 66 | + |
| 67 | +**Deliverables:** |
| 68 | + |
| 69 | +1. **Strategy 1 — consolidate to one tool.** Merge `search_codebase`, `get_entity_context`, `trace_calls`, `retrieve_context_for_query` into a single `knowcode` tool with an `action` enum (`search` | `context` | `trace` | `query`). Keep the four-tool surface available behind a deprecation flag for one release. Expected saving: roughly 400 tokens per agent turn, across every consumer CLI. |
| 70 | +2. **Strategy 5 — summary-first responses.** Default `context_text` to signature + docstring + caller/callee IDs. Return full source only when `task_type in {debug, review}` or when the caller explicitly sets `include_source=true`. Expected reduction: `context_text` from ~3000 → ~500 tokens for exploratory queries. |
| 71 | +3. Add a regression-guard fixture asserting that the byte size of each `action`'s response stays under fixed caps. Fail the test on creep. |
| 72 | +4. Update `docs/MCP_TOKEN_OVERHEAD_REDUCTION.md` to mark Strategies 1 and 5 as shipped with measured before/after numbers. |
| 73 | + |
| 74 | +**Why second:** These are the last two unshipped items in the token-overhead doc you already authored. Together they close the gap to the ≈88% reduction target. |
| 75 | + |
| 76 | +### P3 — Unified five-consumer onboarding |
| 77 | + |
| 78 | +**Problem it addresses:** Five agent CLIs, five different config locations, no single setup path, no end-to-end verification. |
| 79 | + |
| 80 | +**Deliverables:** |
| 81 | + |
| 82 | +1. `knowcode install-agent <claude-code|codex|hermes|gemini-cli|antigravity>` writes the correct MCP-server config (or equivalent) into the right location per agent, idempotently. Detect existing config and either skip or merge. |
| 83 | +2. One canonical rules snippet at `.knowcode/agent-rules.md` describing when to call KnowCode versus `grep`/`Read`, recommended `action` per intent, and token-budget guidance. `install-agent` appends or `@include`s this snippet from each agent's native rules file (`CLAUDE.md`, `AGENTS.md`, `GEMINI.md`, etc.). |
| 84 | +3. `knowcode doctor --agent <name>` verifies the named agent can actually invoke a tool end-to-end, not just that the config file exists on disk. |
| 85 | +4. Document the exact config syntax and config-file path for each of the five consumers; flag any consumer whose syntax we have not yet verified (currently: Hermes Agent CLI). |
| 86 | + |
| 87 | +**Why third:** Without it, "production" requires five manual setups per repo with divergent quirks. With it, one command per consumer. |
| 88 | + |
| 89 | +### P4 — Freshness and lifecycle automation |
| 90 | + |
| 91 | +**Problem it addresses:** Five agents querying a knowledge store that silently goes stale during normal dev. |
| 92 | + |
| 93 | +**Deliverables:** |
| 94 | + |
| 95 | +1. `.knowcode/freshness.json` tracks `store_commit_sha`, `index_commit_sha`, last successful `analyze`/`index` timestamps. |
| 96 | +2. MCP tool responses include a `freshness` field with `commits_behind_head` and `last_indexed_at`, so the consuming agent can flag potentially stale answers in its response to the user. |
| 97 | +3. `knowcode install-hooks` installs a git `post-commit` hook that enqueues an incremental `analyze` + `index` run, backgrounded. |
| 98 | +4. `knowcode server --watch --daemon` plus launchd (macOS) and systemd (Linux) unit-file templates for solo always-on mode. |
| 99 | + |
| 100 | +**Why fourth:** Once five agents start hitting the store, stale data poisons answers without any visible failure mode. Freshness must be in the response payload, not a thing the developer remembers to rebuild. |
| 101 | + |
| 102 | +### P5 — Usage observability loop |
| 103 | + |
| 104 | +**Problem it addresses:** "What else don't I know?" Closing the loop on whether KnowCode is actually being used and is actually earning its keep. |
| 105 | + |
| 106 | +**Deliverables:** |
| 107 | + |
| 108 | +1. Append every MCP tool call to `.knowcode/telemetry.jsonl`, with: action, latency, response token estimate, `sufficiency_score`, `commits_behind_head` at call time, and the agent identifier when discoverable from the request context. |
| 109 | +2. `knowcode stats --usage [--since 7d]` summarizes: calls per day, mean sufficiency score, local-answer rate (calls with score >= threshold), stale-response count, per-agent breakdown. |
| 110 | +3. Weekly self-report surface (CLI, not a dashboard) so usage is visible at a glance. |
| 111 | + |
| 112 | +**Why last:** Nothing to observe until P1–P4 ship. Once they have, this is what turns "in production" into "trusted in production." |
| 113 | + |
| 114 | +## Sequencing |
| 115 | + |
| 116 | +- **P1 and P2 can run in parallel** — different subsystems, no shared files. |
| 117 | +- **P3 depends on P2.** The rules snippet, install commands, and docs in P3 should refer to the consolidated single-tool surface, not the legacy four-tool interface, to avoid rework. |
| 118 | +- **P4 and P5 can run in parallel** after P3. |
| 119 | + |
| 120 | +## Open verification items |
| 121 | + |
| 122 | +Before starting execution, I have not yet verified the following. Each could affect scope: |
| 123 | + |
| 124 | +1. Actual current test coverage percentage and which subsystems are weakest. |
| 125 | +2. Whether any partial implementation of `knowcode doctor` already exists. |
| 126 | +3. The exact MCP configuration syntax and file path for Hermes Agent CLI. |
| 127 | +4. Whether the `apps/agent-gateway/` OpenAPI-to-tool path should be promoted as an alternative for any of the five consumers, or kept as a separate power-user option. |
| 128 | + |
| 129 | +## Non-goals (explicitly out of scope here) |
| 130 | + |
| 131 | +- Team features: RBAC, shared remote stores, audit logging (Phase 6). |
| 132 | +- Phase 5 deep analysis: data flow, intent extraction, confidence scoring. |
| 133 | +- Phase 4 multi-level documentation synthesis. |
| 134 | +- SQLite-backed storage for large monorepos (AD-8). |
0 commit comments