Skip to content

Commit 8f9a8cd

Browse files
committed
feat: complete Phase 4.5 architectural hardening (AD-3 through AD-7)
Implement comprehensive architectural hardening completing Phase 4.5: - **Schema Versioning (AD-3)**: Add schema_version to knowledge store, index manifest, chunks metadata, and vector metadata with migration/validation shims on load - **Data Model Fixes (AD-4)**: Change metadata: dict[str, str] to dict[str, Any] across Entity, Relationship, and CodeChunk with mixed-type roundtrip coverage - **Configuration Hardening (AD-5)**: Replace print() with logging; raise on invalid config in server/MCP contexts via strict mode; validate YAML schema with known key tracking - **Service Layer Decomposition (AD-6)**: Extract RetrievalOrchestrator from KnowCodeService; add Protocol interfaces for EmbeddingProvider, VectorStore, KnowledgeStore - **Entity Identity Resilience (AD-7)**: Add content_hash to entity metadata for rename-resilient correlation with backfill on load Updates API and MCP server to use strict_config=True for production contexts. Adds comprehensive tests for migration shims, schema validation, and strict configuration behavior. BREAKING CHANGE: metadata fields now support arbitrary types instead of only strings; legacy stores will be migrated on load
1 parent 983e6b4 commit 8f9a8cd

32 files changed

Lines changed: 2072 additions & 280 deletions

.agent/rules/analysis-integrity.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
---
2+
trigger: always_on
3+
---
4+
5+
# Analysis Integrity
6+
7+
## Do not work backward from a desired conclusion
8+
9+
Form conclusions from evidence. Do not decide what the recommendation should be and then select evidence to support it. If you notice yourself building a narrative, stop and ask whether the evidence actually leads there.
10+
11+
## Do not ignore evidence you have already seen
12+
13+
If you have read code that contradicts your current claim, incorporate it — do not skip it because it weakens your argument. Contradictory evidence is more important than confirming evidence.
14+
15+
## Do not inflate problems or minimize existing solutions
16+
17+
If the codebase already addresses the problem you are about to recommend solving, say so. "The current design already handles this" is a valid conclusion. Do not downplay existing mechanisms to make a proposed change seem more necessary.
18+
19+
## Do not present uncertain claims as facts
20+
21+
If you have not verified something, say "I have not verified this." Hedging is not a weakness — unearned confidence is. When you skip verification to sound more decisive, you trade correctness for tone.
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
# Analysis Integrity
2+
3+
## Do not work backward from a desired conclusion
4+
5+
Form conclusions from evidence. Do not decide what the recommendation should be and then select evidence to support it. If you notice yourself building a narrative, stop and ask whether the evidence actually leads there.
6+
7+
## Do not ignore evidence you have already seen
8+
9+
If you have read code that contradicts your current claim, incorporate it — do not skip it because it weakens your argument. Contradictory evidence is more important than confirming evidence.
10+
11+
## Do not inflate problems or minimize existing solutions
12+
13+
If the codebase already addresses the problem you are about to recommend solving, say so. "The current design already handles this" is a valid conclusion. Do not downplay existing mechanisms to make a proposed change seem more necessary.
14+
15+
## Do not present uncertain claims as facts
16+
17+
If you have not verified something, say "I have not verified this." Hedging is not a weakness — unearned confidence is. When you skip verification to sound more decisive, you trade correctness for tone.

AGENTS.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
# Analysis Integrity
2+
3+
## Do not work backward from a desired conclusion
4+
5+
Form conclusions from evidence. Do not decide what the recommendation should be and then select evidence to support it. If you notice yourself building a narrative, stop and ask whether the evidence actually leads there.
6+
7+
## Do not ignore evidence you have already seen
8+
9+
If you have read code that contradicts your current claim, incorporate it — do not skip it because it weakens your argument. Contradictory evidence is more important than confirming evidence.
10+
11+
## Do not inflate problems or minimize existing solutions
12+
13+
If the codebase already addresses the problem you are about to recommend solving, say so. "The current design already handles this" is a valid conclusion. Do not downplay existing mechanisms to make a proposed change seem more necessary.
14+
15+
## Do not present uncertain claims as facts
16+
17+
If you have not verified something, say "I have not verified this." Hedging is not a weakness — unearned confidence is. When you skip verification to sound more decisive, you trade correctness for tone.

CLAUDE.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
# Analysis Integrity
2+
3+
## Do not work backward from a desired conclusion
4+
5+
Form conclusions from evidence. Do not decide what the recommendation should be and then select evidence to support it. If you notice yourself building a narrative, stop and ask whether the evidence actually leads there.
6+
7+
## Do not ignore evidence you have already seen
8+
9+
If you have read code that contradicts your current claim, incorporate it — do not skip it because it weakens your argument. Contradictory evidence is more important than confirming evidence.
10+
11+
## Do not inflate problems or minimize existing solutions
12+
13+
If the codebase already addresses the problem you are about to recommend solving, say so. "The current design already handles this" is a valid conclusion. Do not downplay existing mechanisms to make a proposed change seem more necessary.
14+
15+
## Do not present uncertain claims as facts
16+
17+
If you have not verified something, say "I have not verified this." Hedging is not a weakness — unearned confidence is. When you skip verification to sound more decisive, you trade correctness for tone.

GEMINI.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
# Analysis Integrity
2+
3+
## Do not work backward from a desired conclusion
4+
5+
Form conclusions from evidence. Do not decide what the recommendation should be and then select evidence to support it. If you notice yourself building a narrative, stop and ask whether the evidence actually leads there.
6+
7+
## Do not ignore evidence you have already seen
8+
9+
If you have read code that contradicts your current claim, incorporate it — do not skip it because it weakens your argument. Contradictory evidence is more important than confirming evidence.
10+
11+
## Do not inflate problems or minimize existing solutions
12+
13+
If the codebase already addresses the problem you are about to recommend solving, say so. "The current design already handles this" is a valid conclusion. Do not downplay existing mechanisms to make a proposed change seem more necessary.
14+
15+
## Do not present uncertain claims as facts
16+
17+
If you have not verified something, say "I have not verified this." Hedging is not a weakness — unearned confidence is. When you skip verification to sound more decisive, you trade correctness for tone.

KnowCode.md

Lines changed: 75 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -903,41 +903,91 @@ Commands invoked without the required extra should fail fast with: *"Install kno
903903
4. **[x] Token-Budgeted Context Synthesis (Layer 9)**: Priority-ordered sections with truncation handling.
904904
5. **[x] Service Layer**: Shared business logic for CLI and API.
905905

906+
> **Questions you can now answer:**
907+
> - *"What files and folders make up this project?"*
908+
> - *"What are the main classes and functions in this codebase?"*
909+
> - *"Which function calls which other function?"*
910+
> - *"What does this module import, and who imports it?"*
911+
> - *"Which class inherits from which other class?"*
912+
> - *"Give me a summary of this codebase that fits within a size limit."*
913+
906914
### **Phase 2: Intelligence Server & RAG (COMPLETED)**
907915
6. **[x] FastAPI Server (Layer 10)**: Health, stats, search, context, semantic query, reload, entity details, callers/callees.
908916
7. **[x] Semantic Search & Indexing (Layer 4a)**: Chunker (module header/imports/entities), config-driven embeddings (OpenAI or VoyageAI), FAISS vector store, hybrid BM25+vector retrieval (RRF), reranking, dependency expansion.
909917
8. **[x] Indexer Persistence + CLI**: `index`/`semantic-search` commands with save/load.
910918
9. **[x] Watch Mode**: Background indexer + filesystem monitor for incremental re-indexing.
911919
10. **[x] CLI Workflows**: `analyze`, `query`, `context`, `export`, `stats`, `server`, `history`, `ask`.
912920

921+
> **Questions you can now answer:**
922+
> - *"Where in the code do we handle user authentication?"* (semantic search, not just keyword match)
923+
> - *"Find everything related to payment processing."*
924+
> - *"What code is most relevant to how we send emails?"*
925+
> - *"How big is this codebase — how many files, functions, and classes does it have?"*
926+
> - *"Show me the code that's related to this error message."*
927+
> - *"I just changed a file — is the search index still up to date?"* (watch mode keeps it fresh)
928+
913929
### **Phase 3: Temporal & Runtime Signals (COMPLETED)**
914930
11. **[x] Git History Ingestion (Temporal)**: Commit/author entities, authored/modified/changed_by relationships; surfaced via `--temporal` and `history`.
915931
12. **[x] Coverage Signals (Layer 5)**: Cobertura ingestion with coverage report entities and covers/executed_by relationships.
916932

933+
> **Questions you can now answer:**
934+
> - *"Who last changed this file, and when?"*
935+
> - *"How often has this module been modified in the last six months?"*
936+
> - *"Which parts of the code have no test coverage?"*
937+
> - *"Is the code I'm about to change covered by tests?"*
938+
> - *"Who are the main contributors to this area of the codebase?"*
939+
> - *"Which files change together most often?"*
940+
917941
### **Phase 4: Documentation Synthesis (PARTIAL)**
918942
13. **[x] Markdown Export (MVP)**: CLI `export` produces an index-style Markdown doc (see `docs_test/index.md`).
919943
14. **[ ] Multi-Level Doc Synthesis (Layer 7)**: Architecture/module/function narratives, change summaries, and freshness tracking.
920944

945+
> **Questions you can now answer:**
946+
> - *"Can I get a written overview of this codebase I can share with a new team member?"*
947+
>
948+
> **Questions the remaining work will unlock:**
949+
> - *"Give me a high-level architecture narrative for the whole system."*
950+
> - *"Write a summary of what changed in this module since last release."*
951+
> - *"Which parts of the documentation are stale and need updating?"*
952+
921953
### **Phase 4.5: Architectural Hardening (NEXT)** *(addresses AD-1 through AD-7)*
922954
15. **[x] Dependency Modularisation (AD-1)**: Move heavy dependencies behind optional extras (`server`, `search`, `llm`, `watch`, `all`). Core install stays lightweight.
923955
16. **[x] Side-Effect-Free Query Paths (AD-2)**: Remove auto-analyze/index from `retrieve_context_for_query()`. Fail fast with actionable errors. Add explicit `ensure_store()` / `ensure_index()` helpers.
924-
17. **[ ] Schema Versioning (AD-3)**: Add `schema_version` to knowledge store JSON and index metadata. Write migration shim for version validation on load.
925-
18. **[ ] Data Model Fixes (AD-4)**: Change `metadata: dict[str, str]` to `dict[str, Any]` across `Entity`, `Relationship`, and `CodeChunk`.
926-
19. **[ ] Configuration Hardening (AD-5)**: Replace `print()` with `logging`; raise on invalid config in server contexts; validate YAML schema.
927-
20. **[ ] Service Layer Decomposition (AD-6)**: Extract `RetrievalOrchestrator` from `KnowCodeService`. Define `Protocol` interfaces for `EmbeddingProvider`, `VectorStore`, `KnowledgeStoreProtocol`.
928-
21. **[ ] Entity Identity Resilience (AD-7)**: Add `content_hash` to entity metadata for rename-resilient correlation.
956+
17. **[x] Schema Versioning (AD-3)**: Add `schema_version` to knowledge store JSON, index manifest, chunks metadata, and vector metadata. Include migration/validation shims on load.
957+
18. **[x] Data Model Fixes (AD-4)**: Change `metadata: dict[str, str]` to `dict[str, Any]` across `Entity`, `Relationship`, and `CodeChunk` with mixed-type roundtrip coverage.
958+
19. **[x] Configuration Hardening (AD-5)**: Replace `print()` with `logging`; raise on invalid config in server/MCP contexts via strict mode; validate known YAML keys and warn on unknown keys.
959+
20. **[x] Service Layer Decomposition (AD-6)**: Extracted `RetrievalOrchestrator` from `KnowCodeService`. Added `Protocol` interfaces for `EmbeddingProvider`, `VectorStore`, and `KnowledgeStoreProtocol`.
960+
21. **[x] Entity Identity Resilience (AD-7)**: Add `content_hash` to entity metadata for rename-resilient correlation.
929961
22. **[ ] Layer Contract Tests**: Parser → `ParseResult` contract tests; store save/load roundtrip with schema version; retrieval golden-query tests; CLI smoke tests (Click runner); API endpoint contract tests (conditional on `server` extra).
930962

963+
> *This phase does not unlock new user-facing questions — it makes the existing answers more reliable, portable, and predictable. For example:*
964+
> - *"I upgraded KnowCode — will my existing analysis still work?"* (schema versioning)
965+
> - *"I renamed a file — does KnowCode still recognise the same functions?"* (entity identity resilience)
966+
> - *"Can I install KnowCode without all the heavy AI dependencies?"* (dependency modularisation)
967+
931968
### **Phase 5: Deep Analysis**
932969
23. **[ ] Static Behavioral Analysis (Layer 4)**: Data flow, state transitions, side-effect classification.
933970
24. **[ ] Intent Extraction (Layer 6)**: ADR/PR/commit intent linking beyond commit metadata.
934971
25. **[ ] Confidence Scoring (Layer 3)**: Weighted edges/entities by evidence source.
935972

973+
> **Questions this will unlock:**
974+
> - *"Where does user input end up — does it ever reach the database unsanitised?"* (data flow)
975+
> - *"Does this function have side effects, or is it safe to call multiple times?"*
976+
> - *"What was the original reason this module was built this way?"* (intent from ADRs/PRs)
977+
> - *"How confident should I be in this answer — is it based on solid evidence or inference?"*
978+
> - *"If I change this variable, what downstream behaviour could break?"*
979+
936980
### **Phase 6: Enterprise (FUTURE)**
937981
26. **[ ] Security & RBAC**: Permissioned access and audit trails.
938982
27. **[ ] Scalability (AD-8)**: SQLite-backed storage for large monorepos; incremental graph loading; sharded indexes.
939983
28. **[ ] Team Sharing**: Remote knowledge store sync and collaboration.
940984

985+
> **Questions this will unlock:**
986+
> - *"Can I share my codebase analysis with the rest of the team without everyone re-running it?"*
987+
> - *"Can I restrict who on the team can see the analysis of sensitive modules?"*
988+
> - *"Will this work on our monorepo with 500,000 files?"*
989+
> - *"Who on my team queried the knowledge store, and what did they ask?"*
990+
941991
### **Phase 7: Agentic Capabilities (COMPLETED v2.2)**
942992
29. **[x] Agent Architecture**: `Agent` class with configuration-driven model selection.
943993
30. **[x] Multi-Provider Support**: Google Gemini and OpenRouter/OpenAI integration.
@@ -946,6 +996,14 @@ Commands invoked without the required extra should fail fast with: *"Install kno
946996
33. **[x] Smart Answer**: Local-first answering with configurable sufficiency threshold.
947997
34. **[x] VoyageAI Reranking**: Cross-encoder reranking with signal-based fallback.
948998

999+
> **Questions you can now answer:**
1000+
> - *"Explain how the login flow works, step by step."*
1001+
> - *"I'm getting this error — what's likely causing it and where should I look?"*
1002+
> - *"How would I add a new API endpoint to this project?"*
1003+
> - *"Review this function — anything look wrong or risky?"*
1004+
> - *"Where exactly in the code does the app validate email addresses?"*
1005+
> - *"Answer this from what you already know locally — don't call an external AI unless you have to."*
1006+
9491007
### **Phase 8: IDE Integration (COMPLETED v2.2)**
9501008
35. **[x] MCP Server (Layer 10b)**: Tool exposure via STDIO for IDE agents.
9511009
36. **[x] Core 4 Tools**: `search_codebase`, `get_entity_context`, `trace_calls`, `retrieve_context_for_query`.
@@ -954,11 +1012,23 @@ Commands invoked without the required extra should fail fast with: *"Install kno
9541012
39. **[x] Multi-hop Queries**: `trace_calls(depth=N)` and `get_impact()` analysis.
9551013
40. **[x] Structured Responses**: JSON with `task_type` and `sufficiency_score`.
9561014

1015+
> **Questions you can now answer (directly from your IDE):**
1016+
> - *"What does this function do and what calls it?"* (without leaving your editor)
1017+
> - *"If I change this class, what else in the codebase might break?"*
1018+
> - *"Trace the full call chain from this API endpoint down to the database."*
1019+
> - *"Give my IDE agent the context it needs so it doesn't have to send everything to an expensive cloud model."*
1020+
> - *"How confident is the system that it has enough local context to answer my question?"*
1021+
9571022
### **Supporting Tooling & QA (COMPLETED)**
9581023
- **[x] Tests**: Unit/integration/e2e coverage for parsing, indexing, retrieval, API, CLI, storage, and analysis.
9591024
- **[x] CI/CD**: Ruff linting, pytest + coverage, MkDocs build, and automated changelog generation (last-tag range + optional human summary input).
9601025
- **[x] Evaluation Utilities**: Retrieval-quality evaluation script (`scripts/evaluate.py`).
9611026

1027+
> **Questions you can now answer:**
1028+
> - *"Are the tests passing, and how much of the code do they cover?"*
1029+
> - *"Is the code style consistent and free of lint warnings?"*
1030+
> - *"How good is KnowCode's own search quality — is it returning relevant results?"*
1031+
9621032
---
9631033

9641034
## **Primary Use-Cases**

0 commit comments

Comments
 (0)