You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: complete Phase 4.5 architectural hardening (AD-3 through AD-7)
Implement comprehensive architectural hardening completing Phase 4.5:
- **Schema Versioning (AD-3)**: Add schema_version to knowledge store, index manifest, chunks metadata, and vector metadata with migration/validation shims on load
- **Data Model Fixes (AD-4)**: Change metadata: dict[str, str] to dict[str, Any] across Entity, Relationship, and CodeChunk with mixed-type roundtrip coverage
- **Configuration Hardening (AD-5)**: Replace print() with logging; raise on invalid config in server/MCP contexts via strict mode; validate YAML schema with known key tracking
- **Service Layer Decomposition (AD-6)**: Extract RetrievalOrchestrator from KnowCodeService; add Protocol interfaces for EmbeddingProvider, VectorStore, KnowledgeStore
- **Entity Identity Resilience (AD-7)**: Add content_hash to entity metadata for rename-resilient correlation with backfill on load
Updates API and MCP server to use strict_config=True for production contexts. Adds comprehensive tests for migration shims, schema validation, and strict configuration behavior.
BREAKING CHANGE: metadata fields now support arbitrary types instead of only strings; legacy stores will be migrated on load
Form conclusions from evidence. Do not decide what the recommendation should be and then select evidence to support it. If you notice yourself building a narrative, stop and ask whether the evidence actually leads there.
10
+
11
+
## Do not ignore evidence you have already seen
12
+
13
+
If you have read code that contradicts your current claim, incorporate it — do not skip it because it weakens your argument. Contradictory evidence is more important than confirming evidence.
14
+
15
+
## Do not inflate problems or minimize existing solutions
16
+
17
+
If the codebase already addresses the problem you are about to recommend solving, say so. "The current design already handles this" is a valid conclusion. Do not downplay existing mechanisms to make a proposed change seem more necessary.
18
+
19
+
## Do not present uncertain claims as facts
20
+
21
+
If you have not verified something, say "I have not verified this." Hedging is not a weakness — unearned confidence is. When you skip verification to sound more decisive, you trade correctness for tone.
Form conclusions from evidence. Do not decide what the recommendation should be and then select evidence to support it. If you notice yourself building a narrative, stop and ask whether the evidence actually leads there.
6
+
7
+
## Do not ignore evidence you have already seen
8
+
9
+
If you have read code that contradicts your current claim, incorporate it — do not skip it because it weakens your argument. Contradictory evidence is more important than confirming evidence.
10
+
11
+
## Do not inflate problems or minimize existing solutions
12
+
13
+
If the codebase already addresses the problem you are about to recommend solving, say so. "The current design already handles this" is a valid conclusion. Do not downplay existing mechanisms to make a proposed change seem more necessary.
14
+
15
+
## Do not present uncertain claims as facts
16
+
17
+
If you have not verified something, say "I have not verified this." Hedging is not a weakness — unearned confidence is. When you skip verification to sound more decisive, you trade correctness for tone.
Form conclusions from evidence. Do not decide what the recommendation should be and then select evidence to support it. If you notice yourself building a narrative, stop and ask whether the evidence actually leads there.
6
+
7
+
## Do not ignore evidence you have already seen
8
+
9
+
If you have read code that contradicts your current claim, incorporate it — do not skip it because it weakens your argument. Contradictory evidence is more important than confirming evidence.
10
+
11
+
## Do not inflate problems or minimize existing solutions
12
+
13
+
If the codebase already addresses the problem you are about to recommend solving, say so. "The current design already handles this" is a valid conclusion. Do not downplay existing mechanisms to make a proposed change seem more necessary.
14
+
15
+
## Do not present uncertain claims as facts
16
+
17
+
If you have not verified something, say "I have not verified this." Hedging is not a weakness — unearned confidence is. When you skip verification to sound more decisive, you trade correctness for tone.
Form conclusions from evidence. Do not decide what the recommendation should be and then select evidence to support it. If you notice yourself building a narrative, stop and ask whether the evidence actually leads there.
6
+
7
+
## Do not ignore evidence you have already seen
8
+
9
+
If you have read code that contradicts your current claim, incorporate it — do not skip it because it weakens your argument. Contradictory evidence is more important than confirming evidence.
10
+
11
+
## Do not inflate problems or minimize existing solutions
12
+
13
+
If the codebase already addresses the problem you are about to recommend solving, say so. "The current design already handles this" is a valid conclusion. Do not downplay existing mechanisms to make a proposed change seem more necessary.
14
+
15
+
## Do not present uncertain claims as facts
16
+
17
+
If you have not verified something, say "I have not verified this." Hedging is not a weakness — unearned confidence is. When you skip verification to sound more decisive, you trade correctness for tone.
Form conclusions from evidence. Do not decide what the recommendation should be and then select evidence to support it. If you notice yourself building a narrative, stop and ask whether the evidence actually leads there.
6
+
7
+
## Do not ignore evidence you have already seen
8
+
9
+
If you have read code that contradicts your current claim, incorporate it — do not skip it because it weakens your argument. Contradictory evidence is more important than confirming evidence.
10
+
11
+
## Do not inflate problems or minimize existing solutions
12
+
13
+
If the codebase already addresses the problem you are about to recommend solving, say so. "The current design already handles this" is a valid conclusion. Do not downplay existing mechanisms to make a proposed change seem more necessary.
14
+
15
+
## Do not present uncertain claims as facts
16
+
17
+
If you have not verified something, say "I have not verified this." Hedging is not a weakness — unearned confidence is. When you skip verification to sound more decisive, you trade correctness for tone.
11.**[x] Git History Ingestion (Temporal)**: Commit/author entities, authored/modified/changed_by relationships; surfaced via `--temporal` and `history`.
915
931
12.**[x] Coverage Signals (Layer 5)**: Cobertura ingestion with coverage report entities and covers/executed_by relationships.
916
932
933
+
> **Questions you can now answer:**
934
+
> -*"Who last changed this file, and when?"*
935
+
> -*"How often has this module been modified in the last six months?"*
936
+
> -*"Which parts of the code have no test coverage?"*
937
+
> -*"Is the code I'm about to change covered by tests?"*
938
+
> -*"Who are the main contributors to this area of the codebase?"*
16.**[x] Side-Effect-Free Query Paths (AD-2)**: Remove auto-analyze/index from `retrieve_context_for_query()`. Fail fast with actionable errors. Add explicit `ensure_store()` / `ensure_index()` helpers.
924
-
17.**[] Schema Versioning (AD-3)**: Add `schema_version` to knowledge store JSON and index metadata. Write migration shim for version validation on load.
925
-
18.**[] Data Model Fixes (AD-4)**: Change `metadata: dict[str, str]` to `dict[str, Any]` across `Entity`, `Relationship`, and `CodeChunk`.
926
-
19.**[] Configuration Hardening (AD-5)**: Replace `print()` with `logging`; raise on invalid config in server contexts; validate YAML schema.
927
-
20.**[] Service Layer Decomposition (AD-6)**: Extract`RetrievalOrchestrator` from `KnowCodeService`. Define`Protocol` interfaces for `EmbeddingProvider`, `VectorStore`, `KnowledgeStoreProtocol`.
928
-
21.**[] Entity Identity Resilience (AD-7)**: Add `content_hash` to entity metadata for rename-resilient correlation.
956
+
17.**[x] Schema Versioning (AD-3)**: Add `schema_version` to knowledge store JSON, index manifest, chunks metadata, and vector metadata. Include migration/validation shims on load.
957
+
18.**[x] Data Model Fixes (AD-4)**: Change `metadata: dict[str, str]` to `dict[str, Any]` across `Entity`, `Relationship`, and `CodeChunk` with mixed-type roundtrip coverage.
958
+
19.**[x] Configuration Hardening (AD-5)**: Replace `print()` with `logging`; raise on invalid config in server/MCP contexts via strict mode; validate known YAML keys and warn on unknown keys.
959
+
20.**[x] Service Layer Decomposition (AD-6)**: Extracted`RetrievalOrchestrator` from `KnowCodeService`. Added`Protocol` interfaces for `EmbeddingProvider`, `VectorStore`, and`KnowledgeStoreProtocol`.
960
+
21.**[x] Entity Identity Resilience (AD-7)**: Add `content_hash` to entity metadata for rename-resilient correlation.
929
961
22.**[] Layer Contract Tests**: Parser → `ParseResult` contract tests; store save/load roundtrip with schema version; retrieval golden-query tests; CLI smoke tests (Click runner); API endpoint contract tests (conditional on `server` extra).
930
962
963
+
> *This phase does not unlock new user-facing questions — it makes the existing answers more reliable, portable, and predictable. For example:*
964
+
> -*"I upgraded KnowCode — will my existing analysis still work?"* (schema versioning)
965
+
> -*"I renamed a file — does KnowCode still recognise the same functions?"* (entity identity resilience)
966
+
> -*"Can I install KnowCode without all the heavy AI dependencies?"* (dependency modularisation)
967
+
931
968
### **Phase 5: Deep Analysis**
932
969
23.**[] Static Behavioral Analysis (Layer 4)**: Data flow, state transitions, side-effect classification.
0 commit comments