Skip to content
266 changes: 266 additions & 0 deletions docs/tasks/PLAN_centralize_config.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,266 @@
# Plan: Centralize Hardcoded Configuration

> **Goal:** Eliminate magic numbers scattered across the codebase by routing all tunable parameters through the existing `.codegraphrc.json` config system (`DEFAULTS` in `src/infrastructure/config.js`).

## Problem

The config system already exists and handles deep-merge + env overrides, but ~50+ behavioral constants are hardcoded in individual modules and never read from config. Users cannot tune thresholds, depths, weights, or limits without editing source code.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Inconsistent magic-number count between plan and PR description

The problem statement (line 7) says "~50+ behavioral constants", but the PR description header claims "78+ hardcoded magic numbers". The two numbers differ significantly and neither is sourced/derived in the document.

For a plan whose explicit goal is completeness of inventory, documenting how the total is arrived at (e.g. counting individual values in B1 and B2 separately, counting each MCP_DEFAULTS entry as distinct) would help reviewers verify completeness and give future implementers confidence that the inventory is exhaustive.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — updated the problem statement to say "~70 individual behavioral constants (34 inventory entries expanding to ~70 discrete values when counting sub-keys in B1, B2, and E1)" with a derivation inline. PR description also updated to match.


---

## Inventory of Hardcoded Values

### Category A — Analysis Parameters (high user value)

| # | Value | File | Line | Controls |
|---|-------|------|------|----------|
| A1 | `maxDepth = 5` | `domain/analysis/impact.js` | 111 | `fn-impact` transitive caller depth |
| A2 | `maxDepth = 3` | `domain/analysis/impact.js` | 31, 144 | BFS default depth for impact/diff-impact |
| A3 | `maxDepth = 3` | `features/audit.js` | 102 | Audit blast-radius depth |
| A4 | `maxDepth = 3` | `features/check.js` | 220 | CI check blast-radius depth |
| A5 | `maxDepth = 10` | `features/sequence.js` | 91 | Sequence diagram traversal depth |
| A6 | `FALSE_POSITIVE_CALLER_THRESHOLD = 20` | `domain/analysis/module-map.js` | 37 | Generic function false-positive filter |
| A7 | `resolution = 1.0` | `graph/algorithms/louvain.js` | 17 | Louvain community detection granularity |
| A8 | `driftThreshold = 0.3` | `features/structure.js` | 581 | Structure cohesion drift warning |
| A9 | `maxCallers >= 10` | `domain/analysis/brief.js` | 40 | `brief` high-risk tier threshold |
| A10 | `maxCallers >= 3` | `domain/analysis/brief.js` | 41 | `brief` medium-risk tier threshold |
| A11 | `maxDepth = 5` | `domain/analysis/brief.js` | 50 | `brief` transitive caller BFS depth |
| A12 | `maxDepth = 5` | `domain/analysis/brief.js` | 76 | `brief` transitive importer BFS depth |

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Stale line numbers in brief.js inventory entries

All four brief.js entries point to incorrect lines. The actual locations (verified against current src/domain/analysis/brief.js) are:

Entry Plan says Actual line
A9 maxCallers >= 10 40 38
A10 maxCallers >= 3 41 39
A11 maxDepth = 5 (callers) 50 47
A12 maxDepth = 5 (importers) 76 73

Since this plan is primarily an implementation guide, each implementing PR will start by locating these constants. Off-by-3 line references add friction, especially for A11/A12 where both entries share the same value and file.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — corrected all four line numbers to match current brief.js: A9→38, A10→39, A11→47, A12→73.


### Category B — Risk & Scoring Weights (medium-high user value)

| # | Value | File | Line | Controls |
|---|-------|------|------|----------|
| B1 | `fanIn: 0.25, complexity: 0.3, churn: 0.2, role: 0.15, mi: 0.1` | `graph/classifiers/risk.js` | 10-14 | Risk score weighting |
| B2 | `core: 1.0, utility: 0.9, entry: 0.8, adapter: 0.5, leaf: 0.2, dead: 0.1` | `graph/classifiers/risk.js` | 21-27 | Role importance weights |
| B3 | `DEFAULT_ROLE_WEIGHT = 0.5` | `graph/classifiers/risk.js` | 30 | Fallback role weight |

### Category C — Search & Embedding (already partially in config)

| # | Value | File | Line | Controls |
|---|-------|------|------|----------|
| C1 | `limit = 15` | `domain/search/search/hybrid.js` | 12 | Hybrid search default limit |
| C2 | `rrfK = 60` | `domain/search/search/hybrid.js` | 13 | RRF fusion constant |
| C3 | `limit = 15` | `domain/search/search/semantic.js` | 12 | Semantic search default limit |
| C4 | `minScore = 0.2` | `domain/search/search/semantic.js` | 13, 52 | Minimum similarity threshold |
| C5 | `SIMILARITY_WARN_THRESHOLD = 0.85` | `domain/search/search/semantic.js` | 71 | Duplicate query warning |
Comment on lines +45 to +46

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 C5 SIMILARITY_WARN_THRESHOLD has no config key and no wiring phase

SIMILARITY_WARN_THRESHOLD = 0.85 at src/domain/search/search/semantic.js:71 is included in the Category C inventory (C5), but it is absent from the proposed DEFAULTS schema additions and from the Phase 4 file list. Phase 4 only says "wire existing config keys" — there is no existing config.search.similarityWarnThreshold key to wire, so C5 would remain hardcoded after all six phases complete, contrary to the plan's stated goal.

A new key should be added to the DEFAULTS design block and wired in Phase 4:

search: {
  defaultMinScore: 0.2,
  rrfK: 60,
  topK: 15,
  similarityWarnThreshold: 0.85,  // C5: duplicate-query warning in multiSearchData
},

Phase 4 should be updated to mention src/domain/search/search/semantic.js → replace SIMILARITY_WARN_THRESHOLD with config.search.similarityWarnThreshold.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — added search.similarityWarnThreshold: 0.85 to the proposed DEFAULTS schema, and updated Phase 4 to explicitly wire C5 by replacing the hardcoded SIMILARITY_WARN_THRESHOLD in semantic.js with config.search.similarityWarnThreshold.

| C6 | Batch sizes per model | `domain/search/models.js` | 66-75 | Embedding batch sizes |

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 C6 has no config key and no definitive phase assignment

C6 | Batch sizes per model | domain/search/models.js | 66-75 is listed in the Category C inventory as a hardcoded value, but:

  • It is absent from the proposed DEFAULTS schema additions (no search.batchSizes or similar key is designed).
  • Phase 4 mentions it only with hedging language: "batch sizes could be config-overridable per model" — no config key is specified, no file→key mapping is given.

This means C6 will remain hardcoded after all six phases complete, contrary to the plan's goal of eliminating all magic numbers from the inventory. Every other entry in Category C has a concrete key assigned.

There are two clean resolutions:

  1. Add a config key — e.g. search.batchSizeMap — and wire models.js in Phase 4 (more complex due to the per-model map shape).
  2. Move C6 to Category F (keep hardcoded) with an explicit rationale (e.g. batch sizes are model-specific implementation details rarely tuned by end-users, analogous to F3 watcher debounce).

Either way, the ambiguity should be resolved so Phase 4 implementers don't discover an undesigned item mid-PR.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — moved C6 (embedding batch sizes) to Category F with explicit rationale: model-specific implementation details rarely tuned by end-users, analogous to watcher debounce (F3). Removed ambiguous "could be" wording from Phase 4 and updated models.js entry to state batch sizes stay hardcoded.


### Category D — Display & Truncation (low-medium user value)

| # | Value | File | Line | Controls |
|---|-------|------|------|----------|
| D1 | `MAX_COL_WIDTH = 40` | `presentation/result-formatter.js` | 82 | Table column width |
| D2 | `50 lines` | `shared/file-utils.js` | 23 | Source context excerpt length |
| D3 | `100 chars` | `shared/file-utils.js` | 48, 63 | Summary/docstring truncation |
| D4 | `10 / 20 lines` | `shared/file-utils.js` | 36, 54 | JSDoc scan depth |

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 D4 conflates two distinct constants with different values into one key

The plan records D4 as 10 / 20 lines from file-utils.js lines 36 and 54 and maps both to display.jsdocScanLines: 10. These are two separate constants with different semantics:

  • Line 36Math.max(0, idx - 10): scans 10 lines upward from a definition to detect the end of a comment block (*/).
  • Line 54Math.max(0, jsdocEnd - 20): scans 20 lines upward from the block end to locate the /** opening.

If an implementer in Phase 5 wires config.display.jsdocScanLines to both locations, line 54's range silently shrinks from 20 → 10. For JSDoc blocks longer than 10 lines, extractSummary would fail to find the /** opener and return null where it previously returned a description.

Consider splitting into two keys:

display: {
  jsdocEndScanLines: 10,      // D4a: lines to scan for block-end marker
  jsdocOpenScanLines: 20,     // D4b: lines to scan for /** opening
  ...
}

Or, at minimum, note in Phase 5 that line 54 uses 20 (not 10) and should remain 20 or be wired to a separate key.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — split D4 into two distinct entries: D4a (jsdocEndScanLines: 10, line 36 — scans upward for */) and D4b (jsdocOpenScanLines: 20, line 54 — scans upward for /**). Both the inventory table and DEFAULTS schema now reflect the correct separate values. Phase 5 wiring explicitly lists both keys with their different defaults.

| D5 | `5 lines` | `shared/file-utils.js` | 76 | Multi-line signature gather |

### Category E — MCP Pagination (medium user value)

| # | Value | File | Line | Controls |
|---|-------|------|------|----------|
| E1 | `MCP_DEFAULTS` (22 entries) | `shared/paginate.js` | 9-34 | Per-tool default page sizes |
| E2 | `MCP_MAX_LIMIT = 1000` | `shared/paginate.js` | 37 | Hard abuse-prevention cap |

### Category F — Infrastructure (low user value, keep hardcoded)

| # | Value | File | Line | Controls |
|---|-------|------|------|----------|
| F1 | `CACHE_TTL_MS = 86400000` | `infrastructure/update-check.js` | 10 | Version check cache (24h) |
| F2 | `FETCH_TIMEOUT_MS = 3000` | `infrastructure/update-check.js` | 11 | Version check HTTP timeout |
| F3 | `debounce = 300` | `domain/graph/watcher.js` | 80 | File watcher debounce (ms) |
| F4 | `maxBuffer = 10MB` | `features/check.js` | 260 | Git diff buffer |
| F5 | `volume / 3000` | `features/complexity.js` | 85 | Halstead bugs formula (standard) |
| F6 | `timeout = 10_000` | `infrastructure/config.js` | 110 | apiKeyCommand timeout |

---

## Design

### Proposed `DEFAULTS` additions in `src/infrastructure/config.js`

```js
export const DEFAULTS = {
// ... existing fields ...

analysis: {
defaultDepth: 3, // A2: BFS depth for impact/diff-impact

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 analysis.defaultDepth name implies a global fallback it doesn't provide

The key analysis.defaultDepth: 3 is mapped exclusively to A2 ("BFS depth for impact/diff-impact" — impact.js lines 31, 144). However, every other per-feature depth already has its own specific key in the same section (auditDepth, sequenceDepth, fnImpactDepth, briefCallerDepth, briefImporterDepth) and check.depth lives under check. The name defaultDepth implies it's a catch-all fallback that governs all analysis commands, but setting it to 5 would only affect impact/diff-impact — not audit, check, or sequence.

A user who wants to raise blast-radius depth across the board would be surprised to find only impact affected.

Consider renaming to impactDepth for consistency with the rest of the section:

Suggested change
defaultDepth: 3, // A2: BFS depth for impact/diff-impact
impactDepth: 3, // A2: BFS depth for impact/diff-impact

The Phase 2 wiring instructions would need the same update (config.analysis.impactDepth instead of config.analysis.defaultDepth).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — renamed analysis.defaultDepth to analysis.impactDepth throughout the plan (DEFAULTS schema, Phase 2 wiring instructions, and check.js note). This is consistent with the other per-feature depth keys (auditDepth, sequenceDepth, fnImpactDepth, briefCallerDepth, etc.).

fnImpactDepth: 5, // A1: fn-impact transitive depth
auditDepth: 3, // A3: audit blast-radius depth
sequenceDepth: 10, // A5: sequence diagram depth
falsePositiveCallers: 20, // A6: generic function filter threshold
briefBfsDepth: 5, // A11/A12: brief command BFS depth (callers + importers)
briefHighRiskCallers: 10, // A9: brief high-risk tier threshold
briefMediumRiskCallers: 3, // A10: brief medium-risk tier threshold
Comment on lines +98 to +103

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 build.driftThreshold belongs in build, not community

The existing build.driftThreshold controls incremental-build health in src/domain/graph/builder/stages/finalize.js (line 52):

const driftThreshold = config.build?.driftThreshold ?? 0.2;
if (nodeDrift > driftThreshold || edgeDrift > driftThreshold) { ... }

This threshold asks "has the graph structure changed significantly between builds?" — it's a build-pipeline concern, not a community-detection concern. Moving it to community.driftThreshold creates a misleading namespace: users searching for how to tune incremental build warnings will not look under community.

structureDriftThreshold (A8) is also a poor fit for community: it controls cohesion in module boundary detection (moduleBoundariesData), not the Louvain resolution parameter.

Consider:

  • Keep build.driftThreshold exactly where it is (already in DEFAULTS, already wired — it needs no migration)
  • Use structure.cohesionThreshold (or add it under build) for A8
  • Reserve the community section purely for Louvain parameters (resolution)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — kept build.driftThreshold in the build namespace where it already lives (wired in finalize.js line 52). Moved A8 (structureDriftThreshold) to a new structure.cohesionThreshold key instead. The community section now contains only the Louvain resolution parameter. Removed all references to the deprecated alias since no migration is needed.

},

community: {
resolution: 1.0, // A7: Louvain resolution

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 briefBfsDepth conflates two independently meaningful BFS depths

A11 (countTransitiveCallers) and A12 (countTransitiveImporters) both default to maxDepth = 5 today, but they traverse the graph in opposite directions and have different performance profiles (callers go up the call graph; importers go up the import graph). Collapsing them into a single briefBfsDepth key prevents users from independently tuning one without the other.

Consider two separate keys:

briefCallerDepth: 5,    // A11: transitive caller BFS (countTransitiveCallers)
briefImporterDepth: 5,  // A12: transitive importer BFS (countTransitiveImporters)

If keeping a single key is preferred for simplicity, at minimum document this limitation in Phase 6.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — split briefBfsDepth into two independent keys: briefCallerDepth (A11, countTransitiveCallers) and briefImporterDepth (A12, countTransitiveImporters). Updated the DEFAULTS schema, Phase 2 wiring references, and brief.js config key references accordingly.

driftThreshold: 0.2, // existing (build.driftThreshold → move here)
structureDriftThreshold: 0.3, // A8: structure cohesion drift
},

risk: {
weights: { // B1
fanIn: 0.25,
complexity: 0.3,
churn: 0.2,
role: 0.15,
mi: 0.1,
},
roleWeights: { // B2
core: 1.0,
utility: 0.9,
entry: 0.8,
adapter: 0.5,
leaf: 0.2,
dead: 0.1,
},
defaultRoleWeight: 0.5, // B3
},
Comment on lines +117 to +134

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 mergeConfig is shallow — risk.weights partial overrides will silently drop un-specified keys

The existing mergeConfig in src/infrastructure/config.js (lines 123–138) only merges one level deep. For a top-level key like risk, it does { ...defaults['risk'], ...user['risk'] }. That's fine for flat properties like defaultRoleWeight, but when the user supplies a partial risk.weights override, the entire nested object is replaced, not merged:

// user config:
{ "risk": { "weights": { "complexity": 0.4, "churn": 0.1 } } }

// result after mergeConfig:
{ risk: { weights: { complexity: 0.4, churn: 0.1 },   // ← fanIn, role, mi are GONE
           roleWeights: { ... }, defaultRoleWeight: 0.5 } }

The dropped weights default to undefined, causing scoreRisk to compute 0.25 * normFanIn + undefined * ... — which silently produces NaN risk scores.

The Example .codegraphrc.json in the plan (lines 238–244) shows exactly this partial override pattern, which would be broken out of the box.

Phase 1 must include making mergeConfig recursive (at least 2 levels deep) before any of the nested sub-keys (risk.weights, risk.roleWeights) are wired. Without this fix, partial user overrides will silently corrupt risk scoring. Consider noting this as a hard blocker, not just a "may need to verify."

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — upgraded the mergeConfig note from "may need recursive merge" to a hard Phase 1 prerequisite. The plan now explicitly states that the current 1-level shallow merge will silently drop sibling keys on partial nested overrides, producing NaN risk scores, and that this must be fixed before any nested config keys are wired.


display: {
maxColWidth: 40, // D1
excerptLines: 50, // D2
summaryMaxChars: 100, // D3
jsdocScanLines: 10, // D4
signatureGatherLines: 5, // D5
},

mcp: {
defaults: { /* E1: current MCP_DEFAULTS object */ },
maxLimit: 1000, // E2
},
Comment on lines +151 to +154

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Making MCP_MAX_LIMIT user-configurable defeats its security purpose

MCP_MAX_LIMIT is explicitly documented in src/shared/paginate.js as a "Hard cap to prevent abuse via MCP." Routing it through .codegraphrc.json means a user — or any process with write access to the project directory — can raise it arbitrarily (e.g. "maxLimit": 999999). This would allow MCP clients to request unbounded result sets, defeating the abuse-prevention intent.

Unlike the other constants in the plan (thresholds, depths, weights), this cap is a server-side safety boundary, not a tuning knob. Consider keeping MCP_MAX_LIMIT hardcoded (move it to Category F), or at minimum require an operator-level mechanism (e.g. a CLI flag or env var) rather than the project config file.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — moved MCP_MAX_LIMIT to Category F (keep hardcoded). Removed it from the proposed DEFAULTS schema, updated the Phase 5 file list to note it stays hardcoded, and added it to the "What stays hardcoded" section with explanation: it's a server-side safety boundary, not a tuning knob.

};
```

### What stays hardcoded (Category F)

- **Halstead `volume / 3000`** — industry-standard formula, not a tuning knob
- **Git `maxBuffer`** — platform concern, not analysis behavior
- **`apiKeyCommand` timeout** — security boundary, not user-facing
- **Update check TTL/timeout** — implementation detail
- **Watcher debounce** — could be configurable later but low priority

---

## Implementation Plan

### Phase 1 — Extend DEFAULTS schema (1 PR)

**Files:** `src/infrastructure/config.js`, `tests/unit/config.test.js`

1. Add `analysis`, `community`, `risk`, `display`, `mcp` sections to `DEFAULTS`
2. Move `build.driftThreshold` → `community.driftThreshold` (keep `build.driftThreshold` as deprecated alias)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Phase 2's check.js wiring is ambiguous about which key takes precedence

The plan says check.js should read config.check.depth (already exists) and config.analysis.defaultDepth, but it doesn't specify the precedence relationship between them. Currently, check.js line 220 is:

const maxDepth = opts.depth || 3;

config.check.depth is already in DEFAULTS (check: { depth: 3 }) but the code ignores it in favour of the hardcoded 3. There are three ways to resolve this in Phase 2:

  1. Replace hardcoded 3 with config.check.depth (keeping check.depth as the sole authoritative key — cleanest)
  2. Replace with config.check.depth ?? config.analysis.defaultDepth (chain with new key as a global fallback)
  3. Use config.analysis.defaultDepth directly, making config.check.depth an override

Option 1 seems cleanest since check.depth already exists. Options 2 and 3 leave users unclear about which config key to set when they want to tune check's blast-radius. The plan should spell out the resolution order.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — clarified that config.check.depth is the sole authoritative key for check.js depth (already in DEFAULTS). The plan now explicitly says: replace hardcoded 3 with config.check.depth, do not chain with config.analysis.defaultDepth. This avoids ambiguous dual-key precedence.

3. Update `mergeConfig` to handle the new nested sections (already works for 1-level deep objects; verify 2-level `risk.weights` merges correctly — may need recursive merge)
4. Add tests: loading config with overrides for each new section

### Phase 2 — Wire analysis parameters (1 PR)

**Files to change:**
- `src/domain/analysis/impact.js` → read `config.analysis.defaultDepth` / `config.analysis.fnImpactDepth`
- `src/features/audit.js` → read `config.analysis.auditDepth`
- `src/features/check.js` → read `config.check.depth` (already exists) and `config.analysis.defaultDepth`
- `src/features/sequence.js` → read `config.analysis.sequenceDepth`
- `src/domain/analysis/module-map.js` → read `config.analysis.falsePositiveCallers`
- `src/domain/analysis/brief.js` → read `config.analysis.briefBfsDepth`, `config.analysis.briefHighRiskCallers`, `config.analysis.briefMediumRiskCallers` (PR #480)

**Pattern:** Each module calls `loadConfig()` (or receives config as a parameter). Replace the hardcoded value with `config.analysis.X ?? FALLBACK`. The fallback ensures backward compatibility if config is missing.

**Tests:** Update integration tests to verify custom config values flow through.

### Phase 3 — Wire risk & community parameters (1 PR)

**Files to change:**
- `src/graph/classifiers/risk.js` → read `config.risk.weights`, `config.risk.roleWeights`, `config.risk.defaultRoleWeight`
- `src/graph/algorithms/louvain.js` → accept `resolution` parameter, default from config
- `src/features/structure.js` → read `config.community.structureDriftThreshold`

**Pattern:** These modules don't currently receive config. Options:
1. **Preferred:** Accept an `options` parameter that callers populate from config
2. **Alternative:** Import `loadConfig` directly (adds coupling but simpler)

**Tests:** Unit tests for risk scoring with custom weights. Integration test for Louvain with custom resolution.

Comment on lines +207 to +208

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 C3 (semantic.js limit) missing from Phase 4 wiring

C3 (limit = 15 in semantic.js line 12) is listed in the inventory and the Phase 4 note acknowledges that config.search.topK already exists in DEFAULTS. However, the explicit wiring instruction for semantic.js only covers config.search.defaultMinScore and config.search.similarityWarnThreshold (C5) — config.search.topK is never mentioned for semantic.js.

After all six phases complete, C3 remains hardcoded, contrary to the plan's stated goal of eliminating every inventory item. An implementer following Phase 4 line-by-line would wire topK for hybrid.js (C1) but silently miss it for semantic.js (C3).

Suggested change
**Tests:** Unit tests for risk scoring with custom weights. Integration test for Louvain with custom resolution.
- `src/domain/search/search/hybrid.js` → read `config.search.rrfK`, `config.search.topK`
- `src/domain/search/search/semantic.js` → read `config.search.defaultMinScore`, `config.search.topK` (C3), and `config.search.similarityWarnThreshold` (C5, replaces hardcoded `SIMILARITY_WARN_THRESHOLD`)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — added config.search.topK (C3) to the semantic.js wiring instruction in Phase 4. The line now reads: read config.search.defaultMinScore, config.search.topK (C3), and config.search.similarityWarnThreshold (C5).

### Phase 4 — Wire search parameters (1 PR)

**Files to change:**
- `src/domain/search/search/hybrid.js` → read `config.search.rrfK`, `config.search.topK`
- `src/domain/search/search/semantic.js` → read `config.search.defaultMinScore`
- `src/domain/search/models.js` → batch sizes could be config-overridable per model

**Note:** `config.search` already exists with `defaultMinScore`, `rrfK`, `topK`. The modules just don't read from it — they duplicate the values. This phase wires the existing config keys.

### Phase 5 — Wire display & MCP parameters (1 PR)

**Files to change:**
- `src/presentation/result-formatter.js` → read `config.display.maxColWidth`
- `src/shared/file-utils.js` → read `config.display.excerptLines`, etc.
- `src/shared/paginate.js` → read `config.mcp.defaults`, `config.mcp.maxLimit`

**Consideration:** `file-utils.js` and `paginate.js` are low-level shared utilities. They shouldn't call `loadConfig()` directly. Instead, pass display/mcp settings down from callers, or use a module-level config cache set at startup.

### Phase 6 — Documentation & migration (1 PR)

1. Update `README.md` configuration section with the full schema
2. Add a `docs/configuration.md` reference with all keys, types, defaults, and descriptions

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 structure section missing from Phase 6 CLAUDE.md list

Phase 6, item 5 instructs adding a CLAUDE.md Configuration section documenting "the full list of configurable sections" — but the enumerated list (analysis, community, risk, display, mcp, search, check, coChange, manifesto) omits structure, despite structure.cohesionThreshold being a new section explicitly added in Phase 1 and wired in Phase 3. A user reading the CLAUDE.md reference would never discover that structure.cohesionThreshold is configurable.

Suggested change
2. Add a `docs/configuration.md` reference with all keys, types, defaults, and descriptions
- The full list of configurable sections (`analysis`, `community`, `structure`, `risk`, `display`, `mcp`, `search`, `check`, `coChange`, `manifesto`)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — added structure to the Phase 6 configurable sections list. It now reads: analysis, community, structure, risk, display, mcp, search, check, coChange, manifesto.

3. Document the deprecated `build.driftThreshold` alias
4. Add a JSON Schema file (`.codegraphrc.schema.json`) for IDE autocomplete

### Phase 7 — Update CLAUDE.md with configuration guidance (same PR as Phase 6)

Add a **Configuration** section to `CLAUDE.md` that documents:
1. The `.codegraphrc.json` config file and its location
2. The full list of configurable sections (`analysis`, `community`, `risk`, `display`, `mcp`, `search`, `check`, `coChange`, `manifesto`)
3. Key tunable parameters and their defaults (depth limits, risk weights, thresholds)
4. How `mergeConfig` works (partial overrides deep-merge with defaults)
5. Env var overrides (`CODEGRAPH_LLM_*`)
6. Guidance: when adding new behavioral constants, always add them to `DEFAULTS` in `config.js` and wire them through — never introduce new hardcoded magic numbers

---

## Migration & Backward Compatibility

- All new config keys have defaults matching current hardcoded values → **zero breaking changes**
- Existing `.codegraphrc.json` files continue to work unchanged
- `mergeConfig` deep-merges, so users only need to specify the keys they want to override
- The `build.driftThreshold` → `community.driftThreshold` move uses a deprecated alias

## Example `.codegraphrc.json` after this work

```json
{
"analysis": {
"fnImpactDepth": 8,
"falsePositiveCallers": 30
},
"risk": {
"weights": {
"complexity": 0.4,
"churn": 0.1
}
},
"community": {
"resolution": 1.5
},
"display": {
"maxColWidth": 60
}
}
```

---

## Estimated Scope

| Phase | Files changed | New tests | Risk |
|-------|--------------|-----------|------|
| 1 — Schema | 2 | 3-4 | Low |
| 2 — Analysis wiring | 6 | 4-5 | Low |
| 3 — Risk/community | 3 | 2-3 | Medium (parameter threading) |
| 4 — Search wiring | 3 | 2 | Low (config keys already exist) |
| 5 — Display/MCP | 3 | 2 | Medium (shared utility coupling) |
| 6 — Docs + CLAUDE.md | 4 | 0 | None |

**Total: ~20 files changed, 6 PRs, one concern per PR.**

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Phase count mismatch — plan body has 7 phases, scope table shows 6

The PR description calls this a "6-phase plan" and the scope table lists Phase 1–6, but the Implementation Plan section defines 7 phases (Phase 1–6, then a separate Phase 7 for CLAUDE.md updates).

Phase 7 is also not in the scope table, so there's no effort/file estimate for it. Please either:

  • Merge Phase 7 into Phase 6 (they share the same PR according to the plan text), or
  • Add Phase 7 to the scope table and update the PR description and plan title to say "7-phase plan"
Suggested change
**Total: ~20 files changed, 6 PRs, one concern per PR.**
**Total: ~20 files changed, 7 PRs, one concern per PR.**

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — merged Phase 7 into Phase 6 (they were already the same PR). The scope table now shows 6 phases consistently, matching the PR description and total line. Phase 6 description updated to include the CLAUDE.md configuration guidance as item 5.

Loading