|
| 1 | +# Plan: Centralize Hardcoded Configuration |
| 2 | + |
| 3 | +> **Goal:** Eliminate magic numbers scattered across the codebase by routing all tunable parameters through the existing `.codegraphrc.json` config system (`DEFAULTS` in `src/infrastructure/config.js`). |
| 4 | +
|
| 5 | +## Problem |
| 6 | + |
| 7 | +The config system already exists and handles deep-merge + env overrides, but ~50+ behavioral constants are hardcoded in individual modules and never read from config. Users cannot tune thresholds, depths, weights, or limits without editing source code. |
| 8 | + |
| 9 | +--- |
| 10 | + |
| 11 | +## Inventory of Hardcoded Values |
| 12 | + |
| 13 | +### Category A — Analysis Parameters (high user value) |
| 14 | + |
| 15 | +| # | Value | File | Line | Controls | |
| 16 | +|---|-------|------|------|----------| |
| 17 | +| A1 | `maxDepth = 5` | `domain/analysis/impact.js` | 111 | `fn-impact` transitive caller depth | |
| 18 | +| A2 | `maxDepth = 3` | `domain/analysis/impact.js` | 31, 144 | BFS default depth for impact/diff-impact | |
| 19 | +| A3 | `maxDepth = 3` | `features/audit.js` | 102 | Audit blast-radius depth | |
| 20 | +| A4 | `maxDepth = 3` | `features/check.js` | 220 | CI check blast-radius depth | |
| 21 | +| A5 | `maxDepth = 10` | `features/sequence.js` | 91 | Sequence diagram traversal depth | |
| 22 | +| A6 | `FALSE_POSITIVE_CALLER_THRESHOLD = 20` | `domain/analysis/module-map.js` | 37 | Generic function false-positive filter | |
| 23 | +| A7 | `resolution = 1.0` | `graph/algorithms/louvain.js` | 17 | Louvain community detection granularity | |
| 24 | +| A8 | `driftThreshold = 0.3` | `features/structure.js` | 581 | Structure cohesion drift warning | |
| 25 | +| A9 | `maxCallers >= 10` | `domain/analysis/brief.js` | 40 | `brief` high-risk tier threshold | |
| 26 | +| A10 | `maxCallers >= 3` | `domain/analysis/brief.js` | 41 | `brief` medium-risk tier threshold | |
| 27 | +| A11 | `maxDepth = 5` | `domain/analysis/brief.js` | 50 | `brief` transitive caller BFS depth | |
| 28 | +| A12 | `maxDepth = 5` | `domain/analysis/brief.js` | 76 | `brief` transitive importer BFS depth | |
| 29 | + |
| 30 | +### Category B — Risk & Scoring Weights (medium-high user value) |
| 31 | + |
| 32 | +| # | Value | File | Line | Controls | |
| 33 | +|---|-------|------|------|----------| |
| 34 | +| B1 | `fanIn: 0.25, complexity: 0.3, churn: 0.2, role: 0.15, mi: 0.1` | `graph/classifiers/risk.js` | 10-14 | Risk score weighting | |
| 35 | +| B2 | `core: 1.0, utility: 0.9, entry: 0.8, adapter: 0.5, leaf: 0.2, dead: 0.1` | `graph/classifiers/risk.js` | 21-27 | Role importance weights | |
| 36 | +| B3 | `DEFAULT_ROLE_WEIGHT = 0.5` | `graph/classifiers/risk.js` | 30 | Fallback role weight | |
| 37 | + |
| 38 | +### Category C — Search & Embedding (already partially in config) |
| 39 | + |
| 40 | +| # | Value | File | Line | Controls | |
| 41 | +|---|-------|------|------|----------| |
| 42 | +| C1 | `limit = 15` | `domain/search/search/hybrid.js` | 12 | Hybrid search default limit | |
| 43 | +| C2 | `rrfK = 60` | `domain/search/search/hybrid.js` | 13 | RRF fusion constant | |
| 44 | +| C3 | `limit = 15` | `domain/search/search/semantic.js` | 12 | Semantic search default limit | |
| 45 | +| C4 | `minScore = 0.2` | `domain/search/search/semantic.js` | 13, 52 | Minimum similarity threshold | |
| 46 | +| C5 | `SIMILARITY_WARN_THRESHOLD = 0.85` | `domain/search/search/semantic.js` | 71 | Duplicate query warning | |
| 47 | +| C6 | Batch sizes per model | `domain/search/models.js` | 66-75 | Embedding batch sizes | |
| 48 | + |
| 49 | +### Category D — Display & Truncation (low-medium user value) |
| 50 | + |
| 51 | +| # | Value | File | Line | Controls | |
| 52 | +|---|-------|------|------|----------| |
| 53 | +| D1 | `MAX_COL_WIDTH = 40` | `presentation/result-formatter.js` | 82 | Table column width | |
| 54 | +| D2 | `50 lines` | `shared/file-utils.js` | 23 | Source context excerpt length | |
| 55 | +| D3 | `100 chars` | `shared/file-utils.js` | 48, 63 | Summary/docstring truncation | |
| 56 | +| D4 | `10 / 20 lines` | `shared/file-utils.js` | 36, 54 | JSDoc scan depth | |
| 57 | +| D5 | `5 lines` | `shared/file-utils.js` | 76 | Multi-line signature gather | |
| 58 | + |
| 59 | +### Category E — MCP Pagination (medium user value) |
| 60 | + |
| 61 | +| # | Value | File | Line | Controls | |
| 62 | +|---|-------|------|------|----------| |
| 63 | +| E1 | `MCP_DEFAULTS` (22 entries) | `shared/paginate.js` | 9-34 | Per-tool default page sizes | |
| 64 | +| E2 | `MCP_MAX_LIMIT = 1000` | `shared/paginate.js` | 37 | Hard abuse-prevention cap | |
| 65 | + |
| 66 | +### Category F — Infrastructure (low user value, keep hardcoded) |
| 67 | + |
| 68 | +| # | Value | File | Line | Controls | |
| 69 | +|---|-------|------|------|----------| |
| 70 | +| F1 | `CACHE_TTL_MS = 86400000` | `infrastructure/update-check.js` | 10 | Version check cache (24h) | |
| 71 | +| F2 | `FETCH_TIMEOUT_MS = 3000` | `infrastructure/update-check.js` | 11 | Version check HTTP timeout | |
| 72 | +| F3 | `debounce = 300` | `domain/graph/watcher.js` | 80 | File watcher debounce (ms) | |
| 73 | +| F4 | `maxBuffer = 10MB` | `features/check.js` | 260 | Git diff buffer | |
| 74 | +| F5 | `volume / 3000` | `features/complexity.js` | 85 | Halstead bugs formula (standard) | |
| 75 | +| F6 | `timeout = 10_000` | `infrastructure/config.js` | 110 | apiKeyCommand timeout | |
| 76 | + |
| 77 | +--- |
| 78 | + |
| 79 | +## Design |
| 80 | + |
| 81 | +### Proposed `DEFAULTS` additions in `src/infrastructure/config.js` |
| 82 | + |
| 83 | +```js |
| 84 | +export const DEFAULTS = { |
| 85 | + // ... existing fields ... |
| 86 | + |
| 87 | + analysis: { |
| 88 | + defaultDepth: 3, // A2: BFS depth for impact/diff-impact |
| 89 | + fnImpactDepth: 5, // A1: fn-impact transitive depth |
| 90 | + auditDepth: 3, // A3: audit blast-radius depth |
| 91 | + sequenceDepth: 10, // A5: sequence diagram depth |
| 92 | + falsePositiveCallers: 20, // A6: generic function filter threshold |
| 93 | + briefBfsDepth: 5, // A11/A12: brief command BFS depth (callers + importers) |
| 94 | + briefHighRiskCallers: 10, // A9: brief high-risk tier threshold |
| 95 | + briefMediumRiskCallers: 3, // A10: brief medium-risk tier threshold |
| 96 | + }, |
| 97 | + |
| 98 | + community: { |
| 99 | + resolution: 1.0, // A7: Louvain resolution |
| 100 | + driftThreshold: 0.2, // existing (build.driftThreshold → move here) |
| 101 | + structureDriftThreshold: 0.3, // A8: structure cohesion drift |
| 102 | + }, |
| 103 | + |
| 104 | + risk: { |
| 105 | + weights: { // B1 |
| 106 | + fanIn: 0.25, |
| 107 | + complexity: 0.3, |
| 108 | + churn: 0.2, |
| 109 | + role: 0.15, |
| 110 | + mi: 0.1, |
| 111 | + }, |
| 112 | + roleWeights: { // B2 |
| 113 | + core: 1.0, |
| 114 | + utility: 0.9, |
| 115 | + entry: 0.8, |
| 116 | + adapter: 0.5, |
| 117 | + leaf: 0.2, |
| 118 | + dead: 0.1, |
| 119 | + }, |
| 120 | + defaultRoleWeight: 0.5, // B3 |
| 121 | + }, |
| 122 | + |
| 123 | + display: { |
| 124 | + maxColWidth: 40, // D1 |
| 125 | + excerptLines: 50, // D2 |
| 126 | + summaryMaxChars: 100, // D3 |
| 127 | + jsdocScanLines: 10, // D4 |
| 128 | + signatureGatherLines: 5, // D5 |
| 129 | + }, |
| 130 | + |
| 131 | + mcp: { |
| 132 | + defaults: { /* E1: current MCP_DEFAULTS object */ }, |
| 133 | + maxLimit: 1000, // E2 |
| 134 | + }, |
| 135 | +}; |
| 136 | +``` |
| 137 | + |
| 138 | +### What stays hardcoded (Category F) |
| 139 | + |
| 140 | +- **Halstead `volume / 3000`** — industry-standard formula, not a tuning knob |
| 141 | +- **Git `maxBuffer`** — platform concern, not analysis behavior |
| 142 | +- **`apiKeyCommand` timeout** — security boundary, not user-facing |
| 143 | +- **Update check TTL/timeout** — implementation detail |
| 144 | +- **Watcher debounce** — could be configurable later but low priority |
| 145 | + |
| 146 | +--- |
| 147 | + |
| 148 | +## Implementation Plan |
| 149 | + |
| 150 | +### Phase 1 — Extend DEFAULTS schema (1 PR) |
| 151 | + |
| 152 | +**Files:** `src/infrastructure/config.js`, `tests/unit/config.test.js` |
| 153 | + |
| 154 | +1. Add `analysis`, `community`, `risk`, `display`, `mcp` sections to `DEFAULTS` |
| 155 | +2. Move `build.driftThreshold` → `community.driftThreshold` (keep `build.driftThreshold` as deprecated alias) |
| 156 | +3. Update `mergeConfig` to handle the new nested sections (already works for 1-level deep objects; verify 2-level `risk.weights` merges correctly — may need recursive merge) |
| 157 | +4. Add tests: loading config with overrides for each new section |
| 158 | + |
| 159 | +### Phase 2 — Wire analysis parameters (1 PR) |
| 160 | + |
| 161 | +**Files to change:** |
| 162 | +- `src/domain/analysis/impact.js` → read `config.analysis.defaultDepth` / `config.analysis.fnImpactDepth` |
| 163 | +- `src/features/audit.js` → read `config.analysis.auditDepth` |
| 164 | +- `src/features/check.js` → read `config.check.depth` (already exists) and `config.analysis.defaultDepth` |
| 165 | +- `src/features/sequence.js` → read `config.analysis.sequenceDepth` |
| 166 | +- `src/domain/analysis/module-map.js` → read `config.analysis.falsePositiveCallers` |
| 167 | +- `src/domain/analysis/brief.js` → read `config.analysis.briefBfsDepth`, `config.analysis.briefHighRiskCallers`, `config.analysis.briefMediumRiskCallers` (PR #480) |
| 168 | + |
| 169 | +**Pattern:** Each module calls `loadConfig()` (or receives config as a parameter). Replace the hardcoded value with `config.analysis.X ?? FALLBACK`. The fallback ensures backward compatibility if config is missing. |
| 170 | + |
| 171 | +**Tests:** Update integration tests to verify custom config values flow through. |
| 172 | + |
| 173 | +### Phase 3 — Wire risk & community parameters (1 PR) |
| 174 | + |
| 175 | +**Files to change:** |
| 176 | +- `src/graph/classifiers/risk.js` → read `config.risk.weights`, `config.risk.roleWeights`, `config.risk.defaultRoleWeight` |
| 177 | +- `src/graph/algorithms/louvain.js` → accept `resolution` parameter, default from config |
| 178 | +- `src/features/structure.js` → read `config.community.structureDriftThreshold` |
| 179 | + |
| 180 | +**Pattern:** These modules don't currently receive config. Options: |
| 181 | +1. **Preferred:** Accept an `options` parameter that callers populate from config |
| 182 | +2. **Alternative:** Import `loadConfig` directly (adds coupling but simpler) |
| 183 | + |
| 184 | +**Tests:** Unit tests for risk scoring with custom weights. Integration test for Louvain with custom resolution. |
| 185 | + |
| 186 | +### Phase 4 — Wire search parameters (1 PR) |
| 187 | + |
| 188 | +**Files to change:** |
| 189 | +- `src/domain/search/search/hybrid.js` → read `config.search.rrfK`, `config.search.topK` |
| 190 | +- `src/domain/search/search/semantic.js` → read `config.search.defaultMinScore` |
| 191 | +- `src/domain/search/models.js` → batch sizes could be config-overridable per model |
| 192 | + |
| 193 | +**Note:** `config.search` already exists with `defaultMinScore`, `rrfK`, `topK`. The modules just don't read from it — they duplicate the values. This phase wires the existing config keys. |
| 194 | + |
| 195 | +### Phase 5 — Wire display & MCP parameters (1 PR) |
| 196 | + |
| 197 | +**Files to change:** |
| 198 | +- `src/presentation/result-formatter.js` → read `config.display.maxColWidth` |
| 199 | +- `src/shared/file-utils.js` → read `config.display.excerptLines`, etc. |
| 200 | +- `src/shared/paginate.js` → read `config.mcp.defaults`, `config.mcp.maxLimit` |
| 201 | + |
| 202 | +**Consideration:** `file-utils.js` and `paginate.js` are low-level shared utilities. They shouldn't call `loadConfig()` directly. Instead, pass display/mcp settings down from callers, or use a module-level config cache set at startup. |
| 203 | + |
| 204 | +### Phase 6 — Documentation & migration (1 PR) |
| 205 | + |
| 206 | +1. Update `README.md` configuration section with the full schema |
| 207 | +2. Add a `docs/configuration.md` reference with all keys, types, defaults, and descriptions |
| 208 | +3. Document the deprecated `build.driftThreshold` alias |
| 209 | +4. Add a JSON Schema file (`.codegraphrc.schema.json`) for IDE autocomplete |
| 210 | + |
| 211 | +### Phase 7 — Update CLAUDE.md with configuration guidance (same PR as Phase 6) |
| 212 | + |
| 213 | +Add a **Configuration** section to `CLAUDE.md` that documents: |
| 214 | +1. The `.codegraphrc.json` config file and its location |
| 215 | +2. The full list of configurable sections (`analysis`, `community`, `risk`, `display`, `mcp`, `search`, `check`, `coChange`, `manifesto`) |
| 216 | +3. Key tunable parameters and their defaults (depth limits, risk weights, thresholds) |
| 217 | +4. How `mergeConfig` works (partial overrides deep-merge with defaults) |
| 218 | +5. Env var overrides (`CODEGRAPH_LLM_*`) |
| 219 | +6. Guidance: when adding new behavioral constants, always add them to `DEFAULTS` in `config.js` and wire them through — never introduce new hardcoded magic numbers |
| 220 | + |
| 221 | +--- |
| 222 | + |
| 223 | +## Migration & Backward Compatibility |
| 224 | + |
| 225 | +- All new config keys have defaults matching current hardcoded values → **zero breaking changes** |
| 226 | +- Existing `.codegraphrc.json` files continue to work unchanged |
| 227 | +- `mergeConfig` deep-merges, so users only need to specify the keys they want to override |
| 228 | +- The `build.driftThreshold` → `community.driftThreshold` move uses a deprecated alias |
| 229 | + |
| 230 | +## Example `.codegraphrc.json` after this work |
| 231 | + |
| 232 | +```json |
| 233 | +{ |
| 234 | + "analysis": { |
| 235 | + "fnImpactDepth": 8, |
| 236 | + "falsePositiveCallers": 30 |
| 237 | + }, |
| 238 | + "risk": { |
| 239 | + "weights": { |
| 240 | + "complexity": 0.4, |
| 241 | + "churn": 0.1 |
| 242 | + } |
| 243 | + }, |
| 244 | + "community": { |
| 245 | + "resolution": 1.5 |
| 246 | + }, |
| 247 | + "display": { |
| 248 | + "maxColWidth": 60 |
| 249 | + } |
| 250 | +} |
| 251 | +``` |
| 252 | + |
| 253 | +--- |
| 254 | + |
| 255 | +## Estimated Scope |
| 256 | + |
| 257 | +| Phase | Files changed | New tests | Risk | |
| 258 | +|-------|--------------|-----------|------| |
| 259 | +| 1 — Schema | 2 | 3-4 | Low | |
| 260 | +| 2 — Analysis wiring | 6 | 4-5 | Low | |
| 261 | +| 3 — Risk/community | 3 | 2-3 | Medium (parameter threading) | |
| 262 | +| 4 — Search wiring | 3 | 2 | Low (config keys already exist) | |
| 263 | +| 5 — Display/MCP | 3 | 2 | Medium (shared utility coupling) | |
| 264 | +| 6 — Docs + CLAUDE.md | 4 | 0 | None | |
| 265 | + |
| 266 | +**Total: ~20 files changed, 6 PRs, one concern per PR.** |
0 commit comments