Skip to content

feat(code-graph): detect execution flows (processes) for impact analysis #39

@lsmonki

Description

@lsmonki

Summary

Add execution flow (process) detection to @specd/code-graph so that computeRiskLevel can use real processCount instead of hardcoded 0, and impact analysis can report which end-to-end flows are affected by a change.

Motivation

Today, impact analysis only counts direct/indirect callers. This misses the functional dimension: a symbol with 2 callers but participating in 5 end-to-end flows (login, register, checkout, etc.) is more critical than one with 10 callers that's only used in a single flow.

Execution flows answer "what breaks" instead of just "what depends on this":

  • broken_at_step reveals if the breakage is early (severe) or late (contained)
  • processCount enables proper CRITICAL risk scoring (>= 3 → HIGH, >= 5 → CRITICAL)
  • Process-grouped search returns conceptual paths instead of flat symbol lists

Design

Detection algorithm (post-indexing phase)

  1. Score entry points — functions that call many others but are called by few. Boost for: exported/public, name patterns (handle*, on*, *Controller, register*), framework conventions
  2. BFS forward from each entry point along CALLS edges, max depth ~10
  3. Collect traces — each path with ≥ 3 steps becomes a process
  4. Deduplicate — remove subset traces, keep longest per entry→terminal pair
  5. Limit — dynamic cap based on codebase size (max(20, min(300, symbolCount / 10)))

Data model additions

New node: Process

id: string           // "proc_0_handleLogin"
label: string        // heuristic: "HandleLogin → UpdateSession"
stepCount: number
entryPointId: string
terminalId: string

New relation: STEP_IN_PROCESS

source: Symbol → target: Process
step: number  // 1-indexed position in trace

New relation type in RelationType:

StepInProcess = 'STEP_IN_PROCESS'

Populating COVERS (Spec → File)

The schema already defines COVERS(FROM Spec TO File) but it's not populated. This is the missing link between specs and the code graph. Once populated, the traversal chain becomes:

Spec --COVERS--> File --DEFINES--> Symbol --STEP_IN_PROCESS--> Process

Detection strategy:

  • During indexing, for each workspace, match spec paths against source file paths by convention:
    • specs/core/change/ covers files matching core/**/change*, core/**/Change*
    • Use the spec's dependsOn to transitively cover files from dependent specs
  • Additionally, scan source files for import paths that reference types/functions whose names match spec keywords
  • This is heuristic — exact COVERS can be refined later with explicit annotations in metadata

Queries this enables:

  • graph impact --file auth.ts → shows affected specs via File ← COVERS ← Spec
  • graph search "login" --group-by flows → specs grouped alongside their flows
  • graph flows --spec core:core/auth → flows that pass through files covered by this spec

Integration points

  • computeRiskLevel(direct, total, processCount) — wire the real count (currently 0)
  • analyzeImpact — query STEP_IN_PROCESS to populate affectedProcesses
  • ImpactResult.affectedProcesses — already exists as string[], populate with process labels
  • New getProcesses(symbolId) query on GraphStore
  • Schema DDL: add Process node table + STEP_IN_PROCESS rel table
  • Populate existing COVERS rel table during spec indexing phase

CLI surface

graph impact — automatically includes affectedProcesses and affectedSpecs in output (no flag needed)

graph search --group-by <mode> — opt-in flag to group search results. Default remains ungrouped (flat list). Modes:

Mode Groups by Use case
flows Execution flow "These symbols participate in the login flow"
workspace Workspace name Monorepo overview: results per core, cli, etc.
file File path "These 5 matches are in auth.ts"
kind Symbol kind "All matching classes", "all matching functions"

JSON output shape with --group-by:

{
  "groups": [
    { "key": "Login → UpdateSession", "symbols": [...], "specs": [...] },
    { "key": "Register → SendEmail", "symbols": [...], "specs": [...] }
  ]
}

Without --group-by, output stays flat (symbols[] + specs[]) as today.

graph flows — dedicated command for exploring flows:

  • specd graph flows — list all detected processes
  • specd graph flows --symbol <name> — flows a symbol participates in
  • specd graph flows --file <path> — flows passing through a file
  • specd graph flows --spec <specId> — flows related to a spec (via COVERS)

What already exists

  • CALLS graph ✅
  • BFS traversal (getUpstream/getDownstream) ✅
  • affectedProcesses: [] field in ImpactResult ✅ (just needs populating)
  • processCount parameter in computeRiskLevel ✅ (just needs wiring)
  • Schema DDL with COVERS rel table ✅ (just needs populating)
  • Schema DDL pattern for new node/rel types ✅

What needs building

  1. Entry point scoring heuristics (~100 lines)
  2. Forward BFS trace collector (~80 lines)
  3. Deduplication + limiting (~50 lines)
  4. Process node + STEP_IN_PROCESS in schema DDL
  5. GraphStore methods: addProcesses(), getProcesses(), getSymbolProcesses()
  6. COVERS population: spec path → file path matching during spec indexing (~100 lines)
  7. Wire into IndexCodeGraph.execute() as post-indexing phase
  8. Wire processCount into analyzeImpact and computeRiskLevel
  9. CLI: graph flows command
  10. CLI: --group-by flag in graph search (modes: flows, workspace, file, kind)

Estimated scope

Large — ~900 lines of new code. The graph infrastructure and risk model are already in place. The main work is entry point scoring, trace collection, COVERS population, and the --group-by presentation layer.

References

  • Current TODO: packages/code-graph/src/domain/services/analyze-impact.ts:105
  • Existing unpopulated relation: COVERS(FROM Spec TO File) in schema.ts

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions