Skip to content

feat(graph): vault-wide statistics when sourcePath omitted (closes #132)#213

Open
aaronsb wants to merge 1 commit into
mainfrom
feat/132-graph-statistics-vault-wide
Open

feat(graph): vault-wide statistics when sourcePath omitted (closes #132)#213
aaronsb wants to merge 1 commit into
mainfrom
feat/132-graph-statistics-vault-wide

Conversation

@aaronsb
Copy link
Copy Markdown
Owner

@aaronsb aaronsb commented May 25, 2026

Summary

Closes #132. `graph.statistics` previously threw `Source path is required` when called without `sourcePath`, blocking the baseline use case (vault-health snapshots, dashboards). Users had to pick an arbitrary seed and reason about its neighborhood, which isn't representative of the vault.

When `sourcePath` is omitted, the operation now returns a new `vaultStatistics` shape (per the issue's proposal):

Field Meaning
`totalNotes` Count of `.md` files in the vault
`totalLinks` Sum of resolved link occurrences — Obsidian semantics: A→B referenced 3× counts as 3
`orphanCount` Singletons (no resolved links in either direction)
`averageDegree` `2 * totalLinks / totalNotes`
`largestComponentSize` Biggest connected subgraph (treated as undirected)
`isolatedClusters` Total connected-component count, inclusive of singletons (non-trivial = `isolatedClusters - orphanCount`)

Per-node statistics behavior is unchanged when `sourcePath` is provided.

Implementation

  • `GraphTraversal.getVaultStatistics()` — one O(V+E) pass over the `resolvedLinks` adjacency, plus BFS for components. Treats the graph as undirected for component analysis (a link from A to B means A and B are in the same component) while keeping link counts directed (matches how `metadataCache.resolvedLinks` exposes them).
  • `GraphSearchTool.getStatistics()` — branches on `!sourcePath` and returns the vault-wide shape with a different message + workflow hints. Throw-on-missing path removed.
  • `GraphSearchResult.vaultStatistics?` — added alongside existing `statistics?` rather than overloading, since the shapes are genuinely different and a discriminated union would just push the question to consumers.

Tests

`tests/graph-statistics-vault-wide.test.ts` (7 cases):

  • vault-wide path with a known 3-component topology (A→B,C / D→E / F orphan) → asserts exact `vaultStatistics` values
  • per-node path with `sourcePath` still returns the old `statistics` shape
  • repeated occurrences count toward `totalLinks` (Obsidian semantics)
  • empty vault: no divide-by-zero, all zeros returned
  • non-`.md` files excluded from `totalNotes`
  • bidirectional links collapse to one undirected component

Test plan

  • `npm run build` — type-checks clean
  • `npm test` — 319/319 across 27 suites (was 312 before, +7 new)
  • `npm run lint` — no new errors (5 pre-existing warnings unchanged)
  • Live MCP smoke against the running vault — needs this release shipped first to update the local plugin

Before: graph.statistics threw 'Source path is required' when called
without sourcePath, blocking the baseline use case (vault health
snapshots, dashboards). Users had to pick an arbitrary seed and reason
about its neighborhood — not representative.

After: when sourcePath is omitted, return a new vaultStatistics shape:

  totalNotes            — count of .md files in the vault
  totalLinks            — sum of resolved link occurrences (Obsidian
                          semantics: A→B referenced 3× counts as 3)
  orphanCount           — singletons (no links in either direction)
  averageDegree         — 2 * totalLinks / totalNotes
  largestComponentSize  — biggest connected subgraph (undirected)
  isolatedClusters      — total connected-component count,
                          inclusive of singletons (so non-trivial =
                          isolatedClusters - orphanCount)

Per-node statistics behaviour is unchanged when sourcePath is provided.

Implementation: GraphTraversal.getVaultStatistics does one O(V+E)
pass over the resolvedLinks adjacency, plus BFS for components.
Treats the graph as undirected for component analysis (a link from A
to B means A and B are in the same component) while keeping link
counts directed (matches how metadataCache exposes them).

Tests: tests/graph-statistics-vault-wide.test.ts covers the
vault-wide path (the topology described above), per-node fallback,
empty vault, repeated occurrences, non-md filtering, and the
directed-edges-undirected-components invariant.
@aaronsb aaronsb added enhancement New feature or request area:graph Graph operations (graph.*, link traversal, statistics) labels May 25, 2026
@github-actions
Copy link
Copy Markdown

✅ Build succeeded! Artifacts are available in the Actions tab.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:graph Graph operations (graph.*, link traversal, statistics) enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

graph.statistics should support vault-wide queries (sourcePath optional)

1 participant