A two-step pipeline with a static viewer. No server and no runtime web deps. Renderer CSS is authored directly in the canonical skill asset.
This document describes the current pre-v1 implementation. The active stable
release roadmap and target installed-package topology are defined in
../V1_RELEASE_PLAN.md.
V1 is an internal renovation, not a rewrite.
The stable architecture choices are:
- Node.js and ESM for the scanner and renderer.
- Node standard-library-only installed runtime.
- Graph JSON as the scanner/renderer boundary.
- One static self-contained HTML output.
- Inline SVG for deterministic graph visualization.
- Vanilla browser JavaScript inside the generated map.
The implementation should be modularized behind those contracts. Scanner filesystem walking, classification, Markdown references, graph assembly, renderer validation, layout, HTML generation, and browser behavior should become independently testable responsibilities.
The runtime uses authored plain CSS without a CSS build dependency. Canvas,
WebGL, and browser frameworks remain deferred unless measured evidence shows
the current artifact model cannot meet representative needs. See
v1-stack-decision.md for the accepted decision and
reconsideration triggers.
target repo --scan--> docs/ai/visualize/codebase-graph.json
|
v
docs/ai/visualize/codebase-map.html
The docs/ai/ directory inside a target repo is shared across tools.
Each tool is intended to own a sub-namespace and should not write outside it
during normal skill operation.
| Path | Owner | Notes |
|---|---|---|
docs/ai/visualize/codebase-graph.json |
codebase-visualize | generated, refreshable |
docs/ai/visualize/codebase-map.html |
codebase-visualize | generated, refreshable |
docs/ai/CODEBASE_MAP.md |
codebase-orient | read-only for visualizer |
docs/ai/CHANGE_SURFACES.md |
codebase-orient | read-only for visualizer |
docs/ai/OPEN_QUESTIONS.md |
codebase-orient | read-only for visualizer |
Intended namespace rules:
- Default output writes inside
<target>/docs/ai/visualize/. docs/ai/visualize/is listed inIGNORED_PATHS(repo-relative path set). The filesystem walk skips that directory entirely: no nodes, no edges, no doc-reference parsing. Self-generated artifacts never appear in the graph.- Root
docs/ai/*.mdorientation docs are NOT inIGNORED_PATHS. They appear as normal graph nodes (kind: docs, risk hint: orient-docs) and are scanned for doc-reference edges like any other markdown file. - Reading orient doc content to enrich routing hints is deferred to a future version. The current scanner treats them as ordinary markdown nodes only.
- If orientation docs are absent, the scanner works from filesystem scans alone.
- Local agent/tool directories
.agents/,.claude/, and.codex/are excluded so project-local skill installations do not become target nodes.
Public commands accept only --target, derive both output paths from that
target, validate the namespace before writing, and replace outputs atomically.
| field | type | claim | notes |
|---|---|---|---|
id |
string | fact | repo-relative POSIX path; "." for root |
type |
"file" or "folder" |
fact | |
name |
string | fact | basename |
path |
string | fact | same as id |
parent |
string or null | fact | parent node id |
depth |
number | fact | distance from root |
group |
string | fact | top-level folder, or "(root)" |
kind |
enum | inference | docs/tests/config/scripts/source/styles/assets/unknown |
ext |
string | fact | files only |
size |
number | fact | bytes, files only |
risk_hints |
string[] | inference | files only |
claims |
object | - | per-field claim labels |
| field | type | claim | notes |
|---|---|---|---|
from |
string | fact | source node id |
to |
string | fact | target node id |
type |
string | - | see edge types below |
claim |
string | - | "fact" or "inference" depending on type |
evidence |
object[] | - | optional; present on references edges |
| type | claim | description |
|---|---|---|
"contains" |
fact | folder parent/child relationship from the filesystem walk |
"references" |
inference | a markdown file explicitly links to another file via an inline link [label](path) or a path-like token inside a fenced code block |
Each references edge carries an evidence array. One entry per source location
(line + matched text). Multiple entries when the same target is referenced from
more than one place in the same doc. Evidence text is bounded to 240 characters.
{
"from": "docs/architecture.md",
"to": "src/routes/users.js",
"type": "references",
"claim": "inference",
"evidence": [
{ "line": 12, "text": "[route handlers](../src/routes/users.js)" },
{ "line": 20, "text": "src/routes/users.js" }
]
}- Inline links (
[label](path)): resolved relative to the doc's directory. Images () are skipped. - Code-fence tokens: a whitespace-delimited token that contains
/and ends with a recognized extension. Resolved doc-relative first; falls back to repo-root-relative if the doc-relative path is not in the graph. - External URLs (
http:,https:,ftp:,//) and bare anchors (#) are always skipped. - Only paths that resolve to an existing file node in the graph are emitted.
- Files under
docs/ai/visualize/are excluded from the graph entirely (IGNORED_PATHS). Rootdocs/ai/orientation docs are not excluded; they are parsed as regular markdown and may produce references edges.
lib/schema.mjsis the validation authority for schema major version1.- Node ids are unique and equal their paths; every non-root node has an existing parent; containment edges must agree with that parent.
- Reference edges target existing nodes and carry non-empty inferred evidence.
- Metadata counts must exactly match the graph.
- Symlinks, including broken symlinks, are not traversed and produce a warning.
- Unreadable directories are skipped with a warning. Files that cannot be stated are skipped with a warning. Unreadable Markdown remains a file node but its references are skipped with a warning.
- Warnings use stable repo-relative paths and are sorted deterministically.
lib/renderer.mjs consumes a graph that conforms to the schema above and emits a
single self-contained HTML file. It must:
- embed the graph as inline JSON (no fetch / no external files),
- inline all CSS and JS,
- inline renderer CSS from
skills/codebase-visualize/assets/renderer.css, - surface a compact overview from existing graph metadata/counts,
- derive readability-only sections such as important nodes and lightweight file filters from existing graph data without adding new scanner intelligence,
- degrade gracefully if optional fields are missing,
- never claim more certainty than the graph's claim labels allow (it renders the banner + per-field claim badges).
Renderer style source lives in
skills/codebase-visualize/assets/renderer.css and is embedded directly.
Generated maps do not load any external stylesheet at runtime.
The canonical self-contained runtime now lives inside the skill package:
skills/codebase-visualize/SKILL.md
skills/codebase-visualize/lib/version.mjs
skills/codebase-visualize/lib/scanner.mjs
skills/codebase-visualize/lib/renderer.mjs
skills/codebase-visualize/lib/runtime.mjs
skills/codebase-visualize/lib/commands.mjs
skills/codebase-visualize/scripts/scan.mjs
skills/codebase-visualize/scripts/render.mjs
skills/codebase-visualize/scripts/visualize.mjs
skills/codebase-visualize/assets/renderer.css
Root package.json commands and repository-only scripts invoke this canonical
runtime. Root scanner, renderer, and renderer-CSS runtime copies are forbidden
by scripts/check-package-integrity.mjs.
The package-integrity check copies only skills/codebase-visualize/ to a
disposable location, constructs a disposable target repository, and proves that
scan and render work without the source repository or its node_modules.
evals/cases.json declares useful and honest orientation outcomes for five
representative fixture shapes. scripts/check-evals.mjs copies the canonical
skill package and each fixture into OS temporary storage, invokes only the
copied public visualize.mjs entrypoint, grades structured graph evidence, and
removes all raw artifacts.
The eval layer is repository-only and does not ship inside the installed package. It proves discoverability, claim honesty, reference evidence, generated-output exclusion, and sibling-doc preservation. It intentionally does not grade browser usability, large-repository behavior, platform support, or human/agent interpretation quality.
Repository-only scripts/install.ps1 and scripts/install.sh resolve the
Claude Code or Codex destination from validated tool and scope arguments. They
copy only the canonical skills/codebase-visualize/ package.
Installers refuse existing targets by default. Explicit force installation
stages and verifies a complete package beside the destination, then performs a
clean exact-sync replacement. Installation never invokes the runtime or writes
visualization artifacts. scripts/check-installers.mjs verifies the full
PowerShell/bash, Claude/Codex, user/project matrix in disposable locations.
The exact accepted target topology and source-of-truth rules are recorded in
v1-contract-decisions.md.
- fact - observed from the filesystem.
- inference - heuristic guess (kind, risk hints).
- unknown - could not classify.
All tracked text is ASCII-only (see npm run check:ascii). Generated output
under docs/ai/visualize/ is exempt because it may embed non-ASCII content
from the scanned target. Root docs/ai/ orientation docs are tracked text
and are subject to the ASCII check if present in this repo.
AST/import edges, MCP, Excalidraw export, incremental re-scan, and
configuration. See README.md for the full out-of-scope list.
{ "schema_version": "1.0.0", "metadata": { "generated_at": "ISO-8601", "target_root": "string", "producer_version": "SemVer string", "scanner_version": "0.2.0", "ignored_patterns": ["string", "..."], "claim_policy": { "fact": "...", "inference": "...", "unknown": "..." }, "authority": "string (these outputs are generated/refreshable, not truth)", "counts": { "files": 0, "folders": 0, "by_kind": { "source": 0 }, "references_edges": 0 }, "scan_warnings": [] }, "nodes": [ "Node, ..." ], "edges": [ "Edge, ..." ] }