Skip to content

Latest commit

 

History

History
263 lines (212 loc) · 11 KB

File metadata and controls

263 lines (212 loc) · 11 KB

v0 architecture

A two-step pipeline with a static viewer. No server and no runtime web deps. Renderer CSS is authored directly in the canonical skill asset.

This document describes the current pre-v1 implementation. The active stable release roadmap and target installed-package topology are defined in ../V1_RELEASE_PLAN.md.

V1 architecture direction

V1 is an internal renovation, not a rewrite.

The stable architecture choices are:

  • Node.js and ESM for the scanner and renderer.
  • Node standard-library-only installed runtime.
  • Graph JSON as the scanner/renderer boundary.
  • One static self-contained HTML output.
  • Inline SVG for deterministic graph visualization.
  • Vanilla browser JavaScript inside the generated map.

The implementation should be modularized behind those contracts. Scanner filesystem walking, classification, Markdown references, graph assembly, renderer validation, layout, HTML generation, and browser behavior should become independently testable responsibilities.

The runtime uses authored plain CSS without a CSS build dependency. Canvas, WebGL, and browser frameworks remain deferred unless measured evidence shows the current artifact model cannot meet representative needs. See v1-stack-decision.md for the accepted decision and reconsideration triggers.

target repo --scan--> docs/ai/visualize/codebase-graph.json
                             |
                             v
                      docs/ai/visualize/codebase-map.html

Namespace and ownership model

The docs/ai/ directory inside a target repo is shared across tools. Each tool is intended to own a sub-namespace and should not write outside it during normal skill operation.

Path Owner Notes
docs/ai/visualize/codebase-graph.json codebase-visualize generated, refreshable
docs/ai/visualize/codebase-map.html codebase-visualize generated, refreshable
docs/ai/CODEBASE_MAP.md codebase-orient read-only for visualizer
docs/ai/CHANGE_SURFACES.md codebase-orient read-only for visualizer
docs/ai/OPEN_QUESTIONS.md codebase-orient read-only for visualizer

Intended namespace rules:

  • Default output writes inside <target>/docs/ai/visualize/.
  • docs/ai/visualize/ is listed in IGNORED_PATHS (repo-relative path set). The filesystem walk skips that directory entirely: no nodes, no edges, no doc-reference parsing. Self-generated artifacts never appear in the graph.
  • Root docs/ai/*.md orientation docs are NOT in IGNORED_PATHS. They appear as normal graph nodes (kind: docs, risk hint: orient-docs) and are scanned for doc-reference edges like any other markdown file.
  • Reading orient doc content to enrich routing hints is deferred to a future version. The current scanner treats them as ordinary markdown nodes only.
  • If orientation docs are absent, the scanner works from filesystem scans alone.
  • Local agent/tool directories .agents/, .claude/, and .codex/ are excluded so project-local skill installations do not become target nodes.

Public commands accept only --target, derive both output paths from that target, validate the namespace before writing, and replace outputs atomically.

Graph schema (current 1.0.0)

{
  "schema_version": "1.0.0",
  "metadata": {
    "generated_at": "ISO-8601",
    "target_root": "string",
    "producer_version": "SemVer string",
    "scanner_version": "0.2.0",
    "ignored_patterns": ["string", "..."],
    "claim_policy": { "fact": "...", "inference": "...", "unknown": "..." },
    "authority": "string (these outputs are generated/refreshable, not truth)",
    "counts": { "files": 0, "folders": 0, "by_kind": { "source": 0 }, "references_edges": 0 },
    "scan_warnings": []
  },
  "nodes": [ "Node, ..." ],
  "edges": [ "Edge, ..." ]
}

Node

field type claim notes
id string fact repo-relative POSIX path; "." for root
type "file" or "folder" fact
name string fact basename
path string fact same as id
parent string or null fact parent node id
depth number fact distance from root
group string fact top-level folder, or "(root)"
kind enum inference docs/tests/config/scripts/source/styles/assets/unknown
ext string fact files only
size number fact bytes, files only
risk_hints string[] inference files only
claims object - per-field claim labels

Edge

field type claim notes
from string fact source node id
to string fact target node id
type string - see edge types below
claim string - "fact" or "inference" depending on type
evidence object[] - optional; present on references edges

Edge types

type claim description
"contains" fact folder parent/child relationship from the filesystem walk
"references" inference a markdown file explicitly links to another file via an inline link [label](path) or a path-like token inside a fenced code block

Evidence (references edges only)

Each references edge carries an evidence array. One entry per source location (line + matched text). Multiple entries when the same target is referenced from more than one place in the same doc. Evidence text is bounded to 240 characters.

{
  "from": "docs/architecture.md",
  "to": "src/routes/users.js",
  "type": "references",
  "claim": "inference",
  "evidence": [
    { "line": 12, "text": "[route handlers](../src/routes/users.js)" },
    { "line": 20, "text": "src/routes/users.js" }
  ]
}

References edge resolution rules

  1. Inline links ([label](path)): resolved relative to the doc's directory. Images (![...](...)) are skipped.
  2. Code-fence tokens: a whitespace-delimited token that contains / and ends with a recognized extension. Resolved doc-relative first; falls back to repo-root-relative if the doc-relative path is not in the graph.
  3. External URLs (http:, https:, ftp:, //) and bare anchors (#) are always skipped.
  4. Only paths that resolve to an existing file node in the graph are emitted.
  5. Files under docs/ai/visualize/ are excluded from the graph entirely (IGNORED_PATHS). Root docs/ai/ orientation docs are not excluded; they are parsed as regular markdown and may produce references edges.

Required-field and filesystem rules

  • lib/schema.mjs is the validation authority for schema major version 1.
  • Node ids are unique and equal their paths; every non-root node has an existing parent; containment edges must agree with that parent.
  • Reference edges target existing nodes and carry non-empty inferred evidence.
  • Metadata counts must exactly match the graph.
  • Symlinks, including broken symlinks, are not traversed and produce a warning.
  • Unreadable directories are skipped with a warning. Files that cannot be stated are skipped with a warning. Unreadable Markdown remains a file node but its references are skipped with a warning.
  • Warnings use stable repo-relative paths and are sorted deterministically.

Renderer contract

lib/renderer.mjs consumes a graph that conforms to the schema above and emits a single self-contained HTML file. It must:

  • embed the graph as inline JSON (no fetch / no external files),
  • inline all CSS and JS,
  • inline renderer CSS from skills/codebase-visualize/assets/renderer.css,
  • surface a compact overview from existing graph metadata/counts,
  • derive readability-only sections such as important nodes and lightweight file filters from existing graph data without adding new scanner intelligence,
  • degrade gracefully if optional fields are missing,
  • never claim more certainty than the graph's claim labels allow (it renders the banner + per-field claim badges).

Renderer style source lives in skills/codebase-visualize/assets/renderer.css and is embedded directly. Generated maps do not load any external stylesheet at runtime.

Current distribution topology

The canonical self-contained runtime now lives inside the skill package:

skills/codebase-visualize/SKILL.md
skills/codebase-visualize/lib/version.mjs
skills/codebase-visualize/lib/scanner.mjs
skills/codebase-visualize/lib/renderer.mjs
skills/codebase-visualize/lib/runtime.mjs
skills/codebase-visualize/lib/commands.mjs
skills/codebase-visualize/scripts/scan.mjs
skills/codebase-visualize/scripts/render.mjs
skills/codebase-visualize/scripts/visualize.mjs
skills/codebase-visualize/assets/renderer.css

Root package.json commands and repository-only scripts invoke this canonical runtime. Root scanner, renderer, and renderer-CSS runtime copies are forbidden by scripts/check-package-integrity.mjs.

The package-integrity check copies only skills/codebase-visualize/ to a disposable location, constructs a disposable target repository, and proves that scan and render work without the source repository or its node_modules.

Behavioral evaluation architecture

evals/cases.json declares useful and honest orientation outcomes for five representative fixture shapes. scripts/check-evals.mjs copies the canonical skill package and each fixture into OS temporary storage, invokes only the copied public visualize.mjs entrypoint, grades structured graph evidence, and removes all raw artifacts.

The eval layer is repository-only and does not ship inside the installed package. It proves discoverability, claim honesty, reference evidence, generated-output exclusion, and sibling-doc preservation. It intentionally does not grade browser usability, large-repository behavior, platform support, or human/agent interpretation quality.

Installer architecture

Repository-only scripts/install.ps1 and scripts/install.sh resolve the Claude Code or Codex destination from validated tool and scope arguments. They copy only the canonical skills/codebase-visualize/ package.

Installers refuse existing targets by default. Explicit force installation stages and verifies a complete package beside the destination, then performs a clean exact-sync replacement. Installation never invokes the runtime or writes visualization artifacts. scripts/check-installers.mjs verifies the full PowerShell/bash, Claude/Codex, user/project matrix in disposable locations.

The exact accepted target topology and source-of-truth rules are recorded in v1-contract-decisions.md.

Claim labels

  • fact - observed from the filesystem.
  • inference - heuristic guess (kind, risk hints).
  • unknown - could not classify.

Conventions

All tracked text is ASCII-only (see npm run check:ascii). Generated output under docs/ai/visualize/ is exempt because it may embed non-ASCII content from the scanned target. Root docs/ai/ orientation docs are tracked text and are subject to the ASCII check if present in this repo.

Deliberately deferred

AST/import edges, MCP, Excalidraw export, incremental re-scan, and configuration. See README.md for the full out-of-scope list.