Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions .agents/rules/codemap.mdc
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,9 @@ A local database (default **`.codemap.db`**) indexes structure: symbols, imports
| ------- | ----------------- | ----- |
| **Default** — from this clone | `bun src/index.ts` | `bun src/index.ts query "<SQL>"` |
| Same entry | `bun run dev` | (same as first row) |
| JSON output | — | `bun src/index.ts query --json "<SQL>"` |
| Recipe | — | `bun src/index.ts query --recipe fan-out` (see **`bun src/index.ts query --help`**) |
| Recipe catalog / SQL | — | `bun src/index.ts query --recipes-json` · `bun src/index.ts query --print-sql fan-out` |

After **`bun run build`**, **`node dist/index.mjs`** matches the published **`codemap`** binary (same flags). **`bun link`** / global **`codemap`** also work when testing the packaged CLI.

Expand Down Expand Up @@ -66,6 +69,10 @@ If the question looks like any of these → use the index:
bun src/index.ts query "<SQL>"
```

**Row count:** The CLI does **not** impose a maximum number of rows. Add **`LIMIT`** (and **`ORDER BY`**) in SQL when you need a bounded list. For automation and multi-row answers, prefer **`bun src/index.ts query --json`** — stdout is a JSON array; on failure, stdout is **`{"error":"<message>"}`** and the process exits **1**.

**Verbatim answers:** When the user asks for lists, counts, or enumerated structural data from the index, **paste or summarize from the query output without inventing rows** — do not substitute a prose “summary” that omits rows the user asked to see. Prefer **`--json`** so the full result set is unambiguous.

## Quick reference queries

| I need to... | Query |
Expand Down
40 changes: 40 additions & 0 deletions .agents/skills/codemap/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,44 @@ bun src/index.ts query "<SQL>"

After **`bun run build`**, **`node dist/index.mjs query …`** or a linked **`codemap`** binary matches the published CLI. Use **`--root`** / **`CODEMAP_ROOT`** to index another tree.

## Query output and agents

- **`bun src/index.ts query --json`** prints a **JSON array** of row objects to stdout. On SQL error, stdout is **`{"error":"<message>"}`** and the process exits **1**.
- The CLI **does not cap** how many rows SQLite returns — add **`LIMIT`** and **`ORDER BY`** in SQL when you need a bounded list.
- When answering with structural facts from the index (lists of paths, symbols, dependency edges), **ground the answer in the query rows** — do not invent or silently drop rows. Prefer **`--json`** for large or multi-column results.

## Agent-friendly SQL recipes

Replace placeholders (`'...'`) with your module path, file glob, or symbol name.

**CLI shortcuts:** **`bun src/index.ts query --recipe <id>`** runs bundled SQL (optional **`--json`**). **`bun src/index.ts query --recipes-json`** prints every bundled recipe (**`id`**, **`description`**, **`sql`**) as JSON (no index / DB required). **`bun src/index.ts query --print-sql <id>`** prints one recipe’s SQL only. Ids include **`fan-out`**, **`fan-out-sample`** (**`GROUP_CONCAT`** samples), **`fan-out-sample-json`** (same, but **`json_group_array`** — needs SQLite JSON1), **`fan-in`**, **`index-summary`**, **`files-largest`**, **`components-by-hooks`**, **`markers-by-kind`** — see **`bun src/index.ts query --help`**. The fan-out rows match the SQL below; others align with “Conditional aggregation”, “Codebase statistics”, and component sections later in this skill.

**Top files by dependency fan-out:**

```sql
SELECT from_path, COUNT(*) AS deps
FROM dependencies
GROUP BY from_path
ORDER BY deps DESC
LIMIT 10;
```

**Same ranking, plus up to five sample targets per file** (uses a correlated subquery; adjust **`LIMIT 5`** as needed):

```sql
SELECT d.from_path,
COUNT(*) AS deps,
(SELECT GROUP_CONCAT(to_path, ' | ')
FROM (SELECT to_path FROM dependencies d2 WHERE d2.from_path = d.from_path LIMIT 5))
AS sample_targets
FROM dependencies d
GROUP BY d.from_path
ORDER BY deps DESC
LIMIT 10;
```

**JSON array samples (JSON1):** replace **`GROUP_CONCAT`** with **`json_group_array(to_path)`** in that subquery if your SQLite build has JSON1 — or use **`bun src/index.ts query --recipe fan-out-sample-json`**.

## Schema

### `files` — Every indexed file
Expand Down Expand Up @@ -196,6 +234,8 @@ SELECT name, file_path, hooks_used
FROM components WHERE hooks_used LIKE '%useTheme%';

-- Components with most hooks (complexity indicator)
-- `json_array_length` requires SQLite JSON1. For a portable ranking, use
-- `bun src/index.ts query --recipe components-by-hooks` (comma-based count on the stored JSON array).
SELECT name, file_path,
json_array_length(hooks_used) as hook_count
FROM components ORDER BY hook_count DESC LIMIT 15;
Expand Down
34 changes: 34 additions & 0 deletions .changeset/codemap-query-and-golden.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
---
"@stainless-code/codemap": patch
---

**Query CLI**

- **`codemap query --json`**: print a JSON array of result rows to stdout (and **`{"error":"…"}`** on SQL errors) for agents and automation. Document that the query subcommand does **not** cap rows — use SQL **`LIMIT`** for bounded results. Update bundled agent rule and skill with **`--json`** preference, verbatim structural answers, and generic SQL recipes (fan-out + sample targets).

- **`codemap query --recipe <id>`** for bundled read-only SQL so agents can run common structural queries without embedding SQL on the command line. **`--json`** works with recipes the same way as ad-hoc SQL. Bundled ids include dependency **`fan-out`** / **`fan-out-sample`** / **`fan-out-sample-json`** (JSON1 **`json_group_array`**) / **`fan-in`**, index **`index-summary`**, **`files-largest`**, React **`components-by-hooks`** (comma-based hook count, no JSON1), and **`markers-by-kind`**. The benchmark suite uses the **`fan-out`** recipe SQL for an indexed-path scenario; docs clarify that recipes add no extra query cost vs pasting the same SQL.

- **Recipe discovery (no index / DB):** **`codemap query --recipes-json`** prints all bundled recipes (**`id`**, **`description`**, **`sql`**) as JSON. **`codemap query --print-sql <id>`** prints one recipe’s SQL. **`listQueryRecipeCatalog()`** in **`src/cli/query-recipes.ts`** is the single derived view of **`QUERY_RECIPES`** for the JSON output.

**Golden tests**

- **`bun run test:golden`**: index **`fixtures/minimal`**, run scenarios from **`fixtures/golden/scenarios.json`**, and compare query JSON to **`fixtures/golden/minimal/`**. Use **`bun scripts/query-golden.ts --update`** after intentional fixture or schema changes. Documented in **benchmark.md** and **CONTRIBUTING**.

**Query robustness**

- With **`--json`**, **`{"error":"…"}`** is printed for invalid SQL, database open failures, and **`codemap query`** bootstrap failures (config / resolver setup), not only bad SQL. The CLI sets **`process.exitCode`** instead of **`process.exit`** so piped stdout is not cut off mid-stream.

**Benchmark & `CODEMAP_BENCHMARK_CONFIG`**

- Each **`indexedSql`** in custom scenario JSON is validated as a single read-only **`SELECT`** (or **`WITH` … `SELECT`**) — DDL/DML and **`RETURNING`** are rejected before execution.
- Config file paths are resolved from **`process.cwd()`** (see **benchmark.md**). **`traditional.regex`** strings are developer-controlled (local JSON); **`files`** mode compiles the regex once per scenario.
- Overlapping **globs** in the traditional path are **deduplicated** so **Files read** / **Bytes read** count each path once.
- The default **components in `shop/`** scenario uses a **`LIKE`** filter aligned with the traditional globs under **`components/shop/`** (**\*.tsx** and **\*.jsx**, matching **`components`** rows from the parser) and avoids unrelated paths such as **`workshop`**.

**Recipes (determinism)**

- Bundled recipe SQL adds stable secondary **`ORDER BY`** columns (and orders inner **`LIMIT`** samples) so **`--recipe`** / **`--json`** output does not vary on aggregate ties.

**External QA**

- **`bun run qa:external`**: **`--max-files`** and **`--max-symbols`** must be positive integers (invalid values throw before indexing).
8 changes: 7 additions & 1 deletion .github/CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,9 @@ Codemap is in **bootstrap / extraction** phase. Before large PRs, please open an
bun install # runs `prepare` → Husky git hooks
bun run dev # same as `bun src/index.ts` — CLI from source
bun test
bun run check # format + lint + tests + typecheck + build
bun run test:golden # golden SQL vs fixtures/minimal (also runs at end of `bun run check`)
bun run test:golden:external # Tier B: local tree via CODEMAP_ROOT / --root (not in CI)
bun run check # format + lint + tests + typecheck + build + test:golden
bun run clean # remove untracked/ignored build artifacts (keeps `.env`, `.codemap.db*`)
bun run check-updates # interactive dependency updates (`bun update -i --latest`)
```
Expand All @@ -36,6 +38,8 @@ Then open a PR on GitHub into **`main`**.
- **Public API** — Anything exported from the package entry (`src/index.ts` → `src/api.ts`, `config.ts`, shared types) should have **JSDoc** that reads well in hovers and in published typings.
- **Layers** — Keep boundaries clear: [architecture.md](../docs/architecture.md) (`cli` → `application` → infrastructure). Don’t let CLI concerns leak into parsers or the DB layer.
- **Before you open / update a PR** — `bun run check` (or at least `bun run test` + `bun run typecheck` while iterating).
- **Golden queries (Tier A)** — If you change `fixtures/minimal/` or schema/query behavior expected by [fixtures/golden/](../fixtures/golden/), run `bun scripts/query-golden.ts --update`, review diffs, and commit updated JSON under `fixtures/golden/minimal/`. Prefer **fixing the indexer** when output changes for the wrong reason; only refresh goldens when the new rows are correct. See [docs/golden-queries.md](../docs/golden-queries.md).
- **Golden queries (Tier B)** — Against a **local** clone, use `bun run test:golden:external` with `CODEMAP_ROOT` / `--root`. Copy [fixtures/golden/scenarios.external.example.json](../fixtures/golden/scenarios.external.example.json) to `scenarios.external.json` if you need custom scenarios; goldens under `fixtures/golden/external/` are gitignored — do not commit snapshots from proprietary trees.
- **Style** — Match Oxfmt/Oxlint; prefer **straight-line code** and extracted helpers over long nested blocks.

**Editor (VS Code):** [`.vscode/extensions.json`](../.vscode/extensions.json) lists recommended extensions (Bun, Oxc, TypeScript native preview, etc.). [`.vscode/settings.json`](../.vscode/settings.json) enables Oxc format on save and `tsgo`. Formatting and lint rules live in [`.oxfmtrc.json`](../.oxfmtrc.json) and [`.oxlintrc.json`](../.oxlintrc.json) (no framework-specific options beyond defaults).
Expand All @@ -46,6 +50,8 @@ Then open a PR on GitHub into **`main`**.

Do **not** add Codemap as a dependency to the bench repo. In **this** repo, copy `.env.example` to `.env` and set **`CODEMAP_TEST_BENCH`** to an **absolute path** to the other clone, then run `bun src/index.ts` as usual. See [docs/benchmark.md § Indexing another project](../docs/benchmark.md#indexing-another-project).

**One-shot QA (index + disk checks + benchmark):** `CODEMAP_ROOT=/absolute/path/to/app bun run qa:external` (or set **`CODEMAP_TEST_BENCH`** in `.env`; optional `--root` overrides). Optional **`--max-files`** / **`--max-symbols`** (positive integers; default caps sampling). Validates indexed paths exist, spot-checks symbol lines vs files, prints sample SQL rows, then runs `src/benchmark.ts`. Do **not** add external app source into this repository.

Releases: **[@changesets/cli](https://github.com/changesets/changesets)** — run **`bun run changeset`** when your PR should bump the version; see [docs/packaging.md § Releases](../docs/packaging.md#releases).

**Issues:** use [GitHub issue templates](https://github.com/stainless-code/codemap/issues/new/choose) — **Core bug** vs **Adapter proposal** (see `.github/ISSUE_TEMPLATE/`).
Expand Down
3 changes: 3 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,9 @@ jobs:
- name: Run unit tests with coverage
run: bun run test:coverage

- name: Golden query regression (fixtures/minimal)
run: bun run test:golden

build:
name: 🧰 Build
needs: skip-ci
Expand Down
8 changes: 8 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,11 @@ dist/
.env.*
!.env.example
tsconfig.lint-staged.json

# Tier B golden query outputs (local / private trees — see docs/golden-queries.md)
fixtures/golden/external/
fixtures/golden/scenarios.external.json

# QA chat prompts tied to a private/local index (paths + product names)
fixtures/qa/*.local.md
fixtures/benchmark/*.local.json
33 changes: 24 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
- **Not** full-text search or grep on arbitrary strings — use those when you need raw file-body search.
- **Is** a fast, token-efficient way to navigate **structure**: definitions, imports, dependency direction, components, and other extracted facts.

**Documentation:** [docs/README.md](docs/README.md) is the hub (topic index + single-source rules). Topics: [architecture](docs/architecture.md), [agents](docs/agents.md) (`codemap agents init`), [benchmark](docs/benchmark.md), [packaging](docs/packaging.md), [roadmap](docs/roadmap.md), [why Codemap](docs/why-codemap.md). **Bundled rules/skills:** [`.agents/rules/`](.agents/rules/), [`.agents/skills/codemap/SKILL.md`](.agents/skills/codemap/SKILL.md). **Consumers:** [.github/CONTRIBUTING.md](.github/CONTRIBUTING.md).
**Documentation:** [docs/README.md](docs/README.md) is the hub (topic index + single-source rules). Topics: [architecture](docs/architecture.md), [agents](docs/agents.md) (`codemap agents init`), [benchmark](docs/benchmark.md), [golden queries](docs/golden-queries.md), [packaging](docs/packaging.md), [roadmap](docs/roadmap.md), [why Codemap](docs/why-codemap.md). **Bundled rules/skills:** [`.agents/rules/`](.agents/rules/), [`.agents/skills/codemap/SKILL.md`](.agents/skills/codemap/SKILL.md). **Consumers:** [.github/CONTRIBUTING.md](.github/CONTRIBUTING.md).

---

Expand Down Expand Up @@ -37,6 +37,16 @@ codemap --full

# SQL against the index (after at least one index run)
codemap query "SELECT name, file_path FROM symbols LIMIT 10"
# JSON array on stdout (agents / scripts); {"error":"..."} for bad SQL, DB open, or query bootstrap (config/resolver) when using --json
codemap query --json "SELECT name, file_path FROM symbols LIMIT 10"
# Query is not row-capped — add LIMIT in SQL for large selects
# Bundled SQL (same as skill examples): fan-out rankings
codemap query --recipe fan-out
codemap query --json --recipe fan-out-sample
# List bundled recipes as JSON, or print one recipe's SQL (no DB required)
codemap query --recipes-json
codemap query --print-sql fan-out
# `components-by-hooks` ranks by hook count without SQLite JSON1 (comma-based count on the stored JSON array).

# Another project
codemap --root /path/to/repo --full
Expand Down Expand Up @@ -81,12 +91,15 @@ const rows = cm.query("SELECT name FROM symbols LIMIT 5");

Tooling: **Oxfmt**, **Oxlint**, **tsgo** (`@typescript/native-preview`).

| Command | Purpose |
| ------------------------------------ | ---------------------------------------------------------------- |
| `bun run dev` | Run the CLI from source (same as `bun src/index.ts`) |
| `bun run check` | Build, format check, lint, tests, typecheck — run before pushing |
| `bun run fix` | Apply lint fixes, then format |
| `bun run test` / `bun run typecheck` | Focused checks |
| Command | Purpose |
| ------------------------------------ | -------------------------------------------------------------------------- |
| `bun run dev` | Run the CLI from source (same as `bun src/index.ts`) |
| `bun run check` | Build, format check, lint, tests, typecheck — run before pushing |
| `bun run fix` | Apply lint fixes, then format |
| `bun run test` / `bun run typecheck` | Focused checks |
| `bun run test:golden` | SQL snapshot regression on `fixtures/minimal` (included in `check`) |
| `bun run test:golden:external` | Tier B: local tree via `CODEMAP_*` / `--root` (not in default `check`) |
| `bun run qa:external` | Index + sanity checks + benchmark on `CODEMAP_ROOT` / `CODEMAP_TEST_BENCH` |

```bash
bun install
Expand All @@ -100,11 +113,13 @@ bun run fix # oxlint --fix, then oxfmt

## Benchmark

Use a **real** project path (the repo must exist on disk). See [docs/benchmark.md § Indexing another project](docs/benchmark.md#indexing-another-project).

```bash
CODEMAP_ROOT=/path/to/indexed-repo bun src/benchmark.ts
CODEMAP_ROOT=/absolute/path/to/indexed-repo bun src/benchmark.ts
```

Details: [docs/benchmark.md](docs/benchmark.md).
Optional **`CODEMAP_BENCHMARK_CONFIG`** for repo-specific scenarios: [docs/benchmark.md § Custom scenarios](docs/benchmark.md#custom-scenarios-codemap_benchmark_config).

---

Expand Down
Loading
Loading