Skip to content

Commit 85755bb

Browse files
docs: add AGENTS.md and reference it from CLAUDE.md (#86)
1 parent cfc2962 commit 85755bb

2 files changed

Lines changed: 165 additions & 164 deletions

File tree

AGENTS.md

Lines changed: 164 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,164 @@
1+
# AGENTS.md
2+
3+
For detailed subsystem docs, see [docs/index.md](./docs/index.md).
4+
5+
## Project Overview
6+
7+
InferenceX App — Next.js 16 dashboard for ML inference benchmark data. DB-backed with Neon PostgreSQL, React Query for data fetching, D3.js for charts.
8+
9+
- **Framework**: Next.js 16 (App Router, Turbopack)
10+
- **Language**: TypeScript (strict mode)
11+
- **Styling**: Tailwind CSS 4 + shadcn/ui (Radix UI primitives)
12+
- **Charts**: D3.js — shared library at `src/lib/d3-chart/`, scatter/GPU/bar charts
13+
- **Data**: Neon DB → API routes (`/api/v1/*`) → React Query hooks → Context providers
14+
- **Deployment**: Vercel with daily cron-triggered rebuilds
15+
- **Analytics**: PostHog (`posthog-js`) via `@/lib/analytics` — recommended on all interactive elements (autocapture provides baseline coverage)
16+
17+
## Quick Start
18+
19+
```bash
20+
pnpm install # Install dependencies
21+
pnpm dev # Dev server with Turbopack (http://localhost:3000)
22+
pnpm build # Production build
23+
pnpm typecheck # TypeScript type checking (all packages)
24+
pnpm lint # Lint with oxlint
25+
pnpm lint:fix # Auto-fix lint issues
26+
pnpm fmt # Format check with oxfmt
27+
pnpm fmt:fix # Auto-fix formatting
28+
pnpm test:unit # Vitest unit tests
29+
pnpm test:e2e # Cypress E2E tests
30+
```
31+
32+
## Monorepo Structure
33+
34+
```
35+
packages/
36+
├── app/ # Next.js frontend (@semianalysisai/inferencex-app)
37+
│ └── src/
38+
│ ├── app/ # Pages, layouts, API routes (/api/v1/*)
39+
│ ├── components/ # Tab sections: inference/, evaluation/, historical-trends/,
40+
│ │ # throughput-calculator/, reliability/, gpu-specs/, ui/
41+
│ ├── hooks/api/ # React Query hooks (use-benchmarks, use-availability, etc.)
42+
│ └── lib/ # Utilities, constants, d3-chart/, chart-utils, data-mappings
43+
├── constants/ # Shared constants (GPU keys, model mappings)
44+
└── db/ # DB layer, ETL, migrations, queries, ingest scripts
45+
```
46+
47+
**Path alias**: `@/*``packages/app/src/`
48+
49+
## Data Architecture
50+
51+
```
52+
Frontend → React Query hooks (src/hooks/api/) → /api/v1/* routes → Neon DB
53+
```
54+
55+
API routes (`packages/app/src/app/api/v1/`):
56+
57+
- `benchmarks?model=X&date=YYYY-MM-DD` — latest benchmark per (config, concurrency)
58+
- `benchmarks/history?model=X&gpu=Y` — historical benchmark data for trend charts
59+
- `workflow-info?date=YYYY-MM-DD` — runs, changelogs, configs for a date
60+
- `availability``Record<model, dates[]>`
61+
- `reliability` — raw `ReliabilityRow[]`
62+
- `evaluations` — raw `EvalRow[]`
63+
- `server-log` — retrieve benchmark runtime logs
64+
- `github-stars` — star count for the repo
65+
- `invalidate` — invalidate API cache (admin)
66+
67+
**API routes return raw DB data** — no presentation logic. Frontend handles all transformations.
68+
69+
## Code Style & Tooling
70+
71+
- **Linter**: oxlint — `pnpm lint` / `pnpm lint:fix`
72+
- **Formatter**: oxfmt — `pnpm fmt` / `pnpm fmt:fix`
73+
- **Type checking**: `pnpm typecheck` (tsc --noEmit, strict mode)
74+
- **Node**: 24.x
75+
76+
## Environment Variables
77+
78+
See `.env.example`. Key vars: `GITHUB_TOKEN`, `DATABASE_READONLY_URL`, `DATABASE_WRITE_URL` (admin only).
79+
80+
## Testing
81+
82+
See [Testing](./docs/testing.md) for full requirements, quality standards, and pre-commit checklist. Tests are **mandatory** — missing/low-quality tests are 🔴 BLOCKING on PR review.
83+
84+
## Analytics Requirement
85+
86+
All interactive elements should have `track()` from `@/lib/analytics` (autocapture provides baseline coverage).
87+
88+
**Convention**: `[section]_[action]` — e.g., `latency_zoom_reset`, `calculator_bar_selected`, `tab_changed`
89+
90+
**Prefixes**: `latency_`, `interactivity_`, `gpu_timeseries_`, `inference_`, `calculator_`, `evaluation_`, `reliability_`, `tab_`, `selector_`
91+
92+
## Tab Structure
93+
94+
Order: `inference``evaluation``historical``calculator``reliability``gpu-specs` (defined in `page-content.tsx` `VALID_TABS`). Tab value = URL hash.
95+
96+
## Common Development Tasks
97+
98+
### Modify chart appearance/behavior
99+
100+
- D3 scatter plot: `src/components/inference/ui/ScatterGraph.tsx`
101+
- D3 GPU graph: `src/components/inference/ui/GPUGraph.tsx`
102+
- Chart layout/errors: `src/components/inference/ui/ChartDisplay.tsx`
103+
- Shared D3 library: `src/lib/d3-chart/` (setup, axes, grid, watermark, layers)
104+
105+
### Change chart filters/state
106+
107+
- State: `src/components/inference/InferenceContext.tsx`
108+
- Controls: `src/components/inference/ui/ChartControls.tsx`
109+
- Filter logic: `src/components/inference/hooks/useChartData.ts`
110+
111+
### Add/modify a metric
112+
113+
1. Register in `src/lib/chart-utils.ts`: `Y_AXIS_METRICS`, `calculateRoofline`, `computeAllRooflines`, `markRooflinePoints`
114+
2. Add TS types: optional field in `InferenceData`, add to `YAxisMetricKey`, add `ChartDefinition` fields
115+
3. Add chart config: `src/components/inference/inference-chart-config.json`
116+
4. Add Y-axis dropdown: `ChartControls.tsx`
117+
5. Add subtitle/disclaimer in `ChartDisplay.tsx` if metric depends on assumed constants
118+
6. Add disagg caveat banner in `ChartDisplay.tsx` for per-GPU or per-MW metrics (animated amber `border-l-2` banner pattern)
119+
7. Expose in UI state: `InferenceContext.tsx`
120+
121+
### Add a new model or GPU
122+
123+
**First ask for the PR / GitHub Actions run URL** — see [Adding Entities](./docs/adding-entities.md) for the full workflow. Never ask other questions before getting the URL.
124+
125+
### Adding a new tab
126+
127+
1. `page-content.tsx`: Add to `VALID_TABS`, add `TabsTrigger` (desktop), `SelectItem` (mobile), `TabsContent`
128+
2. Create a per-section context provider (see `InferenceContext.tsx`, `EvaluationContext.tsx` for patterns)
129+
3. Use `ChartLegend` with `variant="sidebar"`, sorted by `MODEL_ORDER`, default expanded
130+
4. Analytics: all interactive elements use `track()` with `{tabname}_` prefix
131+
132+
## Subsystem Docs
133+
134+
Detailed design rationale (the "why" and "how", not the "what") lives in [docs/](./docs/index.md):
135+
136+
- **[Index](./docs/index.md)** — index of all docs **MUST ALWAYS READ IN CASE OF RELEVANT INFORMATION**
137+
- **[Architecture](./docs/architecture.md)** — Client-first design, hash routing, caching, color system
138+
- **[D3 Charts](./docs/d3-charts.md)** — 4-effect architecture, zoom refs, tooltip lifecycle
139+
- **[Data Pipeline](./docs/data-pipeline.md)** — DB schema reasoning, ETL design, spline interpolation
140+
- **[Pitfalls](./docs/pitfalls.md)** — Token type bugs, schema evolution, stale closures, zoom loss
141+
- **[GPU Specs](./docs/gpu-specs.md)** — Topology invariants, unit conventions, hardware gotchas
142+
- **[TCO Calculator](./docs/tco-calculator.md)** — Interpolation, composite keys, cost matrix
143+
- **[Adding Entities](./docs/adding-entities.md)** — Checklists for adding models, GPUs, precisions, sequences, frameworks
144+
- **[Testing](./docs/testing.md)** — Requirements, quality standards, pre-commit checklist
145+
- **[Data Transforms](./docs/data-transforms.md)** — BenchmarkRow → AggDataEntry → InferenceData pipeline, hardware key construction, derived metrics
146+
- **[State Ownership](./docs/state-ownership.md)** — Context provider state map, availability filtering cascade, comparison dates, URL params
147+
148+
## Claude AI Agents
149+
150+
### `@frontend-claude` (`.github/workflows/claude.yml`)
151+
152+
Triggered by mentioning in issues/comments. Full code implementation + Playwright browser testing. Creates `claude/issue-{N}-*` branches. Must verify charts render real data (no "No data available").
153+
154+
### `@chrome-claude` (`.github/workflows/claude-chrome.yml`)
155+
156+
Same as `@frontend-claude` but uses Chrome DevTools MCP instead of Playwright for browser automation. Preferred when you need deeper debugging (network requests, console messages, JS evaluation).
157+
158+
### `@pr-claude` (`.github/workflows/pr-claude.yml`)
159+
160+
Auto-runs on PR open/sync. Code review only. Flags: bugs, security, breaking changes, missing tests (🔴 BLOCKING), low-quality tests (🔴 BLOCKING). Ignores: style, naming, docs.
161+
162+
### `claude-cron-media` (`.github/workflows/claude-cron-media.yml`)
163+
164+
Weekly cron (Mondays 10 AM UTC). Searches for new media mentions of InferenceMAX/InferenceX and opens PRs to add them to `packages/app/src/components/media/media-data.ts`.

CLAUDE.md

Lines changed: 1 addition & 164 deletions
Original file line numberDiff line numberDiff line change
@@ -1,164 +1 @@
1-
# CLAUDE.md
2-
3-
For detailed subsystem docs, see [docs/index.md](./docs/index.md).
4-
5-
## Project Overview
6-
7-
InferenceX App — Next.js 16 dashboard for ML inference benchmark data. DB-backed with Neon PostgreSQL, React Query for data fetching, D3.js for charts.
8-
9-
- **Framework**: Next.js 16 (App Router, Turbopack)
10-
- **Language**: TypeScript (strict mode)
11-
- **Styling**: Tailwind CSS 4 + shadcn/ui (Radix UI primitives)
12-
- **Charts**: D3.js — shared library at `src/lib/d3-chart/`, scatter/GPU/bar charts
13-
- **Data**: Neon DB → API routes (`/api/v1/*`) → React Query hooks → Context providers
14-
- **Deployment**: Vercel with daily cron-triggered rebuilds
15-
- **Analytics**: PostHog (`posthog-js`) via `@/lib/analytics` — recommended on all interactive elements (autocapture provides baseline coverage)
16-
17-
## Quick Start
18-
19-
```bash
20-
pnpm install # Install dependencies
21-
pnpm dev # Dev server with Turbopack (http://localhost:3000)
22-
pnpm build # Production build
23-
pnpm typecheck # TypeScript type checking (all packages)
24-
pnpm lint # Lint with oxlint
25-
pnpm lint:fix # Auto-fix lint issues
26-
pnpm fmt # Format check with oxfmt
27-
pnpm fmt:fix # Auto-fix formatting
28-
pnpm test:unit # Vitest unit tests
29-
pnpm test:e2e # Cypress E2E tests
30-
```
31-
32-
## Monorepo Structure
33-
34-
```
35-
packages/
36-
├── app/ # Next.js frontend (@semianalysisai/inferencex-app)
37-
│ └── src/
38-
│ ├── app/ # Pages, layouts, API routes (/api/v1/*)
39-
│ ├── components/ # Tab sections: inference/, evaluation/, historical-trends/,
40-
│ │ # throughput-calculator/, reliability/, gpu-specs/, ui/
41-
│ ├── hooks/api/ # React Query hooks (use-benchmarks, use-availability, etc.)
42-
│ └── lib/ # Utilities, constants, d3-chart/, chart-utils, data-mappings
43-
├── constants/ # Shared constants (GPU keys, model mappings)
44-
└── db/ # DB layer, ETL, migrations, queries, ingest scripts
45-
```
46-
47-
**Path alias**: `@/*``packages/app/src/`
48-
49-
## Data Architecture
50-
51-
```
52-
Frontend → React Query hooks (src/hooks/api/) → /api/v1/* routes → Neon DB
53-
```
54-
55-
API routes (`packages/app/src/app/api/v1/`):
56-
57-
- `benchmarks?model=X&date=YYYY-MM-DD` — latest benchmark per (config, concurrency)
58-
- `benchmarks/history?model=X&gpu=Y` — historical benchmark data for trend charts
59-
- `workflow-info?date=YYYY-MM-DD` — runs, changelogs, configs for a date
60-
- `availability``Record<model, dates[]>`
61-
- `reliability` — raw `ReliabilityRow[]`
62-
- `evaluations` — raw `EvalRow[]`
63-
- `server-log` — retrieve benchmark runtime logs
64-
- `github-stars` — star count for the repo
65-
- `invalidate` — invalidate API cache (admin)
66-
67-
**API routes return raw DB data** — no presentation logic. Frontend handles all transformations.
68-
69-
## Code Style & Tooling
70-
71-
- **Linter**: oxlint — `pnpm lint` / `pnpm lint:fix`
72-
- **Formatter**: oxfmt — `pnpm fmt` / `pnpm fmt:fix`
73-
- **Type checking**: `pnpm typecheck` (tsc --noEmit, strict mode)
74-
- **Node**: 24.x
75-
76-
## Environment Variables
77-
78-
See `.env.example`. Key vars: `GITHUB_TOKEN`, `DATABASE_READONLY_URL`, `DATABASE_WRITE_URL` (admin only).
79-
80-
## Testing
81-
82-
See [Testing](./docs/testing.md) for full requirements, quality standards, and pre-commit checklist. Tests are **mandatory** — missing/low-quality tests are 🔴 BLOCKING on PR review.
83-
84-
## Analytics Requirement
85-
86-
All interactive elements should have `track()` from `@/lib/analytics` (autocapture provides baseline coverage).
87-
88-
**Convention**: `[section]_[action]` — e.g., `latency_zoom_reset`, `calculator_bar_selected`, `tab_changed`
89-
90-
**Prefixes**: `latency_`, `interactivity_`, `gpu_timeseries_`, `inference_`, `calculator_`, `evaluation_`, `reliability_`, `tab_`, `selector_`
91-
92-
## Tab Structure
93-
94-
Order: `inference``evaluation``historical``calculator``reliability``gpu-specs` (defined in `page-content.tsx` `VALID_TABS`). Tab value = URL hash.
95-
96-
## Common Development Tasks
97-
98-
### Modify chart appearance/behavior
99-
100-
- D3 scatter plot: `src/components/inference/ui/ScatterGraph.tsx`
101-
- D3 GPU graph: `src/components/inference/ui/GPUGraph.tsx`
102-
- Chart layout/errors: `src/components/inference/ui/ChartDisplay.tsx`
103-
- Shared D3 library: `src/lib/d3-chart/` (setup, axes, grid, watermark, layers)
104-
105-
### Change chart filters/state
106-
107-
- State: `src/components/inference/InferenceContext.tsx`
108-
- Controls: `src/components/inference/ui/ChartControls.tsx`
109-
- Filter logic: `src/components/inference/hooks/useChartData.ts`
110-
111-
### Add/modify a metric
112-
113-
1. Register in `src/lib/chart-utils.ts`: `Y_AXIS_METRICS`, `calculateRoofline`, `computeAllRooflines`, `markRooflinePoints`
114-
2. Add TS types: optional field in `InferenceData`, add to `YAxisMetricKey`, add `ChartDefinition` fields
115-
3. Add chart config: `src/components/inference/inference-chart-config.json`
116-
4. Add Y-axis dropdown: `ChartControls.tsx`
117-
5. Add subtitle/disclaimer in `ChartDisplay.tsx` if metric depends on assumed constants
118-
6. Add disagg caveat banner in `ChartDisplay.tsx` for per-GPU or per-MW metrics (animated amber `border-l-2` banner pattern)
119-
7. Expose in UI state: `InferenceContext.tsx`
120-
121-
### Add a new model or GPU
122-
123-
**First ask for the PR / GitHub Actions run URL** — see [Adding Entities](./docs/adding-entities.md) for the full workflow. Never ask other questions before getting the URL.
124-
125-
### Adding a new tab
126-
127-
1. `page-content.tsx`: Add to `VALID_TABS`, add `TabsTrigger` (desktop), `SelectItem` (mobile), `TabsContent`
128-
2. Create a per-section context provider (see `InferenceContext.tsx`, `EvaluationContext.tsx` for patterns)
129-
3. Use `ChartLegend` with `variant="sidebar"`, sorted by `MODEL_ORDER`, default expanded
130-
4. Analytics: all interactive elements use `track()` with `{tabname}_` prefix
131-
132-
## Subsystem Docs
133-
134-
Detailed design rationale (the "why" and "how", not the "what") lives in [docs/](./docs/index.md):
135-
136-
- **[Index](./docs/index.md)** — index of all docs **MUST ALWAYS READ IN CASE OF RELEVANT INFORMATION**
137-
- **[Architecture](./docs/architecture.md)** — Client-first design, hash routing, caching, color system
138-
- **[D3 Charts](./docs/d3-charts.md)** — 4-effect architecture, zoom refs, tooltip lifecycle
139-
- **[Data Pipeline](./docs/data-pipeline.md)** — DB schema reasoning, ETL design, spline interpolation
140-
- **[Pitfalls](./docs/pitfalls.md)** — Token type bugs, schema evolution, stale closures, zoom loss
141-
- **[GPU Specs](./docs/gpu-specs.md)** — Topology invariants, unit conventions, hardware gotchas
142-
- **[TCO Calculator](./docs/tco-calculator.md)** — Interpolation, composite keys, cost matrix
143-
- **[Adding Entities](./docs/adding-entities.md)** — Checklists for adding models, GPUs, precisions, sequences, frameworks
144-
- **[Testing](./docs/testing.md)** — Requirements, quality standards, pre-commit checklist
145-
- **[Data Transforms](./docs/data-transforms.md)** — BenchmarkRow → AggDataEntry → InferenceData pipeline, hardware key construction, derived metrics
146-
- **[State Ownership](./docs/state-ownership.md)** — Context provider state map, availability filtering cascade, comparison dates, URL params
147-
148-
## Claude AI Agents
149-
150-
### `@frontend-claude` (`.github/workflows/claude.yml`)
151-
152-
Triggered by mentioning in issues/comments. Full code implementation + Playwright browser testing. Creates `claude/issue-{N}-*` branches. Must verify charts render real data (no "No data available").
153-
154-
### `@chrome-claude` (`.github/workflows/claude-chrome.yml`)
155-
156-
Same as `@frontend-claude` but uses Chrome DevTools MCP instead of Playwright for browser automation. Preferred when you need deeper debugging (network requests, console messages, JS evaluation).
157-
158-
### `@pr-claude` (`.github/workflows/pr-claude.yml`)
159-
160-
Auto-runs on PR open/sync. Code review only. Flags: bugs, security, breaking changes, missing tests (🔴 BLOCKING), low-quality tests (🔴 BLOCKING). Ignores: style, naming, docs.
161-
162-
### `claude-cron-media` (`.github/workflows/claude-cron-media.yml`)
163-
164-
Weekly cron (Mondays 10 AM UTC). Searches for new media mentions of InferenceMAX/InferenceX and opens PRs to add them to `packages/app/src/components/media/media-data.ts`.
1+
@AGENTS.md

0 commit comments

Comments
 (0)