|
| 1 | +# AGENTS.md |
| 2 | + |
| 3 | +For detailed subsystem docs, see [docs/index.md](./docs/index.md). |
| 4 | + |
| 5 | +## Project Overview |
| 6 | + |
| 7 | +InferenceX App — Next.js 16 dashboard for ML inference benchmark data. DB-backed with Neon PostgreSQL, React Query for data fetching, D3.js for charts. |
| 8 | + |
| 9 | +- **Framework**: Next.js 16 (App Router, Turbopack) |
| 10 | +- **Language**: TypeScript (strict mode) |
| 11 | +- **Styling**: Tailwind CSS 4 + shadcn/ui (Radix UI primitives) |
| 12 | +- **Charts**: D3.js — shared library at `src/lib/d3-chart/`, scatter/GPU/bar charts |
| 13 | +- **Data**: Neon DB → API routes (`/api/v1/*`) → React Query hooks → Context providers |
| 14 | +- **Deployment**: Vercel with daily cron-triggered rebuilds |
| 15 | +- **Analytics**: PostHog (`posthog-js`) via `@/lib/analytics` — recommended on all interactive elements (autocapture provides baseline coverage) |
| 16 | + |
| 17 | +## Quick Start |
| 18 | + |
| 19 | +```bash |
| 20 | +pnpm install # Install dependencies |
| 21 | +pnpm dev # Dev server with Turbopack (http://localhost:3000) |
| 22 | +pnpm build # Production build |
| 23 | +pnpm typecheck # TypeScript type checking (all packages) |
| 24 | +pnpm lint # Lint with oxlint |
| 25 | +pnpm lint:fix # Auto-fix lint issues |
| 26 | +pnpm fmt # Format check with oxfmt |
| 27 | +pnpm fmt:fix # Auto-fix formatting |
| 28 | +pnpm test:unit # Vitest unit tests |
| 29 | +pnpm test:e2e # Cypress E2E tests |
| 30 | +``` |
| 31 | + |
| 32 | +## Monorepo Structure |
| 33 | + |
| 34 | +``` |
| 35 | +packages/ |
| 36 | +├── app/ # Next.js frontend (@semianalysisai/inferencex-app) |
| 37 | +│ └── src/ |
| 38 | +│ ├── app/ # Pages, layouts, API routes (/api/v1/*) |
| 39 | +│ ├── components/ # Tab sections: inference/, evaluation/, historical-trends/, |
| 40 | +│ │ # throughput-calculator/, reliability/, gpu-specs/, ui/ |
| 41 | +│ ├── hooks/api/ # React Query hooks (use-benchmarks, use-availability, etc.) |
| 42 | +│ └── lib/ # Utilities, constants, d3-chart/, chart-utils, data-mappings |
| 43 | +├── constants/ # Shared constants (GPU keys, model mappings) |
| 44 | +└── db/ # DB layer, ETL, migrations, queries, ingest scripts |
| 45 | +``` |
| 46 | + |
| 47 | +**Path alias**: `@/*` → `packages/app/src/` |
| 48 | + |
| 49 | +## Data Architecture |
| 50 | + |
| 51 | +``` |
| 52 | +Frontend → React Query hooks (src/hooks/api/) → /api/v1/* routes → Neon DB |
| 53 | +``` |
| 54 | + |
| 55 | +API routes (`packages/app/src/app/api/v1/`): |
| 56 | + |
| 57 | +- `benchmarks?model=X&date=YYYY-MM-DD` — latest benchmark per (config, concurrency) |
| 58 | +- `benchmarks/history?model=X&gpu=Y` — historical benchmark data for trend charts |
| 59 | +- `workflow-info?date=YYYY-MM-DD` — runs, changelogs, configs for a date |
| 60 | +- `availability` — `Record<model, dates[]>` |
| 61 | +- `reliability` — raw `ReliabilityRow[]` |
| 62 | +- `evaluations` — raw `EvalRow[]` |
| 63 | +- `server-log` — retrieve benchmark runtime logs |
| 64 | +- `github-stars` — star count for the repo |
| 65 | +- `invalidate` — invalidate API cache (admin) |
| 66 | + |
| 67 | +**API routes return raw DB data** — no presentation logic. Frontend handles all transformations. |
| 68 | + |
| 69 | +## Code Style & Tooling |
| 70 | + |
| 71 | +- **Linter**: oxlint — `pnpm lint` / `pnpm lint:fix` |
| 72 | +- **Formatter**: oxfmt — `pnpm fmt` / `pnpm fmt:fix` |
| 73 | +- **Type checking**: `pnpm typecheck` (tsc --noEmit, strict mode) |
| 74 | +- **Node**: 24.x |
| 75 | + |
| 76 | +## Environment Variables |
| 77 | + |
| 78 | +See `.env.example`. Key vars: `GITHUB_TOKEN`, `DATABASE_READONLY_URL`, `DATABASE_WRITE_URL` (admin only). |
| 79 | + |
| 80 | +## Testing |
| 81 | + |
| 82 | +See [Testing](./docs/testing.md) for full requirements, quality standards, and pre-commit checklist. Tests are **mandatory** — missing/low-quality tests are 🔴 BLOCKING on PR review. |
| 83 | + |
| 84 | +## Analytics Requirement |
| 85 | + |
| 86 | +All interactive elements should have `track()` from `@/lib/analytics` (autocapture provides baseline coverage). |
| 87 | + |
| 88 | +**Convention**: `[section]_[action]` — e.g., `latency_zoom_reset`, `calculator_bar_selected`, `tab_changed` |
| 89 | + |
| 90 | +**Prefixes**: `latency_`, `interactivity_`, `gpu_timeseries_`, `inference_`, `calculator_`, `evaluation_`, `reliability_`, `tab_`, `selector_` |
| 91 | + |
| 92 | +## Tab Structure |
| 93 | + |
| 94 | +Order: `inference` → `evaluation` → `historical` → `calculator` → `reliability` → `gpu-specs` (defined in `page-content.tsx` `VALID_TABS`). Tab value = URL hash. |
| 95 | + |
| 96 | +## Common Development Tasks |
| 97 | + |
| 98 | +### Modify chart appearance/behavior |
| 99 | + |
| 100 | +- D3 scatter plot: `src/components/inference/ui/ScatterGraph.tsx` |
| 101 | +- D3 GPU graph: `src/components/inference/ui/GPUGraph.tsx` |
| 102 | +- Chart layout/errors: `src/components/inference/ui/ChartDisplay.tsx` |
| 103 | +- Shared D3 library: `src/lib/d3-chart/` (setup, axes, grid, watermark, layers) |
| 104 | + |
| 105 | +### Change chart filters/state |
| 106 | + |
| 107 | +- State: `src/components/inference/InferenceContext.tsx` |
| 108 | +- Controls: `src/components/inference/ui/ChartControls.tsx` |
| 109 | +- Filter logic: `src/components/inference/hooks/useChartData.ts` |
| 110 | + |
| 111 | +### Add/modify a metric |
| 112 | + |
| 113 | +1. Register in `src/lib/chart-utils.ts`: `Y_AXIS_METRICS`, `calculateRoofline`, `computeAllRooflines`, `markRooflinePoints` |
| 114 | +2. Add TS types: optional field in `InferenceData`, add to `YAxisMetricKey`, add `ChartDefinition` fields |
| 115 | +3. Add chart config: `src/components/inference/inference-chart-config.json` |
| 116 | +4. Add Y-axis dropdown: `ChartControls.tsx` |
| 117 | +5. Add subtitle/disclaimer in `ChartDisplay.tsx` if metric depends on assumed constants |
| 118 | +6. Add disagg caveat banner in `ChartDisplay.tsx` for per-GPU or per-MW metrics (animated amber `border-l-2` banner pattern) |
| 119 | +7. Expose in UI state: `InferenceContext.tsx` |
| 120 | + |
| 121 | +### Add a new model or GPU |
| 122 | + |
| 123 | +**First ask for the PR / GitHub Actions run URL** — see [Adding Entities](./docs/adding-entities.md) for the full workflow. Never ask other questions before getting the URL. |
| 124 | + |
| 125 | +### Adding a new tab |
| 126 | + |
| 127 | +1. `page-content.tsx`: Add to `VALID_TABS`, add `TabsTrigger` (desktop), `SelectItem` (mobile), `TabsContent` |
| 128 | +2. Create a per-section context provider (see `InferenceContext.tsx`, `EvaluationContext.tsx` for patterns) |
| 129 | +3. Use `ChartLegend` with `variant="sidebar"`, sorted by `MODEL_ORDER`, default expanded |
| 130 | +4. Analytics: all interactive elements use `track()` with `{tabname}_` prefix |
| 131 | + |
| 132 | +## Subsystem Docs |
| 133 | + |
| 134 | +Detailed design rationale (the "why" and "how", not the "what") lives in [docs/](./docs/index.md): |
| 135 | + |
| 136 | +- **[Index](./docs/index.md)** — index of all docs **MUST ALWAYS READ IN CASE OF RELEVANT INFORMATION** |
| 137 | +- **[Architecture](./docs/architecture.md)** — Client-first design, hash routing, caching, color system |
| 138 | +- **[D3 Charts](./docs/d3-charts.md)** — 4-effect architecture, zoom refs, tooltip lifecycle |
| 139 | +- **[Data Pipeline](./docs/data-pipeline.md)** — DB schema reasoning, ETL design, spline interpolation |
| 140 | +- **[Pitfalls](./docs/pitfalls.md)** — Token type bugs, schema evolution, stale closures, zoom loss |
| 141 | +- **[GPU Specs](./docs/gpu-specs.md)** — Topology invariants, unit conventions, hardware gotchas |
| 142 | +- **[TCO Calculator](./docs/tco-calculator.md)** — Interpolation, composite keys, cost matrix |
| 143 | +- **[Adding Entities](./docs/adding-entities.md)** — Checklists for adding models, GPUs, precisions, sequences, frameworks |
| 144 | +- **[Testing](./docs/testing.md)** — Requirements, quality standards, pre-commit checklist |
| 145 | +- **[Data Transforms](./docs/data-transforms.md)** — BenchmarkRow → AggDataEntry → InferenceData pipeline, hardware key construction, derived metrics |
| 146 | +- **[State Ownership](./docs/state-ownership.md)** — Context provider state map, availability filtering cascade, comparison dates, URL params |
| 147 | + |
| 148 | +## Claude AI Agents |
| 149 | + |
| 150 | +### `@frontend-claude` (`.github/workflows/claude.yml`) |
| 151 | + |
| 152 | +Triggered by mentioning in issues/comments. Full code implementation + Playwright browser testing. Creates `claude/issue-{N}-*` branches. Must verify charts render real data (no "No data available"). |
| 153 | + |
| 154 | +### `@chrome-claude` (`.github/workflows/claude-chrome.yml`) |
| 155 | + |
| 156 | +Same as `@frontend-claude` but uses Chrome DevTools MCP instead of Playwright for browser automation. Preferred when you need deeper debugging (network requests, console messages, JS evaluation). |
| 157 | + |
| 158 | +### `@pr-claude` (`.github/workflows/pr-claude.yml`) |
| 159 | + |
| 160 | +Auto-runs on PR open/sync. Code review only. Flags: bugs, security, breaking changes, missing tests (🔴 BLOCKING), low-quality tests (🔴 BLOCKING). Ignores: style, naming, docs. |
| 161 | + |
| 162 | +### `claude-cron-media` (`.github/workflows/claude-cron-media.yml`) |
| 163 | + |
| 164 | +Weekly cron (Mondays 10 AM UTC). Searches for new media mentions of InferenceMAX/InferenceX and opens PRs to add them to `packages/app/src/components/media/media-data.ts`. |
0 commit comments