This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
If you notice the user's request is based on a misconception, say so.
Never claim 'all tests pass' when output shows failures.
Keep text between tool calls to <=25 words.
Spawn an adversarial sub-agent (the Red Team / Raze, defined in _bmad/agents/adversarial-reviewer.md) to review non-trivial changes before reporting completion.
This project uses spec-anchored development (BMAD + OpenSpec). Every code change follows:
- Spec First — Update
openspec/capabilities/*/spec.mdwith new REQ-* and SCENARIO-*. Create/update story inepics/stories/. - Write Tests — Tests reference REQ-* and SCENARIO-* in comments.
- Implement — Code to satisfy spec requirements.
- Verify — Run unit tests, type checks, builds per commands below.
- E2E Verify (MANDATORY) — Run end-to-end tests per
ops/e2e-test-plan.md. All changes derived from user instruction MUST be verified E2E before reporting done. See E2E Testing below. - Reconcile Specs — Update Implementation Status in spec.md. Update story status. Update
_bmad/traceability.mdimpl status column. If implementation diverged from spec, update spec to match reality with rationale. - Update Ops — Update
ops/status.md(what's working/next) andops/changelog.md(what you did). - Update
_bmad— Update any part of_bmadthat is relevant to the changes you made. Never leave specs and code disagreeing silently.
If _bmad/architecture.md "Last Reconciled" date is >30 days old, flag to user before starting new capability work.
Every change derived from user instruction must be verified end-to-end. This means:
- Web applications: Browser automation (Playwright or equivalent) against the deployed system
- Mobile applications: Mobile emulator testing (mobile-mcp or equivalent)
- Backend services / drivers: Integration tests against running system instances with real protocol exchanges
- DNS resolution: When testing against running systems, use proper DNS names when feasible (not just localhost)
E2E tests must exercise the full deployed stack, not just unit tests against mocked dependencies. The test plan lives at ops/e2e-test-plan.md and results are documented at ops/test-results.md.
# Install dependencies
npm install
# Build for production
npm run build
# Run development server
npm run dev
# Run unit tests
npx vitest run
# Run E2E tests
npx playwright test
# Lint
npx eslint .
# Type check
npx tsc --noEmit
# Deploy (Docker)
docker-compose up -dTrack execution time and token consumption every turn:
- Turn start: Run
date -u +"%Y-%m-%dT%H:%M:%SZ"at start of each response - Turn end: Run
date -u +"%Y-%m-%dT%H:%M:%SZ"right before responding to user - Log both in
ops/metrics.mdturn log table - Subagent metrics: Record tokens and duration from agent result metadata
- On context compaction or session end: Run
python3 scripts/session-metrics.pyto extract authoritative token counts and costs from the session JSONL, then updateops/metrics.mdSession Summary
Every user instruction must be captured and traceable to outcomes:
- Log user instructions: At the start of each turn, record a 1-line summary of the user's request in
ops/metrics.mdturn log (Description column) - Cycle time: The turn log's Start/End columns capture wall-clock time between user input and agent completion — this IS the cycle time. Review it to identify slow turns.
- Instruction → outcome mapping: Each entry in
ops/changelog.mdshould be traceable to the user instruction that triggered it. If a change was agent-initiated (refactoring, cleanup), note that explicitly. - Session handoff: Before session ends, update
ops/status.mdwith what's working and what's next. This is the handoff document for the next session — human or AI.
This project uses a context-reset architecture with discrete BMAD agent roles. No two roles share a context window.
| BMAD Role | Agent | Context | Config |
|---|---|---|---|
| Analyst (Mary) | Discovery | Fresh per analysis | .harness/prompts/discovery.md |
| PM (Pat) | Planner | Fresh per planning cycle | .harness/prompts/planner.md |
| Architect (Alex) | Architect | Fresh per design decision | .harness/prompts/architect.md |
| UX Designer (Sally) | Design | Fresh per design task | .harness/prompts/design.md |
| Developer (Dana) | Generator | Fresh per story | .harness/prompts/generator.md |
| QA (Quinn) | Evaluator | Fresh per evaluation | .harness/prompts/evaluator.md |
| Red Team (Raze) | Adversarial Reviewer | Fresh per review (Gate 4) | .harness/prompts/adversarial.md |
| Scrum Master (Sam) | Orchestrator | Stateless script | scripts/orchestrate.py |
- Harness config:
.harness/config.yaml(agent models, tools, budgets, evaluation criteria) - Handoff artifacts:
.harness/handoffs/(YAML state passed between agents) - Sprint contracts:
.harness/contracts/(measurable criteria binding generator + evaluator) - Evaluation reports:
.harness/evaluations/(independent QA verdicts)
See .harness/prompts/*.md for each agent's full role definition and _bmad/workflow.md for the orchestration loop.
- WSL2 (Linux 6.6.87.2-microsoft-standard-WSL2)
- Node.js 20 LTS (via nvm)
- npm package manager
- Docker + docker-compose for deployment
| What | Where |
|---|---|
| BMAD strategic docs | _bmad/ |
| BMAD agent roles | _bmad/agents/ |
| BMAD workflow | _bmad/workflow.md |
| Capability specs | openspec/capabilities/*/spec.md |
| Capability designs | openspec/capabilities/*/design.md |
| OpenSpec agent guide | openspec/AGENTS.md |
| Project conventions | openspec/project.md |
| Change proposals | openspec/change-proposals/ |
| Epics & stories | epics/ |
| Harness config | .harness/config.yaml |
| Agent prompts | .harness/prompts/ |
| Handoff artifacts | .harness/handoffs/ |
| Sprint contracts | .harness/contracts/ |
| Evaluation reports | .harness/evaluations/ |
| Adversarial reviewer role | _bmad/agents/adversarial-reviewer.md |
| Operational status | ops/status.md |
| Server & credentials | ops/server.md |
| Work log | ops/changelog.md |
| Known issues | ops/known-issues.md |
| E2E test plan | ops/e2e-test-plan.md |
| Test results | ops/test-results.md |
| Session metrics | ops/metrics.md |
| Token cost script | scripts/session-metrics.py |
| Orchestration loop | scripts/orchestrate.py |
- Before starting a new capability: First review all documents in
_bmad/and determine if the new capability is already implemented or if there are any relevant change proposals, or if the new capability implies an evolution of the architecture. Read the relevantopenspec/capabilities/*/spec.mdanddesign.md - Before deploying or debugging server issues: Read
ops/server.mdandops/known-issues.md - Before architectural decisions or adding new components: Read
_bmad/architecture.md - To understand project scope or requirements: Read
_bmad/prd.md - To check what's built vs. spec'd: Read
_bmad/traceability.md(has implementation status per FR) - Before reporting work as done: Read
ops/e2e-test-plan.mdand execute relevant E2E tests