You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: .claude/rules/02-quality-gates.md
+1Lines changed: 1 addition & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -211,6 +211,7 @@ String containment tests on structured output create false confidence. The test
211
211
3.**Existing workflows MUST be confirmed working.** Navigate the dashboard, projects, settings. Verify no regressions — pages load, data displays, navigation works, no new console errors.
212
212
4.**New feature/fix MUST be verified on staging.** The specific changes in the PR must work correctly on the live staging environment.
213
213
5.**Evidence MUST be reported.** Include screenshots, API responses, or Playwright observations in the PR.
214
+
6.**For browser-consumed streams (SSE / WebSocket), verification MUST use a real browser, not `curl`.**`curl` can confirm bytes arrive on the wire; only a browser confirms the client actually dispatches them to its handler. See `.claude/rules/13-staging-verification.md` for the full reasoning and the post-mortem that motivated this rule (`docs/notes/2026-04-19-trial-sse-named-events-postmortem.md`).
Copy file name to clipboardExpand all lines: .claude/rules/10-e2e-verification.md
+1Lines changed: 1 addition & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -23,6 +23,7 @@ Before marking a feature complete:
23
23
-[ ] The test asserts the **final outcome**, not just that intermediate steps succeeded
24
24
-[ ] If the test uses mocks at system boundaries, the mock asserts the **exact payload** the real system would receive
25
25
-[ ] Any untestable gaps are documented with manual verification steps
26
+
-[ ]**Port-of-pattern coverage** — when porting a multi-step pattern (VM boot, credential rotation, agent session lifecycle) from an existing consumer to a new one, the new consumer's tests MUST mock each cross-boundary target and assert **every step of the pattern fired** with the correct payload. A test that asserts "step 1 fired" but not "step 3 fired" does not prove the port is complete. See `docs/notes/2026-04-19-trial-orchestrator-agent-boot-postmortem.md` for the class of bug this prevents.
26
27
27
28
## Data Flow Tracing (Mandatory for Multi-Component Features)
Copy file name to clipboardExpand all lines: CLAUDE.md
+4Lines changed: 4 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -221,6 +221,10 @@ Domains chain together: competitive research feeds marketing and business strate
221
221
- Cloudflare D1 (credentials table with AES-GCM encrypted tokens) (028-provider-infrastructure)
222
222
223
223
## Recent Changes
224
+
- ai-proxy-gateway: AI inference proxy routes LLM requests through Cloudflare AI Gateway — `POST /ai/v1/chat/completions` accepts OpenAI-format requests, transparently routes to Workers AI (@cf/* models) or Anthropic (claude-* models) with format translation (`ai-anthropic-translate.ts`); per-user RPM rate limiting + daily token budget via KV; admin model picker at `/admin/ai-proxy`; AI usage analytics dashboard at `/admin/analytics/ai-usage` aggregates AI Gateway logs by model, day, cost; configurable via AI_PROXY_ENABLED, AI_PROXY_DEFAULT_MODEL, AI_GATEWAY_ID, AI_PROXY_ALLOWED_MODELS, AI_PROXY_RATE_LIMIT_RPM, AI_PROXY_RATE_LIMIT_WINDOW_SECONDS, AI_PROXY_MAX_INPUT_TOKENS_PER_REQUEST, AI_USAGE_PAGE_SIZE, AI_USAGE_MAX_PAGES
225
+
- trial-agent-boot: TrialOrchestrator `discovery_agent_start` step now runs the full 5-step idempotent VM boot (registers agent session via `createAgentSessionOnNode`, mints MCP token with trialId as synthetic taskId, `startAgentSessionOnNode` with discovery prompt + MCP server URL, drives ACP session `pending → assigned → running`; idempotency flags `mcpToken`, `agentSessionCreatedOnVm`, `agentStartedOnVm`, `acpAssignedOnVm`, `acpRunningOnVm` on DO state let crash/retry resume without double-booking); new `fetchDefaultBranch()` probes GitHub `/repos/:owner/:repo` with AbortController-bounded fetch and threads the real default branch through `projects.defaultBranch` + workspace `git clone --branch` (master-default repos like `octocat/Hello-World` now work); configurable via TRIAL_GITHUB_TIMEOUT_MS (default: 5000); new capability test `apps/api/tests/unit/durable-objects/trial-orchestrator-agent-boot.test.ts` asserts every cross-boundary call fires with correct payload; rule 10 updated with port-of-pattern coverage requirement. See `docs/notes/2026-04-19-trial-orchestrator-agent-boot-postmortem.md`.
226
+
- trial-sse-events-fix: Fixed "zero trial.* events on staging" — `formatSse()` in `apps/api/src/routes/trial/events.ts` previously emitted named SSE frames (`event: trial.knowledge\ndata: {...}`), but the frontend subscribes via `source.onmessage` which only fires for the default (unnamed) event; frames arrived on the wire (curl saw them) but browser EventSource silently dropped them. Now emits unnamed `data: {JSON}\n\n` frames; the `TrialEvent` payload's own `type` discriminator preserves dispatch info. Also fixed `eventsUrl` in `apps/api/src/routes/trial/create.ts` response shape mismatch (`/api/trial/events?trialId=X` → `/api/trial/:trialId/events`). New capability test `apps/api/tests/workers/trial-event-bus-sse.test.ts` asserts no `event:` line + JSON round-trip across the TrialEventBus DO → SSE endpoint boundary; unit tests updated to assert new unnamed-frame contract and exact `eventsUrl` shape (no substring matches on URL contracts). Rule 13 updated to ban curl-only verification of browser-consumed SSE/WebSocket streams — curl confirms bytes, browsers confirm dispatch. See `docs/notes/2026-04-19-trial-sse-named-events-postmortem.md`.
227
+
- trial-orchestrator-wire-up: TrialOrchestrator Durable Object + GitHub-API knowledge fast-path — `POST /api/trial/create` now fire-and-forget dispatches two concurrent `c.executionCtx.waitUntil` tasks: (1) `env.TRIAL_ORCHESTRATOR.idFromName(trialId)` DO state machine (alarm-driven, steps: project_creation → node_provisioning → workspace_creation → workspace_ready → agent_session → completed; idempotent `start()`; terminal guard on completed/failed; overall-timeout emits `trial.error`); (2) `emitGithubKnowledgeEvents()` probe hits unauthenticated `/repos/:o/:n`, `/repos/:o/:n/languages`, `/repos/:o/:n/readme` in parallel with AbortController-bounded fetches, emits up to `TRIAL_KNOWLEDGE_MAX_EVENTS` `trial.knowledge` events (description, primary language, stars, topics, license, language breakdown by bytes, README first paragraph), swallows all errors; `apps/api/src/services/trial/bridge.ts` bridges ACP session transitions (`running` → `trial.ready`, `failed` → `trial.error`) and MCP tool calls (`add_knowledge` → `trial.knowledge`, `create_idea` → `trial.idea`) into the SSE stream via `readTrialByProject()` KV lookup (no-op on non-trial projects); new sentinel `TRIAL_ANONYMOUS_INSTALLATION_ID` row in `github_installations` so trial projects satisfy the FK; configurable via TRIAL_ORCHESTRATOR_OVERALL_TIMEOUT_MS (default: 300000), TRIAL_ORCHESTRATOR_STEP_MAX_RETRIES (default: 5), TRIAL_ORCHESTRATOR_RETRY_BASE_DELAY_MS (default: 1000), TRIAL_ORCHESTRATOR_RETRY_MAX_DELAY_MS (default: 60000), TRIAL_ORCHESTRATOR_NODE_READY_TIMEOUT_MS (default: 180000), TRIAL_ORCHESTRATOR_AGENT_READY_TIMEOUT_MS (default: 60000), TRIAL_ORCHESTRATOR_WORKSPACE_READY_TIMEOUT_MS (default: 180000), TRIAL_ORCHESTRATOR_WORKSPACE_READY_POLL_INTERVAL_MS (default: 5000), TRIAL_VM_SIZE (default: DEFAULT_VM_SIZE), TRIAL_VM_LOCATION (default: DEFAULT_VM_LOCATION), TRIAL_KNOWLEDGE_GITHUB_TIMEOUT_MS (default: 5000), TRIAL_KNOWLEDGE_MAX_EVENTS (default: 10)
224
228
- project-credential-overrides: Per-project agent credential overrides — `credentials.project_id` column (migration 0042, nullable FK to `projects.id ON DELETE CASCADE`) with two partial unique indexes (`WHERE project_id IS NULL` for user-scoped, `WHERE project_id IS NOT NULL` for project-scoped); `getDecryptedAgentKey(db, userId, agentType, key, projectId?)` resolves project → user → platform in order; workspace runtime callback forwards `workspace.projectId`; `CodexRefreshLock` DO preserves scope on OAuth token rotation; new `/api/projects/:id/credentials` routes (GET/PUT/DELETE) guarded by `requireOwnedProject` (404 on cross-user); `ProjectAgentsSection` on Project Settings combines credential override and model/permission override per agent using `AgentKeyCard` (scope='project') with inheritance hints; cross-user writes rejected at query layer AND ownership check; `autoActivate` only affects project-scoped rows (user-scoped untouched)
225
229
- project-knowledge-graph: Per-project knowledge graph for persistent agent memory — `knowledge_entities`, `knowledge_observations`, `knowledge_relations` tables + FTS5 virtual table in ProjectData DO SQLite (migration 016); entity-observation-relation model with confidence scoring and recency weighting; 11 MCP tools (`add_knowledge`, `update_knowledge`, `remove_knowledge`, `get_knowledge`, `search_knowledge`, `get_project_knowledge`, `get_relevant_knowledge`, `relate_knowledge`, `get_related`, `confirm_knowledge`, `flag_contradiction`) in `apps/api/src/routes/mcp/knowledge-tools.ts`; auto-retrieval of relevant knowledge in `get_instructions` MCP tool; REST API at `/api/projects/:projectId/knowledge/*` for UI CRUD; Knowledge Browser page at `/projects/:id/knowledge` with entity list, search, type filters, detail panel; configurable via KNOWLEDGE_AUTO_RETRIEVE_LIMIT (default: 20), KNOWLEDGE_MAX_ENTITIES_PER_PROJECT (default: 500), KNOWLEDGE_MAX_OBSERVATIONS_PER_ENTITY (default: 100), KNOWLEDGE_SEARCH_LIMIT (default: 20), KNOWLEDGE_SEARCH_MAX_LIMIT (default: 100), KNOWLEDGE_LIST_PAGE_SIZE (default: 50), KNOWLEDGE_LIST_MAX_PAGE_SIZE (default: 200), KNOWLEDGE_OBSERVATION_MAX_LENGTH (default: 1000)
226
230
- dispatch-task-config-parity: Full task execution config parity for `dispatch_task` MCP tool — extended schema accepts optional `agentProfileId`, `taskMode` (task/conversation), `agentType`, `workspaceProfile` (default/lightweight), `provider` (hetzner/scaleway/gcp), `vmLocation`; config precedence matches normal submit path: explicit field → agent profile → project default → platform default; `resolveAgentProfile()` from `agent-profiles.ts` resolves profiles by ID or name with built-in seeding; profile-derived values (`model`, `permissionMode`, `systemPromptAppend`) passed through to `startTaskRunnerDO()`; `agentProfileHint` and `taskMode` persisted in task INSERT for observability; location validated against resolved provider; `maxTurns`/`timeoutMinutes` excluded — not enforced by runtime (documented in task file)
0 commit comments