docs: add product roadmap for end-to-end vision completion

Test User · Test User · commit a366126e6e40 · 2026-03-26T14:50:00.000-07:00
Captures the remaining gaps between current state (Phase 2.5/3 complete)
and the full Think → Build → Prove → Ship vision. Organized into three
phases: closing the interaction gap (bidirectional chat, gate runs, glitch
capture), completing the SHIP phase (PR status tracking, post-merge loop),
and platform completeness (settings, costs, notifications, stress-test UI,
issue import). Includes explicit out-of-scope section grounded in the
vision doc.
diff --git a/docs/PRODUCT_ROADMAP.md b/docs/PRODUCT_ROADMAP.md
@@ -0,0 +1,219 @@
+# CodeFRAME Product Roadmap
+
+**Updated**: 2026-03-26
+**Vision**: *CodeFRAME is the project delivery system that turns ideas into verified, deployed code — AI agents write the code, CodeFRAME owns everything before and after.*
+
+This document focuses on gaps in the web product that block the end-to-end vision. It is not a comprehensive feature list. Items included here were selected because they are load-bearing for the Think → Build → Prove → Ship pipeline or because their absence creates a significant hole in the user experience. Items that are nice-to-have, purely polish, or already well-served by the CLI are excluded.
+
+The phase numbering continues from the existing V2 Strategic Roadmap (Phases 1–2.5 complete, Phase 3 web UI substantially complete).
+
+---
+
+## Current State
+
+Phase 3 web UI delivered the core screens for all four pipeline stages: PRD editor, task board, execution monitor, blocker resolution, diff reviewer, and PROOF9 requirements table. The golden path works end-to-end in the browser for a single developer on a single project.
+
+What is missing is not *breadth* — it is *depth* in the places the vision depends on most.
+
+---
+
+## Phase 3.5 — Close the Interaction Gap
+
+**The issue**: The web UI is read-heavy. Users watch agents run, view requirements, inspect diffs. But they cannot steer an agent in real-time, cannot run quality gates from the browser, and cannot capture a glitch and watch it become a permanent proof obligation. The pipeline runs but the human is a spectator, not a participant.
+
+### Milestone A: Bidirectional Agent Chat (already tracked #500–509)
+
+Interactive sessions where the user sends messages to a coding agent and receives streaming token-by-token responses. The full scope, backend and frontend, is documented in GitHub issues #500–509. This is the highest-priority item in this phase.
+
+**Why it matters for the vision**: The BUILD phase owns "blocker escalation when the agent is genuinely stuck." Right now that is a pause-and-ask mechanism. Real-time chat is the upgrade path — the agent asks, the user answers, the agent continues. It also enables the session model that the PROVE and SHIP phases depend on for cost attribution and audit.
+
+---
+
+### Milestone B: Run Quality Gates from the Web UI
+
+**Current state**: The PROOF9 page lists requirements and lets users waive them. It does not let users trigger a gate run.
+
+**What to build**:
+
+- A **[Run Gates]** button on the PROOF9 page (and optionally on the task detail modal) that calls the existing `POST /api/v2/proof/run` endpoint
+- A **gate run progress view** showing each gate as it executes: pending → running → passed / failed
+- Per-gate **evidence display**: show the artifact (test output, coverage report, lighthouse score, etc.) that was produced as evidence
+- A **run history** panel showing the last 5 gate runs with their outcomes
+
+**Why it matters for the vision**: PROOF9 is described as "nine categories of evidence that code must produce." Without the ability to produce that evidence from the UI, the PROVE phase is inspection-only. Gate runs are the core action of the PROVE phase.
+
+---
+
+### Milestone C: Glitch Capture UI
+
+**Current state**: The CLI has `cf proof capture` for converting a production glitch into a permanent PROOF9 requirement. There is no web equivalent.
+
+**What to build**:
+
+- A **"Capture Glitch"** entry point reachable from the PROOF9 page and the sidebar
+- A structured form collecting:
+  - Description of the failure (free text, supports markdown)
+  - Where it was found (production / QA / dogfooding / monitoring)
+  - Scope selector: which files, routes, or components are affected
+  - Which PROOF9 gates should be required as proof obligations (multi-select)
+  - Severity and optional expiry (for time-bounded obligations)
+- On submit: creates a new REQ in the requirements ledger, associates obligations, and shows the new requirement in the PROOF9 table immediately
+- A **REQ detail view** that shows the glitch description, its obligations, and the evidence history across all gate runs
+
+**Why it matters for the vision**: The glitch capture closed loop — *Ship → Discover glitch → Capture → Enforce forever → Ship with higher confidence* — is described as "the defining feature of the system." Without a web UI for capture, this loop requires CLI access and will be skipped by most users. This is the most differentiated feature in CodeFRAME and it is currently invisible to web users.
+
+---
+
+## Phase 4 — Complete the SHIP Phase
+
+**The issue**: The Review page creates a PR. After that, the user has no feedback from CodeFRAME. They must go to GitHub to check CI, check reviews, and merge. The SHIP phase currently ends at PR creation.
+
+### Milestone A: PR Status Tracking
+
+After a PR is created from the Review page, show its live status in the web UI.
+
+**What to build**:
+
+- A **PR Status panel** on the Review page (and optionally on the task detail modal) that polls GitHub for:
+  - CI check status (pending / passing / failing), with per-check breakdown
+  - Review status (approved / changes requested / pending)
+  - Merge state (open / merged / closed)
+- Polling interval: 30 seconds when the Review page is active
+- Visual indicators matching the existing state badge patterns
+- A **[Merge]** button that becomes active only when:
+  1. All CI checks pass, and
+  2. PROOF9 has no open (non-waived) requirements for the changed scope
+- If PROOF9 has open requirements: show a gating message listing which requirements are blocking merge and linking to the PROOF9 page
+
+**Why it matters for the vision**: "Merge is gated on PROOF9 pass." That sentence is in the vision doc. Without CI tracking and a merge gate in the UI, this is a CLI-only guarantee. The SHIP phase is only complete when the user can go from "PR opened" to "merged" without leaving CodeFRAME.
+
+---
+
+### Milestone B: Post-Merge Glitch Capture Loop
+
+When a merged PR leads to a production glitch, the system should make it easy to feed that back into PROOF9 as a permanent requirement.
+
+**What to build**:
+
+- A **PR history view** on the Review page (or a dedicated `/shipped` page) listing recently merged PRs with their proof reports at time of merge
+- A **"Report Glitch"** action on each merged PR that pre-populates the Glitch Capture form (Milestone C above) with the PR's scope (files changed, routes affected)
+- A link from each glitch REQ back to the PR that produced the code it is guarding
+
+**Why it matters for the vision**: "Quality compounding interest. Over time, the system becomes harder to break in the ways you have already been burned." This feedback loop is described as central to the system. Without connecting post-merge glitches back to the PROVE layer, each deployment is a one-shot with no learning.
+
+---
+
+## Phase 5 — Platform Completeness
+
+These items are not part of a specific pipeline stage but are prerequisites for real-world adoption. They are ordered by the degree to which their absence blocks a new user from completing the pipeline.
+
+### 1. Settings Page
+
+**Current state**: API keys, model selection, quality gate thresholds, and agent preferences are configured via environment variables or CLI config. There is no web UI for any of this.
+
+**What to build**:
+
+- **Agent settings**: default model per agent type (Claude, Codex, OpenCode), max turns, max cost per task
+- **API keys**: input and verify `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, GitHub token — stored encrypted, never returned in plaintext
+- **PROOF9 defaults**: which gates are enabled by default for new projects, strictness level (fail on any open REQ vs. warn only)
+- **Workspace configuration**: workspace root path, default branch, auto-detection overrides
+
+Without a settings page, a new user who cannot find the env vars cannot use the product. This is an onboarding blocker.
+
+---
+
+### 2. Cost and Token Analytics
+
+**Current state**: Token usage is recorded in the DB (`token_usage` table) per task. It is not surfaced anywhere in the web UI.
+
+**What to build**:
+
+- A **Costs page** (or section within Settings) showing:
+  - Total spend for the current workspace, last 7 / 30 / 90 days
+  - Cost breakdown by task (top 10 most expensive)
+  - Cost breakdown by agent type (Claude Code vs Codex vs ReAct)
+  - Input vs output token split
+  - Average cost per task
+- Cost column on the task board cards (already supported in the data model, just not displayed)
+
+**Why it matters for the vision**: CodeFRAME runs paid AI APIs. Users need to know what they are spending and which tasks are costing the most. This is also the data that informs prompt template improvements and agent selection decisions.
+
+---
+
+### 3. Async Notifications
+
+**Current state**: Batch executions can run for hours. The user has no notification when a batch completes, a blocker is created, or a gate run fails.
+
+**What to build**:
+
+- **Browser notifications** (Web Notifications API): opt-in, triggered on batch completion, blocker creation, and gate run failure — follow the existing WebSocket event stream for triggers
+- **In-app notification center**: a bell icon in the sidebar with a history of recent notifications, clearable
+- **Optional webhook**: a single URL the user can configure to receive JSON payloads on key events (batch done, blocker created, PR merged) — supports Slack, Discord, or any HTTP endpoint
+
+The webhook is optional and last priority. Browser notifications and the in-app center are sufficient for the core use case.
+
+---
+
+### 4. PRD Stress-Test Web UI
+
+**Current state**: The CLI has `cf prd stress-test` for recursive decomposition — it takes the PRD and surfaces ambiguities the agent cannot resolve without human input. This is described in the vision as a core part of the THINK phase. The web UI has no equivalent; users who work exclusively in the browser never see this step.
+
+**What to build**:
+
+- A **[Stress Test]** button on the PRD page that triggers the stress-test process
+- A **results view** showing the decomposition tree with ambiguities surfaced as questions, styled similarly to the existing Discovery transcript
+- Each ambiguity has an inline answer field — the user's answers are fed back to refine the PRD
+- On completion: the refined PRD is saved and the user can proceed to task generation
+
+**Why it matters for the vision**: "Gaps discovered at planning time, not execution time." The stress-test is the mechanism that makes requirements specific enough for agents to execute correctly. Without it in the web UI, the web-first user skips the most valuable part of the THINK phase.
+
+---
+
+### 5. External Issue Import (GitHub Issues → Tasks)
+
+**Current state**: The THINK phase starts from "I have an idea" (PRD generation). The vision acknowledges that some users start from an existing issue tracker: "If you already have issues in a tracker, CodeFRAME can potentially consume them (future integration)."
+
+**What to build**:
+
+- A **GitHub Issues import** flow on the Tasks page: connect a GitHub repo, browse open issues, select one or more, and import them as CodeFRAME tasks with their title, description, and labels mapped to task fields
+- Imported tasks link back to the original GitHub issue (external ID stored)
+- On task completion, optionally close the corresponding GitHub issue
+
+Keep scope narrow: GitHub only, import-only (no two-way sync), no Linear or Jira in v1 of this feature.
+
+**Why it matters for the vision**: Many developers already have issue trackers. Requiring them to re-enter every issue into a PRD is a barrier to adoption. Import is the bridge between "I have a backlog" and "I want CodeFRAME to work through it."
+
+---
+
+## What Is Explicitly Out of Scope
+
+These are items that were considered and excluded because they do not serve the core vision at this stage.
+
+**Fleet management / multi-repo coordination**: The vision explicitly says "It is not a fleet manager." CodeFRAME is for a single developer or small team on one project. Scaling to 30 agents across 10 repos is Gastown's domain.
+
+**Multi-user workspaces and team permissions**: The vision says "solo developers and small teams." JWT auth supports multiple users, but role-based workspace access control is not load-bearing until there is evidence of team adoption. Shipping collaboration features before single-user quality is right would be premature.
+
+**Custom quality gate definitions via UI**: PROOF9's 9 gates are well-defined and their configuration is a power-user concern that belongs in a config file, not a UI form. This can be revisited when the gate run UI (Phase 3.5 Milestone B) is validated.
+
+**Deployment automation**: Post-merge deployment hooks are mentioned in the vision as part of SHIP, but CodeFRAME is not a CI/CD system. Deployment is what happens after CodeFRAME's artifacts are consumed by GitHub Actions or another pipeline. Focus on producing the right artifacts (verified PRs with proof reports), not on owning deployment.
+
+**Competitor / agent benchmarking**: Comparing Claude Code vs Codex results for the same task is interesting but not on the critical path for the vision. Instrument cost and quality data first; analysis tooling comes later when there is data to analyze.
+
+---
+
+## Summary
+
+| Phase | Focus | Key Outcome |
+|---|---|---|
+| 3.5A | Bidirectional agent chat (#500–509) | Users can steer agents in real-time |
+| 3.5B | Run gates from the web UI | PROVE phase becomes active, not passive |
+| 3.5C | Glitch capture UI | The defining closed loop is accessible to web users |
+| 4A | PR status + PROOF9 merge gate | SHIP phase is complete in the browser |
+| 4B | Post-merge glitch capture loop | System learns from production failures |
+| 5.1 | Settings page | New users can onboard without env vars |
+| 5.2 | Cost analytics | Users understand what they are spending |
+| 5.3 | Async notifications | Batch workflows work without babysitting |
+| 5.4 | PRD stress-test web UI | THINK phase fully accessible to web users |
+| 5.5 | GitHub Issues import | Existing backlogs can enter the pipeline |
+
+The ordering within Phase 5 is by onboarding impact. Settings (5.1) and cost (5.2) block new users earliest. Notifications (5.3), stress-test (5.4), and issue import (5.5) complete the product for users who are already working in it.