Skip to content

Commit c1fb8e1

Browse files
committed
docs: clarify research and review workflow
Document Research as the benchmark-driven optimization surface rather than the default todo pre-execution research path. Clarify that needs-bot-review is the todo-specific research and planning handoff, and that the default execution flow is Todo -> needs-bot-review when planning or research is needed -> Task for direct execution or Job for staged execution.
1 parent 44511f0 commit c1fb8e1

6 files changed

Lines changed: 41 additions & 31 deletions

File tree

README.md

Lines changed: 16 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ It connects three tightly linked layers:
3030
2. Execution and scheduling through `Tasks` and `Jobs`
3131
3. Tool and control-plane integration through `Research`, `MCP`, and optional agent surfaces
3232

33-
The workflow is explicit on purpose. A `Todo` is the planning artifact. A `Task` is one executable unit. A `Job` is an orchestrated or scheduled run built from steps. `Research` is an exploratory context-building artifact when the system still needs evidence before execution.
33+
The workflow is explicit on purpose. A `Todo` is the planning artifact. `needs-bot-review` is the default planning and research handoff when a todo still needs analysis, outside evidence, or a clearer execution plan. A `Task` is one executable unit. A `Job` is an orchestrated or scheduled run built from steps. `Research` is a separate bounded benchmark-and-optimization surface for repeated measured runs and self-improvement loops.
3434

3535
That structure keeps the LLM as the native execution chat surface while Copilot Cockpit provides the approval, scheduling, and control layer around it. The goal is not less automation. The goal is accountable automation that can move from intake to execution without losing review, context, or ownership.
3636

@@ -75,10 +75,11 @@ For the step-by-step walkthrough, open [docs/feature-tour.md](https://github.com
7575
The recommended default path is simple:
7676

7777
1. Start with a `Todo` in `Todo Cockpit` for intake, planning, and triage.
78-
2. Use `Research` only when context is missing or the direction still needs evidence.
78+
2. Use `needs-bot-review` when the todo still needs research, planning, or a better handoff before execution.
7979
3. Promote approved work into a `Task` for one executable unit or a `Job` for an orchestrated run.
8080
4. Review the result before granting more autonomy or scheduling the next cycle.
81-
5. Add `MCP`, repo-local skills, or agent/control-plane features only when the core loop is already working.
81+
5. Use `Research` when the goal is benchmark-driven iteration or tool and agent improvement over repeated measured runs.
82+
6. Add `MCP`, repo-local skills, or agent/control-plane features only when the core loop is already working.
8283

8384
This keeps the relationship collaborative: the workflow starts with planning, earns execution, and only then extends into higher-autonomy integrations.
8485

@@ -135,7 +136,7 @@ These are the default path and the main product surface.
135136

136137
### Todo Cockpit
137138

138-
`Todo Cockpit` is the planning and triage layer. A `Todo` stays a planning artifact: capture work, add comments, apply labels and workflow flags, and decide what should happen next.
139+
`Todo Cockpit` is the planning and triage layer. A `Todo` stays a planning artifact: capture work, add comments, apply labels and workflow flags, and decide what should happen next. When a todo needs analysis, outside evidence, or a more explicit plan, move it through `needs-bot-review` so the planning prompt carries the todo context forward into the handoff.
139140

140141
Optional GitHub inbox triage also lives here. The `Settings` tab can save repo-local GitHub repository settings plus a reusable automation prompt, then expose a cached GitHub inbox at the top of the board with `Issues`, `Pull Requests`, and `Security Alerts`. Refresh uses your existing VS Code GitHub sign-in, inbox rows can create a plain Todo or `Create Todo + Review`, and repeat imports reuse the existing GitHub-sourced card instead of creating duplicates. For setup, storage, and current limits, see [docs/github-integration.md](https://github.com/goodguy1963/Copilot-Cockpit/blob/main/docs/github-integration.md).
141142

@@ -153,9 +154,9 @@ Think of `Jobs` as deeper agentic workflows inside VS Code: research, decision s
153154

154155
### Research
155156

156-
`Research` is the exploratory context-building layer. Use it when the system is missing context, needs outside evidence, or should iterate against a benchmark before you decide on execution.
157+
`Research` is the benchmark-driven iteration layer. Use it when the goal is repeated measured improvement: benchmark a prompt, tool, harness, or agent, extract a score, and iterate within explicit limits.
157158

158-
Research is especially useful when work should pull in fresher outside knowledge first, through web search, Perplexity, scrapers, or other tooling, and then return that material for user review before implementation begins.
159+
This is not the default place to research a todo before execution. Todo-specific discovery and planning belong in `Todo Cockpit` and the `needs-bot-review` flow; `Research` is for bounded optimization loops such as the included AutoAgent-style benchmark example.
159160

160161
### Experimental and advanced playground capabilities
161162

@@ -173,17 +174,17 @@ That also creates a control layer for cost: GitHub Copilot or OpenRouter can use
173174

174175
### How To Use
175176

176-
`How To Use` is the built-in onboarding tab. Start there if you want the recommended path explained in order: `Todo` first, `Research` when context is missing, `Task` or `Job` for execution, then optional control-plane integration after the core loop is working.
177+
`How To Use` is the built-in onboarding tab. Start there if you want the recommended path explained in order: `Todo` first, `needs-bot-review` when planning or research is needed, `Task` or `Job` for execution, then `Research` for benchmark-driven iteration and optional control-plane integration after the core loop is working.
177178

178179
## Common Workflows
179180

180181
### Approval-First Work
181182

182183
Capture work in `Todo Cockpit`, discuss it, move it into `ready`, and only then prepare the execution unit.
183184

184-
### Research-First Collaboration
185+
### Todo Research And Planning
185186

186-
Use `Research`, web search, or tool-assisted discovery to gather current information first. Review that output with the user, discuss changes, and only then convert the result into scheduled implementation work.
187+
Use `needs-bot-review` when a todo needs analysis, outside evidence, or planning before execution. That flow already carries the todo context, can mention the configured search and research providers in its guidance, and should end in a simpler downstream `Task` or `Job` handoff.
187188

188189
### Scheduled Execution
189190

@@ -213,9 +214,9 @@ Start with one recurring loop that produces useful work instead of toy output.
213214
- `Delivery Risk and Security Watch (Daily)` looks for shipping, trust, and operational blind spots.
214215
- `Knowledge and Shipping Packager (Daily)` stages reusable docs, memory candidates, and release material for later curation.
215216
- `Project Intelligence and Delivery Prep` runs those steps in sequence and stops at a review checkpoint before anything turns into real execution.
216-
- `Onboarding Example Coverage Research` starts with a Todo Cockpit intake item, uses Research to gather or benchmark onboarding evidence, and then promotes approved follow-up into Tasks or Jobs.
217+
- `Onboarding Example Coverage Research` starts with a Todo Cockpit intake item, uses `needs-bot-review` for the todo-specific planning handoff, and then uses Research only when onboarding quality should be benchmarked and improved over repeated measured runs before promoting approved follow-up into Tasks or Jobs.
217218

218-
Use that onboarding example when you want one concrete loop to demonstrate the product: start in Todo Cockpit, gather context with Research, promote approved work into Tasks or Jobs, and stop at a review checkpoint before autonomy expands.
219+
Use that onboarding example when you want one concrete loop to demonstrate the product: start in Todo Cockpit, plan or research the todo through `needs-bot-review`, promote approved work into Tasks or Jobs, and use Research separately when benchmark-driven optimization is the real goal.
219220

220221
This is a good fit for a solo product, an internal tool, a small SaaS, or an actively maintained extension like this repo.
221222

@@ -234,9 +235,10 @@ The point is not to overclaim autonomy. The point is to show recurring, inspecta
234235
1. Open Copilot Cockpit from the activity bar or run `Copilot Cockpit: Create Scheduled Prompt (GUI)` from the command palette. Or use the todo-list icon in the top right.
235236
2. Start in `How To Use` if you are new to the extension, or click the top-bar `Intro Tutorial` button for the same guided walkthrough.
236237
3. Capture or refine work in `Todo Cockpit` until the planning artifact is clear.
237-
4. Use `Research` if the work still needs exploratory context or outside evidence.
238+
4. Use `needs-bot-review` if the todo still needs research, planning, or outside evidence before execution.
238239
5. Move approved work into `ready`, then promote it into a `Task` for one executable unit or a `Job` for an orchestrated run.
239-
6. Open `Settings` to configure repo-local defaults and optional integrations such as the GitHub inbox flow. Add `MCP`, Copilot skills, starter agents, or other control-plane features only when you want those optional extensions.
240+
6. Use `Research` when you want bounded benchmark-driven iteration, such as improving a prompt, tool, or agent over repeated measured runs.
241+
7. Open `Settings` to configure repo-local defaults and optional integrations such as the GitHub inbox flow. Add `MCP`, Copilot skills, starter agents, or other control-plane features only when you want those optional extensions.
240242

241243
In the same `Settings` tab you can also choose the scheduled task execution provider:
242244

@@ -248,7 +250,7 @@ If you select `Codex` or `OpenCode`, install and authenticate those tools separa
248250

249251
If you want the optional integration layers, the practical order is:
250252

251-
1. Get the core `Todo` -> `Research` -> `Task` or `Job` loop working first.
253+
1. Get the core `Todo` -> `needs-bot-review` -> `Task` or `Job` loop working first.
252254
2. Use `Set Up MCP` to create or repair `.vscode/mcp.json` and activate the repo-local scheduler MCP server for this workspace.
253255
3. Add any separate third-party MCP servers you want, such as Tavily, Perplexity, or [Prefab by Max Health Inc.](https://github.com/Max-Health-Inc/prefab), to that same workspace MCP config. Those servers are separate from Copilot Cockpit's scheduler server and may need their own API keys or provider-specific setup.
254256
4. Optionally, but recommended if you want the full repo-local Copilot guidance layer, use `Sync Bundled Skills` to write the bundled Copilot skills into `.github/skills`. If the Prefab by Max Health Inc. MCP server is configured, that bundled path also adds the `prefab-ui` skill so installed users can route Prefab by Max Health Inc. UI and wire-format work through the shipped contract instead of keeping it as a repo-only extra.

docs/feature-tour.md

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
Copilot Cockpit is easiest to understand as one operating loop:
44

55
1. Plan the work.
6-
2. Research and refine the direction.
6+
2. Research or plan the todo in the right surface.
77
3. Approve the handoff.
88
4. Run the right execution unit.
99
5. Review the result before granting more autonomy.
@@ -40,6 +40,7 @@ Caption: Plan and review work before it runs.
4040

4141
- Use it to capture work before it becomes execution.
4242
- Keep comments, labels, flags, due dates, and approval state close to the item.
43+
- Use `needs-bot-review` when the todo needs research, evidence, or a more explicit execution plan before it should become a task or job.
4344
- Move work into `ready` when it should hand off into execution.
4445
- When optional GitHub integration is enabled, the top of the board also becomes a cached inbox for `Issues`, `Pull Requests`, and `Security Alerts`, with direct `Create Todo` and `Create Todo + Review` actions.
4546

@@ -107,9 +108,9 @@ Caption: Improve against a benchmark, not by guesswork.
107108
- Let the system try repeated improvements against a metric.
108109
- Stop the loop with explicit limits instead of running indefinitely.
109110

110-
Research can also act as a collaborative discovery phase before implementation: gather web knowledge, review the findings with the user, refine the direction, and only then turn the result into scheduled execution.
111+
Research is not the default place to investigate a todo before execution. Todo-specific discovery, outside evidence gathering, and planning handoff belong in `Todo Cockpit` through `needs-bot-review`, where the todo context is already attached.
111112

112-
For onboarding, `Onboarding Example Coverage Research` shows the full loop in one pass: capture the gap in Todo Cockpit, use Research to benchmark the docs, promote approved fixes into Tasks or Jobs, and pause at a review checkpoint before broader autonomy.
113+
For onboarding, `Onboarding Example Coverage Research` shows the benchmarked version of the loop: capture the gap in Todo Cockpit, use `needs-bot-review` for the todo handoff, use Research to benchmark the docs when measured iteration is warranted, then promote approved fixes into Tasks or Jobs and pause at a review checkpoint before broader autonomy.
113114

114115
![Illustrative Research mockup](../images/feature-tour-research.svg)
115116

@@ -190,14 +191,15 @@ Best for: first-time users who want the operating model before the controls.
190191
## Choosing The Right Surface
191192

192193
- Use `Todo Cockpit` when the work still needs planning or approval.
194+
- Use `needs-bot-review` when that todo needs research, outside evidence, or a better plan before execution.
193195
- Use `Tasks` when one prompt and one schedule are enough.
194196
- Use `Jobs` when the work needs ordered stages or pause points.
195197
- Use `Research` when the goal is measured improvement against a benchmark.
196198

197199
## Working Style This Enables
198200

199201
- Keep the human in the loop while still using AI for the heavy lifting.
200-
- Let research happen before implementation instead of after mistakes are made.
202+
- Let todo planning and review happen before implementation, and use benchmark research when optimization needs measurement instead of guesswork.
201203
- Run non-conflicting work in parallel while keeping risky work sequenced and visible.
202204
- Archive completed, rejected, or reviewed work so the project gains memory over time.
203205
- Use specialized agents, prompts, and models as a team of different experts instead of forcing one general agent to do every job.

docs/getting-started.md

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -6,17 +6,18 @@ Copilot Cockpit works best when you treat it as one workflow stack with three la
66
2. Execution and scheduling through `Tasks` and `Jobs`
77
3. Optional tool/control-plane integration through `Research`, `MCP`, and repo-local agent surfaces
88

9-
The recommended path is: start with a `Todo`, use `Research` when context is missing, then promote approved work into a `Task` or `Job`.
9+
The recommended path is: start with a `Todo`, use `needs-bot-review` when the todo needs research or planning, then promote approved work into a `Task` or `Job`.
1010

1111
## Quick Start
1212

1313
1. Open Copilot Cockpit from the activity bar or with `Copilot Cockpit: Create Scheduled Prompt (GUI)`.
1414
2. Start in `How To Use` if you are new to the extension, or click the top-bar `Intro Tutorial` button for the same walkthrough.
1515
3. Capture or refine work in `Todo Cockpit`. A `Todo` is the planning artifact and intake surface.
16-
4. Use `Research` if the work still needs exploratory context, outside evidence, or benchmarked iteration.
16+
4. Use `needs-bot-review` if the todo still needs analysis, outside evidence, or a better execution plan.
1717
5. Move approved work into `ready`, then promote it into a `Task` for one executable unit or a `Job` for an orchestrated or scheduled run.
18-
6. Open `Settings` to configure repo-local defaults and integrations. This is also where you choose the scheduled task execution provider: `GitHub Copilot Chat`, `OpenAI Codex CLI`, or `OpenCode CLI`.
19-
7. Use the top-bar `Plan Integration` button only when you want optional control-plane extensions such as MCP, skills, starter agents, or the GitHub inbox flow.
18+
6. Use `Research` when you want bounded benchmarked iteration, such as improving a prompt, tool, or agent over repeated measured runs.
19+
7. Open `Settings` to configure repo-local defaults and integrations. This is also where you choose the scheduled task execution provider: `GitHub Copilot Chat`, `OpenAI Codex CLI`, or `OpenCode CLI`.
20+
8. Use the top-bar `Plan Integration` button only when you want optional control-plane extensions such as MCP, skills, starter agents, or the GitHub inbox flow.
2021

2122
## Optional: Enable GitHub Inbox Triage
2223

@@ -35,7 +36,8 @@ The GitHub inbox is repo-local, uses cached manual refreshes, and resolves crede
3536
- Use `Todo Cockpit` when the work still needs planning, comments, approval, or triage.
3637
- Use `Tasks` when one prompt and one schedule are enough for one executable unit.
3738
- Use `Jobs` when the work needs ordered stages, orchestration, or pause checkpoints.
38-
- Use `Research` when the work needs exploratory context or measured improvement before execution.
39+
- Use `needs-bot-review` when a todo needs research or planning before it becomes execution.
40+
- Use `Research` when the work needs measured improvement against a benchmark.
3941

4042
## Optional Extensions
4143

@@ -61,7 +63,7 @@ Skip toy prompts. Start with one recurring loop that would still be worth keepin
6163
- For a company team, use the same pattern for product signals, security and release readiness, support queues, or operations follow-up.
6264
- If you also want to show the Research surface, add one benchmarked profile that scores onboarding or prompt quality against a simple command before you promote anything into execution.
6365

64-
`Onboarding Example Coverage Research` is the simplest version of that pattern: log the onboarding gap in Todo Cockpit, use Research to gather examples or benchmark the docs, then turn the approved next step into Tasks for a direct doc pass or Jobs for a staged follow-up. Use it when you want a real onboarding loop that still stops at a review checkpoint before autonomy expands.
66+
`Onboarding Example Coverage Research` is the simplest version of that pattern: log the onboarding gap in Todo Cockpit, use `needs-bot-review` for the todo-specific planning handoff, and use Research only when you want to benchmark and improve the docs over repeated measured runs. Then turn the approved next step into Tasks for a direct doc pass or Jobs for a staged follow-up. Use it when you want a real onboarding loop that still stops at a review checkpoint before autonomy expands.
6567

6668
That keeps the demo honest: the proof is useful output plus explicit review, not a claim that the system should run unchecked.
6769

docs/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ Use this folder for the detailed reference that used to live in the top-level RE
2121
- If you want to verify which visuals are live footage, which are illustrative mockups, and where retired media now lives, go to [Media Reference](./media-reference.md).
2222
- If you want to connect GitHub inbox triage to Todo Cockpit, go to [GitHub Integration](./github-integration.md).
2323
- If you want to understand the optional starter-agent orchestration layer, go to [Agent Workflow](./agent-workflow.md).
24-
- If you want to understand Todo Cockpit, Tasks, Jobs, and Research, go to [Workflows](./workflows.md).
24+
- If you want to understand the default `Todo` -> `needs-bot-review` -> `Task` or `Job` path, plus where benchmark-driven Research fits, go to [Workflows](./workflows.md).
2525
- If you want MCP, skills, Copilot, OpenRouter, Codex, or Telegram details, go to [Integrations](./integrations.md).
2626
- If you want persistence and repo-local boundary details, go to [Storage and Boundaries](./storage-and-boundaries.md).
2727
- If you want the design intent and fork background, go to [Architecture and Principles](./architecture-and-principles.md).

0 commit comments

Comments
 (0)