You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Document Research as the benchmark-driven optimization surface rather than the default todo pre-execution research path.
Clarify that needs-bot-review is the todo-specific research and planning handoff, and that the default execution flow is Todo -> needs-bot-review when planning or research is needed -> Task for direct execution or Job for staged execution.
Copy file name to clipboardExpand all lines: README.md
+16-14Lines changed: 16 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -30,7 +30,7 @@ It connects three tightly linked layers:
30
30
2. Execution and scheduling through `Tasks` and `Jobs`
31
31
3. Tool and control-plane integration through `Research`, `MCP`, and optional agent surfaces
32
32
33
-
The workflow is explicit on purpose. A `Todo` is the planning artifact. A `Task` is one executable unit. A `Job` is an orchestrated or scheduled run built from steps. `Research` is an exploratory context-building artifact when the system still needs evidence before execution.
33
+
The workflow is explicit on purpose. A `Todo` is the planning artifact. `needs-bot-review` is the default planning and research handoff when a todo still needs analysis, outside evidence, or a clearer execution plan. A `Task` is one executable unit. A `Job` is an orchestrated or scheduled run built from steps. `Research` is a separate bounded benchmark-and-optimization surface for repeated measured runs and self-improvement loops.
34
34
35
35
That structure keeps the LLM as the native execution chat surface while Copilot Cockpit provides the approval, scheduling, and control layer around it. The goal is not less automation. The goal is accountable automation that can move from intake to execution without losing review, context, or ownership.
36
36
@@ -75,10 +75,11 @@ For the step-by-step walkthrough, open [docs/feature-tour.md](https://github.com
75
75
The recommended default path is simple:
76
76
77
77
1. Start with a `Todo` in `Todo Cockpit` for intake, planning, and triage.
78
-
2. Use `Research` only when context is missing or the direction still needs evidence.
78
+
2. Use `needs-bot-review`when the todo still needs research, planning, or a better handoff before execution.
79
79
3. Promote approved work into a `Task` for one executable unit or a `Job` for an orchestrated run.
80
80
4. Review the result before granting more autonomy or scheduling the next cycle.
81
-
5. Add `MCP`, repo-local skills, or agent/control-plane features only when the core loop is already working.
81
+
5. Use `Research` when the goal is benchmark-driven iteration or tool and agent improvement over repeated measured runs.
82
+
6. Add `MCP`, repo-local skills, or agent/control-plane features only when the core loop is already working.
82
83
83
84
This keeps the relationship collaborative: the workflow starts with planning, earns execution, and only then extends into higher-autonomy integrations.
84
85
@@ -135,7 +136,7 @@ These are the default path and the main product surface.
135
136
136
137
### Todo Cockpit
137
138
138
-
`Todo Cockpit` is the planning and triage layer. A `Todo` stays a planning artifact: capture work, add comments, apply labels and workflow flags, and decide what should happen next.
139
+
`Todo Cockpit` is the planning and triage layer. A `Todo` stays a planning artifact: capture work, add comments, apply labels and workflow flags, and decide what should happen next. When a todo needs analysis, outside evidence, or a more explicit plan, move it through `needs-bot-review` so the planning prompt carries the todo context forward into the handoff.
139
140
140
141
Optional GitHub inbox triage also lives here. The `Settings` tab can save repo-local GitHub repository settings plus a reusable automation prompt, then expose a cached GitHub inbox at the top of the board with `Issues`, `Pull Requests`, and `Security Alerts`. Refresh uses your existing VS Code GitHub sign-in, inbox rows can create a plain Todo or `Create Todo + Review`, and repeat imports reuse the existing GitHub-sourced card instead of creating duplicates. For setup, storage, and current limits, see [docs/github-integration.md](https://github.com/goodguy1963/Copilot-Cockpit/blob/main/docs/github-integration.md).
141
142
@@ -153,9 +154,9 @@ Think of `Jobs` as deeper agentic workflows inside VS Code: research, decision s
153
154
154
155
### Research
155
156
156
-
`Research` is the exploratory context-building layer. Use it when the system is missing context, needs outside evidence, or should iterate against a benchmark before you decide on execution.
157
+
`Research` is the benchmark-driven iteration layer. Use it when the goal is repeated measured improvement: benchmark a prompt, tool, harness, or agent, extract a score, and iterate within explicit limits.
157
158
158
-
Research is especially useful when work should pull in fresher outside knowledge first, through web search, Perplexity, scrapers, or other tooling, and then return that material for user review before implementation begins.
159
+
This is not the default place to research a todo before execution. Todo-specific discovery and planning belong in `Todo Cockpit`and the `needs-bot-review` flow; `Research` is for bounded optimization loops such as the included AutoAgent-style benchmark example.
159
160
160
161
### Experimental and advanced playground capabilities
161
162
@@ -173,17 +174,17 @@ That also creates a control layer for cost: GitHub Copilot or OpenRouter can use
173
174
174
175
### How To Use
175
176
176
-
`How To Use` is the built-in onboarding tab. Start there if you want the recommended path explained in order: `Todo` first, `Research` when context is missing, `Task` or `Job` for execution, then optional control-plane integration after the core loop is working.
177
+
`How To Use` is the built-in onboarding tab. Start there if you want the recommended path explained in order: `Todo` first, `needs-bot-review` when planning or research is needed, `Task` or `Job` for execution, then`Research` for benchmark-driven iteration and optional control-plane integration after the core loop is working.
177
178
178
179
## Common Workflows
179
180
180
181
### Approval-First Work
181
182
182
183
Capture work in `Todo Cockpit`, discuss it, move it into `ready`, and only then prepare the execution unit.
183
184
184
-
### Research-First Collaboration
185
+
### Todo Research And Planning
185
186
186
-
Use `Research`, web search, or tool-assisted discovery to gather current information first. Review that output with the user, discuss changes, and only then convert the result into scheduled implementation work.
187
+
Use `needs-bot-review` when a todo needs analysis, outside evidence, or planning before execution. That flow already carries the todo context, can mention the configured search and research providers in its guidance, and should end in a simpler downstream `Task` or `Job` handoff.
187
188
188
189
### Scheduled Execution
189
190
@@ -213,9 +214,9 @@ Start with one recurring loop that produces useful work instead of toy output.
213
214
-`Delivery Risk and Security Watch (Daily)` looks for shipping, trust, and operational blind spots.
214
215
-`Knowledge and Shipping Packager (Daily)` stages reusable docs, memory candidates, and release material for later curation.
215
216
-`Project Intelligence and Delivery Prep` runs those steps in sequence and stops at a review checkpoint before anything turns into real execution.
216
-
-`Onboarding Example Coverage Research` starts with a Todo Cockpit intake item, uses Research to gather or benchmark onboarding evidence, and then promotes approved follow-up into Tasks or Jobs.
217
+
-`Onboarding Example Coverage Research` starts with a Todo Cockpit intake item, uses `needs-bot-review` for the todo-specific planning handoff, and then uses Research only when onboarding quality should be benchmarked and improved over repeated measured runs before promoting approved follow-up into Tasks or Jobs.
217
218
218
-
Use that onboarding example when you want one concrete loop to demonstrate the product: start in Todo Cockpit, gather context with Research, promote approved work into Tasks or Jobs, and stop at a review checkpoint before autonomy expands.
219
+
Use that onboarding example when you want one concrete loop to demonstrate the product: start in Todo Cockpit, plan or research the todo through `needs-bot-review`, promote approved work into Tasks or Jobs, and use Research separately when benchmark-driven optimization is the real goal.
219
220
220
221
This is a good fit for a solo product, an internal tool, a small SaaS, or an actively maintained extension like this repo.
221
222
@@ -234,9 +235,10 @@ The point is not to overclaim autonomy. The point is to show recurring, inspecta
234
235
1. Open Copilot Cockpit from the activity bar or run `Copilot Cockpit: Create Scheduled Prompt (GUI)` from the command palette. Or use the todo-list icon in the top right.
235
236
2. Start in `How To Use` if you are new to the extension, or click the top-bar `Intro Tutorial` button for the same guided walkthrough.
236
237
3. Capture or refine work in `Todo Cockpit` until the planning artifact is clear.
237
-
4. Use `Research` if the work still needs exploratory context or outside evidence.
238
+
4. Use `needs-bot-review` if the todo still needs research, planning, or outside evidence before execution.
238
239
5. Move approved work into `ready`, then promote it into a `Task` for one executable unit or a `Job` for an orchestrated run.
239
-
6. Open `Settings` to configure repo-local defaults and optional integrations such as the GitHub inbox flow. Add `MCP`, Copilot skills, starter agents, or other control-plane features only when you want those optional extensions.
240
+
6. Use `Research` when you want bounded benchmark-driven iteration, such as improving a prompt, tool, or agent over repeated measured runs.
241
+
7. Open `Settings` to configure repo-local defaults and optional integrations such as the GitHub inbox flow. Add `MCP`, Copilot skills, starter agents, or other control-plane features only when you want those optional extensions.
240
242
241
243
In the same `Settings` tab you can also choose the scheduled task execution provider:
242
244
@@ -248,7 +250,7 @@ If you select `Codex` or `OpenCode`, install and authenticate those tools separa
248
250
249
251
If you want the optional integration layers, the practical order is:
250
252
251
-
1. Get the core `Todo` -> `Research` -> `Task` or `Job` loop working first.
253
+
1. Get the core `Todo` -> `needs-bot-review` -> `Task` or `Job` loop working first.
252
254
2. Use `Set Up MCP` to create or repair `.vscode/mcp.json` and activate the repo-local scheduler MCP server for this workspace.
253
255
3. Add any separate third-party MCP servers you want, such as Tavily, Perplexity, or [Prefab by Max Health Inc.](https://github.com/Max-Health-Inc/prefab), to that same workspace MCP config. Those servers are separate from Copilot Cockpit's scheduler server and may need their own API keys or provider-specific setup.
254
256
4. Optionally, but recommended if you want the full repo-local Copilot guidance layer, use `Sync Bundled Skills` to write the bundled Copilot skills into `.github/skills`. If the Prefab by Max Health Inc. MCP server is configured, that bundled path also adds the `prefab-ui` skill so installed users can route Prefab by Max Health Inc. UI and wire-format work through the shipped contract instead of keeping it as a repo-only extra.
Copy file name to clipboardExpand all lines: docs/feature-tour.md
+6-4Lines changed: 6 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,7 +3,7 @@
3
3
Copilot Cockpit is easiest to understand as one operating loop:
4
4
5
5
1. Plan the work.
6
-
2. Research and refine the direction.
6
+
2. Research or plan the todo in the right surface.
7
7
3. Approve the handoff.
8
8
4. Run the right execution unit.
9
9
5. Review the result before granting more autonomy.
@@ -40,6 +40,7 @@ Caption: Plan and review work before it runs.
40
40
41
41
- Use it to capture work before it becomes execution.
42
42
- Keep comments, labels, flags, due dates, and approval state close to the item.
43
+
- Use `needs-bot-review` when the todo needs research, evidence, or a more explicit execution plan before it should become a task or job.
43
44
- Move work into `ready` when it should hand off into execution.
44
45
- When optional GitHub integration is enabled, the top of the board also becomes a cached inbox for `Issues`, `Pull Requests`, and `Security Alerts`, with direct `Create Todo` and `Create Todo + Review` actions.
45
46
@@ -107,9 +108,9 @@ Caption: Improve against a benchmark, not by guesswork.
107
108
- Let the system try repeated improvements against a metric.
108
109
- Stop the loop with explicit limits instead of running indefinitely.
109
110
110
-
Research can also act as a collaborative discovery phase before implementation: gather web knowledge, review the findings with the user, refine the direction, and only then turn the result into scheduled execution.
111
+
Research is not the default place to investigate a todo before execution. Todo-specific discovery, outside evidence gathering, and planning handoff belong in `Todo Cockpit` through `needs-bot-review`, where the todo context is already attached.
111
112
112
-
For onboarding, `Onboarding Example Coverage Research` shows the full loop in one pass: capture the gap in Todo Cockpit, use Research to benchmark the docs, promote approved fixes into Tasks or Jobs, and pause at a review checkpoint before broader autonomy.
113
+
For onboarding, `Onboarding Example Coverage Research` shows the benchmarked version of the loop: capture the gap in Todo Cockpit, use `needs-bot-review` for the todo handoff, use Research to benchmark the docs when measured iteration is warranted, then promote approved fixes into Tasks or Jobs and pause at a review checkpoint before broader autonomy.
113
114
114
115

115
116
@@ -190,14 +191,15 @@ Best for: first-time users who want the operating model before the controls.
190
191
## Choosing The Right Surface
191
192
192
193
- Use `Todo Cockpit` when the work still needs planning or approval.
194
+
- Use `needs-bot-review` when that todo needs research, outside evidence, or a better plan before execution.
193
195
- Use `Tasks` when one prompt and one schedule are enough.
194
196
- Use `Jobs` when the work needs ordered stages or pause points.
195
197
- Use `Research` when the goal is measured improvement against a benchmark.
196
198
197
199
## Working Style This Enables
198
200
199
201
- Keep the human in the loop while still using AI for the heavy lifting.
200
-
- Let research happen before implementation instead of after mistakes are made.
202
+
- Let todo planning and review happen before implementation, and use benchmark research when optimization needs measurement instead of guesswork.
201
203
- Run non-conflicting work in parallel while keeping risky work sequenced and visible.
202
204
- Archive completed, rejected, or reviewed work so the project gains memory over time.
203
205
- Use specialized agents, prompts, and models as a team of different experts instead of forcing one general agent to do every job.
Copy file name to clipboardExpand all lines: docs/getting-started.md
+8-6Lines changed: 8 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,17 +6,18 @@ Copilot Cockpit works best when you treat it as one workflow stack with three la
6
6
2. Execution and scheduling through `Tasks` and `Jobs`
7
7
3. Optional tool/control-plane integration through `Research`, `MCP`, and repo-local agent surfaces
8
8
9
-
The recommended path is: start with a `Todo`, use `Research` when context is missing, then promote approved work into a `Task` or `Job`.
9
+
The recommended path is: start with a `Todo`, use `needs-bot-review` when the todo needs research or planning, then promote approved work into a `Task` or `Job`.
10
10
11
11
## Quick Start
12
12
13
13
1. Open Copilot Cockpit from the activity bar or with `Copilot Cockpit: Create Scheduled Prompt (GUI)`.
14
14
2. Start in `How To Use` if you are new to the extension, or click the top-bar `Intro Tutorial` button for the same walkthrough.
15
15
3. Capture or refine work in `Todo Cockpit`. A `Todo` is the planning artifact and intake surface.
16
-
4. Use `Research` if the work still needs exploratory context, outside evidence, or benchmarked iteration.
16
+
4. Use `needs-bot-review` if the todo still needs analysis, outside evidence, or a better execution plan.
17
17
5. Move approved work into `ready`, then promote it into a `Task` for one executable unit or a `Job` for an orchestrated or scheduled run.
18
-
6. Open `Settings` to configure repo-local defaults and integrations. This is also where you choose the scheduled task execution provider: `GitHub Copilot Chat`, `OpenAI Codex CLI`, or `OpenCode CLI`.
19
-
7. Use the top-bar `Plan Integration` button only when you want optional control-plane extensions such as MCP, skills, starter agents, or the GitHub inbox flow.
18
+
6. Use `Research` when you want bounded benchmarked iteration, such as improving a prompt, tool, or agent over repeated measured runs.
19
+
7. Open `Settings` to configure repo-local defaults and integrations. This is also where you choose the scheduled task execution provider: `GitHub Copilot Chat`, `OpenAI Codex CLI`, or `OpenCode CLI`.
20
+
8. Use the top-bar `Plan Integration` button only when you want optional control-plane extensions such as MCP, skills, starter agents, or the GitHub inbox flow.
20
21
21
22
## Optional: Enable GitHub Inbox Triage
22
23
@@ -35,7 +36,8 @@ The GitHub inbox is repo-local, uses cached manual refreshes, and resolves crede
35
36
- Use `Todo Cockpit` when the work still needs planning, comments, approval, or triage.
36
37
- Use `Tasks` when one prompt and one schedule are enough for one executable unit.
37
38
- Use `Jobs` when the work needs ordered stages, orchestration, or pause checkpoints.
38
-
- Use `Research` when the work needs exploratory context or measured improvement before execution.
39
+
- Use `needs-bot-review` when a todo needs research or planning before it becomes execution.
40
+
- Use `Research` when the work needs measured improvement against a benchmark.
39
41
40
42
## Optional Extensions
41
43
@@ -61,7 +63,7 @@ Skip toy prompts. Start with one recurring loop that would still be worth keepin
61
63
- For a company team, use the same pattern for product signals, security and release readiness, support queues, or operations follow-up.
62
64
- If you also want to show the Research surface, add one benchmarked profile that scores onboarding or prompt quality against a simple command before you promote anything into execution.
63
65
64
-
`Onboarding Example Coverage Research` is the simplest version of that pattern: log the onboarding gap in Todo Cockpit, use Research to gather examples or benchmark the docs, then turn the approved next step into Tasks for a direct doc pass or Jobs for a staged follow-up. Use it when you want a real onboarding loop that still stops at a review checkpoint before autonomy expands.
66
+
`Onboarding Example Coverage Research` is the simplest version of that pattern: log the onboarding gap in Todo Cockpit, use `needs-bot-review` for the todo-specific planning handoff, and use Research only when you want to benchmark and improve the docs over repeated measured runs. Then turn the approved next step into Tasks for a direct doc pass or Jobs for a staged follow-up. Use it when you want a real onboarding loop that still stops at a review checkpoint before autonomy expands.
65
67
66
68
That keeps the demo honest: the proof is useful output plus explicit review, not a claim that the system should run unchecked.
Copy file name to clipboardExpand all lines: docs/index.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -21,7 +21,7 @@ Use this folder for the detailed reference that used to live in the top-level RE
21
21
- If you want to verify which visuals are live footage, which are illustrative mockups, and where retired media now lives, go to [Media Reference](./media-reference.md).
22
22
- If you want to connect GitHub inbox triage to Todo Cockpit, go to [GitHub Integration](./github-integration.md).
23
23
- If you want to understand the optional starter-agent orchestration layer, go to [Agent Workflow](./agent-workflow.md).
24
-
- If you want to understand Todo Cockpit, Tasks, Jobs, and Research, go to [Workflows](./workflows.md).
24
+
- If you want to understand the default `Todo` -> `needs-bot-review` -> `Task` or `Job` path, plus where benchmark-driven Research fits, go to [Workflows](./workflows.md).
25
25
- If you want MCP, skills, Copilot, OpenRouter, Codex, or Telegram details, go to [Integrations](./integrations.md).
26
26
- If you want persistence and repo-local boundary details, go to [Storage and Boundaries](./storage-and-boundaries.md).
27
27
- If you want the design intent and fork background, go to [Architecture and Principles](./architecture-and-principles.md).
0 commit comments