Skip to content

Commit 8c3178f

Browse files
update(area-basic): refine spec optimization and update phases
- Renamed Phase 2 to "Spec Optimization" and added detailed steps for spec analysis. - Updated Phase 3 to Phase 4 and adjusted task descriptions for clarity. - Revised Phase 5 to Phase 6 and clarified the shipping process. - Introduced Phase 7 for monitoring PRs and handling rejections. - Enhanced documentation for library agent prompts and context handling.
1 parent d6fcaff commit 8c3178f

File tree

3 files changed

+204
-30
lines changed

3 files changed

+204
-30
lines changed

agentic/commands/update.md

Lines changed: 202 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,37 @@ Parse `$ARGUMENTS` using this format:
5353

5454
---
5555

56-
### Phase 2: Create Team & Spawn Agents
56+
### Phase 2: Spec Optimization
57+
58+
The lead performs this directly (no extra agent). This ensures agents work against a high-quality spec.
59+
60+
1. **Read references**: Read `plots/{spec_id}/specification.md`, `plots/{spec_id}/specification.yaml`,
61+
`prompts/templates/specification.md`, `prompts/templates/specification.yaml`, and `prompts/spec-tags-generator.md`.
62+
63+
2. **Analyse the spec** against these dimensions:
64+
65+
| Dimension | What to check |
66+
|-----------|---------------|
67+
| **Wording** | Description clear and concise? Applications realistic? Data fields include types and sizes? Notes actionable? |
68+
| **Missing sections** | All sections from `prompts/templates/specification.md` present? |
69+
| **Tag completeness** | All 4 tag dimensions (`plot_type`, `data_type`, `domain`, `features`) have at least 1 value? |
70+
| **Tag quality** | Naming conventions (lowercase, hyphens)? Values from recommended vocabulary in `prompts/spec-tags-generator.md`? Missing obvious tags? |
71+
| **Tag accuracy** | Do existing tags actually match the spec content? |
72+
73+
3. **Present numbered suggestions** to the user (e.g., "1. Add `time-series` to data_type tags", "2. Clarify data size in Data section").
74+
If the spec looks good, say so and move on.
75+
76+
4. **User responds** with one of:
77+
- `all` — apply all suggestions
78+
- `1,3` — apply only listed suggestions
79+
- `none` or `skip` — skip spec optimization, proceed as-is
80+
- Custom feedback — apply the user's specific instructions
81+
82+
5. **Apply accepted changes** to `specification.md` and/or `specification.yaml`, then proceed to Phase 3.
83+
84+
---
85+
86+
### Phase 3: Create Team & Spawn Agents
5787

5888
1. **Create team**: `TeamCreate` with name `update-{spec_id}`
5989

@@ -63,10 +93,36 @@ Parse `$ARGUMENTS` using this format:
6393

6494
3. **Spawn one `general-purpose` opus agent per library** via `Task` tool with:
6595
- `team_name`: `update-{spec_id}`
66-
- `name`: `{library}-updater`
96+
- `name`: `{library}`
6797
- `subagent_type`: `general-purpose`
6898
- `model`: `opus`
69-
- The **library-updater prompt** (see below), with `{SPEC_ID}`, `{LIBRARY}`, and `{DESCRIPTION}` filled in
99+
- The **library agent prompt** (see below), with `{SPEC_ID}`, `{LIBRARY}`, `{DESCRIPTION}`, `{CONTEXT7_LIBRARY}`,
100+
`{PLOT_TYPE}`, and `{SPEC_TITLE}` filled in
101+
102+
**Template variable reference** (lead must fill these):
103+
104+
| Variable | Source |
105+
|----------|--------|
106+
| `{SPEC_ID}` | From Phase 1 parse |
107+
| `{LIBRARY}` | Current library name |
108+
| `{DESCRIPTION}` | User's description |
109+
| `{CONTEXT7_LIBRARY}` | Mapped library name for Context7 (see mapping below) |
110+
| `{PLOT_TYPE}` | Primary `plot_type` tag from `specification.yaml` |
111+
| `{SPEC_TITLE}` | Title from `specification.md` |
112+
113+
**Context7 library name mapping:**
114+
115+
| Library | Context7 name |
116+
|---------|---------------|
117+
| `matplotlib` | `matplotlib` |
118+
| `seaborn` | `seaborn` |
119+
| `plotly` | `plotly` |
120+
| `bokeh` | `bokeh` |
121+
| `altair` | `altair` |
122+
| `plotnine` | `plotnine` |
123+
| `pygal` | `pygal` |
124+
| `highcharts` | `highcharts-core` |
125+
| `letsplot` | `lets-plot` |
70126

71127
4. **Assign tasks** to the corresponding agents via `TaskUpdate`
72128

@@ -75,7 +131,7 @@ their designated directories (see file containment rules in the agent prompt).
75131

76132
---
77133

78-
### Phase 3: Collect & Present
134+
### Phase 4: Collect & Present
79135

80136
Agents report back via `SendMessage` (auto-delivered to you). Agents may report either **completed work** (`STATUS: done`) or **a conflict** (`STATUS: conflict`). Once all agents have reported:
81137

@@ -132,25 +188,25 @@ Agents report back via `SendMessage` (auto-delivered to you). Agents may report
132188

133189
---
134190

135-
### Phase 4: Iterate
191+
### Phase 5: Iterate
136192

137193
For per-library feedback:
138194

139-
1. Send the feedback to the specific idle teammate via `SendMessage` (e.g., to `seaborn-updater`). This wakes them up.
140-
2. The agent runs its conflict check again (Step 2) on the new feedback. If it detects a conflict, it reports back with `STATUS: conflict` instead of making changes — handle as in Phase 3.
195+
1. Send the feedback to the specific idle teammate via `SendMessage` (e.g., to `seaborn`). This wakes them up.
196+
2. The agent runs its conflict check again (Step 2) on the new feedback. If it detects a conflict, it reports back with `STATUS: conflict` instead of making changes — handle as in Phase 4.
141197
3. If no conflict, the agent re-modifies, re-generates, reports back, and goes idle again.
142198
4. Present updated results to the user.
143199
5. Repeat until the user approves.
144200

145201
---
146202

147-
### Phase 5: Ship
203+
### Phase 6: Ship
148204

149205
**Only proceed when the user explicitly approves shipping.**
150206

151207
The lead handles all shipping directly (no delegation to teammates):
152208

153-
#### 5a. Code Quality
209+
#### 6a. Code Quality
154210

155211
Run ruff format and check **sequentially first**, before any parallel version-info commands.
156212
If parallel Bash calls are used and one fails, all sibling calls get cancelled — so always run ruff alone.
@@ -163,7 +219,7 @@ uv run ruff check --fix plots/{spec_id}/implementations/*.py
163219
If there are unfixable errors, fix them manually and re-run. The agents should have already run ruff in their
164220
lint step, but this is a safety net.
165221

166-
#### 5b. Update Metadata YAML
222+
#### 6b. Update Metadata YAML
167223

168224
For each updated library, edit `plots/{spec_id}/metadata/{library}.yaml`:
169225

@@ -190,7 +246,7 @@ For each updated library, edit `plots/{spec_id}/metadata/{library}.yaml`:
190246
| highcharts | `highcharts-core` |
191247
| letsplot | `lets-plot` |
192248

193-
#### 5c. Update Implementation Header
249+
#### 6c. Update Implementation Header
194250

195251
For each updated library, ensure the implementation file starts with:
196252

@@ -202,7 +258,7 @@ Quality: /100 | Updated: {YYYY-MM-DD}
202258
"""
203259
```
204260

205-
#### 5d. Copy Final Images
261+
#### 6d. Copy Final Images
206262

207263
For each library, copy the preview images to the implementations directory for GCS upload:
208264

@@ -217,7 +273,7 @@ uv run python -m core.images process \
217273

218274
Note: Since we process one library at a time for GCS upload, handle sequentially.
219275

220-
#### 5e. GCS Staging Upload
276+
#### 6e. GCS Staging Upload
221277

222278
For each library:
223279

@@ -246,13 +302,13 @@ GCS files from staging to production on merge):
246302
- `preview_url`: `https://storage.googleapis.com/pyplots-images/plots/{spec_id}/{library}/plot.png`
247303
- `preview_thumb`: `https://storage.googleapis.com/pyplots-images/plots/{spec_id}/{library}/plot_thumb.png`
248304

249-
#### 5f. Clean Up Preview Directory
305+
#### 6f. Clean Up Preview Directory
250306

251307
```bash
252308
rm -rf plots/{spec_id}/implementations/.update-preview
253309
```
254310

255-
#### 5g. Per-Library Branches, PRs & Reviews
311+
#### 6g. Per-Library Branches, PRs & Reviews
256312

257313
**IMPORTANT:** The review pipeline (`impl-review.yml`) extracts `SPEC_ID` and `LIBRARY` from the branch name
258314
pattern `implementation/{spec-id}/{library}`. Therefore, each library MUST get its own branch and PR.
@@ -362,22 +418,140 @@ rm -f /tmp/patch-{spec_id}-*.patch
362418

363419
Report all PR URLs to the user.
364420

365-
#### 5h. Cleanup Team
421+
---
422+
423+
### Phase 7: Monitor & Resolve
424+
425+
After shipping PRs, the lead monitors the review pipeline and handles any failures. The team stays alive until
426+
all PRs are merged.
427+
428+
#### 7a. Poll PR Status
429+
430+
Build a tracking table: `{library} → {pr_number, status}` where status is one of: `reviewing`, `approved`,
431+
`merged`, `rejected`, `failed`.
432+
433+
Present the summary table to the user.
434+
435+
Poll every **90 seconds** using `gh pr view` for each PR:
436+
437+
```bash
438+
gh pr view {pr_number} --json state,labels,mergedAt
439+
```
440+
441+
Extract status from labels: `ai-approved`, `ai-rejected`, `quality:{score}`, `quality-poor`.
442+
443+
Update the table and inform the user when status changes.
444+
445+
**Exit conditions**: all PRs are `merged` OR user says `abort`.
446+
447+
#### 7b. Handle Rejections
448+
449+
**When a PR gets `ai-rejected`:**
450+
451+
1. **Cancel CI repair**`impl-repair.yml` auto-triggers on `ai-rejected`. Cancel it since we'll fix locally
452+
(agents have context):
453+
```bash
454+
gh run list --workflow=impl-repair.yml --branch=implementation/{spec_id}/{library} --status=in_progress --json databaseId -q '.[0].databaseId'
455+
# then: gh run cancel {run_id}
456+
```
457+
458+
2. **Read review feedback** from the PR:
459+
```bash
460+
gh pr view {pr_number} --json comments -q '.comments[-1].body'
461+
```
462+
Also read the updated metadata on the PR branch for structured review data:
463+
```bash
464+
gh api repos/{owner}/{repo}/contents/plots/{spec_id}/metadata/{library}.yaml?ref=implementation/{spec_id}/{library} -q '.content' | base64 -d
465+
```
466+
467+
3. **Wake the agent** via `SendMessage` with the review feedback. Agent repeats Steps 2-8 (conflict check →
468+
modify → generate → lint → process → self-check → report).
469+
470+
4. **Push repair to PR branch** — after agent reports back:
471+
```bash
472+
# Save current main state
473+
git stash
474+
475+
# Checkout PR branch, pull latest (review may have pushed metadata updates)
476+
git checkout implementation/{spec_id}/{library}
477+
git pull
478+
479+
# Stage agent's changes
480+
git add plots/{spec_id}/implementations/{library}.py
481+
git add plots/{spec_id}/metadata/{library}.yaml
482+
git commit -m "repair({spec_id}): {library} — address review feedback"
483+
git push
484+
485+
# Return to main
486+
git checkout main
487+
git stash pop
488+
```
489+
490+
5. **Re-upload images to GCS staging** (agent regenerated in `.update-preview/`):
491+
```bash
492+
# Process images
493+
uv run python -m core.images process \
494+
plots/{spec_id}/implementations/.update-preview/{library}/plot.png \
495+
plots/{spec_id}/implementations/.update-preview/{library}/plot.png \
496+
plots/{spec_id}/implementations/.update-preview/{library}/plot_thumb.png
497+
498+
STAGING_PATH="gs://pyplots-images/staging/{spec_id}/{library}"
499+
gsutil cp plots/{spec_id}/implementations/.update-preview/{library}/plot.png "${STAGING_PATH}/plot.png"
500+
gsutil cp plots/{spec_id}/implementations/.update-preview/{library}/plot_thumb.png "${STAGING_PATH}/plot_thumb.png"
501+
```
502+
503+
6. **Re-trigger review**:
504+
```bash
505+
gh api repos/{owner}/{repo}/dispatches \
506+
-f event_type=review-pr \
507+
-f 'client_payload[pr_number]='"$PR_NUMBER"
508+
```
509+
510+
7. Continue polling. If rejected again, repeat (up to 2 repair rounds by lead — 3rd attempt handled by CI if
511+
needed).
512+
513+
**Workflow failures:**
514+
515+
If a review or merge workflow fails (no labels appear after ~10 minutes):
366516

367-
1. `SendMessage` with type `shutdown_request` to all agents
368-
2. `TeamDelete` to clean up the team
369-
3. Report all PR URLs to the user
517+
- Check workflow run status:
518+
```bash
519+
gh run list --workflow=impl-review.yml --branch=implementation/{spec_id}/{library} --limit 1 --json status,conclusion
520+
```
521+
- If failed, read logs:
522+
```bash
523+
gh run view {run_id} --log-failed
524+
```
525+
- Report failure reason to user, ask how to proceed (re-trigger, fix manually, skip).
526+
527+
#### 7c. Final Report & Cleanup
528+
529+
Once all PRs are merged:
530+
531+
1. Present final summary table:
532+
533+
| Library | PR | Quality Score | Status |
534+
|---------|-----|--------------|--------|
535+
| matplotlib | #1234 | 92 | merged |
536+
| seaborn | #1235 | 87 (repair) | merged |
537+
538+
2. `SendMessage` with type `shutdown_request` to all agents
539+
3. `TeamDelete` to clean up the team
540+
4. Clean up preview directory if still present:
541+
```bash
542+
rm -rf plots/{spec_id}/implementations/.update-preview
543+
```
370544

371545
---
372546

373-
## Library-Updater Agent Prompt
547+
## Library Agent Prompt
374548

375-
Use this prompt when spawning each per-library agent. Replace `{SPEC_ID}`, `{LIBRARY}`, and `{DESCRIPTION}` with actual
376-
values.
549+
Use this prompt when spawning each per-library agent. Replace `{SPEC_ID}`, `{LIBRARY}`, `{DESCRIPTION}`,
550+
`{CONTEXT7_LIBRARY}`, `{PLOT_TYPE}`, and `{SPEC_TITLE}` with actual values (see Phase 3 template variable reference).
377551

378552
---
379553

380-
You are the **{LIBRARY}-updater** on the `update-{SPEC_ID}` team. Your job is to update the {LIBRARY} implementation for
554+
You are **{LIBRARY}** on the `update-{SPEC_ID}` team. Your job is to update the {LIBRARY} implementation for
381555
**{SPEC_ID}**.
382556

383557
**Task:** {DESCRIPTION}
@@ -396,6 +570,10 @@ Read these files to understand what you're working with:
396570
4. `prompts/library/{LIBRARY}.md` — library-specific rules (**CRITICAL**: follow these exactly)
397571
5. `prompts/plot-generator.md` — base generation rules
398572
6. `prompts/quality-criteria.md` — quality scoring criteria
573+
7. **Context7 library documentation** — Query up-to-date library docs for idiomatic patterns:
574+
- Call `resolve-library-id` with `libraryName: "{CONTEXT7_LIBRARY}"` and `query: "how to create {PLOT_TYPE} chart with {CONTEXT7_LIBRARY}"`
575+
- Call `query-docs` with the resolved library ID and `query: "idiomatic patterns for creating {SPEC_TITLE} ({PLOT_TYPE}) with {CONTEXT7_LIBRARY}, including best practices for styling and layout"`
576+
- Use the returned documentation **together with** (not instead of) the static library rules from step 4
399577

400578
If `preview_url` exists in the metadata, view the current preview image to understand what the plot currently looks
401579
like.
@@ -455,6 +633,7 @@ Edit `plots/{SPEC_ID}/implementations/{LIBRARY}.py`:
455633
4. **Spec Compliance** — Point-by-point check against `specification.md`
456634
5. **Library Feature Usage** (LF-01) — Does the code leverage distinctive library strengths? Basic usage is not enough
457635
6. **Code Transferability** — Can a user easily adapt this to their own data? Clear separation of data vs. plot logic? Meaningful variable names?
636+
- **Respect the spec variant:** If the spec-id contains `basic`, the plot must stay basic. Do NOT add annotations, trendlines, regression lines, callout boxes, or other embellishments. Basic means clean and simple — storytelling comes from well-chosen data and visual clarity, not added elements.
458637
- **No changes for the sake of changes:** If you find nothing meaningful to improve, report "no improvements needed" and leave the code unchanged. Do not make cosmetic or unnecessary changes just to show activity.
459638

460639
If the specification genuinely needs changes to improve the result, edit `plots/{SPEC_ID}/specification.md` and

core/__init__.py

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1 @@
11
"""Core business logic for pyplots."""
2-
3-
from core.images import create_thumbnail, process_plot_image
4-
5-
6-
__all__ = ["create_thumbnail", "process_plot_image"]

prompts/plot-generator.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -362,9 +362,9 @@ pyplots renders at **4800 × 2700 px** (16:9) or **3600 × 3600 px** (1:1) — s
362362
- Remove decorations: single-series legends, tick marks (keep labels), unnecessary grid lines
363363

364364
**Data storytelling (for DE-03 score):**
365-
- Consider adding annotations to highlight key data points or trends
366365
- Use visual emphasis (color, size) to guide the viewer's eye
367-
- Tell a story, don't just display data
366+
- Tell a story through good data choice and clear visual hierarchy
367+
- **Respect the spec variant:** If the spec-id contains `basic`, storytelling comes from well-chosen data and clean design — NOT from adding annotations, trendlines, or extra visual elements. A basic scatter plot should remain a basic scatter plot.
368368

369369
## Output File
370370

0 commit comments

Comments
 (0)