fix(workflows): update quality threshold to 90 and add CLI retry logic#1089
Conversation
Documentation:
- Update quality threshold from 85 to 90 across all docs
- Files: CLAUDE.md, README.md, docs/workflow.md,
docs/concepts/claude-skill-plot-generation.md, prompts/quality-evaluator.md
CLI Retry Logic:
- Add automatic retry for Claude CLI steps on failure
- Helps handle intermittent "Executable not found in $PATH" errors
- Files: spec-create.yml, spec-update.yml, impl-generate.yml,
impl-repair.yml, util-claude.yml
Issue Lifecycle:
- spec-ready issues now stay open until all 9 libraries are done
- Changed "Closes #..." to "Related to #..." in spec PR body
- Added auto-close when all impl:{library}:done labels present
Fixes #967
There was a problem hiding this comment.
Pull request overview
This PR addresses issue #967 by updating quality threshold documentation from 85 to 90 (matching the actual implementation), adding CLI retry logic to handle intermittent Claude failures, and changing the issue lifecycle so spec-ready issues remain open until all library implementations are complete.
Key Changes:
- Documentation updated across 5 files to reflect the correct quality threshold of 90
- Retry mechanism added to 5 workflow files to handle Claude CLI "Executable not found in $PATH" errors
- Issue lifecycle modified: spec PRs now use "Related to" instead of "Closes", and issues auto-close when all 9 libraries complete
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| prompts/quality-evaluator.md | Updates score thresholds: approve ≥90 (was ≥85), request_changes 80-89 (was 75-84), reject <80 (was <75) |
| docs/workflow.md | Updates all references to quality threshold from 85 to 90 in documentation and diagrams |
| docs/concepts/claude-skill-plot-generation.md | Updates pass_threshold parameter and loop condition from 85 to 90 |
| README.md | Updates two instances of quality score requirement from ≥85 to ≥90 |
| CLAUDE.md | Updates label descriptions and final note about quality threshold from 85 to 90 |
| .github/workflows/util-claude.yml | Adds continue-on-error and retry step to handle Claude CLI failures |
| .github/workflows/spec-update.yml | Adds continue-on-error to Claude step and full retry step with duplicate prompt |
| .github/workflows/spec-create.yml | Adds continue-on-error to Claude step, retry logic, and changes "Closes" to "Related to" in PR body |
| .github/workflows/impl-repair.yml | Adds continue-on-error to Claude step and full retry step with duplicate prompt |
| .github/workflows/impl-generate.yml | Adds continue-on-error to Claude step and full retry step with duplicate prompt |
| .github/workflows/impl-merge.yml | Adds auto-close logic that closes issues when all 9 library implementations have impl:{lib}:done labels |
Comments suppressed due to low confidence (4)
.github/workflows/impl-repair.yml:225
- The retry step duplicates the entire prompt from the first attempt (lines 119-170). This creates maintenance burden - if the prompt needs to be updated, it must be changed in two places. Consider using YAML anchors or extracting the prompt to a separate file/variable that can be reused in both steps.
- name: Retry Claude (on failure)
if: steps.claude.outcome == 'failure'
id: claude_retry
timeout-minutes: 45
uses: anthropics/claude-code-action@v1
with:
claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}
claude_args: "--model opus"
prompt: |
## Task: Repair ${{ inputs.library }} Implementation for ${{ inputs.specification_id }}
This is **repair attempt ${{ inputs.attempt }}/3**. The previous implementation was rejected.
### Step 1: Read the AI review feedback
Read `/tmp/ai_feedback.md` to understand what needs to be fixed.
### Step 2: Read reference files
1. `prompts/library/${{ inputs.library }}.md` - Library-specific rules
2. `plots/${{ inputs.specification_id }}/specification.md` - The specification
3. `prompts/quality-criteria.md` - Quality requirements
### Step 3: Read current implementation
`plots/${{ inputs.specification_id }}/implementations/${{ inputs.library }}.py`
### Step 4: Fix the issues
Based on the AI feedback, fix:
- Visual quality issues
- Code quality issues
- Spec compliance issues
### Step 5: Test the fix
```bash
source .venv/bin/activate
cd plots/${{ inputs.specification_id }}/implementations
MPLBACKEND=Agg python ${{ inputs.library }}.py
```
### Step 6: Visual self-check
View `plot.png` and verify fixes are correct.
### Step 7: Format the code
```bash
source .venv/bin/activate
ruff format plots/${{ inputs.specification_id }}/implementations/${{ inputs.library }}.py
ruff check --fix plots/${{ inputs.specification_id }}/implementations/${{ inputs.library }}.py
```
### Step 8: Commit and push
```bash
git config user.name "github-actions[bot]"
git config user.email "github-actions[bot]@users.noreply.github.com"
git add plots/${{ inputs.specification_id }}/implementations/${{ inputs.library }}.py
git commit -m "fix(${{ inputs.library }}): address review feedback for ${{ inputs.specification_id }}
.github/workflows/spec-create.yml:256
- The retry step duplicates the entire prompt from the first attempt (lines 76-161). This creates maintenance burden - if the prompt needs to be updated, it must be changed in two places. Consider using YAML anchors or extracting the prompt to a separate file/variable that can be reused in both steps.
- name: Retry Claude (on failure)
if: steps.check.outputs.should_run == 'true' && steps.process.outcome == 'failure'
id: process_retry
timeout-minutes: 30
uses: anthropics/claude-code-action@v1
with:
claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}
claude_args: "--model opus"
prompt: |
## Task: Create New Specification
You are creating a new plot specification.
### Issue Details
- **Title:** ${{ github.event.issue.title }}
- **Number:** #${{ github.event.issue.number }}
- **Author:** ${{ github.event.issue.user.login }}
- **Body:**
```
${{ github.event.issue.body }}
```
---
## Instructions
1. **Read the rules:** `prompts/spec-id-generator.md`
2. **Check for duplicates:**
- List all existing specs: `ls plots/`
- Read existing specification files if titles seem similar
- If duplicate found: Post comment explaining which spec matches, then STOP
3. **Generate specification-id:**
- Format: `{type}-{variant}` or `{type}-{variant}-{modifier}`
- Examples: `scatter-basic`, `bar-grouped-horizontal`, `heatmap-correlation`
- All lowercase, hyphens only
4. **Create specification branch:**
```bash
git checkout -b "specification/{specification-id}"
```
5. **Post analysis comment:**
Post a SHORT comment (max 3-4 sentences) to the issue using `gh issue comment`:
- Is this a valid/useful plot type?
- Does it already exist? (check `ls plots/`)
- Any concerns?
6. **Create specification files:**
- Read template: `prompts/templates/specification.md`
- Read metadata template: `prompts/templates/specification.yaml`
- Create directory: `plots/{specification-id}/`
- Create: `plots/{specification-id}/specification.md` (follow template structure)
- Create: `plots/{specification-id}/specification.yaml` with:
- `specification_id`: the generated id
- `title`: a proper title
- `created`: Use `$(date -u +"%Y-%m-%dT%H:%M:%SZ")` for current timestamp
- `issue`: ${{ github.event.issue.number }}
- `suggested`: ${{ github.event.issue.user.login }}
- `tags`: appropriate tags for this plot type
- Create empty folder: `plots/{specification-id}/implementations/`
- Create empty folder: `plots/{specification-id}/metadata/`
7. **Commit and push:**
```bash
git config user.name "github-actions[bot]"
git config user.email "github-actions[bot]@users.noreply.github.com"
git add plots/{specification-id}/
git commit -m "spec: add {specification-id} specification
Created from issue #${{ github.event.issue.number }}"
git push -u origin "specification/{specification-id}"
```
8. **Update issue title:**
```bash
gh issue edit ${{ github.event.issue.number }} --title "[{specification-id}] {original title}"
```
9. **Output for workflow:**
After completing, print these lines exactly:
```
SPECIFICATION_ID={specification-id}
BRANCH=specification/{specification-id}
```
---
## Important Rules
- Do NOT create a PR (the workflow does that)
- Do NOT add labels
- Do NOT close the issue
- STOP after pushing the branch
.github/workflows/spec-update.yml:203
- The retry step duplicates the entire prompt from the first attempt (lines 79-136). This creates maintenance burden - if the prompt needs to be updated, it must be changed in two places. Consider using YAML anchors or extracting the prompt to a separate file/variable that can be reused in both steps.
- name: Retry Claude (on failure)
if: steps.process.outcome == 'failure'
id: process_retry
timeout-minutes: 30
uses: anthropics/claude-code-action@v1
with:
claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}
claude_args: "--model opus"
prompt: |
## Task: Update Existing Specification
You are updating an existing plot specification.
### Issue Details
- **Title:** ${{ github.event.issue.title }}
- **Number:** #${{ github.event.issue.number }}
- **Specification ID:** ${{ steps.extract.outputs.specification_id }}
- **Body:**
```
${{ github.event.issue.body }}
```
---
## Instructions
1. **Read current specification:**
- `plots/${{ steps.extract.outputs.specification_id }}/specification.md`
- `plots/${{ steps.extract.outputs.specification_id }}/specification.yaml`
2. **Post analysis comment:**
Post a SHORT comment (max 3-4 sentences) to the issue using `gh issue comment`:
- Is this a valid/useful change?
- What will be modified?
- Any concerns?
3. **Create update branch:**
```bash
git checkout -b "specification/${{ steps.extract.outputs.specification_id }}-update"
```
4. **Apply updates:**
- Modify `plots/${{ steps.extract.outputs.specification_id }}/specification.md` as needed
- Update `plots/${{ steps.extract.outputs.specification_id }}/specification.yaml`:
- Add entry to `history` array with:
- `date`: current timestamp (ISO 8601 format)
- `issue`: ${{ github.event.issue.number }}
- `changes`: brief description of what changed
5. **Commit and push:**
```bash
git config user.name "github-actions[bot]"
git config user.email "github-actions[bot]@users.noreply.github.com"
git add plots/${{ steps.extract.outputs.specification_id }}/
git commit -m "spec: update ${{ steps.extract.outputs.specification_id }}
Updated from issue #${{ github.event.issue.number }}"
git push -u origin "specification/${{ steps.extract.outputs.specification_id }}-update"
```
---
## Important Rules
- Do NOT create a PR (the workflow does that)
- Do NOT add labels
- STOP after pushing the branch
.github/workflows/impl-generate.yml:320
- The retry step duplicates the entire prompt from the first attempt (lines 205-259). This creates maintenance burden - if the prompt needs to be updated, it must be changed in two places. Consider using YAML anchors or extracting the prompt to a separate file/variable that can be reused in both steps.
- name: Retry Claude (on failure)
if: steps.claude.outcome == 'failure'
id: claude_retry
timeout-minutes: 60
uses: anthropics/claude-code-action@v1
with:
claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}
claude_args: "--model opus"
prompt: |
## Task: Generate ${{ steps.inputs.outputs.library }} Implementation
You are generating the **${{ steps.inputs.outputs.library }}** implementation for **${{ steps.inputs.outputs.specification_id }}**.
### Step 1: Read required files
1. `prompts/plot-generator.md` - Base generation rules
2. `prompts/default-style-guide.md` - Visual style requirements
3. `prompts/quality-criteria.md` - Quality requirements
4. `prompts/library/${{ steps.inputs.outputs.library }}.md` - Library-specific rules
5. `plots/${{ steps.inputs.outputs.specification_id }}/specification.md` - The specification
### Step 2: Generate implementation
Create: `plots/${{ steps.inputs.outputs.specification_id }}/implementations/${{ steps.inputs.outputs.library }}.py`
The script MUST:
- Save as `plot.png` in the current directory
- For interactive libraries (plotly, bokeh, altair, highcharts, pygal, letsplot): also save `plot.html`
### Step 3: Test and fix (up to 3 attempts)
Run the implementation:
```bash
source .venv/bin/activate
cd plots/${{ steps.inputs.outputs.specification_id }}/implementations
MPLBACKEND=Agg python ${{ steps.inputs.outputs.library }}.py
```
If it fails, fix and try again (max 3 attempts).
### Step 4: Visual self-check
Look at the generated `plot.png`:
- Does it match the specification?
- Are axes labeled correctly?
- Is the visualization clear?
### Step 5: Format the code
```bash
source .venv/bin/activate
ruff format plots/${{ steps.inputs.outputs.specification_id }}/implementations/${{ steps.inputs.outputs.library }}.py
ruff check --fix plots/${{ steps.inputs.outputs.specification_id }}/implementations/${{ steps.inputs.outputs.library }}.py
```
### Step 6: Commit
```bash
git config user.name "github-actions[bot]"
git config user.email "github-actions[bot]@users.noreply.github.com"
git add plots/${{ steps.inputs.outputs.specification_id }}/implementations/${{ steps.inputs.outputs.library }}.py
git commit -m "feat(${{ steps.inputs.outputs.library }}): implement ${{ steps.inputs.outputs.specification_id }}"
git push -u origin implementation/${{ steps.inputs.outputs.specification_id }}/${{ steps.inputs.outputs.library }}
```
### Report result
| # Close issue if all 9 libraries are done | ||
| if [ "$DONE_COUNT" -eq 9 ]; then | ||
| gh issue comment "$ISSUE" --body "## :tada: All Implementations Complete! | ||
|
|
||
| All 9 library implementations for \`${SPEC_ID}\` have been successfully merged. | ||
|
|
||
| | Library | Status | | ||
| |---------|--------| | ||
| | matplotlib | :white_check_mark: | | ||
| | seaborn | :white_check_mark: | | ||
| | plotly | :white_check_mark: | | ||
| | bokeh | :white_check_mark: | | ||
| | altair | :white_check_mark: | | ||
| | plotnine | :white_check_mark: | | ||
| | pygal | :white_check_mark: | | ||
| | highcharts | :white_check_mark: | | ||
| | letsplot | :white_check_mark: | | ||
|
|
||
| --- | ||
| :robot: *[impl-merge](https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }})*" | ||
|
|
||
| gh issue close "$ISSUE" | ||
| echo "::notice::Closed issue #$ISSUE - all implementations complete" | ||
| fi |
There was a problem hiding this comment.
The auto-close logic only checks for impl:{lib}:done labels and requires all 9 libraries to have this label. However, if any library implementation fails 3 times, it gets marked with impl:{lib}:failed instead (see impl-review.yml line 349). This means issues will remain open indefinitely if even a single library cannot implement the spec. Consider also counting impl:{lib}:failed labels and closing the issue when all 9 libraries have either :done or :failed status.
Fixes #967
Summary
Changes
Documentation (Issue Finding #4)
Updates quality threshold from 85 to 90 in documentation to match actual workflow configuration in
impl-review.yml.Files:
CLI Retry Logic (Issue Finding #3)
Adds retry mechanism for Claude CLI steps to handle intermittent "Executable not found in $PATH" errors. Each Claude step now has
continue-on-error: trueand a retry step that runs if the first attempt fails.Files:
Issue Lifecycle (Bonus)
spec-readyissues now stay open until all 9 library implementations are mergedCloses #...toRelated to #...in spec PR bodyimpl-merge.ymlwhen allimpl:{library}:donelabels are presentFiles:
Test Plan
impl:{library}:done