Skip to content

fix(workflows): improve robustness of impl-generate and impl-merge#1308

Merged
MarkusNeusinger merged 2 commits intomainfrom
fix/workflow-robustness
Dec 17, 2025
Merged

fix(workflows): improve robustness of impl-generate and impl-merge#1308
MarkusNeusinger merged 2 commits intomainfrom
fix/workflow-robustness

Conversation

@MarkusNeusinger
Copy link
Copy Markdown
Owner

Summary

Fixes several workflow issues discovered during batch processing of spec-ready issues.

Fix #1: Branch-Not-Found Errors

Problem: fatal: 'origin/implementation/{spec}/{library}' is not a commit errors when workflow tries to checkout a non-existent remote branch.

Solution: Check if remote branch exists before checkout, fall back to creating fresh branch from main.

Fix #2: Duplicate Labels

Problem: Issues accumulate both impl:X:pending and impl:X:failed labels when generation fails.

Solution: Failure handler removes both generate:X and impl:X:pending when marking as failed.

Fix #3: Auto-Close with Failures

Problem: Issues with 8 done + 1 failed stay OPEN because auto-close only triggers on 9 done.

Solution: Close when done + failed = 9, with appropriate summary (shows which libraries failed).

Fix #4: Generation Failure Tracking

Problem: When impl-generate fails (no plot.png), no PR is created → no review → no repair → library stays pending forever.

Solution: Track generation failures and mark as impl:X:failed after 3 consecutive failures. Posts comment explaining the library may not support this plot type.

Files Changed

Testing

  • YAML syntax validated
  • Logic reviewed against observed failure patterns

- Fix #1: Handle missing remote branches in impl-generate.yml
  - Check if branch exists before checkout to avoid 'not a commit' errors
  - Fall back to creating fresh branch from main if remote doesn't exist

- Fix #2: Clean up duplicate labels on failure
  - Remove both 'generate:X' and 'impl:X:pending' when marking as failed
  - Prevents label accumulation (e.g., both pending and failed)

- Fix #3: Auto-close issues when done + failed = 9
  - Previously only closed when all 9 were 'done'
  - Now closes when total (done + failed) reaches 9
  - Shows which libraries failed in closing comment

- Fix #4: Track generation failures and auto-mark as failed
  - Count previous failed runs for same spec/library
  - After 3 failures, mark as 'impl:X:failed' automatically
  - Posts failure comment explaining the library may not support this plot type
Copilot AI review requested due to automatic review settings December 17, 2025 17:53
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves the robustness of GitHub workflow automation for library implementation generation and merging. It addresses four key issues discovered during batch processing: branch checkout failures, duplicate label accumulation, premature issue closure logic, and untracked generation failures that leave libraries in a pending state indefinitely.

Key changes:

  • Adds remote branch existence check before checkout to prevent fatal errors when branches don't exist
  • Implements failure tracking with automatic marking as failed after 3 consecutive generation attempts
  • Updates issue auto-close logic to consider both completed and failed implementations (done + failed = 9)

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File Description
.github/workflows/impl-generate.yml Adds branch existence check before git checkout; implements failure tracking with retry logic and label cleanup after 3 failures
.github/workflows/impl-merge.yml Updates auto-close logic to count both done and failed implementations; generates dynamic status table showing which libraries succeeded/failed

Comment thread .github/workflows/impl-generate.yml Outdated
git checkout -B "$BRANCH" "origin/$BRANCH"

# Check if remote branch exists before checkout (fixes branch-not-found error)
if git ls-remote --exit-code origin "$BRANCH" >/dev/null 2>&1; then
Copy link

Copilot AI Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The git ls-remote command checks for branch existence, but the pattern matching may fail if BRANCH contains the full path (e.g., "implementation/spec-id/library"). The git ls-remote output includes refs/heads/ prefix, so you should either add "refs/heads/" prefix to the pattern or use "--heads" flag with the bare branch name. Without this, the check may incorrectly report that the branch doesn't exist.

Suggested change
if git ls-remote --exit-code origin "$BRANCH" >/dev/null 2>&1; then
if git ls-remote --exit-code --heads origin "$BRANCH" >/dev/null 2>&1; then

Copilot uses AI. Check for mistakes.
Comment thread .github/workflows/impl-merge.yml Outdated
Comment on lines +272 to +274
if echo "$DONE_LIBS" | grep -q "$lib"; then
TABLE="$TABLE\n| $lib | :white_check_mark: |"
elif echo "$FAILED_LIBS" | grep -q "$lib"; then
Copy link

Copilot AI Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The grep pattern matching for library names will produce false positives. For example, when checking for "plotly", the grep will also match "plotly" within "$DONE_LIBS" or "$FAILED_LIBS" even if the actual library was "plotly-extra". Use word boundaries or exact matching to ensure you're matching complete library names only.

Suggested change
if echo "$DONE_LIBS" | grep -q "$lib"; then
TABLE="$TABLE\n| $lib | :white_check_mark: |"
elif echo "$FAILED_LIBS" | grep -q "$lib"; then
if echo "$DONE_LIBS" | grep -w -q "$lib"; then
TABLE="$TABLE\n| $lib | :white_check_mark: |"
elif echo "$FAILED_LIBS" | grep -w -q "$lib"; then

Copilot uses AI. Check for mistakes.
| pygal | :white_check_mark: |
| highcharts | :white_check_mark: |
| letsplot | :white_check_mark: |
$(echo -e "$TABLE")
Copy link

Copilot AI Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The echo command with -e flag is used to render the markdown table, but the output is directly embedded in the gh issue comment body. Shell interpolation can cause issues with special characters in the TABLE variable. Consider using proper quoting or a different approach to build the comment body to avoid potential formatting issues.

Copilot uses AI. Check for mistakes.
Comment thread .github/workflows/impl-generate.yml Outdated
--workflow=impl-generate.yml \
--limit 20 \
--json conclusion,displayTitle,createdAt \
-q "[.[] | select(.conclusion == \"failure\") | select(.displayTitle | contains(\"$LIBRARY\"))] | length" \
Copy link

Copilot AI Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The failure counting logic uses a substring match on displayTitle that will incorrectly count failures for different specs. For example, if you're checking "matplotlib" failures for "scatter-basic", this will also count failures for "scatter-basic-color" or any other spec that runs matplotlib. The query should filter by both the library AND the spec ID to get an accurate failure count.

Suggested change
-q "[.[] | select(.conclusion == \"failure\") | select(.displayTitle | contains(\"$LIBRARY\"))] | length" \
-q "[.[] | select(.conclusion == \"failure\") | select(.displayTitle | (contains(\"$LIBRARY\") and contains(\"$SPEC_ID\")))] | length" \

Copilot uses AI. Check for mistakes.
Comment on lines +257 to +260
DONE_LIBS="$DONE_LIBS $lib"
elif echo "$LABELS" | grep -q "^impl:${lib}:failed$"; then
FAILED_COUNT=$((FAILED_COUNT + 1))
FAILED_LIBS="$FAILED_LIBS $lib"
Copy link

Copilot AI Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The DONE_LIBS and FAILED_LIBS variables accumulate library names with leading spaces (e.g., " matplotlib seaborn"). This works with the current grep usage but makes the code fragile. Consider either trimming the leading space or using an array-based approach for cleaner string handling.

Suggested change
DONE_LIBS="$DONE_LIBS $lib"
elif echo "$LABELS" | grep -q "^impl:${lib}:failed$"; then
FAILED_COUNT=$((FAILED_COUNT + 1))
FAILED_LIBS="$FAILED_LIBS $lib"
if [ -z "$DONE_LIBS" ]; then
DONE_LIBS="$lib"
else
DONE_LIBS="$DONE_LIBS $lib"
fi
elif echo "$LABELS" | grep -q "^impl:${lib}:failed$"; then
FAILED_COUNT=$((FAILED_COUNT + 1))
if [ -z "$FAILED_LIBS" ]; then
FAILED_LIBS="$lib"
else
FAILED_LIBS="$FAILED_LIBS $lib"
fi

Copilot uses AI. Check for mistakes.
- Use --heads flag for git ls-remote (more precise branch check)
- Filter failure count by both LIBRARY and SPEC_ID (avoid false counts)
- Use grep -w for word-boundary matching (prevent false positives)
@MarkusNeusinger MarkusNeusinger merged commit 3ffa4b2 into main Dec 17, 2025
6 checks passed
@MarkusNeusinger MarkusNeusinger deleted the fix/workflow-robustness branch December 17, 2025 18:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants