All critical and recommended fixes from the conflict assessment have been implemented to make parallel test execution safe and reliable.
Problem: All PR tests for the same AI type used identical title prefixes, causing false positives in parallel execution.
Solution: Made title prefixes unique per test type:
test-copilot-create-pull-request.md→[copilot-test-single-pr]test-copilot-create-two-pull-requests.md→[copilot-test-two-prs]test-claude-create-pull-request.md→[claude-test-single-pr]test-claude-create-two-pull-requests.md→[claude-test-two-prs]test-codex-create-pull-request.md→[codex-test-single-pr]test-codex-create-two-pull-requests.md→[codex-test-two-prs]test-copilot-nosandbox-create-pull-request.md→[copilot-nosandbox-test-single-pr]test-copilot-nosandbox-create-two-pull-requests.md→[copilot-nosandbox-test-two-prs]test-copilot-siderepo-create-pull-request.md→[copilot-test-single-pr]test-copilot-siderepo-create-two-pull-requests.md→[copilot-test-two-prs]
Impact:
- Each test type now searches for its own unique prefix
create-pull-requestwon't find PRs fromcreate-two-pull-requestscreate-two-pull-requestswon't count PRs fromcreate-pull-request- No more false positives from cross-test contamination
New Function: get_title_prefix()
get_title_prefix() {
local workflow_name="$1"
local ai_type="$2"
if [[ "$workflow_name" == *"create-two-pull-requests"* ]]; then
echo "[${ai_type}-test-two-prs] "
elif [[ "$workflow_name" == *"create-pull-request"* ]]; then
echo "[${ai_type}-test-single-pr] "
else
echo "[${ai_type}-test] "
fi
}Impact:
- Validation logic now dynamically determines correct prefix based on workflow type
- Centralized prefix logic - easy to update if needed
- Works for all AI types and variants (nosandbox, siderepo)
Problem: MCP validation used overly broad patterns that could match unrelated issues.
Old Validation:
# Fallback matched ANY issue with "time" or "Time" anywhere
select(.title or .body | contains("time") or contains("Time") or contains("timestamp"))New Validation:
# Primary: Requires both "Hello from" title AND MCP-specific content
select((.title | contains("Hello from")) and
((.body | contains("MCP time tool")) or
(.body | contains("get_current_time")) or
(.body | contains("current time is"))))
# Fallback: Still requires BOTH title pattern AND time content
select((.title | contains("Hello from")) and
((.body | contains("time")) or (.body | contains("Time"))))Impact:
- Much more specific matching - requires both title and content patterns
- Won't match random issues that happen to mention time
- Reduces false positives from unrelated test artifacts
Changed in Both Runners:
- Parallel runner (
run_tests_parallel()) - Sequential runner (
run_tests_sequential())
Before:
local title_prefix="[${ai_type}-test]"After:
local title_prefix=$(get_title_prefix "$workflow" "$ai_type")Locations Updated:
- create-issue validation
- create-discussion validation
- create-pull-request validation (now uses unique prefix!)
- create-two-pull-requests validation (now uses unique prefix!)
- gh-steps validation
Removed:
warning "Note: Parallel mode works best with workflow_dispatch tests. Complex issue/command/PR-triggered tests may be skipped."Reason: With conflicts fixed, parallel mode is now safe for workflow_dispatch tests. The warning was causing unnecessary concern.
✓ test-copilot-create-pull-request.md compiles successfully
✓ test-copilot-create-two-pull-requests.md compiles successfully✓ bash -n e2e.sh passes (no syntax errors)✓ ./e2e.sh --dry-run test-copilot-create-pull-request test-copilot-create-two-pull-requests
- Both tests detected correctly
- No errors in execution pathtest-copilot-create-pull-request → searches for "[copilot-test] *"
test-copilot-create-two-pull-requests → searches for "[copilot-test] *"
RESULT: Tests find each other's PRs → FALSE POSITIVES
test-copilot-create-pull-request → searches for "[copilot-test-single-pr] *"
test-copilot-create-two-pull-requests → searches for "[copilot-test-two-prs] *"
RESULT: Tests only find their own PRs → CORRECT VALIDATION
Remaining Considerations:
- Issue tests still share same prefix across different issue test types (but this is acceptable since they don't conflict in validation logic)
- MCP tests now have much better specificity
- gh-steps already had excellent uniqueness with run IDs
For maximum robustness, consider adding run IDs to all test titles:
---
Create an issue with title "[copilot-test] Test ${{ github.run_id }}: Hello from Copilot"
---Benefits:
- Absolute uniqueness even across test reruns
- Better traceability in logs
- Follows pattern already proven in gh-steps test
Add post-validation checks to detect potential conflicts:
# Count recent artifacts with same prefix
recent_count=$(gh pr list --limit 20 --json createdAt,title \
--jq ".[] | select(.createdAt > \"$(date -u -d '5 minutes ago' --iso-8601=seconds)\") |
select(.title | startswith(\"$prefix\")) | .number" | wc -l)
if [[ $recent_count -gt 1 ]]; then
warning "Found $recent_count PRs with prefix '$prefix' from last 5 min"
fiAdd timestamp filtering to validation functions:
validate_pr_created() {
local title_prefix="$1"
local since_time="${2:-5 minutes ago}"
gh pr list --limit 10 --json number,title,createdAt \
--jq ".[] | select(.createdAt > \"$(date -u -d \"$since_time\" --iso-8601=seconds)\") |
select(.title | startswith(\"$title_prefix\")) | .number"
}✅ All critical fixes implemented ✅ Parallel execution is now safe for workflow_dispatch tests ✅ Validation criteria are distinct and specific ✅ No more false positives from test cross-contamination
The parallel batch execution feature can now be used with confidence. Tests will correctly validate only their own artifacts and won't be confused by concurrent test execution.
Workflow Files (10):
- All
*-create-pull-request.mdfiles - All
*-create-two-pull-requests.mdfiles
Script Files (1):
e2e.sh- Addedget_title_prefix(), fixed MCP validation, updated all call sites
Documentation (3):
WORKFLOW_STATE_MANAGEMENT.md- Previously createdPARALLEL_TEST_CONFLICT_ASSESSMENT.md- Previously createdPARALLEL_FIXES_IMPLEMENTED.md- This document