Skip to content

Commit 8bd2fd3

Browse files
authored
Add LabelOps agentic workflows (conflict & CI auto-resolve) (#19629)
* Add LabelOps agentic workflows Two opt-in, label-gated workflows for automated PR maintenance: - labelops-pr-maintenance: scheduled every 3h, iterates open PRs carrying AI-Auto-Resolve-Conflicts or AI-Auto-Resolve-CI, shuffles for fairness seeded by GITHUB_RUN_ID, caps at 5 PRs per run. Per PR: triages CI first (pr-build-status skill), resolves conflicts second. Small mechanical CI fixes applied in-place with local verification. Proven flakes (flaky-test-detector ≥3 PRs) are delegated to the spinoff workflow. Unfixable CI escalates by adding AI-needs-CI-fix-input with a repro + up to 3 options. Labels are sticky — agent never removes any label. - labelops-flake-fix: workflow_dispatch-only spinoff, dispatched from the babysitter via safe-outputs.dispatch-workflow. Re-verifies the flake, reproduces with a 20-iteration loop, prefers determinism fix else quarantines. Scoped to tests/** and vsintegration/tests/**. Both workflows use protected-files: fallback-to-issue to prevent any .github/** modifications. Ship with schedule enabled; dry-run via workflow_dispatch recommended before relying on scheduled runs.
1 parent ea3438a commit 8bd2fd3

6 files changed

Lines changed: 2927 additions & 0 deletions

File tree

.github/aw/actions-lock.json

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,11 @@
1515
"version": "v0.71.1",
1616
"sha": "239aec45b78c8799417efdd5bc6d8cc036629ec1"
1717
},
18+
"github/gh-aw-actions/setup@v0.68.3": {
19+
"repo": "github/gh-aw-actions/setup",
20+
"version": "v0.68.3",
21+
"sha": "ba90f2186d7ad780ec640f364005fa24e797b360"
22+
},
1823
"github/gh-aw/actions/setup@v0.67.2": {
1924
"repo": "github/gh-aw/actions/setup",
2025
"version": "v0.67.2",

.github/workflows/labelops-flake-fix.lock.yml

Lines changed: 1315 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
Lines changed: 134 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,134 @@
1+
---
2+
description: |
3+
LabelOps spinoff — fixes proven flaky tests.
4+
Dispatched by labelops-pr-maintenance when a failing test has evidence
5+
on ≥3 distinct unrelated PRs. Re-verifies the flake, reproduces locally,
6+
and either lands a determinism fix or quarantines with a skip marker and
7+
tracking issue. If root cause is shared across multiple tests, fix them
8+
all together. May include small, non-invasive product-code fixes when the
9+
root cause lives outside test files. Opens one PR, comments on originator.
10+
11+
on:
12+
workflow_dispatch:
13+
inputs:
14+
failing_test:
15+
description: "Fully qualified test name or display name (e.g. FSharp.Compiler.Tests.Foo.Bar or ``backtick name``)"
16+
required: true
17+
type: string
18+
affected_prs:
19+
description: "JSON array of PR numbers where this test failed (e.g. [19820, 19833, 19891])"
20+
required: true
21+
type: string
22+
originating_pr:
23+
description: "PR number that triggered this spinoff"
24+
required: true
25+
type: string
26+
27+
timeout-minutes: 60
28+
29+
permissions: read-all
30+
31+
concurrency:
32+
group: labelops-flake-fix-${{ inputs.failing_test }}
33+
cancel-in-progress: false
34+
35+
network:
36+
allowed:
37+
- defaults
38+
- dotnet
39+
- dev.azure.com
40+
41+
checkout:
42+
ref: main
43+
fetch-depth: 0
44+
45+
tools:
46+
github:
47+
toolsets: [default, issues, pull_requests, repos, actions]
48+
min-integrity: none
49+
bash: true
50+
51+
safe-outputs:
52+
create-pull-request:
53+
title-prefix: "[LabelOps Flake] "
54+
labels: [automation, Flaky, NO_RELEASE_NOTES]
55+
draft: false
56+
max: 1
57+
protected-files: fallback-to-issue
58+
add-comment:
59+
target: "*"
60+
max: 1
61+
create-issue:
62+
title-prefix: "[LabelOps Flake] "
63+
labels: [Flaky, automation]
64+
max: 1
65+
---
66+
67+
# LabelOps — Flake Fixer
68+
69+
You fix proven flaky tests. You were dispatched by `labelops-pr-maintenance` after it saw `${{ inputs.failing_test }}` failing across `${{ inputs.affected_prs }}` (≥3 distinct PRs).
70+
71+
## Hard rules
72+
73+
1. **Never modify `.github/**`.** Protected by `fallback-to-issue`.
74+
2. **Re-verify before acting.** If the flake can't be re-confirmed, `noop` and exit.
75+
3. **One PR per invocation.**
76+
4. **Never rebase, force-push, amend, squash, or `git add .`.** Commit explicit paths only.
77+
5. **Don't quarantine a test that was introduced by the originating PR or any open PR in `affected_prs`.** That would defeat the PR's purpose — `noop` + comment instead.
78+
6. **If unsure, `noop`.** Better to skip than guess.
79+
7. **Prefix comments with `🤖 *LabelOps Flake — <subtopic>.*`**
80+
8. **Fix co-located tests.** If the same root cause affects other tests, fix them all.
81+
9. **Small product-code fixes are allowed** when the root cause lives outside `tests/`. Keep changes minimal. If non-trivial, quarantine instead.
82+
83+
## Step 0 — Validate inputs
84+
85+
```bash
86+
set -euo pipefail
87+
88+
# affected_prs must parse as a JSON array of positive integers
89+
echo '${{ inputs.affected_prs }}' | python3 -c '
90+
import json, sys
91+
v = json.loads(sys.stdin.read())
92+
assert isinstance(v, list) and all(isinstance(x, int) and x > 0 for x in v), "bad affected_prs"
93+
'
94+
95+
# originating_pr must be a positive integer
96+
if ! [[ "${{ inputs.originating_pr }}" =~ ^[1-9][0-9]*$ ]]; then
97+
echo "::error::originating_pr must be a positive integer."
98+
exit 1
99+
fi
100+
```
101+
102+
If any check fails, exit.
103+
104+
## Step 1 — Re-verify
105+
106+
Run `flaky-test-detector` with the test name. Require evidence across ≥3 of `affected_prs`. If not confirmed, comment on originating PR: `🤖 *LabelOps Flake — not reproducible.* No recent failures for <test>. No action taken.` Then `noop`.
107+
108+
## Step 2 — Reproduce locally
109+
110+
Find the containing test project and run the test in a loop (up to 20 iterations, 15min cap).
111+
112+
- `0/N` failures but ≥3 PRs showed it → race doesn't trigger locally; prefer quarantine.
113+
- `1–(N-1)/N` → classic non-determinism; prefer determinism fix.
114+
- `N/N` → hard failure, not a flake. `noop` + comment.
115+
116+
Before proceeding: check if the originating PR introduced/modified this test (`gh pr diff`). If so, `noop` + comment.
117+
118+
## Step 3 — Fix
119+
120+
If the root cause affects other tests in the same area, fix them all in the same PR.
121+
122+
**Option A — Determinism fix** (preferred): fix the root cause. Re-run the 20-iteration loop, require `0/20`. Title: `[LabelOps Flake] Fix <short name> determinism`
123+
124+
**Option B — Quarantine** (when fix is non-trivial): create a tracking issue with evidence, add skip marker referencing the issue. Title: `[LabelOps Flake] Quarantine <short name>`
125+
126+
## Step 4 — Open PR
127+
128+
Use `create-pull-request`. Brief body: evidence table, local reproduction stats, fix strategy, link to originating PR.
129+
130+
## Step 5 — Comment on originating PR
131+
132+
```
133+
🤖 *LabelOps Flake — dispatched.* Opened #<new-pr> to address `<test>`. Re-run checks once it merges.
134+
```

0 commit comments

Comments
 (0)