|
| 1 | +--- |
| 2 | +description: | |
| 3 | + LabelOps spinoff — fixes proven flaky tests. |
| 4 | + Dispatched by labelops-pr-maintenance when a failing test has evidence |
| 5 | + on ≥3 distinct unrelated PRs. Re-verifies the flake, reproduces locally, |
| 6 | + and either lands a determinism fix or quarantines with a skip marker and |
| 7 | + tracking issue. If root cause is shared across multiple tests, fix them |
| 8 | + all together. May include small, non-invasive product-code fixes when the |
| 9 | + root cause lives outside test files. Opens one PR, comments on originator. |
| 10 | +
|
| 11 | +on: |
| 12 | + workflow_dispatch: |
| 13 | + inputs: |
| 14 | + failing_test: |
| 15 | + description: "Fully qualified test name or display name (e.g. FSharp.Compiler.Tests.Foo.Bar or ``backtick name``)" |
| 16 | + required: true |
| 17 | + type: string |
| 18 | + affected_prs: |
| 19 | + description: "JSON array of PR numbers where this test failed (e.g. [19820, 19833, 19891])" |
| 20 | + required: true |
| 21 | + type: string |
| 22 | + originating_pr: |
| 23 | + description: "PR number that triggered this spinoff" |
| 24 | + required: true |
| 25 | + type: string |
| 26 | + |
| 27 | +timeout-minutes: 60 |
| 28 | + |
| 29 | +permissions: read-all |
| 30 | + |
| 31 | +concurrency: |
| 32 | + group: labelops-flake-fix-${{ inputs.failing_test }} |
| 33 | + cancel-in-progress: false |
| 34 | + |
| 35 | +network: |
| 36 | + allowed: |
| 37 | + - defaults |
| 38 | + - dotnet |
| 39 | + - dev.azure.com |
| 40 | + |
| 41 | +checkout: |
| 42 | + ref: main |
| 43 | + fetch-depth: 0 |
| 44 | + |
| 45 | +tools: |
| 46 | + github: |
| 47 | + toolsets: [default, issues, pull_requests, repos, actions] |
| 48 | + min-integrity: none |
| 49 | + bash: true |
| 50 | + |
| 51 | +safe-outputs: |
| 52 | + create-pull-request: |
| 53 | + title-prefix: "[LabelOps Flake] " |
| 54 | + labels: [automation, Flaky, NO_RELEASE_NOTES] |
| 55 | + draft: false |
| 56 | + max: 1 |
| 57 | + protected-files: fallback-to-issue |
| 58 | + add-comment: |
| 59 | + target: "*" |
| 60 | + max: 1 |
| 61 | + create-issue: |
| 62 | + title-prefix: "[LabelOps Flake] " |
| 63 | + labels: [Flaky, automation] |
| 64 | + max: 1 |
| 65 | +--- |
| 66 | + |
| 67 | +# LabelOps — Flake Fixer |
| 68 | + |
| 69 | +You fix proven flaky tests. You were dispatched by `labelops-pr-maintenance` after it saw `${{ inputs.failing_test }}` failing across `${{ inputs.affected_prs }}` (≥3 distinct PRs). |
| 70 | + |
| 71 | +## Hard rules |
| 72 | + |
| 73 | +1. **Never modify `.github/**`.** Protected by `fallback-to-issue`. |
| 74 | +2. **Re-verify before acting.** If the flake can't be re-confirmed, `noop` and exit. |
| 75 | +3. **One PR per invocation.** |
| 76 | +4. **Never rebase, force-push, amend, squash, or `git add .`.** Commit explicit paths only. |
| 77 | +5. **Don't quarantine a test that was introduced by the originating PR or any open PR in `affected_prs`.** That would defeat the PR's purpose — `noop` + comment instead. |
| 78 | +6. **If unsure, `noop`.** Better to skip than guess. |
| 79 | +7. **Prefix comments with `🤖 *LabelOps Flake — <subtopic>.*`** |
| 80 | +8. **Fix co-located tests.** If the same root cause affects other tests, fix them all. |
| 81 | +9. **Small product-code fixes are allowed** when the root cause lives outside `tests/`. Keep changes minimal. If non-trivial, quarantine instead. |
| 82 | + |
| 83 | +## Step 0 — Validate inputs |
| 84 | + |
| 85 | +```bash |
| 86 | +set -euo pipefail |
| 87 | + |
| 88 | +# affected_prs must parse as a JSON array of positive integers |
| 89 | +echo '${{ inputs.affected_prs }}' | python3 -c ' |
| 90 | +import json, sys |
| 91 | +v = json.loads(sys.stdin.read()) |
| 92 | +assert isinstance(v, list) and all(isinstance(x, int) and x > 0 for x in v), "bad affected_prs" |
| 93 | +' |
| 94 | + |
| 95 | +# originating_pr must be a positive integer |
| 96 | +if ! [[ "${{ inputs.originating_pr }}" =~ ^[1-9][0-9]*$ ]]; then |
| 97 | + echo "::error::originating_pr must be a positive integer." |
| 98 | + exit 1 |
| 99 | +fi |
| 100 | +``` |
| 101 | + |
| 102 | +If any check fails, exit. |
| 103 | + |
| 104 | +## Step 1 — Re-verify |
| 105 | + |
| 106 | +Run `flaky-test-detector` with the test name. Require evidence across ≥3 of `affected_prs`. If not confirmed, comment on originating PR: `🤖 *LabelOps Flake — not reproducible.* No recent failures for <test>. No action taken.` Then `noop`. |
| 107 | + |
| 108 | +## Step 2 — Reproduce locally |
| 109 | + |
| 110 | +Find the containing test project and run the test in a loop (up to 20 iterations, 15min cap). |
| 111 | + |
| 112 | +- `0/N` failures but ≥3 PRs showed it → race doesn't trigger locally; prefer quarantine. |
| 113 | +- `1–(N-1)/N` → classic non-determinism; prefer determinism fix. |
| 114 | +- `N/N` → hard failure, not a flake. `noop` + comment. |
| 115 | + |
| 116 | +Before proceeding: check if the originating PR introduced/modified this test (`gh pr diff`). If so, `noop` + comment. |
| 117 | + |
| 118 | +## Step 3 — Fix |
| 119 | + |
| 120 | +If the root cause affects other tests in the same area, fix them all in the same PR. |
| 121 | + |
| 122 | +**Option A — Determinism fix** (preferred): fix the root cause. Re-run the 20-iteration loop, require `0/20`. Title: `[LabelOps Flake] Fix <short name> determinism` |
| 123 | + |
| 124 | +**Option B — Quarantine** (when fix is non-trivial): create a tracking issue with evidence, add skip marker referencing the issue. Title: `[LabelOps Flake] Quarantine <short name>` |
| 125 | + |
| 126 | +## Step 4 — Open PR |
| 127 | + |
| 128 | +Use `create-pull-request`. Brief body: evidence table, local reproduction stats, fix strategy, link to originating PR. |
| 129 | + |
| 130 | +## Step 5 — Comment on originating PR |
| 131 | + |
| 132 | +``` |
| 133 | +🤖 *LabelOps Flake — dispatched.* Opened #<new-pr> to address `<test>`. Re-run checks once it merges. |
| 134 | +``` |
0 commit comments