Skip to content

Commit f1932c9

Browse files
authored
ci: Add automatic flaky test detector (#18684)
Manually checking for flakes and opening issues is a bit annoying. I was thinking we could add a ci workflow to automate this. The action only runs when merging to develop. Could also be done on PRs but seems unnecessarily complicated. My thinking is that for a push to develop to happen, all the test must first have passed in the original PR. Therefore if the test then fails on develop we know it's a flake. Open for ideas/improvements/cleanups or let me know if there might be any cases I am missing that could lead to false positives. Example issue created with this: #18693 It doesn't get all the details but I think basically the most important is a link to the run so we can then investigate further. Also the logic for creating the issues is a bit ugly, but not sure if we can make it cleaner given that I want to create one issue per failed test not dump it all into one issue.
1 parent 05ab207 commit f1932c9

File tree

2 files changed

+102
-0
lines changed

2 files changed

+102
-0
lines changed
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
---
2+
title: '[Flaky CI]: {{ env.JOB_NAME }}'
3+
labels: Tests
4+
---
5+
6+
### Flakiness Type
7+
8+
Other / Unknown
9+
10+
### Name of Job
11+
12+
{{ env.JOB_NAME }}
13+
14+
### Name of Test
15+
16+
_Not available - check the run link for details_
17+
18+
### Link to Test Run
19+
20+
{{ env.RUN_LINK }}
21+
22+
---
23+
24+
_This issue was automatically created._

.github/workflows/build.yml

Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1192,7 +1192,85 @@ jobs:
11921192
# Always run this, even if a dependent job failed
11931193
if: always()
11941194
runs-on: ubuntu-24.04
1195+
permissions:
1196+
issues: write
11951197
steps:
1198+
- name: Check out current commit
1199+
if: github.ref == 'refs/heads/develop' && contains(needs.*.result, 'failure')
1200+
uses: actions/checkout@v6
1201+
with:
1202+
sparse-checkout: .github
1203+
1204+
- name: Create issues for failed jobs
1205+
if: github.ref == 'refs/heads/develop' && contains(needs.*.result, 'failure')
1206+
uses: actions/github-script@v7
1207+
with:
1208+
script: |
1209+
const fs = require('fs');
1210+
1211+
// Fetch actual job details from the API to get descriptive names
1212+
const jobs = await github.paginate(github.rest.actions.listJobsForWorkflowRun, {
1213+
owner: context.repo.owner,
1214+
repo: context.repo.repo,
1215+
run_id: context.runId,
1216+
per_page: 100
1217+
});
1218+
1219+
const failedJobs = jobs.filter(job => job.conclusion === 'failure');
1220+
1221+
if (failedJobs.length === 0) {
1222+
console.log('No failed jobs found');
1223+
return;
1224+
}
1225+
1226+
// Read and parse template
1227+
const template = fs.readFileSync('.github/FLAKY_CI_FAILURE_TEMPLATE.md', 'utf8');
1228+
const [, frontmatter, bodyTemplate] = template.match(/^---\n([\s\S]*?)\n---\n([\s\S]*)$/);
1229+
1230+
// Get existing open issues with Tests label
1231+
const existing = await github.paginate(github.rest.issues.listForRepo, {
1232+
owner: context.repo.owner,
1233+
repo: context.repo.repo,
1234+
state: 'open',
1235+
labels: 'Tests',
1236+
per_page: 100
1237+
});
1238+
1239+
for (const job of failedJobs) {
1240+
const jobName = job.name;
1241+
const jobUrl = job.html_url;
1242+
1243+
// Replace template variables
1244+
const vars = {
1245+
'JOB_NAME': jobName,
1246+
'RUN_LINK': jobUrl
1247+
};
1248+
1249+
let title = frontmatter.match(/title:\s*'(.*)'/)[1];
1250+
let issueBody = bodyTemplate;
1251+
for (const [key, value] of Object.entries(vars)) {
1252+
const pattern = new RegExp(`\\{\\{\\s*env\\.${key}\\s*\\}\\}`, 'g');
1253+
title = title.replace(pattern, value);
1254+
issueBody = issueBody.replace(pattern, value);
1255+
}
1256+
1257+
const existingIssue = existing.find(i => i.title === title);
1258+
1259+
if (existingIssue) {
1260+
console.log(`Issue already exists for ${jobName}: #${existingIssue.number}`);
1261+
continue;
1262+
}
1263+
1264+
const newIssue = await github.rest.issues.create({
1265+
owner: context.repo.owner,
1266+
repo: context.repo.repo,
1267+
title: title,
1268+
body: issueBody.trim(),
1269+
labels: ['Tests']
1270+
});
1271+
console.log(`Created issue #${newIssue.data.number} for ${jobName}`);
1272+
}
1273+
11961274
- name: Check for failures
11971275
if: cancelled() || contains(needs.*.result, 'failure') || contains(needs.*.result, 'cancelled')
11981276
run: |

0 commit comments

Comments
 (0)