|
| 1 | +# Build failure analysis — CI run 24703204267 |
| 2 | + |
| 3 | +Investigation of the failing Linux CI run triggered by the merge of PR [#1438 "fix: add pr comment after the build with download links"](https://github.com/espressif/idf-eclipse-plugin/pull/1438) on `master` (`0acc4deaf720cc1b46af6be5608d921842756194`, 2026-04-21). |
| 4 | + |
| 5 | +Failing job: [`build_linux` (ID 72250690470)](https://github.com/espressif/idf-eclipse-plugin/actions/runs/24703204267/job/72250690470). `build_macos` on the same run passed. |
| 6 | + |
| 7 | +## TL;DR |
| 8 | + |
| 9 | +The PR itself did not break the build. PR #1438 only touched `.github/workflows/pr-comment.yml` (a post-build commenter workflow). Nothing in the main `ci.yml` workflow, the plugin source, or the tests was modified. |
| 10 | + |
| 11 | +The Linux job failed because **two UI tests in `NewEspressifIDFProjectClangFilesTest` failed on the self-hosted Linux runner**, and the `phoenix-actions/test-reporting@v12` step is configured with `fail-on-error: true`, which converts any failed JUnit result into a job failure (`Process completed with exit code 1.`). |
| 12 | + |
| 13 | +The same test class passes on the macOS runner, so this is a Linux-runner-specific flakiness in a pair of cascading SWTBot test methods that depend on timing and UI state carried over between methods. |
| 14 | + |
| 15 | +## Failing tests |
| 16 | + |
| 17 | +Both failures come from `tests/com.espressif.idf.ui.test/target/surefire-reports/TEST-com.espressif.idf.ui.test.executable.cases.project.NewEspressifIDFProjectClangFilesTest.xml`, extracted from the check-run annotations on the `Linux Test Reports` job (ID 72279770001). |
| 18 | + |
| 19 | +### 1. `shouldMatchExpectedClangdArgumentsAfterBuildingProjects` — `org.junit.ComparisonFailure` |
| 20 | + |
| 21 | +``` |
| 22 | +org.junit.ComparisonFailure: expected:<...rk/workspace/Project[1]/build> but was:<...rk/workspace/Project[2]/build> |
| 23 | + at org.junit.Assert.assertEquals(Assert.java:117) |
| 24 | + at org.junit.Assert.assertEquals(Assert.java:146) |
| 25 | + at ...NewEspressifIDFProjectClangFilesTest$Fixture.thenCompareActualClangdArgumentWithExpected(NewEspressifIDFProjectClangFilesTest.java:356) |
| 26 | + at ...NewEspressifIDFProjectClangFilesTest$Fixture.thenCheckClangdArgumentAfterProjectBuilt(NewEspressifIDFProjectClangFilesTest.java:142) |
| 27 | + at ...NewEspressifIDFProjectClangFilesTest.shouldMatchExpectedClangdArgumentsAfterBuildingProjects(NewEspressifIDFProjectClangFilesTest.java:111) |
| 28 | +``` |
| 29 | + |
| 30 | +The expected vs actual paths correspond to `/opt/actions-runner/_work/workspace/Project1/build` and `.../Project2/build` respectively (the `...rk` prefix is JUnit's diff-trim of `/opt/actions-runner/_work`). |
| 31 | + |
| 32 | +### 2. `shouldRecreateClangdFileAfterDeletionAndVerifyContent` — `WidgetNotFoundException` |
| 33 | + |
| 34 | +``` |
| 35 | +org.eclipse.swtbot.swt.finder.exceptions.WidgetNotFoundException: Timed out waiting for tree item Project1 |
| 36 | + at org.eclipse.swtbot.swt.finder.widgets.SWTBotTree.getTreeItem(SWTBotTree.java:577) |
| 37 | + at ...NewEspressifIDFProjectClangFilesTest$Fixture.whenClangdFileDeleted(NewEspressifIDFProjectClangFilesTest.java:168) |
| 38 | + at ...NewEspressifIDFProjectClangFilesTest.shouldRecreateClangdFileAfterDeletionAndVerifyContent(NewEspressifIDFProjectClangFilesTest.java:81) |
| 39 | +Caused by: TimeoutException: Timeout after: 5000 ms.: Could not find node with text Project1 |
| 40 | +``` |
| 41 | + |
| 42 | +## Root cause |
| 43 | + |
| 44 | +### Failure #1 — stale clangd `--compile-commands-dir=` preference |
| 45 | + |
| 46 | +Method execution order is `@FixMethodOrder(MethodSorters.NAME_ASCENDING)`, so the sequence is: |
| 47 | + |
| 48 | +1. `shouldApplyClangFormatSettingsWhenAutoSaveIsEnabled` (uses `Project2`) |
| 49 | +2. `shouldHaveClangFilesPresentAndContentCorrectForNewProject` (uses `Project1`) |
| 50 | +3. `shouldMatchExpectedClangdArgumentsAfterBuildingProjects` ← **fails here** |
| 51 | +4. `shouldRecreateClangdFileAfterDeletionAndVerifyContent` ← cascading failure |
| 52 | + |
| 53 | +Inside test #3, the relevant lines are: |
| 54 | + |
| 55 | +```105:113:tests/com.espressif.idf.ui.test/src/com/espressif/idf/ui/test/executable/cases/project/NewEspressifIDFProjectClangFilesTest.java |
| 56 | + public void shouldMatchExpectedClangdArgumentsAfterBuildingProjects() throws Exception |
| 57 | + { |
| 58 | + Fixture.whenProjectIsBuiltUsingContextMenu(CLEAN_PROJECT2); |
| 59 | + Fixture.thenCheckClangdArgumentAfterProjectBuilt(CLEAN_PROJECT2); |
| 60 | + Fixture.whenSelectProjectInLaunchConfig(); |
| 61 | + Fixture.whenProjectIsBuiltUsingContextMenu(CLEAN_PROJECT1); |
| 62 | + Fixture.thenCheckClangdArgumentAfterProjectBuilt(CLEAN_PROJECT1); |
| 63 | + Fixture.thenClangdDriversUpdateOnSelectedTarget(); |
| 64 | + } |
| 65 | +``` |
| 66 | + |
| 67 | +The assertion verifies the value displayed in the _clangd Additional arguments_ preference textbox under Window → Preferences → C/C++ → Editor (LSP) → clangd. That preference is a **workspace-scope** (`InstanceScope`) singleton, not per-project: |
| 68 | + |
| 69 | +```79:93:bundles/com.espressif.idf.core/src/com/espressif/idf/core/util/LspService.java |
| 70 | + public void updateCompileCommandsDir(String buildDir) |
| 71 | + { |
| 72 | + String qualifier = configuration.qualifier(); |
| 73 | + String identifier = ClangdMetadata.Predefined.additionalOptions.identifer(); |
| 74 | + IEclipsePreferences preferences = InstanceScope.INSTANCE.getNode(qualifier); |
| 75 | + |
| 76 | + String existingOptions = preferences.get(identifier, StringUtil.EMPTY); |
| 77 | + String compileCommandsDirString = "--compile-commands-dir="; //$NON-NLS-1$ |
| 78 | + String newCompuileCommandsDirString = compileCommandsDirString + buildDir; |
| 79 | + String updatedOptions = existingOptions.contains(compileCommandsDirString) |
| 80 | + ? existingOptions.replaceAll(compileCommandsDirString + ".+", //$NON-NLS-1$ |
| 81 | + Matcher.quoteReplacement(newCompuileCommandsDirString)) |
| 82 | + : newCompuileCommandsDirString; |
| 83 | + preferences.put(identifier, updatedOptions); |
| 84 | + } |
| 85 | +``` |
| 86 | + |
| 87 | +The only place that calls `updateCompileCommandsDir(...)` during a normal build is guarded by an error-count check: |
| 88 | + |
| 89 | +```410:419:bundles/com.espressif.idf.core/src/com/espressif/idf/core/build/IDFBuildConfiguration.java |
| 90 | + if (!monitor.isCanceled() && epm.getErrorCount() == 0) |
| 91 | + { |
| 92 | + project.refreshLocal(IResource.DEPTH_INFINITE, monitor); |
| 93 | + ParitionSizeHandler paritionSizeHandler = new ParitionSizeHandler(project, infoStream, console); |
| 94 | + paritionSizeHandler.startCheckingSize(); |
| 95 | + |
| 96 | + LspService lspService = new LspService(); |
| 97 | + lspService.updateCompileCommandsDir(buildDir.toString()); |
| 98 | + lspService.restartLspServers(); |
| 99 | + } |
| 100 | +``` |
| 101 | + |
| 102 | +So the preference is rewritten **only when**: |
| 103 | +- the build is not cancelled, **and** |
| 104 | +- `ErrorParserManager.getErrorCount() == 0`, **and** |
| 105 | +- the build has actually progressed to the `Build complete` console line that `waitForProjectBuild` keys on. |
| 106 | + |
| 107 | +On Linux this race is visible: |
| 108 | +- After Project2 is built, the preference correctly reads `.../Project2/build` (assertion #1 passes). |
| 109 | +- `whenSelectProjectInLaunchConfig()` switches the launch bar to `Project1` — this does **not** rewrite the clangd preference. |
| 110 | +- Project1 is built. The test's `waitForProjectBuild` returns as soon as the CDT Build Console prints `Build complete`, but the preference write inside `IDFBuildConfiguration.build(...)` happens on the builder thread after the post-build refresh and `ParitionSizeHandler` start — any of these can lag, and if Project1's build reports a non-zero error count (e.g. from a transient parser error on the self-hosted runner) the `updateCompileCommandsDir` call is skipped entirely. |
| 111 | +- `whenOpenClangdPreferences()` opens the dialog and reads the stale `.../Project2/build` value, so `assertEquals` fails at line 356. |
| 112 | + |
| 113 | +The fix committed on 2026-01-13 (`4d9b9fe7` "fix: 'shouldMatchExpectedClangdArgumentsAfterBuildingProjects' test") already added `whenSelectProjectInLaunchConfig()` to address this same stale-value class of bug. On the Brno self-hosted Ubuntu runner (`BrnoUBU0004`) the fix is still not sufficient because Linux runs the full suite in ~31 m vs macOS's ~12 m — timing is ~2.5x slower. |
| 114 | + |
| 115 | +### Failure #2 — cascading UI state |
| 116 | + |
| 117 | +`shouldRecreateClangdFileAfterDeletionAndVerifyContent` immediately calls: |
| 118 | + |
| 119 | +```80:82:tests/com.espressif.idf.ui.test/src/com/espressif/idf/ui/test/executable/cases/project/NewEspressifIDFProjectClangFilesTest.java |
| 120 | + { |
| 121 | + Fixture.whenClangdFileDeleted(CLEAN_PROJECT1); |
| 122 | + Fixture.thenClangdFileIsAbsent(CLEAN_PROJECT1); |
| 123 | +``` |
| 124 | + |
| 125 | +…which calls `bot.tree().getTreeItem("Project1")`. When the previous test aborted with a `ComparisonFailure`, the Preferences dialog was still modal (or had just been closed), the Project Explorer no longer had focus, and the background LSP-server restart triggered after Project2's build was still running. SWTBot's default 5 s wait expires before `Project1` is re-rendered in the tree, producing the `WidgetNotFoundException`. |
| 126 | + |
| 127 | +This is a **knock-on failure**, not an independent bug. Fixing Failure #1 is expected to fix Failure #2 too. |
| 128 | + |
| 129 | +## Why macOS passed but Linux failed |
| 130 | + |
| 131 | +- `build_macos` completed in **11 m 47 s**; `build_linux` took **31 m 3 s**. The Linux runner is ~2.5x slower and far more likely to lose the race described above. |
| 132 | +- Both jobs run on self-hosted runners (`runs-on: ide eclipse` group) but the Linux runner additionally ran the eim-based ESP-IDF installer, which emitted `Failed to copy OpenOCD rules: ... Permission denied (os error 13)` (benign) and a burst of Pydantic serializer warnings while resolving IDF component manager dependencies. Neither of these causes test failures directly, but they lengthen setup and push timing margins tighter. |
| 133 | +- PR #1438 touched no product or test code, so there is no new regression — the failure is an existing flaky-test problem exposed by the slow Linux runner. |
| 134 | + |
| 135 | +## Other noise in the log (informational, not failure-causing) |
| 136 | + |
| 137 | +- `ERROR - Failed to copy OpenOCD rules: Failed to copy /tmp/esp/openocd-esp32/.../60-openocd.rules to /etc/udev/rules.d/60-openocd.rules . Make sure you have the necessary permissions.` — the runner user isn't `root`; udev rules copy fails. Safe to ignore for CI. |
| 138 | +- Many `PydanticSerializationUnexpectedValue` warnings from `/tmp/esp/python/v5.4/venv/lib/python3.10/site-packages/pydantic/main.py` during ESP-IDF component dependency resolution. |
| 139 | +- `WARN - Mirror ping failed for https://mirrors.aliyun.com: 403` / `https://pypi.tuna.tsinghua.edu.cn: TimedOut` — expected from the Brno network reaching Chinese mirrors. |
| 140 | +- `tm4e` grammar registry warnings (`No grammar source for scope [source.scala]` etc.) and `joni.exception.SyntaxException: invalid pattern in look-behind` from the TextMate tokenizer — known upstream noise, not test failures. |
| 141 | +- `Node.js 20 actions are deprecated` — warning only; won't break until September 2026. |
| 142 | + |
| 143 | +## Suggested remediation |
| 144 | + |
| 145 | +Short term (stabilise CI, same day): |
| 146 | + |
| 147 | +1. **Re-run the failing job.** The two tests are flaky on the Linux self-hosted runner and frequently pass on retry. The run has an explicit "fix: add pr comment after the build with download links" merge commit that didn't change product code, so a re-run is safe. |
| 148 | +2. If re-run still fails, temporarily set `fail-on-error: false` on `phoenix-actions/test-reporting@v12` in `ci.yml` for the Linux job, or mark the two tests `@Ignore` until the real fix lands. |
| 149 | +
|
| 150 | +Medium term (fix the flake): |
| 151 | +
|
| 152 | +3. In `IDFBuildConfiguration.build(...)`, update the clangd `--compile-commands-dir=` preference **unconditionally** when `compile_commands.json` exists in `buildDir`, rather than gating it on `epm.getErrorCount() == 0`. A compile commands database is produced by CMake before compilation and is still meaningful for LSP even if later compile steps report errors. |
| 153 | +4. After writing the preference, call `preferences.flush()` to force the `InstanceScope` write to the backing store before `restartLspServers()` returns. |
| 154 | +5. In the test, replace the direct `whenOpenClangdPreferences() + assertEquals(...)` sequence with a polling wait on the `InstanceScope` preference value (`TestWidgetWaitUtility.waitUntil(...)`), then open the dialog once the value is correct. |
| 155 | +6. Split `shouldMatchExpectedClangdArgumentsAfterBuildingProjects` into two `@Test` methods — one per project — with a `@Before` hook that re-focuses the Project Explorer, so a Project2-phase failure can't cascade into a Project1-phase failure. |
| 156 | +7. Add a recovery step at the top of `shouldRecreateClangdFileAfterDeletionAndVerifyContent` that re-opens/focuses the Project Explorer and re-selects `Project1` (same pattern as `whenNewProjectIsSelected`) before assuming the tree item is visible. |
| 157 | + |
| 158 | +Long term: |
| 159 | + |
| 160 | +8. Move the self-hosted Linux runner to hardware with comparable speed to the macOS runners, or lower the SWTBot default timeout only for this test class rather than globally. |
| 161 | + |
| 162 | +## References |
| 163 | + |
| 164 | +- Failing run: <https://github.com/espressif/idf-eclipse-plugin/actions/runs/24703204267> |
| 165 | +- Failing job: <https://github.com/espressif/idf-eclipse-plugin/actions/runs/24703204267/job/72250690470> |
| 166 | +- Test source: [`tests/com.espressif.idf.ui.test/src/com/espressif/idf/ui/test/executable/cases/project/NewEspressifIDFProjectClangFilesTest.java`](tests/com.espressif.idf.ui.test/src/com/espressif/idf/ui/test/executable/cases/project/NewEspressifIDFProjectClangFilesTest.java) |
| 167 | +- Related previous fix: commit `4d9b9fe7` — "fix: 'shouldMatchExpectedClangdArgumentsAfterBuildingProjects' test" (2026-01-13) |
| 168 | +- PR that triggered this run: [#1438](https://github.com/espressif/idf-eclipse-plugin/pull/1438) — documentation/CI-only change to `pr-comment.yml`. |
0 commit comments