Commit 8de1414
authored
fix(test-plans): mitigate scheduled e2e-autotest flakiness (#1622)
* fix(test-plans): mitigate scheduled e2e-autotest flakiness
Triage of the last 8 scheduled e2e-autotest runs identified three failure
categories: a real plan bug, LLM screenshot-based false downgrades, and
real timing flakes. This change addresses all three.
Category A — real plan bug
* java-pack-help-center-webview was missing vscjava.vscode-java-pack from
setup.extensions. On scheduled runs (no PR VSIX) java.welcome was
unregistered and the open-help-center step silently timed out. This was
the #1 failure across the last 8 nightly runs (7/8). Now installs the
pack from the marketplace on schedule runs while still letting --vsix
override on PR runs.
Category B — LLM downgrade noise on ls-ready
* Add skipLlmVerify: true (introduced in @vscjava/vscode-autotest 0.7.5) to
every ls-ready step that has no structured verify* field. The
waitForLanguageServer action is itself the authoritative deterministic
check; the LLM was downgrading these whenever the status bar still showed
background indexing ("Java: Searching... 0%"), even though the LS was
fully functional. Affected: java-dependency-viewer, java-extension-pack,
java-fresh-import, java-maven-resolve-type, java-maven,
java-new-file-snippet, java-single-file, java-webview-migration.
Category C — real timing flakes
* java-test-runner: bump wait-test-discovery from 45s to 90s (the
vscode-java-test discovery scan can take longer than 45s on a cold cache)
and add retries: 1 to run-all-tests so a discovery-still-warming first
invocation can retry.
* java-maven-resolve-type: add retries: 1 to save-after-resolve so a slow
Maven re-import on a cold cache (where the LS hasn't yet republished
zero-errors at the time of save) can retry instead of failing the plan.
Plans whose flaky steps already carry a structured verify* field (e.g.
verify-completion with verifyCompletion: { notEmpty: true },
save-after-organize with verifyFile, verify-help-center-content with
verifyWebview) no longer need plan changes because the framework
auto-skip in @vscjava/vscode-autotest 0.7.5 already short-circuits the
LLM re-check whenever any structured verifier is present.
* fix(test-plans): restrict skipLlmVerify to G1/G4; add retries for cold-cache flakes
Reverts the over-broad framework auto-skip (any structured verify -> no LLM)
that was landed in autotest v0.7.5/0.7.6. LLM screenshot verification is the
anti-silent-pass safety net and must stay enabled on steps where the screenshot
carries unique signal (popup visibility, decoration lag, panel content).
Final policy:
- skipLlmVerify=true on Group 1 (16 ls-ready steps): waitForLanguageServer
polls the same status bar text the LLM would read, so LLM adds zero signal.
- skipLlmVerify=true on Group 4 (3 disk-write steps: save-after-organize,
add-gson-dependency, create-formatter-profile): action mutates a file not
open in any editor; before/after screenshots are by-design identical and
LLM always downgrades. verifyFile from disk is the authoritative signal.
- retries: 1 on 8 verify-completion steps to mitigate cold-cache 'Loading...'
LLM downgrades while keeping the screenshot check enabled.
- retries: 1 on java-maven-resolve-type save-after-resolve (kept from prior
commit) for Maven indexer warm-up.
- Wait bump 45 -> 90s on java-test-runner wait-test-discovery (kept).
- java-pack-help-center-webview setup.extensions hard-requires java-pack
(kept) — fixes the real bug (5/8 failures).
LLM coverage preserved on verify-completion (popup visibility), verifyEditor
(guards against page-wide DOM stale-tab fallback), verifyProblems
(diagnostics red squiggle lag) and verifyWebview (visual rendering).
Requires autotest >= 0.7.7 to honor skipLlmVerify without the auto-skip side
effect.
* ci: retrigger E2E AutoTest against @vscjava/vscode-autotest@0.7.8
The 0.7.7 release did not actually honor skipLlmVerify because
planParser dropped the field on deserialize. 0.7.8 contains the
parser fix; this empty commit restarts CI so the matrix installs
the correct version.1 parent c429a05 commit 8de1414
17 files changed
Lines changed: 76 additions & 1 deletion
File tree
- test-plans
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
48 | 48 | | |
49 | 49 | | |
50 | 50 | | |
| 51 | + | |
51 | 52 | | |
52 | 53 | | |
53 | 54 | | |
| |||
140 | 141 | | |
141 | 142 | | |
142 | 143 | | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
143 | 148 | | |
144 | 149 | | |
145 | 150 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
44 | 44 | | |
45 | 45 | | |
46 | 46 | | |
| 47 | + | |
47 | 48 | | |
48 | 49 | | |
49 | 50 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
28 | 28 | | |
29 | 29 | | |
30 | 30 | | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
31 | 36 | | |
32 | 37 | | |
33 | 38 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
30 | 30 | | |
31 | 31 | | |
32 | 32 | | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
33 | 36 | | |
34 | 37 | | |
35 | 38 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
45 | 45 | | |
46 | 46 | | |
47 | 47 | | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
48 | 51 | | |
49 | 52 | | |
50 | 53 | | |
| |||
67 | 70 | | |
68 | 71 | | |
69 | 72 | | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
36 | 36 | | |
37 | 37 | | |
38 | 38 | | |
| 39 | + | |
39 | 40 | | |
40 | 41 | | |
41 | 42 | | |
| |||
54 | 55 | | |
55 | 56 | | |
56 | 57 | | |
| 58 | + | |
| 59 | + | |
57 | 60 | | |
58 | 61 | | |
59 | 62 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
36 | 36 | | |
37 | 37 | | |
38 | 38 | | |
| 39 | + | |
39 | 40 | | |
40 | 41 | | |
41 | 42 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
34 | 34 | | |
35 | 35 | | |
36 | 36 | | |
| 37 | + | |
37 | 38 | | |
38 | 39 | | |
39 | 40 | | |
| |||
52 | 53 | | |
53 | 54 | | |
54 | 55 | | |
| 56 | + | |
| 57 | + | |
55 | 58 | | |
56 | 59 | | |
57 | 60 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
| 35 | + | |
35 | 36 | | |
36 | 37 | | |
37 | 38 | | |
| |||
48 | 49 | | |
49 | 50 | | |
50 | 51 | | |
| 52 | + | |
| 53 | + | |
51 | 54 | | |
52 | 55 | | |
53 | 56 | | |
| |||
69 | 72 | | |
70 | 73 | | |
71 | 74 | | |
| 75 | + | |
| 76 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
49 | 49 | | |
50 | 50 | | |
51 | 51 | | |
| 52 | + | |
| 53 | + | |
52 | 54 | | |
53 | 55 | | |
54 | 56 | | |
| |||
103 | 105 | | |
104 | 106 | | |
105 | 107 | | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
106 | 112 | | |
107 | 113 | | |
108 | 114 | | |
| |||
161 | 167 | | |
162 | 168 | | |
163 | 169 | | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
164 | 174 | | |
165 | 175 | | |
166 | 176 | | |
| |||
0 commit comments