Skip to content

Commit f5cc903

Browse files
wenytang-msCopilot
andcommitted
fix(test-plans): mitigate scheduled e2e-autotest flakiness
Triage of the last 8 scheduled e2e-autotest runs identified three failure categories: a real plan bug, LLM screenshot-based false downgrades, and real timing flakes. This change addresses all three. Category A — real plan bug * java-pack-help-center-webview was missing vscjava.vscode-java-pack from setup.extensions. On scheduled runs (no PR VSIX) java.welcome was unregistered and the open-help-center step silently timed out. This was the #1 failure across the last 8 nightly runs (7/8). Now installs the pack from the marketplace on schedule runs while still letting --vsix override on PR runs. Category B — LLM downgrade noise on ls-ready * Add skipLlmVerify: true (introduced in @vscjava/vscode-autotest 0.7.5) to every ls-ready step that has no structured verify* field. The waitForLanguageServer action is itself the authoritative deterministic check; the LLM was downgrading these whenever the status bar still showed background indexing ("Java: Searching... 0%"), even though the LS was fully functional. Affected: java-dependency-viewer, java-extension-pack, java-fresh-import, java-maven-resolve-type, java-maven, java-new-file-snippet, java-single-file, java-webview-migration. Category C — real timing flakes * java-test-runner: bump wait-test-discovery from 45s to 90s (the vscode-java-test discovery scan can take longer than 45s on a cold cache) and add retries: 1 to run-all-tests so a discovery-still-warming first invocation can retry. * java-maven-resolve-type: add retries: 1 to save-after-resolve so a slow Maven re-import on a cold cache (where the LS hasn't yet republished zero-errors at the time of save) can retry instead of failing the plan. Plans whose flaky steps already carry a structured verify* field (e.g. verify-completion with verifyCompletion: { notEmpty: true }, save-after-organize with verifyFile, verify-help-center-content with verifyWebview) no longer need plan changes because the framework auto-skip in @vscjava/vscode-autotest 0.7.5 already short-circuits the LLM re-check whenever any structured verifier is present. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1 parent 5ec3f2b commit f5cc903

10 files changed

Lines changed: 40 additions & 1 deletion

test-plans/java-dependency-viewer.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,11 @@ steps:
2828
- id: "ls-ready"
2929
action: "waitForLanguageServer"
3030
verify: "Java workspace has loaded; Explorer shows the project tree and Problems panel is settled"
31+
# waitForLanguageServer is the authoritative deterministic check — the
32+
# status bar can still flicker "Java: Searching... 0%" for background
33+
# indexing right after the LS reports ready, which has historically
34+
# caused LLM screenshot downgrades. Skip LLM here.
35+
skipLlmVerify: true
3136
timeout: 120
3237

3338
# ── Open dependency view ─────────────────────────────────

test-plans/java-extension-pack.yaml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,9 @@ steps:
3030
- id: "ls-ready"
3131
action: "waitForLanguageServer"
3232
verify: "Java workspace has loaded; Explorer shows the project tree and Problems panel is settled"
33+
# waitForLanguageServer is the authoritative deterministic check —
34+
# status-bar background indexing can cause spurious LLM downgrades.
35+
skipLlmVerify: true
3336
timeout: 120
3437

3538
# ── Trigger Classpath configuration command ──────────────

test-plans/java-fresh-import.yaml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,9 @@ steps:
4545
- id: "ls-ready"
4646
action: "waitForLanguageServer"
4747
verify: "spring-petclinic project has been imported; Java extension is activated and ready for editing"
48+
# waitForLanguageServer is authoritative — skip LLM screenshot re-check
49+
# (status bar background indexing causes false downgrades).
50+
skipLlmVerify: true
4851
timeout: 300
4952

5053
# ── Verify completion ────────────────────────────────────

test-plans/java-maven-resolve-type.yaml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,8 @@ steps:
4949
action: "waitForLanguageServer"
5050
verify: "maven-resolve-type project has been imported; the Java extension is activated and pom.xml is visible in the Explorer"
5151
timeout: 180
52+
# waitForLanguageServer is authoritative — skip LLM screenshot re-check.
53+
skipLlmVerify: true
5254

5355
# ── Open Java file ──────────────────────────────────────
5456
- id: "open-app"
@@ -161,6 +163,10 @@ steps:
161163
errors: 0
162164
waitBefore: 20
163165
timeout: 90
166+
# Maven re-import on a cold cache can take significantly longer than the
167+
# waitBefore window; a single retry (with the LS likely already settled
168+
# by then) recovers without inflating the happy-path wait further.
169+
retries: 1
164170

165171
# After save, the language server publishes diagnostics (status bar updates
166172
# to 0 errors, verified deterministically above). However, on Linux runners

test-plans/java-maven.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,8 @@ steps:
3030
- id: "ls-ready"
3131
action: "waitForLanguageServer"
3232
verify: "Maven workspace has loaded; the Java extension is initialized and pom.xml is visible in the Explorer (the Problems panel may briefly show diagnostics that are still being recomputed after import)"
33+
# waitForLanguageServer is authoritative — skip LLM screenshot re-check.
34+
skipLlmVerify: true
3335
timeout: 120
3436

3537
# ── Step 2: Open Java file and verify editing experience ─────────────────

test-plans/java-new-file-snippet.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,8 @@ steps:
2828
- id: "ls-ready"
2929
action: "waitForLanguageServer"
3030
verify: "Java workspace has loaded for the simple-app project; no error notifications visible"
31+
# waitForLanguageServer is authoritative — skip LLM screenshot re-check.
32+
skipLlmVerify: true
3133
timeout: 120
3234

3335
# ── Step 9: Create new Java file ─────────────────────────────

test-plans/java-pack-help-center-webview.yaml

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,13 @@ description: |
1313
1414
setup:
1515
extension: "redhat.java"
16+
# The Help Center webview lives in vscode-java-pack itself. On scheduled
17+
# runs there is no built VSIX, so the pack must be installed from the
18+
# marketplace; otherwise the `java.welcome` command is unregistered and
19+
# the open-help-center step times out silently. On PR runs the
20+
# build-pack job's VSIX takes precedence (--vsix overrides marketplace).
21+
extensions:
22+
- "vscjava.vscode-java-pack"
1623
vscodeVersion: "stable"
1724
timeout: 60
1825
settings:

test-plans/java-single-file.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,8 @@ steps:
3131
- id: "ls-ready"
3232
action: "waitForLanguageServer"
3333
verify: "Java extension has activated for the single-file workspace; no error notifications are visible"
34+
# waitForLanguageServer is authoritative — skip LLM screenshot re-check.
35+
skipLlmVerify: true
3436
timeout: 120
3537

3638
# ── Step 2: Open Java file ──────────────────────────────

test-plans/java-test-runner.yaml

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@ steps:
5252
# ready — discovery is asynchronous and Test Explorer is initially empty.
5353
# On cold-cache CI runners 20s is sometimes too short; bump to 45s.
5454
- id: "wait-test-discovery"
55-
action: "wait 45 seconds"
55+
action: "wait 90 seconds"
5656

5757
# ── Step 2: Run tests via Java Test Runner palette command ───────
5858
# autotest 0.7.1 ships `openTestExplorer`/`runAllTests` actions wired to
@@ -72,6 +72,11 @@ steps:
7272
action: "run command Java: Run Tests"
7373
verify: "Java: Run Tests command has been invoked from the palette; the Java Test Runner extension has responded (this may show as a Testing view becoming active, a run indicator in the status bar, or an informational notification such as 'No tests found in this file' if discovery is still in progress — all of these indicate the command executed successfully)"
7474
waitBefore: 3
75+
# Test discovery is asynchronous in vscode-java-test; on cold-cache
76+
# CI runners the first Run Tests invocation can land before discovery
77+
# completes ("No tests have been found"). Allow one retry — by the
78+
# second attempt the discovery cache is usually warm.
79+
retries: 1
7580

7681
- id: "wait-test-complete"
7782
action: "wait 45 seconds"

test-plans/java-webview-migration.yaml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,10 @@ steps:
5555
- id: "ls-ready"
5656
action: "waitForLanguageServer"
5757
verify: "Maven salut workspace has loaded; the Java extension and the pack webview commands are ready"
58+
# waitForLanguageServer is authoritative — skip LLM screenshot re-check
59+
# (status bar may still show "Java: Searching... 0%" for background
60+
# indexing right after the LS reports ready).
61+
skipLlmVerify: true
5862
timeout: 180
5963

6064
# ══════════════════════════════════════════════════════════════════════

0 commit comments

Comments
 (0)