Skip to content

Reduce flakiness in io.opentelemetry.instrumentation.jmx.rules.WildflyTest.testWildflyMetrics(String)[1]#18502

Closed
trask wants to merge 2 commits into
open-telemetry:mainfrom
trask:otelbot/flaky-test-remediation-io-opentelemetry-instrumentation-jmx-rules-WildflyTest-testW-20260501232817
Closed

Reduce flakiness in io.opentelemetry.instrumentation.jmx.rules.WildflyTest.testWildflyMetrics(String)[1]#18502
trask wants to merge 2 commits into
open-telemetry:mainfrom
trask:otelbot/flaky-test-remediation-io-opentelemetry-instrumentation-jmx-rules-WildflyTest-testW-20260501232817

Conversation

@trask

@trask trask commented May 1, 2026

Copy link
Copy Markdown
Member

Automated attempt at fixing flakiness in io.opentelemetry.instrumentation.jmx.rules.WildflyTest.testWildflyMetrics(String)[1].

Recent failed/flaky scans

  • y6yh2s5b3je5o (flaky, :instrumentation:jmx-metrics:library:test)

Flake history (per UTC day)

Day flaky failed passed
2026-04-25 3 0 426
2026-04-26 5 0 529
2026-04-27 6 0 500
2026-04-28 8 0 689
2026-04-29 15 0 691
2026-04-30 11 0 718
2026-05-01 9 0 411

Sample failure (from Develocity)

org.opentest4j.AssertionFailedError: [no data point matched attribute set '[{wildfly.server=default-server}, {wildfly.listener=default}, {network.io.direction=receive}]' for metric 'wildfly.network.io'] 
expected: true
 but was: false

Copilot diagnosis

Root cause

WildflyTest started WildFly and immediately verified all captured OTLP metric exports without ever sending a real HTTP request to the deployed test application. The sample failure consistently showed wildfly.network.io missing the network.io.direction=receive datapoint for the default listener, which can happen when Undertow listener metrics are scraped before the listener has received request bytes. Because TargetSystemTest.verifyMetrics() verifies every export accumulated since test start, an early incomplete export stayed in the queue and could keep the Awaitility assertion failing even if later exports became complete.

Fix

  • Added a small WildFly test warm-up request to the deployed servlet after the container starts, using the javax servlet path for the old WildFly image and the jakarta path for the newer image.
  • Poll the warm-up request with Awaitility until it returns HTTP 200, retrying transient IOExceptions while the deployment finishes.
  • Added a protected resetMetrics() helper and clear pre-warmup OTLP exports before running the existing metric verifier.

Why this addresses the root cause

The real servlet request initializes the Undertow listener request and network counters, including both receive and transmit directions, before the assertions inspect exported metrics. Clearing captured exports after the warm-up prevents a pre-request scrape with incomplete network datapoints from being rechecked on every Awaitility poll.

Risks / follow-ups

  • If a WildFly image changes the deployed context path or servlet API support, the warm-up request could expose that as a deterministic HTTP readiness failure.
  • Maintainers may want to consider whether TargetSystemTest.verifyMetrics() should ignore older failed snapshots once newer exports satisfy assertions, but this change keeps the fix scoped to the flaky WildFly test.

Review the diagnosis and the diff carefully before merging - automated fixes can mask flakiness instead of addressing the root cause.

trask added 2 commits May 1, 2026 16:38
…yTest.testWildflyMetrics(String)[1]

Automated fix attempt based on Develocity flaky-test analysis.
@trask

trask commented May 2, 2026

Copy link
Copy Markdown
Member Author

Closing in favor of #18509

@trask trask closed this May 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant