Skip to content

Reduce flakiness in io.opentelemetry.instrumentation.jmx.rules.CamelTest.testCollectedMetrics()#18491

Closed
trask wants to merge 2 commits into
open-telemetry:mainfrom
trask:otelbot/flaky-fix-io-opentelemetry-instrumentation-jmx-rules-CamelTest-testCol-20260501174938
Closed

Reduce flakiness in io.opentelemetry.instrumentation.jmx.rules.CamelTest.testCollectedMetrics()#18491
trask wants to merge 2 commits into
open-telemetry:mainfrom
trask:otelbot/flaky-fix-io-opentelemetry-instrumentation-jmx-rules-CamelTest-testCol-20260501174938

Conversation

@trask

@trask trask commented May 1, 2026

Copy link
Copy Markdown
Member

Fixes flakiness in io.opentelemetry.instrumentation.jmx.rules.CamelTest.testCollectedMetrics().

Recent failed/flaky scans

  • ce3lsc2mxypdq (flaky, :instrumentation:jmx-metrics:library:test)
  • tn3om7jdhfsgg (flaky, :instrumentation:jmx-metrics:library:test)
  • 4rm3n3nsa4rv6 (flaky, :instrumentation:jmx-metrics:library:test)
  • gsyg7utdnwkjs (flaky, :instrumentation:jmx-metrics:library:test)
  • bckzhxa7vfhde (flaky, :instrumentation:jmx-metrics:library:test)

Flake history (per UTC day)

Day flaky failed passed
2026-04-24 11 0 179
2026-04-25 27 0 402
2026-04-26 16 0 518
2026-04-27 23 0 483
2026-04-28 33 0 664
2026-04-29 44 0 663
2026-04-30 49 0 680
2026-05-01 14 0 267

Sample failure (from Develocity)

org.opentest4j.AssertionFailedError: [no data point matched attribute set '[{camel.context}, {camel.route}, {camel.processor}, {camel.destination}]' for metric 'camel.processor.exchange.redelivered.count']
expected: true
 but was: false

Diagnosis

The container wait strategy used a custom Camel test application started log line from the test application. That log line was emitted before main.run(args), so Testcontainers could consider the target ready before Apache Camel had actually started its routes and finished registering the related JMX MBeans.

Fix

  • Wait for Camel's own startup completion log: Apache Camel ... started in ....
  • Remove the misleading custom pre-start log from the Camel test application.

Why this addresses the flake

The failing assertion expects destination-aware Camel processor datapoints to be present. Waiting for Camel's actual startup completion reduces the race where metrics are verified after the application process exists but before Camel routes/processors have finished startup.

Risks / follow-ups

If Camel's started in log can still appear before all relevant JMX MBeans or destination-aware attributes are visible to the JMX scraper, this may reduce but not fully eliminate the flake. If flakiness continues, the next likely step is to avoid verifying stale pre-ready OTLP exports, or to make the metrics verifier evaluate a coherent later export instead of accumulating and rechecking earlier incomplete exports.

trask added 2 commits May 1, 2026 10:53
…est.testCollectedMetrics()

Automated fix attempt based on Develocity flaky-test analysis.
@trask trask marked this pull request as ready for review May 1, 2026 19:43
@trask trask requested a review from a team as a code owner May 1, 2026 19:43
@trask

trask commented May 2, 2026

Copy link
Copy Markdown
Member Author

Closing in favor of #18509

@trask trask closed this May 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant