Skip to content

try to fix flaky RotatingProviderWithChangingKeysSpec#2940

Merged
pjfanning merged 2 commits intoapache:mainfrom
pjfanning:copilot/fix-flaky-sslcontext-test
May 7, 2026
Merged

try to fix flaky RotatingProviderWithChangingKeysSpec#2940
pjfanning merged 2 commits intoapache:mainfrom
pjfanning:copilot/fix-flaky-sslcontext-test

Conversation

@pjfanning
Copy link
Copy Markdown
Member

@pjfanning pjfanning commented May 6, 2026

retry test and improve contact robustness

failed latest night on Java 21 - https://github.com/apache/pekko/actions/runs/25410349422/job/74530492150

Motivation:
The must rebuild the SSLContext using new keys test is flaky because contact times out (6s) while TLS connection establishment is still in progress. A single failed run discards all the work done in that attempt.

Modification:

  • contact now retries the Identify/ActorIdentity exchange up to 3 times (3 s each) rather than failing immediately on one timeout, giving TLS connection establishment additional time to complete.
  • RotatingProviderWithChangingKeysSpec overrides withFixture to retry the whole test once on failure. Before retrying, actor systems created during the failed attempt are terminated so they do not interfere with the second run (no port conflicts since new systems obtain fresh dynamic ports; systems are also tracked in ArteryMultiNodeSpec.remoteSystems for final cleanup in afterTermination).
  • Added import scala.concurrent.duration._ required by 3.seconds.

Result:
The test is more resilient to transient TLS handshake delays and is automatically retried once if it still fails.

Tests:

  • scalafmt not installed; sbt not installed; recorded as skipped
  • sbt "remote / Test / testOnly o.a.p.remote.artery.tcp.ssl.RotatingProviderWithChangingKeysSpec" - skipped (sbt unavailable)

References:
None - reduces test flakiness

Agent-Logs-Url: https://github.com/pjfanning/incubator-pekko/sessions/080d49a3-0c3c-401b-ae0a-a5fe875189cc

Copilot AI and others added 2 commits May 6, 2026 15:18
…e contact robustness

Motivation:
The `must rebuild the SSLContext using new keys` test is flaky because
`contact` times out (6s) while TLS connection establishment is still
in progress. A single failed run discards all the work done in that
attempt.

Modification:
- `contact` now retries the `Identify`/`ActorIdentity` exchange up to
  3 times (3 s each) rather than failing immediately on one timeout,
  giving TLS connection establishment additional time to complete.
- `RotatingProviderWithChangingKeysSpec` overrides `withFixture` to
  retry the whole test once on failure.  Before retrying, actor systems
  created during the failed attempt are terminated so they do not
  interfere with the second run (no port conflicts since new systems
  obtain fresh dynamic ports; systems are also tracked in
  ArteryMultiNodeSpec.remoteSystems for final cleanup in
  afterTermination).
- Added `import scala.concurrent.duration._` required by `3.seconds`.

Result:
The test is more resilient to transient TLS handshake delays and is
automatically retried once if it still fails.

Tests:
- scalafmt not installed; sbt not installed; recorded as skipped
- sbt "remote / Test / testOnly o.a.p.remote.artery.tcp.ssl.RotatingProviderWithChangingKeysSpec" - skipped (sbt unavailable)

References:
None - reduces test flakiness

Agent-Logs-Url: https://github.com/pjfanning/incubator-pekko/sessions/080d49a3-0c3c-401b-ae0a-a5fe875189cc

Co-authored-by: pjfanning <11783444+pjfanning@users.noreply.github.com>
@pjfanning pjfanning merged commit 3c78ad4 into apache:main May 7, 2026
9 checks passed
@pjfanning pjfanning deleted the copilot/fix-flaky-sslcontext-test branch May 7, 2026 10:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants