Retry interrupted at-most-once steps#399
Conversation
| var retryDecision = config.retryStrategy().makeRetryDecision(exception, attempt); | ||
|
|
||
| if (isRetryable && retryDecision.shouldRetry()) { | ||
| if (retryDecision.shouldRetry()) { |
There was a problem hiding this comment.
will this change leading to retry and break the at-most-once semantic?
There was a problem hiding this comment.
No, this keeps the at-most-once guarantee scoped to each retry attempt.
For a replayed STARTED at-most-once step, we still do not call executeStepLogic for that same attempt. The interrupted attempt is converted through the retry strategy into either RETRY/PENDING or FAIL. User code only runs again after the operation becomes READY, which is the next retry attempt.
That is what the two tests are intended to pin down: testAtMostOnceNoRetryFailsAfterCheckpointFailure proves the started attempt is not re-executed when retries are disabled, and testAtMostOnceRetriesAfterCheckpointFailure proves a configured retry creates exactly one next-attempt execution.
There was a problem hiding this comment.
This would be a breaking change to users who rely on the existing behavior.
|
Thanks, agreed. The existing integration tests make this clearer: for This PR changes that observable behavior under the normal retry path, so I am going to close it rather than keep pushing a breaking semantic change. If retrying interrupted at-most-once steps is still desirable later, it probably needs an explicit opt-in API or a design discussion rather than changing the current semantics in place. |
Summary
AT_MOST_ONCE_PER_RETRYsteps use the configured retry strategy instead of failing before retry handlingCloses #394.
Testing
docker run --rm -v ${PWD}:/workspace -w /workspace maven:3.9-eclipse-temurin-21 mvn -pl sdk-integration-tests -am -Dtest=StepSemanticsIntegrationTest -Dsurefire.failIfNoSpecifiedTests=false testgit diff --check