Skip to content

refactor(util): make retry policy explicit and deterministic#633

Open
wgtmac wants to merge 2 commits intoapache:mainfrom
wgtmac:refine_retry
Open

refactor(util): make retry policy explicit and deterministic#633
wgtmac wants to merge 2 commits intoapache:mainfrom
wgtmac:refine_retry

Conversation

@wgtmac
Copy link
Copy Markdown
Member

@wgtmac wgtmac commented Apr 29, 2026

No description provided.

.max_wait_ms = 10,
.total_timeout_ms = 5000})
.OnlyRetryOn(ErrorKind::kCommitFailed)
.StopRetryOn({ErrorKind::kCommitFailed})
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see StopRetryOn doesn't have a single error type signature like OnlyRetryOn, is that by design?

Copy link
Copy Markdown
Collaborator

@zhjwpku zhjwpku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown

@kamcheungting-db kamcheungting-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice factor.
I have some non-blocker comments for understanding the underlying intention.

Comment on lines +165 to +169
enum class RetryPolicyMode {
kUnset,
kOnlyRetryOn,
kStopRetryOn,
};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you add one line comments for these RetryPolicyModes' behavior?


if (!WaitForNextAttempt(attempt, deadline)) {
return result;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we record the result detail for each failed run?

bool HasTimedOut(const std::optional<TimePoint>& deadline) const {
return deadline.has_value() && Clock::now() >= *deadline;
}
Status ValidateConfig() const;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add simple comments for this method. Its implementation is not trivial

config_.num_retries);
}
if (config_.num_retries == 0) {
return {};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: could we explicitly call out?

Suggested change
return {};
return RetryExhausted(<".... list of failed run reason">);

struct RetryTestHooks {
using Clock = std::chrono::steady_clock;
using Duration = std::chrono::milliseconds;
using TimePoint = Clock::time_point;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these are duplicated with these few line code

Have we considered about deduplicating them?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants