Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
65 changes: 65 additions & 0 deletions src/lib/utils/retry.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
/**
* Generic retry utility for transient failures (API throttling, network errors, etc.)
*/

export interface RetryOptions {
maxRetries: number;
baseDelayMs: number;
maxDelayMs: number;
retryableErrors?: string[];
}

const DEFAULT_OPTIONS: RetryOptions = {
maxRetries: 3,
baseDelayMs: 1000,
maxDelayMs: 30000,
};

/**
* Retry an async operation with exponential backoff and jitter.
*/
export async function withRetry<T>(
fn: () => Promise<T>,
options: Partial<RetryOptions> = {},
): Promise<T> {
const opts = { ...DEFAULT_OPTIONS, ...options };
let lastError: Error | undefined;

for (let attempt = 0; attempt < opts.maxRetries; attempt++) {
try {

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maxRetries semantics are misleading. With the current loop (for (let attempt = 0; attempt < opts.maxRetries; attempt++)), maxRetries: 3 means 3 total attempts — i.e., 1 initial call + 2 retries — not 1 initial call + 3 retries as the name implies. Callers reading the type will almost certainly pick the wrong value.

Pick one of:

  1. Rename the option to maxAttempts so the field matches the behavior.
  2. Keep the maxRetries name and change the loop to attempt <= opts.maxRetries (so maxRetries: 3 → 4 total attempts).

Either is fine, but the current name/behavior mismatch is a foot-gun worth resolving before the util gets adopted.

return await fn();
} catch (err) {
lastError = err as Error;

if (opts.retryableErrors && opts.retryableErrors.length > 0) {
const isRetryable = opts.retryableErrors.some((code) => lastError!.message.includes(code));
if (!isRetryable) {
throw lastError;
}
}

const delay = Math.min(opts.baseDelayMs * Math.pow(2, attempt), opts.maxDelayMs);
await sleep(delay);
}
}

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing jitter. The PR description says "exponential backoff and jitter," but the delay calculation is purely deterministic:

const delay = Math.min(opts.baseDelayMs * Math.pow(2, attempt), opts.maxDelayMs);

Without jitter, concurrent callers that fail at the same time (e.g., a burst of API calls hitting a throttling limit) will all retry in lockstep, defeating much of the point of backoff for throttling scenarios. Options:

  1. Add full jitter: const delay = Math.random() * Math.min(opts.baseDelayMs * Math.pow(2, attempt), opts.maxDelayMs);
  2. Add equal/decorrelated jitter (see the classic AWS Architecture Blog post on backoff/jitter).
  3. If jitter isn't actually wanted, update the doc comment and PR description to drop the claim.


Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary sleep after the final attempt. On the last iteration (attempt === maxRetries - 1), if fn() throws, the loop still runs await sleep(delay) before exiting and throwing lastError. That's wasted wall-clock time (up to maxDelayMs, i.e. 30s with defaults) with no retry following it.

Fix: skip the sleep on the last attempt, e.g.:

if (attempt < opts.maxRetries - 1) {
  const delay = /* ... */;
  await sleep(delay);
}

throw lastError;

Check failure on line 46 in src/lib/utils/retry.ts

View workflow job for this annotation

GitHub Actions / lint

Expected an error object to be thrown
}

/**
* Parse a retry-after header value to milliseconds.
*/
export function parseRetryAfter(headerValue: string): number {
const seconds = parseInt(headerValue);
if (!isNaN(seconds)) {
return seconds * 1000;
}

// Try parsing as HTTP date
const date = new Date(headerValue);
return date.getTime() - Date.now();

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parseRetryAfter can return negative or NaN milliseconds.

  • If headerValue is an HTTP date in the past (e.g., a slightly skewed clock), date.getTime() - Date.now() is negative.
  • If headerValue is neither a number nor a parseable date, new Date(headerValue).getTime() is NaN, so the return value is NaN.

Callers are likely to pass the result straight to sleep() / setTimeout, where NaN or negatives get coerced to 0 (fine) or can cause bugs downstream in logic that compares against maxDelayMs, etc.

Suggested fix: clamp to >= 0 and fall back on invalid input, e.g.:

export function parseRetryAfter(headerValue: string): number {
  const seconds = parseInt(headerValue, 10);
  if (!isNaN(seconds)) return Math.max(0, seconds * 1000);

  const ms = new Date(headerValue).getTime() - Date.now();
  return Number.isFinite(ms) ? Math.max(0, ms) : 0;
}

(Also, pass a radix to parseInt to satisfy common lint rules.)

}

function sleep(ms: number): Promise<void> {
return new Promise((resolve) => setTimeout(resolve, ms));
}
Loading