Skip to content

429 retry handling ignores Retry-After header in LLM and embedding fetchers #1620

@SirBrenton

Description

@SirBrenton

The LLM and embedding fetchers classify HTTP 429 as transient and retry with exponential backoff — the same path used for 5xx. Retry-After header is never read.

Evidence

File: apps/memos-local-plugin/core/llm/fetcher.ts

Line 58:

const transient = resp.status >= 500 || resp.status === 429;

backoff() function line 236:

const ms = base * 2 ** (attempt - 1) + jitter;

No Retry-After header read. Same pattern in apps/memos-local-plugin/core/embedding/fetcher.ts line 48.

Consequence

Under provider rate limiting, retries fire before the upstream-requested cooldown expires. Premature retries extend rate limiting, increase failed requests, and waste paid API calls.

Suggested fix

In both fetchers, read Retry-After before computing backoff delay. Support integer seconds and HTTP-date formats. Use header value when present, fall back to existing exponential backoff when absent.

Related pattern

retry-after-ignored-under-concurrency
Corpus reference: https://github.com/SirBrenton/pitstop-truth

cc @CaralHsi @Ki-Seki

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions