Skip to content

Add call-level retry to Provider.complete()#151

Merged
chris-colinsky merged 1 commit into
mainfrom
feature/0050-call-level-retry
Jun 11, 2026
Merged

Add call-level retry to Provider.complete()#151
chris-colinsky merged 1 commit into
mainfrom
feature/0050-call-level-retry

Conversation

@chris-colinsky

Copy link
Copy Markdown
Member

Final PR in the proposal 0050 series (after #149 failure-isolation middleware and #150 the RetryConfig refactor). Adds call-level retry on the LLM provider.

What

Provider.complete() gains an optional retry: RetryConfig | None parameter. When supplied, the wire call is retried in-call on transient provider errors per the config (classifier, backoff, on_retry, max_attempts), so a node that issues several LLM calls in a loop (chunked processing, multi-step) does not re-run the already-successful calls when a later call hits a transient failure.

The implementation wraps only the wire call in a small retry helper, so the rest of complete() is unchanged:

  • The request is built and validated once; pre-send validation errors (bad request, auth) are never retried.
  • The call stays terminal-only on the observability surface: exactly one LlmCompletionEvent (eventual success) or LlmFailedEvent (exhaustion or a non-transient error) fires per complete() call, with a single call_id shared across attempts. Intermediate transient retries emit no event.
  • The classifier receives None for state at the call boundary (the default classifier ignores it).

Deferred (conformance partial)

§7.1's per-attempt span surface (N per-attempt openarmature.llm.complete spans + the openarmature.llm.attempt_index attribute) is deferred to a future within-call sub-event (LlmRetryAttemptEvent): the python LLM span is rendered from the terminal typed event, so per-attempt spans need a dedicated signal. conformance.toml marks proposal 0050 partial accordingly. Retry visibility in the meantime is the on_retry callback plus logs.

Scope

  • No spec-pin change: proposal 0050 is already within the v0.53.0 pin.
  • This completes 0050: both primitives (failure isolation + call-level retry) ship; only the per-attempt-span sub-event remains, scoped to a future cycle.

Tests

Four call-level-retry tests (transient-then-success, exhaustion emits one failed event, non-transient skips retry, on_retry fires per attempt), each asserting the terminal-only event surface and the attempt count. Full suite green (1265 passed), pyright and ruff clean, docs build clean.

Add an optional retry: RetryConfig parameter to complete(). When set,
the wire call is retried in-call on transient provider errors per the
config, so a node issuing several LLM calls in a loop does not re-run
already-successful calls when a later call hits a transient failure.

The request is built and validated once (pre-send errors are never
retried) and the call stays terminal-only on the observability surface:
exactly one LlmCompletionEvent or LlmFailedEvent fires per complete()
call, with one call_id across attempts. The per-attempt span surface is
deferred to a future sub-event; conformance.toml marks 0050 partial.

Final piece of proposal 0050 (after failure isolation + the RetryConfig
refactor). No spec-pin change.
Copilot AI review requested due to automatic review settings June 11, 2026 00:55

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds optional call-level retry support to the LLM provider complete() API so transient failures can be retried without re-running surrounding node work, while preserving a terminal-only observability event surface.

Changes:

  • Add retry: RetryConfig | None to Provider.complete() and implement an in-call retry loop around the wire call for OpenAIProvider.
  • Add unit tests covering transient-then-success, exhaustion behavior, non-transient skip, and on_retry callbacks.
  • Document call-level retry usage and update conformance/changelog to mark proposal 0050 as partial (per-attempt spans deferred).

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.

Show a summary per file
File Description
tests/unit/test_llm_provider.py Adds call-level retry unit tests validating attempt counts and terminal-only events.
src/openarmature/llm/providers/openai.py Implements complete(retry=...) via _do_complete_with_retry around the wire call.
src/openarmature/llm/provider.py Extends the Provider protocol to include the optional retry parameter and documents it.
docs/concepts/llms.md Documents call-level retry and contrasts it with node-level retry.
conformance.toml Marks proposal 0050 as partial and documents deferred per-attempt span surface.
CHANGELOG.md Records the new call-level retry feature in the Added section.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@chris-colinsky chris-colinsky merged commit 433632b into main Jun 11, 2026
7 checks passed
@chris-colinsky chris-colinsky deleted the feature/0050-call-level-retry branch June 11, 2026 02:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants