Skip to content

Commit 8ccdea7

Browse files
authored
docs: add retry settings (#2655)
1 parent 1580644 commit 8ccdea7

5 files changed

Lines changed: 97 additions & 0 deletions

File tree

docs/examples.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,8 @@ Check out a variety of sample implementations of the SDK in the examples section
2727
- Prompt templates
2828
- File handling (local and remote, images and PDFs)
2929
- Usage tracking
30+
- Runner-managed retry settings (`examples/basic/retry.py`)
31+
- Runner-managed retries with LiteLLM (`examples/basic/retry_litellm.py`)
3032
- Non-strict output types
3133
- Previous response ID usage
3234

docs/models/index.md

Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -268,6 +268,7 @@ When you are using the OpenAI Responses API, several request fields already have
268268
| `prompt_cache_retention` | Keep cached prompt prefixes around longer, for example with `"24h"`. |
269269
| `response_include` | Request richer response payloads such as `web_search_call.action.sources`, `file_search_call.results`, or `reasoning.encrypted_content`. |
270270
| `top_logprobs` | Request top-token logprobs for output text. The SDK also adds `message.output_text.logprobs` automatically. |
271+
| `retry` | Opt in to runner-managed retry settings for model calls. See [Runner-managed retries](#runner-managed-retries). |
271272

272273
```python
273274
from agents import Agent, ModelSettings
@@ -285,6 +286,91 @@ research_agent = Agent(
285286
)
286287
```
287288

289+
#### Runner-managed retries
290+
291+
Retries are runtime-only and opt in. The SDK does not retry general model requests unless you set `ModelSettings(retry=...)` and your retry policy chooses to retry.
292+
293+
```python
294+
from agents import Agent, ModelRetrySettings, ModelSettings, retry_policies
295+
296+
agent = Agent(
297+
name="Assistant",
298+
model="gpt-5.4",
299+
model_settings=ModelSettings(
300+
retry=ModelRetrySettings(
301+
max_retries=4,
302+
backoff={
303+
"initial_delay": 0.5,
304+
"max_delay": 5.0,
305+
"multiplier": 2.0,
306+
"jitter": True,
307+
},
308+
policy=retry_policies.any(
309+
retry_policies.provider_suggested(),
310+
retry_policies.retry_after(),
311+
retry_policies.network_error(),
312+
retry_policies.http_status([408, 409, 429, 500, 502, 503, 504]),
313+
),
314+
)
315+
),
316+
)
317+
```
318+
319+
`ModelRetrySettings` has three fields:
320+
321+
| Field | Type | Notes |
322+
| --- | --- | --- |
323+
| `max_retries` | `int \| None` | Number of retry attempts allowed after the initial request. |
324+
| `backoff` | `ModelRetryBackoffSettings \| dict \| None` | Default delay strategy when the policy retries without returning an explicit delay. |
325+
| `policy` | `RetryPolicy \| None` | Callback that decides whether to retry. This field is runtime-only and is not serialized. |
326+
327+
A retry policy receives a [`RetryPolicyContext`][agents.retry.RetryPolicyContext] with:
328+
329+
- `attempt` and `max_retries` so you can make attempt-aware decisions.
330+
- `stream` so you can branch between streamed and non-streamed behavior.
331+
- `error` for raw inspection.
332+
- `normalized` facts such as `status_code`, `retry_after`, `error_code`, `is_network_error`, `is_timeout`, and `is_abort`.
333+
- `provider_advice` when the underlying model adapter can supply retry guidance.
334+
335+
The policy can return either:
336+
337+
- `True` / `False` for a simple retry decision.
338+
- A [`RetryDecision`][agents.retry.RetryDecision] when you want to override the delay or attach a diagnostic reason.
339+
340+
The SDK exports ready-made helpers on `retry_policies`:
341+
342+
| Helper | Behavior |
343+
| --- | --- |
344+
| `retry_policies.never()` | Always opts out. |
345+
| `retry_policies.provider_suggested()` | Follows provider retry advice when available. |
346+
| `retry_policies.network_error()` | Matches transient transport and timeout failures. |
347+
| `retry_policies.http_status([...])` | Matches selected HTTP status codes. |
348+
| `retry_policies.retry_after()` | Retries only when a retry-after hint is available, using that delay. |
349+
| `retry_policies.any(...)` | Retries when any nested policy opts in. |
350+
| `retry_policies.all(...)` | Retries only when every nested policy opts in. |
351+
352+
When you compose policies, `provider_suggested()` is the safest first building block because it preserves provider vetoes and replay-safety approvals when the provider can distinguish them.
353+
354+
##### Safety boundaries
355+
356+
Some failures are never retried automatically:
357+
358+
- Abort errors.
359+
- Requests where provider advice marks replay as unsafe.
360+
- Streamed runs after output has already started in a way that would make replay unsafe.
361+
362+
Stateful follow-up requests using `previous_response_id` or `conversation_id` are also treated more conservatively. For those requests, non-provider predicates such as `network_error()` or `http_status([500])` are not enough by themselves. The retry policy should include a replay-safe approval from the provider, typically via `retry_policies.provider_suggested()`.
363+
364+
##### Runner and agent merge behavior
365+
366+
`retry` is deep-merged between runner-level and agent-level `ModelSettings`:
367+
368+
- An agent can override only `retry.max_retries` and still inherit the runner's `policy`.
369+
- An agent can override only part of `retry.backoff` and keep sibling backoff fields from the runner.
370+
- `policy` is runtime-only, so serialized `ModelSettings` keep `max_retries` and `backoff` but omit the callback itself.
371+
372+
For fuller examples, see [`examples/basic/retry.py`](https://github.com/openai/openai-agents-python/tree/main/examples/basic/retry.py) and [`examples/basic/retry_litellm.py`](https://github.com/openai/openai-agents-python/tree/main/examples/basic/retry_litellm.py).
373+
288374
Use `extra_args` when you need provider-specific or newer request fields that the SDK does not expose directly at the top level yet.
289375

290376
Also, when you use OpenAI's Responses API, [there are a few other optional parameters](https://platform.openai.com/docs/api-reference/responses/create) (e.g., `user`, `service_tier`, and so on). If they are not available at the top level, you can use `extra_args` to pass them as well.

docs/ref/retry.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# `Retry`
2+
3+
::: agents.retry
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# `Model Retry`
2+
3+
::: agents.run_internal.model_retry

docs/running_agents.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -370,6 +370,9 @@ settings so the resumed turn continues in the same server-managed conversation.
370370
`previous_response_id`, or `auto_previous_response_id`), the SDK also performs a best-effort
371371
rollback of recently persisted input items to reduce duplicate history entries after a retry.
372372

373+
This compatibility retry happens even if you do not configure `ModelSettings.retry`. For
374+
broader opt-in retry behavior on model requests, see [Runner-managed retries](models/index.md#runner-managed-retries).
375+
373376
## Hooks and customization
374377

375378
### Call model input filter

0 commit comments

Comments
 (0)