docs: add retry settings (#2655)

seratch · web-flow · commit 8ccdea780ef8 · 2026-03-12T17:55:11.000+09:00
diff --git a/docs/examples.md b/docs/examples.md
@@ -27,6 +27,8 @@ Check out a variety of sample implementations of the SDK in the examples section
     -   Prompt templates
     -   File handling (local and remote, images and PDFs)
     -   Usage tracking
+    -   Runner-managed retry settings (`examples/basic/retry.py`)
+    -   Runner-managed retries with LiteLLM (`examples/basic/retry_litellm.py`)
     -   Non-strict output types
     -   Previous response ID usage
 
diff --git a/docs/models/index.md b/docs/models/index.md
@@ -268,6 +268,7 @@ When you are using the OpenAI Responses API, several request fields already have
 | `prompt_cache_retention` | Keep cached prompt prefixes around longer, for example with `"24h"`. |
 | `response_include` | Request richer response payloads such as `web_search_call.action.sources`, `file_search_call.results`, or `reasoning.encrypted_content`. |
 | `top_logprobs` | Request top-token logprobs for output text. The SDK also adds `message.output_text.logprobs` automatically. |
+| `retry` | Opt in to runner-managed retry settings for model calls. See [Runner-managed retries](#runner-managed-retries). |
 
 ```python
 from agents import Agent, ModelSettings
@@ -285,6 +286,91 @@ research_agent = Agent(
 )
 ```
 
+#### Runner-managed retries
+
+Retries are runtime-only and opt in. The SDK does not retry general model requests unless you set `ModelSettings(retry=...)` and your retry policy chooses to retry.
+
+```python
+from agents import Agent, ModelRetrySettings, ModelSettings, retry_policies
+
+agent = Agent(
+    name="Assistant",
+    model="gpt-5.4",
+    model_settings=ModelSettings(
+        retry=ModelRetrySettings(
+            max_retries=4,
+            backoff={
+                "initial_delay": 0.5,
+                "max_delay": 5.0,
+                "multiplier": 2.0,
+                "jitter": True,
+            },
+            policy=retry_policies.any(
+                retry_policies.provider_suggested(),
+                retry_policies.retry_after(),
+                retry_policies.network_error(),
+                retry_policies.http_status([408, 409, 429, 500, 502, 503, 504]),
+            ),
+        )
+    ),
+)
+```
+
+`ModelRetrySettings` has three fields:
+
+| Field | Type | Notes |
+| --- | --- | --- |
+| `max_retries` | `int \| None` | Number of retry attempts allowed after the initial request. |
+| `backoff` | `ModelRetryBackoffSettings \| dict \| None` | Default delay strategy when the policy retries without returning an explicit delay. |
+| `policy` | `RetryPolicy \| None` | Callback that decides whether to retry. This field is runtime-only and is not serialized. |
+
+A retry policy receives a [`RetryPolicyContext`][agents.retry.RetryPolicyContext] with:
+
+- `attempt` and `max_retries` so you can make attempt-aware decisions.
+- `stream` so you can branch between streamed and non-streamed behavior.
+- `error` for raw inspection.
+- `normalized` facts such as `status_code`, `retry_after`, `error_code`, `is_network_error`, `is_timeout`, and `is_abort`.
+- `provider_advice` when the underlying model adapter can supply retry guidance.
+
+The policy can return either:
+
+- `True` / `False` for a simple retry decision.
+- A [`RetryDecision`][agents.retry.RetryDecision] when you want to override the delay or attach a diagnostic reason.
+
+The SDK exports ready-made helpers on `retry_policies`:
+
+| Helper | Behavior |
+| --- | --- |
+| `retry_policies.never()` | Always opts out. |
+| `retry_policies.provider_suggested()` | Follows provider retry advice when available. |
+| `retry_policies.network_error()` | Matches transient transport and timeout failures. |
+| `retry_policies.http_status([...])` | Matches selected HTTP status codes. |
+| `retry_policies.retry_after()` | Retries only when a retry-after hint is available, using that delay. |
+| `retry_policies.any(...)` | Retries when any nested policy opts in. |
+| `retry_policies.all(...)` | Retries only when every nested policy opts in. |
+
+When you compose policies, `provider_suggested()` is the safest first building block because it preserves provider vetoes and replay-safety approvals when the provider can distinguish them.
+
+##### Safety boundaries
+
+Some failures are never retried automatically:
+
+- Abort errors.
+- Requests where provider advice marks replay as unsafe.
+- Streamed runs after output has already started in a way that would make replay unsafe.
+
+Stateful follow-up requests using `previous_response_id` or `conversation_id` are also treated more conservatively. For those requests, non-provider predicates such as `network_error()` or `http_status([500])` are not enough by themselves. The retry policy should include a replay-safe approval from the provider, typically via `retry_policies.provider_suggested()`.
+
+##### Runner and agent merge behavior
+
+`retry` is deep-merged between runner-level and agent-level `ModelSettings`:
+
+- An agent can override only `retry.max_retries` and still inherit the runner's `policy`.
+- An agent can override only part of `retry.backoff` and keep sibling backoff fields from the runner.
+- `policy` is runtime-only, so serialized `ModelSettings` keep `max_retries` and `backoff` but omit the callback itself.
+
+For fuller examples, see [`examples/basic/retry.py`](https://github.com/openai/openai-agents-python/tree/main/examples/basic/retry.py) and [`examples/basic/retry_litellm.py`](https://github.com/openai/openai-agents-python/tree/main/examples/basic/retry_litellm.py).
+
 Use `extra_args` when you need provider-specific or newer request fields that the SDK does not expose directly at the top level yet.
 
 Also, when you use OpenAI's Responses API, [there are a few other optional parameters](https://platform.openai.com/docs/api-reference/responses/create) (e.g., `user`, `service_tier`, and so on). If they are not available at the top level, you can use `extra_args` to pass them as well.
diff --git a/docs/ref/retry.md b/docs/ref/retry.md
@@ -0,0 +1,3 @@
+# `Retry`
+
+::: agents.retry
diff --git a/docs/ref/run_internal/model_retry.md b/docs/ref/run_internal/model_retry.md
@@ -0,0 +1,3 @@
+# `Model Retry`
+
+::: agents.run_internal.model_retry
diff --git a/docs/running_agents.md b/docs/running_agents.md
@@ -370,6 +370,9 @@ settings so the resumed turn continues in the same server-managed conversation.
     `previous_response_id`, or `auto_previous_response_id`), the SDK also performs a best-effort
     rollback of recently persisted input items to reduce duplicate history entries after a retry.
 
+    This compatibility retry happens even if you do not configure `ModelSettings.retry`. For
+    broader opt-in retry behavior on model requests, see [Runner-managed retries](models/index.md#runner-managed-retries).
+
 ## Hooks and customization
 
 ### Call model input filter

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,3 @@`
	`1`	+# `Model Retry`
	`2`	`+`
	`3`	`+::: agents.run_internal.model_retry`