Skip to content

Commit b4d2850

Browse files
authored
docs: structure improvements (#2538)
1 parent 4581307 commit b4d2850

9 files changed

Lines changed: 243 additions & 143 deletions

File tree

docs/human_in_the_loop.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -111,7 +111,7 @@ In this example, `prompt_approval` is synchronous because it uses `input()` and
111111

112112
To stream output while waiting for approvals, call `Runner.run_streamed`, consume `result.stream_events()` until it completes, and then follow the same `result.to_state()` and resume steps shown above.
113113

114-
## Other patterns in this repository
114+
## Repository patterns and examples
115115

116116
- **Streaming approvals**: `examples/agent_patterns/human_in_the_loop_stream.py` shows how to drain `stream_events()` and then approve pending tool calls before resuming with `Runner.run_streamed(agent, state)`.
117117
- **Agent as tool approvals**: `Agent.as_tool(..., needs_approval=...)` applies the same interruption flow when delegated agent tasks need review.

docs/mcp.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,17 @@ Notes:
5151
- When `failure_error_function` is unset, the SDK uses the default tool error formatter.
5252
- Server-level `failure_error_function` overrides `Agent.mcp_config["failure_error_function"]` for that server.
5353

54+
## Shared patterns across transports
55+
56+
After you choose a transport, most integrations need the same follow-up decisions:
57+
58+
- How to expose only a subset of tools ([Tool filtering](#tool-filtering)).
59+
- Whether the server also provides reusable prompts ([Prompts](#prompts)).
60+
- Whether `list_tools()` should be cached ([Caching](#caching)).
61+
- How MCP activity appears in traces ([Tracing](#tracing)).
62+
63+
For local MCP servers (`MCPServerStdio`, `MCPServerSse`, `MCPServerStreamableHttp`), approval policies and per-call `_meta` payloads are also shared concepts. The Streamable HTTP section shows the most complete examples, and the same patterns apply to the other local transports.
64+
5465
## 1. Hosted MCP server tools
5566

5667
Hosted tools push the entire tool round-trip into OpenAI's infrastructure. Instead of your code listing and calling tools, the
@@ -356,6 +367,10 @@ Key behaviors:
356367
- Call `reconnect(failed_only=True)` to retry failed servers, or `reconnect(failed_only=False)` to restart all servers.
357368
- Use `connect_timeout_seconds`, `cleanup_timeout_seconds`, and `connect_in_parallel` to tune lifecycle behavior.
358369

370+
## Common server capabilities
371+
372+
The sections below apply across MCP server transports (with the exact API surface depending on the server class).
373+
359374
## Tool filtering
360375

361376
Each MCP server supports tool filters so that you can expose only the functions that your agent needs. Filtering can happen at

docs/models/index.md

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,18 @@ The Agents SDK comes with out-of-the-box support for OpenAI models in two flavor
55
- **Recommended**: the [`OpenAIResponsesModel`][agents.models.openai_responses.OpenAIResponsesModel], which calls OpenAI APIs using the new [Responses API](https://platform.openai.com/docs/api-reference/responses).
66
- The [`OpenAIChatCompletionsModel`][agents.models.openai_chatcompletions.OpenAIChatCompletionsModel], which calls OpenAI APIs using the [Chat Completions API](https://platform.openai.com/docs/api-reference/chat).
77

8+
## Choosing a model setup
9+
10+
Use this page in the following order depending on your setup:
11+
12+
| Goal | Start here |
13+
| --- | --- |
14+
| Use OpenAI-hosted models with SDK defaults | [OpenAI models](#openai-models) |
15+
| Use OpenAI Responses API over websocket transport | [Responses WebSocket transport](#responses-websocket-transport) |
16+
| Use non-OpenAI providers | [Non-OpenAI models](#non-openai-models) |
17+
| Mix models/providers in one workflow | [Advanced model selection and mixing](#advanced-model-selection-and-mixing) and [Mixing models across providers](#mixing-models-across-providers) |
18+
| Debug provider compatibility issues | [Troubleshooting non-OpenAI providers](#troubleshooting-non-openai-providers) |
19+
820
## OpenAI models
921

1022
When you don't specify a model when initializing an `Agent`, the default model will be used. The default is currently [`gpt-4.1`](https://platform.openai.com/docs/models/gpt-4.1) for compatibility and low latency. If you have access, we recommend setting your agents to [`gpt-5.2`](https://platform.openai.com/docs/models/gpt-5.2) for higher quality while keeping explicit `model_settings`.
@@ -129,7 +141,7 @@ In cases where you do not have an API key from `platform.openai.com`, we recomme
129141

130142
In these examples, we use the Chat Completions API/model, because most LLM providers don't yet support the Responses API. If your LLM provider does support it, we recommend using Responses.
131143

132-
## Mixing and matching models
144+
## Advanced model selection and mixing
133145

134146
Within a single workflow, you may want to use different models for each agent. For example, you could use a smaller, faster model for triage, while using a larger, more capable model for complex tasks. When configuring an [`Agent`][agents.Agent], you can select a specific model by either:
135147

@@ -204,7 +216,7 @@ english_agent = Agent(
204216
)
205217
```
206218

207-
## Common issues with using other LLM providers
219+
## Troubleshooting non-OpenAI providers
208220

209221
### Tracing client error 401
210222

docs/realtime/guide.md

Lines changed: 12 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Guide
1+
# Realtime agents guide
22

33
This guide provides an in-depth look at building voice-enabled AI agents using the OpenAI Agents SDK's realtime capabilities.
44

@@ -123,7 +123,9 @@ main_agent = RealtimeAgent(
123123
)
124124
```
125125

126-
## Event handling
126+
## Runtime behavior and session handling
127+
128+
### Event handling
127129

128130
The session streams events that you can listen to by iterating over the session object. Events include audio output chunks, transcription results, tool execution start and end, agent handoffs, and errors. Key events to handle include:
129131

@@ -136,7 +138,7 @@ The session streams events that you can listen to by iterating over the session
136138

137139
For complete event details, see [`RealtimeSessionEvent`][agents.realtime.events.RealtimeSessionEvent].
138140

139-
## Guardrails
141+
### Guardrails
140142

141143
Only output guardrails are supported for realtime agents. These guardrails are debounced and run periodically (not on every word) to avoid performance issues during real-time generation. The default debounce length is 100 characters, but this is configurable.
142144

@@ -160,13 +162,15 @@ agent = RealtimeAgent(
160162

161163
When a guardrail is triggered, it generates a `guardrail_tripped` event and can interrupt the agent's current response. The debounce behavior helps balance safety with real-time performance requirements. Unlike text agents, realtime agents do **not** raise an Exception when guardrails are tripped.
162164

163-
## Audio processing
165+
### Audio processing
164166

165167
Send audio to the session using [`session.send_audio(audio_bytes)`][agents.realtime.session.RealtimeSession.send_audio] or send text using [`session.send_message()`][agents.realtime.session.RealtimeSession.send_message].
166168

167169
For audio output, listen for `audio` events and play the audio data through your preferred audio library. Make sure to listen for `audio_interrupted` events to stop playback immediately and clear any queued audio when the user interrupts the agent.
168170

169-
## SIP integration
171+
## Advanced integrations and low-level access
172+
173+
### SIP integration
170174

171175
You can attach realtime agents to phone calls that arrive via the [Realtime Calls API](https://platform.openai.com/docs/guides/realtime-sip). The SDK provides [`OpenAIRealtimeSIPModel`][agents.realtime.openai_realtime.OpenAIRealtimeSIPModel], which reuses the same agent flow while negotiating media over SIP.
172176

@@ -195,7 +199,7 @@ async with await runner.run(
195199

196200
When the caller hangs up, the SIP session ends and the realtime connection closes automatically. For a complete telephony example, see [`examples/realtime/twilio_sip`](https://github.com/openai/openai-agents-python/tree/main/examples/realtime/twilio_sip).
197201

198-
## Direct model access
202+
### Direct model access
199203

200204
You can access the underlying model to add custom listeners or perform advanced operations:
201205

@@ -206,11 +210,11 @@ session.model.add_listener(my_custom_listener)
206210

207211
This gives you direct access to the [`RealtimeModel`][agents.realtime.model.RealtimeModel] interface for advanced use cases where you need lower-level control over the connection.
208212

209-
## Examples
213+
### Examples and further reading
210214

211215
For complete working examples, check out the [examples/realtime directory](https://github.com/openai/openai-agents-python/tree/main/examples/realtime) which includes demos with and without UI components.
212216

213-
## Azure OpenAI endpoint format
217+
### Azure OpenAI endpoint format
214218

215219
When connecting to Azure OpenAI, use the GA Realtime endpoint format and pass credentials via
216220
headers in `model_config`:

docs/realtime/quickstart.md

Lines changed: 15 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -105,9 +105,9 @@ def _truncate_str(s: str, max_length: int) -> str:
105105
return s
106106
```
107107

108-
## Complete example
108+
## Full example (same flow in one file)
109109

110-
Here's a complete working example:
110+
This is the same quickstart flow rewritten as a single script.
111111

112112
```python
113113
import asyncio
@@ -184,7 +184,9 @@ if __name__ == "__main__":
184184
asyncio.run(main())
185185
```
186186

187-
## Configuration options
187+
## Configuration and deployment notes
188+
189+
Use these options after you have a basic session running.
188190

189191
### Model settings
190192

@@ -215,15 +217,7 @@ if __name__ == "__main__":
215217

216218
For the full schema, see the API reference for [`RealtimeRunConfig`][agents.realtime.config.RealtimeRunConfig] and [`RealtimeSessionModelSettings`][agents.realtime.config.RealtimeSessionModelSettings].
217219

218-
## Next steps
219-
220-
- [Learn more about realtime agents](guide.md)
221-
- Check out working examples in the [examples/realtime](https://github.com/openai/openai-agents-python/tree/main/examples/realtime) folder
222-
- Add tools to your agent
223-
- Implement handoffs between agents
224-
- Set up guardrails for safety
225-
226-
## Authentication
220+
### Authentication
227221

228222
Make sure your OpenAI API key is set in your environment:
229223

@@ -237,7 +231,7 @@ Or pass it directly when creating the session:
237231
session = await runner.run(model_config={"api_key": "your-api-key"})
238232
```
239233

240-
## Azure OpenAI endpoint format
234+
### Azure OpenAI endpoint format
241235

242236
If you connect to Azure OpenAI instead of OpenAI's default endpoint, pass a GA Realtime URL in
243237
`model_config["url"]` and set auth headers explicitly.
@@ -264,3 +258,11 @@ session = await runner.run(
264258

265259
Avoid using the legacy beta path (`/openai/realtime?api-version=...`) with realtime agents. The
266260
SDK expects the GA Realtime interface.
261+
262+
## Next steps
263+
264+
- [Learn more about realtime agents](guide.md)
265+
- Check out working examples in the [examples/realtime](https://github.com/openai/openai-agents-python/tree/main/examples/realtime) folder
266+
- Add tools to your agent
267+
- Implement handoffs between agents
268+
- Set up guardrails for safety

0 commit comments

Comments
 (0)