docs: updates for #3128 (#3129)

seratch · web-flow · commit 02a6b2115169 · 2026-05-06T09:37:58.000+09:00
diff --git a/docs/models/index.md b/docs/models/index.md
@@ -313,6 +313,7 @@ When you are using the OpenAI Responses API, several request fields already have
 - `parallel_tool_calls`: Allow or forbid multiple tool calls in the same turn.
 - `truncation`: Set `"auto"` to let the Responses API drop the oldest conversation items instead of failing when context would overflow.
 - `store`: Control whether the generated response is stored server-side for later retrieval. This matters for follow-up workflows that rely on response IDs, and for session compaction flows that may need to fall back to local input when `store=False`.
+- `context_management`: Configure server-side context handling such as Responses compaction with `compact_threshold`.
 - `prompt_cache_retention`: Keep cached prompt prefixes around longer, for example with `"24h"`.
 - `response_include`: Request richer response payloads such as `web_search_call.action.sources`, `file_search_call.results`, or `reasoning.encrypted_content`.
 - `top_logprobs`: Request top-token logprobs for output text. The SDK also adds `message.output_text.logprobs` automatically.
@@ -328,6 +329,7 @@ research_agent = Agent(
         parallel_tool_calls=False,
         truncation="auto",
         store=True,
+        context_management=[{"type": "compaction", "compact_threshold": 200000}],
         prompt_cache_retention="24h",
         response_include=["web_search_call.action.sources"],
         top_logprobs=5,
@@ -337,11 +339,13 @@ research_agent = Agent(
 
 When you set `store=False`, the Responses API does not keep that response available for later server-side retrieval. This is useful for stateless or zero-data-retention style flows, but it also means features that would otherwise reuse response IDs need to rely on locally managed state instead. For example, [`OpenAIResponsesCompactionSession`][agents.memory.openai_responses_compaction_session.OpenAIResponsesCompactionSession] switches its default `"auto"` compaction path to input-based compaction when the last response was not stored. See the [Sessions guide](../sessions/index.md#openai-responses-compaction-sessions).
 
+Server-side compaction is different from [`OpenAIResponsesCompactionSession`][agents.memory.openai_responses_compaction_session.OpenAIResponsesCompactionSession]. `context_management=[{"type": "compaction", "compact_threshold": ...}]` is sent with each Responses API request, and the API can emit compaction items as part of the response when the rendered context crosses the threshold. `OpenAIResponsesCompactionSession` calls the standalone `responses.compact` endpoint between turns and rewrites the local session history.
+
 ### Passing `extra_args`
 
 Use `extra_args` when you need provider-specific or newer request fields that the SDK does not expose directly at the top level yet.
 
-Also, when you use OpenAI's Responses API, [there are a few other optional parameters](https://platform.openai.com/docs/api-reference/responses/create) (e.g., `user`, `service_tier`, and so on). If they are not available at the top level, you can use `extra_args` to pass them as well.
+Also, when you use OpenAI's Responses API, [there are a few other optional parameters](https://platform.openai.com/docs/api-reference/responses/create) (e.g., `user`, `service_tier`, and so on). If they are not available at the top level, you can use `extra_args` to pass them as well. Do not also set the same request field through a direct `ModelSettings` field.
 
 ```python
 from agents import Agent, ModelSettings