diff --git a/docs/llms-full.txt b/docs/llms-full.txt
index d6996ee332..dddad2545d 100644
--- a/docs/llms-full.txt
+++ b/docs/llms-full.txt
@@ -38,8 +38,7 @@ The Agents SDK delivers a focused set of Python primitives—agents, tools, guar
 - [Realtime guide](https://openai.github.io/openai-agents-python/realtime/guide/): Deep dive into realtime session lifecycle, structured input, approvals, interruptions, and low-level transport control.
 
 ## Models and Provider Integrations
-- [Model catalog](https://openai.github.io/openai-agents-python/models/): Lists supported OpenAI and partner models with guidance on selecting capabilities for different workloads.
-- [LiteLLM integration](https://openai.github.io/openai-agents-python/models/litellm/): Configure LiteLLM as a provider, map model aliases, and route requests across heterogeneous backends.
+- [Model catalog](https://openai.github.io/openai-agents-python/models/): Covers OpenAI model selection, non-OpenAI provider patterns, websocket transport, and the SDK's best-effort LiteLLM guidance in one place.
 
 ## API Reference – Agents SDK Core
 - [API index](https://openai.github.io/openai-agents-python/ref/index/): Directory of all documented modules, classes, and functions in the SDK.
diff --git a/docs/llms.txt b/docs/llms.txt
index cbd6312a3f..a96401c0c0 100644
--- a/docs/llms.txt
+++ b/docs/llms.txt
@@ -52,8 +52,7 @@ The SDK focuses on a concise set of primitives so you can orchestrate multi-agen
 - [Extensions](https://openai.github.io/openai-agents-python/ref/extensions/handoff_filters/): Extend the SDK with custom handoff filters, prompts, LiteLLM integration, and SQLAlchemy session memory.
 
 ## Models and Providers
-- [Model catalog](https://openai.github.io/openai-agents-python/models/): Overview of supported model families and configuration guidance.
-- [LiteLLM integration](https://openai.github.io/openai-agents-python/models/litellm/): Configure LiteLLM as a provider to fan out across multiple model backends.
+- [Model catalog](https://openai.github.io/openai-agents-python/models/): Overview of OpenAI models, non-OpenAI provider patterns, websocket transport, and the SDK's best-effort LiteLLM guidance.
 
 ## Optional
 - [Release notes](https://openai.github.io/openai-agents-python/release/): Track SDK changes, migration notes, and deprecations.
diff --git a/docs/models/index.md b/docs/models/index.md
index c510f0dd85..3ec1b573c8 100644
--- a/docs/models/index.md
+++ b/docs/models/index.md
@@ -7,18 +7,21 @@ The Agents SDK comes with out-of-the-box support for OpenAI models in two flavor
 
 ## Choosing a model setup
 
-Use this page in the following order depending on your setup:
+Start with the simplest path that fits your setup:
 
-| Goal | Start here |
-| --- | --- |
-| Use OpenAI-hosted models with SDK defaults | [OpenAI models](#openai-models) |
-| Use OpenAI Responses API over websocket transport | [Responses WebSocket transport](#responses-websocket-transport) |
-| Use non-OpenAI providers | [Non-OpenAI models](#non-openai-models) |
-| Mix models/providers in one workflow | [Advanced model selection and mixing](#advanced-model-selection-and-mixing) and [Mixing models across providers](#mixing-models-across-providers) |
-| Debug provider compatibility issues | [Troubleshooting non-OpenAI providers](#troubleshooting-non-openai-providers) |
+| If you are trying to... | Recommended path | Read more |
+| --- | --- | --- |
+| Use OpenAI models only | Use the default OpenAI provider with the Responses model path | [OpenAI models](#openai-models) |
+| Use OpenAI Responses API over websocket transport | Keep the Responses model path and enable websocket transport | [Responses WebSocket transport](#responses-websocket-transport) |
+| Use one non-OpenAI provider | Start with the built-in provider integration points | [Non-OpenAI models](#non-openai-models) |
+| Mix models or providers across agents | Select providers per run or per agent and review feature differences | [Mixing models in one workflow](#mixing-models-in-one-workflow) and [Mixing models across providers](#mixing-models-across-providers) |
+| Tune advanced OpenAI Responses request settings | Use `ModelSettings` on the OpenAI Responses path | [Advanced OpenAI Responses settings](#advanced-openai-responses-settings) |
+| Use LiteLLM for non-OpenAI Chat Completions providers | Treat LiteLLM as a beta fallback | [LiteLLM](#litellm) |
 
 ## OpenAI models
 
+For most OpenAI-only apps, the recommended path is to use string model names with the default OpenAI provider and stay on the Responses model path.
+
 When you don't specify a model when initializing an `Agent`, the default model will be used. The default is currently [`gpt-4.1`](https://developers.openai.com/api/docs/models/gpt-4.1) for compatibility and low latency. If you have access, we recommend setting your agents to [`gpt-5.4`](https://developers.openai.com/api/docs/models/gpt-5.4) for higher quality while keeping explicit `model_settings`.
 
 If you want to switch to other models like [`gpt-5.4`](https://developers.openai.com/api/docs/models/gpt-5.4), there are two ways to configure your agents.
@@ -97,6 +100,8 @@ These features are rejected on Chat Completions models and on non-Responses back
 
 By default, OpenAI Responses API requests use HTTP transport. You can opt in to websocket transport when using OpenAI-backed models.
 
+#### Basic setup
+
 ```python
 from agents import set_default_openai_responses_transport
 
@@ -107,6 +112,8 @@ This affects OpenAI Responses models resolved by the default OpenAI provider (in
 
 Transport selection happens when the SDK resolves a model name into a model instance. If you pass a concrete [`Model`][agents.models.interface.Model] object, its transport is already fixed: [`OpenAIResponsesWSModel`][agents.models.openai_responses.OpenAIResponsesWSModel] uses websocket, [`OpenAIResponsesModel`][agents.models.openai_responses.OpenAIResponsesModel] uses HTTP, and [`OpenAIChatCompletionsModel`][agents.models.openai_chatcompletions.OpenAIChatCompletionsModel] stays on Chat Completions. If you pass `RunConfig(model_provider=...)`, that provider controls transport selection instead of the global default.
 
+#### Provider or run-level setup
+
 You can also configure websocket transport per provider or per run:
 
 ```python
@@ -126,6 +133,8 @@ result = await Runner.run(
 )
 ```
 
+#### Advanced routing with `MultiProvider`
+
 If you need prefix-based model routing (for example mixing `openai/...` and `litellm/...` model names in one run), use [`MultiProvider`][agents.MultiProvider] and set `openai_use_responses_websocket=True` there instead.
 
 `MultiProvider` keeps two historical defaults:
@@ -163,7 +172,7 @@ Use `openai_prefix_mode="model_id"` when a backend expects the literal `openai/.
 
 If you use a custom OpenAI-compatible endpoint or proxy, websocket transport also requires a compatible websocket `/responses` endpoint. In those setups you may need to set `websocket_base_url` explicitly.
 
-Notes:
+#### Notes
 
 -   This is the Responses API over websocket transport, not the [Realtime API](../realtime/guide.md). It does not apply to Chat Completions or non-OpenAI providers unless they support the Responses websocket `/responses` endpoint.
 -   Install the `websockets` package if it is not already available in your environment.
@@ -171,34 +180,30 @@ Notes:
 
 ## Non-OpenAI models
 
-You can use most other non-OpenAI models via the [LiteLLM integration](./litellm.md). First, install the litellm dependency group:
-
-```bash
-pip install "openai-agents[litellm]"
-```
-
-Then, use any of the [supported models](https://docs.litellm.ai/docs/providers) with the `litellm/` prefix:
+If you need a non-OpenAI provider, start with the SDK's built-in provider integration points. In many setups, this is enough without adding LiteLLM. Examples for each pattern live in [examples/model_providers](https://github.com/openai/openai-agents-python/tree/main/examples/model_providers/).
 
-```python
-claude_agent = Agent(model="litellm/anthropic/claude-3-5-sonnet-20240620", ...)
-gemini_agent = Agent(model="litellm/gemini/gemini-2.5-flash-preview-04-17", ...)
-```
+### Ways to integrate non-OpenAI providers
 
-### Other ways to use non-OpenAI models
+| Approach | Use it when | Scope |
+| --- | --- | --- |
+| [`set_default_openai_client`][agents.set_default_openai_client] | One OpenAI-compatible endpoint should be the default for most or all agents | Global default |
+| [`ModelProvider`][agents.models.interface.ModelProvider] | One custom provider should apply to a single run | Per run |
+| [`Agent.model`][agents.agent.Agent.model] | Different agents need different providers or concrete model objects | Per agent |
+| LiteLLM (beta) | You need LiteLLM-specific provider coverage or routing | See [LiteLLM](#litellm) |
 
-You can integrate other LLM providers in 3 more ways (examples [here](https://github.com/openai/openai-agents-python/tree/main/examples/model_providers/)):
+You can integrate other LLM providers with these built-in paths:
 
 1. [`set_default_openai_client`][agents.set_default_openai_client] is useful in cases where you want to globally use an instance of `AsyncOpenAI` as the LLM client. This is for cases where the LLM provider has an OpenAI compatible API endpoint, and you can set the `base_url` and `api_key`. See a configurable example in [examples/model_providers/custom_example_global.py](https://github.com/openai/openai-agents-python/tree/main/examples/model_providers/custom_example_global.py).
 2. [`ModelProvider`][agents.models.interface.ModelProvider] is at the `Runner.run` level. This lets you say "use a custom model provider for all agents in this run". See a configurable example in [examples/model_providers/custom_example_provider.py](https://github.com/openai/openai-agents-python/tree/main/examples/model_providers/custom_example_provider.py).
-3. [`Agent.model`][agents.agent.Agent.model] lets you specify the model on a specific Agent instance. This enables you to mix and match different providers for different agents. See a configurable example in [examples/model_providers/custom_example_agent.py](https://github.com/openai/openai-agents-python/tree/main/examples/model_providers/custom_example_agent.py). An easy way to use most available models is via the [LiteLLM integration](./litellm.md).
+3. [`Agent.model`][agents.agent.Agent.model] lets you specify the model on a specific Agent instance. This enables you to mix and match different providers for different agents. See a configurable example in [examples/model_providers/custom_example_agent.py](https://github.com/openai/openai-agents-python/tree/main/examples/model_providers/custom_example_agent.py).
 
 In cases where you do not have an API key from `platform.openai.com`, we recommend disabling tracing via `set_tracing_disabled()`, or setting up a [different tracing processor](../tracing.md).
 
 !!! note
 
-    In these examples, we use the Chat Completions API/model, because most LLM providers don't yet support the Responses API. If your LLM provider does support it, we recommend using Responses.
+    In these examples, we use the Chat Completions API/model, because many LLM providers still do not support the Responses API. If your LLM provider does support it, we recommend using Responses.
 
-## Advanced model selection and mixing
+## Mixing models in one workflow
 
 Within a single workflow, you may want to use different models for each agent. For example, you could use a smaller, faster model for triage, while using a larger, more capable model for complex tasks. When configuring an [`Agent`][agents.Agent], you can select a specific model by either:
 
@@ -206,7 +211,7 @@ Within a single workflow, you may want to use different models for each agent. F
 2. Passing any model name + a [`ModelProvider`][agents.models.interface.ModelProvider] that can map that name to a Model instance.
 3. Directly providing a [`Model`][agents.models.interface.Model] implementation.
 
-!!!note
+!!! note
 
     While our SDK supports both the [`OpenAIResponsesModel`][agents.models.openai_responses.OpenAIResponsesModel] and the [`OpenAIChatCompletionsModel`][agents.models.openai_chatcompletions.OpenAIChatCompletionsModel] shapes, we recommend using a single model shape for each workflow because the two shapes support a different set of features and tools. If your workflow requires mixing and matching model shapes, make sure that all the features you're using are available on both.
 
@@ -257,19 +262,21 @@ english_agent = Agent(
 )
 ```
 
-#### Common advanced `ModelSettings` options
+## Advanced OpenAI Responses settings
+
+When you are on the OpenAI Responses path and need more control, start with `ModelSettings`.
+
+### Common advanced `ModelSettings` options
 
 When you are using the OpenAI Responses API, several request fields already have direct `ModelSettings` fields, so you do not need `extra_args` for them.
 
-| Field | Use it for |
-| --- | --- |
-| `parallel_tool_calls` | Allow or forbid multiple tool calls in the same turn. |
-| `truncation` | Set `"auto"` to let the Responses API drop the oldest conversation items instead of failing when context would overflow. |
-| `store` | Control whether the generated response is stored server-side for later retrieval. This matters for follow-up workflows that rely on response IDs, and for session compaction flows that may need to fall back to local input when `store=False`. |
-| `prompt_cache_retention` | Keep cached prompt prefixes around longer, for example with `"24h"`. |
-| `response_include` | Request richer response payloads such as `web_search_call.action.sources`, `file_search_call.results`, or `reasoning.encrypted_content`. |
-| `top_logprobs` | Request top-token logprobs for output text. The SDK also adds `message.output_text.logprobs` automatically. |
-| `retry` | Opt in to runner-managed retry settings for model calls. See [Runner-managed retries](#runner-managed-retries). |
+- `parallel_tool_calls`: Allow or forbid multiple tool calls in the same turn.
+- `truncation`: Set `"auto"` to let the Responses API drop the oldest conversation items instead of failing when context would overflow.
+- `store`: Control whether the generated response is stored server-side for later retrieval. This matters for follow-up workflows that rely on response IDs, and for session compaction flows that may need to fall back to local input when `store=False`.
+- `prompt_cache_retention`: Keep cached prompt prefixes around longer, for example with `"24h"`.
+- `response_include`: Request richer response payloads such as `web_search_call.action.sources`, `file_search_call.results`, or `reasoning.encrypted_content`.
+- `top_logprobs`: Request top-token logprobs for output text. The SDK also adds `message.output_text.logprobs` automatically.
+- `retry`: Opt in to runner-managed retry settings for model calls. See [Runner-managed retries](#runner-managed-retries).
 
 ```python
 from agents import Agent, ModelSettings
@@ -290,7 +297,27 @@ research_agent = Agent(
 
 When you set `store=False`, the Responses API does not keep that response available for later server-side retrieval. This is useful for stateless or zero-data-retention style flows, but it also means features that would otherwise reuse response IDs need to rely on locally managed state instead. For example, [`OpenAIResponsesCompactionSession`][agents.memory.openai_responses_compaction_session.OpenAIResponsesCompactionSession] switches its default `"auto"` compaction path to input-based compaction when the last response was not stored. See the [Sessions guide](../sessions/index.md#openai-responses-compaction-sessions).
 
-#### Runner-managed retries
+### Passing `extra_args`
+
+Use `extra_args` when you need provider-specific or newer request fields that the SDK does not expose directly at the top level yet.
+
+Also, when you use OpenAI's Responses API, [there are a few other optional parameters](https://platform.openai.com/docs/api-reference/responses/create) (e.g., `user`, `service_tier`, and so on). If they are not available at the top level, you can use `extra_args` to pass them as well.
+
+```python
+from agents import Agent, ModelSettings
+
+english_agent = Agent(
+    name="English agent",
+    instructions="You only speak English",
+    model="gpt-4.1",
+    model_settings=ModelSettings(
+        temperature=0.1,
+        extra_args={"service_tier": "flex", "user": "user_12345"},
+    ),
+)
+```
+
+## Runner-managed retries
 
 Retries are runtime-only and opt in. The SDK does not retry general model requests unless you set `ModelSettings(retry=...)` and your retry policy chooses to retry.
 
@@ -322,11 +349,15 @@ agent = Agent(
 
 `ModelRetrySettings` has three fields:
 
+<div class="field-table" markdown="1">
+
 | Field | Type | Notes |
 | --- | --- | --- |
-| `max_retries` | `int \| None` | Number of retry attempts allowed after the initial request. |
-| `backoff` | `ModelRetryBackoffSettings \| dict \| None` | Default delay strategy when the policy retries without returning an explicit delay. |
-| `policy` | `RetryPolicy \| None` | Callback that decides whether to retry. This field is runtime-only and is not serialized. |
+| `max_retries` | `int | None` | Number of retry attempts allowed after the initial request. |
+| `backoff` | `ModelRetryBackoffSettings | dict | None` | Default delay strategy when the policy retries without returning an explicit delay. |
+| `policy` | `RetryPolicy | None` | Callback that decides whether to retry. This field is runtime-only and is not serialized. |
+
+</div>
 
 A retry policy receives a [`RetryPolicyContext`][agents.retry.RetryPolicyContext] with:
 
@@ -375,24 +406,6 @@ Stateful follow-up requests using `previous_response_id` or `conversation_id` ar
 
 For fuller examples, see [`examples/basic/retry.py`](https://github.com/openai/openai-agents-python/tree/main/examples/basic/retry.py) and [`examples/basic/retry_litellm.py`](https://github.com/openai/openai-agents-python/tree/main/examples/basic/retry_litellm.py).
 
-Use `extra_args` when you need provider-specific or newer request fields that the SDK does not expose directly at the top level yet.
-
-Also, when you use OpenAI's Responses API, [there are a few other optional parameters](https://platform.openai.com/docs/api-reference/responses/create) (e.g., `user`, `service_tier`, and so on). If they are not available at the top level, you can use `extra_args` to pass them as well.
-
-```python
-from agents import Agent, ModelSettings
-
-english_agent = Agent(
-    name="English agent",
-    instructions="You only speak English",
-    model="gpt-4.1",
-    model_settings=ModelSettings(
-        temperature=0.1,
-        extra_args={"service_tier": "flex", "user": "user_12345"},
-    ),
-)
-```
-
 ## Troubleshooting non-OpenAI providers
 
 ### Tracing client error 401
@@ -405,7 +418,7 @@ If you get errors related to tracing, this is because traces are uploaded to Ope
 
 ### Responses API support
 
-The SDK uses the Responses API by default, but most other LLM providers don't yet support it. You may see 404s or similar issues as a result. To resolve, you have two options:
+The SDK uses the Responses API by default, but many other LLM providers still do not support it. You may see 404s or similar issues as a result. To resolve, you have two options:
 
 1. Call [`set_default_openai_api("chat_completions")`][agents.set_default_openai_api]. This works if you are setting `OPENAI_API_KEY` and `OPENAI_BASE_URL` via environment vars.
 2. Use [`OpenAIChatCompletionsModel`][agents.models.openai_chatcompletions.OpenAIChatCompletionsModel]. There are examples [here](https://github.com/openai/openai-agents-python/tree/main/examples/model_providers/).
@@ -429,3 +442,15 @@ You need to be aware of feature differences between model providers, or you may
 -   Don't send unsupported `tools` to providers that don't understand them
 -   Filter out multimodal inputs before calling models that are text-only
 -   Be aware that providers that don't support structured JSON outputs will occasionally produce invalid JSON.
+
+## LiteLLM
+
+LiteLLM support is included on a best-effort, beta basis for cases where you need to bring non-OpenAI providers into an Agents SDK workflow.
+
+If you are using OpenAI models with this SDK, we recommend the built-in [`OpenAIResponsesModel`][agents.models.openai_responses.OpenAIResponsesModel] path instead of LiteLLM.
+
+If you need to combine OpenAI models with non-OpenAI providers, especially through Chat Completions-compatible APIs, LiteLLM is available as a beta option, but it may not be the optimal choice for every setup.
+
+If you need LiteLLM for a non-OpenAI provider, install `openai-agents[litellm]`, then start from [`examples/model_providers/litellm_auto.py`](https://github.com/openai/openai-agents-python/tree/main/examples/model_providers/litellm_auto.py) or [`examples/model_providers/litellm_provider.py`](https://github.com/openai/openai-agents-python/tree/main/examples/model_providers/litellm_provider.py). You can either use `litellm/...` model names or instantiate [`LitellmModel`][agents.extensions.models.litellm_model.LitellmModel] directly.
+
+If you want LiteLLM responses to populate the SDK's usage metrics, pass `ModelSettings(include_usage=True)`.
diff --git a/docs/models/litellm.md b/docs/models/litellm.md
index d61d24cf24..cf6e971c3a 100644
--- a/docs/models/litellm.md
+++ b/docs/models/litellm.md
@@ -1,100 +1,9 @@
-# Using any model via LiteLLM
+# LiteLLM
 
-!!! note
+<script>
+  window.location.replace("../#litellm");
+</script>
 
-    The LiteLLM integration is in beta. You may run into issues with some model providers, especially smaller ones. Please report any issues via [Github issues](https://github.com/openai/openai-agents-python/issues) and we'll fix quickly.
+This page moved to the [LiteLLM section in Models](index.md#litellm).
 
-[LiteLLM](https://docs.litellm.ai/docs/) is a library that allows you to use 100+ models via a single interface. We've added a LiteLLM integration to allow you to use any AI model in the Agents SDK.
-
-## Setup
-
-You'll need to ensure `litellm` is available. You can do this by installing the optional `litellm` dependency group:
-
-```bash
-pip install "openai-agents[litellm]"
-```
-
-Once done, you can use [`LitellmModel`][agents.extensions.models.litellm_model.LitellmModel] in any agent.
-
-## Example
-
-This is a fully working example. When you run it, you'll be prompted for a model name and API key. For example, you could enter:
-
--   `openai/gpt-4.1` for the model, and your OpenAI API key
--   `anthropic/claude-3-5-sonnet-20240620` for the model, and your Anthropic API key
--   etc
-
-For a full list of models supported in LiteLLM, see the [litellm providers docs](https://docs.litellm.ai/docs/providers).
-
-```python
-from __future__ import annotations
-
-import asyncio
-
-from agents import Agent, Runner, function_tool, set_tracing_disabled
-from agents.extensions.models.litellm_model import LitellmModel
-
-@function_tool
-def get_weather(city: str):
-    print(f"[debug] getting weather for {city}")
-    return f"The weather in {city} is sunny."
-
-
-async def main(model: str, api_key: str):
-    agent = Agent(
-        name="Assistant",
-        instructions="You only respond in haikus.",
-        model=LitellmModel(model=model, api_key=api_key),
-        tools=[get_weather],
-    )
-
-    result = await Runner.run(agent, "What's the weather in Tokyo?")
-    print(result.final_output)
-
-
-if __name__ == "__main__":
-    # First try to get model/api key from args
-    import argparse
-
-    parser = argparse.ArgumentParser()
-    parser.add_argument("--model", type=str, required=False)
-    parser.add_argument("--api-key", type=str, required=False)
-    args = parser.parse_args()
-
-    model = args.model
-    if not model:
-        model = input("Enter a model name for Litellm: ")
-
-    api_key = args.api_key
-    if not api_key:
-        api_key = input("Enter an API key for Litellm: ")
-
-    asyncio.run(main(model, api_key))
-```
-
-## Tracking usage data
-
-If you want LiteLLM responses to populate the Agents SDK usage metrics, pass `ModelSettings(include_usage=True)` when creating your agent.
-
-```python
-from agents import Agent, ModelSettings
-from agents.extensions.models.litellm_model import LitellmModel
-
-agent = Agent(
-    name="Assistant",
-    model=LitellmModel(model="your/model", api_key="..."),
-    model_settings=ModelSettings(include_usage=True),
-)
-```
-
-With `include_usage=True`, LiteLLM requests report token and request counts through `result.context_wrapper.usage` just like the built-in OpenAI models.
-
-## Troubleshooting
-
-If you see Pydantic serializer warnings from LiteLLM responses, enable a small compatibility patch by setting:
-
-```bash
-export OPENAI_AGENTS_ENABLE_LITELLM_SERIALIZER_PATCH=true
-```
-
-This opt-in flag suppresses known LiteLLM serializer warnings while preserving normal behavior. Turn it off (unset or `false`) if you do not need it.
+If you are not redirected automatically, use the link above.
diff --git a/docs/stylesheets/extra.css b/docs/stylesheets/extra.css
index c6bd554287..591a4a3ef3 100644
--- a/docs/stylesheets/extra.css
+++ b/docs/stylesheets/extra.css
@@ -170,6 +170,32 @@
     font-size: 14px;
 }
 
+.md-typeset .field-table {
+    overflow-x: auto;
+}
+
+.md-typeset .field-table table:not([class]) {
+    display: table;
+    table-layout: fixed;
+    width: 100%;
+}
+
+.md-typeset .field-table table:not([class]) th:first-child,
+.md-typeset .field-table table:not([class]) td:first-child {
+    width: 11rem;
+}
+
+.md-typeset .field-table table:not([class]) th:nth-child(2),
+.md-typeset .field-table table:not([class]) td:nth-child(2) {
+    width: 18rem;
+}
+
+.md-typeset .field-table table:not([class]) th:first-child code,
+.md-typeset .field-table table:not([class]) td:first-child code {
+    white-space: nowrap;
+    word-break: normal;
+}
+
 /* Custom link styling */
 .md-content a {
     text-decoration: none;
@@ -203,3 +229,10 @@
 .md-sidebar__scrollwrap {
     scrollbar-color: auto !important;
 }
+
+/* Let the docs layout use more of large viewports without becoming fully fluid. */
+@media screen and (min-width: 76.25em) {
+    .md-grid {
+        max-width: clamp(76rem, 92vw, 92rem);
+    }
+}
diff --git a/docs/usage.md b/docs/usage.md
index 5c3e69c505..938f2467c7 100644
--- a/docs/usage.md
+++ b/docs/usage.md
@@ -31,7 +31,7 @@ Usage is aggregated across all model calls during the run (including tool calls
 
 ### Enabling usage with LiteLLM models
 
-LiteLLM providers do not report usage metrics by default. When you are using [`LitellmModel`](models/litellm.md), pass `ModelSettings(include_usage=True)` to your agent so that LiteLLM responses populate `result.context_wrapper.usage`.
+LiteLLM providers do not report usage metrics by default. When you are using [`LitellmModel`][agents.extensions.models.litellm_model.LitellmModel], pass `ModelSettings(include_usage=True)` to your agent so that LiteLLM responses populate `result.context_wrapper.usage`. See the [LiteLLM note](models/index.md#litellm) in the Models guide for setup guidance and examples.
 
 ```python
 from agents import Agent, ModelSettings, Runner
diff --git a/mkdocs.yml b/mkdocs.yml
index e5fb6abe8e..f111dc3945 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -56,9 +56,7 @@ plugins:
             - Configuration: config.md
             - Documentation:
                 - agents.md
-                - Models:
-                    - models/index.md
-                    - models/litellm.md
+                - Models: models/index.md
                 - tools.md
                 - guardrails.md
                 - running_agents.md
@@ -177,9 +175,7 @@ plugins:
             - config.md
             - ドキュメント:
                 - agents.md
-                - モデル:
-                    - models/index.md
-                    - models/litellm.md
+                - モデル: models/index.md
                 - tools.md
                 - guardrails.md
                 - running_agents.md
@@ -217,9 +213,7 @@ plugins:
             - config.md
             - 문서:
                 - agents.md
-                - 모델:
-                    - models/index.md
-                    - models/litellm.md
+                - 모델: models/index.md
                 - tools.md
                 - guardrails.md
                 - running_agents.md
@@ -257,9 +251,7 @@ plugins:
             - config.md
             - 文档:
                 - agents.md
-                - 模型:
-                    - models/index.md
-                    - models/litellm.md
+                - 模型: models/index.md
                 - tools.md
                 - guardrails.md
                 - running_agents.md