Skip to content

Commit 7a8a65c

Browse files
micheleRPclaude
andcommitted
Update docs to match current AI Gateway UI
- Update MCP server fields to match Create MCP Server dialog (Server ID, Display Name, Server Address, Defer Loading Override, Forward OIDC Token Override) - Update orchestrator section to reflect system-managed config with configurable blocked tool patterns - Update deferred loading config to use per-server Defer Loading Override dropdown instead of gateway-level toggle - Update observability references to point to gateway Overview tab - Comment out references to UI features not yet available Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 195c95c commit 7a8a65c

7 files changed

Lines changed: 114 additions & 94 deletions

File tree

modules/ai-agents/pages/ai-gateway/admin/setup-guide.adoc

Lines changed: 82 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ After completing this guide, you will be able to:
2626

2727
Providers represent upstream services (Anthropic, OpenAI, Google AI) and associated credentials. Providers are disabled by default and must be enabled explicitly by an administrator.
2828

29-
. In the Redpanda Cloud Console, navigate to *AI Gateway* → *Providers*.
29+
. In the Redpanda Cloud Console, navigate to *Agentic AI* → *Providers*.
3030
. Select a provider (for example, Anthropic).
3131
. On the Configuration tab for the provider, click *Add configuration*.
3232
. Enter your API Key for the provider.
@@ -43,17 +43,15 @@ The model catalog is the set of models made available through the gateway. Model
4343

4444
The infrastructure that serves the model differs based on the provider you select. For example, OpenAI has different reliability and availability metrics than Anthropic. When you consider all metrics, you can design your gateway to use different providers for different use cases.
4545

46-
. Navigate to *AI Gateway* → *Models*.
46+
. Navigate to *Agentic AI* → *Models*.
4747
. Review the list of available models from enabled providers.
48-
. For each model you want to expose through gateways, toggle it to *Enabled*.
49-
+
50-
Common models to enable:
48+
. For each model you want to expose through gateways, toggle it to *Enabled*. For example:
5149
+
5250
--
53-
* `openai/gpt-5.2` - OpenAI's most capable model
54-
* `openai/gpt-5.2-mini` - Cost-effective OpenAI model
55-
* `anthropic/claude-sonnet-4.5` - Balanced Anthropic model
56-
* `anthropic/claude-opus-4.6` - Anthropic's most capable model
51+
* `openai/gpt-5.2`
52+
* `openai/gpt-5.2-mini`
53+
* `anthropic/claude-sonnet-4.5`
54+
* `anthropic/claude-opus-4.6`
5755
--
5856

5957
. Click *Save changes*.
@@ -62,9 +60,7 @@ Only enabled models will be accessible through gateways. You can enable or disab
6260

6361
=== Model naming convention
6462

65-
Model requests must use the `vendor/model_id` format in the model property of the request body. This format allows AI Gateway to route requests to the appropriate provider.
66-
67-
Examples:
63+
Model requests must use the `vendor/model_id` format in the model property of the request body. This format allows AI Gateway to route requests to the appropriate provider. For example:
6864

6965
* `openai/gpt-5.2`
7066
* `anthropic/claude-sonnet-4.5`
@@ -109,7 +105,7 @@ endif::[]
109105

110106
A gateway is a logical configuration boundary (policies + routing + observability) on top of a single deployment. It's a "virtual gateway" that you can create per team, environment (staging/production), product, or customer.
111107

112-
. Navigate to *AI Gateway* → *Gateways*.
108+
. Navigate to *Agentic AI* → *Gateways*.
113109
. Click *Create Gateway*.
114110
. Configure the gateway:
115111
+
@@ -128,10 +124,12 @@ TIP: A workspace is conceptually similar to a resource group in Redpanda streami
128124
. After creation, note the following information:
129125
+
130126
--
131-
* *Gateway Endpoint*: URL for API requests (for example, `https://example/gateways/gw_abc123/v1`) - the gateway ID is embedded in the URL
127+
* *Gateway endpoint*: URL for API requests (for example, `https://example/gateways/d633lffcc16s73ct95mg/v1`)
128+
+
129+
The gateway ID is embedded in the URL.
132130
--
133131

134-
You'll share the Gateway Endpoint with users who need to access this gateway.
132+
You'll share the gateway endpoint with users who need to access this gateway.
135133

136134
== Configure LLM routing
137135

@@ -228,49 +226,78 @@ TIP: Provider pool (UI) = Backend pool (API)
228226

229227
If a provider pool contains multiple providers, you can distribute traffic to balance load or optimize for cost/performance:
230228

231-
* *Round-robin*: Distribute evenly across all providers
232-
* *Weighted*: Assign weights (for example, 80% to Anthropic, 20% to OpenAI)
233-
* *Least latency*: Route to fastest provider based on recent performance
234-
* *Cost-optimized*: Route to cheapest provider for each model
229+
* Round-robin: Distribute evenly across all providers
230+
* Weighted: Assign weights (for example, 80% to Anthropic, 20% to OpenAI)
231+
* Least latency: Route to fastest provider based on recent performance
232+
* Cost-optimized: Route to cheapest provider for each model
235233

236234
== Configure MCP tools (optional)
237235

238236
If your users will build glossterm:AI agent[,AI agents] that need access to glossterm:MCP tool[,tools] via glossterm:MCP[,Model Context Protocol (MCP)], configure MCP tool aggregation.
239237

240238
On the gateway details page, select the *MCP* tab to configure tool discovery and execution. The MCP proxy aggregates multiple glossterm:MCP server[,MCP servers], allowing agents to find and call tools through a single endpoint.
241239

240+
=== Configure MCP rate limits
241+
242+
Rate limits for MCP work the same way as LLM rate limits.
243+
244+
. In the *MCP* tab, locate the *Rate Limit* section.
245+
. Click *Add rate limit*.
246+
. Configure the maximum requests per second and optional burst allowance.
247+
. Click *Save*.
248+
242249
=== Add MCP servers
243250

244-
. In the *MCP* tab, click *Add MCP server*.
251+
. In the *MCP* tab, click *Create MCP Server*.
245252
. Configure the server:
246253
+
247254
--
248-
* *Server name*: Human-readable identifier (for example, `database-server`, `slack-server`)
249-
* *Server URL*: Endpoint for the MCP server (for example, `https://mcp-database.example.com`)
250-
* *Authentication*: Configure authentication if required (bearer token, API key, mTLS)
251-
* *Enabled tools*: Select which tools from this server to expose (or *All tools*)
255+
* *Server ID*: Unique identifier for this server
256+
* *Display Name*: Human-readable name (for example, `database-server`, `slack-server`)
257+
* *Server Address*: Endpoint URL for the MCP server (for example, `https://mcp-database.example.com`)
252258
--
253259

254-
. Click *Test connection* to verify connectivity.
255-
. Click *Save* to add the server to this gateway.
260+
. Configure server settings:
261+
+
262+
--
263+
* *Timeout (seconds)*: Maximum time to wait for a response from this server
264+
* *Enabled*: Whether this server is active and accepting requests
265+
* *Defer Loading Override*: Controls whether tools from this server are loaded upfront or on demand
266+
+
267+
[cols="1,2"]
268+
|===
269+
|Option |Description
256270

257-
Repeat for each MCP server you want to aggregate.
271+
|Inherit from gateway
272+
|Use the gateway-level deferred loading setting (default)
258273

259-
=== Configure deferred tool loading
274+
|Enabled
275+
|Always defer loading from this server. Agents receive only a search tool initially and query for specific tools when needed. This can reduce token usage by 80-90%.
260276

261-
Deferred tool loading dramatically reduces token costs by initially exposing only a search tool and orchestrator, rather than listing all available tools.
277+
|Disabled
278+
|Always load all tools from this server upfront.
279+
|===
262280

263-
. In the *MCP* tab, locate *Deferred Loading*.
264-
. Toggle *Enable deferred tool loading* to *On*.
265-
. Configure behavior:
281+
* *Forward OIDC Token Override*: Controls whether the client's OIDC token is forwarded to this MCP server
266282
+
267-
--
268-
* *Initially expose*: Search tool + orchestrator only
269-
* *Load on demand*: Tools are retrieved when agents query for them
270-
* *Token savings*: Expect 80-90% reduction in token usage for tool definitions
283+
[cols="1,2"]
284+
|===
285+
|Option |Description
286+
287+
|Inherit from gateway
288+
|Use the gateway-level OIDC forwarding setting (default)
289+
290+
|Enabled
291+
|Always forward the OIDC token to this server
292+
293+
|Disabled
294+
|Never forward the OIDC token to this server
295+
|===
271296
--
272297

273-
. Click *Save*.
298+
. Click *Save* to add the server to this gateway.
299+
300+
Repeat for each MCP server you want to aggregate.
274301

275302
See xref:ai-gateway/mcp-aggregation-guide.adoc[] for detailed information about MCP aggregation.
276303

@@ -280,11 +307,24 @@ The MCP orchestrator is a built-in MCP server that enables programmatic tool cal
280307

281308
Example: A workflow requiring 47 file reads can be reduced from 49 round trips to just 1 round trip using the orchestrator.
282309

283-
The orchestrator is enabled by default when you enable MCP tools. You can configure:
310+
The orchestrator is pre-configured when you initialize the MCP gateway. Its server configuration (Server ID, Display Name, Transport, Command, and Timeout) is system-managed and cannot be modified.
284311

285-
* *Execution timeout*: Maximum time for orchestrator workflows (for example, 30 seconds)
286-
* *Memory limit*: Maximum memory for JavaScript execution (for example, 128MB)
287-
* *Allowed operations*: Restrict which MCP tools can be called from orchestrator workflows
312+
You can configure blocked tool patterns to prevent specific tools from being called through the orchestrator:
313+
314+
. In the *MCP* tab, select the orchestrator server to edit it.
315+
. Under *Blocked Tools*, click *Add Pattern* to add glob patterns for tools that should be blocked from execution.
316+
+
317+
Example patterns:
318+
+
319+
--
320+
* `server_id:*` - Block all tools from a specific server
321+
* `*:dangerous_tool` - Block a specific tool across all servers
322+
* `specific:tool` - Block a single tool on a specific server
323+
--
324+
+
325+
NOTE: The orchestrator's own tools are blocked by default to prevent recursive execution.
326+
327+
. Click *Save*.
288328

289329
== Verify your setup
290330

@@ -318,7 +358,7 @@ Expected result: Successful completion response.
318358

319359
=== Check the gateway overview
320360

321-
. Navigate to *AI Gateway* → *Gateways* → Select your gateway → *Overview*.
361+
. Navigate to *Gateways* → Select your gateway → *Overview*.
322362
. Check the aggregate metrics to verify your test request was processed:
323363
+
324364
--

modules/ai-agents/pages/ai-gateway/builders/connect-your-agent.adoc

Lines changed: 2 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -32,9 +32,7 @@ Connecting to AI Gateway requires two configuration changes:
3232
. *Change the base URL*: Point to the gateway endpoint instead of the provider's API. The gateway ID is embedded in the endpoint URL.
3333
. *Add authentication*: Use your Redpanda Cloud token instead of provider API keys
3434

35-
That's it. Your existing application code doesn't need to change.
36-
37-
== Quick start
35+
== Quickstart
3836

3937
=== Environment variables
4038

@@ -144,9 +142,7 @@ When making requests through AI Gateway, use the `vendor/model_id` format for th
144142
* `anthropic/claude-sonnet-4.5`
145143
* `anthropic/claude-opus-4.6`
146144

147-
This format tells AI Gateway which provider to route the request to.
148-
149-
Example:
145+
This format tells AI Gateway which provider to route the request to. For example:
150146

151147
[source,python]
152148
----

modules/ai-agents/pages/ai-gateway/builders/discover-gateways.adoc

Lines changed: 3 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -27,19 +27,10 @@ After reading this page, you will be able to:
2727
====
2828
Using the Console::
2929
+
30-
. Navigate to *AI Gateway* in the Redpanda Cloud Console.
31-
. View the *My Gateways* tab (or *Gateways* if you're an administrator).
32-
. Review the list of gateways you can access:
30+
. Navigate to *Gateways* in the Redpanda Cloud Console.
31+
. Review the list of gateways you can access. For each gateway, you'll see the gateway name, ID, endpoint URL, status, available models, and provider performance.
3332
+
34-
For each gateway, you'll see:
35-
+
36-
--
37-
* *Gateway Name*: Human-readable name (for example, `production-gateway`, `team-ml-gateway`)
38-
* *Gateway Endpoint*: URL for API requests, with the gateway ID embedded in the path (for example, `https://example/gateways/gw_abc123/v1`)
39-
* *Status*: Whether the gateway is active and accepting requests
40-
* *Available Models*: Which LLM models you can access
41-
* *MCP Tools*: Whether MCP tool aggregation is enabled
42-
--
33+
Click the Configuration, API, MCP Tools, and Changelog tabs for additional information.
4334
4435
Using the API::
4536
+

modules/ai-agents/pages/ai-gateway/cel-routing-cookbook.adoc

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -745,6 +745,7 @@ test_cel_routing(
745745
----
746746

747747

748+
////
748749
=== Option 3: CLI test (if available)
749750
750751
[source,bash]
@@ -758,6 +759,7 @@ rpk cloud ai-gateway test-cel \
758759
759760
# Expected output: openai/gpt-5.2
760761
----
762+
////
761763

762764

763765
== Common CEL errors

modules/ai-agents/pages/ai-gateway/gateway-quickstart.adoc

Lines changed: 4 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -78,10 +78,10 @@ ifdef::ai-hub-available[]
7878
endif::[]
7979
. Configure the gateway:
8080
+
81-
* Display name: Choose a descriptive name (for example, `my-first-gateway`)
82-
* Workspace: Select a workspace (conceptually similar to a resource group)
83-
* Description: Add context about this gateway's purpose
84-
* Optional metadata for documentation
81+
** Display name: Choose a descriptive name (for example, `my-first-gateway`)
82+
** Workspace: Select a workspace (conceptually similar to a resource group)
83+
** Description: Add context about this gateway's purpose
84+
** Optional metadata for documentation
8585

8686
After creation, copy the gateway endpoint from the overview page. You'll need this for sending requests. The gateway ID is embedded in the endpoint URL. For example:
8787

@@ -213,7 +213,6 @@ If your request fails, check these common issues:
213213

214214
Confirm your request was routed through AI Gateway.
215215

216-
. In the sidebar, navigate to *Agentic AI > Gateways*, then select your gateway.
217216
. On the *Overview* tab, check the aggregate metrics:
218217
+
219218
* *Total Requests*: Should have incremented
@@ -225,12 +224,6 @@ Confirm your request was routed through AI Gateway.
225224
+
226225
The model you used in your request should appear with its request count, token usage (input/output), estimated cost, latency, and error rate.
227226

228-
If your request doesn't appear:
229-
230-
* Wait a few seconds for metrics to update
231-
* Verify the gateway endpoint in your request matches the gateway you're viewing
232-
* Check that your client received a successful response
233-
234227
== Configure LLM routing (optional)
235228

236229
Configure rate limits, spend limits, and provider pools with failover.

modules/ai-agents/pages/ai-gateway/mcp-aggregation-guide.adoc

Lines changed: 19 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -282,27 +282,27 @@ Don't use deferred loading when:
282282

283283
=== Configure deferred loading
284284

285-
// PLACEHOLDER: Add UI path or configuration method
285+
Deferred loading is configured per MCP server through the *Defer Loading Override* setting in the Create MCP Server dialog.
286286

287-
Option 1: Enable at gateway level (recommended)
287+
. Navigate to your gateway's *MCP* tab.
288+
. Create or edit an MCP server.
289+
. Under *Server Settings*, set *Defer Loading Override*:
290+
+
291+
[cols="1,2"]
292+
|===
293+
|Option |Description
288294

289-
[source,yaml]
290-
----
291-
# PLACEHOLDER: Actual configuration format
292-
mcp:
293-
deferred_loading: true # Default for all agents using this gateway
294-
----
295+
|Inherit from gateway
296+
|Use the gateway-level deferred loading setting (default)
295297

298+
|Enabled
299+
|Always defer loading from this server. Agents receive only a search tool initially and query for specific tools when needed.
296300

297-
Option 2: Enable per-request (agent-controlled)
301+
|Disabled
302+
|Always load all tools from this server upfront.
303+
|===
298304

299-
[source,python]
300-
----
301-
# Agent includes header
302-
headers = {
303-
"rp-aigw-mcp-deferred": "true" # Enable for this request
304-
}
305-
----
305+
. Click *Save*.
306306

307307

308308
=== Measure token savings
@@ -460,14 +460,12 @@ Available in workflow:
460460
Security:
461461

462462
* Sandboxed execution (no file system, network, or system access)
463-
* Timeout: // PLACEHOLDER: e.g., 30 seconds
464-
* Memory limit: // PLACEHOLDER: e.g., 128MB
463+
* Timeout and memory limits are system-managed and cannot be modified
465464

466465
Limitations:
467466

468467
* Cannot call external APIs directly (must use MCP tools)
469468
* Cannot import npm packages (built-in JS only)
470-
* // PLACEHOLDER: Other limitations?
471469

472470
=== Orchestrator example: data aggregation
473471

@@ -797,6 +795,7 @@ Solution:
797795

798796
== Security considerations
799797

798+
////
800799
=== Tool execution sandboxing
801800
802801
// PLACEHOLDER: Confirm sandboxing implementation
@@ -814,6 +813,7 @@ MCP tool execution:
814813
* Tools execute in MCP server's environment (not gateway)
815814
* Gateway does not execute tool code (only proxies requests)
816815
* Security is MCP server's responsibility
816+
////
817817

818818
=== Authentication
819819

@@ -1004,7 +1004,3 @@ response = agent.run("Find all premium users in the database")
10041004
----
10051005
--
10061006
====
1007-
1008-
1009-
== Next steps
1010-

modules/ai-agents/pages/ai-gateway/what-is-ai-gateway.adoc

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -185,6 +185,8 @@ AI Gateway may not be necessary if:
185185

186186
Now that you understand what AI Gateway is and how it can benefit your organization:
187187

188+
* xref:ai-gateway/gateway-quickstart.adoc[Gateway Quickstart] - Get started quickly with a basic gateway setup
189+
188190
*For Administrators:*
189191

190192
* xref:ai-gateway/admin/setup-guide.adoc[Setup Guide] - Enable providers, models, and create gateways

0 commit comments

Comments
 (0)