Skip to content

Commit 73be85d

Browse files
Paulo BorgesPaulo Borges
authored andcommitted
Merge branch 'DOC-1867-Document-feature-AI-Gateway-help-cloud-team-polish-clean-up' into adp-pkg1
# Conflicts: # modules/ai-agents/pages/mcp/overview.adoc
2 parents 7decbed + 038d8a6 commit 73be85d

14 files changed

Lines changed: 60 additions & 58 deletions

modules/ai-agents/pages/ai-gateway/admin/setup-guide.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
= AI Gateway Setup Guide
2-
:description: Complete setup guide for administrators to enable providers, configure models, create gateways, and set up routing policies.
2+
:description: Set up AI Gateway for your organization. Enable providers, configure failover for high availability, set budget controls, and create gateways with team-level isolation.
33
:page-topic-type: how-to
44
:personas: platform_admin
55
:learning-objective-1: Enable LLM providers and models in the catalog

modules/ai-agents/pages/ai-gateway/gateway-architecture.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
= AI Gateway Architecture
2-
:description: Technical architecture of Redpanda AI Gateway, including request lifecycle, supported providers, deployment models, and implementation details.
2+
:description: Technical architecture of Redpanda AI Gateway, including how the control plane, data plane, and observability plane deliver high availability, cost governance, and multi-tenant isolation.
33
:page-topic-type: concept
44
:personas: app_developer, platform_admin
55
:learning-objective-1: Describe the three architectural planes of AI Gateway

modules/ai-agents/pages/ai-gateway/gateway-quickstart.adoc

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
= AI Gateway Quickstart
2-
:description: Get started with AI Gateway by configuring providers, creating your first gateway, and routing requests through unified LLM endpoints.
2+
:description: Get started with AI Gateway. Configure providers, create your first gateway with failover and budget controls, and route your first request.
33
:page-topic-type: quickstart
44
:personas: app_developer, platform_admin
55
:learning-objective-1: Enable an LLM provider and create your first gateway
@@ -8,7 +8,7 @@
88

99
include::ai-agents:partial$ai-gateway-byoc-note.adoc[]
1010

11-
Redpanda AI Gateway provides unified access to multiple large language model (LLM) providers and glossterm:MCP[,Model Context Protocol (MCP)] servers through a single endpoint. This quickstart walks you through configuring your first gateway and routing requests through it.
11+
Redpanda AI Gateway keeps your AI-powered applications running and your costs under control by routing all LLM and MCP traffic through a single managed layer with automatic failover and budget enforcement. This quickstart walks you through configuring your first gateway and routing requests through it.
1212

1313
== Prerequisites
1414

modules/ai-agents/pages/ai-gateway/index.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
= AI Gateway
2-
:description: Learn about the unified access layer for LLM providers and AI tools with centralized routing, policy enforcement, cost management, and observability.
2+
:description: Keep AI-powered apps running with automatic provider failover, prevent runaway spend with centralized budget controls, and govern access across teams, apps, and service accounts.
33
:page-layout: index
44
:personas: platform_admin, app_developer, evaluator
55

modules/ai-agents/pages/ai-gateway/what-is-ai-gateway.adoc

Lines changed: 38 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -1,30 +1,54 @@
11
= What is an AI Gateway?
2-
:description: Understand what an AI Gateway is, the problems it solves, and how it benefits your AI infrastructure.
2+
:description: Understand how AI Gateway keeps AI-powered apps highly available across providers and prevents runaway AI spend with centralized cost governance.
33
:page-topic-type: concept
44
:personas: app_developer, platform_admin
5-
:learning-objective-1: Describe how AI Gateway centralizes LLM provider management and reduces operational complexity
6-
:learning-objective-2: Identify key features that address common LLM integration challenges
7-
:learning-objective-3: Determine whether AI Gateway fits your use case based on traffic volume and provider diversity
5+
:learning-objective-1: Explain how AI Gateway keeps AI-powered apps highly available through governed provider failover
6+
:learning-objective-2: Describe how AI Gateway prevents runaway AI spend with centralized budget controls and tenancy-based governance
7+
:learning-objective-3: Identify when AI Gateway fits your use case based on availability requirements, cost governance needs, and multi-provider or MCP tool usage
88

99
include::ai-agents:partial$ai-gateway-byoc-note.adoc[]
1010

11-
Redpanda AI Gateway is a unified access layer for LLM providers and AI tools that sits between your applications and the AI services they use. It provides centralized routing, policy enforcement, cost management, and observability for all your AI traffic.
11+
Redpanda AI Gateway keeps your AI-powered applications highly available and your AI spend under control. It sits between your applications and the LLM providers and AI tools they depend on, providing automatic provider failover so your apps stay up even when a provider goes down, and centralized budget controls so costs never run away. For platform teams, it adds governance at the model-fallback level, tenancy modeling for teams, individuals, apps, and service accounts, and a single proxy layer for both LLM models and MCP tool servers.
1212

1313
== The problem
1414

15-
Modern AI applications face four critical challenges that increase costs, reduce reliability, and slow down development.
15+
Modern AI applications face two business-critical challenges: staying up and staying on budget.
1616

17-
First, applications typically hardcode provider-specific SDKs. An application using OpenAI's SDK cannot easily switch to Anthropic or Google without code changes and redeployment. This tight coupling makes testing across providers time-consuming and error-prone, and means provider outages directly impact your application availability.
17+
First, applications typically hardcode provider-specific SDKs. An application using OpenAI's SDK cannot easily switch to Anthropic or Google without code changes and redeployment. When a provider hits rate limits, suffers an outage, or degrades in performance, your application goes down with it. Your end users don't care which provider you use; they care that the app works.
1818

19-
Second, costs can spiral without visibility into usage patterns. Without a centralized view of token consumption across teams and applications, it's difficult to attribute costs to specific customers, features, or environments. Testing and debugging can generate unexpected bills, and there's no way to enforce budgets or rate limits per team or customer.
19+
Second, costs can spiral without centralized controls. Without a single view of token consumption across teams and applications, it's difficult to attribute costs to specific customers, features, or environments. Testing and debugging can generate unexpected bills, and there's no way to enforce budgets or rate limits per team, application, or service account. The result: runaway spend that finance discovers only after the fact.
2020

21-
Third, glossterm:AI agent[,AI agents] that use glossterm:MCP[,Model Context Protocol (MCP)] servers face tool coordination challenges. Managing tool discovery and execution is repetitive across projects, and agents typically load all available tools upfront, which creates high token costs. There's also no centralized governance over which tools agents can access.
22-
23-
Finally, observability is fragmented across provider dashboards. You cannot reconstruct user sessions that span multiple models, compare latency and costs across providers in a unified view, or efficiently debug issues. Troubleshooting "the AI gave the wrong answer" requires manual log diving across different systems.
21+
These two challenges are compounded by fragmented observability across provider dashboards, which makes it harder to detect availability issues or cost anomalies in time to act. And as organizations adopt glossterm:AI agent[,AI agents] that call glossterm:MCP tool[,MCP tools], the lack of centralized tool governance adds another dimension of uncontrolled cost and risk.
2422

2523
== What AI Gateway solves
2624

27-
Redpanda AI Gateway addresses these challenges through the following core capabilities:
25+
Redpanda AI Gateway delivers two core business outcomes, high availability and cost governance, backed by platform-level controls that set it apart from simple proxy layers:
26+
27+
=== High availability through governed failover
28+
29+
Your end users don't care whether you use OpenAI, Anthropic, or Google; they care that your app stays up. AI Gateway lets you configure provider pools with automatic failover so that when your primary provider hits rate limits, times out, or returns errors, the gateway routes requests to a fallback provider with no code changes and no downtime for your users.
30+
31+
Unlike simple retry logic, AI Gateway provides governance at the failover level: you define which providers fail over to which, under what conditions, and with what priority. This controlled failover can significantly improve uptime even during extended provider outages.
32+
33+
=== Cost governance and budget controls
34+
35+
AI Gateway gives you centralized fiscal control over AI spend. Set monthly budget caps per gateway, enforce them automatically, and set rate limits per team, environment, or application. No more runaway costs discovered after the fact.
36+
37+
You can route requests to different models based on user attributes. For example, to direct premium users to a more capable model while routing free tier users to a cost-effective option, use a CEL expression:
38+
39+
[source,cel]
40+
----
41+
// Route premium users to best model, free users to cost-effective model
42+
request.headers["x-user-tier"] == "premium"
43+
? "anthropic/claude-opus-4.6"
44+
: "anthropic/claude-sonnet-4.5"
45+
----
46+
47+
You can also set different rate limits and spend limits per environment to prevent staging or development traffic from consuming production budgets.
48+
49+
=== Tenancy and access governance
50+
51+
AI Gateway provides multi-tenant isolation by design. Create separate gateways for teams, individual developers, applications, or service accounts, each with their own budgets, rate limits, routing policies, and observability scope. This tenancy model lets platform teams govern who uses what, how much they spend, and which models and tools they can access, without building custom authorization layers.
2852

2953
=== Unified LLM access (single endpoint for all providers)
3054

@@ -85,27 +109,9 @@ response = client.chat.completions.create(
85109

86110
To switch providers, you change only the `model` parameter from `openai/gpt-5.2` to `anthropic/claude-sonnet-4.5`. No code changes or redeployment needed.
87111

88-
=== Policy-based routing and cost control
89-
90-
AI Gateway lets you define routing rules, rate limits, and budgets once, then enforces them automatically for all requests.
91-
92-
You can route requests to different models based on user attributes. For example, to direct premium users to a more capable model while routing free tier users to a cost-effective option, use a CEL expression:
93-
94-
[source,cel]
95-
----
96-
// Route premium users to best model, free users to cost-effective model
97-
request.headers["x-user-tier"] == "premium"
98-
? "anthropic/claude-opus-4.6"
99-
: "anthropic/claude-sonnet-4.5"
100-
----
101-
102-
You can also set different rate limits and spend limits per environment to prevent staging or development traffic from consuming production budgets.
103-
104-
For reliability, you can configure provider pools with automatic failover. If you configure OpenAI GPT-4 as your primary model and Anthropic Claude Opus as the fallback, the gateway automatically routes requests to the fallback when it detects rate limits or timeouts from the primary provider. This configuration can significantly improve uptime (potentially up to 99.9% in some configurations) even during provider outages.
105-
106-
=== MCP aggregation and orchestration
112+
=== Proxy for LLM models and MCP tool servers
107113

108-
AI Gateway aggregates multiple glossterm:MCP server[,MCP servers] and provides deferred tool loading, which dramatically reduces token costs for AI agents.
114+
AI Gateway acts as a single proxy layer for both LLM model requests and MCP tool servers. For LLM traffic, it provides the unified endpoint described above. For AI agents that use MCP tools, it aggregates multiple MCP servers and provides deferred tool loading, which dramatically reduces token costs.
109115

110116
Without AI Gateway, agents typically load all available glossterm:MCP tool[,tools] from multiple MCP servers at startup. This approach sends 50+ tool definitions with every request, creating high token costs (thousands of tokens per request), slow agent startup times, and no centralized governance over which tools agents can access.
111117

modules/ai-agents/pages/index.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
= Agentic AI
2-
:description: Learn about the Redpanda Agentic Data Plane, including the AI Gateway, AI agents, and MCP servers.
2+
:description: Learn about the Redpanda Agentic Data Plane. Keep AI-powered apps highly available, control costs across providers, and govern access for teams, apps, and service accounts.
33
:page-layout: index
44
:page-aliases: develop:agents/about.adoc, develop:ai-agents/about.adoc

modules/ai-agents/pages/mcp/index.adoc

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
= Model Context Protocol (MCP)
2-
:description: Learn about the Model Context Protocol (MCP) in Redpanda Cloud.
2+
:description: Give AI agents direct access to your databases, queues, CRMs, and other business systems without writing custom glue code.
33
:page-layout: index
44

5-
The Model Context Protocol (MCP) provides a standardized way for AI agents to connect with external data sources and tools in Redpanda Cloud.
5+
AI agents need context from your business systems. The Model Context Protocol (MCP) translates agent intent into real connections to databases, queues, CRMs, HRIS, and other systems of record, without you writing custom integration code. Redpanda's MCP servers are built on the same proven connectors that power the world's largest e-commerce, electric vehicle, energy, and AI companies.
66

77
Redpanda Cloud offers two complementary MCP options:
88

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
= Redpanda Cloud Management MCP Server
22
:page-beta: true
3-
:description: Find links to information about the Redpanda Cloud Management MCP Server and its features for building and managing AI agents that can interact with your Redpanda Cloud account and clusters.
3+
:description: Manage your Redpanda Cloud clusters, topics, and users through AI agents using natural language commands.
44
:page-layout: index

modules/ai-agents/pages/mcp/local/overview.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
= Redpanda Cloud Management MCP Server
22
:page-beta: true
3-
:description: Learn about the Redpanda Cloud Management MCP Server, which lets AI agents securely access and operate your Redpanda Cloud account and clusters.
3+
:description: Let AI agents securely operate your Redpanda Cloud clusters, topics, and users through natural language commands.
44
:page-topic-type: overview
55
:personas: evaluator, agent_developer, platform_admin
66
// Reader journey: "I'm new"

modules/ai-agents/pages/mcp/overview.adoc

Lines changed: 7 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
= MCP Servers for Redpanda Cloud Overview
2-
:description: Learn about Model Context Protocol (MCP) in Redpanda Cloud, including the two complementary options: the Redpanda Cloud Management MCP Server and Remote MCP.
2+
:description: Connect AI agents to your databases, queues, CRMs, and other business systems without writing glue code, using Redpanda's proven connectors.
33
:page-topic-type: overview
4-
:personas: evaluator, agent_developer
4+
:personas: evaluator, ai_agent_developer
55
// Reader journey: "I'm new" - understanding the landscape
66
// Learning objectives - what readers should understand after reading this page:
77
:learning-objective-1: Describe what MCP enables for AI agents
@@ -18,13 +18,9 @@ After reading this page, you will be able to:
1818
1919
== What is MCP?
2020

21-
The Model Context Protocol (MCP) provides a standardized way for AI agents to connect with external data sources and tools in Redpanda Cloud.
21+
MCP (Model Context Protocol) is an open standard that translates AI agent intent into real connections to databases, queues, CRMs, HRIS, accounting software, and other business systems. Instead of writing custom glue code for every integration, you define your tools once using MCP, and any MCP-compatible AI client can discover and use them.
2222

23-
Each MCP server hosts a set of tools that AI clients can discover and invoke. Tools are custom integrations that expose data, APIs, or workflows to AI agents.
24-
25-
Think of MCP like a universal adapter: instead of building custom integrations for every AI system, you define your tools once using MCP, and any MCP-compatible AI client can discover and use them.
26-
27-
Without MCP, connecting AI to your business systems requires custom API code, authentication handling, and response formatting for each AI platform. With MCP, you describe what a tool does and what inputs it needs, and the protocol handles the rest.
23+
Without MCP, connecting AI to your business systems requires custom API code, authentication handling, and response formatting for each AI platform. With MCP, you describe what a tool does and what inputs it needs, and the protocol handles the rest. Redpanda's MCP servers are built on the same proven connectors that power the world's largest e-commerce, electric vehicle, energy, and AI companies today.
2824

2925
== MCP options in Redpanda Cloud
3026

@@ -89,9 +85,9 @@ You can use both options together. For example, use the Redpanda Cloud Managemen
8985

9086
== Get started
9187

92-
* xref:ai-agents:mcp/local/quickstart.adoc[]
93-
* xref:ai-agents:mcp/remote/quickstart.adoc[]
88+
* xref:ai-agents:mcp/local/quickstart.adoc[]: Connect Claude to your Redpanda Cloud account
89+
* xref:ai-agents:mcp/remote/quickstart.adoc[]: Build and deploy custom MCP tools
9490

9591
== Suggested reading
9692

97-
* xref:home:ROOT:mcp-setup.adoc[]
93+
* xref:home:ROOT:mcp-setup.adoc[]: Access Redpanda documentation through AI agents (read-only, no Cloud access required)

0 commit comments

Comments
 (0)