Skip to content

Commit a944451

Browse files
committed
improved messaging from Alex for AI Gateway
1 parent 53616ba commit a944451

6 files changed

Lines changed: 44 additions & 38 deletions

File tree

modules/ai-agents/pages/ai-gateway/admin/setup-guide.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
= AI Gateway Setup Guide
2-
:description: Complete setup guide for administrators to enable providers, configure models, create gateways, and set up routing policies.
2+
:description: Set up AI Gateway for your organization. Enable providers, configure failover for high availability, set budget controls, and create gateways with team-level isolation.
33
:page-topic-type: how-to
44
:personas: platform_admin
55
:learning-objective-1: Enable LLM providers and models in the catalog

modules/ai-agents/pages/ai-gateway/gateway-architecture.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
= AI Gateway Architecture
2-
:description: Technical architecture of Redpanda AI Gateway, including request lifecycle, supported providers, deployment models, and implementation details.
2+
:description: Technical architecture of Redpanda AI Gateway, including how the control plane, data plane, and observability plane deliver high availability, cost governance, and multi-tenant isolation.
33
:page-topic-type: concept
44
:personas: app_developer, platform_admin
55
:learning-objective-1: Describe the three architectural planes of AI Gateway

modules/ai-agents/pages/ai-gateway/gateway-quickstart.adoc

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
= AI Gateway Quickstart
2-
:description: Get started with AI Gateway by configuring providers, creating your first gateway, and routing requests through unified LLM endpoints.
2+
:description: Get started with AI Gateway. Configure providers, create your first gateway with failover and budget controls, and route your first request.
33
:page-topic-type: quickstart
44
:personas: app_developer, platform_admin
55
:learning-objective-1: Enable an LLM provider and create your first gateway
@@ -8,7 +8,7 @@
88

99
include::ai-agents:partial$ai-gateway-byoc-note.adoc[]
1010

11-
Redpanda AI Gateway provides unified access to multiple large language model (LLM) providers and glossterm:MCP[,Model Context Protocol (MCP)] servers through a single endpoint. This quickstart walks you through configuring your first gateway and routing requests through it.
11+
Redpanda AI Gateway keeps your AI-powered applications running and your costs under control by routing all LLM and MCP traffic through a single managed layer with automatic failover and budget enforcement. This quickstart walks you through configuring your first gateway and routing requests through it.
1212

1313
== Prerequisites
1414

modules/ai-agents/pages/ai-gateway/index.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
= AI Gateway
2-
:description: Learn about the unified access layer for LLM providers and AI tools with centralized routing, policy enforcement, cost management, and observability.
2+
:description: Keep AI-powered apps running with automatic provider failover, prevent runaway spend with centralized budget controls, and govern access across teams, apps, and service accounts.
33
:page-layout: index
44
:personas: platform_admin, app_developer, evaluator
55

modules/ai-agents/pages/ai-gateway/what-is-ai-gateway.adoc

Lines changed: 38 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -1,30 +1,54 @@
11
= What is an AI Gateway?
2-
:description: Understand what an AI Gateway is, the problems it solves, and how it benefits your AI infrastructure.
2+
:description: Understand how AI Gateway keeps AI-powered apps highly available across providers and prevents runaway AI spend with centralized cost governance.
33
:page-topic-type: concept
44
:personas: app_developer, platform_admin
5-
:learning-objective-1: Describe how AI Gateway centralizes LLM provider management and reduces operational complexity
6-
:learning-objective-2: Identify key features that address common LLM integration challenges
7-
:learning-objective-3: Determine whether AI Gateway fits your use case based on traffic volume and provider diversity
5+
:learning-objective-1: Explain how AI Gateway keeps AI-powered apps highly available through governed provider failover
6+
:learning-objective-2: Describe how AI Gateway prevents runaway AI spend with centralized budget controls and tenancy-based governance
7+
:learning-objective-3: Identify when AI Gateway fits your use case based on availability requirements, cost governance needs, and multi-provider or MCP tool usage
88

99
include::ai-agents:partial$ai-gateway-byoc-note.adoc[]
1010

11-
Redpanda AI Gateway is a unified access layer for LLM providers and AI tools that sits between your applications and the AI services they use. It provides centralized routing, policy enforcement, cost management, and observability for all your AI traffic.
11+
Redpanda AI Gateway keeps your AI-powered applications highly available and your AI spend under control. It sits between your applications and the LLM providers and AI tools they depend on, providing automatic provider failover so your apps stay up even when a provider goes down, and centralized budget controls so costs never run away. For platform teams, it adds governance at the model-fallback level, tenancy modeling for teams, individuals, apps, and service accounts, and a single proxy layer for both LLM models and MCP tool servers.
1212

1313
== The problem
1414

15-
Modern AI applications face four critical challenges that increase costs, reduce reliability, and slow down development.
15+
Modern AI applications face two business-critical challenges: staying up and staying on budget.
1616

17-
First, applications typically hardcode provider-specific SDKs. An application using OpenAI's SDK cannot easily switch to Anthropic or Google without code changes and redeployment. This tight coupling makes testing across providers time-consuming and error-prone, and means provider outages directly impact your application availability.
17+
First, applications typically hardcode provider-specific SDKs. An application using OpenAI's SDK cannot easily switch to Anthropic or Google without code changes and redeployment. When a provider hits rate limits, suffers an outage, or degrades in performance, your application goes down with it. Your end users don't care which provider you use; they care that the app works.
1818

19-
Second, costs can spiral without visibility into usage patterns. Without a centralized view of token consumption across teams and applications, it's difficult to attribute costs to specific customers, features, or environments. Testing and debugging can generate unexpected bills, and there's no way to enforce budgets or rate limits per team or customer.
19+
Second, costs can spiral without centralized controls. Without a single view of token consumption across teams and applications, it's difficult to attribute costs to specific customers, features, or environments. Testing and debugging can generate unexpected bills, and there's no way to enforce budgets or rate limits per team, application, or service account. The result: runaway spend that finance discovers only after the fact.
2020

21-
Third, glossterm:AI agent[,AI agents] that use glossterm:MCP[,Model Context Protocol (MCP)] servers face tool coordination challenges. Managing tool discovery and execution is repetitive across projects, and agents typically load all available tools upfront, which creates high token costs. There's also no centralized governance over which tools agents can access.
22-
23-
Finally, observability is fragmented across provider dashboards. You cannot reconstruct user sessions that span multiple models, compare latency and costs across providers in a unified view, or efficiently debug issues. Troubleshooting "the AI gave the wrong answer" requires manual log diving across different systems.
21+
These two challenges are compounded by fragmented observability across provider dashboards, which makes it harder to detect availability issues or cost anomalies in time to act. And as organizations adopt glossterm:AI agent[,AI agents] that call glossterm:MCP tool[,MCP tools], the lack of centralized tool governance adds another dimension of uncontrolled cost and risk.
2422

2523
== What AI Gateway solves
2624

27-
Redpanda AI Gateway addresses these challenges through the following core capabilities:
25+
Redpanda AI Gateway delivers two core business outcomes, high availability and cost governance, backed by platform-level controls that set it apart from simple proxy layers:
26+
27+
=== High availability through governed failover
28+
29+
Your end users don't care whether you use OpenAI, Anthropic, or Google; they care that your app stays up. AI Gateway lets you configure provider pools with automatic failover so that when your primary provider hits rate limits, times out, or returns errors, the gateway routes requests to a fallback provider with no code changes and no downtime for your users.
30+
31+
Unlike simple retry logic, AI Gateway provides governance at the failover level: you define which providers fail over to which, under what conditions, and with what priority. This controlled failover can significantly improve uptime even during extended provider outages.
32+
33+
=== Cost governance and budget controls
34+
35+
AI Gateway gives you centralized fiscal control over AI spend. Set monthly budget caps per gateway, enforce them automatically, and set rate limits per team, environment, or application. No more runaway costs discovered after the fact.
36+
37+
You can route requests to different models based on user attributes. For example, to direct premium users to a more capable model while routing free tier users to a cost-effective option, use a CEL expression:
38+
39+
[source,cel]
40+
----
41+
// Route premium users to best model, free users to cost-effective model
42+
request.headers["x-user-tier"] == "premium"
43+
? "anthropic/claude-opus-4.6"
44+
: "anthropic/claude-sonnet-4.5"
45+
----
46+
47+
You can also set different rate limits and spend limits per environment to prevent staging or development traffic from consuming production budgets.
48+
49+
=== Tenancy and access governance
50+
51+
AI Gateway provides multi-tenant isolation by design. Create separate gateways for teams, individual developers, applications, or service accounts, each with their own budgets, rate limits, routing policies, and observability scope. This tenancy model lets platform teams govern who uses what, how much they spend, and which models and tools they can access, without building custom authorization layers.
2852

2953
=== Unified LLM access (single endpoint for all providers)
3054

@@ -85,27 +109,9 @@ response = client.chat.completions.create(
85109

86110
To switch providers, you change only the `model` parameter from `openai/gpt-5.2` to `anthropic/claude-sonnet-4.5`. No code changes or redeployment needed.
87111

88-
=== Policy-based routing and cost control
89-
90-
AI Gateway lets you define routing rules, rate limits, and budgets once, then enforces them automatically for all requests.
91-
92-
You can route requests to different models based on user attributes. For example, to direct premium users to a more capable model while routing free tier users to a cost-effective option, use a CEL expression:
93-
94-
[source,cel]
95-
----
96-
// Route premium users to best model, free users to cost-effective model
97-
request.headers["x-user-tier"] == "premium"
98-
? "anthropic/claude-opus-4.6"
99-
: "anthropic/claude-sonnet-4.5"
100-
----
101-
102-
You can also set different rate limits and spend limits per environment to prevent staging or development traffic from consuming production budgets.
103-
104-
For reliability, you can configure provider pools with automatic failover. If you configure OpenAI GPT-4 as your primary model and Anthropic Claude Opus as the fallback, the gateway automatically routes requests to the fallback when it detects rate limits or timeouts from the primary provider. This configuration can significantly improve uptime (potentially up to 99.9% in some configurations) even during provider outages.
105-
106-
=== MCP aggregation and orchestration
112+
=== Proxy for LLM models and MCP tool servers
107113

108-
AI Gateway aggregates multiple glossterm:MCP server[,MCP servers] and provides deferred tool loading, which dramatically reduces token costs for AI agents.
114+
AI Gateway acts as a single proxy layer for both LLM model requests and MCP tool servers. For LLM traffic, it provides the unified endpoint described above. For AI agents that use MCP tools, it aggregates multiple MCP servers and provides deferred tool loading, which dramatically reduces token costs.
109115

110116
Without AI Gateway, agents typically load all available glossterm:MCP tool[,tools] from multiple MCP servers at startup. This approach sends 50+ tool definitions with every request, creating high token costs (thousands of tokens per request), slow agent startup times, and no centralized governance over which tools agents can access.
111117

modules/ai-agents/pages/index.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
= Agentic AI
2-
:description: Learn about the Redpanda Agentic Data Plane, including the AI Gateway, AI agents, and MCP servers.
2+
:description: Learn about the Redpanda Agentic Data Plane. Keep AI-powered apps highly available, control costs across providers, and govern access for teams, apps, and service accounts.
33
:page-layout: index
44
:page-aliases: develop:agents/about.adoc, develop:ai-agents/about.adoc

0 commit comments

Comments
 (0)