ai-agents:partial$adp-la.adoc
Configure Redpanda AI Gateway to support Claude Code clients accessing LLM providers and MCP tools through a unified endpoint.
After reading this page, you will be able to:
-
❏ Configure AI Gateway endpoints for Claude Code connectivity.
-
❏ Set up authentication and access control for Claude Code clients.
-
❏ Deploy MCP tool aggregation for Claude Code tool discovery.
-
AI Gateway deployed on a BYOC cluster running Redpanda version 25.3 or later
-
Administrator access to the AI Gateway UI
-
At least one LLM provider API key (OpenAI, Anthropic, or Google Gemini)
-
Understanding of AI Gateway concepts
Claude Code connects to AI Gateway through two primary endpoints:
-
LLM endpoint:
https://{CLUSTER_ID}.cloud.redpanda.com/ai-gateway/v1for chat completions -
MCP endpoint:
https://{CLUSTER_ID}.cloud.redpanda.com/ai-gateway/mcpfor tool discovery and execution
The gateway handles:
-
Authentication via bearer tokens in the
Authorizationheader -
Gateway selection via the endpoint URL
-
Model routing using the
vendor/model_idformat -
MCP server aggregation for multi-tool workflows
-
Request logging and cost tracking per gateway
Claude Code requires access to LLM providers through the gateway. Enable at least one provider.
Claude Code uses Anthropic models by default. To enable Anthropic:
-
Navigate to AI Gateway > Providers in the Redpanda Cloud console
-
Select Anthropic from the provider list
-
Click Add configuration
-
Enter your Anthropic API key
-
Click Save
The gateway can now route requests to Anthropic models.
To enable OpenAI as a provider:
-
Navigate to AI Gateway > Providers
-
Select OpenAI from the provider list
-
Click Add configuration
-
Enter your OpenAI API key
-
Click Save
After enabling providers, enable specific models:
-
Navigate to AI Gateway > Models
-
Enable the models you want Claude Code clients to access
Common models for Claude Code:
-
anthropic/claude-opus-4.6-5 -
anthropic/claude-sonnet-4.5 -
openai/gpt-5.2 -
openai/o1-mini
-
-
Click Save
Models appear in the catalog with the vendor/model_id format that Claude Code uses in requests.
Create a dedicated gateway to isolate Claude Code traffic and apply specific policies.
-
Navigate to Agentic > AI Gateway > Routers
-
Click Create Gateway
-
Enter gateway details:
Field Value Name
claude-code-gateway(or your preferred name)Workspace
Select the workspace for access control grouping
Description
Gateway for Claude Code IDE clients
-
Click Create
-
Copy the gateway ID from the gateway details page
The gateway ID is embedded in the gateway endpoint URL.
Set up routing policies for Claude Code requests.
Configure a primary provider with automatic failover:
-
Navigate to the gateway’s LLM tab
-
Under Routing, click Add route
-
Configure the route:
true # Matches all requests -
Add a Primary provider pool:
-
Provider: Anthropic
-
Model: All enabled Anthropic models
-
Load balancing: Round robin (if multiple Anthropic configurations exist)
-
-
Add a Fallback provider pool:
-
Provider: OpenAI
-
Model: All enabled OpenAI models
-
Failover conditions: Rate limits, timeouts, 5xx errors
-
-
Click Save
Claude Code requests route to Anthropic by default and fail over to OpenAI if Anthropic is unavailable.
Prevent runaway usage from Claude Code clients:
-
Navigate to the gateway’s LLM tab
-
Under Rate Limit, configure:
Setting Recommended Value Global rate limit
100 requests per minute
Per-user rate limit
10 requests per minute (if using user headers)
-
Click Save
The gateway blocks requests exceeding these limits and returns HTTP 429 errors.
Control LLM costs:
-
Under Spend Limit, configure:
Setting Value Monthly budget
$5,000 (adjust based on expected usage)
Enforcement
Block requests after budget exceeded
-
Click Save
The gateway tracks estimated costs per request and blocks traffic when the monthly budget is exhausted.
Enable Claude Code to discover and use tools from multiple MCP servers through a single endpoint.
-
Navigate to the gateway’s MCP tab
-
Click Add MCP Server
-
Enter server details:
Field Value Display name
Descriptive name (for example,
redpanda-data-catalog)Endpoint URL
MCP server endpoint (for example, Remote MCP server URL)
Authentication
Bearer token or other authentication mechanism
-
Click Save
Repeat for each MCP server you want to aggregate.
Reduce token costs by deferring tool discovery:
-
Under MCP Settings, enable Deferred tool loading
-
Click Save
When enabled:
-
Claude Code initially receives only a search tool and orchestrator tool
-
Claude Code queries for specific tools by name when needed
-
Token usage decreases by 80-90% for agents with many tools configured
The MCP orchestrator reduces multi-step workflows to single calls:
-
Under MCP Settings, enable MCP Orchestrator
-
Configure:
Setting Value Orchestrator model
Select a model with strong code generation capabilities (for example,
anthropic/claude-sonnet-4.5)Execution timeout
30 seconds
-
Click Save
Claude Code can now invoke the orchestrator tool to execute complex, multi-step operations in a single request.
Claude Code clients authenticate using bearer tokens.
-
Navigate to Security > API Tokens in the Redpanda Cloud console
-
Click Create Token
-
Enter token details:
Field Value Name
claude-code-accessScopes
ai-gateway:read,ai-gateway:writeExpiration
Set appropriate expiration based on security policies
-
Click Create
-
Copy the token (it appears only once)
Distribute this token to Claude Code users through secure channels.
Provide these instructions to users configuring Claude Code.
Users can configure Claude Code using the CLI:
claude mcp add \
--transport http \
redpanda-aigateway \
https://{CLUSTER_ID}.cloud.redpanda.com/ai-gateway/mcp \
--header "Authorization: Bearer YOUR_API_TOKEN"Replace:
-
{CLUSTER_ID}: Your Redpanda cluster ID -
YOUR_API_TOKEN: The API token generated earlier
Alternatively, users can edit ~/.claude.json (user-level) or .mcp.json (project-level):
{
"mcpServers": {
"redpanda-ai-gateway": {
"type": "http",
"url": "https://{CLUSTER_ID}.cloud.redpanda.com/ai-gateway/mcp",
"headers": {
"Authorization": "Bearer YOUR_API_TOKEN"
}
}
}
}This configuration:
-
Connects Claude Code to the aggregated MCP endpoint
-
Includes authentication headers
Track Claude Code activity through gateway observability features.
-
Navigate to AI Gateway > Observability > Logs
-
Filter by gateway ID:
claude-code-gateway -
Review:
-
Request timestamps and duration
-
Model used per request
-
Token usage (prompt and completion tokens)
-
Estimated cost per request
-
HTTP status codes and errors
-
-
Navigate to AI Gateway > Observability > Metrics
-
Select the Claude Code gateway
-
Review:
Metric Purpose Request volume
Identify usage patterns and peak times
Token usage
Track consumption trends
Estimated spend
Monitor costs against budget
Latency (p50, p95, p99)
Detect performance issues
Error rate
Identify failing requests or misconfigured clients
Programmatically access logs for integration with monitoring systems:
curl https://{CLUSTER_ID}.cloud.redpanda.com/api/ai-gateway/logs \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"gateway_id": "GATEWAY_ID",
"start_time": "2026-01-01T00:00:00Z",
"end_time": "2026-01-14T23:59:59Z",
"limit": 100
}'Apply these security best practices for Claude Code deployments.
Create tokens with minimal required scopes:
-
ai-gateway:read: Required for MCP tool discovery -
ai-gateway:write: Required for LLM requests and tool execution
Avoid granting broader scopes like admin or cluster:write.
If Claude Code clients connect from known IP ranges, configure network policies:
-
Use cloud provider security groups to restrict access to AI Gateway endpoints
-
Allowlist only the IP ranges where Claude Code clients operate
-
Monitor for unauthorized access attempts in request logs
Set short token lifetimes for high-security environments:
-
Development environments: 90 days
-
Production environments: 30 days
Automate token rotation to reduce manual overhead.
Common issues and solutions when configuring AI Gateway for Claude Code.
Symptom: Connection errors when Claude Code tries to discover tools or send LLM requests.
Causes and solutions:
-
Invalid gateway endpoint: Verify the gateway endpoint URL matches the endpoint from the console
-
Expired token: Generate a new API token and update the Claude Code configuration
-
Network connectivity: Verify the cluster endpoint is accessible from the client network
-
Provider not enabled: Ensure at least one LLM provider is enabled and has models in the catalog
Symptom: Claude Code does not discover MCP tools.
Causes and solutions:
-
MCP servers not configured: Add MCP server endpoints in the gateway’s MCP tab
-
Deferred loading enabled but search failing: Check that the search tool is correctly configured
-
MCP server authentication failing: Verify MCP server authentication credentials in the gateway configuration
Symptom: Token usage and costs exceed expectations.
Causes and solutions:
-
Deferred tool loading disabled: Enable deferred tool loading to reduce tokens by 80-90%
-
No rate limits: Apply per-minute rate limits to prevent runaway usage
-
Missing spending limits: Set monthly budget limits with blocking enforcement
-
Expensive models: Route to cost-effective models (for example, Claude Sonnet instead of Opus) for non-critical requests
Symptom: Claude Code receives HTTP 429 Too Many Requests errors.
Causes and solutions:
-
Rate limit exceeded: Review and increase rate limits if usage is legitimate
-
Upstream provider rate limits: Check if the upstream LLM provider is rate-limiting; configure failover pools
-
Budget exhausted: Verify monthly spending limit has not been reached