Skip to content

Latest commit

 

History

History
498 lines (340 loc) · 12.7 KB

File metadata and controls

498 lines (340 loc) · 12.7 KB

Configure AI Gateway for Claude Code

ai-agents:partial$adp-la.adoc

Configure Redpanda AI Gateway to support Claude Code clients accessing LLM providers and MCP tools through a unified endpoint.

After reading this page, you will be able to:

  • ❏ Configure AI Gateway endpoints for Claude Code connectivity.

  • ❏ Set up authentication and access control for Claude Code clients.

  • ❏ Deploy MCP tool aggregation for Claude Code tool discovery.

Prerequisites

  • AI Gateway deployed on a BYOC cluster running Redpanda version 25.3 or later

  • Administrator access to the AI Gateway UI

  • At least one LLM provider API key (OpenAI, Anthropic, or Google Gemini)

  • Understanding of AI Gateway concepts

Architecture overview

Claude Code connects to AI Gateway through two primary endpoints:

The gateway handles:

  1. Authentication via bearer tokens in the Authorization header

  2. Gateway selection via the endpoint URL

  3. Model routing using the vendor/model_id format

  4. MCP server aggregation for multi-tool workflows

  5. Request logging and cost tracking per gateway

Enable LLM providers

Claude Code requires access to LLM providers through the gateway. Enable at least one provider.

Configure Anthropic

Claude Code uses Anthropic models by default. To enable Anthropic:

  1. Navigate to AI Gateway > Providers in the Redpanda Cloud console

  2. Select Anthropic from the provider list

  3. Click Add configuration

  4. Enter your Anthropic API key

  5. Click Save

The gateway can now route requests to Anthropic models.

Configure OpenAI

To enable OpenAI as a provider:

  1. Navigate to AI Gateway > Providers

  2. Select OpenAI from the provider list

  3. Click Add configuration

  4. Enter your OpenAI API key

  5. Click Save

Enable models in the catalog

After enabling providers, enable specific models:

  1. Navigate to AI Gateway > Models

  2. Enable the models you want Claude Code clients to access

    Common models for Claude Code:

    • anthropic/claude-opus-4.6-5

    • anthropic/claude-sonnet-4.5

    • openai/gpt-5.2

    • openai/o1-mini

  3. Click Save

Models appear in the catalog with the vendor/model_id format that Claude Code uses in requests.

Create a gateway for Claude Code clients

Create a dedicated gateway to isolate Claude Code traffic and apply specific policies.

Gateway configuration

  1. Navigate to Agentic > AI Gateway > Routers

  2. Click Create Gateway

  3. Enter gateway details:

    Field Value

    Name

    claude-code-gateway (or your preferred name)

    Workspace

    Select the workspace for access control grouping

    Description

    Gateway for Claude Code IDE clients

  4. Click Create

  5. Copy the gateway ID from the gateway details page

The gateway ID is embedded in the gateway endpoint URL.

Configure LLM routing

Set up routing policies for Claude Code requests.

Basic routing with failover

Configure a primary provider with automatic failover:

  1. Navigate to the gateway’s LLM tab

  2. Under Routing, click Add route

  3. Configure the route:

    true  # Matches all requests
  4. Add a Primary provider pool:

    • Provider: Anthropic

    • Model: All enabled Anthropic models

    • Load balancing: Round robin (if multiple Anthropic configurations exist)

  5. Add a Fallback provider pool:

    • Provider: OpenAI

    • Model: All enabled OpenAI models

    • Failover conditions: Rate limits, timeouts, 5xx errors

  6. Click Save

Claude Code requests route to Anthropic by default and fail over to OpenAI if Anthropic is unavailable.

User-based routing

Route requests based on user identity (if Claude Code passes user identifiers):

request.headers["x-user-tier"] == "premium"

Create separate routes:

  • Premium route: Claude Opus 4.6.5 (highest quality)

  • Standard route: Claude Sonnet 4.5 (balanced cost and quality)

Apply rate limits

Prevent runaway usage from Claude Code clients:

  1. Navigate to the gateway’s LLM tab

  2. Under Rate Limit, configure:

    Setting Recommended Value

    Global rate limit

    100 requests per minute

    Per-user rate limit

    10 requests per minute (if using user headers)

  3. Click Save

The gateway blocks requests exceeding these limits and returns HTTP 429 errors.

Set spending limits

Control LLM costs:

  1. Under Spend Limit, configure:

    Setting Value

    Monthly budget

    $5,000 (adjust based on expected usage)

    Enforcement

    Block requests after budget exceeded

  2. Click Save

The gateway tracks estimated costs per request and blocks traffic when the monthly budget is exhausted.

Configure MCP tool aggregation

Enable Claude Code to discover and use tools from multiple MCP servers through a single endpoint.

Add MCP servers

  1. Navigate to the gateway’s MCP tab

  2. Click Add MCP Server

  3. Enter server details:

    Field Value

    Display name

    Descriptive name (for example, redpanda-data-catalog)

    Endpoint URL

    MCP server endpoint (for example, Remote MCP server URL)

    Authentication

    Bearer token or other authentication mechanism

  4. Click Save

Repeat for each MCP server you want to aggregate.

Enable deferred tool loading

Reduce token costs by deferring tool discovery:

  1. Under MCP Settings, enable Deferred tool loading

  2. Click Save

When enabled:

  • Claude Code initially receives only a search tool and orchestrator tool

  • Claude Code queries for specific tools by name when needed

  • Token usage decreases by 80-90% for agents with many tools configured

Add the MCP orchestrator

The MCP orchestrator reduces multi-step workflows to single calls:

  1. Under MCP Settings, enable MCP Orchestrator

  2. Configure:

    Setting Value

    Orchestrator model

    Select a model with strong code generation capabilities (for example, anthropic/claude-sonnet-4.5)

    Execution timeout

    30 seconds

  3. Click Save

Claude Code can now invoke the orchestrator tool to execute complex, multi-step operations in a single request.

Configure authentication

Claude Code clients authenticate using bearer tokens.

Generate API tokens

  1. Navigate to Security > API Tokens in the Redpanda Cloud console

  2. Click Create Token

  3. Enter token details:

    Field Value

    Name

    claude-code-access

    Scopes

    ai-gateway:read, ai-gateway:write

    Expiration

    Set appropriate expiration based on security policies

  4. Click Create

  5. Copy the token (it appears only once)

Distribute this token to Claude Code users through secure channels.

Token rotation

Implement token rotation for security:

  1. Create a new token before the existing token expires

  2. Distribute the new token to users

  3. Monitor usage of the old token in (observability dashboard)

  4. Revoke the old token after all users have migrated

Configure Claude Code clients

Provide these instructions to users configuring Claude Code.

CLI configuration

Users can configure Claude Code using the CLI:

claude mcp add \
  --transport http \
  redpanda-aigateway \
  https://{CLUSTER_ID}.cloud.redpanda.com/ai-gateway/mcp \
  --header "Authorization: Bearer YOUR_API_TOKEN"

Replace:

  • {CLUSTER_ID}: Your Redpanda cluster ID

  • YOUR_API_TOKEN: The API token generated earlier

Configuration file

Alternatively, users can edit ~/.claude.json (user-level) or .mcp.json (project-level):

{
  "mcpServers": {
    "redpanda-ai-gateway": {
      "type": "http",
      "url": "https://{CLUSTER_ID}.cloud.redpanda.com/ai-gateway/mcp",
      "headers": {
        "Authorization": "Bearer YOUR_API_TOKEN"
      }
    }
  }
}

This configuration:

  • Connects Claude Code to the aggregated MCP endpoint

  • Includes authentication headers

Monitor Claude Code usage

Track Claude Code activity through gateway observability features.

View request logs

  1. Navigate to AI Gateway > Observability > Logs

  2. Filter by gateway ID: claude-code-gateway

  3. Review:

    • Request timestamps and duration

    • Model used per request

    • Token usage (prompt and completion tokens)

    • Estimated cost per request

    • HTTP status codes and errors

Analyze metrics

  1. Navigate to AI Gateway > Observability > Metrics

  2. Select the Claude Code gateway

  3. Review:

    Metric Purpose

    Request volume

    Identify usage patterns and peak times

    Token usage

    Track consumption trends

    Estimated spend

    Monitor costs against budget

    Latency (p50, p95, p99)

    Detect performance issues

    Error rate

    Identify failing requests or misconfigured clients

Query logs via API

Programmatically access logs for integration with monitoring systems:

curl https://{CLUSTER_ID}.cloud.redpanda.com/api/ai-gateway/logs \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "gateway_id": "GATEWAY_ID",
    "start_time": "2026-01-01T00:00:00Z",
    "end_time": "2026-01-14T23:59:59Z",
    "limit": 100
  }'

Security considerations

Apply these security best practices for Claude Code deployments.

Limit token scope

Create tokens with minimal required scopes:

  • ai-gateway:read: Required for MCP tool discovery

  • ai-gateway:write: Required for LLM requests and tool execution

Avoid granting broader scopes like admin or cluster:write.

Implement network restrictions

If Claude Code clients connect from known IP ranges, configure network policies:

  1. Use cloud provider security groups to restrict access to AI Gateway endpoints

  2. Allowlist only the IP ranges where Claude Code clients operate

  3. Monitor for unauthorized access attempts in request logs

Enforce token expiration

Set short token lifetimes for high-security environments:

  • Development environments: 90 days

  • Production environments: 30 days

Automate token rotation to reduce manual overhead.

Audit tool access

Review which MCP tools Claude Code clients can access:

  1. Periodically audit the MCP servers configured in the gateway

  2. Remove unused or deprecated MCP servers

  3. Monitor tool execution logs for unexpected behavior

Troubleshooting

Common issues and solutions when configuring AI Gateway for Claude Code.

Claude Code cannot connect to gateway

Symptom: Connection errors when Claude Code tries to discover tools or send LLM requests.

Causes and solutions:

  • Invalid gateway endpoint: Verify the gateway endpoint URL matches the endpoint from the console

  • Expired token: Generate a new API token and update the Claude Code configuration

  • Network connectivity: Verify the cluster endpoint is accessible from the client network

  • Provider not enabled: Ensure at least one LLM provider is enabled and has models in the catalog

Tools not appearing in Claude Code

Symptom: Claude Code does not discover MCP tools.

Causes and solutions:

  • MCP servers not configured: Add MCP server endpoints in the gateway’s MCP tab

  • Deferred loading enabled but search failing: Check that the search tool is correctly configured

  • MCP server authentication failing: Verify MCP server authentication credentials in the gateway configuration

High costs or token usage

Symptom: Token usage and costs exceed expectations.

Causes and solutions:

  • Deferred tool loading disabled: Enable deferred tool loading to reduce tokens by 80-90%

  • No rate limits: Apply per-minute rate limits to prevent runaway usage

  • Missing spending limits: Set monthly budget limits with blocking enforcement

  • Expensive models: Route to cost-effective models (for example, Claude Sonnet instead of Opus) for non-critical requests

Requests failing with 429 errors

Symptom: Claude Code receives HTTP 429 Too Many Requests errors.

Causes and solutions:

  • Rate limit exceeded: Review and increase rate limits if usage is legitimate

  • Upstream provider rate limits: Check if the upstream LLM provider is rate-limiting; configure failover pools

  • Budget exhausted: Verify monthly spending limit has not been reached

Next steps

  • ai-agents:ai-gateway/cel-routing-cookbook.adoc: Implement advanced routing rules

  • ai-agents:mcp/remote/overview.adoc: Deploy Remote MCP servers for custom tools