Skip to content

Latest commit

 

History

History
741 lines (506 loc) · 21.4 KB

File metadata and controls

741 lines (506 loc) · 21.4 KB

Configure AI Gateway for Continue.dev

ai-agents:partial$adp-la.adoc

Configure Redpanda AI Gateway to support Continue.dev clients accessing multiple LLM providers and MCP tools through flexible, native-format endpoints.

After reading this page, you will be able to:

  • ❏ Configure AI Gateway endpoints for Continue.dev connectivity.

  • ❏ Set up multi-provider backends with native format routing.

  • ❏ Deploy MCP tool aggregation for Continue.dev tool discovery.

Prerequisites

  • AI Gateway deployed on a BYOC cluster running Redpanda version 25.3 or later

  • Administrator access to the AI Gateway UI

  • API keys for at least one LLM provider (Anthropic, OpenAI, or others)

  • Understanding of AI Gateway concepts

About Continue.dev

Continue.dev is a highly configurable open-source AI coding assistant that integrates with VS Code and JetBrains IDEs. Unlike other AI assistants, Continue.dev uses native provider API formats rather than requiring transforms to a unified format. This architectural choice provides maximum flexibility but requires specific gateway configuration.

Key characteristics:

  • Uses native provider formats (Anthropic format for Anthropic, OpenAI format for OpenAI)

  • Supports multiple LLM providers simultaneously with per-provider configuration

  • Custom API endpoints via apiBase configuration

  • Custom headers via requestOptions.headers

  • Built-in MCP support for tool discovery and execution

  • Autocomplete, chat, and inline edit modes

Architecture overview

Continue.dev connects to AI Gateway differently than unified-format clients:

The gateway handles:

  1. Authentication via bearer tokens in the Authorization header

  2. Provider-specific request formats without transformation

  3. Model routing using provider-native model identifiers

  4. MCP server aggregation for multi-tool workflows

  5. Request logging and cost tracking per gateway

Enable LLM providers

Continue.dev works with multiple providers. Enable the providers your users will access.

Configure Anthropic

To enable Anthropic with native format support:

  1. Navigate to AI Gateway > Providers in the Redpanda Cloud console

  2. Select Anthropic from the provider list

  3. Click Add configuration

  4. Enter your Anthropic API key

  5. Under Format, select Native Anthropic (not OpenAI-compatible)

  6. Click Save

The gateway now accepts Anthropic’s native /v1/messages format.

Configure OpenAI

To enable OpenAI:

  1. Navigate to AI Gateway > Providers

  2. Select OpenAI from the provider list

  3. Click Add configuration

  4. Enter your OpenAI API key

  5. Under Format, select Native OpenAI

  6. Click Save

Configure additional providers

Continue.dev supports many providers. For each provider:

  1. Add the provider configuration in the gateway

  2. Ensure the format is set to the provider’s native format

  3. Do not enable format transforms (Continue.dev handles format differences in its client code)

Common additional providers:

  • Google Gemini (native Google format)

  • Mistral AI (OpenAI-compatible format)

  • Together AI (OpenAI-compatible format)

  • Ollama (OpenAI-compatible format for local models)

Enable models in the catalog

After enabling providers, enable specific models:

  1. Navigate to AI Gateway > Models

  2. Enable the models you want Continue.dev clients to access

    Common models for Continue.dev:

    • claude-opus-4.6 (Anthropic, high quality)

    • claude-sonnet-4.5 (Anthropic, balanced)

    • gpt-5.2 (OpenAI, high quality)

    • gpt-5.2-mini (OpenAI, fast autocomplete)

    • o1-mini (OpenAI, reasoning)

  3. Click Save

Continue.dev uses provider-native model identifiers (for example, claude-sonnet-4.5 not anthropic/claude-sonnet-4.5).

Create a gateway for Continue.dev clients

Create a dedicated gateway to isolate Continue.dev traffic and apply specific policies.

Gateway configuration

  1. Navigate to Agentic > AI Gateway > Routers

  2. Click Create Gateway

  3. Enter gateway details:

    Field Value

    Name

    continue-gateway (or your preferred name)

    Workspace

    Select the workspace for access control grouping

    Description

    Gateway for Continue.dev IDE clients

  4. Click Create

  5. Copy the gateway endpoint URL from the gateway details page

Configure provider-specific backends

Continue.dev requires separate backend configurations for each provider because it uses native formats.

Anthropic backend

  1. Navigate to the gateway’s Backends tab

  2. Click Add Backend

  3. Configure:

    Field Value

    Backend name

    anthropic-native

    Provider

    Anthropic

    Format

    Native Anthropic (no transform)

    Path

    /v1/anthropic

    Enabled models

    All Anthropic models you enabled in the catalog

  4. Click Save

Continue.dev will send requests to https://{CLUSTER_ID}.cloud.redpanda.com/ai-gateway/v1/anthropic using Anthropic’s native format.

OpenAI backend

  1. Click Add Backend

  2. Configure:

    Field Value

    Backend name

    openai-native

    Provider

    OpenAI

    Format

    Native OpenAI (no transform)

    Path

    /v1/openai

    Enabled models

    All OpenAI models you enabled in the catalog

  3. Click Save

Continue.dev will send requests to https://{CLUSTER_ID}.cloud.redpanda.com/ai-gateway/v1/openai using OpenAI’s native format.

Additional provider backends

Repeat the backend configuration process for each provider:

  • Google Gemini: /v1/google, native Google format

  • Mistral: /v1/mistral, OpenAI-compatible format

  • Ollama (if proxying local models): /v1/ollama, OpenAI-compatible format

Configure LLM routing

Set up routing policies for Continue.dev requests.

Per-provider routing

Configure routing rules that apply to each backend:

  1. Navigate to the gateway’s Routing tab

  2. For each backend, click Add Route

  3. Configure basic routing:

    true  # Matches all requests to this backend
  4. Add a primary provider configuration with your Anthropic API key

  5. (Optional) Add a fallback configuration for redundancy if you have multiple API keys

  6. Click Save

Provider failover

For providers with multiple API keys, configure failover:

  1. In the backend’s routing configuration, add multiple provider configurations

  2. Set failover conditions:

    • Rate limits (HTTP 429)

    • Timeouts (no response within 30 seconds)

    • 5xx errors (provider unavailable)

  3. Configure load balancing: Round robin across available keys

  4. Click Save

Continue.dev requests automatically fail over to healthy API keys when the primary key experiences issues.

Apply rate limits

Prevent runaway usage from Continue.dev clients:

  1. Navigate to the gateway’s Rate Limits tab

  2. Configure global limits:

    Setting Recommended Value

    Global rate limit

    200 requests per minute (Continue.dev autocomplete can generate many requests)

    Per-user rate limit

    20 requests per minute (if using user identification headers)

    Per-backend limits

    Vary by provider (autocomplete backends need higher limits)

  3. Click Save

The gateway blocks requests exceeding these limits and returns HTTP 429 errors.

Rate limit considerations for autocomplete

Continue.dev’s autocomplete feature generates frequent, short requests. Configure higher rate limits for autocomplete-specific backends:

  • Autocomplete models (for example, gpt-5.2-mini): 100 requests per minute per user

  • Chat models (for example, claude-sonnet-4.5): 20 requests per minute per user

Set spending limits

Control LLM costs across all providers:

  1. Navigate to the gateway’s Spend Limits tab

  2. Configure:

    Setting Value

    Monthly budget

    $10,000 (adjust based on expected usage)

    Enforcement

    Block requests after budget exceeded

    Alert threshold

    80% of budget (sends notification)

  3. Click Save

The gateway tracks estimated costs per request across all providers and blocks traffic when the monthly budget is exhausted.

Configure MCP tool aggregation

Enable Continue.dev to discover and use tools from multiple MCP servers through a single endpoint.

Add MCP servers

  1. Navigate to the gateway’s MCP tab

  2. Click Add MCP Server

  3. Enter server details:

    Field Value

    Display name

    Descriptive name (for example, redpanda-data-catalog, code-search-tools)

    Endpoint URL

    MCP server endpoint (for example, Remote MCP server URL)

    Authentication

    Bearer token or other authentication mechanism

  4. Click Save

Repeat for each MCP server you want to aggregate.

Enable deferred tool loading

Reduce token costs for Continue.dev sessions with many available tools:

  1. Under MCP Settings, enable Deferred tool loading

  2. Click Save

When enabled:

  • Continue.dev initially receives only a search tool and orchestrator tool

  • Continue.dev queries for specific tools by name when needed

  • Token usage decreases by 80-90% for configurations with many tools

This is particularly important for Continue.dev because autocomplete and chat modes both use tool discovery.

Add the MCP orchestrator

The MCP orchestrator reduces multi-step workflows to single calls:

  1. Under MCP Settings, enable MCP Orchestrator

  2. Configure:

    Setting Value

    Orchestrator model

    Select a model with strong code generation capabilities (for example, claude-sonnet-4.5)

    Execution timeout

    30 seconds

    Backend

    Select the Anthropic backend (orchestrator works best with Claude models)

  3. Click Save

Continue.dev can now invoke the orchestrator tool to execute complex, multi-step operations in a single request.

Configure authentication

Continue.dev clients authenticate using bearer tokens.

Generate API tokens

  1. Navigate to Security > API Tokens in the Redpanda Cloud console

  2. Click Create Token

  3. Enter token details:

    Field Value

    Name

    continue-access

    Scopes

    ai-gateway:read, ai-gateway:write

    Expiration

    Set appropriate expiration based on security policies

  4. Click Create

  5. Copy the token (it appears only once)

Distribute this token to Continue.dev users through secure channels.

Token rotation

Implement token rotation for security:

  1. Create a new token before the existing token expires

  2. Distribute the new token to users

  3. Monitor usage of the old token in (observability dashboard)

  4. Revoke the old token after all users have migrated

Configure Continue.dev clients

Provide these instructions to users configuring Continue.dev in their IDE.

Configuration file location

Continue.dev supports both JSON and YAML configuration formats. This guide uses YAML (config.yaml) because it supports MCP server configuration and environment variable interpolation:

  • VS Code: ~/.continue/config.yaml

  • JetBrains: ~/.continue/config.yaml

Note
While config.json is still supported for basic LLM configuration, config.yaml is required for MCP server integration.

Multi-provider configuration

Users configure Continue.dev with separate provider entries for each backend:

models:
  - title: Claude Sonnet (Redpanda)
    provider: anthropic
    model: claude-sonnet-4.5
    apiBase: https://{CLUSTER_ID}.cloud.redpanda.com/ai-gateway/v1/anthropic
    apiKey: YOUR_API_TOKEN

  - title: GPT-5.2 (Redpanda)
    provider: openai
    model: gpt-5.2
    apiBase: https://{CLUSTER_ID}.cloud.redpanda.com/ai-gateway/v1/openai
    apiKey: YOUR_API_TOKEN

  - title: GPT-5.2-mini (Autocomplete)
    provider: openai
    model: gpt-5.2-mini
    apiBase: https://{CLUSTER_ID}.cloud.redpanda.com/ai-gateway/v1/openai
    apiKey: YOUR_API_TOKEN

tabAutocompleteModel:
  title: GPT-5.2-mini (Autocomplete)
  provider: openai
  model: gpt-5.2-mini
  apiBase: https://{CLUSTER_ID}.cloud.redpanda.com/ai-gateway/v1/openai
  apiKey: YOUR_API_TOKEN

Replace:

  • {CLUSTER_ID}: Your Redpanda cluster ID

  • YOUR_API_TOKEN: The API token generated earlier

MCP server configuration

Configure Continue.dev to connect to the aggregated MCP endpoint.

The preferred method is to create MCP server configuration files in the ~/.continue/mcpServers/ directory:

  1. Create the directory: mkdir -p ~/.continue/mcpServers

  2. Create ~/.continue/mcpServers/redpanda-ai-gateway.yaml:

    transport:
      type: streamable-http
      url: https://{CLUSTER_ID}.cloud.redpanda.com/ai-gateway/mcp
      headers:
        Authorization: Bearer YOUR_API_TOKEN
    Important
    For production deployments, use environment variable interpolation with ${{ secrets.VARIABLE }} syntax instead of hardcoding tokens. See Configure with environment variables in the user guide for details.

Continue.dev automatically discovers MCP server configurations in this directory.

Alternative: Inline configuration

Alternatively, embed MCP server configuration in ~/.continue/config.yaml:

mcpServers:
  - transport:
      type: streamable-http
      url: https://{CLUSTER_ID}.cloud.redpanda.com/ai-gateway/mcp
      headers:
        Authorization: Bearer YOUR_API_TOKEN

Replace:

  • {CLUSTER_ID}: Your Redpanda cluster ID

  • YOUR_API_TOKEN: The API token generated earlier

This configuration connects Continue.dev to the aggregated MCP endpoint with authentication headers.

Model selection strategy

Configure different models for different Continue.dev modes:

Mode Recommended Model Reason

Chat

claude-sonnet-4.5 or gpt-5.2

High quality for complex questions

Autocomplete

gpt-5.2-mini

Fast, cost-effective for frequent requests

Inline edit

claude-sonnet-4.5

Balanced quality and speed for code modifications

Embeddings

text-embedding-3-small

Cost-effective for code search

Monitor Continue.dev usage

Track Continue.dev activity through gateway observability features.

View request logs

  1. Navigate to AI Gateway > Observability > Logs

  2. Filter by gateway ID: continue-gateway

  3. Review:

    • Request timestamps and duration

    • Backend and model used per request

    • Token usage (prompt and completion tokens)

    • Estimated cost per request

    • HTTP status codes and errors

Continue.dev generates different request patterns:

  • Autocomplete: Many short requests with low token counts

  • Chat: Longer requests with context and multi-turn conversations

  • Inline edit: Medium-length requests with code context

Analyze metrics

  1. Navigate to AI Gateway > Observability > Metrics

  2. Select the Continue.dev gateway

  3. Review:

    Metric Purpose

    Request volume by backend

    Identify which providers are most used

    Token usage by model

    Track consumption patterns (autocomplete vs chat)

    Estimated spend by backend

    Monitor costs across providers

    Latency (p50, p95, p99) by backend

    Detect provider-specific performance issues

    Error rate by backend

    Identify failing providers or misconfigured backends

Query logs via API

Programmatically access logs for integration with monitoring systems:

curl https://{CLUSTER_ID}.cloud.redpanda.com/api/ai-gateway/logs \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "gateway_id": "GATEWAY_ID",
    "start_time": "2026-01-01T00:00:00Z",
    "end_time": "2026-01-14T23:59:59Z",
    "limit": 100
  }'

Security considerations

Apply these security best practices for Continue.dev deployments.

Limit token scope

Create tokens with minimal required scopes:

  • ai-gateway:read: Required for MCP tool discovery

  • ai-gateway:write: Required for LLM requests and tool execution

Avoid granting broader scopes like admin or cluster:write.

Implement network restrictions

If Continue.dev clients connect from known networks, configure network policies:

  1. Use cloud provider security groups to restrict access to AI Gateway endpoints

  2. Allowlist only the IP ranges where Continue.dev clients operate

  3. Monitor for unauthorized access attempts in request logs

Enforce token expiration

Set short token lifetimes for high-security environments:

  • Development environments: 90 days

  • Production environments: 30 days

Automate token rotation to reduce manual overhead.

Audit tool access

Review which MCP tools Continue.dev clients can access:

  1. Periodically audit the MCP servers configured in the gateway

  2. Remove unused or deprecated MCP servers

  3. Monitor tool execution logs for unexpected behavior

Protect API keys in configuration

Continue.dev stores the API token in plain text in config.yaml. Remind users to:

  • Never commit config.yaml to version control

  • Use file system permissions to restrict access (for example, chmod 600 ~/.continue/config.yaml)

  • Rotate tokens if they suspect compromise

Troubleshooting

Common issues and solutions when configuring AI Gateway for Continue.dev.

Continue.dev cannot connect to gateway

Symptom: Connection errors when Continue.dev tries to discover tools or send LLM requests.

Causes and solutions:

  • Invalid gateway ID: Verify the gateway endpoint URL matches the URL from the console

  • Expired token: Generate a new API token and update the Continue.dev configuration

  • Wrong backend path: Verify apiBase matches the backend path (for example, /v1/anthropic not /v1)

  • Network connectivity: Verify the cluster endpoint is accessible from the client network

  • Provider not enabled: Ensure at least one backend is configured with models enabled

Model not found errors

Symptom: Continue.dev shows "model not found" or similar errors.

Causes and solutions:

  • Model not enabled in catalog: Enable the model in the gateway’s model catalog

  • Model identifier mismatch: Use provider-native names (for example, claude-sonnet-4.5 not anthropic/claude-sonnet-4.5)

  • Wrong backend for model: Verify the model is associated with the correct backend (Anthropic models with Anthropic backend)

Format errors or unexpected responses

Symptom: Responses are malformed or Continue.dev reports format errors.

Causes and solutions:

  • Transform enabled on backend: Ensure backend format is set to native (no OpenAI-compatible transform for Anthropic)

  • Wrong provider for apiBase: Verify Continue.dev’s provider field matches the backend’s provider

  • Headers not passed: Confirm requestOptions.headers is correctly configured

Autocomplete not working or slow

Symptom: Autocomplete suggestions don’t appear or are delayed.

Causes and solutions:

  • Wrong model for autocomplete: Use a fast model like gpt-5.2-mini in tabAutocompleteModel

  • Rate limits too restrictive: Increase rate limits for autocomplete backend

  • High backend latency: Check backend metrics and consider provider failover

  • Token exhaustion: Verify spending limits haven’t been reached

Tools not appearing in Continue.dev

Symptom: Continue.dev does not discover MCP tools.

Causes and solutions:

  • MCP configuration missing: Ensure mcpServers is configured

  • MCP servers not configured in gateway: Add MCP server endpoints in the gateway’s MCP tab

  • Deferred loading enabled but search failing: Check that the search tool is correctly configured

  • MCP server authentication failing: Verify MCP server authentication credentials in the gateway configuration

High costs or token usage

Symptom: Token usage and costs exceed expectations.

Causes and solutions:

  • Autocomplete using expensive model: Configure tabAutocompleteModel to use gpt-5.2-mini instead of larger models

  • Deferred tool loading disabled: Enable deferred tool loading to reduce tokens by 80-90%

  • No rate limits: Apply per-minute rate limits to prevent runaway usage

  • Missing spending limits: Set monthly budget limits with blocking enforcement

  • Chat using wrong model: Route chat requests to cost-effective models (for example, claude-sonnet-4.5 instead of claude-opus-4.6)

Requests failing with 429 errors

Symptom: Continue.dev receives HTTP 429 Too Many Requests errors.

Causes and solutions:

  • Rate limit exceeded: Review and increase rate limits if usage is legitimate (autocomplete needs higher limits)

  • Upstream provider rate limits: Check if the upstream LLM provider is rate-limiting; configure failover to alternate API keys

  • Budget exhausted: Verify monthly spending limit has not been reached

Different results from different providers

Symptom: Same prompt produces different results when switching providers.

This is expected behavior, not a configuration issue:

  • Different models have different capabilities and response styles

  • Continue.dev uses native formats, which may include provider-specific parameters

  • Users should select the appropriate model for their task (quality vs speed vs cost)

Next steps

  • ai-agents:ai-gateway/cel-routing-cookbook.adoc: Implement advanced routing rules

  • ai-agents:mcp/remote/overview.adoc: Deploy Remote MCP servers for custom tools