Skip to content

Latest commit

 

History

History
822 lines (580 loc) · 24.3 KB

File metadata and controls

822 lines (580 loc) · 24.3 KB

Configure AI Gateway for GitHub Copilot

Configure Redpanda AI Gateway to support GitHub Copilot clients accessing multiple LLM providers through OpenAI-compatible endpoints with bring-your-own-key (BYOK) support.

After reading this page, you will be able to:

  • ❏ Configure AI Gateway endpoints for GitHub Copilot connectivity.

  • ❏ Deploy multi-tenant authentication strategies for Copilot clients.

  • ❏ Set up model aliasing and BYOK routing for GitHub Copilot.

Prerequisites

  • AI Gateway deployed on a BYOC cluster running Redpanda version 25.3 or later

  • Administrator access to the AI Gateway UI

  • API keys for at least one LLM provider (OpenAI, Anthropic, or others)

  • Understanding of AI Gateway concepts

  • GitHub Copilot Business or Enterprise subscription (for BYOK and custom endpoints)

About GitHub Copilot

GitHub Copilot is an AI-powered code completion tool that integrates with popular IDEs including VS Code, Visual Studio, JetBrains IDEs, and Neovim. GitHub Copilot uses OpenAI models by default but supports BYOK (bring your own key) configurations for Business and Enterprise customers.

Key characteristics:

  • Sends all requests in OpenAI-compatible format to /v1/chat/completions

  • Limited support for custom headers (similar to Cursor IDE)

  • Supports BYOK for Business/Enterprise subscriptions

  • Built-in code completion, chat, and inline editing modes

  • Configuration via IDE settings or organization policies

  • High request volume from code completion features

Architecture overview

GitHub Copilot connects to AI Gateway through standardized endpoints:

The gateway handles:

  1. Authentication via bearer tokens in the Authorization header

  2. Gateway selection via URL path routing or query parameters

  3. Model routing and aliasing for friendly names

  4. Format transforms from OpenAI format to provider-native formats

  5. Request logging and cost tracking per gateway

  6. BYOK routing for different teams or users

Enable LLM providers

GitHub Copilot works with multiple providers through OpenAI-compatible transforms. Enable the providers your users will access.

Configure OpenAI (default provider)

GitHub Copilot uses OpenAI by default. To enable OpenAI through the gateway:

  1. Navigate to AI Gateway > Providers in ADP

  2. Select OpenAI from the provider list

  3. Click Add configuration

  4. Enter your OpenAI API key

  5. Under Format, select Native OpenAI

  6. Click Save

Configure Anthropic with OpenAI-compatible format

For BYOK deployments, you can route GitHub Copilot to Anthropic models. Configure the gateway to transform requests:

  1. Navigate to AI Gateway > Providers

  2. Select Anthropic from the provider list

  3. Click Add configuration

  4. Enter your Anthropic API key

  5. Under Format, select OpenAI-compatible (enables automatic transform)

  6. Click Save

The gateway now transforms OpenAI-format requests to Anthropic’s native /v1/messages format.

Configure additional providers

GitHub Copilot supports multiple providers through OpenAI-compatible transforms. For each provider:

  1. Add the provider configuration in the gateway

  2. Set the format to OpenAI-compatible (the gateway handles format transformation)

  3. Enable the transform layer to convert OpenAI request format to the provider’s native format

Common additional providers:

  • Google Gemini (requires OpenAI-compatible transform)

  • Mistral AI (already OpenAI-compatible format)

  • Azure OpenAI (already OpenAI-compatible format)

Enable models in the catalog

After enabling providers, enable specific models:

  1. Navigate to AI Gateway > Models

  2. Enable the models you want GitHub Copilot clients to access

    Common models for GitHub Copilot:

    • gpt-5.2 (OpenAI)

    • gpt-5.2-mini (OpenAI)

    • o1-mini (OpenAI)

    • claude-sonnet-4.5 (Anthropic, requires alias)

  3. Click Save

GitHub Copilot typically uses model names without vendor prefixes. You’ll configure model aliasing in the next section to map friendly names to provider-specific models.

Create a gateway for GitHub Copilot clients

Create a dedicated gateway to isolate GitHub Copilot traffic and apply specific policies.

Gateway configuration

  1. Navigate to Agentic > AI Gateway > Gateways

  2. Click Create Gateway

  3. Enter gateway details:

    Field Value

    Name

    github-copilot-gateway (or your preferred name)

    Workspace

    Select the workspace for access control grouping

    Description

    Gateway for GitHub Copilot clients

  4. Click Create

  5. Copy the gateway ID from the gateway details page

The gateway ID is required for routing requests to this gateway.

Configure model aliasing

GitHub Copilot expects model names like gpt-5.2 without vendor prefixes. Configure aliases to map these to provider-specific models:

  1. Navigate to the gateway’s Models tab

  2. Click Add Model Alias

  3. Configure aliases:

    Alias Name Target Model Provider

    gpt-5.2

    openai/gpt-5.2

    OpenAI

    gpt-5.2-mini

    openai/gpt-5.2-mini

    OpenAI

    claude-sonnet

    anthropic/claude-sonnet-4.5

    Anthropic

    o1-mini

    openai/o1-mini

    OpenAI

  4. Click Save

When GitHub Copilot requests gpt-5.2, the gateway routes to OpenAI’s gpt-5.2 model. Users can optionally request claude-sonnet for Anthropic models if the IDE configuration supports model selection.

Configure unified LLM routing

GitHub Copilot sends all requests to a single endpoint (/v1/chat/completions). Configure the gateway to route based on the requested model name.

Model-based routing

Configure routing that inspects the model field to determine the target provider:

  1. Navigate to the gateway’s LLM tab

  2. Under Routing, click Add route

  3. Configure OpenAI routing:

    request.body.model.startsWith("gpt-") || request.body.model.startsWith("o1-")
  4. Add a Primary provider pool:

    • Provider: OpenAI

    • Model: All enabled OpenAI models

    • Transform: None (already OpenAI format)

    • Load balancing: Round robin (if multiple OpenAI configurations exist)

  5. Click Save

  6. Add another route for Anthropic models:

    request.body.model.startsWith("claude-")
  7. Add a Primary provider pool:

    • Provider: Anthropic

    • Model: All enabled Anthropic models

    • Transform: OpenAI to Anthropic

  8. Click Save

GitHub Copilot requests route to the appropriate provider based on the model alias.

Default routing with fallback

Configure a catch-all route for requests without specific model prefixes:

true  # Matches all requests not matched by previous routes

Add a primary provider (for example, OpenAI) with fallback to Anthropic:

  • Primary: OpenAI (for requests with no specific model)

  • Fallback: Anthropic (if OpenAI is unavailable)

  • Failover conditions: Rate limits, timeouts, 5xx errors

Apply rate limits

Prevent runaway usage from GitHub Copilot clients. Code completion features generate very high request volumes.

  1. Navigate to the gateway’s LLM tab

  2. Under Rate Limit, configure:

    Setting Recommended Value

    Global rate limit

    300 requests per minute

    Per-user rate limit

    30 requests per minute (if using user identification)

  3. Click Save

The gateway blocks requests exceeding these limits and returns HTTP 429 errors.

Rate limit considerations for code completion

GitHub Copilot’s code completion feature generates extremely frequent requests (potentially dozens per minute per user). Consider:

  • Higher global rate limits than other AI coding assistants

  • Separate rate limits for different request types if the gateway supports request classification

  • Monitoring initial usage patterns to adjust limits appropriately

Set spending limits

Control LLM costs across all providers:

  1. Under Spend Limit, configure:

    Setting Value

    Monthly budget

    $10,000 (adjust based on expected usage)

    Enforcement

    Block requests after budget exceeded

    Alert threshold

    80% of budget (sends notification)

  2. Click Save

The gateway tracks estimated costs per request across all providers and blocks traffic when the monthly budget is exhausted.

Configure authentication

GitHub Copilot clients authenticate using bearer tokens in the Authorization header.

Generate API tokens

  1. Navigate to Security > API Tokens in ADP

  2. Click Create Token

  3. Enter token details:

    Field Value

    Name

    copilot-access

    Scopes

    ai-gateway:read, ai-gateway:write

    Expiration

    Set appropriate expiration based on security policies

  4. Click Create

  5. Copy the token (it appears only once)

Distribute this token to GitHub Copilot administrators through secure channels for organization-level configuration.

Token rotation

Implement token rotation for security:

  1. Create a new token before the existing token expires

  2. Update organization-level GitHub Copilot configuration with the new token

  3. Monitor usage of the old token in (observability dashboard)

  4. Revoke the old token after the configuration update propagates

Multi-tenant deployment strategies

GitHub Copilot has limited support for custom headers. The gateway ID is now embedded in the URL path, simplifying multi-tenancy. Use one of these strategies for BYOK deployments.

For organizations using VS Code with GitHub Copilot, the OAI Compatible Provider extension enables custom headers for additional metadata.

Install the extension

  1. Navigate to VS Code Extensions Marketplace

  2. Search for "OAI Compatible Provider"

  3. Install the extension

  4. Restart VS Code

Configure the extension

  1. Open VS Code settings (JSON)

  2. Add gateway configuration:

    {
      "oai-compatible-provider.providers": [
        {
          "name": "Redpanda AI Gateway",
          "baseUrl": "https://{CLUSTER_ID}.cloud.redpanda.com/ai-gateway/v1",
          "headers": {
            "Authorization": "Bearer YOUR_API_TOKEN"
          },
          "models": [
            "gpt-5.2",
            "gpt-5.2-mini",
            "claude-sonnet"
          ]
        }
      ]
    }
  3. Replace:

    • {CLUSTER_ID}: Your Redpanda cluster ID

    • YOUR_API_TOKEN: Team-specific API token

This approach allows true multi-tenancy with proper gateway isolation per team.

Benefits:

  • Clean separation between tenants

  • Standard authentication flow

  • Works with any IDE supported by the extension

Limitations:

  • Requires VS Code and extension installation

  • Not available for all GitHub Copilot-supported IDEs

  • Users must configure extension in addition to GitHub Copilot

Strategy 2: Query parameter routing

Embed tenant identity in query parameters for multi-tenant routing without custom headers.

  1. Configure gateway routing to extract tenant from query parameters:

    request.url.query["tenant"][0] == "team-alpha"
  2. Distribute tenant-specific endpoints to each team

  3. Configure GitHub Copilot organization settings with the tenant-specific base URL

Configuration example for Team Alpha:

Organization-level GitHub Copilot settings:

{
  "copilot": {
    "api_base_url": "https://{CLUSTER_ID}.cloud.redpanda.com/ai-gateway/v1?tenant=team-alpha",
    "api_key": "TEAM_ALPHA_TOKEN"
  }
}

Benefits:

  • Works with standard GitHub Copilot configuration

  • No additional extensions required

  • Simple to implement

Limitations:

  • Tenant identity exposed in URLs and logs

  • Less clean than header-based routing

  • URL parameters may be logged by intermediate proxies

Strategy 3: Token-based gateway mapping

Use different API tokens to identify which gateway to route to:

  1. Generate separate API tokens for each tenant or team

  2. Tag tokens with metadata indicating the target gateway

  3. Configure gateway routing based on token identity:

    request.auth.metadata["gateway_id"] == "team-alpha-gateway"
  4. Apply tenant-specific routing, rate limits, and spending limits based on the token

Benefits:

  • Transparent to users

  • No URL modifications needed

  • Centralized control through token management

Limitations:

  • Requires gateway support for token metadata inspection

  • Token management overhead increases with number of tenants

  • All tenants use the same base URL

Strategy 4: Single-tenant mode

For simpler deployments, configure a single gateway with shared access:

  1. Create one gateway for all GitHub Copilot users

  2. Generate a shared API token

  3. Configure GitHub Copilot at the organization level

  4. Use rate limits and spending limits to control overall usage

Benefits:

  • Simplest configuration

  • No tenant routing complexity

  • Easy to manage

Limitations:

  • No per-team cost tracking or limits

  • Shared rate limits may impact individual teams

  • All users have the same model access

Choosing a multi-tenant strategy

Strategy Pros Cons Best For

OAI Compatible Provider

Clean tenant separation, custom headers

Requires extension, VS Code only

Organizations standardized on VS Code

Query parameters

No extensions needed, simple setup

Tenant exposed in URLs, less clean

Quick deployments, small teams

Token-based

Transparent to users, centralized control

Requires advanced gateway features

Large organizations with many teams

Single-tenant

Simplest configuration and management

No per-team isolation or limits

Small organizations, proof of concept

Configure GitHub Copilot clients

Provide these instructions based on your chosen multi-tenant strategy.

Organization-level configuration (GitHub Enterprise)

For GitHub Enterprise customers, configure Copilot at the organization level:

  1. Navigate to your organization settings on GitHub

  2. Go to Copilot > Policies

  3. Enable Allow use of Copilot with custom models

  4. Configure the custom endpoint:

    {
      "api_base_url": "https://{CLUSTER_ID}.cloud.redpanda.com/ai-gateway/v1",
      "api_key": "YOUR_API_TOKEN"
    }
  5. If using query parameter routing, append the tenant identifier:

    {
      "api_base_url": "https://{CLUSTER_ID}.cloud.redpanda.com/ai-gateway/v1?tenant=YOUR_TEAM",
      "api_key": "YOUR_API_TOKEN"
    }

This configuration applies to all users in the organization.

IDE-specific configuration (individual users)

For individual users or when organization-level configuration is not available:

VS Code configuration

  1. Open VS Code settings

  2. Search for "GitHub Copilot"

  3. Configure custom endpoint (if using OAI Compatible Provider):

    {
      "github.copilot.advanced": {
        "endpoint": "https://{CLUSTER_ID}.cloud.redpanda.com/ai-gateway/v1"
      }
    }

JetBrains IDEs

  1. Open IDE Settings

  2. Navigate to Tools > GitHub Copilot

  3. Configure custom endpoint (support varies by IDE and Copilot version)

Neovim

  1. Edit Copilot configuration

  2. Add custom endpoint in the Copilot.vim or Copilot.lua configuration

  3. Refer to the Copilot.vim documentation for exact syntax

Model selection

Configure model preferences based on use case:

Use Case Recommended Model Reason

Code completion

gpt-5.2-mini

Fast, cost-effective for frequent requests

Code explanation

gpt-5.2 or claude-sonnet

Higher quality for complex explanations

Code generation

gpt-5.2 or claude-sonnet

Better at generating complete functions

Documentation

gpt-5.2-mini

Sufficient quality for docstrings and comments

Model selection is typically configured at the organization level or through IDE settings.

Monitor GitHub Copilot usage

Track GitHub Copilot activity through gateway observability features.

View request logs

  1. Navigate to AI Gateway > Observability > Logs

  2. Filter by gateway ID: github-copilot-gateway

  3. Review:

    • Request timestamps and duration

    • Model used per request (including aliases)

    • Token usage (prompt and completion tokens)

    • Estimated cost per request

    • HTTP status codes and errors

    • Transform operations (OpenAI to provider-native format)

GitHub Copilot generates distinct request patterns:

  • Code completion: Very high volume, short requests with low token counts

  • Chat/explain: Medium volume, longer requests with code context

  • Code generation: Lower volume, variable length requests

Analyze metrics

  1. Navigate to AI Gateway > Observability > Metrics

  2. Select the GitHub Copilot gateway

  3. Review:

    Metric Purpose

    Request volume by model

    Identify most-used models via aliases

    Token usage by model

    Track consumption patterns (completion vs chat)

    Estimated spend by provider

    Monitor costs across providers with transforms

    Latency (p50, p95, p99)

    Detect transform overhead and performance issues

    Error rate by provider

    Identify failing providers or transform issues

    Transform success rate

    Monitor OpenAI-to-provider format conversion success

    Requests per user/tenant

    Track usage by team (if using multi-tenant strategies)

Query logs via API

Programmatically access logs for integration with monitoring systems:

curl https://{CLUSTER_ID}.cloud.redpanda.com/api/ai-gateway/logs \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "gateway_id": "GATEWAY_ID",
    "start_time": "2026-01-01T00:00:00Z",
    "end_time": "2026-01-14T23:59:59Z",
    "limit": 100
  }'

Security considerations

Apply these security best practices for GitHub Copilot deployments.

Limit token scope

Create tokens with minimal required scopes:

  • ai-gateway:read: Required for model discovery

  • ai-gateway:write: Required for LLM requests

Avoid granting broader scopes like admin or cluster:write.

Implement network restrictions

If GitHub Copilot clients connect from known networks, configure network policies:

  1. Use cloud provider security groups to restrict access to AI Gateway endpoints

  2. Allowlist only the IP ranges where GitHub Copilot clients operate

  3. Monitor for unauthorized access attempts in request logs

Enforce token expiration

Set short token lifetimes for high-security environments:

  • Development environments: 90 days

  • Production environments: 30 days

Automate token rotation to reduce manual overhead. Coordinate with GitHub organization administrators when rotating tokens.

Monitor transform operations

Because GitHub Copilot may route to non-OpenAI providers through transforms:

  1. Review transform success rates in metrics

  2. Monitor for transform failures that may leak request details

  3. Test transforms thoroughly before production deployment

  4. Keep transform logic updated as provider APIs evolve

Audit model access

Review which models GitHub Copilot clients can access:

  1. Periodically audit enabled models and aliases

  2. Remove deprecated or unused model configurations

  3. Monitor model usage logs for unexpected patterns

  4. Ensure cost-effective models are used for high-volume completion requests

Code completion security

GitHub Copilot sends code context to LLM providers. Ensure:

  • Users understand what code context is sent with requests

  • Proprietary code may be included in prompts

  • Configure organization policies to limit code sharing if needed

  • Review provider data retention policies

  • Monitor logs for sensitive information in prompts (if logging includes prompt content)

Organization-level controls

For GitHub Enterprise customers:

  1. Use organization-level policies to enforce custom endpoint usage

  2. Restrict which users can configure custom endpoints

  3. Monitor organization audit logs for configuration changes

  4. Implement approval workflows for endpoint changes

Troubleshooting

Common issues and solutions when configuring AI Gateway for GitHub Copilot.

GitHub Copilot cannot connect to gateway

Symptom: Connection errors when GitHub Copilot tries to send requests.

Causes and solutions:

  • Invalid base URL: Verify the configured endpoint matches the gateway URL (including query parameters if using query-based routing)

  • Expired token: Generate a new API token and update the GitHub Copilot configuration

  • Network connectivity: Verify the cluster endpoint is accessible from the client network

  • Provider not enabled: Ensure at least one provider is enabled and has models in the catalog

  • SSL/TLS issues: Verify the cluster has valid SSL certificates

  • Organization policy blocking custom endpoints: Check GitHub organization settings

Model not found errors

Symptom: GitHub Copilot shows "model not found" or similar errors.

Causes and solutions:

  • Model not enabled in catalog: Enable the model in the gateway’s model catalog

  • Model alias missing: Create an alias for the model name GitHub Copilot expects (for example, gpt-5.2)

  • Incorrect model name: Verify GitHub Copilot is requesting a model name that exists in your aliases

  • Routing rule mismatch: Check that routing rules correctly match the requested model name

Transform errors or unexpected responses

Symptom: Responses are malformed or GitHub Copilot reports format errors.

Causes and solutions:

  • Transform disabled: Ensure OpenAI-compatible transform is enabled for non-OpenAI providers (for example, Anthropic)

  • Transform version mismatch: Verify the transform is compatible with the current provider API version

  • Model-specific transform issues: Some models may require specific transform configurations

  • Check transform logs: Review logs for transform errors and stack traces

  • Response format incompatibility: Verify the provider’s response can be transformed to OpenAI format

High costs or token usage

Symptom: Token usage and costs exceed expectations.

Causes and solutions:

  • Code completion using expensive model: Configure completion to use gpt-5.2-mini instead of larger models

  • No rate limits: Apply per-minute rate limits to prevent runaway usage

  • Missing spending limits: Set monthly budget limits with blocking enforcement

  • Chat using wrong model: Ensure chat/explanation features use cost-effective models

  • Transform overhead: Monitor if transforms add significant token overhead

  • High completion request volume: Expected behavior, adjust budgets or implement stricter rate limits

Requests failing with 429 errors

Symptom: GitHub Copilot receives HTTP 429 Too Many Requests errors.

Causes and solutions:

  • Rate limit exceeded: Review and increase rate limits if usage is legitimate (code completion needs very high limits)

  • Upstream provider rate limits: Check if the upstream LLM provider is rate-limiting; configure failover to alternate providers

  • Budget exhausted: Verify monthly spending limit has not been reached

  • Per-user limits too restrictive: Adjust per-user rate limits if using multi-tenant strategies

  • Spike in usage: Code completion can generate sudden usage spikes, consider burstable rate limits

Multi-tenant routing failures

Symptom: Requests route to wrong gateway or fail authorization.

Causes and solutions:

  • Query parameter missing: Ensure query parameter is appended to all requests if using query-based routing

  • Token metadata incorrect: Verify token is tagged with correct gateway metadata

  • Routing rule conflicts: Check for overlapping routing rules that may cause unexpected routing

  • Organization policy override: Verify GitHub organization settings aren’t overriding user configurations

  • Extension not configured: If using OAI Compatible Provider extension, verify proper installation and configuration

Performance issues

Symptom: Slow response times from GitHub Copilot.

Causes and solutions:

  • Transform latency: Monitor metrics for transform processing time overhead

  • Provider latency: Check latency metrics by provider to identify slow backends

  • Network latency: Verify cluster is in a region with good connectivity to users

  • Cold start delays: Some providers may have cold start latency on first request

  • Rate limiting overhead: Check if rate limit enforcement is adding latency

Next steps

  • ai-gateway:routing-cel.adoc: Implement advanced routing rules for model aliasing