Skip to content

Latest commit

 

History

History
914 lines (695 loc) · 22.7 KB

File metadata and controls

914 lines (695 loc) · 22.7 KB

Configure GitHub Copilot with AI Gateway

After configuring your AI Gateway, set up GitHub Copilot to route LLM requests through the gateway for centralized observability, cost management, and provider flexibility.

After reading this page, you will be able to:

  • ❏ Configure GitHub Copilot in VS Code and JetBrains IDEs to route requests through AI Gateway.

  • ❏ Set up multi-tenancy with gateway routing for cost tracking.

  • ❏ Configure enterprise BYOK deployments for team-wide Copilot access.

Prerequisites

Before configuring GitHub Copilot, ensure you have:

  • GitHub Copilot subscription (Individual, Business, or Enterprise)

  • An active Redpanda AI Gateway with:

    • At least one LLM provider enabled (see Enable a provider)

    • A gateway created and configured (see Create a gateway)

  • Your AI Gateway credentials:

    • Gateway endpoint URL (for example, https://gw.ai.panda.com)

    • Gateway ID (for example, gateway-abc123)

    • API key with access to the gateway

  • Your IDE:

    • VS Code with GitHub Copilot extension installed

    • Or JetBrains IDE (IntelliJ IDEA, PyCharm, etc.) with GitHub Copilot plugin

About GitHub Copilot and AI Gateway

GitHub Copilot provides AI-powered code completion and chat within your IDE. By default, Copilot routes requests directly to GitHub’s infrastructure, which uses OpenAI and other LLM providers.

When you route Copilot through AI Gateway, you gain:

  • Centralized observability across all Copilot usage

  • Cost attribution per developer, team, or project

  • Provider flexibility (use your own API keys or alternative models)

  • Policy enforcement (rate limits, spend controls)

  • Multi-tenancy support for enterprise deployments

Configuration approaches

GitHub Copilot supports different configuration approaches depending on your IDE and subscription tier:

IDE Method Subscription Tier Complexity

VS Code

Custom OpenAI models

Individual, Business, Enterprise

Medium

VS Code

OAI Compatible Provider extension

Individual, Business, Enterprise

Low

JetBrains

Enterprise BYOK

Enterprise

Low

Choose the approach that matches your environment. VS Code users have multiple options, while JetBrains users need GitHub Copilot Enterprise with BYOK support.

Configure in VS Code

VS Code offers two approaches for routing Copilot through AI Gateway:

  1. Custom OpenAI models (manual configuration)

  2. OAI Compatible Provider extension (simplified)

Option 1: Custom OpenAI models

This approach configures VS Code to recognize your AI Gateway as a custom OpenAI-compatible provider.

Configure custom models

  1. Open VS Code Settings:

    • macOS: Cmd+,

    • Windows/Linux: Ctrl+,

  2. Search for github.copilot.chat.customOAIModels

  3. Click Edit in settings.json

  4. Add the following configuration:

{
  "github.copilot.chat.customOAIModels": [
    {
      "id": "anthropic/claude-sonnet-4.5",
      "name": "Claude Sonnet 4.5 (Gateway)",
      "endpoint": "https://gw.ai.panda.com/v1",
      "provider": "redpanda-gateway"
    },
    {
      "id": "openai/gpt-5.2",
      "name": "GPT-5.2 (Gateway)",
      "endpoint": "https://gw.ai.panda.com/v1",
      "provider": "redpanda-gateway"
    }
  ]
}

Replace https://gw.ai.panda.com/v1 with your gateway endpoint.

Important
This experimental feature requires configuring API keys and custom headers through the Copilot Chat UI, not in settings.json.

Configure API key and headers via Copilot Chat UI

  1. Open Copilot Chat in VS Code (Cmd+I or Ctrl+I)

  2. Click the model selector dropdown

  3. Click Manage Models at the bottom of the dropdown

  4. Click Add Model

  5. Select your configured provider ("redpanda-gateway")

  6. Enter the connection details:

  7. Click Save

Select model

  1. Open Copilot chat with Cmd+I (macOS) or Ctrl+I (Windows/Linux)

  2. Click the model selector dropdown

  3. Choose a model from the "redpanda-gateway" provider

Option 2: OAI Compatible Provider extension

The OAI Compatible Provider extension provides enhanced support for OpenAI-compatible endpoints with custom headers.

Install extension

  1. Open VS Code Extensions (Cmd+Shift+X or Ctrl+Shift+X)

  2. Search for "OAI Compatible Provider"

  3. Click Install

Configure base URL in settings

Add the base URL configuration in VS Code settings:

  1. Open VS Code Settings (Cmd+, or Ctrl+,)

  2. Search for oaicopilot

  3. Click Edit in settings.json

  4. Add the following:

{
  "oaicopilot.baseUrl": "https://gw.ai.panda.com/v1",
  "oaicopilot.models": [
    "anthropic/claude-sonnet-4.5",
    "openai/gpt-5.2",
    "openai/gpt-5.2-mini"
  ]
}

Replace https://gw.ai.panda.com/v1 with your gateway endpoint.

Configure API key and headers via Copilot Chat UI

Important
Do not configure API keys or custom headers in settings.json. Use the Copilot Chat UI instead.
  1. Open Copilot Chat in VS Code (Cmd+I or Ctrl+I)

  2. Click the model selector dropdown

  3. Click Manage Models

  4. Find the OAI Compatible Provider in the list

  5. Click Configure or Edit

  6. Enter the connection details:

    • API Key: Your Redpanda API key

  7. Click Save

Select model

  1. Open Copilot chat with Cmd+I (macOS) or Ctrl+I (Windows/Linux)

  2. Click the model selector dropdown

  3. Choose a model from the OAI Compatible Provider

Configure in JetBrains IDEs

JetBrains IDE integration requires GitHub Copilot Enterprise with Bring Your Own Key (BYOK) support.

Prerequisites

  • GitHub Copilot Enterprise subscription

  • BYOK enabled for your organization

  • JetBrains IDE 2024.1 or later

  • GitHub Copilot plugin version 1.4.0 or later

Configure BYOK with AI Gateway

  1. Open your JetBrains IDE (IntelliJ IDEA, PyCharm, etc.)

  2. Navigate to Settings/Preferences:

    • macOS: Cmd+,

    • Windows/Linux: Ctrl+Alt+S

  3. Go to Tools > GitHub Copilot

  4. Under Advanced Settings, find Custom Model Configuration

  5. Configure the OpenAI-compatible endpoint:

Base URL: https://gw.ai.panda.com/v1
API Key: your-redpanda-api-key

Replace placeholder values:

Configure model selection

In the GitHub Copilot settings:

  1. Expand Model Selection

  2. Choose your preferred models from the AI Gateway:

    • Chat model: anthropic/claude-sonnet-4.5 or openai/gpt-5.2

    • Code completion model: openai/gpt-5.2-mini (faster, cost-effective)

Model format uses vendor/model_id pattern to route through the gateway to the appropriate provider.

Test configuration

  1. Open a code file

  2. Trigger code completion (start typing)

  3. Or open Copilot chat:

    • Right-click > Copilot > Open Chat

    • Or use shortcut: Cmd+Shift+C (macOS) or Ctrl+Shift+C (Windows/Linux)

  4. Verify suggestions appear

Check the AI Gateway dashboard to confirm requests are logged.

Multi-tenancy configuration

For organizations with multiple teams or projects sharing AI Gateway, use separate gateways to track usage per team.

Approach 1: One gateway per team

Create separate gateways for each team:

  • Team A Gateway: ID team-a-gateway-123

  • Team B Gateway: ID team-b-gateway-456

Each team configures their IDE with their team’s gateway endpoint URL, which includes the gateway ID in the path.

Benefits:

  • Isolated cost tracking per team

  • Team-specific rate limits and budgets

  • Separate observability dashboards

Approach 2: Shared gateway with custom headers

Use a single gateway with custom headers for attribution:

{
  "oai.provider.headers": {
    "x-team": "backend-team",
    "x-project": "api-service"
  }
}

Configure gateway CEL routing to read these headers:

request.headers["x-team"] == "backend-team" ? "openai/gpt-5.2" : "openai/gpt-5.2-mini"

Benefits:

  • Single gateway to manage

  • Flexible cost attribution

  • Header-based routing policies

Filter observability dashboard by x-team or x-project headers to generate team-specific reports.

Approach 3: Environment-based gateways

Separate development, staging, and production environments:

{
  "oai.provider.headers": {
    "x-environment": "${env:ENVIRONMENT}"
  }
}

Set environment variables per workspace:

# Development workspace
export ENVIRONMENT="development"

# Production workspace
export ENVIRONMENT="production"

Benefits:

  • Prevent development usage from affecting production metrics

  • Different rate limits and budgets per environment

  • Environment-specific model access policies

Enterprise BYOK at scale

For large organizations deploying GitHub Copilot Enterprise with AI Gateway across hundreds or thousands of developers.

Centralized configuration management

Distribute IDE configuration files via:

  • Git repository: Store settings.json or IDE configuration in a shared repository

  • Configuration management tools: Puppet, Chef, Ansible

  • Group Policy (Windows environments)

  • MDM solutions (macOS environments)

Example centralized configuration:

{
  "oai.provider.endpoint": "https://gw.company.com/v1",
  "oai.provider.apiKey": "${env:COPILOT_GATEWAY_KEY}",
  "oai.provider.headers": {
    "x-user-email": "${env:USER_EMAIL}",
    "x-department": "${env:DEPARTMENT}"
  }
}

Developers set environment variables locally or receive them from identity management systems.

API key management

Option 1: Individual API keys

Each developer gets their own Redpanda API key:

  • Tied to their identity (email, employee ID)

  • Revocable when they leave the organization

  • Enables per-developer cost attribution

Option 2: Team API keys

Teams share API keys:

  • Simpler key management

  • Cost attribution by team, not individual

  • Use custom headers for finer-grained tracking

Option 3: Service account keys

Single key for all developers:

  • Simplest to deploy

  • No per-developer tracking

  • Use custom headers for all attribution

Automated provisioning workflow

  1. Developer joins organization

  2. Identity system (Okta, Azure AD, etc.) triggers provisioning:

    1. Create Redpanda API key

    2. Assign to appropriate gateway

    3. Generate IDE configuration file with embedded keys

    4. Distribute to developer workstation

  3. Developer installs IDE and GitHub Copilot

  4. Configuration auto-applies (via MDM or configuration management)

  5. Developer starts using Copilot immediately

Observability and governance

Track usage across the organization:

  1. Navigate to AI Gateway dashboard

  2. Filter by custom headers:

    • x-department: View costs per department

    • x-user-email: Track usage per developer

    • x-project: Attribute costs to specific projects

  3. Generate reports:

    • Top 10 users by token usage

    • Departments exceeding budget

    • Projects using deprecated models

  4. Set alerts:

    • Individual developer exceeds threshold (potential misuse)

    • Department budget approaching limit

    • Unusual request patterns (security concern)

Policy enforcement

Use gateway CEL routing to enforce policies:

// Limit junior developers to cost-effective models
request.headers["x-user-level"] == "junior"
  ? "openai/gpt-5.2-mini"
  : "anthropic/claude-sonnet-4.5"

// Block access for contractors to expensive models
request.headers["x-user-type"] == "contractor" &&
request.body.model.contains("opus")
  ? error("Contractors cannot use Opus models")
  : request.body.model

Verify configuration

After configuring GitHub Copilot, verify it routes requests through your AI Gateway.

Test code completion

  1. Open a code file in your IDE

  2. Start typing a function definition

  3. Wait for code completion suggestions to appear

Completion requests appear in the gateway dashboard with:

  • Low token counts (typically 50-200 tokens)

  • High request frequency (as you type)

  • The completion model you configured

Test chat interface

  1. Open Copilot chat:

    • VS Code: Cmd+I (macOS) or Ctrl+I (Windows/Linux)

    • JetBrains: Right-click > Copilot > Open Chat

  2. Ask a simple question: "Explain this function"

  3. Wait for response

Chat requests appear in the gateway dashboard with:

  • Higher token counts (500-2000 tokens typical)

  • The chat model you configured

  • Response status (200 for success)

Verify in dashboard

  1. Sign in to ADP

  2. Navigate to your gateway’s observability dashboard

  3. Filter by gateway ID

  4. Verify:

    • Requests appear in logs

    • Models show correct format (for example, anthropic/claude-sonnet-4.5)

    • Token usage and cost are recorded

    • Custom headers appear (if configured)

If requests don’t appear, see Troubleshooting.

Advanced configuration

Model-specific settings

Configure different models for different tasks:

{
  "oai.provider.models": [
    {
      "id": "anthropic/claude-sonnet-4.5",
      "name": "Claude Sonnet (chat)",
      "type": "chat",
      "temperature": 0.7,
      "maxTokens": 4096
    },
    {
      "id": "openai/gpt-5.2-mini",
      "name": "GPT-5.2 Mini (completion)",
      "type": "completion",
      "temperature": 0.2,
      "maxTokens": 512
    }
  ]
}

Settings explained:

  • Chat uses Claude Sonnet with higher temperature for creative responses

  • Completion uses GPT-5.2 Mini with lower temperature for deterministic code

  • Chat allows longer responses, completion limits tokens for speed

Workspace-specific configuration

Override global settings for specific projects using workspace settings.

In VS Code, create .vscode/settings.json in your project root:

{
  "oai.provider.headers": {
    "x-project": "customer-portal"
  }
}

Benefits:

  • Route different projects through different gateways

  • Track costs per project

  • Use different models per project (cost-effective for internal, premium for customer-facing)

Custom request timeouts

Configure timeout for AI Gateway requests:

{
  "oai.provider.timeout": 30000
}

Timeout is in milliseconds. Default is typically 30000 (30 seconds).

Increase timeouts for:

  • High-latency network environments

  • Complex code generation tasks

  • Large file context

Debug mode

Enable debug logging to troubleshoot issues:

{
  "oai.provider.debug": true,
  "github.copilot.advanced": {
    "debug": true
  }
}

View debug logs:

  • VS Code: Developer Console (Help > Toggle Developer Tools > Console tab)

  • JetBrains: Help > Diagnostic Tools > Debug Log Settings > Add github.copilot

Debug mode shows:

  • HTTP request and response headers

  • Model selection decisions

  • Token usage calculations

  • Error details

Troubleshooting

Copilot shows no suggestions

Symptom: Code completion doesn’t work or Copilot shows "No suggestions available".

Causes and solutions:

  1. Configuration not loaded

    Reload your IDE to apply configuration changes:

    • VS Code: Command Palette > "Developer: Reload Window"

    • JetBrains: File > Invalidate Caches / Restart

  2. Incorrect endpoint URL

    Verify the URL format includes /v1 at the end:

    # Correct
    https://gw.ai.panda.com/v1
    
    # Incorrect
    https://gw.ai.panda.com
  3. Authentication failure

    Verify your API key is valid:

    curl -H "Authorization: Bearer YOUR_API_KEY" \
         https://gw.ai.panda.com/v1/models

    You should receive a list of available models. If you get 401 Unauthorized, regenerate your API key in ADP.

  4. Extension/plugin disabled

    Verify GitHub Copilot is enabled:

    • VS Code: Extensions view > GitHub Copilot > Ensure "Enabled"

    • JetBrains: Settings > Plugins > GitHub Copilot > Check "Enabled"

  5. Network connectivity issues

    Test connectivity to the gateway:

    curl -I https://gw.ai.panda.com/v1

    If this times out, check your network configuration, firewall rules, or VPN connection.

Requests not appearing in gateway dashboard

Symptom: Copilot works, but requests don’t appear in the AI Gateway observability dashboard.

Causes and solutions:

  1. Wrong gateway ID

    Verify the gateway ID in your endpoint URL matches the gateway you’re viewing in the dashboard (case-sensitive).

  2. Using direct GitHub connection

    If the endpoint configuration is missing or incorrect, Copilot may route directly to GitHub instead of your gateway. Verify endpoint configuration.

  3. Log ingestion delay

    Gateway logs can take 5-10 seconds to appear in the dashboard. Wait briefly and refresh.

  4. Environment variable not set

    If using environment variables like ${env:REDPANDA_API_KEY}, verify they’re set before launching the IDE:

    echo $REDPANDA_API_KEY  # Should print your API key

High latency or slow suggestions

Symptom: Code completion is slow or chat responses take a long time.

Causes and solutions:

  1. Gateway geographic distance

    If your gateway is in a different region than you or the upstream provider, this adds network latency. Check gateway region in ADP.

  2. Slow model for completion

    Use a faster model for code completion:

    {
      "oai.provider.models": [
        {
          "id": "openai/gpt-5.2-mini",
          "type": "completion"
        }
      ]
    }

    Models like GPT-5.2 Mini or Claude Haiku provide faster responses ideal for code completion.

  3. Provider pool failover

    If your gateway is configured with fallback providers, check the logs to see if requests are failing over. Failover adds latency.

  4. Rate limiting

    If you’re hitting rate limits, the gateway may be queuing requests. Check the observability dashboard for rate limit metrics.

  5. Token limit too high

    Reduce maxTokens for completion models to improve speed:

    {
      "oai.provider.models": [
        {
          "id": "openai/gpt-5.2-mini",
          "type": "completion",
          "maxTokens": 256
        }
      ]
    }

Custom headers not being sent

Symptom: Custom headers (like x-team or x-project) don’t appear in gateway logs.

Causes and solutions:

  1. Extension not installed (VS Code)

    Custom headers require the OAI Compatible Provider extension in VS Code. Install it from the Extensions marketplace.

  2. Header configuration location

    Ensure headers are in the correct configuration section:

    {
      "oai.provider.headers": {
        "x-custom": "value"
      }
    }

    Not:

    {
      "github.copilot.advanced": {
        "headers": {  // Wrong location
          "x-custom": "value"
        }
      }
    }
  3. Environment variable not expanded

    If using ${env:VAR_NAME} syntax, verify the environment variable is set before launching the IDE.

Model not recognized

Symptom: Error message "Model not found" or "Invalid model ID".

Causes and solutions:

  1. Incorrect model format

    Ensure model names use the vendor/model_id format:

    # Correct
    anthropic/claude-sonnet-4.5
    openai/gpt-5.2
    
    # Incorrect
    claude-sonnet-4.5
    gpt-5.2
  2. Model not enabled in gateway

    Verify the model is enabled in your AI Gateway configuration:

    1. Sign in to ADP

    2. Navigate to your gateway

    3. Check enabled providers and models

  3. Typo in model ID

    Double-check the model ID matches exactly (case-sensitive). Copy from the AI Gateway UI rather than typing manually.

Configuration changes not taking effect

Symptom: Changes to settings don’t apply.

Solutions:

  1. Reload IDE

    Configuration changes require reloading:

    • VS Code: Command Palette > "Developer: Reload Window"

    • JetBrains: File > Invalidate Caches / Restart

  2. Invalid JSON syntax

    Validate your settings.json file:

    python3 -m json.tool ~/.config/Code/User/settings.json

    Fix any syntax errors reported.

  3. Workspace settings override

    Check if .vscode/settings.json in your project root overrides global settings. Workspace settings take precedence over global settings.

  4. File permissions

    Verify the IDE can read the configuration file:

    ls -la ~/.config/Code/User/settings.json

    Fix permissions if needed:

    chmod 600 ~/.config/Code/User/settings.json

Cost optimization tips

Use different models for chat and completion

Code completion needs speed, while chat benefits from reasoning depth:

{
  "oai.provider.models": [
    {
      "id": "anthropic/claude-sonnet-4.5",
      "type": "chat"
    },
    {
      "id": "openai/gpt-5.2-mini",
      "type": "completion"
    }
  ]
}

This can reduce costs by 5-10x for code completion while maintaining chat quality.

Limit token usage

Reduce maximum tokens for completion to prevent runaway costs:

{
  "oai.provider.models": [
    {
      "id": "openai/gpt-5.2-mini",
      "type": "completion",
      "maxTokens": 256
    }
  ]
}

Code completion rarely needs more than 256 tokens.

Monitor usage patterns

Use the AI Gateway dashboard to identify optimization opportunities:

  1. Navigate to your gateway’s observability dashboard

  2. Filter by custom headers (for example, x-team, x-user-email)

  3. Analyze:

    • Token usage per developer or team

    • Most expensive queries

    • High-frequency low-value requests

Set team-based budgets

Use separate gateways or CEL routing to enforce team budgets:

// Limit team to 1 million tokens per month
request.headers["x-team"] == "frontend" &&
monthly_tokens > 1000000
  ? error("Team budget exceeded")
  : request.body.model

Configure alerts in the dashboard when teams approach their limits.

Track costs per project

Use custom headers to attribute costs:

{
  "oai.provider.headers": {
    "x-project": "mobile-app"
  }
}

Generate project-specific cost reports from the gateway dashboard.

Next steps

  • ai-gateway:routing-cel.adoc: Use CEL expressions to route Copilot requests based on context

  • ai-gateway:aggregation.adoc: Learn about MCP tool integration (if using Copilot Workspace)