|
1 | | -= Configure Your LLM Provider |
2 | | -:description: Connect AI Gateway to your preferred LLM providers. |
| 1 | += Configure an LLM Provider |
| 2 | +:description: Create an LLM provider in AI Gateway to proxy requests to OpenAI, Anthropic, Google AI, AWS Bedrock, or any OpenAI-compatible endpoint through a managed Redpanda URL. |
| 3 | +:page-topic-type: how-to |
| 4 | +:personas: platform_admin, app_developer |
| 5 | +// Page aliases for the consolidated quickstart and setup-guide redirects will land in a follow-up cleanup PR that also deletes the legacy pages (gateway-quickstart.adoc, gateway-architecture.adoc, aggregation.adoc, routing-cel.adoc, admin/setup-guide.adoc, builders/discover-gateways.adoc) and retargets the ~80 cross-module xrefs (agents, integrations, observability) that still point at them. |
| 6 | +:learning-objective-1: Create an LLM provider for OpenAI, Anthropic, Google AI, AWS Bedrock, or an OpenAI-compatible endpoint |
| 7 | +:learning-objective-2: Select the models you want to expose through the provider |
| 8 | +:learning-objective-3: Verify the provider is reachable using the built-in Test Connection control |
3 | 9 |
|
4 | | -// TODO: Add content |
| 10 | +include::ROOT:partial$adp-la.adoc[] |
| 11 | + |
| 12 | +An LLM provider is the primary resource in AI Gateway. When you create one, Redpanda gives you a managed proxy URL that your applications can point at: Redpanda handles the upstream API keys, forwards requests to the provider, and records usage for you. This guide walks you through creating a provider for each supported upstream. |
| 13 | + |
| 14 | +After completing this guide, you will be able to: |
| 15 | + |
| 16 | +* [ ] {learning-objective-1} |
| 17 | +* [ ] {learning-objective-2} |
| 18 | +* [ ] {learning-objective-3} |
| 19 | +
|
| 20 | +== Prerequisites |
| 21 | + |
| 22 | +* Access to a Redpanda Cloud cluster with ADP enabled. |
| 23 | ++ |
| 24 | +// TODO: this guide describes the cluster-embedded view available today on cloud.redpanda.com. The standalone-ADP UI launches as a separate product surface; sign-in URL, IAM model, and role-permission requirements will change. Update once standalone ADP ships. |
| 25 | +* An API key (or AWS credentials for Bedrock) for the upstream provider you want to configure. |
| 26 | +* One or more secrets already created in your dataplane's secret store for the provider's credentials. Secret references must use `UPPER_SNAKE_CASE`. For example: `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, `AWS_ACCESS_KEY_ID`. |
| 27 | ++ |
| 28 | +// TODO: xref the secrets-management page for ADP once confirmed. |
| 29 | + |
| 30 | +== Open the Create LLM provider page |
| 31 | + |
| 32 | +. Sign in to https://cloud.redpanda.com[cloud.redpanda.com] and open the cluster you want to configure. |
| 33 | +. In the sidebar, expand *ADP* and select *LLM Providers*. |
| 34 | +. Click *Create provider*. The *Create LLM provider* page opens. |
| 35 | + |
| 36 | +== Fill in the Provider card |
| 37 | + |
| 38 | +The first card on the page collects identity fields. |
| 39 | + |
| 40 | +[cols="1,1,3"] |
| 41 | +|=== |
| 42 | +|Field |Required |Notes |
| 43 | + |
| 44 | +|*Name* |
| 45 | +|Yes |
| 46 | +|Machine identifier. Lowercase letters, numbers, and hyphens only (`^[a-z][a-z0-9-]*$`), up to 63 characters. Immutable after creation. Appears in the proxy URL (`/llm/v1/providers/<name>/...`). The form auto-suggests a friendly name (for example, `red-space-bear`); override it if you want something more descriptive. |
| 47 | + |
| 48 | +|*Display name* (Advanced options) |
| 49 | +|No |
| 50 | +|Human-readable label shown in dashboards and model selectors. Up to 253 characters. Leave blank to use the *Name*. |
| 51 | +|=== |
| 52 | + |
| 53 | +Display name lives in the *Advanced options* expander on the same card. |
| 54 | + |
| 55 | +== Choose a provider type |
| 56 | + |
| 57 | +The *Provider type* card shows five cards. Pick the one that matches your upstream. |
| 58 | + |
| 59 | +[cols="1,3"] |
| 60 | +|=== |
| 61 | +|Type |Use when |
| 62 | + |
| 63 | +|*OpenAI* |
| 64 | +|Proxy GPT, o-series, and embeddings through the OpenAI API. Best when you already hold an OpenAI API key or want the broadest GPT model catalog. |
| 65 | + |
| 66 | +|*Anthropic* |
| 67 | +|Call Claude Opus, Sonnet, and Haiku directly. Strong at coding, long-context reasoning, and tool use. Supports forwarding client `Authorization` headers to Anthropic for enterprise and Max-plan subscription passthrough (see <<anthropic-authorization-passthrough>>). |
| 68 | + |
| 69 | +|*Google AI* |
| 70 | +|Reach Gemini Pro, Flash, and multimodal models via Google AI Studio. Ideal for long-context workloads and image/video inputs. |
| 71 | + |
| 72 | +|*AWS Bedrock* |
| 73 | +|Invoke foundation models (Claude, Llama, Titan, Nova) hosted inside your AWS account. Requires an AWS region and credentials (static, STS-assumed role, or the default credential chain). |
| 74 | + |
| 75 | +|*OpenAI-compatible* |
| 76 | +|Point at any OpenAI-compatible endpoint that ships `/v1/chat/completions` (vLLM, Ollama, LM Studio, LocalAI, Together, Groq, OpenRouter). Useful for self-hosted models and aggregator gateways. Requires a *Base URL*; authentication is optional. |
| 77 | +|=== |
| 78 | + |
| 79 | +Selecting a type reveals the type-specific configuration block below the picker. |
| 80 | + |
| 81 | +== Fill in the type-specific configuration |
| 82 | + |
| 83 | +[tabs] |
| 84 | +====== |
| 85 | +OpenAI:: |
| 86 | ++ |
| 87 | +[cols="1,3"] |
| 88 | +|=== |
| 89 | +|Field |Notes |
| 90 | +
|
| 91 | +|*Base URL* |
| 92 | +|Optional. Leave empty for the standard OpenAI API (`https://api.openai.com/v1`). Override for Azure OpenAI or other OpenAI-hosted endpoints. |
| 93 | +
|
| 94 | +|*API key* |
| 95 | +|Required. Secret-store reference for the OpenAI API key. Must be `UPPER_SNAKE_CASE`, for example `OPENAI_API_KEY`. |
| 96 | +|=== |
| 97 | +
|
| 98 | +Anthropic:: |
| 99 | ++ |
| 100 | +[cols="1,3"] |
| 101 | +|=== |
| 102 | +|Field |Notes |
| 103 | +
|
| 104 | +|*Base URL* |
| 105 | +|Optional. Leave empty for the standard Anthropic API (`https://api.anthropic.com`). |
| 106 | +
|
| 107 | +|*API key* |
| 108 | +|Required unless *Auth passthrough* is on. `UPPER_SNAKE_CASE`, for example `ANTHROPIC_API_KEY`. |
| 109 | +
|
| 110 | +|*Auth passthrough* |
| 111 | +|Optional toggle. When on, the client's `Authorization` header is forwarded to Anthropic instead of using a server-side API key. Used for enterprise and Max-plan OAuth passthrough: each client authenticates with its own Anthropic subscription. Leave the API key reference empty when using passthrough. |
| 112 | +|=== |
| 113 | +
|
| 114 | +Google AI:: |
| 115 | ++ |
| 116 | +[cols="1,3"] |
| 117 | +|=== |
| 118 | +|Field |Notes |
| 119 | +
|
| 120 | +|*Base URL* |
| 121 | +|Optional. Leave empty for the standard Google AI API (`https://generativelanguage.googleapis.com`). |
| 122 | +
|
| 123 | +|*API key* |
| 124 | +|Required. Secret-store reference for the Google AI API key. `UPPER_SNAKE_CASE`, for example `GOOGLE_AI_API_KEY`. |
| 125 | +|=== |
| 126 | ++ |
| 127 | +[IMPORTANT] |
| 128 | +==== |
| 129 | +Gemini uses the `x-goog-api-key` header for authentication, not `Authorization: Bearer`. This matters when you wire up clients. See xref:connect-agent.adoc[Connect your agent]. |
| 130 | +==== |
| 131 | +
|
| 132 | +AWS Bedrock:: |
| 133 | ++ |
| 134 | +[cols="1,3"] |
| 135 | +|=== |
| 136 | +|Field |Notes |
| 137 | +
|
| 138 | +|*Region* |
| 139 | +|Required. AWS region where the Bedrock endpoint is deployed, for example `us-east-1`. |
| 140 | +
|
| 141 | +|*Base URL* |
| 142 | +|Optional. Override the default regional Bedrock endpoint. |
| 143 | +
|
| 144 | +|*Credentials* |
| 145 | +|Choose one of: |
| 146 | +
|
| 147 | +* *Default credential chain* (leave the credentials oneof unset). Uses environment variables, IRSA, EKS Pod Identity, or instance profile. |
| 148 | +* *Static credentials*. Secret-store references for the access key ID and secret access key, both `UPPER_SNAKE_CASE` (typically `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY`). |
| 149 | +* *Assume role*. Provide a `role_arn`, optional `external_id` (required when the role's trust policy mandates it), and optional `session_name` (surfaces in CloudTrail audit). |
| 150 | +|=== |
| 151 | +
|
| 152 | +OpenAI-compatible:: |
| 153 | ++ |
| 154 | +[cols="1,3"] |
| 155 | +|=== |
| 156 | +|Field |Notes |
| 157 | +
|
| 158 | +|*Base URL* |
| 159 | +|Required. URL of your OpenAI-compatible endpoint, for example `http://vllm.internal:8000/v1`, `http://ollama.local:11434/v1`, or an aggregator like Together / Groq / OpenRouter. |
| 160 | +
|
| 161 | +|*API key* |
| 162 | +|Optional. Leave empty for no-auth endpoints (common for local runtimes). `UPPER_SNAKE_CASE` if set. |
| 163 | +|=== |
| 164 | ++ |
| 165 | +TIP: OpenAI-compatible endpoints can serve any model. Enter the exact model identifiers your upstream server exposes (for example, `meta-llama/Llama-3.3-70B-Instruct` or `qwen3:8b`). |
| 166 | +====== |
| 167 | + |
| 168 | +[[select-models]] |
| 169 | +== Select models |
| 170 | + |
| 171 | +Models you select on this form become the catalog the provider exposes. Leave the list empty to allow every model the upstream catalog returns. |
| 172 | + |
| 173 | +For *OpenAI*, *Anthropic*, *Google AI*, and *AWS Bedrock*, the form shows a picker backed by the provider's catalog. Pick from the list, or type a model identifier the catalog doesn't show. For *OpenAI-compatible*, the form takes a freeform list — type the exact identifiers your upstream serves. |
| 174 | + |
| 175 | +[NOTE] |
| 176 | +==== |
| 177 | +Models are stored as structured `ProviderModel` entries (one entry per model, with the model name as the only required field). A future Phase 2 release will add per-model metadata such as custom pricing overrides. The legacy flat `models` field still works on writes for backward compatibility. |
| 178 | +==== |
| 179 | + |
| 180 | +After you create the provider, the detail page renders each model as a card with capability badges (for example, *Vision*, *Reasoning*, *Streaming*) lifted from the catalog. |
| 181 | + |
| 182 | +== Save and verify |
| 183 | + |
| 184 | +. Click *Create provider*. The button activates once *Name* and *Type* are both set; the right-hand *Summary* panel checks them off as you fill them in. |
| 185 | +. On the provider's detail page, the *Connection* card shows your *Proxy URL*, *Discovery* URL, *Base URL*, and *API key ref*. Copy the *Proxy URL* — this is where your applications point. |
| 186 | +. Scroll to the *Verify connection* section. Pick a model from the dropdown and click *Test Connection*. The status updates from "Not tested yet" to a pass/fail indicator. Use the *Show commands* disclosure if you want to see the equivalent curl or SDK call. |
| 187 | +. To wire up an application, open *Connect your app* further down the page or follow xref:connect-agent.adoc[Connect your agent]. |
| 188 | + |
| 189 | +A successful Test Connection result confirms that the provider's credentials, region (Bedrock), and network path are all correct. If the call fails, see <<troubleshooting>>. |
| 190 | + |
| 191 | +[[anthropic-authorization-passthrough]] |
| 192 | +== Anthropic: authorization passthrough |
| 193 | + |
| 194 | +If you want each client to authenticate against Anthropic with its own subscription (Claude Pro, Max, Team, or enterprise), enable *Auth passthrough* instead of configuring a server-side API key. In this mode: |
| 195 | + |
| 196 | +* Leave the *API key* field empty. |
| 197 | +* Clients must send their own Anthropic `Authorization` header with every request. AI Gateway forwards it unchanged. |
| 198 | +* Use this when you want to aggregate individual client subscriptions rather than share a single API account. |
| 199 | + |
| 200 | +The provider detail page shows whether Auth passthrough is enabled in the *Connection* card. |
| 201 | + |
| 202 | +== Edit, disable, or delete a provider |
| 203 | + |
| 204 | +* *Edit*: click *Edit* on the detail page. You can change any field *except* `Name` and `Type`, which are immutable. Model lists, credential references, and the enabled state can all change. |
| 205 | +* *Disable*: click *Disable* on the detail page. The provider remains in the list, but requests to its proxy URL are rejected until you enable it again. Use this when you want to pause traffic without losing configuration. |
| 206 | +* *Delete*: scroll to the *Delete this provider* section at the bottom of the detail page and click *Delete*. The action is permanent; in-flight requests fail and downstream clients receive errors until reconfigured. |
| 207 | + |
| 208 | +[[troubleshooting]] |
| 209 | +== Troubleshooting |
| 210 | + |
| 211 | +[cols="1,2"] |
| 212 | +|=== |
| 213 | +|Symptom |What to check |
| 214 | + |
| 215 | +|`secret "<NAME>" not found` |
| 216 | +|Confirm the secret exists in your dataplane's secret store and the reference in the provider configuration is spelled identically (`UPPER_SNAKE_CASE`, no typos). |
| 217 | + |
| 218 | +|Bedrock returns `AccessDenied` or region errors |
| 219 | +|Verify the AWS region field matches the region where your Bedrock models are enabled. Bedrock model availability varies by region. |
| 220 | + |
| 221 | +|Anthropic returns 401 when passthrough is enabled |
| 222 | +|Confirm the client is sending its own `Authorization` header and the *API key* field on the provider is empty. |
| 223 | + |
| 224 | +|Gemini returns 401 |
| 225 | +|Gemini uses the `x-goog-api-key` header, not `Authorization`. If you're seeing 401s on Gemini, check that the client is sending the correct header. See xref:connect-agent.adoc[Connect your agent]. |
| 226 | + |
| 227 | +|Provider list empty or 403 |
| 228 | +|Confirm your account has the `dataplane_adp_llmprovider_*` permissions in ADP. |
| 229 | ++ |
| 230 | +// TODO: confirm the exact role/permission model once the standalone ADP UI launches. |
| 231 | +|=== |
| 232 | + |
| 233 | +// TODO: add screenshots of common error toasts once captured from the live environment. |
| 234 | + |
| 235 | +== Out of scope |
| 236 | + |
| 237 | +AI Gateway does not provide these capabilities. For current status, consult the Redpanda Cloud release notes. |
| 238 | + |
| 239 | +* *Multi-provider routing, failover, and retries across providers.* A synthetic provider that fans requests to multiple upstreams is not part of AI Gateway. |
| 240 | +* *Spend limits.* Per-user, per-org, and global cost caps are not available. The provider detail page shows a *Cost & usage* placeholder labeled "Coming soon"; see xref:governance:budgets.adoc[Token budgets and limits]. |
| 241 | +* *Rate limits.* Requests-per-second, per-minute, or per-day limits are not available. |
| 242 | +* *Managed MCP aggregation at the gateway.* Register MCP tool servers separately under *ADP* → *MCP Servers*. |
| 243 | + |
| 244 | +== Next steps |
| 245 | + |
| 246 | +* xref:connect-agent.adoc[Connect your agent]. Point your application's SDK at the proxy URL and make requests. |
0 commit comments