Skip to content

Commit 39169d5

Browse files
IsaiahWitzkeoz-agenthongyi-chen
authored
Document AWS Bedrock IAM role setup for BYOLLM, including Oz cloud runs (#92)
* Document AWS Bedrock IAM role setup for BYOLLM, including Oz cloud runs Rewrite the BYOLLM auth section to describe the OIDC + role-assumption flow that Warp actually uses for AWS Bedrock inference, and extend the docs to cover Oz cloud agent runs in addition to interactive terminal agents. Highlights: - Replace the local AWS CLI session-credentials path with a trust policy + permissions policy + AWS CLI setup script that admins follow once per team. - Use scoped_principal:<team-uid>/* in the trust policy so a single role works for human-triggered interactive and cloud runs as well as named-agent service-account runs. Note the full sub claim shape so admins can scope further if they want. - Update prerequisites, troubleshooting, and FAQ to match the new auth model. Companion to the Oz Bedrock plumbing on iw/add-aws-region-to-oz-runs in warp-server. Co-Authored-By: Oz <oz-agent@warp.dev> * Split BYOLLM doc: narrow team trust policy to user:*, add BYOLLM on Oz section The team-wide trust policy (Step 2) is now scoped to scoped_principal:<team-uid>/user:* for human team members. Named-agent runs authenticate as service accounts, not users, so they need separate handling. Add a new top-level 'BYOLLM on Oz' section that covers the named-agent flow: - Explains the actor-type difference between human runs (user:<user-uid>) and named-agent runs (service_account:<service-account-uid>). - Provides two patterns for authorizing named agents: a wildcard service_account:* for all team agents, or per-UID conditions for a subset. - Documents how to find the team UID (Admin Panel) and named-agent UIDs (GET /api/v1/agent/identities or the Oz web app URL). - Documents the per-agent inference_providers.aws override in the public API, including role_arn / region / disabled and the trust-policy implications. Co-Authored-By: Oz <oz-agent@warp.dev> * wip * wip2 * Iterate on BYOLLM cloud agents docs - Split Step 3 into team-wide and per-agent paths - Reorder cloud agents steps so AWS setup precedes Admin Panel config - Add Step for setting up Warp as an OIDC identity provider in AWS - Use team-wide wildcard pattern as default trust policy - Link to Oz web app new agent docs and credit-buckets reference Co-Authored-By: Oz <oz-agent@warp.dev> * wip3 * wip4 * wip5 * Apply suggestions from code review Co-authored-by: Hong Yi Chen <hongyigma@gmail.com> * wip6 * wip7 --------- Co-authored-by: Oz <oz-agent@warp.dev> Co-authored-by: Hong Yi Chen <hongyigma@gmail.com>
1 parent 286ec5a commit 39169d5

1 file changed

Lines changed: 168 additions & 20 deletions

File tree

src/content/docs/enterprise/enterprise-features/bring-your-own-llm.mdx

Lines changed: 168 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -11,8 +11,6 @@ This gives you control over cloud spend and model hosting, without changing how
1111

1212
:::caution
1313
BYOLLM currently supports **AWS Bedrock** only. Coming soon: Azure Foundry and Google Vertex support.
14-
15-
BYOLLM applies to interactive agents in the terminal. Cloud agents do not yet support BYOLLM routing.
1614
:::
1715

1816
:::note
@@ -21,7 +19,8 @@ BYOLLM is only available on Warp's Enterprise plan. Contact [warp.dev/contact-sa
2119

2220
## Key features
2321

24-
* **Cloud-native credentials** - Authenticate using each user’s AWS IAM identity. Warp does not store API keys.
22+
* **Cloud-native credentials** - No long-lived API keys. Interactive terminal sessions use each user's AWS CLI session credentials; cloud agent runs assume an IAM role in your AWS account via OIDC.
23+
* **Admin-controlled IAM** - Admins define which IAM role(s) Warp can assume and which models are available via AWS Bedrock, with the ability to disable non-Bedrock model access entirely.
2524
* **Admin-enforced routing** - Team admins configure which models are available to users in AWS Bedrock, with the ability to disable non-Bedrock model access entirely.
2625
* **Consolidated billing** - Inference costs are billed directly to your AWS account, leveraging existing cloud commitments.
2726

@@ -33,11 +32,20 @@ When BYOLLM is enabled, Warp redirects inference calls to your AWS Bedrock envir
3332

3433
Here's the high-level flow:
3534

35+
**Interactive terminal flow**
36+
3637
1. **Admin configures routing** - Your team admin sets routing policies in Warp's admin settings (e.g., "Route Claude Opus 4.7 through AWS Bedrock; disable direct Anthropic API").
3738
2. **Team members authenticate** - Each team member authenticates to AWS locally using the AWS CLI (`aws login`).
3839
3. **Warp routes requests** - When a team member uses an interactive agent in the terminal, Warp uses their short-lived session credentials to authenticate requests to your configured AWS Bedrock API endpoint.
3940
4. **Inference executes in your cloud** - The model runs in your AWS account. Responses return to the Warp client.
4041

42+
**Cloud agent flow**
43+
44+
1. **Admin configures routing** - Your team admin configures BYOLLM in the Admin Panel and provides an IAM role ARN that Warp can assume. See [Enabling BYOLLM for Cloud Agents](#enabling-byollm-for-cloud-agents) for setup details.
45+
2. **Warp assumes the role** - At run start, Warp mints an OIDC token and assumes the configured IAM role in your AWS account to obtain temporary credentials.
46+
3. **Warp routes requests** - The cloud agent uses those temporary credentials to call your configured AWS Bedrock endpoint.
47+
4. **Inference executes in your cloud** - The model runs in your AWS account. Responses return to the cloud agent worker.
48+
4149
### Credential lifecycle
4250

4351
BYOLLM uses **cloud-native IAM authentication**, not long-lived API keys:
@@ -73,7 +81,7 @@ Before configuring BYOLLM, confirm the following:
7381

7482
In the [Admin Panel](/enterprise/team-management/admin-panel/), configure which models should route through AWS Bedrock:
7583

76-
1. From the [Admin Panel](/enterprise/team-management/admin-panel/), navigate to the BYOLLM or model routing settings.
84+
1. From the [Admin Panel](/enterprise/team-management/admin-panel/), navigate to the **Models** page.
7785
2. Select which models should use your cloud provider (e.g., "Claude Opus 4.7 via AWS Bedrock").
7886
3. Optionally, disable direct API access to enforce provider-only routing.
7987

@@ -105,7 +113,7 @@ Grant your team members the necessary permissions in AWS. Use least-privilege IA
105113
```
106114

107115
:::note
108-
This policy covers Warp's current usage. Warp uses global inference profiles for models when available.
116+
This policy covers Warp's current usage. By default, Warp uses [global inference profiles](https://docs.aws.amazon.com/bedrock/latest/userguide/cross-region-inference.html) for models when available. Admins can override the inference profile per model on the **Models** page of the [Admin Panel](/enterprise/team-management/admin-panel/).
109117
:::
110118

111119
### Step 3: Authenticate locally (team member)
@@ -125,22 +133,158 @@ Run a test prompt in Warp using a model configured for BYOLLM routing. Verify:
125133
* The request completes successfully.
126134
* Logs appear in AWS CloudWatch.
127135

136+
## Enabling BYOLLM for cloud agents
137+
138+
Cloud agents authenticate to AWS Bedrock differently from the local terminal flow above. Instead of relying on each user's AWS CLI session, Warp assumes an IAM role you provision in your AWS account using OIDC identity federation.
139+
140+
### Prerequisites
141+
142+
Before configuring BYOLLM for cloud agents, confirm the following:
143+
144+
* You have admin access to both Warp's [Admin Panel](/enterprise/team-management/admin-panel/) and your AWS IAM settings.
145+
146+
### Step 1: Set up Warp as an OIDC identity provider in AWS (cloud admin)
147+
148+
Before AWS can trust tokens issued by Warp, register Warp as an OpenID Connect (OIDC) identity provider in IAM. This is a one-time setup per AWS account.
149+
150+
1. Open the [Identity providers](https://console.aws.amazon.com/iam/home#/identity_providers) page in the AWS IAM console.
151+
2. Click **Add provider**.
152+
3. For **Provider type**, choose **OpenID Connect**.
153+
4. For **Provider URL**, enter `https://app.warp.dev`.
154+
5. For **Audience**, enter `sts.amazonaws.com`.
155+
6. Click **Add provider**.
156+
157+
After the provider is created, copy its ARN — it will look like `arn:aws:iam::<aws-account-id>:oidc-provider/app.warp.dev`. You'll reference this ARN in the trust policy in the next step.
158+
159+
For more detail, see AWS's [Create an OpenID Connect (OIDC) identity provider in IAM](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_providers_create_oidc.html) guide.
160+
161+
### Step 2: Provision an assumable IAM role (cloud admin)
162+
163+
Create an IAM role that Warp can assume via OIDC, then attach the minimum Bedrock permissions policy. Use least-privilege IAM policies.
164+
165+
The role setup has two parts:
166+
167+
1. A **trust policy** that allows Warp's OIDC identity to call `sts:AssumeRoleWithWebIdentity`.
168+
2. A **permissions policy** that grants the minimum Bedrock inference permissions.
169+
170+
#### Trust policy requirements
171+
172+
This trust policy authorizes any cloud-hosted run from your team. The `sub` claim Warp signs has the shape `scoped_principal:<team-uid>/<actor-type>:<principal-uid>`, where `<actor-type>` is `user` for user-triggered runs or `service_account` for [agent identity](/agent-platform/cloud-agents/agents/) runs. The `<team-uid>/*` pattern below covers both.
173+
174+
**Example trust policy**
175+
176+
```json
177+
{
178+
"Version": "2012-10-17",
179+
"Statement": [
180+
{
181+
"Effect": "Allow",
182+
"Principal": {
183+
"Federated": "arn:aws:iam::<aws-account-id>:oidc-provider/app.warp.dev"
184+
},
185+
"Action": "sts:AssumeRoleWithWebIdentity",
186+
"Condition": {
187+
"StringLike": {
188+
"app.warp.dev:sub": "scoped_principal:<team-uid>/*"
189+
},
190+
"StringEquals": {
191+
"app.warp.dev:aud": "sts.amazonaws.com"
192+
}
193+
}
194+
}
195+
]
196+
}
197+
```
198+
199+
Replace the account ID, issuer host, and team UID with values for your environment.
200+
201+
The `<team-uid>` is the Warp team UID for the team that will be allowed to assume this role. You can find it in your team's [Admin Panel](/enterprise/team-management/admin-panel/) URL as the path segment after `/admin/`. For example, in `https://app.warp.dev/admin/HzjUdNkg8Uiq8gp6FMgfxe/models`, the team UID is `HzjUdNkg8Uiq8gp6FMgfxe`.
202+
203+
#### Permissions policy
204+
205+
Attach the minimum Bedrock invoke permissions policy to the role:
206+
207+
```json
208+
{
209+
"Version": "2012-10-17",
210+
"Statement": [
211+
{
212+
"Sid": "BedrockModelAccess",
213+
"Effect": "Allow",
214+
"Action": [
215+
"bedrock:InvokeModel",
216+
"bedrock:InvokeModelWithResponseStream"
217+
],
218+
"Resource": [
219+
"arn:aws:bedrock:*::foundation-model/*",
220+
"arn:aws:bedrock:*:*:inference-profile/*",
221+
"arn:aws:bedrock:*:*:application-inference-profile/*"
222+
]
223+
}
224+
]
225+
}
226+
```
227+
228+
:::note
229+
This policy covers Warp's current usage. By default, Warp uses [global inference profiles](https://docs.aws.amazon.com/bedrock/latest/userguide/cross-region-inference.html) for models when available. Admins can override the inference profile per model on the **Models** page of the [Admin Panel](/enterprise/team-management/admin-panel/).
230+
:::
231+
232+
After you create the role, copy its ARN. You'll paste it into the **Models** page in the next step.
233+
234+
### Step 3: Configure routing policies (admin)
235+
236+
Attach the IAM role from Step 2 to your team or to a specific named agent.
237+
238+
#### Option A: Team-wide
239+
240+
This applies the OIDC role to all cloud agent runs on the team.
241+
242+
1. In the [Admin Panel](/enterprise/team-management/admin-panel/), navigate to the **Models** page.
243+
2. Under the **AWS Bedrock** host configuration, paste the IAM role ARN from Step 2 into the **Role ARN** field.
244+
3. Select which models should route through AWS Bedrock.
245+
246+
#### Option B: Per named agent
247+
248+
This applies the OIDC role only to runs from a specific named agent.
249+
250+
:::note
251+
To safely test BYOLLM, configure it on a single named agent first. Misconfigurations scoped to one agent only affect that agent's runs, not the whole team.
252+
:::
253+
254+
In the Oz web app:
255+
256+
1. [Create a new agent](/agent-platform/cloud-agents/oz-web-app/#creating-a-new-agent) or edit an existing one.
257+
2. In the agent form, expand the **AWS Bedrock** section.
258+
3. Choose **Custom** and paste the IAM role ARN from Step 2.
259+
4. Ensure the agent's default model is one that's enabled for Bedrock under the Admin Panel **Models** page.
260+
261+
New runs for this agent will authenticate to Bedrock using the configured role.
262+
263+
### Step 4: Validate the configuration
264+
265+
Start a test cloud agent run using a model configured for BYOLLM routing. Verify:
266+
267+
* The request completes successfully.
268+
* Logs appear in AWS CloudWatch.
269+
128270
## BYOLLM usage and billing behavior
129271

130272
### Billing
131273

132274
When a request routes through BYOLLM:
133275

134-
* **Warp does not consume credits** for that request.
135-
* Your cloud provider account receives the inference costs directly.
276+
* **Warp does not consume AI credits** for that request.
277+
* Cloud agent runs still consume platform and compute credits for orchestration and the cloud agent's compute.
278+
279+
See [The three credit buckets](/support-and-community/plans-and-billing/platform-credits/#the-three-credit-buckets) for more on credit types.
136280

137281
### Routing behavior
138282

139283
Warp's agents automatically select the best model for your task while respecting your admin's routing policies. If you configure a model for BYOLLM, requests for that model route to AWS Bedrock.
140284

141285
### Failover behavior
142286

143-
If a BYOLLM request fails (e.g., due to expired credentials, insufficient permissions, or provider quota limits), Warp attempts to fall back to the next available model your admin has enabled.
287+
If a BYOLLM request fails (e.g., due to role assumption errors, insufficient permissions, or provider quota limits), Warp attempts to fall back to the next available model your admin has enabled.
144288

145289
For example, if Claude Opus 4.7 on Bedrock fails but your admin also enabled it via direct API, Warp falls back to the direct API to avoid disruption. If a fallback uses a direct API model, that request consumes Warp credits.
146290

@@ -173,17 +317,20 @@ However, when using BYOLLM:
173317

174318
### Common errors
175319

176-
* **Missing or expired credentials** — Re-authenticate using `aws login`. To avoid interruptions, enable auto-refresh by opening **Settings** and searching for `AWS Bedrock`, or when prompted during credential expiration.
177-
* **Insufficient permissions** — Verify your IAM policy includes the required actions and resources.
320+
* **Missing or expired local credentials** (interactive terminal use) — Re-authenticate using `aws login`. To avoid interruptions, enable auto-refresh by opening **Settings** and searching for `AWS Bedrock`, or when prompted during credential expiration.
321+
* **Role assumption failed** (cloud agent runs) — Verify the IAM trust policy, issuer host, team UID restriction, and the configured role ARN in Warp.
322+
* **Missing OIDC provider** (cloud agent runs) — Confirm the OIDC provider exists in your AWS account for the issuer host referenced in the trust policy.
323+
* **Insufficient permissions** — Verify your IAM policy includes the required Bedrock actions and any needed resources.
178324
* **Region or model mismatch** — Confirm the model is enabled in your AWS region and that your environment is configured for the correct region.
179325
* **Provider quota limits** — Check your AWS Bedrock quota and request increases if needed.
180326

181327
### Debugging steps
182328

183-
1. Verify local authentication: run `aws sts get-caller-identity`.
184-
2. Check your effective IAM policy for the required permissions.
185-
3. Confirm the model ID and region match your Warp configuration.
186-
4. Inspect AWS CloudWatch logs for request details and errors.
329+
1. Confirm the configured role ARN is the one you intended Warp to assume.
330+
2. Check the IAM trust policy and verify the issuer host, `sub`, and `aud` conditions match your Warp configuration.
331+
3. Check the attached IAM policy for the required Bedrock permissions.
332+
4. Confirm the model ID and region match your Warp configuration.
333+
5. Inspect AWS CloudWatch logs for request details and errors.
187334

188335
## FAQ
189336

@@ -196,25 +343,26 @@ However, when using BYOLLM:
196343
| Feature | BYOK | BYOLLM |
197344
| --- | --- | --- |
198345
| Configuration level | User | Admin/Team |
199-
| Authentication | API keys (local) | Cloud IAM (per-user) |
346+
| Authentication | API keys (local) | IAM role assumed by Warp via OIDC |
200347
| Billing | Direct to provider | Your cloud account |
201348
| Data locality | Provider infrastructure | Your cloud infrastructure |
202349

203350
### Does BYOLLM work with Auto?
204351

205-
Auto model selection is disabled as soon as your admin disables **any** Direct API model, regardless of your AWS Bedrock configuration.
352+
Auto model selection is disabled if an admin disables **any** Direct API model, regardless of AWS Bedrock configuration.
206353

207-
If all Direct API models remain enabled and BYOLLM is configured, Auto will try to use your enabled AWS Bedrock models first, falling back to Direct API only if that fails (e.g., invalid/missing AWS credentials, Bedrock outage).
354+
When Direct API models remain enabled and BYOLLM is configured, Auto picks the best model for the task. If the selected model is also enabled for AWS Bedrock, the request routes through Bedrock; otherwise it routes through the Direct API.
208355

209356
### Where does compute run and who pays?
210357

211-
Inference runs in **your AWS account**. You pay AWS directly for compute usage. Warp does not consume credits for BYOLLM-routed requests.
358+
Inference runs in **your AWS account**, which AWS bills directly. Warp does not consume AI credits for BYOLLM-routed inference. Cloud agent runs continue to consume platform and compute credits for orchestration. See [The three credit buckets](/support-and-community/plans-and-billing/platform-credits/#the-three-credit-buckets) for more.
212359

213360
### What data does Warp store? Do you store our cloud credentials?
214361

215-
Warp **does not store or log** your cloud session tokens. Credentials are used transiently to sign requests and are never persisted on Warp servers.
362+
Warp **does not store or log** your cloud credentials.
216363

217-
Warp stores standard run metadata (timestamps, model used, etc.) but does not retain the content of your prompts or responses when using BYOLLM.
364+
* **Interactive terminal use** — Credentials are used transiently to sign requests and are never persisted on Warp servers.
365+
* **Cloud agent runs** — Temporary AWS credentials are used only for the duration of the run and are not retained after it ends.
218366

219367
### Can admins enforce provider-only routing and disable Warp-managed models?
220368

0 commit comments

Comments
 (0)