Skip to content

Commit 1352c3f

Browse files
docs(pricing-may-2026): BYOLLM reframe, plan summary, Enterprise billing, and plans-and-billing index (#72)
* docs: update BYOLLM, plan summary, Enterprise billing, and plans-and-billing index for May 2026 - bring-your-own-llm.mdx: Reframe as Enterprise-only managed inference (Bedrock GA, Vertex/Foundry roadmap, internal gateways). Add a comparison section that contrasts BYOLLM with BYOK and Custom inference endpoint, with a note that centrally configured BYOK/CIE for Enterprise is a fast-follow after launch. - plans-pricing-refunds.mdx: Add May 2026 plan summary (Free, Build, Max, Business, Enterprise) with seat limits and qualitative descriptions of each plan's value. Link out to warp.dev/pricing for current monthly credit allowances instead of hard-coding numbers. Add a Custom inference endpoint bullet to the existing sub-page list. - enterprise/support-and-resources/billing.mdx: Clarify that team-wide spend limits are also available on self-serve paid plans while per-user spend limits are Enterprise-only. Add a related-resources link to the Enterprise Analytics API. - plans-and-billing/index.mdx: Add a Custom inference endpoint bullet so the new page is discoverable from the plans-and-billing landing page. Part of the May 2026 Warp pricing docs overhaul (hyc/plan-updates). Co-Authored-By: Oz <oz-agent@warp.dev> * docs(pricing-may-2026): correct plan seat limits per orchestrator Update May 2026 plan summary in plans-pricing-refunds.mdx with the actual seat limits from the pricing-context dump: - Free / Build / Max: Up to 10 team members - Business: Up to 25 team members - Enterprise: Unlimited team members (custom contract) Also reframe descriptions so 'individual' language is dropped from plans that support up to 10 team members. Co-Authored-By: Oz <oz-agent@warp.dev> * docs(pricing-may-2026): add BYOK / CIE org-size disclosure (10 or fewer employees) Per user requirement: BYOK and custom inference endpoint are limited to individual users and orgs with 10 or fewer employees on Free / Build / Max; larger orgs require a Business or Enterprise plan. - bring-your-own-llm.mdx: Update the BYOK and CIE rows in the comparison matrix to reflect the org-size constraint, add a :::note disclosure callout immediately below the matrix, and rephrase the corresponding FAQ for consistency. - plans-pricing-refunds.mdx: Add the same :::note disclosure callout after the May 2026 plan summary bullet list, before the ZDR sentence (one callout covering the Free / Build / Max bullets). Co-Authored-By: Oz <oz-agent@warp.dev> * Add Custom inference endpoint page stub for CI link check This file is the canonical version created on PR #71 (hyc/plan-updates-byok-cie). It is duplicated here so that the link checker on this branch can resolve the relative references to /support-and-community/plans-and-billing/custom-inference-endpoint/ that this PR introduces. When PR #71 merges into hyc/plan-updates, git will reconcile the identical file contents automatically. Co-Authored-By: Oz <oz-agent@warp.dev> * docs(pricing-may-2026): BYOLLM cross-provider IAM + platform credits in plan summary, Enterprise billing - bring-your-own-llm.mdx: rewrite Cloud-native credentials bullet so it covers AWS IAM (GA) plus Google Cloud and Azure identities (roadmap); add a platform-credits note next to the BYOLLM 'no Warp credits consumed' framing so Enterprise readers know local agent runs still consume platform credits; add the platform-credits page to Related resources. - plans-pricing-refunds.mdx: append a one-liner about platform credits to the Business and Enterprise bullets of the May 2026 plan summary so readers understand when platform credits apply across customer-supplied inference. - enterprise/support-and-resources/billing.mdx: add the platform-credits page to Related resources alongside Add-on Credits and Credits. Pre-existing build/link-check failures (CIE sidebar registration; platform-credits.mdx not yet on the umbrella) are out of scope here and will be resolved by PR #71's sidebar.ts update and the umbrella rebase onto main. Co-Authored-By: Oz <oz-agent@warp.dev> * Remove duplicate CIE stub; PR #71 owns the canonical custom-inference-endpoint.mdx The CIE page is owned by PR #71 (hyc/plan-updates-byok-cie), which also adds the matching src/sidebar.ts entry. Removing the stub here so the PR #72 \u2192 umbrella merge auto-resolves cleanly after #71 lands. Co-Authored-By: Oz <oz-agent@warp.dev> --------- Co-authored-by: Oz <oz-agent@warp.dev>
1 parent 24972eb commit 1352c3f

4 files changed

Lines changed: 59 additions & 18 deletions

File tree

src/content/docs/enterprise/enterprise-features/bring-your-own-llm.mdx

Lines changed: 35 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,16 @@
11
---
22
title: Bring your own LLM
33
description: >-
4-
Route Warp's agents through your AWS Bedrock models for billing control and
5-
infrastructure flexibility.
4+
Route Warp's agents through your organization's managed inference
5+
infrastructure for governance, billing control, and model flexibility.
66
---
77

8-
Warp supports **Bring Your Own LLM (BYOLLM)** for enterprise teams that need to run inference on their own cloud infrastructure. With BYOLLM, your team can use Warp's agents while routing inference through models hosted in your AWS Bedrock environment.
8+
Warp supports **Bring your own LLM (BYOLLM)** for Enterprise teams that want to run inference on their own managed infrastructure. BYOLLM covers two patterns: cloud-provider Model-as-a-Service (AWS Bedrock, Google Vertex AI, Azure AI Foundry) and approved internal inference gateways.
99

10-
This gives you control over cloud spend and model hosting, without changing how your team works in Warp.
10+
With BYOLLM, your team uses Warp's agents while Warp manages routing, orchestration, governance, and observability across the providers you've approved. Inference runs in your environment; admins control which models are available to whom.
1111

1212
:::caution
13-
BYOLLM currently supports **AWS Bedrock** only. Coming soon: Azure Foundry and Google Vertex support.
13+
**AWS Bedrock** is the GA implementation today. **Google Vertex AI** and **Azure AI Foundry** support is on the roadmap. Approved internal gateways are evaluated on a case-by-case basis with your Warp account team.
1414

1515
BYOLLM applies to interactive Oz agents in the terminal. Oz cloud agents do not yet support BYOLLM routing.
1616
:::
@@ -19,9 +19,29 @@ BYOLLM applies to interactive Oz agents in the terminal. Oz cloud agents do not
1919
BYOLLM is only available on Warp's Enterprise plan. Contact [warp.dev/contact-sales](https://warp.dev/contact-sales) to learn more.
2020
:::
2121

22+
## How BYOLLM differs from BYOK and Custom inference endpoint
23+
24+
Warp offers three ways to bring your own inference into the product. BYOLLM is one of them, and it serves a different use case than the others.
25+
26+
| Name | Meaning | Plans |
27+
| --- | --- | --- |
28+
| Bring your own API key (BYOK) | User-level API keys for OpenAI, Anthropic, or Google. Each user configures their own key locally; Warp uses it to call the provider directly. | Free, Build, Max (orgs with 10 or fewer employees); Business or Enterprise required for larger orgs |
29+
| Custom inference endpoint (CIE) | User-level OpenAI-compatible endpoint such as OpenRouter, LiteLLM, z.ai, or an internal gateway. Each user configures the endpoint locally. | Free, Build, Max (orgs with 10 or fewer employees); Business or Enterprise required for larger orgs |
30+
| Bring your own LLM (BYOLLM) | Enterprise-only managed inference infrastructure: cloud-provider Model-as-a-Service (Bedrock, Vertex, Foundry) or approved internal gateways. Warp manages routing, orchestration, governance, and observability for the whole team. | Enterprise |
31+
32+
:::note
33+
BYOK and custom inference endpoint support are available for individual users and organizations with 10 or fewer employees, subject to Warp's Terms of Service. Companies or organizations with more than 10 employees require a Warp Business or Enterprise plan to use these features.
34+
:::
35+
36+
Use [Bring your own API key](/support-and-community/plans-and-billing/bring-your-own-api-key/) or [Custom inference endpoint](/support-and-community/plans-and-billing/custom-inference-endpoint/) when an individual developer wants to authenticate to a provider with their own key or endpoint. Use BYOLLM when an organization wants Warp to manage inference routing across approved providers for the whole team.
37+
38+
:::note
39+
Centrally configured BYOK and Custom inference endpoint for Enterprise — where admins approve providers or endpoints for the entire organization through the Admin Panel — are a fast-follow after launch, not at launch. Until then, BYOK and CIE remain user-level configurations, and BYOLLM remains the path for admin-managed inference infrastructure.
40+
:::
41+
2242
## Key features
2343

24-
* **Cloud-native credentials** - Authenticate using each user’s AWS IAM identity. Warp does not store API keys.
44+
* **Cloud-native credentials** - Authenticate using each user's cloud-native identity (AWS IAM today; Google Cloud and Azure identities on the roadmap). Warp does not store API keys.
2545
* **Admin-enforced routing** - Team admins configure which models are available to users in AWS Bedrock, with the ability to disable non-Bedrock model access entirely.
2646
* **Consolidated billing** - Inference costs are billed directly to your AWS account, leveraging existing cloud commitments.
2747

@@ -134,6 +154,10 @@ When a request routes through BYOLLM:
134154
* **Warp does not consume credits** for that request.
135155
* Your cloud provider account receives the inference costs directly.
136156

157+
:::note
158+
BYOLLM-routed local agent runs on Enterprise still consume platform credits for Warp's platform infrastructure (run orchestration, observability, integrations). Inference costs are billed directly to your cloud provider account. See [platform credits](/support-and-community/plans-and-billing/platform-credits/) for the full breakdown.
159+
:::
160+
137161
### Routing behavior
138162

139163
Warp's agents automatically select the best model for your task while respecting your admin's routing policies. If you configure a model for BYOLLM, requests for that model route to AWS Bedrock.
@@ -187,18 +211,9 @@ However, when using BYOLLM:
187211

188212
## FAQ
189213

190-
### How is BYOLLM different from BYOK?
191-
192-
**BYOK (Bring Your Own Key)** lets individual users add their own API keys for direct model provider access (e.g., Anthropic, OpenAI, Google). Warp stores keys locally on the user's device.
214+
### How is BYOLLM different from BYOK and Custom inference endpoint?
193215

194-
**BYOLLM (Bring Your Own LLM)** routes inference through your organization's cloud infrastructure (AWS Bedrock) using cloud-native IAM. Admins configure it at the admin level and it applies to the entire team.
195-
196-
| Feature | BYOK | BYOLLM |
197-
| --- | --- | --- |
198-
| Configuration level | User | Admin/Team |
199-
| Authentication | API keys (local) | Cloud IAM (per-user) |
200-
| Billing | Direct to provider | Your cloud account |
201-
| Data locality | Provider infrastructure | Your cloud infrastructure |
216+
See [How BYOLLM differs from BYOK and Custom inference endpoint](#how-byollm-differs-from-byok-and-custom-inference-endpoint) at the top of this page for a comparison and plan-availability details. In short: BYOK and CIE are user-level configurations available to individual users and orgs with 10 or fewer employees on Free, Build, and Max, and to all users on Business and Enterprise. BYOLLM is Enterprise-only managed inference infrastructure where Warp routes the whole team's traffic through providers your admins have approved.
202217

203218
### Does BYOLLM work with Auto?
204219

@@ -222,7 +237,9 @@ Yes. Admins can configure routing policies to require specific models to use BYO
222237

223238
## Related resources
224239

225-
* [Bring Your Own API Key](/support-and-community/plans-and-billing/bring-your-own-api-key/)
240+
* [Bring Your Own API Key](/support-and-community/plans-and-billing/bring-your-own-api-key/) — User-level keys for OpenAI, Anthropic, and Google
241+
* [Custom inference endpoint](/support-and-community/plans-and-billing/custom-inference-endpoint/) — Connect an OpenAI-compatible endpoint such as OpenRouter, LiteLLM, z.ai, or an internal gateway
242+
* [platform credits](/support-and-community/plans-and-billing/platform-credits/) — Warp's platform-infrastructure credit bucket
226243
* [Model Choice](/agent-platform/capabilities/model-choice/) — Full list of supported models
227244
* [Admin Panel](/enterprise/team-management/admin-panel/) — Configure team settings
228245
* [Contact Sales](https://warp.dev/contact-sales) — Get help with enterprise setup

src/content/docs/enterprise/support-and-resources/billing.mdx

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -72,6 +72,10 @@ Enterprise administrators can set monthly spending limits across the following f
7272

7373
Spending is tracked across all payment types (Add-on Credits, pay-as-you-go usage) so limits apply consistently regardless of how usage is funded.
7474

75+
:::note
76+
Team-wide spending limits (cloud, local, and total) are also available on Warp's self-serve paid plans through admin-managed Reload settings. **Per-user spending limits are Enterprise-only.** For deeper visibility into how individual users consume credits, see the [Enterprise Analytics API](/enterprise/enterprise-features/analytics-api/).
77+
:::
78+
7579
#### Monthly spend alerts
7680

7781
Warp sends alerts to administrators as team usage approaches each configured spending limit, so you can adjust caps, purchase more credits, or communicate with your team before agent usage is blocked at the cap.
@@ -84,6 +88,8 @@ For enterprises with credit pools, administrators receive alerts as the team cre
8488

8589
* [Credits](/support-and-community/plans-and-billing/credits/) - How credits are calculated and consumed
8690
* [Add-on Credits](/support-and-community/plans-and-billing/add-on-credits/) - Purchase additional credits and configure auto-reload
91+
* [platform credits](/support-and-community/plans-and-billing/platform-credits/) - The third credit bucket alongside AI credits and compute credits, covering Warp's platform infrastructure
8792
* [Pricing FAQs](/support-and-community/plans-and-billing/pricing-faqs/) - Common billing questions
8893
* [Bring Your Own LLM](/enterprise/enterprise-features/bring-your-own-llm/) - BYOLLM billing and configuration
94+
* [Enterprise Analytics API](/enterprise/enterprise-features/analytics-api/) - Programmatic access to team usage and spend data
8995
* [Admin Panel](/enterprise/team-management/admin-panel/) - Configure spending limits and billing settings

src/content/docs/support-and-community/plans-and-billing/index.mdx

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,5 +11,6 @@ Warp offers flexible plans designed for individual developers, teams, and enterp
1111
* [**Credits**](/support-and-community/plans-and-billing/credits/) - How credits are used and calculated across AI features
1212
* [**Add-on Credits**](/support-and-community/plans-and-billing/add-on-credits/) - Purchase additional credits or enable automatic reloads
1313
* [**Bring Your Own API Key**](/support-and-community/plans-and-billing/bring-your-own-api-key/) - Connect your own model provider API keys
14+
* [**Custom inference endpoint**](/support-and-community/plans-and-billing/custom-inference-endpoint/) - Connect an OpenAI-compatible endpoint such as OpenRouter, LiteLLM, z.ai, or an internal gateway
1415
* [**Overages (Legacy)**](/support-and-community/plans-and-billing/overages-legacy/) - Information for users on legacy plans with overages
1516
* [**Pricing FAQs**](/support-and-community/plans-and-billing/pricing-faqs/) - Answers to common questions about plans and billing

src/content/docs/support-and-community/plans-and-billing/plans-pricing-refunds.mdx

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,9 +21,26 @@ Visit [warp.dev/pricing](https://warp.dev/pricing) to see the latest plans and w
2121
* [Credits](/support-and-community/plans-and-billing/credits/) — learn how credits are used and calculated across AI features.
2222
* [Add-on Credits](/support-and-community/plans-and-billing/add-on-credits/) — purchase additional credits or enable automatic reloads at discounted rates.
2323
* [Bring Your Own API Key](/support-and-community/plans-and-billing/bring-your-own-api-key/) — connect your own model provider API keys for custom usage and billing.
24+
* [Custom inference endpoint](/support-and-community/plans-and-billing/custom-inference-endpoint/) — connect an OpenAI-compatible endpoint such as OpenRouter, LiteLLM, z.ai, or an internal gateway.
2425
* [Overages (Legacy)](/support-and-community/plans-and-billing/overages-legacy/) — information for users on legacy plans with overages enabled.
2526
* [Pricing FAQs](/support-and-community/plans-and-billing/pricing-faqs/) — answers to common questions about plans, billing, and usage. Don’t have Warp yet? [Download Warp](https://warp.dev/download) and get started for free today.
2627

28+
### May 2026 plan summary
29+
30+
Below is a high-level summary of Warp's plans as of May 14, 2026. Visit [warp.dev/pricing](https://www.warp.dev/pricing) for current monthly credit allowances and seat pricing.
31+
32+
* **Free** — Up to 10 team members. For developers exploring Warp. Includes core terminal features and a monthly credit allowance for trying Warp's agents. Supports [Bring Your Own API Key](/support-and-community/plans-and-billing/bring-your-own-api-key/) and [custom inference endpoints](/support-and-community/plans-and-billing/custom-inference-endpoint/) so you can keep working with your own provider after the included allowance is used.
33+
* **Build** — Up to 10 team members. For developers using Warp's agents as a daily driver. Includes a higher monthly credit allowance than Free, [Add-on Credits](/support-and-community/plans-and-billing/add-on-credits/) with auto-reload, and the same BYOK and custom inference endpoint support as Free.
34+
* **Max** — Up to 10 team members. For heavy users. Includes everything in Build with a higher monthly credit allowance and a better effective Reload rate.
35+
* **Business** — Up to 25 team members. For small-to-midsize teams. Includes everything in Max plus team-wide collaboration features, Reload credits with admin-managed auto-reload and team-wide spend caps, SAML-based SSO, and admin-configurable data controls. Platform credits apply to cloud agent runs and to local runs that use customer-supplied inference (BYOK or CIE).
36+
* **Enterprise** — Unlimited team members (custom contract). For organizations with advanced security, compliance, or scale needs. Includes everything in Business plus [Bring your own LLM](/enterprise/enterprise-features/bring-your-own-llm/), the [Enterprise Analytics API](/enterprise/enterprise-features/analytics-api/), per-user spend limits, advanced admin controls, and Implementation Engineer Support (a structured multi-week implementation program with hands-on guidance from Warp engineers to help your team deploy production Oz Cloud Agent use cases). Platform credits apply to all cloud agent runs and to local runs using BYOLLM, BYOK, or CIE.
37+
38+
:::note
39+
BYOK and custom inference endpoint support are available for individual users and organizations with 10 or fewer employees, subject to Warp's Terms of Service. Companies or organizations with more than 10 employees require a Warp Business or Enterprise plan to use these features.
40+
:::
41+
42+
Model provider Zero Data Retention (ZDR) applies across all plans through Warp's contracted LLM providers. See [Pricing FAQs](/support-and-community/plans-and-billing/pricing-faqs/) for details on data controls.
43+
2744
### Warp’s refund policies
2845

2946
Please review the details of our refund policies below. To request a refund, email [**billing@warp.dev**](mailto:billing@warp.dev) with information about your situation — the more context you provide, the faster we can resolve your request.

0 commit comments

Comments
 (0)