runpod · gwestersf · May 2, 2026 · Apr 23, 2026 · Apr 29, 2026 · Apr 29, 2026
diff --git a/release-notes.mdx b/release-notes.mdx
@@ -261,7 +261,7 @@ Flash now supports deploying endpoints to [multiple datacenters](/flash/configur
 - **Self-service worker upgrade**: Rebuild and roll workers from the dashboard without support tickets.
 - **Edit template from endpoint page**: Inline edit and redeploy the underlying template directly from the endpoint view.
 - **Improved Serverless metrics page**: Refinements to charts and filters for quicker root-cause analysis.
-- [Flex and active workers](/serverless/pricing): Discounted always-on "active" capacity for baseline load with on-demand "flex" workers for bursts.
+- [Flex and active workers](/serverless/pricing): Always-on "active" workers for baseline load with on-demand "flex" workers for bursts.
 - **Billing explorer**: Inspect costs by resource, region, and time to identify optimization opportunities.
 
 </Update>

diff --git a/serverless/development/optimization.mdx b/serverless/development/optimization.mdx
@@ -52,7 +52,7 @@ For private models, [embed them in your Docker image](/serverless/workers/create
 
 ### Maintain active workers
 
-Set [active workers](/serverless/endpoints/endpoint-configurations#active-workers) > 0 to eliminate cold starts entirely. Active workers cost up to 30% less than flex workers.
+Set [active workers](/serverless/endpoints/endpoint-configurations#active-workers) > 0 to eliminate cold starts entirely.
 
 **Formula**: `Active workers = (Requests/min × Request duration in seconds) / 60`
 

diff --git a/serverless/endpoints/endpoint-configurations.mdx b/serverless/endpoints/endpoint-configurations.mdx
@@ -55,7 +55,7 @@ For endpoints with fewer than five workers, all workers use the highest-priority
 
 ### Active workers
 
-Minimum number of workers that remain warm and ready at all times. Setting this to 1+ eliminates cold starts. Active workers incur charges when idle but receive a 20-30% discount.
+Minimum number of workers that remain warm and ready at all times. Setting this to 1+ eliminates cold starts. Active workers incur charges continuously, including when idle.
 
 ### Max workers
 

diff --git a/serverless/pricing.mdx b/serverless/pricing.mdx
@@ -20,7 +20,7 @@ Serverless offers pay-per-second pricing with no upfront costs. You're billed fr
 | | Flex workers | Active workers |
 |---|--------------|----------------|
 | **Behavior** | Scale to zero when idle | Always running (24/7) |
-| **Pricing** | Standard per-second rate | 20–30% discount |
+| **Pricing** | Standard per-second rate | Discounts available through sales inquiry |
 | **Best for** | Variable workloads, cost optimization | Consistent traffic, low-latency requirements |
 
 ## GPU pricing

diff --git a/serverless/workers/overview.mdx b/serverless/workers/overview.mdx
@@ -39,7 +39,7 @@ To deploy workers with AI/ML models, follow this order of preference:
 
 Workers can run in two modes depending on your latency and cost requirements:
 
-- **Active workers** run continuously (24/7) and are always ready to process requests instantly. They eliminate cold starts entirely and receive a discounted rate, making them ideal for latency-sensitive or high-traffic applications.
+- **Active workers** run continuously (24/7) and are always ready to process requests instantly. They eliminate cold starts entirely, making them ideal for latency-sensitive or high-traffic applications.
 
 - **Flex workers** scale dynamically based on demand, spinning down to zero when idle. They incur cold starts when scaling up but cost nothing when not in use, making them ideal for variable or sporadic workloads.