You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix(docs): correct VPC endpoint cost to ~$102/mo and clarify session timeouts
VPC endpoint cost was ~$50/mo (1 AZ math), actual is ~$102/mo
(7 endpoints x 2 AZs x $0.01/hr x 730 hrs). Update baseline totals
from ~$85-95 to ~$140-150 in COST_MODEL.md and DEPLOYMENT_GUIDE.md.
Clarify the two distinct timeout limits: AgentCore 8-hour service
limit vs orchestrator 9-hour executionTimeout.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| API Gateway (idle) |~$0/month | Pay-per-request. |
19
-
|**Total baseline**|**~$85–95/month**||
19
+
|**Total baseline**|**~$140–150/month**||
20
20
21
21
### Scale-to-zero characteristics
22
22
23
-
Most platform components are fully serverless and incur zero cost when idle: DynamoDB (PAY_PER_REQUEST), Lambda, API Gateway, ECS Fargate (cluster is free, when enabled), AgentCore Runtime (per-session), Bedrock (per-token), and Cognito (free tier). The always-on cost floor (~$85–95/month) is dominated by VPC networking infrastructure (NAT Gateway + 7 interface endpoints) which is required for private subnet connectivity to AWS services and GitHub. See the [Deployment guide](../guides/DEPLOYMENT_GUIDE.md) for the full scale-to-zero breakdown.
23
+
Most platform components are fully serverless and incur zero cost when idle: DynamoDB (PAY_PER_REQUEST), Lambda, API Gateway, ECS Fargate (cluster is free, when enabled), AgentCore Runtime (per-session), Bedrock (per-token), and Cognito (free tier). The always-on cost floor (~$140–150/month) is dominated by VPC networking infrastructure (NAT Gateway + 7 interface endpoints across 2 AZs) which is required for private subnet connectivity to AWS services and GitHub. See the [Deployment guide](../guides/DEPLOYMENT_GUIDE.md) for the full scale-to-zero breakdown.
24
24
25
25
## Per-task variable costs
26
26
@@ -47,16 +47,16 @@ Assuming a typical task: 1–2 hours, Claude Sonnet, ~100K input tokens, ~20K ou
47
47
| Model choice | 5–10× between Haiku and Opus | Default to Claude Sonnet; allow per-repo override. |
48
48
| Turn count | Linear with turns |`max_turns` cap (default 100, configurable 1–500). |
49
49
| Cost budget | Hard stop at budget |`max_budget_usd` cap (configurable $0.01–$100). Agent stops when budget is reached regardless of remaining turns. |
50
-
| Task duration | Sub-linear (compute is cheap; tokens dominate) | 8-hour max session timeout. |
50
+
| Task duration | Sub-linear (compute is cheap; tokens dominate) |AgentCore: 8-hour service limit; orchestrator: 9-hour `executionTimeout`. |
51
51
| Prompt caching | 50–90% token cost reduction | Enable by default; cache system prompts and repo context. |
52
52
| Concurrency | Linear with parallel tasks | Per-user and system-wide concurrency limits. |
Both backends are orchestrated by the same durable Lambda function. The `ComputeStrategy` interface abstracts `startSession()`, `pollSession()`, and `stopSession()` -- the ECS strategy calls `ecs:RunTask` / `ecs:DescribeTasks` / `ecs:StopTask` directly from the Lambda. No Step Functions are used.
19
19
@@ -39,15 +39,15 @@ ECS Fargate is currently **opt-in** -- the `EcsAgentCluster` construct is presen
| API Gateway (idle) |~$0/month | Pay-per-request. |
23
-
|**Total baseline**|**~$85–95/month**||
23
+
|**Total baseline**|**~$140–150/month**||
24
24
25
25
### Scale-to-zero characteristics
26
26
27
-
Most platform components are fully serverless and incur zero cost when idle: DynamoDB (PAY_PER_REQUEST), Lambda, API Gateway, ECS Fargate (cluster is free, when enabled), AgentCore Runtime (per-session), Bedrock (per-token), and Cognito (free tier). The always-on cost floor (~$85–95/month) is dominated by VPC networking infrastructure (NAT Gateway + 7 interface endpoints) which is required for private subnet connectivity to AWS services and GitHub. See the [Deployment guide](/getting-started/deployment-guide) for the full scale-to-zero breakdown.
27
+
Most platform components are fully serverless and incur zero cost when idle: DynamoDB (PAY_PER_REQUEST), Lambda, API Gateway, ECS Fargate (cluster is free, when enabled), AgentCore Runtime (per-session), Bedrock (per-token), and Cognito (free tier). The always-on cost floor (~$140–150/month) is dominated by VPC networking infrastructure (NAT Gateway + 7 interface endpoints across 2 AZs) which is required for private subnet connectivity to AWS services and GitHub. See the [Deployment guide](/getting-started/deployment-guide) for the full scale-to-zero breakdown.
28
28
29
29
## Per-task variable costs
30
30
@@ -51,16 +51,16 @@ Assuming a typical task: 1–2 hours, Claude Sonnet, ~100K input tokens, ~20K ou
51
51
| Model choice | 5–10× between Haiku and Opus | Default to Claude Sonnet; allow per-repo override. |
52
52
| Turn count | Linear with turns |`max_turns` cap (default 100, configurable 1–500). |
53
53
| Cost budget | Hard stop at budget |`max_budget_usd` cap (configurable $0.01–$100). Agent stops when budget is reached regardless of remaining turns. |
54
-
| Task duration | Sub-linear (compute is cheap; tokens dominate) | 8-hour max session timeout. |
54
+
| Task duration | Sub-linear (compute is cheap; tokens dominate) |AgentCore: 8-hour service limit; orchestrator: 9-hour `executionTimeout`. |
55
55
| Prompt caching | 50–90% token cost reduction | Enable by default; cache system prompts and repo context. |
56
56
| Concurrency | Linear with parallel tasks | Per-user and system-wide concurrency limits. |
Both backends are orchestrated by the same durable Lambda function. The `ComputeStrategy` interface abstracts `startSession()`, `pollSession()`, and `stopSession()` -- the ECS strategy calls `ecs:RunTask` / `ecs:DescribeTasks` / `ecs:StopTask` directly from the Lambda. No Step Functions are used.
23
23
@@ -43,15 +43,15 @@ ECS Fargate is currently **opt-in** -- the `EcsAgentCluster` construct is presen
0 commit comments