Skip to content

Commit abe2a90

Browse files
fix(docs): correct VPC endpoint cost to ~$102/mo and clarify session timeouts
VPC endpoint cost was ~$50/mo (1 AZ math), actual is ~$102/mo (7 endpoints x 2 AZs x $0.01/hr x 730 hrs). Update baseline totals from ~$85-95 to ~$140-150 in COST_MODEL.md and DEPLOYMENT_GUIDE.md. Clarify the two distinct timeout limits: AgentCore 8-hour service limit vs orchestrator 9-hour executionTimeout. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 6df700d commit abe2a90

4 files changed

Lines changed: 22 additions & 22 deletions

File tree

docs/design/COST_MODEL.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -11,16 +11,16 @@ These costs are incurred regardless of task volume:
1111
| Component | Estimated cost | Notes |
1212
|---|---|---|
1313
| NAT Gateway (1×) | ~$32/month | Fixed hourly cost + data processing. Single AZ (see [COMPUTE.md - Network architecture](./COMPUTE.md)). |
14-
| VPC Interface Endpoints (7×) | ~$50/month | $0.01/hr per endpoint per AZ. |
14+
| VPC Interface Endpoints (7×, 2 AZs) | ~$102/month | $0.01/hr × 7 endpoints × 2 AZs × 730 hrs. |
1515
| VPC Flow Logs | ~$3/month | CloudWatch ingestion. |
1616
| DynamoDB (on-demand, idle) | ~$0/month | Pay-per-request; no cost when idle. |
1717
| CloudWatch Logs retention | ~$1–5/month | Depends on log volume. 90-day retention. |
1818
| API Gateway (idle) | ~$0/month | Pay-per-request. |
19-
| **Total baseline** | **~$85–95/month** | |
19+
| **Total baseline** | **~$140–150/month** | |
2020

2121
### Scale-to-zero characteristics
2222

23-
Most platform components are fully serverless and incur zero cost when idle: DynamoDB (PAY_PER_REQUEST), Lambda, API Gateway, ECS Fargate (cluster is free, when enabled), AgentCore Runtime (per-session), Bedrock (per-token), and Cognito (free tier). The always-on cost floor (~$85–95/month) is dominated by VPC networking infrastructure (NAT Gateway + 7 interface endpoints) which is required for private subnet connectivity to AWS services and GitHub. See the [Deployment guide](../guides/DEPLOYMENT_GUIDE.md) for the full scale-to-zero breakdown.
23+
Most platform components are fully serverless and incur zero cost when idle: DynamoDB (PAY_PER_REQUEST), Lambda, API Gateway, ECS Fargate (cluster is free, when enabled), AgentCore Runtime (per-session), Bedrock (per-token), and Cognito (free tier). The always-on cost floor (~$140–150/month) is dominated by VPC networking infrastructure (NAT Gateway + 7 interface endpoints across 2 AZs) which is required for private subnet connectivity to AWS services and GitHub. See the [Deployment guide](../guides/DEPLOYMENT_GUIDE.md) for the full scale-to-zero breakdown.
2424

2525
## Per-task variable costs
2626

@@ -47,16 +47,16 @@ Assuming a typical task: 1–2 hours, Claude Sonnet, ~100K input tokens, ~20K ou
4747
| Model choice | 5–10× between Haiku and Opus | Default to Claude Sonnet; allow per-repo override. |
4848
| Turn count | Linear with turns | `max_turns` cap (default 100, configurable 1–500). |
4949
| Cost budget | Hard stop at budget | `max_budget_usd` cap (configurable $0.01–$100). Agent stops when budget is reached regardless of remaining turns. |
50-
| Task duration | Sub-linear (compute is cheap; tokens dominate) | 8-hour max session timeout. |
50+
| Task duration | Sub-linear (compute is cheap; tokens dominate) | AgentCore: 8-hour service limit; orchestrator: 9-hour `executionTimeout`. |
5151
| Prompt caching | 50–90% token cost reduction | Enable by default; cache system prompts and repo context. |
5252
| Concurrency | Linear with parallel tasks | Per-user and system-wide concurrency limits. |
5353

5454
## Cost at scale
5555

5656
| Scale | Tasks/month | Estimated monthly cost (infra + tasks) |
5757
|---|---|---|
58-
| Low (1 developer) | 30–60 | $150–500 |
59-
| Medium (small team) | 200–500 | $500–3,000 |
58+
| Low (1 developer) | 30–60 | $200–550 |
59+
| Medium (small team) | 200–500 | $550–3,000 |
6060
| High (org-wide) | 2,000–5,000 | $5,000–30,000 |
6161

6262
These estimates assume Claude Sonnet with prompt caching enabled and average task complexity.

docs/guides/DEPLOYMENT_GUIDE.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ ABCA deploys as a **single CDK stack** (`backgroundagent-dev`) containing all pl
1313
| **Orchestration** | Durable Lambda (checkpoint/replay) | Same durable Lambda via `ComputeStrategy` |
1414
| **Agent mode** | FastAPI server (HTTP invocation) | Batch (run-to-completion) |
1515
| **Startup** | ~10s (warm MicroVM) | ~60-180s (Fargate cold start) |
16-
| **Max duration** | 8 hours (AgentCore session) | Limited by orchestrator timeout (9 hours) |
16+
| **Max duration** | 8 hours (AgentCore service limit) | 9 hours (orchestrator `executionTimeout`) |
1717

1818
Both backends are orchestrated by the same durable Lambda function. The `ComputeStrategy` interface abstracts `startSession()`, `pollSession()`, and `stopSession()` -- the ECS strategy calls `ecs:RunTask` / `ecs:DescribeTasks` / `ecs:StopTask` directly from the Lambda. No Step Functions are used.
1919

@@ -39,15 +39,15 @@ ECS Fargate is currently **opt-in** -- the `EcsAgentCluster` construct is presen
3939
| Component | Est. Monthly Idle Cost | Why |
4040
|-----------|----------------------|-----|
4141
| NAT Gateway (1x) | ~$32 | $0.045/hr fixed charge |
42-
| VPC Interface Endpoints (7x, 2 AZs) | ~$50 | $0.01/hr per endpoint per AZ |
42+
| VPC Interface Endpoints (7x, 2 AZs) | ~$102 | $0.01/hr × 7 endpoints × 2 AZs × 730 hrs |
4343
| WAF v2 Web ACL | ~$5 | Base monthly charge |
4444
| CloudWatch Dashboard | ~$3 | Per-dashboard charge |
4545
| Secrets Manager (1+ secrets) | ~$0.40/secret | Per-secret monthly |
4646
| CloudWatch Alarms | ~$0.10/alarm | Per standard alarm |
4747
| CloudWatch Logs retention | ~$1-5 | Storage for retained logs |
48-
| **Total always-on baseline** | **~$85-95/month** | |
48+
| **Total always-on baseline** | **~$140-150/month** | |
4949

50-
The dominant idle cost is VPC networking: 7 interface endpoints (~$50/month) plus the NAT Gateway (~$32/month).
50+
The dominant idle cost is VPC networking: 7 interface endpoints across 2 AZs (~$102/month) plus the NAT Gateway (~$32/month).
5151

5252
For the full cost model including per-task costs, see [COST_MODEL.md](../design/COST_MODEL.md).
5353

@@ -75,7 +75,7 @@ For the full cost model including per-task costs, see [COST_MODEL.md](../design/
7575
|---------|---------|---------------|
7676
| VPC (public + private subnets, 2 AZs) | All compute | N/A (no direct cost) |
7777
| NAT Gateway (1x) | Private subnet internet egress | **No** (~$32/mo) |
78-
| VPC Interface Endpoints (7x) | AWS service connectivity from private subnets | **No** (~$50/mo) |
78+
| VPC Interface Endpoints (7x, 2 AZs) | AWS service connectivity from private subnets | **No** (~$102/mo) |
7979
| VPC Gateway Endpoints (2x: S3, DynamoDB) | S3 and DynamoDB connectivity | Yes (free) |
8080
| Security Groups | HTTPS-only egress | N/A |
8181
| Route 53 Resolver DNS Firewall | Domain allowlisting for agent egress | Minimal |

docs/src/content/docs/architecture/Cost-model.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -15,16 +15,16 @@ These costs are incurred regardless of task volume:
1515
| Component | Estimated cost | Notes |
1616
|---|---|---|
1717
| NAT Gateway (1×) | ~$32/month | Fixed hourly cost + data processing. Single AZ (see [COMPUTE.md - Network architecture](/architecture/compute)). |
18-
| VPC Interface Endpoints (7×) | ~$50/month | $0.01/hr per endpoint per AZ. |
18+
| VPC Interface Endpoints (7×, 2 AZs) | ~$102/month | $0.01/hr × 7 endpoints × 2 AZs × 730 hrs. |
1919
| VPC Flow Logs | ~$3/month | CloudWatch ingestion. |
2020
| DynamoDB (on-demand, idle) | ~$0/month | Pay-per-request; no cost when idle. |
2121
| CloudWatch Logs retention | ~$1–5/month | Depends on log volume. 90-day retention. |
2222
| API Gateway (idle) | ~$0/month | Pay-per-request. |
23-
| **Total baseline** | **~$85–95/month** | |
23+
| **Total baseline** | **~$140–150/month** | |
2424

2525
### Scale-to-zero characteristics
2626

27-
Most platform components are fully serverless and incur zero cost when idle: DynamoDB (PAY_PER_REQUEST), Lambda, API Gateway, ECS Fargate (cluster is free, when enabled), AgentCore Runtime (per-session), Bedrock (per-token), and Cognito (free tier). The always-on cost floor (~$85–95/month) is dominated by VPC networking infrastructure (NAT Gateway + 7 interface endpoints) which is required for private subnet connectivity to AWS services and GitHub. See the [Deployment guide](/getting-started/deployment-guide) for the full scale-to-zero breakdown.
27+
Most platform components are fully serverless and incur zero cost when idle: DynamoDB (PAY_PER_REQUEST), Lambda, API Gateway, ECS Fargate (cluster is free, when enabled), AgentCore Runtime (per-session), Bedrock (per-token), and Cognito (free tier). The always-on cost floor (~$140–150/month) is dominated by VPC networking infrastructure (NAT Gateway + 7 interface endpoints across 2 AZs) which is required for private subnet connectivity to AWS services and GitHub. See the [Deployment guide](/getting-started/deployment-guide) for the full scale-to-zero breakdown.
2828

2929
## Per-task variable costs
3030

@@ -51,16 +51,16 @@ Assuming a typical task: 1–2 hours, Claude Sonnet, ~100K input tokens, ~20K ou
5151
| Model choice | 5–10× between Haiku and Opus | Default to Claude Sonnet; allow per-repo override. |
5252
| Turn count | Linear with turns | `max_turns` cap (default 100, configurable 1–500). |
5353
| Cost budget | Hard stop at budget | `max_budget_usd` cap (configurable $0.01–$100). Agent stops when budget is reached regardless of remaining turns. |
54-
| Task duration | Sub-linear (compute is cheap; tokens dominate) | 8-hour max session timeout. |
54+
| Task duration | Sub-linear (compute is cheap; tokens dominate) | AgentCore: 8-hour service limit; orchestrator: 9-hour `executionTimeout`. |
5555
| Prompt caching | 50–90% token cost reduction | Enable by default; cache system prompts and repo context. |
5656
| Concurrency | Linear with parallel tasks | Per-user and system-wide concurrency limits. |
5757

5858
## Cost at scale
5959

6060
| Scale | Tasks/month | Estimated monthly cost (infra + tasks) |
6161
|---|---|---|
62-
| Low (1 developer) | 30–60 | $150–500 |
63-
| Medium (small team) | 200–500 | $500–3,000 |
62+
| Low (1 developer) | 30–60 | $200–550 |
63+
| Medium (small team) | 200–500 | $550–3,000 |
6464
| High (org-wide) | 2,000–5,000 | $5,000–30,000 |
6565

6666
These estimates assume Claude Sonnet with prompt caching enabled and average task complexity.

docs/src/content/docs/getting-started/Deployment-guide.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ ABCA deploys as a **single CDK stack** (`backgroundagent-dev`) containing all pl
1717
| **Orchestration** | Durable Lambda (checkpoint/replay) | Same durable Lambda via `ComputeStrategy` |
1818
| **Agent mode** | FastAPI server (HTTP invocation) | Batch (run-to-completion) |
1919
| **Startup** | ~10s (warm MicroVM) | ~60-180s (Fargate cold start) |
20-
| **Max duration** | 8 hours (AgentCore session) | Limited by orchestrator timeout (9 hours) |
20+
| **Max duration** | 8 hours (AgentCore service limit) | 9 hours (orchestrator `executionTimeout`) |
2121

2222
Both backends are orchestrated by the same durable Lambda function. The `ComputeStrategy` interface abstracts `startSession()`, `pollSession()`, and `stopSession()` -- the ECS strategy calls `ecs:RunTask` / `ecs:DescribeTasks` / `ecs:StopTask` directly from the Lambda. No Step Functions are used.
2323

@@ -43,15 +43,15 @@ ECS Fargate is currently **opt-in** -- the `EcsAgentCluster` construct is presen
4343
| Component | Est. Monthly Idle Cost | Why |
4444
|-----------|----------------------|-----|
4545
| NAT Gateway (1x) | ~$32 | $0.045/hr fixed charge |
46-
| VPC Interface Endpoints (7x, 2 AZs) | ~$50 | $0.01/hr per endpoint per AZ |
46+
| VPC Interface Endpoints (7x, 2 AZs) | ~$102 | $0.01/hr × 7 endpoints × 2 AZs × 730 hrs |
4747
| WAF v2 Web ACL | ~$5 | Base monthly charge |
4848
| CloudWatch Dashboard | ~$3 | Per-dashboard charge |
4949
| Secrets Manager (1+ secrets) | ~$0.40/secret | Per-secret monthly |
5050
| CloudWatch Alarms | ~$0.10/alarm | Per standard alarm |
5151
| CloudWatch Logs retention | ~$1-5 | Storage for retained logs |
52-
| **Total always-on baseline** | **~$85-95/month** | |
52+
| **Total always-on baseline** | **~$140-150/month** | |
5353

54-
The dominant idle cost is VPC networking: 7 interface endpoints (~$50/month) plus the NAT Gateway (~$32/month).
54+
The dominant idle cost is VPC networking: 7 interface endpoints across 2 AZs (~$102/month) plus the NAT Gateway (~$32/month).
5555

5656
For the full cost model including per-task costs, see [COST_MODEL.md](/architecture/cost-model).
5757

@@ -79,7 +79,7 @@ For the full cost model including per-task costs, see [COST_MODEL.md](/architect
7979
|---------|---------|---------------|
8080
| VPC (public + private subnets, 2 AZs) | All compute | N/A (no direct cost) |
8181
| NAT Gateway (1x) | Private subnet internet egress | **No** (~$32/mo) |
82-
| VPC Interface Endpoints (7x) | AWS service connectivity from private subnets | **No** (~$50/mo) |
82+
| VPC Interface Endpoints (7x, 2 AZs) | AWS service connectivity from private subnets | **No** (~$102/mo) |
8383
| VPC Gateway Endpoints (2x: S3, DynamoDB) | S3 and DynamoDB connectivity | Yes (free) |
8484
| Security Groups | HTTPS-only egress | N/A |
8585
| Route 53 Resolver DNS Firewall | Domain allowlisting for agent egress | Minimal |

0 commit comments

Comments
 (0)