Skip to content

Commit 4daf666

Browse files
author
bgagent
committed
chore(docs): update docs
1 parent 0742ebe commit 4daf666

4 files changed

Lines changed: 48 additions & 18 deletions

File tree

docs/design/COST_MODEL.md

Lines changed: 12 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -13,14 +13,17 @@ These costs are incurred regardless of task volume:
1313
| NAT Gateway (1×) | ~$32/month | Fixed hourly cost + data processing. Single AZ (see [COMPUTE.md - Network architecture](./COMPUTE.md)). |
1414
| VPC Interface Endpoints (7×, 2 AZs) | ~$102/month | $0.01/hr × 7 endpoints × 2 AZs × 730 hrs. |
1515
| VPC Flow Logs | ~$3/month | CloudWatch ingestion. |
16-
| DynamoDB (on-demand, idle) | ~$0/month | Pay-per-request; no cost when idle. |
16+
| DynamoDB (on-demand, idle) | ~$0/month | Pay-per-request; 6 tables (Tasks, Events, Nudges, UserConcurrency, Webhooks, Repo). No cost when idle. |
17+
| S3 Trace Artifacts bucket (idle) | ~$0/month | 7-day lifecycle auto-expires objects; no cost when no traces are stored. |
18+
| EventBridge reconciler rule | <$0.01/month | Invokes Lambda every 5 min (288/day). Rule itself is free; Lambda invocation is the cost (see below). |
19+
| Stranded task reconciler Lambda (idle) | <$0.01/month | 288 invocations/day × 256 MB × ~100 ms avg (early exit when no stranded tasks). ~$0.005/month total (requests + duration). |
1720
| CloudWatch Logs retention | ~$1–5/month | Depends on log volume. 90-day retention. |
1821
| API Gateway (idle) | ~$0/month | Pay-per-request. |
19-
| **Total baseline** | **~$140–150/month** | |
22+
| **Total baseline** | **~$140–150/month** | Reconciler adds negligible cost; VPC networking remains dominant. |
2023

2124
### Scale-to-zero characteristics
2225

23-
Most platform components are fully serverless and incur zero cost when idle: DynamoDB (PAY_PER_REQUEST), Lambda, API Gateway, ECS Fargate (cluster is free, when enabled), AgentCore Runtime (per-session), Bedrock (per-token), and Cognito (free tier). The always-on cost floor (~$140–150/month) is dominated by VPC networking infrastructure (NAT Gateway + 7 interface endpoints across 2 AZs) which is required for private subnet connectivity to AWS services and GitHub. See the [Deployment guide](../guides/DEPLOYMENT_GUIDE.md) for the full scale-to-zero breakdown.
26+
Most platform components are fully serverless and incur zero cost when idle: DynamoDB (PAY_PER_REQUEST, 6 tables), Lambda, API Gateway, S3 (trace artifacts auto-expire in 7 days), SQS (fanout DLQ), ECS Fargate (cluster is free, when enabled), AgentCore Runtime (per-session), Bedrock (per-token), and Cognito (free tier). The stranded task reconciler adds <$0.01/month even when idle (288 Lambda invocations/day, early-exit). The always-on cost floor (~$140–150/month) is dominated by VPC networking infrastructure (NAT Gateway + 7 interface endpoints across 2 AZs) which is required for private subnet connectivity to AWS services and GitHub. See the [Deployment guide](../guides/DEPLOYMENT_GUIDE.md) for the full scale-to-zero breakdown.
2427

2528
## Per-task variable costs
2629

@@ -35,10 +38,14 @@ Assuming a typical task: 1–2 hours, Claude Sonnet, ~100K input tokens, ~20K ou
3538
| **Bedrock tokens (dominant)** | $2–15 | Varies widely by model, task complexity, and turn count. Claude Sonnet: ~$3/M input tokens, ~$15/M output tokens. A 50-turn task with 100K input + 20K output per turn ≈ 5M input + 1M output ≈ $15 + $15 = $30 at list price. Prompt caching reduces this significantly (up to 90% for cache hits). Typical range: $2–15 after caching. |
3639
| AgentCore Runtime compute | $0.10–0.50 | 2 vCPU / 8 GB for 1–2 hours. Pricing model is per-session based on vCPU-hours and GB-hours. |
3740
| Lambda orchestrator | <$0.01 | ~10 invocations per task (admission, hydration, polling, finalization). Negligible. |
38-
| DynamoDB reads/writes | <$0.01 | ~20–50 operations per task (task CRUD, events, counter updates). Negligible. |
41+
| Lambda fanout consumer | <$0.01 | Triggered per batch of task events (batch size 100, 5 s window). Typically 5–20 invocations per task at 256 MB. Negligible. |
42+
| Lambda nudge / trace / events | <$0.01 | On-demand per user request. Negligible unless heavily polled. |
43+
| DynamoDB reads/writes | <$0.01 | ~30–80 operations per task (task CRUD, events, nudges, counter updates). Negligible. |
44+
| DynamoDB Streams (fanout) | <$0.01 | Stream reads charged per 25 KB. Typical task: ~20–50 event records. Negligible. |
45+
| S3 trace upload (if `--trace`) | <$0.01 | One PUT per task + storage (gzipped NDJSON, typically 50–500 KB, auto-expires in 7 days). |
3946
| NAT Gateway data | <$0.01 | GitHub API traffic: clone + push. Small repos: <10 MB. |
4047
| Custom step Lambdas | $0–0.05 | Only if configured. Per-invocation: ~$0.01 per step. |
41-
| **Total per task** | **$2–15** | Bedrock tokens dominate (>90% of per-task cost). |
48+
| **Total per task** | **$2–15** | Bedrock tokens dominate (>90% of per-task cost). New interactive features add <$0.01 per task. |
4249

4350
### Cost sensitivity analysis
4451

docs/guides/DEPLOYMENT_GUIDE.md

Lines changed: 12 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ ECS Fargate is currently **opt-in** -- the `EcsAgentCluster` construct is presen
2525

2626
| Component | Billing Model | Idle Cost |
2727
|-----------|--------------|-----------|
28-
| DynamoDB (5 tables) | PAY_PER_REQUEST | $0 |
28+
| DynamoDB (6 tables) | PAY_PER_REQUEST | $0 |
2929
| Lambda (all functions) | Per invocation | $0 |
3030
| API Gateway REST | Per request | $0 |
3131
| ECS Fargate tasks (when enabled) | Per running task | $0 (cluster is free) |
@@ -59,7 +59,7 @@ For the full cost model including per-task costs, see [COST_MODEL.md](../design/
5959
|---------|---------|---------------|
6060
| Bedrock AgentCore Runtime (MicroVMs) | Agent sessions (default) | Yes |
6161
| ECS Fargate (when enabled) | Agent sessions (opt-in) | Yes |
62-
| Lambda (Node.js 24, ARM64) | Orchestrator, API handlers, reconciler, custom resources | Yes |
62+
| Lambda (Node.js 24, ARM64) | Orchestrator, API handlers, fanout consumer, reconcilers, custom resources | Yes |
6363

6464
### AI/ML
6565

@@ -84,8 +84,10 @@ For the full cost model including per-task costs, see [COST_MODEL.md](../design/
8484

8585
| Service | Used By | Scales to Zero |
8686
|---------|---------|---------------|
87-
| DynamoDB (5 tables, PAY_PER_REQUEST) | Task state, events, concurrency, webhooks, repo config | Yes |
88-
| S3 | CDK asset bucket, ECR image layers, FUSE session storage | Minimal |
87+
| DynamoDB (6 tables, PAY_PER_REQUEST) | Task state, events, nudges, concurrency, webhooks, repo config | Yes |
88+
| DynamoDB Streams | TaskEventsTable → FanOut Consumer Lambda | Yes |
89+
| S3 | CDK asset bucket, ECR image layers, FUSE session storage, trace artifacts (7-day lifecycle) | Minimal |
90+
| SQS (DLQ) | FanOut Consumer dead-letter queue | Yes |
8991
| Secrets Manager | GitHub PAT, webhook HMAC secrets | **No** (~$0.40/secret/mo) |
9092

9193
### API / Auth
@@ -96,6 +98,12 @@ For the full cost model including per-task costs, see [COST_MODEL.md](../design/
9698
| Cognito User Pool | CLI/API authentication | Yes (free tier) |
9799
| WAF v2 | API Gateway protection (managed rules + rate limiting) | **No** (~$5/mo base) |
98100

101+
### Scheduling
102+
103+
| Service | Used By | Scales to Zero |
104+
|---------|---------|---------------|
105+
| EventBridge (scheduled rule) | Stranded task reconciler (every 5 min) | Yes (rule is free; Lambda invocation is the cost) |
106+
99107
### Observability
100108

101109
| Service | Used By | Scales to Zero |

docs/src/content/docs/architecture/Cost-model.md

Lines changed: 12 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -17,14 +17,17 @@ These costs are incurred regardless of task volume:
1717
| NAT Gateway (1×) | ~$32/month | Fixed hourly cost + data processing. Single AZ (see [COMPUTE.md - Network architecture](/architecture/compute)). |
1818
| VPC Interface Endpoints (7×, 2 AZs) | ~$102/month | $0.01/hr × 7 endpoints × 2 AZs × 730 hrs. |
1919
| VPC Flow Logs | ~$3/month | CloudWatch ingestion. |
20-
| DynamoDB (on-demand, idle) | ~$0/month | Pay-per-request; no cost when idle. |
20+
| DynamoDB (on-demand, idle) | ~$0/month | Pay-per-request; 6 tables (Tasks, Events, Nudges, UserConcurrency, Webhooks, Repo). No cost when idle. |
21+
| S3 Trace Artifacts bucket (idle) | ~$0/month | 7-day lifecycle auto-expires objects; no cost when no traces are stored. |
22+
| EventBridge reconciler rule | <$0.01/month | Invokes Lambda every 5 min (288/day). Rule itself is free; Lambda invocation is the cost (see below). |
23+
| Stranded task reconciler Lambda (idle) | <$0.01/month | 288 invocations/day × 256 MB × ~100 ms avg (early exit when no stranded tasks). ~$0.005/month total (requests + duration). |
2124
| CloudWatch Logs retention | ~$1–5/month | Depends on log volume. 90-day retention. |
2225
| API Gateway (idle) | ~$0/month | Pay-per-request. |
23-
| **Total baseline** | **~$140–150/month** | |
26+
| **Total baseline** | **~$140–150/month** | Reconciler adds negligible cost; VPC networking remains dominant. |
2427

2528
### Scale-to-zero characteristics
2629

27-
Most platform components are fully serverless and incur zero cost when idle: DynamoDB (PAY_PER_REQUEST), Lambda, API Gateway, ECS Fargate (cluster is free, when enabled), AgentCore Runtime (per-session), Bedrock (per-token), and Cognito (free tier). The always-on cost floor (~$140–150/month) is dominated by VPC networking infrastructure (NAT Gateway + 7 interface endpoints across 2 AZs) which is required for private subnet connectivity to AWS services and GitHub. See the [Deployment guide](/getting-started/deployment-guide) for the full scale-to-zero breakdown.
30+
Most platform components are fully serverless and incur zero cost when idle: DynamoDB (PAY_PER_REQUEST, 6 tables), Lambda, API Gateway, S3 (trace artifacts auto-expire in 7 days), SQS (fanout DLQ), ECS Fargate (cluster is free, when enabled), AgentCore Runtime (per-session), Bedrock (per-token), and Cognito (free tier). The stranded task reconciler adds <$0.01/month even when idle (288 Lambda invocations/day, early-exit). The always-on cost floor (~$140–150/month) is dominated by VPC networking infrastructure (NAT Gateway + 7 interface endpoints across 2 AZs) which is required for private subnet connectivity to AWS services and GitHub. See the [Deployment guide](/getting-started/deployment-guide) for the full scale-to-zero breakdown.
2831

2932
## Per-task variable costs
3033

@@ -39,10 +42,14 @@ Assuming a typical task: 1–2 hours, Claude Sonnet, ~100K input tokens, ~20K ou
3942
| **Bedrock tokens (dominant)** | $2–15 | Varies widely by model, task complexity, and turn count. Claude Sonnet: ~$3/M input tokens, ~$15/M output tokens. A 50-turn task with 100K input + 20K output per turn ≈ 5M input + 1M output ≈ $15 + $15 = $30 at list price. Prompt caching reduces this significantly (up to 90% for cache hits). Typical range: $2–15 after caching. |
4043
| AgentCore Runtime compute | $0.10–0.50 | 2 vCPU / 8 GB for 1–2 hours. Pricing model is per-session based on vCPU-hours and GB-hours. |
4144
| Lambda orchestrator | <$0.01 | ~10 invocations per task (admission, hydration, polling, finalization). Negligible. |
42-
| DynamoDB reads/writes | <$0.01 | ~20–50 operations per task (task CRUD, events, counter updates). Negligible. |
45+
| Lambda fanout consumer | <$0.01 | Triggered per batch of task events (batch size 100, 5 s window). Typically 5–20 invocations per task at 256 MB. Negligible. |
46+
| Lambda nudge / trace / events | <$0.01 | On-demand per user request. Negligible unless heavily polled. |
47+
| DynamoDB reads/writes | <$0.01 | ~30–80 operations per task (task CRUD, events, nudges, counter updates). Negligible. |
48+
| DynamoDB Streams (fanout) | <$0.01 | Stream reads charged per 25 KB. Typical task: ~20–50 event records. Negligible. |
49+
| S3 trace upload (if `--trace`) | <$0.01 | One PUT per task + storage (gzipped NDJSON, typically 50–500 KB, auto-expires in 7 days). |
4350
| NAT Gateway data | <$0.01 | GitHub API traffic: clone + push. Small repos: <10 MB. |
4451
| Custom step Lambdas | $0–0.05 | Only if configured. Per-invocation: ~$0.01 per step. |
45-
| **Total per task** | **$2–15** | Bedrock tokens dominate (>90% of per-task cost). |
52+
| **Total per task** | **$2–15** | Bedrock tokens dominate (>90% of per-task cost). New interactive features add <$0.01 per task. |
4653

4754
### Cost sensitivity analysis
4855

docs/src/content/docs/getting-started/Deployment-guide.md

Lines changed: 12 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ ECS Fargate is currently **opt-in** -- the `EcsAgentCluster` construct is presen
2929

3030
| Component | Billing Model | Idle Cost |
3131
|-----------|--------------|-----------|
32-
| DynamoDB (5 tables) | PAY_PER_REQUEST | $0 |
32+
| DynamoDB (6 tables) | PAY_PER_REQUEST | $0 |
3333
| Lambda (all functions) | Per invocation | $0 |
3434
| API Gateway REST | Per request | $0 |
3535
| ECS Fargate tasks (when enabled) | Per running task | $0 (cluster is free) |
@@ -63,7 +63,7 @@ For the full cost model including per-task costs, see [COST_MODEL.md](/architect
6363
|---------|---------|---------------|
6464
| Bedrock AgentCore Runtime (MicroVMs) | Agent sessions (default) | Yes |
6565
| ECS Fargate (when enabled) | Agent sessions (opt-in) | Yes |
66-
| Lambda (Node.js 24, ARM64) | Orchestrator, API handlers, reconciler, custom resources | Yes |
66+
| Lambda (Node.js 24, ARM64) | Orchestrator, API handlers, fanout consumer, reconcilers, custom resources | Yes |
6767

6868
### AI/ML
6969

@@ -88,8 +88,10 @@ For the full cost model including per-task costs, see [COST_MODEL.md](/architect
8888

8989
| Service | Used By | Scales to Zero |
9090
|---------|---------|---------------|
91-
| DynamoDB (5 tables, PAY_PER_REQUEST) | Task state, events, concurrency, webhooks, repo config | Yes |
92-
| S3 | CDK asset bucket, ECR image layers, FUSE session storage | Minimal |
91+
| DynamoDB (6 tables, PAY_PER_REQUEST) | Task state, events, nudges, concurrency, webhooks, repo config | Yes |
92+
| DynamoDB Streams | TaskEventsTable → FanOut Consumer Lambda | Yes |
93+
| S3 | CDK asset bucket, ECR image layers, FUSE session storage, trace artifacts (7-day lifecycle) | Minimal |
94+
| SQS (DLQ) | FanOut Consumer dead-letter queue | Yes |
9395
| Secrets Manager | GitHub PAT, webhook HMAC secrets | **No** (~$0.40/secret/mo) |
9496

9597
### API / Auth
@@ -100,6 +102,12 @@ For the full cost model including per-task costs, see [COST_MODEL.md](/architect
100102
| Cognito User Pool | CLI/API authentication | Yes (free tier) |
101103
| WAF v2 | API Gateway protection (managed rules + rate limiting) | **No** (~$5/mo base) |
102104

105+
### Scheduling
106+
107+
| Service | Used By | Scales to Zero |
108+
|---------|---------|---------------|
109+
| EventBridge (scheduled rule) | Stranded task reconciler (every 5 min) | Yes (rule is free; Lambda invocation is the cost) |
110+
103111
### Observability
104112

105113
| Service | Used By | Scales to Zero |

0 commit comments

Comments
 (0)