Skip to content

Commit 6797ba2

Browse files
author
Irving Popovetsky
committed
re-enable staging backend and update prod backend docker image tag
Signed-off-by: Irving Popovetsky <irving@honeycomb.io>
1 parent 3098024 commit 6797ba2

File tree

2 files changed

+50
-39
lines changed

2 files changed

+50
-39
lines changed

IPv6_MIGRATION_NOTES.md

Lines changed: 28 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -27,12 +27,13 @@
2727

2828
### Why It Failed
2929

30-
**Root cause:** AWS provides DNS64 but **NOT NAT64**
30+
**Root cause:** NAT64 requires NAT Gateway, which negates cost savings
3131

3232
**What this means:**
3333
- **DNS64** (✅ provided): Translates DNS queries from A records to AAAA records using `64:ff9b::/96` prefix
34-
- **NAT64** (❌ NOT provided): Would translate actual IPv6 packets to IPv4 for IPv4-only services
35-
- Result: Instances can resolve IPv4-only services to IPv6 addresses, but packets time out with no NAT64 gateway
34+
- **NAT64** (✅ available via NAT Gateway): AWS NAT Gateway supports NAT64 translation when routing `64:ff9b::/96` traffic through it
35+
- **The problem**: NAT Gateway costs ~$32+/month base, which exceeds the ~$18/month we'd save on public IPv4 addresses
36+
- Additionally, SSM still requires IPv4 connectivity regardless of NAT64
3637

3738
**Services that broke:**
3839
- ❌ AWS SSM Agent (IPv4-only): `dial tcp [64:ff9b::392:b12]:443: i/o timeout`
@@ -93,15 +94,14 @@ network_interfaces = [
9394

9495
**Waiting for AWS to provide:**
9596

96-
1. **Native NAT64 Service**
97-
- Similar to NAT Gateway but for IPv6→IPv4 translation
98-
- Would allow IPv6-only instances to reach IPv4-only services
99-
- **This is the blocker - AWS doesn't offer this**
97+
1. **SSM dual-stack endpoints** (main blocker)
98+
- SSM, EC2 Messages, and SSM Messages currently require IPv4
99+
- Without this, managed EC2 instances cannot go IPv6-only
100+
- NAT64 via NAT Gateway exists but costs ~$32+/month (negates savings)
100101

101-
2. **Alternative: All services support dual-stack**
102-
- Every AWS service with IPv6 endpoints
102+
2. **Alternative: All management services support dual-stack**
103103
- Particularly: SSM, EC2 Messages, SSM Messages
104-
- Currently only ECS, ECR, CloudWatch Logs, S3 support dual-stack
104+
- Currently ECS, ECR, CloudWatch Logs, S3, IAM support dual-stack
105105

106106
**Self-managed workarounds we rejected:**
107107

@@ -162,13 +162,14 @@ This indicates DNS64 translation without NAT64 gateway.
162162

163163
**Only attempt IPv6-only again when ONE of these is true:**
164164

165-
1.**AWS launches managed NAT64 service**
166-
- Monitor AWS announcements for VPC NAT64 Gateway
167-
- Similar to existing NAT Gateway but for IPv6→IPv4
168-
169-
2.**All required AWS services support dual-stack**
165+
1.**SSM gets dual-stack endpoints**
170166
- Specifically need: SSM, EC2 Messages, SSM Messages with IPv6
171-
- Check: https://docs.aws.amazon.com/general/latest/gr/aws-ipv6-support.html
167+
- This is the primary blocker for managed EC2 instances
168+
- Check: https://docs.aws.amazon.com/vpc/latest/userguide/aws-ipv6-support.html
169+
170+
2.**NAT Gateway pricing drops significantly**
171+
- Currently ~$32+/month base cost negates IPv4 savings
172+
- Would need to be <$10/month to make economic sense
172173

173174
3.**Public IPv4 costs exceed $20-30/month**
174175
- At current scale (2-4 instances), savings too small
@@ -184,6 +185,17 @@ dig service-name.region.api.aws AAAA +short
184185
# If returns IPv6 address, service supports dual-stack
185186
```
186187

188+
**Track AWS IPv6 progress:**
189+
- Official tracker: https://docs.aws.amazon.com/vpc/latest/userguide/aws-ipv6-support.html
190+
- AWS What's New (filter for IPv6): https://aws.amazon.com/new/
191+
192+
**Key services to watch for IPv6-only viability:**
193+
- SSM (Systems Manager) - currently IPv4-only, this is the main blocker
194+
- EC2 Messages - currently IPv4-only
195+
- SSM Messages - currently IPv4-only
196+
197+
**Last checked:** January 2026 - SSM still requires IPv4 connectivity
198+
187199
## Rollback Summary
188200

189201
**What we reverted:**

terraform/apps.tf

Lines changed: 22 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ module "python_backend_prod" {
2727
logs_group = aws_cloudwatch_log_group.ecslogs.name
2828
ecs_cluster_id = module.ecs.cluster_id
2929
task_execution_role = data.aws_iam_role.ecs_task_execution_role.arn
30-
image_tag = "latest"
30+
image_tag = "prod"
3131
}
3232

3333
resource "aws_lb_listener_rule" "python_backend_prod" {
@@ -46,31 +46,31 @@ resource "aws_lb_listener_rule" "python_backend_prod" {
4646
}
4747

4848
# Backend Staging
49-
# module "python_backend_staging" {
50-
# source = "./python_backend"
49+
module "python_backend_staging" {
50+
source = "./python_backend"
5151

52-
# env = "staging"
53-
# vpc_id = data.aws_vpc.use2.id
54-
# logs_group = aws_cloudwatch_log_group.ecslogs.name
55-
# ecs_cluster_id = module.ecs.cluster_id
56-
# task_execution_role = data.aws_iam_role.ecs_task_execution_role.arn
57-
# image_tag = "latest"
58-
# }
52+
env = "staging"
53+
vpc_id = data.aws_vpc.use2.id
54+
logs_group = aws_cloudwatch_log_group.ecslogs.name
55+
ecs_cluster_id = module.ecs.cluster_id
56+
task_execution_role = data.aws_iam_role.ecs_task_execution_role.arn
57+
image_tag = "staging"
58+
}
5959

60-
# resource "aws_lb_listener_rule" "python_backend_staging" {
61-
# listener_arn = aws_lb_listener.default_https.arn
60+
resource "aws_lb_listener_rule" "python_backend_staging" {
61+
listener_arn = aws_lb_listener.default_https.arn
6262

63-
# action {
64-
# type = "forward"
65-
# target_group_arn = module.python_backend_staging.lb_tg_arn
66-
# }
63+
action {
64+
type = "forward"
65+
target_group_arn = module.python_backend_staging.lb_tg_arn
66+
}
6767

68-
# condition {
69-
# host_header {
70-
# values = ["backend-staging.operationcode.org", "api.staging.operationcode.org"]
71-
# }
72-
# }
73-
# }
68+
condition {
69+
host_header {
70+
values = ["backend-staging.operationcode.org", "api.staging.operationcode.org"]
71+
}
72+
}
73+
}
7474

7575
# Redirector for shut down sites
7676
resource "aws_lb_listener_rule" "shutdown_sites_redirector" {
@@ -92,7 +92,6 @@ resource "aws_lb_listener_rule" "shutdown_sites_redirector" {
9292
values = [
9393
"resources.operationcode.org",
9494
"resources-staging.operationcode.org",
95-
"api.staging.operationcode.org",
9695
]
9796
}
9897
}

0 commit comments

Comments
 (0)