Summary
We are running EKS clusters on Bottlerocket across multiple regions and observe persistent DiskPressure caused by unbounded growth in:
/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/
Nodes fill to the kubelet eviction threshold within hours to days of launch under normal rolling deployment churn. GC tuning (image-gc-high-threshold, image-gc-low-threshold) does not resolve this — the accumulation persists regardless.
Fill rate acceleration across versions
We have measured a significant acceleration in fill rate across Bottlerocket versions:
| BR Version |
containerd |
Fill rate observed |
Disk / node type |
| 1.57.0 (k8s-1.33) |
1.7.30 |
≥ 8.9 GB/h |
500 GB / m6i.4xlarge |
| 1.58.0 |
2.1.6 |
2.9–3.3 GB/h |
500 GB / c6a.4xlarge |
| 1.59.0 |
2.1.6 |
≥ 17.2 GB/h |
500 GB / m6in.8xlarge |
Note: rates marked ≥ are lower bounds derived from youngest DiskPressure node age. Staging rates are measured directly via kubelet proxy stats API. Different instance types and workloads mean this is not a controlled comparison — but the pattern is consistent across clusters.
What we observe on affected nodes
From logdog bundles collected via SSM on DiskPressure nodes:
containerd-shim-runc-v2 processes become unresponsive after container exit, logging context deadline exceeded every 2 seconds indefinitely via ttrpc — never cleaned up
- Snapshot counts significantly exceed active mount counts (e.g. 264 snapshots, 107 active mounts → 157 snapshots with no active mount)
- Zombie task directory count exceeds the number of active pods
sync_remove = false (AMI default) — async snapshot removal goroutine is abandoned if the shim is killed before it completes, leaving the snapshot on disk silently
discard_unpacked_layers = false (AMI default) — compressed image tarballs are retained in the content store permanently, contributing a second independent growth vector (57 GB content store observed vs 8.9 GB on a clean node)
Root cause hypothesis
This matches the upstream bug reported and fixed in containerd:
The discard_unpacked_layers = false default is tracked separately in #3314. We confirmed this setting is not configurable via Bottlerocket userdata or apiclient.
Questions
- Which Bottlerocket release will include the containerd build containing the fixes from containerd PRs #12400 and #12397?
- Is there a configuration workaround available in the meantime — e.g. via
[plugins."io.containerd.gc.v1.scheduler"] settings in userdata, or a supported snapshotter alternative?
Environment
- Bottlerocket versions: 1.57.0, 1.58.0, 1.59.0
- Kubernetes: 1.33
- containerd: 1.7.30 (1.57.0), 2.1.6 (1.58.0 and 1.59.0)
- Regions: eu-west-1, ca-central-1
- Disk: 500 GB
- Workload: high-churn rolling deployments (CI/CD), ~4,000 active pods per cluster
Summary
We are running EKS clusters on Bottlerocket across multiple regions and observe persistent DiskPressure caused by unbounded growth in:
/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/
Nodes fill to the kubelet eviction threshold within hours to days of launch under normal rolling deployment churn. GC tuning (
image-gc-high-threshold,image-gc-low-threshold) does not resolve this — the accumulation persists regardless.Fill rate acceleration across versions
We have measured a significant acceleration in fill rate across Bottlerocket versions:
Note: rates marked ≥ are lower bounds derived from youngest DiskPressure node age. Staging rates are measured directly via kubelet proxy stats API. Different instance types and workloads mean this is not a controlled comparison — but the pattern is consistent across clusters.
What we observe on affected nodes
From logdog bundles collected via SSM on DiskPressure nodes:
containerd-shim-runc-v2processes become unresponsive after container exit, loggingcontext deadline exceededevery 2 seconds indefinitely via ttrpc — never cleaned upsync_remove = false(AMI default) — async snapshot removal goroutine is abandoned if the shim is killed before it completes, leaving the snapshot on disk silentlydiscard_unpacked_layers = false(AMI default) — compressed image tarballs are retained in the content store permanently, contributing a second independent growth vector (57 GB content store observed vs 8.9 GB on a clean node)Root cause hypothesis
This matches the upstream bug reported and fixed in containerd:
The
discard_unpacked_layers = falsedefault is tracked separately in #3314. We confirmed this setting is not configurable via Bottlerocket userdata or apiclient.Questions
[plugins."io.containerd.gc.v1.scheduler"]settings in userdata, or a supported snapshotter alternative?Environment