Skip to content

rfc(decision): Distroless base images#157

Open
oioki wants to merge 6 commits into
mainfrom
rfc/distroless-base-images
Open

rfc(decision): Distroless base images#157
oioki wants to merge 6 commits into
mainfrom
rfc/distroless-base-images

Conversation

@oioki
Copy link
Copy Markdown
Member

@oioki oioki commented Mar 27, 2026

@oioki oioki force-pushed the rfc/distroless-base-images branch from 21cb505 to 75cc51e Compare March 27, 2026 09:39
@oioki oioki marked this pull request as ready for review March 27, 2026 15:30

# Resolved questions

- **Long-term commitment to DHI:** Despite Docker Inc having a history of unexpected licensing and policy changes (Hub rate limiting, Desktop licensing, etc.), DHI was recently made public under Apache 2.0, and a rollback of that decision seems unlikely. If needed, Google Distroless is a practical drop-in fallback — it lags a few patch versions behind but is otherwise compatible. Other solutions may also emerge over time. We can go with DHI images as a default.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we would be able to maintain the images if we had to?

Copy link
Copy Markdown
Member Author

@oioki oioki May 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we really don't want to do this (as it's not Sentry business) but I tried doing this literally from scratch in a couple of hours, and drop-in replacement of the base image worked fine to the point of passing all smoke tests on snuba:

So, if we had to, it is doable nowadays.

- **Snuba and getsentry:** These are the largest remaining Python services. The Snuba PoC (https://github.com/getsentry/snuba/pull/7753, https://github.com/getsentry/snuba/pull/7821, https://github.com/getsentry/snuba/pull/7829, https://github.com/getsentry/ops/pull/19824) showed it is feasible. What is the sequencing and who owns driving this to completion?
- **Local development compatibility:** Are there any blockers that might disrupt local development workflows when switching to distroless? So far this appears to be a non-issue — for example, Snuba distroless containers work fine in `sentry devservices` (https://github.com/getsentry/snuba/pull/7829).
- **Services with non-trivial runtime deps:** Some services (e.g. uptime-checker with OpenSSL for certificate validation, or services using external libraries) may need extra work. Are there any blockers that make distroless infeasible for them?
- **Public mirrors for anonymous access:** Pulling directly from `dhi.io` requires a Docker login, which complicates CI pipelines and local image builds for contributors. Should we commit to maintaining public mirrors at `ghcr.io/getsentry/dhi` to allow unauthenticated pulls? See current PoC: https://github.com/getsentry/dhi.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needing to login could be disruptive to self-hosted users.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are going to mirror the limited set of images (for start, python and node) on artifact registries controlled by us: either GHCR (could be more native for self-hosted) or Public Artifact Registry in GCP (could be faster for SaaS builds?). In both cases, pulling base images from those won't require authentication.

Comment on lines +140 to +144
Distroless containers have no shell. You cannot `exec` into a running container and run arbitrary commands. Debugging requires:

- Attaching an ephemeral debug container with a shell to the running pod (e.g. [`sentry-kube debug`](https://github.com/getsentry/sentry-infra-tools/blob/main/sentry_kube/cli/debug.py))
- Using application-level tooling (e.g. interactive shells provided by the framework) rather than OS-level tools, e.g. `getsentry shell`
- Investing in proper observability (logs, metrics, tracing) instead of ad-hoc inspection
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like us to put in some effort ahead of time to validate that the debugging flow is very smooth - we've definitely run into various issues attempting to attach debugger pods in some of the places we've already swapped out more minimal images.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sentry-kube debug is a full replacement of the kubectl exec workflow. I'd even consider it a better experience as it allows elevating permissions to allow attaching runtime profilers, which isn't possible with just kubectl exec.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, our existing application images are already very slim, so exec is not very useful. debug is a better experience today, and anytime I use exec, i immediately go into getsentry shell.

Copy link
Copy Markdown
Member

@Dav1dde Dav1dde May 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the only downside for debugging is docker-compose in dev setups and self-hosted where sentry-kube debug doesn't work.

Copy link
Copy Markdown
Member

@untitaker untitaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sentry uses /dev/shm for multiprocess IPC, and /tmp for random stuff related to release artifacts and other file uploads.

some of these images explicitly say that /tmp is "hardened" (so basically unusuable). while we generally mount tmpfs on /tmp into the container (making this a non-issue), i'm not sure that we do it consistently in all pods that require it. also no idea if self-hosted has this kind of setup at all.

it may be desirable to enforce in the ops repo that tmpfs is mounted in absolutely every container in /dev/shm and /tmp, rather than relying on smoke tests. we have had incidents where new deployments were missing those mounts leading to really weird issues in multiprocess consumers, so some systematic enforcement would be nice to have for other reasons.

Comment on lines +140 to +144
Distroless containers have no shell. You cannot `exec` into a running container and run arbitrary commands. Debugging requires:

- Attaching an ephemeral debug container with a shell to the running pod (e.g. [`sentry-kube debug`](https://github.com/getsentry/sentry-infra-tools/blob/main/sentry_kube/cli/debug.py))
- Using application-level tooling (e.g. interactive shells provided by the framework) rather than OS-level tools, e.g. `getsentry shell`
- Investing in proper observability (logs, metrics, tracing) instead of ad-hoc inspection
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, our existing application images are already very slim, so exec is not very useful. debug is a better experience today, and anytime I use exec, i immediately go into getsentry shell.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants