Skip to content

Reduce number of layers for Beam container images#38440

Merged
Abacn merged 1 commit into
apache:masterfrom
Abacn:fewer-layer
May 11, 2026
Merged

Reduce number of layers for Beam container images#38440
Abacn merged 1 commit into
apache:masterfrom
Abacn:fewer-layer

Conversation

@Abacn
Copy link
Copy Markdown
Contributor

@Abacn Abacn commented May 11, 2026

Reduced Java container layer from 23->14; reduced Go container layer 17->4.

Mitigate artifact registry GET request http 429 error when frequently pull. The number of request is proportional to the number of container layers.

Please add a meaningful description for your change here


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests
Go tests

See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request optimizes the Docker container images for both Java and Go SDKs by reducing the total number of layers. By consolidating build instructions and squashing base images, the changes aim to decrease the frequency of HTTP 429 errors encountered when pulling images from artifact registries, thereby improving deployment reliability.

Highlights

  • Go Container Optimization: Squashed the distroless base image into a single layer using a multi-stage build with 'scratch' to reduce the total layer count from 17 to 4.
  • Java Container Optimization: Consolidated multiple 'ADD' and 'COPY' instructions into fewer layers to reduce the total layer count from 23 to 14.
  • Performance Impact: Reduced the number of container layers to mitigate HTTP 429 errors from artifact registries caused by excessive pull requests.
New Features

🧠 You can now enable Memory (public preview) to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request optimizes the Dockerfiles for the Go and Java SDKs by squashing layers and consolidating file copy operations. Feedback identifies a critical syntax error in the Java Dockerfile where an inline comment within a multi-line COPY command would break the build. Additionally, it is recommended to use wildcards for optional jars to ensure compatibility across Java versions and to prefer COPY over ADD for local files to align with Docker best practices.

Comment thread sdks/java/container/Dockerfile
/opt/apache/beam/jars/

# Built jars
COPY target/open-module-agent.jar target/beam-sdks-java-harness.jar /opt/apache/beam/jars/
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The open-module-agent.jar is only present in Java 9+ builds. Using a specific filename in COPY will cause the build to fail if the file is missing (e.g., in Java 8 environments). Using a wildcard (e.g., target/open-module-agent.jar*) makes this file optional and ensures the build remains robust across different Java versions.

COPY target/open-module-agent.jar* target/beam-sdks-java-harness.jar /opt/apache/beam/jars/

COPY target/${TARGETOS}_${TARGETARCH}/boot target/LICENSE target/NOTICE /opt/apache/beam/

# copy third party licenses
ADD target/third_party_licenses /opt/apache/beam/third_party_licenses/
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For local files and directories, COPY is preferred over ADD unless you specifically need the auto-extraction feature of ADD (e.g., for tarballs). Using COPY is more explicit and follows Docker best practices, consistent with other changes in this pull request.

COPY target/third_party_licenses /opt/apache/beam/third_party_licenses/

ARG TARGETOS
ARG TARGETARCH

ADD target/${TARGETOS}_${TARGETARCH}/boot /opt/apache/beam/
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Use COPY instead of ADD for local files when no extraction is required. This is a Docker best practice and maintains consistency with the improvements made in the Java Dockerfile.

COPY target/${TARGETOS}_${TARGETARCH}/boot /opt/apache/beam/

@Abacn
Copy link
Copy Markdown
Contributor Author

Abacn commented May 11, 2026

R: @shunping

@github-actions
Copy link
Copy Markdown
Contributor

Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control. If you'd like to restart, comment assign set of reviewers

Copy link
Copy Markdown
Collaborator

@shunping shunping left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Abacn Abacn merged commit 13bbd5c into apache:master May 11, 2026
20 of 21 checks passed
@Abacn Abacn deleted the fewer-layer branch May 11, 2026 16:42
aIbrahiim pushed a commit to aIbrahiim/beam that referenced this pull request May 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants