Skip to content

HF-PT-2.9: patch safety, pip check, and ECR scan failures#5837

Closed
Eren-Jeager123 wants to merge 7 commits intoaws:masterfrom
Eren-Jeager123:kevin-dev
Closed

HF-PT-2.9: patch safety, pip check, and ECR scan failures#5837
Eren-Jeager123 wants to merge 7 commits intoaws:masterfrom
Eren-Jeager123:kevin-dev

Conversation

@Eren-Jeager123
Copy link
Copy Markdown
Contributor

Purpose

Patch several failures, including version checks and CVEs.

Justifications for allowlists

1. gradio (safety ID 72086) — spec >=0 Justified. The advisory itself says "the supplier disputes this because the report is about a user attacking himself." Every version of gradio is flagged (>=0), meaning there is literally no version you can upgrade to. No fix exists.

2. onnx (safety ID 89485) — spec <=1.20.1 The latest onnx on PyPI is 1.20.1. The vulnerability spec covers all versions up to and including 1.20.1. There is no patched version available. Allowlist is justified.

3. onnx CVE-2026-28500 (ECR scan) Same package, same situation. PyPI confirms 1.20.1 is the latest release. The fixed_in field in the vulnerability data is empty ([]). No fix exists.

4. sagemaker (safety ID 88445) — spec <3.4.0 The latest sagemaker on PyPI is 3.7.0, and sagemaker 3.4.0+ does exist. However, sagemaker 3.x is a complete rewrite — it drops Estimator, Model, Predictor and all their subclasses. This is a training container that relies on the sagemaker 2.x API. Upgrading to 3.x would break the container's core functionality. Allowlist is justified because the fix requires a breaking major version change that's incompatible with this image.

5. sagemaker/protobuf pip check exception This is a direct consequence of (4). We need protobuf>=6.33.5 for CVE-2026-0994, but sagemaker 2.257.0 caps protobuf at <6.32. Since we can't upgrade sagemaker to 3.x (see above), and we can't drop the protobuf CVE fix, the pip check conflict is unavoidable. The exception is justified.


Toggle if you are merging into master Branch

By default, docker image builds and tests are disabled. Two ways to run builds and tests:

  1. Using dlc_developer_config.toml
  2. Using this PR description (currently only supported for PyTorch, TensorFlow, vllm, and base images)
How to use the helper utility for updating dlc_developer_config.toml

Assuming your remote is called origin (you can find out more with git remote -v)...

  • Run default builds and tests for a particular buildspec - also commits and pushes changes to remote; Example:

python src/prepare_dlc_dev_environment.py -b </path/to/buildspec.yml> -cp origin

  • Enable specific tests for a buildspec or set of buildspecs - also commits and pushes changes to remote; Example:

python src/prepare_dlc_dev_environment.py -b </path/to/buildspec.yml> -t sanity_tests -cp origin

  • Restore TOML file when ready to merge

python src/prepare_dlc_dev_environment.py -rcp origin

NOTE: If you are creating a PR for a new framework version, please ensure success of the local, standard, rc, and efa sagemaker tests by updating the dlc_developer_config.toml file:

  • sagemaker_remote_tests = true
  • sagemaker_efa_tests = true
  • sagemaker_rc_tests = true
  • sagemaker_local_tests = true
How to use PR description Use the code block below to uncomment commands and run the PR CodeBuild jobs. There are two commands available:
  • # /buildspec <buildspec_path>
    • e.g.: # /buildspec pytorch/training/buildspec.yml
    • If this line is commented out, dlc_developer_config.toml will be used.
  • # /tests <test_list>
    • e.g.: # /tests sanity security ec2
    • If this line is commented out, it will run the default set of tests (same as the defaults in dlc_developer_config.toml): sanity, security, ec2, ecs, eks, sagemaker, sagemaker-local.
# /buildspec <buildspec_path>
# /tests <test_list>
Toggle if you are merging into main Branch

PR Checklist

  • [] I ran pre-commit run --all-files locally before creating this PR. (Read DEVELOPMENT.md for details).

@Eren-Jeager123 Eren-Jeager123 requested a review from a team as a code owner March 26, 2026 21:47
@aws-deep-learning-containers-ci aws-deep-learning-containers-ci Bot added authorized build Reflects file change in build folder huggingface Reflects file change in huggingface folder sanity Size:M test Reflects file change in test folder labels Mar 26, 2026
@Eren-Jeager123 Eren-Jeager123 marked this pull request as draft March 27, 2026 22:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

authorized build Reflects file change in build folder huggingface Reflects file change in huggingface folder sanity Size:M test Reflects file change in test folder

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant