feat: Add support for Python package ecosystem#2164
Draft
EyeCantCU wants to merge 12 commits intochainguard-dev:mainfrom
Draft
feat: Add support for Python package ecosystem#2164EyeCantCU wants to merge 12 commits intochainguard-dev:mainfrom
EyeCantCU wants to merge 12 commits intochainguard-dev:mainfrom
Conversation
Add a declarative ecosystem package system that allows installing packages from non-APK ecosystems (starting with Python/PyPI) directly into OCI images without shelling out to pip or any other tool. Packages are resolved via the PEP 503 Simple Repository API, downloaded as wheels, and extracted directly into the filesystem. The new `ecosystems.python` config block supports custom indexes, version constraints, and auto-detection of the installed Python version. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Use the PyPI JSON API (pypi.org/pypi/{name}/{version}/json) to resolve
packages and discover transitive dependencies, instead of downloading
entire wheels just to read their METADATA files. The JSON API returns
clean requires_dist lists and wheel URLs with checksums in a single
request.
Falls back to the PEP 503 Simple API for non-PyPI indexes (private
registries), though without transitive resolution in that case.
Also adds environment marker evaluation (extra, os_name, sys_platform,
etc.) to correctly filter conditional dependencies, and pre-release
filtering to avoid resolving alpha/beta/rc versions unless pinned.
Tested with torch==2.6.0 which correctly resolves all 24 transitive
dependencies automatically.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Rename the package directory and Go package from "pip" to "python" to match the ecosystem name used in YAML config. Update all import paths and log messages accordingly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove readMetadata and parseRequiresDist, which are no longer used after switching to the PyPI JSON API for dependency discovery. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When `venv` is set in the python ecosystem config, packages are installed into a virtual environment with proper pyvenv.cfg and bin/python symlinks. The image environment is automatically configured with VIRTUAL_ENV and PATH prepended with the venv bin directory. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… dep resolution Write SPDX 2.3 SBOMs into dist-info/sboms/sbom.spdx.json for Chainguard-sourced packages, enabling chainctl libraries verify to confirm provenance. Parse data-provenance and data-signature attributes from Simple API HTML and thread them through to ResolvedPackage. Add transitive dependency resolution for non-PyPI indexes by downloading wheels and parsing METADATA for Requires-Dist entries. Also fixes an off-by-one bug in parseSimpleIndex tag extraction that caused data-requires-python (and provenance/signature) attributes to be attributed to the wrong link. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Extend the existing origin-based layering strategy to support ecosystem packages (e.g. Python pip packages) as separate layers, without treating them as APK packages. The approach generalizes file ownership in the filesystem to a string-based "owner" concept. APK files get their owner from the existing tar entry package metadata; ecosystem files get tagged via SetCurrentOwner during installation. The splitLayers function routes files to layers using the Owner() interface, which works for both. Key changes: - tarfs: Add owner field to nodes, SetCurrentOwner/OwnerSize on memFS, Owner() method on memFileInfo that returns APK pkg name or ecosystem owner - ecosystem: Add OwnerTagger interface, OwnerName() on ResolvedPackage, InstalledSize populated after install. Installers tag files themselves. - layers: Generalize group to carry owners[] alongside pkgs[]. Factor groupByOriginAndSize into groupAPKByOrigin + applyBudget so ecosystem groups participate in the shared budget without APK-specific logic. - python installer: Tags files per-package around wheel extraction Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix appendAssign: use apkGroups directly instead of allGroups - Replace map loop with maps.Copy - Fix import ordering Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Both ecosystem callers were passing nil for the authenticator, meaning private Python indexes requiring authentication would fail. Use bc.o.Auth (from options) in the build path and auth.DefaultAuthenticators in the lock path, matching how APK repository auth is handled. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: RJ Sampson <rj.sampson@chainguard.dev>
Test data had wheel URLs pointing to files.example.com, causing DNS lookups that Harden-Runner blocks. Rewrite all test URLs to point back to the httptest server, which also serves dummy wheel responses for dependency extraction. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Stop hardcoding platform tag lists per architecture. Instead, parse wheel platform tags dynamically by checking the machine suffix and prefix (musllinux_, manylinux, linux_). Detect the image libc from /etc/os-release (ID=alpine → musl, otherwise glibc) and only accept wheels matching the correct libc. Replace the scoring system with simple binary-over-pure-python preference. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
And standardize introduction of other ecosystems