You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: multi-Python worker images with startup version check (AE-2827) (#89)
* feat: multi-Python worker images with startup version check (AE-2827)
Add Python 3.10 and 3.11 support to GPU worker images via side-by-side
torch install in the existing runpod/pytorch base. 3.12 keeps the fast
path (torch pre-installed) to avoid the ~7 GB reinstall cost on hot
deployments; 3.10/3.11 images pay that cost once per cold start per DC.
Sibling to flash#322 which landed the SDK-level plumbing. Tags follow
the same ``py${VERSION}-${TAG}`` scheme already in use for CPU images.
- Dockerfile / Dockerfile-lb (GPU): accept PYTHON_VERSION build arg;
install torch from download.pytorch.org/whl/cu128 and repoint
/usr/local/bin/python for non-3.12 targets; validate interpreter
matches the arg during build.
- Dockerfile-cpu / Dockerfile-lb-cpu (CPU): surface PYTHON_VERSION at
runtime via FLASH_PYTHON_VERSION env so the worker's startup check
can read it.
- src/version.py: new ``assert_python_version_matches_image`` — raises
PythonVersionMismatchError at handler boot when ``sys.version_info``
disagrees with the image's stamped FLASH_PYTHON_VERSION. Caught
before user code runs; skipped when the env var is unset (local dev).
- src/handler.py / src/lb_handler.py: call the assertion immediately
after logging setup, before ``maybe_unpack()`` and handler import.
- tests/unit/test_version.py: 4 new cases covering env-unset skip,
match, mismatch raise, and message contents.
- tests/unit/test_lb_handler.py: extend the mocked ``version`` module
with ``assert_python_version_matches_image`` so fresh-import tests
don't break.
- .github/workflows/ci.yml: expand CI to build GPU and LB images
across {3.10, 3.11, 3.12}; align prod CPU and LB-CPU default to
3.12 (matches flash's DEFAULT_PYTHON_VERSION).
* fix(dockerfile): bootstrap pip via get-pip.py for non-3.12 GPU builds
Ubuntu 22.04's system python3.10 has ensurepip disabled by Debian
policy, which broke the side-by-side torch install for 3.10 GPU images
(CI: docker-test-gpu (3.10), docker-test-lb (3.10)). python3.11 is a
separate interpreter without the disable, so only 3.10 was affected.
Use urllib+get-pip.py instead of ensurepip — works for any interpreter
regardless of distro patching, and urllib is stdlib so no curl dep.
Also corrects the outdated deadsnakes comment on both Dockerfiles: the
runpod/pytorch base image layers alt-Python 3.11/3.12 on top of the
system 3.10, not via deadsnakes.
0 commit comments