Summary
Currently, the runner uses a single, monolithic image built from ubuntu:22.04 with all tools baked in. This issue proposes adding multi-image support — the ability to define and select from multiple runner image variants based on workflow job labels, similar to how GitHub-hosted runners offer ubuntu-latest, windows-latest, etc.
Motivation
- Different workflows need different tools: A Python ML workflow needs CUDA + PyTorch, a Node.js workflow needs npm/yarn, a Go workflow needs the Go toolchain
- Image size vs. startup time trade-off: A one-size-fits-all image becomes bloated. Smaller, specialized images cold-start faster
- GitHub parity: GitHub offers
ubuntu-latest, ubuntu-22.04, windows-latest. Users expect label-based image selection
- GPU workflows: CPU-only jobs shouldn't pay the cost of GPU driver libraries in the image
Proposed Approaches
Option A: Label-Based Image Selection (Recommended)
Workflows specify the desired image via runs-on labels:
jobs:
python-job:
runs-on: [self-hosted, modal, image:python]
node-job:
runs-on: [self-hosted, modal, image:node]
ml-job:
runs-on: [self-hosted, modal, image:ml, gpu:a100]
The webhook handler parses image:<name> from labels and selects the matching image definition.
Option B: Environment Variable Registry
Define available images via environment variables:
modal secret create github-secret \
GITHUB_TOKEN=ghp_xxx \
RUNNER_IMAGE_DEFAULT=ubuntu-22.04 \
RUNNER_IMAGE_PYTHON=python-3.11 \
RUNNER_IMAGE_ML=pytorch-2.3-cuda-12.1
Option C: External Registry Pull
Support pulling arbitrary images from Docker Hub / ECR / GCR at sandbox spawn time:
runs-on: [self-hosted, modal, image:docker-hub/catthehacker/ubuntu:runner-22.04]
Implementation Sketch
1. Define Image Registry in Code
# runner/services/sandbox_service.py
RUNNER_IMAGES = {
"default": build_default_image(), # Current ubuntu-based image
"python": build_python_image(), # + Python 3.10/3.11/3.12, pip, poetry
"node": build_node_image(), # + Node 18/20, npm, yarn, pnpm
"go": build_go_image(), # + Go 1.22, golangci-lint
"rust": build_rust_image(), # + Rust, cargo, clippy
"ml": build_ml_image(), # + PyTorch, CUDA, transformers
"docker": build_docker_image(), # Heavy Docker-in-Docker support
"minimal": build_minimal_image(), # Just runner binary, no extras
}
2. Image Resolution in Job Service
# runner/services/job_service.py
def resolve_image(job_labels: list[str]) -> modal.Image:
for label in job_labels:
if label.startswith("image:"):
image_name = label.split(":", 1)[1]
return RUNNER_IMAGES.get(image_name, RUNNER_IMAGES["default"])
return RUNNER_IMAGES["default"]
3. Spawn Sandbox with Selected Image
# In spawn_sandbox, accept image parameter
async def spawn_sandbox(
app: modal.App,
jit_config: str,
job_id: int | str,
image: modal.Image, # New parameter
gpu_config=None,
) -> modal.Sandbox:
sandbox_kwargs = dict(
image=image, # Use selected image instead of global runner_image
...
)
Image Definitions (Examples)
Minimal Image
def build_minimal_image() -> modal.Image:
return (
modal.Image.from_registry("ubuntu:22.04")
.apt_install("curl", "ca-certificates")
.run_commands(f"curl -L https://github.com/actions/runner/releases/download/v{RUNNER_VERSION}/... | tar -xz -C /actions-runner")
)
Python Image
def build_python_image() -> modal.Image:
return (
build_minimal_image()
.apt_install("python3", "python3-pip", "python3-venv")
.pip_install("poetry", "pipenv")
)
ML Image
def build_ml_image() -> modal.Image:
return (
build_python_image()
.pip_install("torch", "transformers", "datasets", "accelerate")
.apt_install("libcuda1", "cuda-toolkit")
)
Trade-offs to Consider
| Aspect |
Single Image (Current) |
Multi-Image |
| Cold start |
Consistent |
Varies by image size |
| Cache efficiency |
One cache entry |
Multiple cache entries |
| Maintenance |
One image to update |
N images to update |
| Disk usage |
One image stored |
N images stored |
| User flexibility |
Low |
High |
| Complexity |
Low |
Medium |
Next Steps
Labels: enhancement, feature-request, architecture
Priority: P3 (nice-to-have, future roadmap)
Summary
Currently, the runner uses a single, monolithic image built from
ubuntu:22.04with all tools baked in. This issue proposes adding multi-image support — the ability to define and select from multiple runner image variants based on workflow job labels, similar to how GitHub-hosted runners offerubuntu-latest,windows-latest, etc.Motivation
ubuntu-latest,ubuntu-22.04,windows-latest. Users expect label-based image selectionProposed Approaches
Option A: Label-Based Image Selection (Recommended)
Workflows specify the desired image via
runs-onlabels:The webhook handler parses
image:<name>from labels and selects the matching image definition.Option B: Environment Variable Registry
Define available images via environment variables:
Option C: External Registry Pull
Support pulling arbitrary images from Docker Hub / ECR / GCR at sandbox spawn time:
Implementation Sketch
1. Define Image Registry in Code
2. Image Resolution in Job Service
3. Spawn Sandbox with Selected Image
Image Definitions (Examples)
Minimal Image
Python Image
ML Image
Trade-offs to Consider
Next Steps
job_service.pyspawn_sandbox()Labels:
enhancement,feature-request,architecturePriority: P3 (nice-to-have, future roadmap)