Skip to content

Commit 7d8cb13

Browse files
committed
Overhaul docs: README rewrite, DEPLOY troubleshooting, CHANGELOG
README: badges, 30-sec quickstart, feature/comparison tables, GPU and network docs, star CTA. DEPLOY: troubleshooting section for 6 common issues. CHANGELOG: v0.1.0 through v0.3.0.
1 parent 3ee64a6 commit 7d8cb13

3 files changed

Lines changed: 195 additions & 31 deletions

File tree

CHANGELOG.md

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
# Changelog
2+
3+
All notable changes to this project will be documented in this file.
4+
5+
## [2025-04-25] - v0.3.0
6+
7+
### Added
8+
- Per-job GPU support via workflow labels (gpu:T4, gpu:A100, etc.) — T4, L4, A100, A100-80GB, H100
9+
- Network isolation via `ALLOWED_CIDRS` environment variable
10+
- Full network blocking via `BLOCK_NETWORK` environment variable
11+
- Persistent cache volume support via `CACHE_VOLUME_NAME` environment variable
12+
13+
## [2025-04-25] - v0.2.0
14+
15+
### Added
16+
- Structured JSON logging with contextual fields (job_id, repo, duration, status, error_code)
17+
- Health check endpoint (`GET /health`)
18+
- Retry logic for GitHub API calls with exponential backoff (1s/2s/4s, 3 attempts, 429/5xx)
19+
- Hardened payload validation with specific 400 error messages
20+
- Per-repo concurrency limits via `MAX_CONCURRENT_PER_REPO` environment variable
21+
- CI linting with ruff and pyproject.toml config
22+
- Unit test suite (23 tests) with pytest and pytest-asyncio
23+
- `tests/**` added to CI path triggers
24+
25+
### Changed
26+
- Optimized runner image: removed unused packages (net-tools, sudo, jq), reordered layers for better caching
27+
- All logger calls updated from f-strings to structured `extra={}` pattern
28+
29+
## [2025-04-24] - v0.1.0
30+
31+
### Added
32+
- Initial release: Modal-powered ephemeral GitHub Actions runner
33+
- Docker-in-Sandbox support for container-based workflows
34+
- HMAC-SHA256 webhook signature verification
35+
- Repository allowlist via `ALLOWED_REPOS`
36+
- JIT runner registration for single-use tokens
37+
- Replay protection via delivery ID cache
38+
- Job deduplication cache
39+
- Configurable runner version, group ID, and labels
40+
- GitHub Enterprise domain support
41+
- Body size limit (1MB)
42+
- Automated CI/CD via GitHub Actions

DEPLOY.md

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -101,6 +101,32 @@ This guide outlines the steps to deploy this project using Modal.
101101
102102
Every time a job is queued, Modal will spawn an ephemeral sandbox that runs the job and then exits. This ensures a clean and isolated environment for each job execution. The webhook is secured using HMAC-SHA256 signature verification.
103103
104+
### Troubleshooting
105+
106+
**Webhook signature verification fails (403)**
107+
108+
`WEBHOOK_SECRET` is mismatched between the Modal secret and GitHub webhook settings. Make sure the same secret value is used in both `modal secret create` and the GitHub webhook configuration.
109+
110+
**JIT token generation fails (401/403)**
111+
112+
`GITHUB_TOKEN` lacks the required permissions or has expired. Use a fine-grained PAT with `Actions: Read and Write` and `Administration: Read and Write` permissions.
113+
114+
**Sandbox spawn fails with timeout**
115+
116+
The runner image build can take a while on first deploy, or the Docker-in-Sandbox setup may have failed. Check Modal logs for build errors. Ensure `MODAL_IMAGE_BUILDER_VERSION=2025.06` is set (this is handled automatically in `app.py`).
117+
118+
**Jobs stuck in queued state**
119+
120+
The webhook is not reaching the endpoint, or the `modal` label is missing from the workflow. Verify the webhook URL is correct and publicly accessible. Check that your workflow file has `runs-on: [self-hosted, modal]`.
121+
122+
**GPU jobs fail to start**
123+
124+
Either an invalid GPU label was used, or the GPU quota on your Modal account has been exceeded. Use one of the valid labels: `gpu:t4`, `gpu:l4`, `gpu:a100`, `gpu:a100-80gb`, `gpu:h100`. Check Modal GPU availability in your account.
125+
126+
**Duplicate delivery warnings in logs**
127+
128+
GitHub retries webhooks when responses are slow or network issues occur. This is normal. The deduplication cache handles it automatically, so no action is needed.
129+
104130
### Environment Variables Reference
105131
106132
| Variable | Required | Default | Description |

README.md

Lines changed: 127 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -1,21 +1,111 @@
11
# Modal GitHub Runner
22

3-
[![Modal](https://img.shields.io/badge/Powered%20By-Modal-000000?style=flat-square&logo=modal&logoColor=white)](https://modal.com)
4-
[![GitHub Actions](https://img.shields.io/badge/GitHub%20Actions-Runner-2088FF?style=flat-square&logo=github-actions&logoColor=white)](https://github.com/features/actions)
5-
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg?style=flat-square)](https://opensource.org/licenses/MIT)
3+
![CI](https://github.com/manascb1344/modal-github-runner/actions/workflows/ci-cd.yml/badge.svg)
4+
![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)
5+
![Modal](https://img.shields.io/badge/Powered%20By-Modal-000000?style=flat-square&logo=modal&logoColor=white)
6+
![GitHub Actions](https://img.shields.io/badge/GitHub%20Actions-Runner-2088FF?style=flat-square&logo=github-actions&logoColor=white)
67

7-
A high-performance, ephemeral self-hosted GitHub Actions runner powered by [**Modal**](https://modal.com). Achieve zero idle costs and instant horizontal scaling with Just-In-Time (JIT) security.
8+
![Star History Chart](https://api.star-history.com/svg?repos=manascb1344/modal-github-runner&type=Date)
89

9-
## 🚀 Key Features
10+
An ephemeral, self-hosted GitHub Actions runner built on [Modal](https://modal.com). Each job runs in a fresh sandbox, you pay only while jobs execute, and credentials are scoped to a single use.
1011

11-
- **⚡ Ephemeral:** Every job runs in a fresh, hardware-isolated Modal Sandbox, ensuring a clean state and preventing side effects between runs.
12-
- **💰 Zero Idle Cost:** No long-running servers or "warm" instances. You only pay for the exact seconds your runner is executing jobs.
13-
- **🛡️ JIT Security:** Utilizes GitHub's Just-In-Time runner registration. Runners are created on-demand and automatically cleaned up by GitHub after a single use.
14-
- **📈 Horizontal Scaling:** Modal's serverless infrastructure allows you to scale to hundreds of concurrent runners instantly. Each job gets its own dedicated resources without queueing delays.
12+
## 30-second quickstart
1513

16-
## 🏗️ Architecture
14+
```bash
15+
# 1. Create a Modal secret with your GitHub PAT and webhook secret
16+
modal secret create github-secret \
17+
GITHUB_TOKEN=ghp_xxx \
18+
WEBHOOK_SECRET=$(openssl rand -hex 32)
1719

18-
The runner follows a reactive, event-driven flow:
20+
# 2. Deploy the runner
21+
modal deploy app.py
22+
23+
# 3. Add a webhook in your GitHub repo settings
24+
# Point it at the URL modal deploy prints, content type JSON,
25+
# secret = the WEBHOOK_SECRET from step 1, events = "Workflow jobs"
26+
```
27+
28+
Then update your workflow file:
29+
30+
```yaml
31+
runs-on: [self-hosted, modal]
32+
```
33+
34+
Full deployment walkthrough: [DEPLOY.md](DEPLOY.md).
35+
36+
## Features
37+
38+
| Feature | Description |
39+
|---------|-------------|
40+
| Ephemeral | Fresh Modal Sandbox per job. No state leaks between runs. |
41+
| Zero idle cost | No long-running servers. You pay for the seconds a job runs, nothing else. |
42+
| JIT security | Single-use runner tokens via GitHub's `generate-jitconfig` API. |
43+
| GPU support | T4, L4, A100, A100-80GB, H100 through workflow labels. |
44+
| Network isolation | Outbound CIDR allowlists or full network blocking. |
45+
| Cache volumes | Persistent `/cache` mount across jobs via Modal Volumes. |
46+
| Auto-retry | Exponential backoff (1s, 2s, 4s) on transient GitHub API failures. |
47+
| Structured logging | JSON logs with job ID, repo, duration, and error context. |
48+
49+
## Comparison
50+
51+
| | Modal Runner | ARC (K8s) | GitHub-hosted |
52+
|--|-------------|------------|---------------|
53+
| Infrastructure | None (serverless) | Kubernetes cluster | None (managed) |
54+
| Idle cost | Zero | Node compute when idle | N/A (billed per minute) |
55+
| Startup time | Sub-second sandbox | Pod scheduling (seconds to minutes) | ~10-20 seconds |
56+
| GPU support | T4, L4, A100, A100-80GB, H100 | Requires GPU nodes | Limited, macOS only |
57+
| Horizontal scaling | Automatic, no config | Requires HPA/cluster autoscaler | Automatic |
58+
| Isolation | MicroVM sandbox | Container (shared kernel) | VM |
59+
| Kubernetes required | No | Yes | No |
60+
61+
ARC is the better choice if you already run a Kubernetes cluster and need deep integration with your existing infrastructure. Modal Runner is simpler if you want something that deploys in under a minute with no ops overhead.
62+
63+
## GPU configuration
64+
65+
Add a `gpu:` label to your workflow's `runs-on` to request a specific GPU:
66+
67+
```yaml
68+
jobs:
69+
train:
70+
runs-on: [self-hosted, modal, gpu:a100]
71+
steps:
72+
- run: python train.py
73+
```
74+
75+
Supported labels:
76+
77+
| Label | Hardware |
78+
|-------|----------|
79+
| `gpu:t4` | NVIDIA T4 (16 GB) |
80+
| `gpu:l4` | NVIDIA L4 (24 GB) |
81+
| `gpu:a100` | NVIDIA A100 (40 GB) |
82+
| `gpu:a100-80gb` | NVIDIA A100 (80 GB) |
83+
| `gpu:h100` | NVIDIA H100 (80 GB) |
84+
85+
## Network and security
86+
87+
Control outbound network access from runner sandboxes with environment variables:
88+
89+
```bash
90+
modal secret create github-secret \
91+
GITHUB_TOKEN=ghp_xxx \
92+
WEBHOOK_SECRET=xxx \
93+
BLOCK_NETWORK=true # block all outbound
94+
# or
95+
ALLOWED_CIDRS="10.0.0.0/8,192.168.0.0/16" # allow specific ranges
96+
```
97+
98+
- `BLOCK_NETWORK=true` drops all outbound connections from the sandbox.
99+
- `ALLOWED_CIDRS` takes a comma-separated list of CIDR ranges. When set, only those ranges are reachable.
100+
101+
Additional security controls:
102+
103+
- `ALLOWED_REPOS` restricts which repositories can trigger runner creation.
104+
- HMAC-SHA256 signature verification on every webhook request.
105+
- Delivery ID deduplication prevents replay attacks.
106+
- Per-repo concurrency limits via `MAX_CONCURRENT_PER_REPO`.
107+
108+
## Architecture
19109

20110
```mermaid
21111
sequenceDiagram
@@ -35,35 +125,41 @@ sequenceDiagram
35125
MS->>MS: 8. Exit & Terminate Sandbox
36126
```
37127

38-
1. **Workflow Queued:** A GitHub Action workflow is triggered and a job enters the `queued` state.
39-
2. **Webhook Trigger:** GitHub sends a `workflow_job` webhook to the Modal web endpoint.
40-
3. **JIT Handshake:** The Modal app validates the request and calls the GitHub API to generate a JIT (Just-In-Time) runner configuration.
41-
4. **Sandbox Spawning:** A Modal Sandbox is provisioned immediately with the pre-configured runner image.
42-
5. **Execution & Cleanup:** The runner connects to GitHub, executes the specific job, and the Sandbox is terminated immediately upon completion.
128+
1. A workflow triggers and a job enters `queued`.
129+
2. GitHub sends a `workflow_job` webhook to the Modal endpoint.
130+
3. The endpoint verifies the HMAC-SHA256 signature, then calls GitHub's `generate-jitconfig` API.
131+
4. A Modal Sandbox is provisioned with the runner image and JIT config.
132+
5. The runner connects to GitHub, executes the job, and the sandbox terminates on completion.
43133

44-
## 🏁 Quick Start
134+
## Environment variables
45135

46-
Setting up your own Modal runner takes only a few minutes.
136+
| Variable | Required | Default | Description |
137+
|----------|----------|---------|-------------|
138+
| `GITHUB_TOKEN` | Yes | | GitHub PAT for runner registration |
139+
| `WEBHOOK_SECRET` | Yes | | Secret for webhook signature validation |
140+
| `ALLOWED_REPOS` | No | (all) | Comma-separated allowlist of `owner/repo` |
141+
| `RUNNER_VERSION` | No | `2.333.1` | GitHub Actions runner version |
142+
| `RUNNER_GROUP_ID` | No | `1` | Runner group ID |
143+
| `MAX_CONCURRENT_PER_REPO` | No | (unlimited) | Max concurrent sandboxes per repo |
144+
| `ALLOWED_CIDRS` | No | (allow all) | Comma-separated CIDR ranges for outbound |
145+
| `BLOCK_NETWORK` | No | `false` | Fully isolate sandbox network |
146+
| `CACHE_VOLUME_NAME` | No | | Modal Volume name for persistent `/cache` |
147+
| `GITHUB_ENTERPRISE_DOMAIN` | No | | Custom domain for GitHub Enterprise |
47148

48-
Refer to the [**DEPLOY.md**](DEPLOY.md) for step-by-step instructions on:
49-
- Setting up Modal secrets.
50-
- Deploying the webhook endpoint.
51-
- Configuring GitHub repository webhooks.
149+
## Limitations
52150

53-
## 🛠️ Technical Details
151+
- Docker-in-Docker support uses Modal's alpha Docker-in-Sandbox feature. GitHub Actions `services:` and container actions generally work but may have edge cases.
152+
- Every job runs in a fresh sandbox. Files saved outside the repository workspace are lost after the job completes.
54153

55-
- **Modal Sandbox:** Built on top of Modal's serverless runtime, providing sub-second startup times and robust isolation using micro-VM technology.
56-
- **JIT Configuration:** Instead of persistent runner tokens, this project uses the `generate-jitconfig` endpoint. This ensures that even if a runner environment were compromised, the credentials are valid for only one specific job.
57-
- **Custom Images:** The runner environment is defined directly within `app.py`, allowing you to easily add dependencies (e.g., specific versions of Python, Node.js, or system libraries) that are pre-baked into the runner image.
58-
- **Root Execution:** Sandboxes run with `RUNNER_ALLOW_RUNASROOT=1` in ephemeral `/tmp` directories, ensuring compatibility with all GitHub Actions features without permission hurdles.
154+
## License
59155

60-
## 📄 License
61-
62-
This project is licensed under the [MIT License](LICENSE).
156+
[MIT](LICENSE)
63157

64158
---
65159

66-
## 👤 Author
160+
If this project helps you, giving it a star helps others discover it.
161+
162+
## Author
67163

68164
**Manas C. Bavaskar**
69165
- GitHub: [@manascb1344](https://github.com/manascb1344)

0 commit comments

Comments
 (0)