Skip to content

Commit ff25c25

Browse files
committed
docs: update CHANGELOG, README, DEPLOY with fix documentation
- CHANGELOG: add entries for & && fix, label contention, ip6tables, dockerd backgrounding - README: recommend unique job-${{ github.run_id }} label in quickstart - DEPLOY: update workflow examples, label contention troubleshooting, RUNNER_VERSION
1 parent e97f39c commit ff25c25

3 files changed

Lines changed: 26 additions & 6 deletions

File tree

CHANGELOG.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,12 @@ All notable changes to this project will be documented in this file.
66

77
### Fixed
88
- OrderedDict has no `.add()` method — use key assignment for delivery dedup cache (`66ad287`)
9+
- Bash syntax error in SANDBOX_CMD (`& &&`) preventing runner from starting (`8b9857f`)
10+
- Label contention in `test.yml` — added `job-${{ github.run_id }}` unique label for 1:1 JIT runner binding (`8b9857f`)
11+
- `ip6tables-legacy` alternative registration crash during image rebuild (`43825b0`)
12+
13+
### Changed
14+
- Dockerd wait + image load now run in background — runner starts immediately without waiting for Docker. Non-Docker jobs skip the 60s dockerd startup entirely. (`43825b0`)
915

1016
### Added
1117
- Multi-region support via `MODAL_REGION` environment variable

DEPLOY.md

Lines changed: 12 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -72,15 +72,19 @@ This guide outlines the steps to deploy this project using Modal.
7272
7373
7. **Update your GitHub Actions workflow:**
7474
* Ensure the `runs-on` field includes `modal` and `self-hosted`.
75+
* **Always include a unique label** to prevent label contention between concurrent jobs.
7576
7677
```yaml
77-
runs-on: [self-hosted, modal]
78+
runs-on: [self-hosted, modal, "job-${{ github.run_id }}"]
7879
```
7980
81+
Without the unique label, multiple queued jobs compete for the same JIT runner
82+
and get stuck in "queued" state — each job needs its own dedicated runner.
83+
8084
**GPU jobs:** Add a `gpu:` label to request GPU acceleration.
8185
8286
```yaml
83-
runs-on: [self-hosted, modal, gpu:t4]
87+
runs-on: [self-hosted, modal, "job-${{ github.run_id }}", gpu:t4]
8488
```
8589
8690
Supported GPU types: `gpu:t4`, `gpu:l4`, `gpu:a100`, `gpu:a100-80gb`, `gpu:h100`
@@ -117,7 +121,11 @@ The runner image build can take a while on first deploy, or the Docker-in-Sandbo
117121
118122
**Jobs stuck in queued state**
119123
120-
The webhook is not reaching the endpoint, or the `modal` label is missing from the workflow. Verify the webhook URL is correct and publicly accessible. Check that your workflow file has `runs-on: [self-hosted, modal]`.
124+
The webhook is not reaching the endpoint, or the `modal` label is missing from the workflow. Verify the webhook URL is correct and publicly accessible. Check that your workflow file has `runs-on: [self-hosted, modal]`. If using concurrent jobs, ensure each job has a unique label like `job-${{ github.run_id }}` to prevent label contention between JIT runners.
125+
126+
**Multiple jobs stuck queued but one completes randomly**
127+
128+
Label contention. Without unique labels in `runs-on`, a JIT runner created for job A can pick up job B instead. Fix: add `"job-${{ github.run_id }}"` to every job's `runs-on`.
121129
122130
**GPU jobs fail to start**
123131
@@ -135,7 +143,7 @@ GitHub retries webhooks when responses are slow or network issues occur. This is
135143
| `WEBHOOK_SECRET` | Yes | - | Secret for webhook signature validation |
136144
| `WEBHOOK_SECRET_OLD` | No | - | Previous secret for seamless rotation |
137145
| `ALLOWED_REPOS` | No | (all) | Comma-separated allowlist of `owner/repo` |
138-
| `RUNNER_VERSION` | No | `2.333.1` | GitHub Actions runner version |
146+
| `RUNNER_VERSION` | No | `2.334.0` | GitHub Actions runner version |
139147
| `RUNNER_GROUP_ID` | No | `1` | Runner group ID |
140148
| `MAX_CONCURRENT_PER_REPO` | No | (unlimited) | Max concurrent sandboxes per repo |
141149
| `ALLOWED_CIDRS` | No | (allow all) | Comma-separated CIDR ranges for outbound |

README.md

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -28,9 +28,15 @@ modal deploy app.py
2828
Then update your workflow file:
2929

3030
```yaml
31-
runs-on: [self-hosted, modal]
31+
jobs:
32+
test:
33+
runs-on: [self-hosted, modal, "job-${{ github.run_id }}"]
34+
steps:
35+
- run: echo "Hello from Modal!"
3236
```
3337
38+
> **Important:** Always include a unique label (`job-${{ github.run_id }}`) in `runs-on` to ensure each job gets its own dedicated runner. Without it, multiple jobs compete for the same runner and get stuck in "queued" state.
39+
3440
Full deployment walkthrough: [DEPLOY.md](DEPLOY.md).
3541

3642
## Features
@@ -72,7 +78,7 @@ Staging limits concurrent Modal sandbox spawns to ~5 at a time, preventing API o
7278
|--|-------------|------------|---------------|
7379
| Infrastructure | None (serverless) | Kubernetes cluster | None (managed) |
7480
| Idle cost | Zero | Node compute when idle | N/A (billed per minute) |
75-
| Startup time | Sub-second sandbox | Pod scheduling (seconds to minutes) | ~10-20 seconds |
81+
| Startup time | Sub-second sandbox, ~1-4min total (cold start) | Pod scheduling (seconds to minutes) | ~10-20 seconds |
7682
| GPU support | T4, L4, A100, A100-80GB, H100 | Requires GPU nodes | Limited, macOS only |
7783
| Horizontal scaling | Automatic, no config | Requires HPA/cluster autoscaler | Automatic |
7884
| Isolation | MicroVM sandbox | Container (shared kernel) | VM |

0 commit comments

Comments
 (0)