Commit 3fbe2da
authored
fix(dotcom): lower Zero kill_timeout to Fly's 5m API cap (tldraw#8656)
In order to unblock production deploys, this PR lowers the Zero RM/VS
`kill_timeout` from `10m` back to `5m`. Fly's Machines API rejects
anything over 5 minutes regardless of CPU kind:
```
Error: failed to update machine ...: invalid stop_config.timeout, cannot exceed 5 minutes
```
This contradicts Fly's [graceful VM exits
guide](https://fly.io/blog/graceful-vm-exits-some-dials/), which
suggests up to 24h on dedicated CPU. The 5m cap from the API is the
authoritative limit today. Drain budget is now half what Rocicorp's CZ
uses, but it's the ceiling Fly will accept.
Follow-up to tldraw#8627.
### Change type
- [x] `bugfix`
### Test plan
1. Merge to `production` → `deploy-dotcom.yml` should complete the VS/RM
rolling update without the `invalid stop_config.timeout` error.
2. Verify generated `flyio-view-syncer.toml` and
`flyio-replication-manager.toml` contain `kill_timeout = "5m"` at top
level.
### Code changes
| Section | LOC change |
| -------------- | ---------- |
| Config/tooling | +4 / -3 |1 parent 5dfc426 commit 3fbe2da
1 file changed
Lines changed: 4 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
289 | 289 | | |
290 | 290 | | |
291 | 291 | | |
292 | | - | |
293 | | - | |
| 292 | + | |
| 293 | + | |
| 294 | + | |
294 | 295 | | |
295 | 296 | | |
296 | 297 | | |
| |||
304 | 305 | | |
305 | 306 | | |
306 | 307 | | |
307 | | - | |
| 308 | + | |
308 | 309 | | |
309 | 310 | | |
310 | 311 | | |
| |||
0 commit comments