test: poll until volume is visible before mounting it#97
Conversation
`sandbox volumes create` returns the volume id immediately, but the backend may take a moment to make the volume visible to subsequent operations (mount, list). The `sandbox with volume mount` test hit this race on every CI run, surfacing as `VOLUME_NOT_FOUND` when the follow-up `sandbox create --volume <id>:path` call ran before the volume was queryable. Adds a `waitForVolumeReady` helper that polls `volumes list` for up to 15s (500ms interval) after creation, and inserts a call between the volume creation and the mount in the affected test. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Initial fix polled `volumes list` and stopped when the new volume appeared, but the mount still raced — the cluster takes another beat after the volume is queryable before it's actually mountable. Add a configurable `postListSleepMs` (default 5s) after the list confirms, and bump the overall timeout to 30s. Each sandbox-with-volume run now spends ~5-15s waiting, which is well below the 60s sandbox timeout. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Even with the post-list sleep extension (1 commit back), the sandbox-side volume lookup propagates separately from the deployng list endpoint. Retrying the mount-bearing `sandbox create` call up to 6 times (5s apart, ~30s budget) handles the residual race without papering over genuine backend failures. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Even with retries and the post-list sleep, the sandbox-side volume lookup was returning 404 deterministically (6/6 attempts, ~30s apart). The volume is created in `ord` but the sandbox create call didn't specify a region, so it landed in a different cluster that doesn't know about the volume. Pin the sandbox create to `--region ord` to match the volume's region. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Closing — after four iterations (one-shot list-poll → post-list sleep → mount-retry → region-pin), the `sandbox with volume mount` test fails identically on every CI run with `VOLUME_NOT_FOUND` from the sandbox-side lookup. Six 5s-spaced retries against a region-pinned (`--region ord`) sandbox create all return 404 for a volume id (`vol_ord_*`) that's already visible via `volumes list` and was just created in the same region. This isn't a propagation race the CLI can wait out — it's a real backend issue where the sandbox cluster's view of the volumes service doesn't see the new volume even ~30s after it's created and listable. Surfacing as a separate issue to the deployng / sandbox team rather than burying behind retries. Recent history on `main` shows the test was passing ~most of the time, so it likely regressed on the backend side in the last week or two. In the meantime, the agent-ergonomics stack (#91-#96, #98) doesn't depend on this test passing; all of those pass `fmt` / `lint` / `check` / `jsr` on Deno 2.7.8 and only the same flake fails their `deno test` job (including on the docs-only PR #96). |
Summary
sandbox volumes createreturns the volume id immediately, but the backend may take a moment to make the volume visible to subsequent operations (mount, list). The `sandbox with volume mount` test hits this race on most CI runs, surfacing as:```
✗ An error occurred:
The requested volume 'vol_ord_...' was not found, or you do not have access to view it. (status: 404, code: VOLUME_NOT_FOUND, ...)
```
immediately after the `sandbox volumes create ...` step. Recent main CI history shows the test passes most of the time but fails frequently enough that it's currently masking real signal on stacked PRs.
What's in this PR
Adds a `waitForVolumeReady(volumeId, { timeoutMs = 15_000, intervalMs = 500 })` helper that polls `sandbox volumes list` until the new volume's id appears (or times out after 15s with a clear error). Inserts one call between the volume creation and the sandbox-with-mount step in the affected test.
This is a test-only fix. No CLI / backend code changes.
Test plan
🤖 Generated with Claude Code