fix(compose): default postgres volume to per-task (concurrency-safe)#18
Merged
Merged
Conversation
The reuse postgres-volume-mode keys the writable PGDATA named volume on (dataset, schema_version) only, shared across all queries of a dataset. Under concurrency.trials>1 this corrupts the run: two same-dataset cells start within seconds of each other and both mount the one PGDATA dir; postgres locks the data dir so the second container never goes healthy, up --wait returns rc=1, and every cell of that dataset errors. Observed on dab0015 trials:2 where all 13 crmarenapro cells errored with "dependency failed to start: dab-postgres-1 is unhealthy" and "volume already exists but was created for project X (expected Y)". Flip the operator-facing default (cli.py + prepare_dataset_tasks) from reuse to fresh, which appends task_id for a per-task-unique volume. The restore-once cost is small: DAB postgres dumps are tiny (crmarenapro 8.5MB/90k lines is the worst concurrent case; PATENTS 129MB has only 3 queries) and restore well within the healthcheck budget. fresh also fixes cross-run staleness since the volume no longer survives across runs. reuse stays available for serial single-trial runs. generate_compose's low-level default stays reuse (its unit tests document that API); the run path shells out via the plugin CLI whose default now governs, and DabPluginArgs does not expose the flag, so the fix takes effect on any re-run with no task regeneration or spec change needed. TDD: test_same_dataset_tasks_get_distinct_postgres_volumes asserts two queries of one dataset get distinct PGDATA volume names; red under the old default, green now. All 173 plugin unit tests pass. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR changes the razorback-plugin-dab operator defaults to use per-task Postgres named volumes (instead of dataset-reused volumes) to avoid concurrency-related PGDATA directory collisions during concurrent DAB runs.
Changes:
- Flip the operator-facing default
postgres_volume_modefrom"reuse"to"fresh"inprepare_dataset_tasks(...). - Flip the CLI default
--postgres-volume-modefrom"reuse"to"fresh"and strengthen the help text warning about concurrency. - Add a unit test asserting that two tasks from the same dataset materialize with distinct Postgres PGDATA volume names under the operator default.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| packages/razorback-plugin-dab/tests/unit/test_compose_dataset_volume.py | Adds a regression test ensuring same-dataset tasks get distinct PGDATA volumes under the operator default. |
| packages/razorback-plugin-dab/src/razorback_plugin_dab/generate/prepare.py | Changes the default postgres_volume_mode to "fresh" and updates its documentation. |
| packages/razorback-plugin-dab/src/razorback_plugin_dab/cli.py | Changes CLI default --postgres-volume-mode to "fresh" and updates CLI help messaging. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+87
to
+91
| goes unhealthy, and the cell errors. Per-task volumes also avoid | ||
| stale cross-run data (the volume does not survive across runs as a | ||
| dataset-keyed one would). Cost: init.d (DB restore) runs per cell | ||
| instead of once per dataset; DAB postgres dumps are small | ||
| (≤8.5MB / ~90k lines for crmarenapro, the worst concurrent case; |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
A full DataAgentBench run with
concurrency.trials: 2corrupts whole Postgres-backed datasets. Observed: all 13crmarenaprocells errored withdependency failed to start: container …-dab-postgres-1 is unhealthy, and the compose stdout showed:The 13 cells all started within ~65s of each other (concurrency), dropping the entire dataset from the board and forcing abstains on other PG-backed passers mid-run (
could not translate host name).Root cause
_postgres_volume_name(...)with the defaultpostgres_volume_mode="reuse"keys the writable PGDATA named volume on(dataset_name, schema_version)only — e.g.dab-postgres-data-crmarenapro-v1— shared across every query of a dataset (an intentional "run init.d once per dataset" optimization). Two failure modes:trials:2, two same-dataset cells mount the same writable PGDATA dir simultaneously. Postgres locks its data dir, so the second container never goes healthy →up --waitrc=1 → every cell of the dataset errors.(
dab-mongois unaffected — it uses read-only bind mounts, no shared writable named volume.)Fix
Flip the operator-facing default from
reusetofresh(per-task-unique volume, keyed ontask_id) incli.pyandprepare_dataset_tasks(...). Each cell gets its own PGDATA volume (e.g.dab-postgres-data-crmarenapro-v1-crmarenapro-q7), eliminating both the intra-run collision and cross-run staleness.reusestays available and documented for serial single-trial runs.generate_compose's low-level default is intentionally leftreuseso its existing API-documenting unit tests are untouched; the change is at the operator entry points.Cost
Per-cell DB restore instead of once-per-dataset. Negligible: the worst concurrent case,
crmarenapro(13 queries), is an 8.5 MB / 90k-line dump that restores in a few seconds — well inside the pg healthcheck budget (20×5s). Most datasets are far smaller.Tests
Added
test_same_dataset_tasks_get_distinct_postgres_volumes(red under the old default, green after). All 173 plugin unit tests pass (restart-policy, healthcheck, depends_on included).🤖 Generated with Claude Code