Skip to content

fix(compose): default postgres volume to per-task (concurrency-safe)#18

Merged
kentwelcome merged 1 commit into
mainfrom
fix/postgres-volume-concurrency-safe
Jun 19, 2026
Merged

fix(compose): default postgres volume to per-task (concurrency-safe)#18
kentwelcome merged 1 commit into
mainfrom
fix/postgres-volume-concurrency-safe

Conversation

@kentwelcome

Copy link
Copy Markdown
Contributor

Problem

A full DataAgentBench run with concurrency.trials: 2 corrupts whole Postgres-backed datasets. Observed: all 13 crmarenapro cells errored with dependency failed to start: container …-dab-postgres-1 is unhealthy, and the compose stdout showed:

volume "dab-postgres-data-crmarenapro-v1" already exists but was created for
project "crmarenapro-q1__buzfvx7" (expected "crmarenapro-q1__v8mtc8f")

The 13 cells all started within ~65s of each other (concurrency), dropping the entire dataset from the board and forcing abstains on other PG-backed passers mid-run (could not translate host name).

Root cause

_postgres_volume_name(...) with the default postgres_volume_mode="reuse" keys the writable PGDATA named volume on (dataset_name, schema_version) only — e.g. dab-postgres-data-crmarenapro-v1 — shared across every query of a dataset (an intentional "run init.d once per dataset" optimization). Two failure modes:

  1. Intra-run concurrency: under trials:2, two same-dataset cells mount the same writable PGDATA dir simultaneously. Postgres locks its data dir, so the second container never goes healthy → up --wait rc=1 → every cell of the dataset errors.
  2. Cross-run persistence: the named volume survives across runs (keyed only on dataset+schema), so a prior run's project owns it → the "created for project X (expected Y)" warning and possibly stale data.

(dab-mongo is unaffected — it uses read-only bind mounts, no shared writable named volume.)

Fix

Flip the operator-facing default from reuse to fresh (per-task-unique volume, keyed on task_id) in cli.py and prepare_dataset_tasks(...). Each cell gets its own PGDATA volume (e.g. dab-postgres-data-crmarenapro-v1-crmarenapro-q7), eliminating both the intra-run collision and cross-run staleness. reuse stays available and documented for serial single-trial runs.

generate_compose's low-level default is intentionally left reuse so its existing API-documenting unit tests are untouched; the change is at the operator entry points.

Cost

Per-cell DB restore instead of once-per-dataset. Negligible: the worst concurrent case, crmarenapro (13 queries), is an 8.5 MB / 90k-line dump that restores in a few seconds — well inside the pg healthcheck budget (20×5s). Most datasets are far smaller.

Tests

Added test_same_dataset_tasks_get_distinct_postgres_volumes (red under the old default, green after). All 173 plugin unit tests pass (restart-policy, healthcheck, depends_on included).

🤖 Generated with Claude Code

The reuse postgres-volume-mode keys the writable PGDATA named volume on
(dataset, schema_version) only, shared across all queries of a dataset.
Under concurrency.trials>1 this corrupts the run: two same-dataset cells
start within seconds of each other and both mount the one PGDATA dir;
postgres locks the data dir so the second container never goes healthy,
up --wait returns rc=1, and every cell of that dataset errors. Observed
on dab0015 trials:2 where all 13 crmarenapro cells errored with
"dependency failed to start: dab-postgres-1 is unhealthy" and
"volume already exists but was created for project X (expected Y)".

Flip the operator-facing default (cli.py + prepare_dataset_tasks) from
reuse to fresh, which appends task_id for a per-task-unique volume. The
restore-once cost is small: DAB postgres dumps are tiny (crmarenapro
8.5MB/90k lines is the worst concurrent case; PATENTS 129MB has only 3
queries) and restore well within the healthcheck budget. fresh also
fixes cross-run staleness since the volume no longer survives across
runs. reuse stays available for serial single-trial runs.

generate_compose's low-level default stays reuse (its unit tests document
that API); the run path shells out via the plugin CLI whose default now
governs, and DabPluginArgs does not expose the flag, so the fix takes
effect on any re-run with no task regeneration or spec change needed.

TDD: test_same_dataset_tasks_get_distinct_postgres_volumes asserts two
queries of one dataset get distinct PGDATA volume names; red under the
old default, green now. All 173 plugin unit tests pass.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 19, 2026 10:19

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR changes the razorback-plugin-dab operator defaults to use per-task Postgres named volumes (instead of dataset-reused volumes) to avoid concurrency-related PGDATA directory collisions during concurrent DAB runs.

Changes:

  • Flip the operator-facing default postgres_volume_mode from "reuse" to "fresh" in prepare_dataset_tasks(...).
  • Flip the CLI default --postgres-volume-mode from "reuse" to "fresh" and strengthen the help text warning about concurrency.
  • Add a unit test asserting that two tasks from the same dataset materialize with distinct Postgres PGDATA volume names under the operator default.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
packages/razorback-plugin-dab/tests/unit/test_compose_dataset_volume.py Adds a regression test ensuring same-dataset tasks get distinct PGDATA volumes under the operator default.
packages/razorback-plugin-dab/src/razorback_plugin_dab/generate/prepare.py Changes the default postgres_volume_mode to "fresh" and updates its documentation.
packages/razorback-plugin-dab/src/razorback_plugin_dab/cli.py Changes CLI default --postgres-volume-mode to "fresh" and updates CLI help messaging.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +87 to +91
goes unhealthy, and the cell errors. Per-task volumes also avoid
stale cross-run data (the volume does not survive across runs as a
dataset-keyed one would). Cost: init.d (DB restore) runs per cell
instead of once per dataset; DAB postgres dumps are small
(≤8.5MB / ~90k lines for crmarenapro, the worst concurrent case;
@kentwelcome kentwelcome merged commit 2ac0295 into main Jun 19, 2026
1 check passed
@kentwelcome kentwelcome deleted the fix/postgres-volume-concurrency-safe branch June 19, 2026 17:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants