Skip to content

feat: Phase 3d+3e — ParquetMergePipeline supervisor + publisher feedback#6354

Open
g-talbot wants to merge 2 commits intogtt/parquet-merge-pipeline-3cfrom
gtt/parquet-merge-pipeline-3de
Open

feat: Phase 3d+3e — ParquetMergePipeline supervisor + publisher feedback#6354
g-talbot wants to merge 2 commits intogtt/parquet-merge-pipeline-3cfrom
gtt/parquet-merge-pipeline-3de

Conversation

@g-talbot
Copy link
Copy Markdown
Contributor

@g-talbot g-talbot commented Apr 29, 2026

Summary

Stacked on #6358 (Phase 3c). Adds the supervisor and feedback loop that make the merge pipeline runnable.

  • ParquetMergePipeline supervisor: spawns all merge actors (publisher → sequencer → uploader → executor → downloader → planner), periodic health-check supervision loop, respawn on failure with backoff, graceful shutdown via FinishPendingMergesAndShutdownPipeline that disconnects feedback and runs finalize policy for cold windows.
  • Publisher feedback: adds parquet_merge_planner_mailbox_opt to Publisher (feature-gated cfg(feature = "metrics")). After successful ParquetSplitsUpdate publish of new ingested splits, sends ParquetNewSplits to the planner. Merge outputs are not fed back (guards against infinite loops).
  • DisconnectMergePlanner extended to clear both Tantivy and Parquet planner mailboxes.

Test plan

  • 3 pipeline tests (spawn+supervise, shutdown drain, initial splits)
  • 3 existing publisher tests pass (no regression)
  • Compiles with and without metrics feature
  • cargo clippy clean, license headers OK

🤖 Generated with Claude Code

@g-talbot g-talbot force-pushed the gtt/parquet-merge-pipeline branch from ceba410 to e96a920 Compare April 29, 2026 14:06
@g-talbot g-talbot changed the base branch from gtt/parquet-merge-pipeline to gtt/parquet-merge-pipeline-3c April 29, 2026 14:07
@g-talbot g-talbot force-pushed the gtt/parquet-merge-pipeline-3c branch from ceba410 to 5937440 Compare April 29, 2026 15:31
@g-talbot g-talbot force-pushed the gtt/parquet-merge-pipeline-3de branch from d82d72d to 92cb5ed Compare April 29, 2026 15:31
@g-talbot g-talbot force-pushed the gtt/parquet-merge-pipeline-3c branch from 5937440 to 0b1c9cc Compare April 29, 2026 18:10
@g-talbot g-talbot force-pushed the gtt/parquet-merge-pipeline-3de branch from 92cb5ed to b6f8bcc Compare April 29, 2026 18:11
@g-talbot g-talbot force-pushed the gtt/parquet-merge-pipeline-3c branch from 0b1c9cc to 66b97e0 Compare April 29, 2026 18:16
@g-talbot g-talbot force-pushed the gtt/parquet-merge-pipeline-3de branch from b6f8bcc to d296774 Compare April 29, 2026 18:16
@g-talbot g-talbot force-pushed the gtt/parquet-merge-pipeline-3c branch from 66b97e0 to 0f051bc Compare April 29, 2026 18:24
@g-talbot g-talbot force-pushed the gtt/parquet-merge-pipeline-3de branch from d296774 to ededb89 Compare April 29, 2026 18:24
@g-talbot g-talbot force-pushed the gtt/parquet-merge-pipeline-3c branch from 0f051bc to 16b46d7 Compare April 29, 2026 18:40
@g-talbot g-talbot force-pushed the gtt/parquet-merge-pipeline-3de branch from ededb89 to 761b379 Compare April 29, 2026 18:42
@g-talbot g-talbot force-pushed the gtt/parquet-merge-pipeline-3c branch from 16b46d7 to 8e19b6b Compare April 29, 2026 18:51
@g-talbot g-talbot force-pushed the gtt/parquet-merge-pipeline-3de branch from 761b379 to 9b3769e Compare April 29, 2026 18:51
@g-talbot g-talbot force-pushed the gtt/parquet-merge-pipeline-3c branch from 8e19b6b to f32bd64 Compare April 29, 2026 19:05
@g-talbot g-talbot force-pushed the gtt/parquet-merge-pipeline-3de branch from 9b3769e to 927dc9f Compare April 29, 2026 19:05
@g-talbot g-talbot force-pushed the gtt/parquet-merge-pipeline-3c branch from f32bd64 to de17c0e Compare April 29, 2026 20:54
@g-talbot g-talbot force-pushed the gtt/parquet-merge-pipeline-3de branch from 927dc9f to c51a84b Compare April 29, 2026 20:54
@g-talbot g-talbot force-pushed the gtt/parquet-merge-pipeline-3c branch from de17c0e to 1f6512e Compare April 30, 2026 02:30
g-talbot and others added 2 commits April 29, 2026 22:30
…se 3d+3e)

Phase 3 pipeline integration, combined supervisor and feedback PR:

- ParquetMergePipeline supervisor: spawns all merge actors (publisher →
  sequencer → uploader → executor → downloader → planner), health-checks
  with periodic supervision loop, respawn on failure with backoff,
  graceful shutdown via FinishPendingMergesAndShutdownPipeline that
  disconnects feedback and runs finalize policy. 3 tests.

- Publisher feedback: add parquet_merge_planner_mailbox_opt to Publisher
  (feature-gated behind cfg(feature = "metrics")). After successful
  ParquetSplitsUpdate publish of new ingested splits, sends ParquetNewSplits
  to the planner. Merge outputs (non-empty replaced_split_ids) are not
  fed back to avoid infinite loops.

- DisconnectMergePlanner extended to clear both Tantivy and Parquet planner
  mailboxes, supporting shutdown drain for both pipeline types.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@g-talbot g-talbot force-pushed the gtt/parquet-merge-pipeline-3de branch from c51a84b to 804e384 Compare April 30, 2026 02:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant