Skip to content

[Data] Decouple sub-progress metrics from progress bar implementations#63195

Open
400Ping wants to merge 14 commits into
ray-project:masterfrom
400Ping:data/decouple-SubProgressMixin
Open

[Data] Decouple sub-progress metrics from progress bar implementations#63195
400Ping wants to merge 14 commits into
ray-project:masterfrom
400Ping:data/decouple-SubProgressMixin

Conversation

@400Ping
Copy link
Copy Markdown
Contributor

@400Ping 400Ping commented May 7, 2026

Description

This PR refactors Ray Data sub-progress reporting so operators expose sub-progress
as structured metrics instead of receiving or holding concrete progress bar
objects.

The main goal is to decouple execution state from display/rendering state:

Before:
Operator <-> ProgressManager <-> ProgressBar object

After:
Operator / execution path -> ProgressMetrics / SubProgressUpdater -> ProgressManager -> display backend

This addresses #60082 by removing the bidirectional coupling where progress
managers created sub-progress bars and injected them back into operators.

What Changed

Sub-progress API

  • Introduces ProgressMetrics as an immutable snapshot of sub-progress state:
    • name
    • total
    • completed
  • Updates SubProgressMixin to expose:
    • get_sub_progress_metrics()
    • get_sub_progress_updaters()
  • Adds shared helper logic for creating sub-progress metrics/updaters.

SubProgressUpdater

  • SubProgressUpdater now updates ProgressMetrics snapshots instead of
    wrapping concrete progress bar objects.
  • Removes the previous path where progress managers injected concrete display
    bars back into operators. Progress managers now attach display-sync callbacks
    to driver-side updaters instead.
  • Adds update callbacks so progress managers can sync display state immediately
    when metrics change.
  • Adds locking around metrics updates to avoid read-modify-write races.

Progress Managers

  • tqdm and rich progress managers now own their concrete display bars and
    sync them from ProgressMetrics updates.
  • logging progress manager reads metrics directly instead of using a fake
    logging sub-progress bar.
  • noop progress manager no longer needs to attach None display bars.

Operators and Schedulers

Updates sub-progress users to the new metrics/updater model, including:

  • AllToAllOperator
  • CPU hash shuffle
  • GPU shuffle
  • sort / aggregate sampling
  • pull-based shuffle scheduler
  • push-based shuffle scheduler
  • split repartition scheduler

Why This Design

Operators should own execution state, while progress managers should own
rendering.

This keeps the operator layer focused on reporting progress as data, and keeps
display-specific behavior inside progress managers. It also avoids making
operators depend on tqdm, rich, logging, or noop progress bar
implementations.

The resulting model is:

Execution state:
  ProgressMetrics(name, total, completed)

Mutation path:
  SubProgressUpdater.update(...)

Rendering path:
  ProgressManager syncs display backend from metrics

Related issues

Closes #60082

400Ping added 6 commits May 7, 2026 13:06
Signed-off-by: 400Ping <jiekaichang@apache.org>
Signed-off-by: 400Ping <jiekaichang@apache.org>
Signed-off-by: 400Ping <jiekaichang@apache.org>
Signed-off-by: 400Ping <jiekaichang@apache.org>
Signed-off-by: 400Ping <jiekaichang@apache.org>
Signed-off-by: 400Ping <jiekaichang@apache.org>
@400Ping 400Ping requested a review from a team as a code owner May 7, 2026 16:16
@400Ping 400Ping changed the title [Data] Decouple sub-progress metrics from progress bar implementations #62262 [Data] Decouple sub-progress metrics from progress bar implementations May 7, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the sub-progress tracking system in Ray Data, transitioning from direct progress bar manipulation to a metrics-based architecture. It replaces SubProgressBarMixin with SubProgressMixin and introduces ProgressMetrics for immutable state snapshots along with SubProgressUpdater for driver-side updates. Several operators and progress managers (Logging, Rich, and TQDM) have been updated to support this new system. Feedback identifies critical issues: SubProgressUpdater is missing required abstract methods (close, refresh) from its base class, which will prevent instantiation; TqdmSubProgressBar lacks the update_absolute method called by the progress manager; and the inclusion of a non-serializable threading.Lock in SubProgressUpdater will likely cause serialization failures when TaskContext is sent to worker tasks.

Comment thread python/ray/data/_internal/progress/base_progress.py
Comment thread python/ray/data/_internal/progress/tqdm_progress.py
Comment thread python/ray/data/_internal/progress/base_progress.py
Signed-off-by: 400Ping <jiekaichang@apache.org>
Comment thread python/ray/data/_internal/progress/base_progress.py Outdated
Signed-off-by: 400Ping <jiekaichang@apache.org>
Comment thread python/ray/data/_internal/progress/base_progress.py
Signed-off-by: 400Ping <jiekaichang@apache.org>
Comment thread python/ray/data/_internal/progress/rich_progress.py
Comment thread python/ray/data/_internal/progress/logging_progress.py Outdated
Signed-off-by: 400Ping <jiekaichang@apache.org>
Comment thread python/ray/data/_internal/progress/tqdm_progress.py Outdated
@ray-gardener ray-gardener Bot added data Ray Data-related issues community-contribution Contributed by the community labels May 7, 2026
400Ping added 2 commits May 8, 2026 20:39
Signed-off-by: 400Ping <jiekaichang@apache.org>
Signed-off-by: 400Ping <jiekaichang@apache.org>
Comment thread python/ray/data/_internal/progress/rich_progress.py
Signed-off-by: 400Ping <jiekaichang@apache.org>
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Reviewed by Cursor Bugbot for commit a1fbc6c. Configure here.

Comment thread python/ray/data/_internal/execution/operators/sub_progress.py Outdated
Signed-off-by: 400Ping <jiekaichang@apache.org>
@400Ping
Copy link
Copy Markdown
Contributor Author

400Ping commented May 15, 2026

cc @kyuds to take a look

@github-actions
Copy link
Copy Markdown

This pull request has been automatically marked as stale because it has not had
any activity for 14 days. It will be closed in another 14 days if no further activity occurs.
Thank you for your contributions.

You can always ask for help on our discussion forum or Ray's public slack channel.

If you'd like to keep this open, just leave any comment, and the stale label will be removed.

@github-actions github-actions Bot added the stale The issue is stale. It will be closed within 7 days unless there are further conversation label May 30, 2026
@kyuds kyuds removed the stale The issue is stale. It will be closed within 7 days unless there are further conversation label May 30, 2026
@kyuds kyuds self-requested a review May 30, 2026 02:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-contribution Contributed by the community data Ray Data-related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Data] Decouple SubProgressMixin from progress bar

2 participants