Skip to content

[data][llm] Promote max_tasks_in_flight_per_actor to a first-class config field and adjust defaults#63214

Merged
kouroshHakha merged 3 commits into
masterfrom
data-llm-max-tasks-in-flight
May 8, 2026
Merged

[data][llm] Promote max_tasks_in_flight_per_actor to a first-class config field and adjust defaults#63214
kouroshHakha merged 3 commits into
masterfrom
data-llm-max-tasks-in-flight

Conversation

@jeffreywang-anyscale
Copy link
Copy Markdown
Contributor

@jeffreywang-anyscale jeffreywang-anyscale commented May 8, 2026

Why

Ray Data LLM hardcoded DEFAULT_MAX_TASKS_IN_FLIGHT = 16 instead of using Ray Data's actor-pool fallback, which (a) didn't track max_concurrent_batches when users tuned it and (b) bypassed both DataContext.max_tasks_in_flight_per_actor and the env-var override of the factor.

What changes?

  • New top-level field OfflineProcessorConfig.max_tasks_in_flight_per_actor: Optional[int] = None.
  • Removed the DEFAULT_MAX_TASKS_IN_FLIGHT = 16 constant; engine processors pass config.max_tasks_in_flight_per_actor straight through to ActorPoolStrategy (including None).
  • Default in-flight cap: hardcoded 16max_concurrent_batches × FACTOR, resolved by Ray Data's actor pool.
  • DataContext.max_tasks_in_flight_per_actor and RAY_DATA_ACTOR_DEFAULT_MAX_TASKS_IN_FLIGHT_TO_MAX_CONCURRENCY_FACTOR are now honored (previously bypassed by the explicit 16).
  • experimental["max_tasks_in_flight_per_actor"] is deprecated: migrated to the new field at construction with a logger.warning. Top-level field wins if both are set.

Original API

OfflineProcessorConfig(
    ...,
    experimental={"max_tasks_in_flight_per_actor": 32},  # only knob
)

New API

OfflineProcessorConfig(
    ...,
    max_concurrent_batches=8,           # unchanged
    max_tasks_in_flight_per_actor=32,   # new top-level field, Optional[int]
)

Behavior changes

  • Users who set RAY_DATA_ACTOR_DEFAULT_MAX_TASKS_IN_FLIGHT_TO_MAX_CONCURRENCY_FACTOR now have their override honored by Ray Data LLM (previously ignored).
  • Setting via experimental still works but logs a deprecation warning. The top-level field overrides experimental if both are set.
max_concurrent_batches max_tasks_in_flight_per_actor Ray actor max_concurrency Effective in-flight cap
unset (default 8) unset (None) 8 16
16 unset (None) 16 32
unset (default 8) 50 8 50
16 50 16 50

Related issues

Link related issues: "Fixes #1234", "Closes #1234", or "Related to #1234".

Additional information

Optional: Add implementation details, API changes, usage examples, screenshots, etc.

Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
@jeffreywang-anyscale jeffreywang-anyscale requested a review from a team as a code owner May 8, 2026 00:46
@jeffreywang-anyscale jeffreywang-anyscale added the go add ONLY when ready to merge, run all tests label May 8, 2026
@jeffreywang-anyscale jeffreywang-anyscale changed the title [data][llm] Promote to a first-class config field and adjust defaults [data][llm] Promote max_tasks_in_flight_per_actor to a first-class config field and adjust defaults May 8, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request promotes max_tasks_in_flight_per_actor from an experimental configuration to a top-level field in OfflineProcessorConfig and its subclasses. It introduces deprecation warnings for the experimental key and implements a resolution strategy that defaults to a calculated value based on max_concurrent_batches. Feedback identifies a potential type mismatch where a float could be assigned to an integer field when bypassing Pydantic validation, suggesting an explicit integer cast to ensure compatibility with Ray Data's actor pool.

* DEFAULT_ACTOR_MAX_TASKS_IN_FLIGHT_TO_MAX_CONCURRENCY_FACTOR,
)
# Bypass `validate_assignment=True` so we don't re-fire the deprecation warning
object.__setattr__(self, "max_tasks_in_flight_per_actor", resolved)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The resolved value for max_tasks_in_flight_per_actor can be a float, particularly when calculated using DEFAULT_ACTOR_MAX_TASKS_IN_FLIGHT_TO_MAX_CONCURRENCY_FACTOR, which is defined as a float. The max_tasks_in_flight_per_actor field is typed as Optional[int], but using object.__setattr__ bypasses Pydantic's type coercion. This could result in a float value being passed to ray.data.ActorPoolStrategy, which expects an integer and may lead to unexpected behavior or a runtime error.

To ensure an integer is always assigned, the resolved value should be explicitly cast to int. This would also align with the original logic in Ray Data's actor pool, which performs this integer conversion.

Suggested change
object.__setattr__(self, "max_tasks_in_flight_per_actor", resolved)
object.__setattr__(self, "max_tasks_in_flight_per_actor", int(resolved))

Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
@ray-gardener ray-gardener Bot added the data Ray Data-related issues label May 8, 2026
Copy link
Copy Markdown
Contributor

@Aydin-ab Aydin-ab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

making a bit more explicit in the doc that it's 2 * max_concurrent_batches

Comment thread python/ray/data/llm.py Outdated
Comment thread python/ray/data/llm.py Outdated
Comment thread python/ray/llm/_internal/batch/processor/base.py Outdated
Copy link
Copy Markdown
Contributor

@kouroshHakha kouroshHakha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach is sound — using a Pydantic mode="after" validator to eagerly resolve the None sentinel is clean, and the resolution order (explicit > experimental > formula) is implemented correctly. The behavioral no-op for default users (8×2=16) is a good property.

Note

This review was co-written with AI assistance (Claude Code).

Comment thread python/ray/llm/_internal/batch/processor/base.py
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
@kouroshHakha kouroshHakha enabled auto-merge (squash) May 8, 2026 21:09
@kouroshHakha kouroshHakha merged commit 75f55e3 into master May 8, 2026
7 checks passed
@kouroshHakha kouroshHakha deleted the data-llm-max-tasks-in-flight branch May 8, 2026 21:42
chillCode404 pushed a commit to chillCode404/ray-contrib that referenced this pull request May 9, 2026
…config field and adjust defaults (ray-project#63214)

Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
dancingactor pushed a commit to dancingactor/ray that referenced this pull request May 13, 2026
…config field and adjust defaults (ray-project#63214)

Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
am-kinetica pushed a commit to kineticadb/ray that referenced this pull request May 14, 2026
…config field and adjust defaults (ray-project#63214)

Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
Signed-off-by: anindyam1969 <amukherjee@kinetica.com>
Lucas61000 pushed a commit to Lucas61000/ray that referenced this pull request May 15, 2026
…config field and adjust defaults (ray-project#63214)

Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
alexandrplashchinsky pushed a commit to alexandrplashchinsky/ray-alex that referenced this pull request May 29, 2026
…config field and adjust defaults (ray-project#63214)

Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
Signed-off-by: Alexandr Plashchinsky <alexandr.plashchinsky@alexandrplashchinsky-H765G66H9V.local>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

data Ray Data-related issues go add ONLY when ready to merge, run all tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Ray fails to serialize self-reference objects

3 participants