[data][llm] Promote max_tasks_in_flight_per_actor to a first-class config field and adjust defaults#63214
Conversation
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
max_tasks_in_flight_per_actor to a first-class config field and adjust defaults
There was a problem hiding this comment.
Code Review
This pull request promotes max_tasks_in_flight_per_actor from an experimental configuration to a top-level field in OfflineProcessorConfig and its subclasses. It introduces deprecation warnings for the experimental key and implements a resolution strategy that defaults to a calculated value based on max_concurrent_batches. Feedback identifies a potential type mismatch where a float could be assigned to an integer field when bypassing Pydantic validation, suggesting an explicit integer cast to ensure compatibility with Ray Data's actor pool.
| * DEFAULT_ACTOR_MAX_TASKS_IN_FLIGHT_TO_MAX_CONCURRENCY_FACTOR, | ||
| ) | ||
| # Bypass `validate_assignment=True` so we don't re-fire the deprecation warning | ||
| object.__setattr__(self, "max_tasks_in_flight_per_actor", resolved) |
There was a problem hiding this comment.
The resolved value for max_tasks_in_flight_per_actor can be a float, particularly when calculated using DEFAULT_ACTOR_MAX_TASKS_IN_FLIGHT_TO_MAX_CONCURRENCY_FACTOR, which is defined as a float. The max_tasks_in_flight_per_actor field is typed as Optional[int], but using object.__setattr__ bypasses Pydantic's type coercion. This could result in a float value being passed to ray.data.ActorPoolStrategy, which expects an integer and may lead to unexpected behavior or a runtime error.
To ensure an integer is always assigned, the resolved value should be explicitly cast to int. This would also align with the original logic in Ray Data's actor pool, which performs this integer conversion.
| object.__setattr__(self, "max_tasks_in_flight_per_actor", resolved) | |
| object.__setattr__(self, "max_tasks_in_flight_per_actor", int(resolved)) |
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
Aydin-ab
left a comment
There was a problem hiding this comment.
making a bit more explicit in the doc that it's 2 * max_concurrent_batches
kouroshHakha
left a comment
There was a problem hiding this comment.
The approach is sound — using a Pydantic mode="after" validator to eagerly resolve the None sentinel is clean, and the resolution order (explicit > experimental > formula) is implemented correctly. The behavioral no-op for default users (8×2=16) is a good property.
Note
This review was co-written with AI assistance (Claude Code).
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
…config field and adjust defaults (ray-project#63214) Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
…config field and adjust defaults (ray-project#63214) Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
…config field and adjust defaults (ray-project#63214) Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com> Signed-off-by: anindyam1969 <amukherjee@kinetica.com>
…config field and adjust defaults (ray-project#63214) Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
…config field and adjust defaults (ray-project#63214) Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com> Signed-off-by: Alexandr Plashchinsky <alexandr.plashchinsky@alexandrplashchinsky-H765G66H9V.local>
Why
Ray Data LLM hardcoded
DEFAULT_MAX_TASKS_IN_FLIGHT = 16instead of using Ray Data's actor-pool fallback, which (a) didn't trackmax_concurrent_batcheswhen users tuned it and (b) bypassed bothDataContext.max_tasks_in_flight_per_actorand the env-var override of the factor.What changes?
OfflineProcessorConfig.max_tasks_in_flight_per_actor: Optional[int] = None.DEFAULT_MAX_TASKS_IN_FLIGHT = 16constant; engine processors passconfig.max_tasks_in_flight_per_actorstraight through toActorPoolStrategy(includingNone).16→max_concurrent_batches × FACTOR, resolved by Ray Data's actor pool.DataContext.max_tasks_in_flight_per_actorandRAY_DATA_ACTOR_DEFAULT_MAX_TASKS_IN_FLIGHT_TO_MAX_CONCURRENCY_FACTORare now honored (previously bypassed by the explicit16).experimental["max_tasks_in_flight_per_actor"]is deprecated: migrated to the new field at construction with alogger.warning. Top-level field wins if both are set.Original API
New API
Behavior changes
RAY_DATA_ACTOR_DEFAULT_MAX_TASKS_IN_FLIGHT_TO_MAX_CONCURRENCY_FACTORnow have their override honored by Ray Data LLM (previously ignored).experimentalstill works but logs a deprecation warning. The top-level field overridesexperimentalif both are set.max_concurrent_batchesmax_tasks_in_flight_per_actormax_concurrencyRelated issues
Additional information