[4/N] [HAProxy stability] - coalesce controller broadcasts into a single reload by harshit-anyscale · Pull Request #63623 · ray-project/ray

harshit-anyscale · 2026-05-25T11:49:30Z

Summary

Carved out of #63308 so the coalescing change can land independently of the stderr-redirect (#63621) and redispatch (#63622) changes in that PR.

Under autoscaling churn the controller's target_groups broadcast fires frequently — its replica set changes on essentially every reconcile that adds or removes a replica — and each broadcast triggers its own config regeneration and graceful reload via -sf. Both target_groups and fallback_targets are emitted from the same control-loop step, so when both change in one ~100 ms tick they reach the proxy microseconds apart and (pre-coalescing) cause two back-to-back reloads.

Both broadcasts are already change-gated on the controller (broadcast_*_if_changed), so this PR collapses genuine consecutive / same-tick changes into one reload — it is not suppressing redundant no-op broadcasts. (fallback_targets in particular is near-static in steady state; it only changes when the fallback proxy's health flips or its actor is replaced.)

This PR collapses adjacent broadcasts into a single reload:

HAProxyManager keeps two pieces of state: _update_pending: bool and _coalesce_task: Optional[asyncio.Task].
Each incoming broadcast sets _update_pending = True and arms (or re-arms) a single sleeping coalesce task.
The task sleeps for RAY_SERVE_HAPROXY_BROADCAST_COALESCE_S (default 100 ms; set to 0 to disable) and then runs one _update_haproxy_backends() call against whatever the latest state is.
Broadcasts arriving inside the window are absorbed — they flip the dirty flag, but the task is already armed, so no second reload is scheduled.

Failure handling. If the apply fails, the task re-arms itself and retries on the next tick. After 3 consecutive failures it stops re-arming and waits for the next broadcast to trigger fresh state — this prevents busy-spinning against a persistent error (e.g. HAProxy crashed and is stuck restarting) while still recovering automatically once a new broadcast lands.

Knob. RAY_SERVE_HAPROXY_BROADCAST_COALESCE_S (env var, float seconds). 100 ms by default — small enough that the worst-case extra latency from a single broadcast is bounded, large enough to catch the common back-to-back pattern observed during scale events. Set to 0 to opt out entirely.

gemini-code-assist

Code Review

This pull request introduces a coalescing mechanism for HAProxy reloads to prevent excessive churn during rapid autoscaling events. It adds a configurable window, RAY_SERVE_HAPROXY_BROADCAST_COALESCE_S, and implements a background task to batch update requests. Feedback indicates that the current retry logic is ineffective because the backend update method is synchronous and does not await the actual reload, meaning exceptions won't be caught as intended. Additionally, improvements were suggested regarding the use of standardized environment variable parsing and the removal of redundant exception information in log messages.

Under autoscaling churn the Serve controller fires `target_groups` and `fallback_targets` broadcasts independently, often only tens of ms apart. Without coalescing, each broadcast triggers its own config regeneration and graceful reload via `-sf`. HAProxyManager now marks state dirty on each broadcast and arms a single sleeping coalesce task; updates arriving during the sleep window are absorbed into the same pending apply. Window defaults to 100 ms via RAY_SERVE_HAPROXY_BROADCAST_COALESCE_S; set to 0 to disable. If an apply fails, the task re-arms and retries on the next tick (up to 3 consecutive failures, then waits for a new broadcast — avoids busy-spinning on a persistent error). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

akyang-anyscale · 2026-05-27T18:12:12Z

Under autoscaling churn the Serve controller fires target_groups and fallback_targets broadcasts independently, often only tens of ms apart.

Why are fallback_targets being broadcasted so frequently if it only include the head node serve proxy? We should fix that if that's the case.

Add a serve_haproxy_update_latency_s histogram measuring seconds from the first coalesced controller broadcast to the HAProxy reload completing. It spans the coalesce window, any time queued behind an in-flight reload, and the reload itself — so a slow-updates investigation can see which part dominates. Latency is measured from the first broadcast since the last apply (_coalesce_armed_at), so coalesced broadcasts report one end-to-end sample, and a failed-then-retried apply still reflects the full delay. Addresses review request on ray-project#63623. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Add a serve_haproxy_update_latency_s histogram measuring seconds from the first coalesced controller broadcast to the HAProxy reload completing. It spans the coalesce window, any time queued behind an in-flight reload, and the reload itself — so a slow-updates investigation can see which part dominates. Latency is measured from the first broadcast since the last apply (_coalesce_armed_at), so coalesced broadcasts report one end-to-end sample, and a failed-then-retried apply still reflects the full delay. All broadcasts now run through the apply task; a coalesce window of 0 just skips the sleep (apply immediately) instead of taking a separate synchronous path. This keeps the latency metric recorded regardless of the coalesce setting — including when it is disabled for debugging. Addresses review request on ray-project#63623. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

harshit-anyscale · 2026-05-29T11:03:16Z

Under autoscaling churn the Serve controller fires target_groups and fallback_targets broadcasts independently, often only tens of ms apart.

Why are fallback_targets being broadcasted so frequently if it only include the head node serve proxy? We should fix that if that's the case.

my bad, target_groups are getting broadcasted very frequently, not fallback. thanks for pointing, change description as well.

harshit-anyscale · 2026-05-29T11:05:22Z

You're right — fallback_targets is change-gated (broadcast_fallback_targets_if_changed) and the target it carries (the head-node proxy's ip / port / instance_id / name, a frozen pydantic Target) is stable, so it does not broadcast frequently in steady state. It only re-broadcasts when the fallback proxy's health flips (HEALTHY ↔ not, which toggles the value to/from {}) or the fallback actor is replaced.

My description overstated it: the frequent broadcaster under churn is target_groups (its replica set changes on essentially every reconcile that adds/removes a replica). Since both are emitted from the same control-loop step, coalescing mainly collapses (a) consecutive target_groups changes across reconciles and (b) the case where both change in the same ~100 ms tick and arrive microseconds apart.

The repeated fallback_targets churn we actually observed was the fallback proxy's health flapping under load — addressed separately by the lenient fallback health-check change, not by coalescing. I've updated the PR description to reflect this.

— Claude (Claude Code)

_update_haproxy_backends previously returned create_task(_reload_haproxy()) and its only caller, _coalesce_and_apply, awaited that task. The task was leftover indirection from when the synchronous long-poll callbacks called this directly; coalescing moved the sync->async boundary up into _schedule_haproxy_update, so the only caller is now async. Make _update_haproxy_backends async and await _reload_haproxy() directly — same failure propagation and serialization, one less task and no orphaned reload if the coalesce task is cancelled. Addresses review feedback on ray-project#63623. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

abrarsheikh · 2026-06-02T22:26:17Z

+        self._update_pending: bool = False
+        self._coalesce_task: Optional[asyncio.Task] = None
+        self._coalesce_armed_at: Optional[float] = None
+        self._haproxy_update_latency = metrics.Histogram(


include the metrics in the monitoring docs

abrarsheikh · 2026-06-02T22:26:41Z

+                "HAProxy reload completing. Includes the coalesce window, time "
+                "queued behind an in-flight reload, and the reload itself."
+            ),
+            boundaries=[0.05, 0.1, 0.25, 0.5, 1, 2, 5, 10, 30],


move boundaries in constants.py

harshit-anyscale requested a review from a team as a code owner May 25, 2026 11:49

harshit-anyscale self-assigned this May 25, 2026

harshit-anyscale added serve Ray Serve Related Issue go add ONLY when ready to merge, run all tests labels May 25, 2026

gemini-code-assist Bot reviewed May 25, 2026

View reviewed changes

Comment thread python/ray/serve/_private/haproxy.py

Comment thread python/ray/serve/_private/constants.py Outdated

Comment thread python/ray/serve/_private/haproxy.py Outdated

harshit-anyscale changed the title ~~[serve] HAProxy: coalesce controller broadcasts into a single reload~~ [4/N] [HAProxy stability] - coalesce controller broadcasts into a single reload May 25, 2026

harshit-anyscale force-pushed the serve-haproxy-coalesce-broadcasts branch from 9f1c879 to 4d487f0 Compare May 25, 2026 11:54

harshit-anyscale requested review from abrarsheikh and akyang-anyscale May 25, 2026 14:29

akyang-anyscale reviewed May 27, 2026

View reviewed changes

Comment thread python/ray/serve/_private/haproxy.py Outdated

harshit-anyscale force-pushed the serve-haproxy-coalesce-broadcasts branch from b785566 to 702207a Compare May 29, 2026 10:46

harshit-anyscale requested a review from akyang-anyscale May 29, 2026 11:07

Merge branch 'master' into serve-haproxy-coalesce-broadcasts

363d06d

akyang-anyscale approved these changes May 29, 2026

View reviewed changes

eicherseiji reviewed Jun 1, 2026

View reviewed changes

Comment thread python/ray/serve/_private/haproxy.py Outdated

harshit-anyscale requested a review from eicherseiji June 2, 2026 08:43

eicherseiji approved these changes Jun 2, 2026

View reviewed changes

abrarsheikh reviewed Jun 2, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[4/N] [HAProxy stability] - coalesce controller broadcasts into a single reload#63623

[4/N] [HAProxy stability] - coalesce controller broadcasts into a single reload#63623
harshit-anyscale wants to merge 4 commits into
ray-project:masterfrom
harshit-anyscale:serve-haproxy-coalesce-broadcasts

harshit-anyscale commented May 25, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

akyang-anyscale commented May 27, 2026

Uh oh!

Uh oh!

harshit-anyscale commented May 29, 2026

Uh oh!

harshit-anyscale commented May 29, 2026

Uh oh!

Uh oh!

abrarsheikh Jun 2, 2026

Uh oh!

abrarsheikh Jun 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

harshit-anyscale commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

akyang-anyscale commented May 27, 2026

Uh oh!

Uh oh!

harshit-anyscale commented May 29, 2026

Uh oh!

harshit-anyscale commented May 29, 2026

Uh oh!

Uh oh!

abrarsheikh Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

abrarsheikh Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

harshit-anyscale commented May 25, 2026 •

edited

Loading