Skip to content

Add routing microbenchmark for choose_replica + dispatch pattern#63293

Merged
abrarsheikh merged 1 commit into
masterfrom
decouple-routing-primitives-benchmark
Jun 1, 2026
Merged

Add routing microbenchmark for choose_replica + dispatch pattern#63293
abrarsheikh merged 1 commit into
masterfrom
decouple-routing-primitives-benchmark

Conversation

@jeffreywang-anyscale
Copy link
Copy Markdown
Contributor

@jeffreywang-anyscale jeffreywang-anyscale commented May 12, 2026

Description

Following up on #63255 (comment), we'd like to show the delta between choose_replica + dispatch and remote in our DB dashboards.

Release test results -- latencies in ms (https://buildkite.com/ray-project/release/builds/92517/canvas?sid=019e1a3c-642f-4a42-867b-2818577272b6&tab=output)

Percentile remote choose_replica + dispatch Delta
p50 1.040 1.784 +0.74
p90 1.149 1.922 +0.77
p95 1.199 1.970 +0.77
p99 1.437 2.137 +0.70

Related issues

Link related issues: "Fixes #1234", "Closes #1234", or "Related to #1234".

Additional information

Optional: Add implementation details, API changes, usage examples, screenshots, etc.

@jeffreywang-anyscale jeffreywang-anyscale requested a review from a team as a code owner May 12, 2026 03:10
@jeffreywang-anyscale jeffreywang-anyscale added the go add ONLY when ready to merge, run all tests label May 12, 2026
Base automatically changed from decouple-routing-primitives-3 to master May 12, 2026 03:16
@jeffreywang-anyscale
Copy link
Copy Markdown
Contributor Author

jeffreywang-anyscale commented May 12, 2026

Kicking off release tests to populate data to verify databricks query.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new benchmarking mode, 'choose_dispatch', to evaluate the latency of the choose_replica and dispatch pattern in Ray Serve. The changes include adding a new benchmarking method to the Benchmarker class, updating the handle_noop_latency script with a new CLI option, and expanding the microbenchmark workloads to include this new mode. A review comment suggests refactoring the run_latency_benchmark method to avoid duplicating the definition of the internal benchmark function, which would improve code maintainability.

Comment on lines +313 to +324
if mode == "remote":

async def f():
await self.do_single_request(payload)

elif mode == "choose_dispatch":

async def f():
await self.do_single_choose_dispatch(payload)

else:
raise ValueError(f"Unknown mode {mode!r}")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current implementation defines the local function f twice within different branches of the if statement. This logic can be refactored to be more concise and maintainable by assigning the target method to a variable first, and then defining f once.

        if mode == "remote":
            func = self.do_single_request
        elif mode == "choose_dispatch":
            func = self.do_single_choose_dispatch
        else:
            raise ValueError(f"Unknown mode {mode!r}")

        async def f():
            await func(payload)

@ray-gardener ray-gardener Bot added the serve Ray Serve Related Issue label May 12, 2026
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
@jeffreywang-anyscale jeffreywang-anyscale force-pushed the decouple-routing-primitives-benchmark branch from 4ae7060 to 7da35fb Compare May 22, 2026 06:30
@abrarsheikh abrarsheikh merged commit 02ba033 into master Jun 1, 2026
6 checks passed
@abrarsheikh abrarsheikh deleted the decouple-routing-primitives-benchmark branch June 1, 2026 18:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

go add ONLY when ready to merge, run all tests serve Ray Serve Related Issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Ray fails to serialize self-reference objects

3 participants