-
Notifications
You must be signed in to change notification settings - Fork 7.6k
[Serve] decouple routing primitives #60865
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
machichima
wants to merge
63
commits into
ray-project:master
from
machichima:59792-serve-decouple-routing-primitives
Closed
Changes from all commits
Commits
Show all changes
63 commits
Select commit
Hold shift + click to select a range
4dc1d4e
feat: add DeploymentHandle skeleton
machichima d85b8eb
feat: add AsyncioRouter skeleton
machichima 8140942
fix: type and undefined err
machichima 1e909ca
test: update FakeReplica and add TestChooseReplica
machichima 3df3795
feat+test: choose_replica + ensure slot reserved
machichima 07a8962
feat: enable calling from handle layer
machichima 3faa9a8
feat: dispatch try_send_request
machichima 470958c
test: ensure dispatch send request to the correct replica
machichima b0c792c
fix: args/kwargs type + verify same deployment id
machichima 008b190
test: ensure single/stack pattern works
machichima 1deff34
fix: account for reserved slots when checking avail
machichima 88db9dd
refactor: lint and format
machichima c6f9b45
refactor: move reserved slots metrics to RouterMetricsManager
machichima f7b30bb
refactor: lint and docstring
machichima 0090c5c
refactor: remove unused code
machichima 504bb8a
refactor: clean-up test comments
machichima 6171af9
Merge branch 'master' into 59792-serve-decouple-routing-primitives
machichima f529d68
test: add placeholder method to LocalRouter
machichima c460e57
fix: also release slot after dispatch
machichima a3048a4
test: fix param type + release slot after dispatch
machichima 56a6a89
fix: choose_replica run on router thread loop
machichima 732215f
fix: check if already dispatched
machichima 251f9e7
fix: dispatch inc metrics + add completion callback
machichima e1ef1fa
fix: add wrap_queued_request in choose_replica
machichima 4fe9ebb
fix: skip choose_replica and dispatch test in local mode
machichima 4009306
fix: manual decrease cache only when not dispatched
machichima 642f05f
fix: call request_counter.inc in DeploymentHandle._choose_replica
machichima a074356
refactor: precommit
machichima 2b5ce83
Merge branch 'master' into 59792-serve-decouple-routing-primitives
machichima 1d4d735
fix: mark dispatch after check replica available
machichima 3ee929a
fix: prevent double count in cache mode
machichima 3327986
fix: router test error
machichima f7fc4c0
fix: api docs ReplicaUnavailableError issue
machichima 1683ff4
fix: decrease queue len when finished or no callback
machichima edd7ccf
refactor: _deployment_handle to optional
machichima afef7b3
refactor: add ReplicaUnavailableError to serve public API
machichima 21547d6
Merge branch 'master' of github.com:ray-project/ray into 59792-serve-…
machichima 2852468
fix: fix circular deps by setting deployment_id instead
machichima bc58182
refactor: extract common logic to utility function
machichima de7e6e3
fix: move import to the top of the file
machichima 58f42f7
refactor: choose_replica to async
machichima 174f7bd
refactor: fix docstrings
machichima fe842d0
refactor: better document on_replica_result_finished
machichima c01822b
fix: remove redundant code as we always in cache mode
machichima 7fe198b
refactor: make reserved_slots comment accurate
machichima 446e7d3
fix: reuse the RequestMetadata created from choose_replica
machichima 1c5c9dc
refactor: fix docstring
machichima 62ea65e
test: multiple dispatch calls fail
machichima 9cae538
test: dispatch won't be rejected even when reach the threshold
machichima 80b3c07
test: ensure specific error raised in choose_replica
machichima ad7c561
test: streaming support integration test
machichima f32576a
test: exception or early return release slot
machichima 04b0c19
Merge branch 'master' of github.com:ray-project/ray into 59792-serve-…
machichima 1af2c2e
fix: add missing callback param
machichima 9959551
fix: prevent duplicate decrease queue length cache
machichima 779123c
fix: use call_soon_threadsafe
machichima f978363
fix: move resolve_request_arguments into try/except block
machichima 5d2154a
refactor: ReplicaSelection to replica_wrapper
machichima b7e376c
test: remove redundant test
machichima 65da32e
Merge branch 'master' of github.com:ray-project/ray into 59792-serve-…
machichima d65ad18
fix: calling on_replica_result_finished only when on_send_request is …
machichima 5684558
Merge branch 'master' into 59792-serve-decouple-routing-primitives
jeffreywang-anyscale c9509df
Reserve replica capacity
jeffreywang-anyscale File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.