Skip to content

fix(vllm-router): allow using prefill-decode for subset of models (by checking labels) and add a fallback routing strategy#3

Merged
nejch merged 1 commit into
deployfrom
fix/allow-using-prefill-decode-for-subset-of-models-by-checking-labels-in-vllm-router
Jun 3, 2026
Merged

fix(vllm-router): allow using prefill-decode for subset of models (by checking labels) and add a fallback routing strategy#3
nejch merged 1 commit into
deployfrom
fix/allow-using-prefill-decode-for-subset-of-models-by-checking-labels-in-vllm-router

Conversation

@Killusions
Copy link
Copy Markdown
Member

So we can introduce it not for all models and still benefit from session routing for the rest.

… checking labels) and add a fallback routing strategy
Copy link
Copy Markdown
Member

@nejch nejch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Killusions works as an interim solution until we go full Envoy I'd say, but definitely can't see this going upstream 😁

@nejch nejch merged commit 7d20695 into deploy Jun 3, 2026
4 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants