Checklist
Detailed Information
Describe the bug
When using LoRA with versioned model names on the vLLM backend, runtime weight updates can make earlier versioned model routes unavailable while rollout requests are still using them.
This causes /v1/chat/completions requests to fail with 404 and breaks rollout workflow execution.
This issue is specific to the current vLLM integration path. The SGLang backend does not exhibit the same failure mode here because versioned LoRA adapters can coexist there, so loading a newer version does not immediately invalidate older versioned adapter names.
Expected behavior
Older in-flight rollout requests targeting versioned LoRA model names should not fail with model does not exist immediately after a newer LoRA version is loaded.
Full logs
HTTP request to .../v1/chat/completions failed with ClientResponseError: 404, message='Not Found' (attempt 3/3)
Error with model error=ErrorInfo(message='The model `gui-lora-r64-v11` does not exist.', type='NotFoundError', param='model', code=404)
Workflow execution failed
Another occurrence from the same failure class:
Error with model error=ErrorInfo(message='The model `gui-lora-r64` does not exist.', type='NotFoundError', param='model', code=404)
Checklist
cause, not a secondary error caused by peer workers.
Detailed Information
Describe the bug
When using LoRA with versioned model names on the vLLM backend, runtime weight updates can make earlier versioned model routes unavailable while rollout requests are still using them.
This causes
/v1/chat/completionsrequests to fail with 404 and breaks rollout workflow execution.This issue is specific to the current vLLM integration path. The SGLang backend does not exhibit the same failure mode here because versioned LoRA adapters can coexist there, so loading a newer version does not immediately invalidate older versioned adapter names.
Expected behavior
Older in-flight rollout requests targeting versioned LoRA model names should not fail with
model does not existimmediately after a newer LoRA version is loaded.Full logs