[LLM] Fix dependency graph paper cuts in serve llm#63092
Conversation
There was a problem hiding this comment.
Code Review
This pull request refactors import statements in two files. In cloud_downloader.py, the CloudFileSystem import is moved from a method to the top level of the file. Conversely, in llm_config.py, the KVConnectorBackendFactory import is moved from the top level into the _setup_kv_connector_backend method, likely to avoid circular dependencies or optimize loading. I have no feedback to provide as there were no review comments to evaluate.
828fca4 to
49438d7
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Reviewed by Cursor Bugbot for commit 49438d7. Configure here.
ae1c054 to
aad8e86
Compare
66b3952 to
de24b25
Compare
|
Thanks @ps2181 for taking a stab! As part of this effort, could you please audit all other imports in |
8f7b939 to
8dd5ca9
Compare
|
@jeffreywang-anyscale Thanks for the suggestion! The remaining function-body imports are intentionally deferred: vllm/transformers/sglang/huggingface_hub are optional deps, VLLMEngine/VLLMEngineConfig are explicitly lazy to avoid a hard vllm module-level dependency at import time, and the benchmark CLI uses lazy loads for startup performance. Do let me know if you have any suggestions or would like any changes in the PR !! |
01c2fe7 to
43ccccc
Compare
43ccccc to
461654b
Compare
Addresses two violations from ray-project#57890: 1. Move CloudFileSystem import from inside on_before_download_model_files_distributed() to module top level in cloud_downloader.py. 2. Audit all imports in python/ray/llm/_internal and move unjustified function-body imports to module top level: - common/utils/lora_utils.py: filelock.FileLock - batch/observability/usage_telemetry/usage.py: TagKey, HEAD_NODE_RESOURCE_NAME, PlacementGroupSchedulingStrategy - serve/observability/usage_telemetry/usage.py: TagKey, HEAD_NODE_RESOURCE_NAME, PlacementGroupSchedulingStrategy, AcceleratorType, AutoscalingConfig Remaining deferred imports are intentional: - serve/core/ingress/builder.py: LLMRouter kept lazy — router.py instantiates FastAPI() at module level with @serve.ingress(), which breaks Sphinx autodoc when fastapi is mocked. - Optional deps (vllm, transformers, sglang, huggingface_hub) stay lazy. - VLLMEngine/VLLMEngineConfig stay lazy to avoid hard vllm module-level dep. - Benchmark CLI imports stay lazy for startup performance. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: ps2181 <hellopritam31@gmail.com>
8a9b50b to
54c1509
Compare

Why
Why are these changes needed?
Addresses a tech debt item tracked in #57890 — keeping the serve llm dependency graph clean.
CloudFileSystem was imported inside a method body in cloud_downloader.py with no justification for deferring the import. Moved to module top level per standard Python convention.
Related issue number
Contributes to #57890
Related: #57890