Skip to content

[LLM] Fix dependency graph paper cuts in serve llm#63092

Open
ps2181 wants to merge 1 commit into
ray-project:masterfrom
ps2181:ps2181/llm-fix-dependency-graph-paper-cuts
Open

[LLM] Fix dependency graph paper cuts in serve llm#63092
ps2181 wants to merge 1 commit into
ray-project:masterfrom
ps2181:ps2181/llm-fix-dependency-graph-paper-cuts

Conversation

@ps2181
Copy link
Copy Markdown
Contributor

@ps2181 ps2181 commented May 3, 2026

Why

Why are these changes needed?
Addresses a tech debt item tracked in #57890 — keeping the serve llm dependency graph clean.

CloudFileSystem was imported inside a method body in cloud_downloader.py with no justification for deferring the import. Moved to module top level per standard Python convention.

Related issue number
Contributes to #57890


Related: #57890

@ps2181 ps2181 requested a review from a team as a code owner May 3, 2026 14:27
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors import statements in two files. In cloud_downloader.py, the CloudFileSystem import is moved from a method to the top level of the file. Conversely, in llm_config.py, the KVConnectorBackendFactory import is moved from the top level into the _setup_kv_connector_backend method, likely to avoid circular dependencies or optimize loading. I have no feedback to provide as there were no review comments to evaluate.

@ps2181 ps2181 force-pushed the ps2181/llm-fix-dependency-graph-paper-cuts branch 5 times, most recently from 828fca4 to 49438d7 Compare May 3, 2026 15:03
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Reviewed by Cursor Bugbot for commit 49438d7. Configure here.

Comment thread python/ray/llm/_internal/serve/core/configs/llm_config.py Outdated
@ps2181 ps2181 force-pushed the ps2181/llm-fix-dependency-graph-paper-cuts branch 6 times, most recently from ae1c054 to aad8e86 Compare May 3, 2026 16:23
@ray-gardener ray-gardener Bot added serve Ray Serve Related Issue community-contribution Contributed by the community labels May 3, 2026
@ps2181 ps2181 force-pushed the ps2181/llm-fix-dependency-graph-paper-cuts branch 2 times, most recently from 66b3952 to de24b25 Compare May 5, 2026 17:50
@jeffreywang-anyscale
Copy link
Copy Markdown
Contributor

Thanks @ps2181 for taking a stab! As part of this effort, could you please audit all other imports in python/ray/llm/_internal and make sure they are moved to top?

@ps2181 ps2181 force-pushed the ps2181/llm-fix-dependency-graph-paper-cuts branch 3 times, most recently from 8f7b939 to 8dd5ca9 Compare May 8, 2026 14:39
@ps2181
Copy link
Copy Markdown
Contributor Author

ps2181 commented May 8, 2026

@jeffreywang-anyscale Thanks for the suggestion!
Done — audited all imports across python/ray/llm/_internal and moved the unjustified deferred ones to module top level.

The remaining function-body imports are intentionally deferred: vllm/transformers/sglang/huggingface_hub are optional deps, VLLMEngine/VLLMEngineConfig are explicitly lazy to avoid a hard vllm module-level dependency at import time, and the benchmark CLI uses lazy loads for startup performance.

Do let me know if you have any suggestions or would like any changes in the PR !!

@ps2181 ps2181 force-pushed the ps2181/llm-fix-dependency-graph-paper-cuts branch 4 times, most recently from 01c2fe7 to 43ccccc Compare May 8, 2026 15:41
@ps2181 ps2181 force-pushed the ps2181/llm-fix-dependency-graph-paper-cuts branch from 43ccccc to 461654b Compare May 19, 2026 14:39
Addresses two violations from ray-project#57890:

1. Move CloudFileSystem import from inside on_before_download_model_files_distributed()
   to module top level in cloud_downloader.py.

2. Audit all imports in python/ray/llm/_internal and move unjustified
   function-body imports to module top level:
   - common/utils/lora_utils.py: filelock.FileLock
   - batch/observability/usage_telemetry/usage.py: TagKey, HEAD_NODE_RESOURCE_NAME,
     PlacementGroupSchedulingStrategy
   - serve/observability/usage_telemetry/usage.py: TagKey, HEAD_NODE_RESOURCE_NAME,
     PlacementGroupSchedulingStrategy, AcceleratorType, AutoscalingConfig

Remaining deferred imports are intentional:
- serve/core/ingress/builder.py: LLMRouter kept lazy — router.py instantiates
  FastAPI() at module level with @serve.ingress(), which breaks Sphinx autodoc
  when fastapi is mocked.
- Optional deps (vllm, transformers, sglang, huggingface_hub) stay lazy.
- VLLMEngine/VLLMEngineConfig stay lazy to avoid hard vllm module-level dep.
- Benchmark CLI imports stay lazy for startup performance.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: ps2181 <hellopritam31@gmail.com>
@ps2181 ps2181 force-pushed the ps2181/llm-fix-dependency-graph-paper-cuts branch from 8a9b50b to 54c1509 Compare May 19, 2026 14:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-contribution Contributed by the community llm serve Ray Serve Related Issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants