Skip to content

lazy import for vllm benchmark#129

Merged
RuBing-Yang merged 3 commits into
Tencent:mainfrom
RuBing-Yang:spec_decode
Nov 5, 2025
Merged

lazy import for vllm benchmark#129
RuBing-Yang merged 3 commits into
Tencent:mainfrom
RuBing-Yang:spec_decode

Conversation

@RuBing-Yang
Copy link
Copy Markdown
Collaborator

@RuBing-Yang RuBing-Yang commented Nov 5, 2025

This pull request refactors how external dependencies are imported in the benchmarking code for speculative decoding, focusing on lazy imports for improved modularity and startup performance. The main changes involve switching direct imports of fastchat, shortuuid, and vllm to use the new angelslim.utils.lazy_imports module, and updating code to reference these modules accordingly. This change helps avoid unnecessary imports when modules are not used, and sets up the codebase for easier dependency management.

Dependency import refactoring:

  • Replaced direct imports of fastchat, shortuuid, and vllm in generate_baseline_answer.py and generate_eagle_answer.py with lazy imports from angelslim.utils.lazy_imports, and updated all usages to reference these lazy-loaded modules. [1] [2] [3]
  • Updated all usages of LLM and SamplingParams from vllm to use vllm.LLM and vllm.SamplingParams via the lazy import, including type hints and instantiations. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12]

Benchmark engine updates:

  • Changed direct import and usage of load_questions from fastchat.llm_judge.common to reference fastchat.llm_judge.common.load_questions via the lazy import in benchmark_engine.py, generate_baseline_answer.py, and generate_eagle_answer.py. [1] [2] [3] [4] [5]

Requirements update:

  • Added vllm>=0.11.0 to requirements/requirements_speculative.txt to ensure the required version is installed for lazy loading.

Miscellaneous:

  • Added from __future__ import annotations at the top of generate_baseline_answer.py and generate_eagle_answer.py to support postponed evaluation of type annotations, which helps with forward references and lazy imports. [1] [2]
  • Removed unused direct imports of shortuuid and vllm from the affected files, cleaning up the codebase. [1] [2]

Let me know if you'd like to discuss how lazy imports work or why this change improves the codebase!

@RuBing-Yang RuBing-Yang merged commit 9bfa6e5 into Tencent:main Nov 5, 2025
5 checks passed
dawnranger pushed a commit to dawnranger/AngelSlim that referenced this pull request Mar 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants