[None][fix] AutoDeploy: set enable_spec_decode on ADEngine for disagg#15260
Conversation
ADEngine subclasses the abstract ModelEngine and does not run PyTorchModelEngine.__init__, so it never set `enable_spec_decode`. After NVIDIA#14546 added an unguarded `self.model_engine.enable_spec_decode` read in `_prepare_disagg_gen_transmission_complete` (the disagg generation handoff path that ADEngine traverses via NVIDIA#14057 AutoDeploy Basic Disagg Support), AutoDeploy disaggregated runs crash with: AttributeError: 'ADEngine' object has no attribute 'enable_spec_decode' NVIDIA#14546 and NVIDIA#14057 each passed CI independently but conflict semantically once both are on main. Set `is_spec_decode`/`enable_spec_decode` in ADEngine.__init__, mirroring PyTorchModelEngine (enable_spec_decode == spec_config is not None), so ADEngine satisfies the ModelEngine attribute contract that shared PyExecutor code relies on. Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
835447b to
73ed620
Compare
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughADEngine.init now initializes spec-decode compatibility flags ( ChangesSpec-decode state initialization
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~3 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
/bot run --disable-fail-fast |
|
PR_Github #53561 [ run ] triggered by Bot. Commit: |
|
PR_Github #53561 [ run ] completed with state
|
|
/bot run --disable-fail-fast |
|
PR_Github #53583 [ run ] triggered by Bot. Commit: |
|
PR_Github #53583 [ run ] completed with state
|
|
Thank you for flagging this and submitting the PR! |
|
/bot run --disable-fail-fast |
|
PR_Github #53706 [ run ] triggered by Bot. Commit: |
|
PR_Github #53706 [ run ] completed with state
|
|
/bot run --disable-fail-fast |
|
PR_Github #53719 [ run ] triggered by Bot. Commit: |
|
PR_Github #53719 [ run ] completed with state
|
|
/bot run --disable-fail-fast |
|
PR_Github #53918 [ run ] triggered by Bot. Commit: |
|
PR_Github #53918 [ run ] completed with state |
This pull request makes a small but important change to the
ad_executor.pyfile to improve compatibility betweenADEngineand sharedPyExecutorcode. Specifically, it ensures that the spec-decode flags expected by the shared code are set correctly whenADEngineis used.is_spec_decodeandenable_spec_decodeflags in theADEngineinitializer to match what sharedPyExecutorcode expects, ensuring proper handling of spec-decode features.Description
Test Coverage
PR Checklist
Please review the following before submitting your PR:
PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
If PR introduces API changes, an appropriate PR label is added - either
api-compatibleorapi-breaking. Forapi-breaking, includeBREAKINGin the PR title.Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.
GitHub Bot Help
To see a list of available CI bot commands, please comment
/bot help.Summary by CodeRabbit