Skip to content

bugfix: update Glm5-W8A8 & Draft Model on npu device.#1193

Merged
liutongxuan merged 2 commits intojd-opensource:mainfrom
sanlio36:dev_glm5_w8a8
Apr 27, 2026
Merged

bugfix: update Glm5-W8A8 & Draft Model on npu device.#1193
liutongxuan merged 2 commits intojd-opensource:mainfrom
sanlio36:dev_glm5_w8a8

Conversation

@sanlio36
Copy link
Copy Markdown
Collaborator

@sanlio36 sanlio36 commented Apr 6, 2026

No description provided.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for the glm_moe_dsa model type to the MTP export utility and introduces an optional rot linear layer to the MTP model base. It also updates the DeepSeek V3.2 decoder loader to handle additional quantization parameters for the indexer. Feedback highlights a logic error in the Python export script regarding the detection and mapping of rot.weight and identifies several instances where constant arguments in C++ were not annotated according to the style guide.

Comment thread tools/export_mtp.py
Comment thread xllm/models/llm/npu/mtp_model_base.h Outdated
Comment thread xllm/core/layers/npu/loader/deepseek_v32_decoder_loader.cpp Outdated
@sanlio36 sanlio36 changed the title Update Glm5-w8a8 & MtpLayer on npu device. Update Glm5-w8a8 MtpLayer on npu device. Apr 8, 2026
@sanlio36 sanlio36 marked this pull request as ready for review April 8, 2026 08:04
@sanlio36 sanlio36 changed the title Update Glm5-w8a8 MtpLayer on npu device. Update Glm5-W8A8 & Draft Model on npu device. Apr 10, 2026
Copy link
Copy Markdown
Collaborator

@yq33victor yq33victor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yq33victor yq33victor changed the title Update Glm5-W8A8 & Draft Model on npu device. bugfix: update Glm5-W8A8 & Draft Model on npu device. Apr 10, 2026
@liutongxuan liutongxuan merged commit d30f9c7 into jd-opensource:main Apr 27, 2026
10 of 29 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants