Refine _extract_layer_prefixes to better handle mtp modules#1124
Refine _extract_layer_prefixes to better handle mtp modules#1124Edwardf0t1 merged 2 commits intomainfrom
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
✅ Files skipped from review due to trivial changes (1)
📝 WalkthroughWalkthroughAdds top-level Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 3 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@examples/llm_ptq/example_utils.py`:
- Around line 369-370: The code currently adds parts[0] into mtp_layer_prefixes
for any dotted name, which can collect non-MTP roots like "model"; update the
logic where parts is computed so you only add the prefix when it is an actual
MTP root (e.g., check parts[0] == "mtp" or otherwise validate that the token
represents an mtp root) before calling mtp_layer_prefixes.add(parts[0]),
ensuring only real "mtp" top-level prefixes are captured.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 17929680-e2d3-448f-af63-da3b24ba1883
📒 Files selected for processing (2)
examples/llm_ptq/example_utils.pyexamples/llm_ptq/hf_ptq.py
| if parts: | ||
| mtp_layer_prefixes.add(parts[0]) |
There was a problem hiding this comment.
Restrict top-level prefix capture to actual mtp roots.
At Line 369–Line 370, adding parts[0] unconditionally can capture non-MTP roots (e.g., "model"), which then propagates into very broad exclusion patterns and can disable quantization for unintended modules.
💡 Suggested fix
- if parts:
- mtp_layer_prefixes.add(parts[0])
+ mtp_root_idx = next((i for i, p in enumerate(parts) if p == "mtp"), None)
+ if mtp_root_idx is not None:
+ mtp_layer_prefixes.add(".".join(parts[: mtp_root_idx + 1]))📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if parts: | |
| mtp_layer_prefixes.add(parts[0]) | |
| mtp_root_idx = next((i for i, p in enumerate(parts) if p == "mtp"), None) | |
| if mtp_root_idx is not None: | |
| mtp_layer_prefixes.add(".".join(parts[: mtp_root_idx + 1])) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@examples/llm_ptq/example_utils.py` around lines 369 - 370, The code currently
adds parts[0] into mtp_layer_prefixes for any dotted name, which can collect
non-MTP roots like "model"; update the logic where parts is computed so you only
add the prefix when it is an actual MTP root (e.g., check parts[0] == "mtp" or
otherwise validate that the token represents an mtp root) before calling
mtp_layer_prefixes.add(parts[0]), ensuring only real "mtp" top-level prefixes
are captured.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1124 +/- ##
==========================================
+ Coverage 74.77% 76.05% +1.28%
==========================================
Files 351 351
Lines 40072 40072
==========================================
+ Hits 29964 30478 +514
+ Misses 10108 9594 -514
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Signed-off-by: Zhiyu Cheng <zhiyuc@nvidia.com>
fac8927 to
3d9d337
Compare
What does this PR do?
Type of change: ? bugfix
Added extraction of the top-level MTP module prefix (e.g., mtp from mtp.fc.weight). Previously it only captured mtp.layers.0 because it looked exclusively for *.layers. patterns.
Now it returns {"mtp", "mtp.layers.0"} instead of just {"mtp.layers.0"}.
Updated pattern construction to handle single-component prefixes:
mtp.layers.0 (multi-component) → layers.0 (same as before)
mtp (single-component) → mtp (new — covers mtp.fc, mtp.norm, etc.)
Usage
# Add a code snippet demonstrating how to use thisTesting
Before your PR is "Ready for review"
Make sure you read and follow Contributor guidelines and your commits are signed (
git commit -s -S).Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded
trust_remote_code=True,torch.load(..., weights_only=False),pickle, etc.).CONTRIBUTING.md: ✅ / ❌ / N/AAdditional Information
Summary by CodeRabbit