feat: support Qwen down_proj fallback for compressed-tensors ignored modules. by yingxudeng · Pull Request #1254 · jd-opensource/xllm

yingxudeng · 2026-04-10T07:15:33Z

No description provided.

gemini-code-assist

Code Review

This pull request introduces an ignored_modules configuration to QuantArgs, allowing specific modules to bypass quantization using exact names or regex patterns. The DenseMLP and Qwen decoder layers were updated to support this filtering mechanism. Review feedback highlights a performance concern regarding repeated regex compilation within a loop, the use of a magic number for the regex prefix, and a logic bug where gate_up_proj and parent modules are not correctly checked against the ignore list.

gemini-code-assist · 2026-04-10T07:17:41Z

+      if (pattern.size() > 3 && pattern.rfind("re:", 0) == 0) {
+        try {
+          if (std::regex_match(module_name, std::regex(pattern.substr(3)))) {
+            return true;
+          }
+        } catch (const std::regex_error&) {
+        }
+      }


Creating a std::regex object inside a loop for every module check is extremely inefficient as regex compilation is an expensive operation. This can significantly slow down model initialization, especially for models with many layers and multiple ignore patterns. Additionally, the magic number 3 should be replaced with a named constant (e.g., kRegexPrefixLength). Consider pre-compiling the regexes or using a more efficient matching strategy.

…modules.

DragonFive

LGTM

walsonyang · 2026-04-14T01:13:20Z

            }
          }
        },
+        "ignore": [


Is this rule applicable to all models?

This field is part of the quantization config schema, not a model-specific rule. The JSON (including ignore) is generated by the quantization tool — at least AngelSlim produces this field. So the applicability depends on which quant tool was used, not the model itself.

yingxudeng requested review from DongheJin, JimHsiung, RobbieLeung, XuZhang99, liutongxuan, walsonyang and yq33victor as code owners April 10, 2026 07:15

gemini-code-assist Bot reviewed Apr 10, 2026

View reviewed changes

feat: support Qwen down_proj fallback for compressed-tensors ignored …

fb643ec

…modules.

yingxudeng force-pushed the feat/compressed-tensors-ignore-modules branch from ccfd0a7 to fb643ec Compare April 10, 2026 09:24

yingxudeng changed the title ~~feat: support ignored modules in compressed-tensors quant config.~~ feat: support Qwen down_proj fallback for compressed-tensors ignored modules. Apr 10, 2026

DragonFive approved these changes Apr 13, 2026

View reviewed changes

walsonyang reviewed Apr 14, 2026

View reviewed changes

walsonyang approved these changes Apr 15, 2026

View reviewed changes

yingxudeng merged commit fb1287d into jd-opensource:release/v0.9.0 Apr 15, 2026
30 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support Qwen down_proj fallback for compressed-tensors ignored modules.#1254

feat: support Qwen down_proj fallback for compressed-tensors ignored modules.#1254
yingxudeng merged 1 commit intojd-opensource:release/v0.9.0from
yingxudeng:feat/compressed-tensors-ignore-modules

yingxudeng commented Apr 10, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Apr 10, 2026

Uh oh!

Uh oh!

DragonFive left a comment

Uh oh!

walsonyang Apr 14, 2026

Uh oh!

yingxudeng Apr 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

yingxudeng commented Apr 10, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

DragonFive left a comment

Choose a reason for hiding this comment

Uh oh!

walsonyang Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

yingxudeng Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants