[OP][Loader]decrease code && make code clean by zhoutianzi666 · Pull Request #7877 · PaddlePaddle/FastDeploy

zhoutianzi666 · 2026-05-21T06:33:52Z

🤖 Paddle-CI-Agent | pr_review | 2026-05-21 15:15:06

📋 Review 摘要

PR 概述：将 linear.py 中多处重复的 weight dtype 处理逻辑提取为公共函数 may_be_do_cast()，同时将原 else 分支的 .cast() 静默转换改为 assert 报错
变更范围：fastdeploy/model_executor/layers/linear.py
影响面 Tag：[OP] [Loader]

问题

级别	文件	概述
🔴 Bug	`fastdeploy/model_executor/layers/linear.py:50`	`assert` 用于运行时 dtype 校验，在 Python `-O` 模式下会被跳过，导致错误静默
🔴 兼容性	`fastdeploy/model_executor/layers/linear.py:44`	语义变更：原代码 `else` 分支做 `.cast()` 允许隐式转换，新代码改为 assert 失败，属 Breaking Change

📝 PR 规范检查

PR 标题缺少官方 Tag（当前为 add assert），PR 描述各必填 section（Motivation / Modifications / Usage or Command / Accuracy Tests）均为空。

标题建议（可直接复制）：

[OP] Replace implicit dtype cast with assert in linear weight_loader

PR 描述建议（可直接复制，必须复刻 checklist §D2 模板的完整结构）：

## Motivation
将 `linear.py` 中 5 处重复的 weight dtype 兼容处理逻辑抽取为公共函数 `may_be_do_cast()`，消除代码重复，同时将 `else` 分支由静默 `.cast()` 改为 assert 报错，以便在 dtype 不匹配时尽早暴露问题。

## Modifications
- 新增 `may_be_do_cast(loaded_weight, param_dtype)` 公共函数，封装 int8→float8_e4m3fn 的 `.view()` 转换及其余 dtype 不匹配时的 assert 检查
- 替换 `ColumnParallelLinear`、`RowParallelLinear`、`QKVParallelLinear`、`MergedColumnParallelLinear`、`gate_weight_loader` 等处的重复 dtype 处理逻辑

## Usage or Command
N/A

## Accuracy Tests
N/A

## Checklist

- [ ] Add at least a tag in the PR title.
  - Tag list: [`[FDConfig]`,`[APIServer]`,`[Engine]`, `[Scheduler]`, `[PD Disaggregation]`, `[Executor]`, `[Graph Optimization]`, `[Speculative Decoding]`, `[RL]`, `[Models]`, `[Quantization]`, `[Loader]`, `[OP]`, `[KVCache]`, `[DataProcessor]`, `[BugFix]`, `[Docs]`, `[CI]`, `[Optimization]`, `[Feature]`, `[Benchmark]`, `[Others]`, `[XPU]`, `[HPU]`, `[GCU]`, `[DCU]`, `[Iluvatar]`, `[Metax]`]
  - You can add new tags based on the PR content, but the semantics must be clear.
- [ ] Format your code, run `pre-commit` before commit.
- [ ] Add unit tests. Please write the reason in this PR if no unit tests.
- [ ] Provide accuracy results.
- [ ] If the current PR is submitting to the `release` branch, make sure the PR has been submitted to the `develop` branch, then cherry-pick it to the `release` branch with the `[Cherry-Pick]` PR tag.

总体评价

PR 将 5 处重复的 dtype 处理逻辑成功抽取为公共函数，有利于维护；但存在两个需要修复的问题：① assert 不应用于运行时校验（应改为 raise ValueError）；② 原 else 分支的静默 .cast() 被删除属 Breaking Change，需在 PR 描述中说明意图并确认不影响已有模型加载流程。

paddle-bot · 2026-05-21T06:34:04Z

Thanks for your contribution!

PaddlePaddle-bot · 2026-05-21T06:55:48Z

🤖 Paddle-CI-Agent | ci_status_monitor | 2026-05-21 20:19:28

CI报告基于以下代码生成（30分钟更新一次）:

PR commit: 7ae71f3
Merge base: 4402396 (branch: develop)
查看完整 Diff
CI 详情

1 任务总览

当前无 required 失败、运行中或等待中的任务；Required 任务 9/10 通过，其中主测试任务 Run FastDeploy Unit Tests and Coverage / run_tests_with_coverage 处于 skipped，请确认跳过是否符合预期。Optional 任务有 2 个失败、1 个等待中，不阻塞合并。

总执行（rerun次数）	总任务	✅ 通过	❌ 失败	⏳ 运行中	⏸️ 等待中	跳过
61(18)	43	38	2	0	1	2

2 任务状态汇总

日志列说明：失败任务直接使用日志链接；运行中/等待中任务使用 Job 或 Workflow 链接（如可用）。

2.1 Required任务 : 9/10 通过

必选任务阻塞合并，失败需优先处理。

状态	任务	耗时	根因	修复建议	日志	重跑
⏭️	`Run FastDeploy Unit Tests and Coverage / run_tests_with_coverage`	-	已跳过	若非预期请触发/检查 skip 条件	-	-
✅	其余 9 个必选任务通过	-	-	-	-	-

2.2 可选任务 — 29/33 通过

可选任务不阻塞合并，失败仅供参考；按规则不做深度失败分析。

状态	任务	耗时	日志	重跑
❌	`Run iluvatar Tests / run_iluvatar_cases`	2m28s	Job	-
❌	`Trigger Jenkins for PR`	45s	Job	-
⏸️	`CI_HPU`	-	Workflow	-
⏭️	`cherry-pick`	-	Workflow	-
✅	其余 29 个可选任务通过	-	-	-

3 失败详情（仅 required）

无 required 失败任务，本轮未调用深度失败分析。

codecov-commenter · 2026-05-21T07:53:07Z

Codecov Report

❌ Patch coverage is 50.00000% with 6 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (develop@4402396). Learn more about missing BASE report.

Files with missing lines	Patch %	Lines
fastdeploy/model_executor/layers/linear.py	50.00%	5 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             develop    #7877   +/-   ##
==========================================
  Coverage           ?   63.59%           
==========================================
  Files              ?      462           
  Lines              ?    64483           
  Branches           ?     9881           
==========================================
  Hits               ?    41007           
  Misses             ?    20700           
  Partials           ?     2776

Flag	Coverage Δ
GPU	`72.72% <50.00%> (?)`
XPU	`7.11% <0.00%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

PaddlePaddle-bot

🤖 Paddle-CI-Agent | pr_review | 2026-05-21 16:53:04

📋 Review 摘要

PR 概述：将 linear.py 中 5 处重复的 weight dtype 处理逻辑提取为公共函数 may_be_do_cast()，同时将原 else 分支的 .cast() 静默转换改为 assert 报错
变更范围：fastdeploy/model_executor/layers/linear.py、tests/model_executor/test_linear.py
影响面 Tag：[OP] [Loader]

问题

级别	文件	概述
🔴 Bug	`fastdeploy/model_executor/layers/linear.py:54`	`assert` 用于运行时 dtype 校验，在 Python `-O` 模式下会被跳过，导致错误静默
🔴 兼容性	`fastdeploy/model_executor/layers/linear.py:53`	Breaking Change：原 `else` 分支 `.cast()` 允许隐式 dtype 转换，新代码改为 assert 失败，破坏已有模型加载流程

📝 PR 规范检查

PR 标题 decrease code && make code clean 缺少官方 Tag，PR 描述缺少所有必填 section（Motivation / Modifications / Usage or Command / Accuracy Tests / Checklist）。

标题建议（可直接复制）：

[OP] Extract may_be_do_cast() to deduplicate weight dtype handling in linear layers

PR 描述建议（可直接复制，必须复刻 checklist §D2 模板的完整结构）：

## Motivation
`linear.py` 中 `ColumnParallelLinear`、`RowParallelLinear`、`QKVParallelLinear`、`MergedColumnParallelLinear`、`gate_weight_loader` 等 5 处存在完全相同的 weight dtype 兼容处理逻辑，重复代码不利于维护。将其提取为公共函数 `may_be_do_cast()`，消除代码重复；同时将 `else` 分支由静默 `.cast()` 改为报错，以便在 dtype 不匹配时尽早暴露问题。

## Modifications
- 新增 `may_be_do_cast(loaded_weight, param)` 公共函数，封装 shape 校验、int8→float8_e4m3fn 的 `.view()` 转换及其余 dtype 不匹配时的报错逻辑
- 替换 `ColumnParallelLinear.weight_loader`、`RowParallelLinear.weight_loader`、`QKVParallelLinear.qkv_weight_loader`、`MergedColumnParallelLinear.weight_loader`、`gate_weight_loader` 中的重复 dtype 处理代码
- 更新 `tests/model_executor/test_linear.py`，将测试中 `float16`/`int8` 权重改为 `float32`，适配新的严格 dtype 校验

## Usage or Command
N/A

## Accuracy Tests
N/A

## Checklist

- [ ] Add at least a tag in the PR title.
  - Tag list: [`[FDConfig]`,`[APIServer]`,`[Engine]`, `[Scheduler]`, `[PD Disaggregation]`, `[Executor]`, `[Graph Optimization]`, `[Speculative Decoding]`, `[RL]`, `[Models]`, `[Quantization]`, `[Loader]`, `[OP]`, `[KVCache]`, `[DataProcessor]`, `[BugFix]`, `[Docs]`, `[CI]`, `[Optimization]`, `[Feature]`, `[Benchmark]`, `[Others]`, `[XPU]`, `[HPU]`, `[GCU]`, `[DCU]`, `[Iluvatar]`, `[Metax]`]
  - You can add new tags based on the PR content, but the semantics must be clear.
- [ ] Format your code, run `pre-commit` before commit.
- [ ] Add unit tests. Please write the reason in this PR if no unit tests.
- [ ] Provide accuracy results.
- [ ] If the current PR is submitting to the `release` branch, make sure the PR has been submitted to the `develop` branch, then cherry-pick it to the `release` branch with the `[Cherry-Pick]` PR tag.

总体评价

PR 成功将 5 处重复的 dtype 处理逻辑抽取为公共函数，有利于维护；但存在两个需要修复的阻塞性问题：① else 分支由 .cast() 改为 assert 属于 Breaking Change，会导致已有的隐式 dtype 转换场景在运行时直接报错；② 运行时校验使用 assert 在 Python -O 优化模式下会被静默跳过，应改为 raise ValueError。

PaddlePaddle-bot · 2026-05-21T08:55:54Z

+        if loaded_weight.dtype == paddle.int8 and param.dtype == paddle.float8_e4m3fn:
+            loaded_weight = loaded_weight.view(param.dtype)
+        else:
+            assert (


🔴 Bug assert 用于运行时 dtype 校验，在 Python -O 优化模式下会被跳过，导致错误静默

assert 语句在 Python 以 -O 参数运行时会被完全忽略，不适合用于运行时参数校验。

建议修复方式：

else: raise ValueError( f"loaded_weight.dtype: {loaded_weight.dtype}, param.dtype: {param.dtype}" )

PaddlePaddle-bot · 2026-05-21T08:55:55Z

+    if loaded_weight.dtype != param.dtype:
+        if loaded_weight.dtype == paddle.int8 and param.dtype == paddle.float8_e4m3fn:
+            loaded_weight = loaded_weight.view(param.dtype)
+        else:


🔴 兼容性 Breaking Change：原 else 分支 .cast() 改为 assert，破坏已有隐式 dtype 转换

原代码：

else: loaded_weight = loaded_weight.cast(param.dtype)

允许隐式 dtype 转换（如 float16 → float32），兼容多种权重文件格式。新代码改为 assert / raise 后，除 int8→float8_e4m3fn 外的所有 dtype 不匹配均会在模型加载时直接报错，属 Breaking Change。

需在 PR 描述中明确说明此语义变更的意图，并确认所有已支持模型/权重格式均不依赖隐式 dtype 转换，再合入。

EmmonsCurse

LGTM～ Skip coverage check as it mainly relies on tests with float8.

commit

5c497fe

zhoutianzi666 had a problem deploying to Metax_ci May 21, 2026 06:33 — with GitHub Actions Failure

This comment was marked as outdated.

Sign in to view

zhoutianzi666 changed the title ~~commit~~ add assert May 21, 2026

commit

4e7810c

zhoutianzi666 had a problem deploying to Metax_ci May 21, 2026 06:57 — with GitHub Actions Failure

commit

99fddf1

zhoutianzi666 had a problem deploying to Metax_ci May 21, 2026 07:04 — with GitHub Actions Failure

commit

2713666

zhoutianzi666 had a problem deploying to Metax_ci May 21, 2026 07:13 — with GitHub Actions Failure

This comment was marked as outdated.

Sign in to view

zhoutianzi666 changed the title ~~add assert~~ decrease code May 21, 2026

zhoutianzi666 changed the title ~~decrease code~~ decrease code && make code clean May 21, 2026

This comment was marked as outdated.

Sign in to view

commit

7ae71f3

zhoutianzi666 had a problem deploying to Metax_ci May 21, 2026 08:46 — with GitHub Actions Failure

PaddlePaddle-bot suggested changes May 21, 2026

View reviewed changes

zhoutianzi666 changed the title ~~decrease code && make code clean~~ [OP][Loader]decrease code && make code clean May 21, 2026

EmmonsCurse added the skip-ci: coverage label May 21, 2026

EmmonsCurse approved these changes May 21, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[OP][Loader]decrease code && make code clean#7877

[OP][Loader]decrease code && make code clean#7877
zhoutianzi666 wants to merge 5 commits into
PaddlePaddle:developfrom
zhoutianzi666:add_assert

zhoutianzi666 commented May 21, 2026 •

edited

Loading

Uh oh!

paddle-bot Bot commented May 21, 2026

Uh oh!

This comment was marked as outdated.

Uh oh!

PaddlePaddle-bot commented May 21, 2026 •

edited

Loading

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

codecov-commenter commented May 21, 2026 •

edited

Loading

Uh oh!

PaddlePaddle-bot left a comment

Uh oh!

PaddlePaddle-bot May 21, 2026

Uh oh!

PaddlePaddle-bot May 21, 2026

Uh oh!

EmmonsCurse left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

zhoutianzi666 commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📋 Review 摘要

问题

📝 PR 规范检查

总体评价

Uh oh!

paddle-bot Bot commented May 21, 2026

Uh oh!

This comment was marked as outdated.

Uh oh!

PaddlePaddle-bot commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

1 任务总览

2 任务状态汇总

2.1 Required任务 : 9/10 通过

2.2 可选任务 — 29/33 通过

3 失败详情（仅 required）

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

codecov-commenter commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

PaddlePaddle-bot left a comment

Choose a reason for hiding this comment

📋 Review 摘要

问题

📝 PR 规范检查

总体评价

Uh oh!

PaddlePaddle-bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

PaddlePaddle-bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

EmmonsCurse left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

zhoutianzi666 commented May 21, 2026 •

edited

Loading

PaddlePaddle-bot commented May 21, 2026 •

edited

Loading

codecov-commenter commented May 21, 2026 •

edited

Loading