[not merge][XPU] fix kvcache_transfer compile#7919
Conversation
|
Thanks for your contribution! |
PaddlePaddle-bot
left a comment
There was a problem hiding this comment.
🤖 Paddle-CI-Agent | pr_review |
2026-05-25 17:29:01
📋 Review 摘要
PR 概述:修复 kvcache_transfer 编译时使用 -march=native 导致的跨机器不兼容问题,改为固定微架构基准 -march=x86-64-v2。
变更范围:cache_manager/transfer_factory/kvcache_transfer/
影响面 Tag:XPU KVCache
问题
| 级别 | 文件 | 概述 |
|---|---|---|
| 📝 PR 规范 | — | Motivation / Modifications section 内容为空,描述不合规 |
| ❓ 疑问 | CMakeLists.txt:17 |
-march=x86-64-v2 的微架构级别选型依据 |
📝 PR 规范检查
PR 标题格式合规([XPU] 为官方 Tag)。但 ## Motivation 与 ## Modifications 内容均为空占位符,不符合 §D2 模板要求。
标题建议(可直接复制):
[XPU][BugFix] fix kvcache_transfer compile: replace -march=native with -march=x86-64-v2
PR 描述建议(可直接复制,已复刻 checklist §D2 完整结构):
## Motivation
`-march=native` 会在编译时根据当前构建机器的 CPU 特性生成优化指令(如 AVX-512),当在特性不足的目标机器上运行时会触发非法指令错误(SIGILL),导致 kvcache_transfer 在 XPU 等交叉编译或异构部署场景下无法正常运行。将其替换为 `-march=x86-64-v2` 以确保二进制在更广泛的 x86 平台上可正常编译和运行。
## Modifications
- `fastdeploy/cache_manager/transfer_factory/kvcache_transfer/CMakeLists.txt`:将编译标志从 `-march=native` 修改为 `-march=x86-64-v2`,避免因 native 架构标志导致的编译或运行时不兼容问题。
## Usage or Command
N/A
## Accuracy Tests
N/A
## Checklist
- [x] Add at least a tag in the PR title.
- Tag list: [`[FDConfig]`,`[APIServer]`,`[Engine]`, `[Scheduler]`, `[PD Disaggregation]`, `[Executor]`, `[Graph Optimization]`, `[Speculative Decoding]`, `[RL]`, `[Models]`, `[Quantization]`, `[Loader]`, `[OP]`, `[KVCache]`, `[DataProcessor]`, `[BugFix]`, `[Docs]`, `[CI]`, `[Optimization]`, `[Feature]`, `[Benchmark]`, `[Others]`, `[XPU]`, `[HPU]`, `[GCU]`, `[DCU]`, `[Iluvatar]`, `[Metax]`]
- You can add new tags based on the PR content, but the semantics must be clear.
- [x] Format your code, run `pre-commit` before commit.
- [x] Add unit tests. Please write the reason in this PR if no unit tests.
- [ ] Provide accuracy results.
- [ ] If the current PR is submitting to the `release` branch, make sure the PR has been submitted to the `develop` branch, then cherry-pick it to the `release` branch with the `[Cherry-Pick]` PR tag.总体评价
修复方向正确,-march=native 替换为固定基准级别可有效解决跨机器兼容性问题。描述模板尚未填写,建议补全后合入。
|
|
||
| set(CMAKE_CXX_COMPILER g++) | ||
| set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -O2 -Ofast -ffast-math -funroll-loops -march=native -std=c++11") | ||
| set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -O2 -Ofast -ffast-math -funroll-loops -march=x86-64-v2 -std=c++11") |
There was a problem hiding this comment.
❓ 疑问 -march=x86-64-v2 对应 SSE4.2/POPCNT/CX16 基准,是保守兼容的选择。
若 kvcache_transfer 的目标部署环境为现代数据中心 CPU(2013 年后的 Haswell 及以上),-march=x86-64-v3(增加 AVX/AVX2/FMA 支持)可获得更好的内存带宽和向量化性能。
请确认选型依据:是以最大兼容性为优先,还是需要更高性能?如无特殊约束,建议在 PR 描述中注明原因。
CI报告基于以下代码生成(30分钟更新一次): 1 任务总览Required 任务已全部通过(10/10),从阻塞合并的 Required CI 角度看建议通过;当前失败均为 Optional 任务,仅供参考。PR 仅修改
2 任务状态汇总日志列说明:失败任务直接使用 CI 预生成日志链接;Required 与 Optional 严格分区展示。 2.1 Required任务 : 10/10 通过
2.2 可选任务 — 27/31 通过
3 失败详情(仅 required)无 Required 失败任务,无需调用 4 可选失败简要说明
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #7919 +/- ##
==========================================
Coverage ? 64.03%
==========================================
Files ? 467
Lines ? 64963
Branches ? 9962
==========================================
Hits ? 41601
Misses ? 20539
Partials ? 2823
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Motivation
Modifications
Usage or Command
Accuracy Tests
Checklist
[FDConfig],[APIServer],[Engine],[Scheduler],[PD Disaggregation],[Executor],[Graph Optimization],[Speculative Decoding],[RL],[Models],[Quantization],[Loader],[OP],[KVCache],[DataProcessor],[BugFix],[Docs],[CI],[Optimization],[Feature],[Benchmark],[Others],[XPU],[HPU],[GCU],[DCU],[Iluvatar],[Metax]]pre-commitbefore commit.releasebranch, make sure the PR has been submitted to thedevelopbranch, then cherry-pick it to thereleasebranch with the[Cherry-Pick]PR tag.