Skip to content

[not merge][XPU] fix kvcache_transfer compile#7919

Open
zhupengyang wants to merge 1 commit into
PaddlePaddle:developfrom
zhupengyang:glm_pd_cache
Open

[not merge][XPU] fix kvcache_transfer compile#7919
zhupengyang wants to merge 1 commit into
PaddlePaddle:developfrom
zhupengyang:glm_pd_cache

Conversation

@zhupengyang
Copy link
Copy Markdown
Collaborator

Motivation

💡 If this PR is a Cherry Pick, the PR title needs to follow the format by adding the [Cherry-Pick] label at the very beginning and appending the original PR ID at the end. For example, [Cherry-Pick][CI] Add check trigger and logic(#5191)

💡 如若此PR是Cherry Pick,PR标题需遵循格式,在最开始加上[Cherry-Pick]标签,以及最后面加上原PR ID,例如[Cherry-Pick][CI] Add check trigger and logic(#5191)

Modifications

Usage or Command

Accuracy Tests

Checklist

  • Add at least a tag in the PR title.
    • Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
    • You can add new tags based on the PR content, but the semantics must be clear.
  • Format your code, run pre-commit before commit.
  • Add unit tests. Please write the reason in this PR if no unit tests.
  • Provide accuracy results.
  • If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

@paddle-bot
Copy link
Copy Markdown

paddle-bot Bot commented May 25, 2026

Thanks for your contribution!

@zhupengyang zhupengyang changed the title [XPU] fix kvcache_transfer compile [not merge][XPU] fix kvcache_transfer compile May 25, 2026
Copy link
Copy Markdown

@PaddlePaddle-bot PaddlePaddle-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Paddle-CI-Agent | pr_review | 2026-05-25 17:29:01

📋 Review 摘要

PR 概述:修复 kvcache_transfer 编译时使用 -march=native 导致的跨机器不兼容问题,改为固定微架构基准 -march=x86-64-v2
变更范围cache_manager/transfer_factory/kvcache_transfer/
影响面 TagXPU KVCache

问题

级别 文件 概述
📝 PR 规范 Motivation / Modifications section 内容为空,描述不合规
❓ 疑问 CMakeLists.txt:17 -march=x86-64-v2 的微架构级别选型依据

📝 PR 规范检查

PR 标题格式合规([XPU] 为官方 Tag)。但 ## Motivation## Modifications 内容均为空占位符,不符合 §D2 模板要求。

标题建议(可直接复制):

  • [XPU][BugFix] fix kvcache_transfer compile: replace -march=native with -march=x86-64-v2

PR 描述建议(可直接复制,已复刻 checklist §D2 完整结构):

## Motivation
`-march=native` 会在编译时根据当前构建机器的 CPU 特性生成优化指令(如 AVX-512),当在特性不足的目标机器上运行时会触发非法指令错误(SIGILL),导致 kvcache_transfer 在 XPU 等交叉编译或异构部署场景下无法正常运行。将其替换为 `-march=x86-64-v2` 以确保二进制在更广泛的 x86 平台上可正常编译和运行。

## Modifications
- `fastdeploy/cache_manager/transfer_factory/kvcache_transfer/CMakeLists.txt`:将编译标志从 `-march=native` 修改为 `-march=x86-64-v2`,避免因 native 架构标志导致的编译或运行时不兼容问题。

## Usage or Command
N/A

## Accuracy Tests
N/A

## Checklist

- [x] Add at least a tag in the PR title.
  - Tag list: [`[FDConfig]`,`[APIServer]`,`[Engine]`, `[Scheduler]`, `[PD Disaggregation]`, `[Executor]`, `[Graph Optimization]`, `[Speculative Decoding]`, `[RL]`, `[Models]`, `[Quantization]`, `[Loader]`, `[OP]`, `[KVCache]`, `[DataProcessor]`, `[BugFix]`, `[Docs]`, `[CI]`, `[Optimization]`, `[Feature]`, `[Benchmark]`, `[Others]`, `[XPU]`, `[HPU]`, `[GCU]`, `[DCU]`, `[Iluvatar]`, `[Metax]`]
  - You can add new tags based on the PR content, but the semantics must be clear.
- [x] Format your code, run `pre-commit` before commit.
- [x] Add unit tests. Please write the reason in this PR if no unit tests.
- [ ] Provide accuracy results.
- [ ] If the current PR is submitting to the `release` branch, make sure the PR has been submitted to the `develop` branch, then cherry-pick it to the `release` branch with the `[Cherry-Pick]` PR tag.

总体评价

修复方向正确,-march=native 替换为固定基准级别可有效解决跨机器兼容性问题。描述模板尚未填写,建议补全后合入。


set(CMAKE_CXX_COMPILER g++)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -O2 -Ofast -ffast-math -funroll-loops -march=native -std=c++11")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -O2 -Ofast -ffast-math -funroll-loops -march=x86-64-v2 -std=c++11")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❓ 疑问 -march=x86-64-v2 对应 SSE4.2/POPCNT/CX16 基准,是保守兼容的选择。

若 kvcache_transfer 的目标部署环境为现代数据中心 CPU(2013 年后的 Haswell 及以上),-march=x86-64-v3(增加 AVX/AVX2/FMA 支持)可获得更好的内存带宽和向量化性能。

请确认选型依据:是以最大兼容性为优先,还是需要更高性能?如无特殊约束,建议在 PR 描述中注明原因。

@PaddlePaddle-bot
Copy link
Copy Markdown

PaddlePaddle-bot commented May 25, 2026

🤖 Paddle-CI-Agent | ci_status_monitor | 2026-05-26 20:27:25

CI报告基于以下代码生成(30分钟更新一次):


1 任务总览

Required 任务已全部通过(10/10),从阻塞合并的 Required CI 角度看建议通过;当前失败均为 Optional 任务,仅供参考。PR 仅修改 fastdeploy/cache_manager/transfer_factory/kvcache_transfer/CMakeLists.txt 中 1 行编译参数(-march=native-march=x86-64-v2),相关 Required XPU/通用测试均已通过。

总执行(rerun次数) 总任务 ✅ 通过 ❌ 失败 ⏳ 运行中 ⏸️ 等待中 跳过
41(0) 41 37 3 0 0 0

补充:另有 1 个 Optional workflow/job 处于 cancelled 状态:CI_HPU,不阻塞 Required CI 结论。

2 任务状态汇总

日志列说明:失败任务直接使用 CI 预生成日志链接;Required 与 Optional 严格分区展示。

2.1 Required任务 : 10/10 通过

必选任务阻塞合并,失败需优先处理。本次无 Required 失败、运行中或等待中任务。

状态 任务 耗时 根因 修复建议 日志 重跑
其余 10 个必选任务通过 - - - - -

2.2 可选任务 — 27/31 通过

可选任务不阻塞合并,失败仅供参考;按 Skill 规则不做深度分析。

状态 任务 耗时 日志 重跑
Run iluvatar Tests / run_iluvatar_cases 2m36s Job -
Check PR Template 20s Job -
Trigger Jenkins for PR 1m3s Job -
CI_HPU - Workflow 已取消 -
其余 27 个可选任务通过 - - -

3 失败详情(仅 required)

无 Required 失败任务,无需调用 ci_failure_analyzer 深度分析。

4 可选失败简要说明

  • Run iluvatar Tests / run_iluvatar_cases:日志摘要显示自托管 runner/container 执行失败,倾向环境/runner 问题,建议按需 rerun。
  • Check PR Template:PR 模板内容仍保留大量占位说明且 Checklist 未勾选,建议作者补全 PR 描述/测试信息后 rerun。
  • Trigger Jenkins for PR:Docker build 失败,属于 Optional METAX Jenkins 触发链路,建议相关维护者按需查看并 rerun。

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 25, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (develop@e85f0e2). Learn more about missing BASE report.

Additional details and impacted files
@@            Coverage Diff             @@
##             develop    #7919   +/-   ##
==========================================
  Coverage           ?   64.03%           
==========================================
  Files              ?      467           
  Lines              ?    64963           
  Branches           ?     9962           
==========================================
  Hits               ?    41601           
  Misses             ?    20539           
  Partials           ?     2823           
Flag Coverage Δ
GPU 73.15% <ø> (?)
XPU 7.07% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants