Skip to content

Add benchmark-compare skill from #7803#7847

Merged
Jiang-Jia-Jun merged 1 commit into
PaddlePaddle:developfrom
chang-wenbin:benchmark-skill
May 19, 2026
Merged

Add benchmark-compare skill from #7803#7847
Jiang-Jia-Jun merged 1 commit into
PaddlePaddle:developfrom
chang-wenbin:benchmark-skill

Conversation

@chang-wenbin

@chang-wenbin chang-wenbin commented May 19, 2026

Copy link
Copy Markdown
Collaborator

Motivation
在日常性能评估工作中,需要频繁对比 FastDeploy 与 SGLang 两个推理框架的性能表现。手动操作涉及环境安装、服务启动、健康检查、benchmark 执行、指标提取和报告生成等多个步骤,流程繁琐且易出错。本 PR 新增一个 Agent Skill(.claude/skills/benchmark-compare/),实现全流程自动化编排,支持通过自然语言或 /benchmark 命令一键完成性能对比测试并生成可视化 HTML 报告。

Modifications
新增 .claude/skills/benchmark-compare/ 目录,包含以下文件:

SKILL.md — 主技能定义,包含完整 12 步工作流编排、参数表、决策树和两种工作模式(全自动测试 / 仅生成报告)
README.md — 使用说明文档
scripts/launch_service.sh — 通用服务启动脚本,支持 FD/SG 两个框架和 single/TP/PD 多种部署模式
scripts/health_check.sh — 服务健康检查脚本,轮询 /v1/models 接口
scripts/run_benchmark.sh — Benchmark 执行封装脚本
scripts/extract_metrics.py — 从 benchmark 结果文件中提取核心指标(吞吐、延迟、TTFT 等)输出为 JSON
scripts/generate_report.py — 生成多模式可视化 HTML 对比报告
references/html_template.md — HTML 报告模板(含 CSS/JS 和占位符)
references/model_profiles.md — 模型推荐部署参数表
支持特性:

单卡 / 多卡 TP / PD 分离等多种部署模式
BF16 / FP8 等量化方式
自动 GPU 空闲检测和分配
自动匹配 hyperparameter YAML 配置
Usage or Command
作为 Agent Skill 使用(在 Claude Code / Ducc 中):

方式 1: slash command

/benchmark

方式 2: 自然语言

帮我跑 benchmark,模型用 /path/to/GLM-4.7-Flash,TP=2,并发 64,开启 fp8 量化

方式 3: 仅从已有数据生成报告

帮我根据这些日志生成 HTML 对比报告

@paddle-bot

paddle-bot Bot commented May 19, 2026

Copy link
Copy Markdown

Thanks for your contribution!

@CLAassistant

Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


chang-wenbin seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@chang-wenbin chang-wenbin changed the title Add benchmark-compare skill Add benchmark-compare skill from #7803 May 19, 2026
@Jiang-Jia-Jun Jiang-Jia-Jun merged commit 4353cdf into PaddlePaddle:develop May 19, 2026
27 of 38 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants