Skip to content

[Kunlunxin] Support DS V3/R1 fp8 cast to channel-wise int8#208

Merged
yghstill merged 5 commits into
Tencent:mainfrom
dynamicheart:fp8_cast_channel_int8
Jan 13, 2026
Merged

[Kunlunxin] Support DS V3/R1 fp8 cast to channel-wise int8#208
yghstill merged 5 commits into
Tencent:mainfrom
dynamicheart:fp8_cast_channel_int8

Conversation

@dynamicheart
Copy link
Copy Markdown
Contributor

@dynamicheart dynamicheart commented Jan 13, 2026

Usage

python3 fp8_cast_channel_int8.py --input-fp8-path /path/to/DeepSeek-R1--output-int8-path /path/to/DeepSeek-R1-Channel-INT8 --num-workers 32

Verified on the modelDeepSeek-V3.1-Terminus

image

vLLM v0.13.0 H20 Test Result

$ curl http://127.0.0.1:30000/v1/chat/completions  -H "Content-Type: application/json" -d '{"model": "base_model", "messages": [{"content":"北京的天气如何", "role": "user"}]}'
{"id":"chatcmpl-b1e81e0764eacaa8","object":"chat.completion","created":1768308647,"model":"base_model","choices":[{"index":0,"message":{"role":"assistant","content":"我需要知道您想查询的具体时间才能给出准确的天气信息。不过,我可以为您提供查看北京实时天气的方法:\n\n**当前查看方式:**\n1. **天气应用/网站**:打开手机天气应用或搜索引擎(如中国天气网、Weather.com),输入“北京”即可查看实时天气、温度、湿度等详细信息。\n2. **语音助手**:直接对手机说“今天北京天气”(如Siri、小爱同学等)。\n\n**如果您需要未来天气预报,请告诉我具体日期(如明天、本周日),我会为您简要总结!**\n\n**近期北京天气特点(一般规律):**\n- 夏季(6-8月):炎热多雨,午后可能有雷阵雨,近期温度多在25°C-35°C之间。\n- 其他季节:春秋温和,冬季干燥寒冷,需注意防风保暖。\n\n希望这些信息对您有帮助! 🌞","refusal":null,"annotations":null,"audio":null,"function_call":null,"tool_calls":[],"reasoning":null,"reasoning_content":null},"logprobs":null,"finish_reason":"stop","stop_reason":null,"token_ids":null}],"service_tier":null,"system_fingerprint":null,"usage":{"prompt_tokens":7,"total_tokens":193,"completion_tokens":186,"prompt_tokens_details":null},"prompt_logprobs":null,"prompt_token_ids":null,"kv_transfer_params":null}

vLLM v0.11.0 P800 Test Result

image

@dynamicheart dynamicheart changed the title Support DS V3/R1 fp8 cast to channel-wise int8 [Kunlunxin] Support DS V3/R1 fp8 cast to channel-wise int8 Jan 13, 2026
@yghstill
Copy link
Copy Markdown
Collaborator

@dynamicheart 进行代码的格式化检查
pip3 install pre-commit black isort flake8
pre-commit run --all-files

Comment thread tools/fp8_cast_channel_int8.py Outdated
@yghstill yghstill merged commit 197e21d into Tencent:main Jan 13, 2026
5 checks passed
yghstill pushed a commit that referenced this pull request Jan 16, 2026
Co-authored-by: Jianbang Yang <yangjianbang112@gmail.com>
dawnranger pushed a commit to dawnranger/AngelSlim that referenced this pull request Mar 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants