Skip to content

Commit 8196f35

Browse files
authored
remove nccl pd mode. (#1342)
1 parent 2740083 commit 8196f35

89 files changed

Lines changed: 1095 additions & 4192 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

docs/CN/source/tutorial/deepseek_deployment.rst

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -175,6 +175,7 @@ PD (Prefill-Decode) 分离模式将预填充和解码阶段分离部署,可以
175175
176176
# PD prefill 模式 for DeepSeek-R1 (DP+EP) on H200
177177
# 使用方法: sh pd_prefill.sh <host> <pd_master_ip>
178+
# 默认使用 NIXL 传输;如需使用 NCCL 数据面,可设置 LIGHTLLM_PD_KV_TRANSPORT_BACKEND=nccl
178179
# nvidia-cuda-mps-control -d,运行MPS(可选, 有mps支持性能会好特别多,但是部分显卡和驱动环境开启mps会容易出现错误,建议升级驱动到较高版本,特别是H系列卡)
179180
180181
export host=$1
@@ -201,6 +202,7 @@ PD (Prefill-Decode) 分离模式将预填充和解码阶段分离部署,可以
201202
202203
# PD decode 模式 for DeepSeek-R1 (DP+EP) on H200
203204
# 使用方法: sh pd_decode.sh <host> <pd_master_ip>
205+
# 默认使用 NIXL 传输;如需使用 NCCL 数据面,可设置 LIGHTLLM_PD_KV_TRANSPORT_BACKEND=nccl
204206
export host=$1
205207
export pd_master_ip=$2
206208
nvidia-cuda-mps-control -d
@@ -336,4 +338,4 @@ PD (Prefill-Decode) 分离模式将预填充和解码阶段分离部署,可以
336338
--tokenizer_path /path/DeepSeek-R1/ \
337339
--url http://127.0.0.1:8088/generate_stream
338340
339-
以上所有脚本可以参考 `test/start_scripts/multi_pd_master/` 目录下的脚本。
341+
以上所有脚本可以参考 `test/start_scripts/multi_pd_master/` 目录下的脚本。

docs/EN/source/tutorial/deepseek_deployment.rst

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -175,6 +175,7 @@ PD (Prefill-Decode) disaggregation mode separates prefill and decode stages for
175175
176176
# PD prefill mode for DeepSeek-R1 (DP+EP) on H200
177177
# Usage: sh pd_prefill.sh <host> <pd_master_ip>
178+
# NIXL is used by default. To use NCCL as the data-plane backend, set LIGHTLLM_PD_KV_TRANSPORT_BACKEND=nccl.
178179
# nvidia-cuda-mps-control -d, run MPS (optional, performance will be much better with mps support, but some GPUs may encounter errors when enabling mps, it's recommended to upgrade to a higher driver version, especially for H-series cards)
179180
180181
export host=$1
@@ -198,6 +199,7 @@ PD (Prefill-Decode) disaggregation mode separates prefill and decode stages for
198199
199200
# PD decode mode for DeepSeek-R1 (DP+EP) on H200
200201
# Usage: sh pd_decode.sh <host> <pd_master_ip>
202+
# NIXL is used by default. To use NCCL as the data-plane backend, set LIGHTLLM_PD_KV_TRANSPORT_BACKEND=nccl.
201203
export host=$1
202204
export pd_master_ip=$2
203205
nvidia-cuda-mps-control -d
@@ -333,4 +335,4 @@ Supports multiple PD Master nodes, providing better load balancing and high avai
333335
--tokenizer_path /path/DeepSeek-R1/ \
334336
--url http://127.0.0.1:8088/generate_stream
335337
336-
All the above scripts can be referenced from the scripts in the `test/start_scripts/multi_pd_master/` directory.
338+
All the above scripts can be referenced from the scripts in the `test/start_scripts/multi_pd_master/` directory.

lightllm/common/basemodel/basemodel.py

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -110,7 +110,6 @@ def __init__(self, kvargs):
110110
# 这可能会占用大量的显存,所以,req_manger 中保存的 mem_manger 是mem manager 初始化后再赋值
111111
self.req_manager.mem_manager = self.mem_manager
112112

113-
self._init_kv_move_buffer()
114113
self._check_mem_size()
115114
self._init_infer_layer()
116115
self._init_some_value()
@@ -197,11 +196,6 @@ def _init_mem_manager(self):
197196
)
198197
return
199198

200-
def _init_kv_move_buffer(self):
201-
# p d 分离的推理模式下才需要做这一步初始化
202-
if self.run_mode in ["prefill", "decode"]:
203-
self.mem_manager.alloc_kv_move_buffer(self.mem_manager.size)
204-
205199
def _check_mem_size(self):
206200
self.max_total_token_num = self.mem_manager.size
207201

lightllm/common/basemodel/infer_lock.py

Lines changed: 0 additions & 138 deletions
This file was deleted.

0 commit comments

Comments
 (0)