Skip to content

Commit 446b26b

Browse files
authored
[Feature] support blackwell gemm in ht (#7053)
* [Feature] support blackwell gemm in ht * [Feature] support ops for convert * fix cuda error 716 * fix cuda error * opt memory * remove unused code
1 parent 334b02c commit 446b26b

5 files changed

Lines changed: 1031 additions & 2 deletions

File tree

fastdeploy/envs.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -88,6 +88,8 @@ def _validate_split_kv_size(value: int) -> int:
8888
"FD_PD_CHANGEABLE": lambda: os.getenv("FD_PD_CHANGEABLE", "0"),
8989
# Whether to use DeepGemm for FP8 blockwise MoE.
9090
"FD_USE_DEEP_GEMM": lambda: bool(int(os.getenv("FD_USE_DEEP_GEMM", "0"))),
91+
# Whether to use DeepGemm for FP8 blockwise MoE.
92+
"FD_USE_BLACKWELL_GEMM": lambda: bool(int(os.getenv("FD_USE_BLACKWELL_GEMM", "0"))),
9193
# Whether to use PFCCLab/DeepEP.
9294
"FD_USE_PFCC_DEEP_EP": lambda: bool(int(os.getenv("FD_USE_PFCC_DEEP_EP", "0"))),
9395
# Whether to use aggregate send.

0 commit comments

Comments
 (0)