Skip to content

Add CUDA graph capture/replay for qwen 3.5 moe decode method #226

Add CUDA graph capture/replay for qwen 3.5 moe decode method

Add CUDA graph capture/replay for qwen 3.5 moe decode method #226

Job Run time
37m 32s
12m 9s
36m 5s
8m 46s
33m 18s
33m 48s
9m 10s
12m 59s
10m 30s
10m 33s
9m 52s
10m 33s
10m 54s
10m 39s
9m 45s
10m 19s
10m 26s
10m 9s
9m 53s
10m 29s
11m 5s
5h 18m 54s