Skip to content

Commit f284d7d

Browse files
committed
[NV] llm-d: enforce-eager on H200 prefill to skip cudagraph OOM
Signed-off-by: Ezra Silvera <ezra@il.ibm.com>
1 parent 170ee9f commit f284d7d

1 file changed

Lines changed: 1 addition & 0 deletions

File tree

benchmarks/multi_node/llm-d-recipes/dsr1-fp8-h200-1p1d-simple.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -98,6 +98,7 @@ prefill:
9898
--max-model-len 16384
9999
--block-size 256
100100
--no-enable-prefix-caching
101+
--enforce-eager
101102
env: {}
102103

103104
decode:

0 commit comments

Comments
 (0)