Skip to content

Commit 6aded53

Browse files
committed
fixup
1 parent 75515f5 commit 6aded53

1 file changed

Lines changed: 1 addition & 0 deletions

File tree

deepmd/utils/argcheck.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3948,6 +3948,7 @@ def training_args(
39483948
"but reduces optimizer memory to 1/N per GPU. "
39493949
"2: FSDP2 stage-2, shards optimizer states and gradients; same communication "
39503950
"volume as stage-1 but further reduces gradient memory to 1/N per GPU. "
3951+
"Stages 2 and 3 require FSDP2, which is available in PyTorch >= 2.6. "
39513952
"Note: FSDP2 introduces DTensor dispatch overhead that can slow down "
39523953
"models with many small layers; use torch.compile to mitigate. "
39533954
"3: FSDP2 stage-3, shards parameters as well; maximum memory savings but "

0 commit comments

Comments
 (0)