Skip to content

Commit d10da7d

Browse files
committed
docs: update output structure with complete checkpoint details
- Add optimizer_0/ directory for optimizer states - Add pytorch_model_fsdp_0/ for FSDP sharded model states - Add random_states_*.pkl for reproducibility (per-rank random states) - Add README.md in transformer/ directory - Clarify EMA states are optional (if enabled)
1 parent 7123f4b commit d10da7d

1 file changed

Lines changed: 9 additions & 2 deletions

File tree

README.md

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -388,8 +388,15 @@ logs/
388388
└── <run_name>_<timestamp>/
389389
├── 📁 checkpoints/ # Periodic checkpoints
390390
│ └── checkpoint-{step}/
391-
│ ├── ema/ # EMA states
392-
│ ├── unwrapped_model/transformer/ # Model weights
391+
│ ├── ema/ # EMA states (if enabled)
392+
│ ├── unwrapped_model/ # Model weights
393+
│ │ └── transformer/
394+
│ │ ├── adapter_config.json # LoRA config (if LoRA)
395+
│ │ ├── adapter_model.safetensors # LoRA weights (if LoRA)
396+
│ │ └── README.md # Model card
397+
│ ├── optimizer_0/ # Optimizer states (for resuming)
398+
│ ├── pytorch_model_fsdp_0/ # FSDP sharded model states
399+
│ ├── random_states_*.pkl # Random states for each rank (for reproducibility)
393400
│ └── metadata.json # Step & config metadata
394401
├── 📁 final_model/ # Final trained model
395402
│ └── transformer/

0 commit comments

Comments
 (0)