Commit 9812b96
committed
Cache already downloaded HuggingFace shards.
Currently, shards seem to be redownloaded every time they are required
causing slowdowns in conversion. Tried running the script with the
changes and there's significant improvements.
Benchmark: 2-Layer Qwen3 MoE Checkpoint Conversion (Lazy Loading Enabled)
| Metric | Baseline (Cached) | Optimized (Phase 1 Only) | Speedup |
|------------------------------|-------------------|--------------------------|----------|
| Sharding (Materialization) | 81.6s (1.36 min) | 16.2s (0.27 min) | **5.0x** |
| Overall Elapse | 83.4s (1.39 min) | 17.4s (0.29 min) | **4.8x** |
Integration Tests (tests/integration/checkpoint_conversion_test.py):
- Baseline: 148.73s (2:28)
- Optimized: 77.33s (1:17) -> **1.9x speedup overall** (includes model download)1 parent a8be563 commit 9812b96
1 file changed
Lines changed: 16 additions & 10 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
116 | 116 | | |
117 | 117 | | |
118 | 118 | | |
| 119 | + | |
| 120 | + | |
119 | 121 | | |
120 | 122 | | |
121 | 123 | | |
| |||
183 | 185 | | |
184 | 186 | | |
185 | 187 | | |
186 | | - | |
187 | | - | |
| 188 | + | |
| 189 | + | |
188 | 190 | | |
189 | | - | |
190 | | - | |
191 | | - | |
192 | | - | |
193 | | - | |
194 | | - | |
195 | | - | |
196 | | - | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
197 | 203 | | |
198 | 204 | | |
199 | 205 | | |
| |||
0 commit comments