Commit fa0a2c5
fix(lm): retry HTTP 408 from the HF CDN in the hub retry backend (#561)
Job 703812 died at dataloader setup when a ranged parquet read got a 408
(Request Time-out) from the HF CDN. The #557 retry backend was active but
its status_forcelist only covered 429/5xx, so urllib3 returned the 408
unretried and hf_raise_for_status raised, tearing down all 16 ranks.
408 is the same transient-timeout class that backend exists for, and only
idempotent methods are retried, so adding it is safe.
Co-authored-by: Claude Fable 5 <noreply@anthropic.com>1 parent c62531d commit fa0a2c5
1 file changed
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
7 | | - | |
| 7 | + | |
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
| |||
0 commit comments