Commit e0cda1b
Add layerwise calibration for large models
This PR does three things:
1. Rename sequential_calibrate to layerwise_calibrate to better describe
the layer-by-layer algorithm (use_sequential -> use_layerwise,
_seq_calib -> _layerwise_calib).
2. Make layerwise calibration performant: persistent_materialization
keeps the active layer on GPU for the entire calibration step,
and _SkipLayer replaces fully-calibrated layers with parameter-free
dummies so framework hooks (accelerate, FSDP2) skip materialization.
3. Add checkpoint save/resume so calibration of large models can be
interrupted and restarted from the last completed layer.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: realAsma <akuriparambi@nvidia.com>
Add layerwise calibration for large models
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: realAsma <akuriparambi@nvidia.com>
Move checkpoint_dir helpers from library to examples/llm_ptq
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: realAsma <akuriparambi@nvidia.com>
Rename layerwise config fields and enable layerwise on experts-only recipe
- use_layerwise -> layerwise, checkpoint_dir -> layerwise_checkpoint_dir
- Enable layerwise calibration + checkpointing on nvfp4_experts_only-fp8_kv recipe
- Add layerwise_checkpoint_dir to nvfp4_default-none_kv_gptq recipe
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: realAsma <akuriparambi@nvidia.com>
Address PR review feedback for layerwise calibration
- Add inline security comments for all torch.load(weights_only=False) calls
- Replace bare assert with RuntimeError for unsupported offload hook layout
- Write back buffers (not just parameters) in _writeback_params_to_weights_map
- Add cross-field validator rejecting layerwise_checkpoint_dir without layerwise=True
- Validate num_layers mismatch on checkpoint resume
- Handle integer device ordinals in _get_execution_device_from_hook
- Clean up stale layer artifacts in partial-checkpoint tests
- Guard non-dict algorithm values in needs_checkpoint_path_update
- Add comment explaining dummy output_meta for last layer
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: realAsma <akuriparambi@nvidia.com>1 parent 361f7e3 commit e0cda1b
File tree
26 files changed
+1915
-503
lines changed- examples/llm_ptq
- modelopt_recipes/general/ptq
- modelopt/torch
- quantization
- plugins
- utils
- utils
- tests
- gpu/torch/quantization
- plugins
- unit/torch/quantization
- plugins
26 files changed
+1915
-503
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
| 18 | + | |
18 | 19 | | |
19 | 20 | | |
20 | 21 | | |
| |||
854 | 855 | | |
855 | 856 | | |
856 | 857 | | |
| 858 | + | |
| 859 | + | |
| 860 | + | |
| 861 | + | |
| 862 | + | |
| 863 | + | |
| 864 | + | |
| 865 | + | |
| 866 | + | |
| 867 | + | |
| 868 | + | |
| 869 | + | |
| 870 | + | |
| 871 | + | |
| 872 | + | |
| 873 | + | |
| 874 | + | |
| 875 | + | |
| 876 | + | |
| 877 | + | |
| 878 | + | |
| 879 | + | |
| 880 | + | |
| 881 | + | |
| 882 | + | |
| 883 | + | |
| 884 | + | |
| 885 | + | |
| 886 | + | |
| 887 | + | |
| 888 | + | |
| 889 | + | |
| 890 | + | |
| 891 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
34 | 34 | | |
35 | 35 | | |
36 | 36 | | |
| 37 | + | |
| 38 | + | |
37 | 39 | | |
38 | 40 | | |
39 | 41 | | |
| |||
91 | 93 | | |
92 | 94 | | |
93 | 95 | | |
94 | | - | |
95 | | - | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
96 | 99 | | |
97 | 100 | | |
98 | 101 | | |
| |||
1104 | 1107 | | |
1105 | 1108 | | |
1106 | 1109 | | |
| 1110 | + | |
| 1111 | + | |
| 1112 | + | |
| 1113 | + | |
| 1114 | + | |
| 1115 | + | |
1107 | 1116 | | |
1108 | 1117 | | |
1109 | 1118 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1217 | 1217 | | |
1218 | 1218 | | |
1219 | 1219 | | |
1220 | | - | |
| 1220 | + | |
1221 | 1221 | | |
1222 | | - | |
| 1222 | + | |
1223 | 1223 | | |
1224 | | - | |
| 1224 | + | |
1225 | 1225 | | |
1226 | 1226 | | |
1227 | 1227 | | |
1228 | 1228 | | |
1229 | 1229 | | |
| 1230 | + | |
| 1231 | + | |
| 1232 | + | |
| 1233 | + | |
| 1234 | + | |
| 1235 | + | |
| 1236 | + | |
| 1237 | + | |
| 1238 | + | |
| 1239 | + | |
| 1240 | + | |
| 1241 | + | |
| 1242 | + | |
| 1243 | + | |
| 1244 | + | |
| 1245 | + | |
| 1246 | + | |
| 1247 | + | |
| 1248 | + | |
| 1249 | + | |
1230 | 1250 | | |
1231 | 1251 | | |
1232 | 1252 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
60 | 60 | | |
61 | 61 | | |
62 | 62 | | |
| 63 | + | |
63 | 64 | | |
64 | 65 | | |
65 | 66 | | |
66 | | - | |
67 | 67 | | |
68 | 68 | | |
69 | 69 | | |
| |||
222 | 222 | | |
223 | 223 | | |
224 | 224 | | |
225 | | - | |
| 225 | + | |
| 226 | + | |
226 | 227 | | |
227 | 228 | | |
228 | 229 | | |
| |||
237 | 238 | | |
238 | 239 | | |
239 | 240 | | |
240 | | - | |
| 241 | + | |
| 242 | + | |
241 | 243 | | |
242 | 244 | | |
243 | | - | |
244 | | - | |
245 | | - | |
246 | | - | |
247 | | - | |
| 245 | + | |
| 246 | + | |
248 | 247 | | |
249 | 248 | | |
250 | 249 | | |
| 250 | + | |
251 | 251 | | |
252 | 252 | | |
253 | 253 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
28 | 28 | | |
29 | 29 | | |
30 | 30 | | |
31 | | - | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
32 | 35 | | |
33 | 36 | | |
34 | 37 | | |
| |||
44 | 47 | | |
45 | 48 | | |
46 | 49 | | |
| 50 | + | |
47 | 51 | | |
48 | 52 | | |
49 | 53 | | |
| |||
53 | 57 | | |
54 | 58 | | |
55 | 59 | | |
| 60 | + | |
56 | 61 | | |
57 | 62 | | |
58 | | - | |
59 | 63 | | |
60 | 64 | | |
61 | 65 | | |
| |||
1552 | 1556 | | |
1553 | 1557 | | |
1554 | 1558 | | |
1555 | | - | |
| 1559 | + | |
1556 | 1560 | | |
1557 | 1561 | | |
1558 | 1562 | | |
1559 | 1563 | | |
1560 | 1564 | | |
1561 | | - | |
| 1565 | + | |
1562 | 1566 | | |
1563 | 1567 | | |
1564 | 1568 | | |
1565 | 1569 | | |
| 1570 | + | |
| 1571 | + | |
| 1572 | + | |
| 1573 | + | |
1566 | 1574 | | |
| 1575 | + | |
| 1576 | + | |
1567 | 1577 | | |
1568 | 1578 | | |
1569 | | - | |
| 1579 | + | |
1570 | 1580 | | |
1571 | 1581 | | |
1572 | 1582 | | |
1573 | 1583 | | |
1574 | 1584 | | |
1575 | 1585 | | |
1576 | 1586 | | |
1577 | | - | |
| 1587 | + | |
1578 | 1588 | | |
1579 | 1589 | | |
1580 | | - | |
| 1590 | + | |
| 1591 | + | |
| 1592 | + | |
| 1593 | + | |
| 1594 | + | |
1581 | 1595 | | |
1582 | 1596 | | |
1583 | 1597 | | |
1584 | 1598 | | |
| 1599 | + | |
| 1600 | + | |
1585 | 1601 | | |
1586 | | - | |
1587 | | - | |
1588 | | - | |
| 1602 | + | |
| 1603 | + | |
| 1604 | + | |
| 1605 | + | |
| 1606 | + | |
| 1607 | + | |
| 1608 | + | |
1589 | 1609 | | |
1590 | 1610 | | |
1591 | 1611 | | |
1592 | 1612 | | |
1593 | 1613 | | |
1594 | | - | |
| 1614 | + | |
| 1615 | + | |
| 1616 | + | |
| 1617 | + | |
| 1618 | + | |
| 1619 | + | |
| 1620 | + | |
| 1621 | + | |
| 1622 | + | |
| 1623 | + | |
| 1624 | + | |
| 1625 | + | |
| 1626 | + | |
1595 | 1627 | | |
1596 | 1628 | | |
1597 | 1629 | | |
| 1630 | + | |
1598 | 1631 | | |
1599 | 1632 | | |
1600 | 1633 | | |
1601 | | - | |
| 1634 | + | |
| 1635 | + | |
| 1636 | + | |
| 1637 | + | |
1602 | 1638 | | |
1603 | 1639 | | |
1604 | 1640 | | |
| |||
1610 | 1646 | | |
1611 | 1647 | | |
1612 | 1648 | | |
1613 | | - | |
| 1649 | + | |
1614 | 1650 | | |
1615 | | - | |
| 1651 | + | |
1616 | 1652 | | |
1617 | 1653 | | |
1618 | | - | |
| 1654 | + | |
1619 | 1655 | | |
1620 | 1656 | | |
1621 | 1657 | | |
| |||
1628 | 1664 | | |
1629 | 1665 | | |
1630 | 1666 | | |
1631 | | - | |
| 1667 | + | |
1632 | 1668 | | |
1633 | 1669 | | |
1634 | 1670 | | |
| |||
1663 | 1699 | | |
1664 | 1700 | | |
1665 | 1701 | | |
| 1702 | + | |
1666 | 1703 | | |
1667 | | - | |
| 1704 | + | |
| 1705 | + | |
1668 | 1706 | | |
1669 | 1707 | | |
1670 | 1708 | | |
| |||
0 commit comments