Skip to content

Commit f3dd299

Browse files
Add a tutorial for a mip sweeper.
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
1 parent a68963a commit f3dd299

3 files changed

Lines changed: 33 additions & 0 deletions

File tree

examples/puzzletron/README.md

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -197,6 +197,36 @@ block_13: attention no_op ffn intermediate_11520
197197
block_14: attention no_op ffn intermediate_3072
198198
```
199199
200+
### MIP Sweep Mode
201+
202+
The **MIP sweep mode** lets you explore multiple memory compression rates in a single run and compare the accuracy-memory trade-offs.
203+
204+
#### Quick Start
205+
206+
1. Enable sweep in your config YAML (e.g., `llama-3_1-8B_pruneffn_memory.yaml`):
207+
208+
```yaml
209+
mip:
210+
sweep:
211+
enabled: true
212+
memory_compression_rates: [0.5, 0.6, 0.7, 0.8, 0.9, 1.0]
213+
output_csv: ${puzzle_dir}/mip_sweep_results.csv
214+
```
215+
216+
2. Run the sweep:
217+
218+
```bash
219+
torchrun --nproc_per_node 2 examples/puzzletron/main.py --config examples/puzzletron/configs/llama-3_1-8B_pruneffn_memory/llama-3_1-8B_pruneffn_memory.yaml --mip-only 2>&1 | tee ./log.txt | grep "Puzzletron Progress"
220+
```
221+
222+
3. View results: The CSV file contains compression rates, memory usage, and accuracy metrics for each configuration.
223+
224+
#### Example Results
225+
226+
![MIP Sweep Results](mip_sweep_example.png)
227+
228+
The plot shows how token accuracy changes with different compression rates. Higher compression (0.5 = 50% of original memory) reduces accuracy, while lower compression maintains accuracy closer to the teacher model.
229+
200230
## Evaluation
201231
202232
Once the model is ready, you can evaluate it using [Language Model Evaluation Harness](https://pypi.org/project/lm-eval/). For example, run the following to evaluate the model on [Massive Multitask Language Understanding](https://huggingface.co/datasets/cais/mmlu) benchmark.

examples/puzzletron/main.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -150,6 +150,9 @@ def run_mip_only(hydra_config_path: str):
150150

151151
# Check if sweep mode is enabled
152152
if hasattr(hydra_cfg.mip, "sweep") and hydra_cfg.mip.sweep.get("enabled", False):
153+
mprint(
154+
"Puzzletron Progress 7/8: running MIP sweep for multiple compression rates (multi-gpu)"
155+
)
153156
sweep.run_mip_sweep(hydra_cfg)
154157
else:
155158
# mip_and_realize_models (distributed processing)
52.5 KB
Loading

0 commit comments

Comments
 (0)