Skip to content

Commit 883af35

Browse files
Minor M-Bridge Pruning updates
Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
1 parent 2e43c80 commit 883af35

2 files changed

Lines changed: 15 additions & 4 deletions

File tree

examples/megatron_bridge/README.md

Lines changed: 14 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ This directory contains examples of using Model Optimizer with [NeMo Megatron-Br
1818

1919
Running these examples requires many additional dependencies to be installed (e.g., Megatron-Bridge, Megatron-core, etc.), hence we strongly recommend directly using the NeMo container (e.g., `nvcr.io/nvidia/nemo:26.02`) which has all the dependencies installed.
2020

21-
To get the latest ModelOpt features and examples, you can mount your latest ModelOpt cloned repository to the container at `/opt/Model-Optimizer` or pull the latest changes once inside the docker container (`cd /opt/Model-Optimizer && git checkout main && git pull`).
21+
To get the latest ModelOpt features and examples, you can mount your latest ModelOpt cloned repository to the container at `/opt/Megatron-Bridge/3rdparty/Model-Optimizer` or pull the latest changes once inside the docker container (`cd /opt/Megatron-Bridge/3rdparty/Model-Optimizer && git checkout main && git pull`).
2222

2323
## Pruning
2424

@@ -30,17 +30,28 @@ Example usage to prune Qwen3-8B to 6B on 2-GPUs (Pipeline Parallelism = 2) while
3030
top-10 candidates are evaluated for MMLU score (5% sampled data) to select the best model.
3131

3232
```bash
33-
torchrun --nproc_per_node 2 /opt/Model-Optimizer/examples/megatron_bridge/prune_minitron.py \
33+
torchrun --nproc_per_node 2 /opt/Megatron-Bridge/3rdparty/Model-Optimizer/examples/megatron_bridge/prune_minitron.py \
3434
--hf_model_name_or_path Qwen/Qwen3-8B \
3535
--prune_target_params 6e9 \
3636
--hparams_to_skip num_attention_heads \
3737
--output_hf_path /tmp/Qwen3-8B-Pruned-6B
3838
```
3939

40+
Example usage for manually pruning to a specific architecture using following defaults:
41+
1024 samples from [`nemotron-post-training-dataset-v2`](https://huggingface.co/datasets/nvidia/Nemotron-Post-Training-Dataset-v2) for calibration.
42+
43+
```bash
44+
torchrun --nproc_per_node 2 /opt/Megatron-Bridge/3rdparty/Model-Optimizer/examples/megatron_bridge/prune_minitron.py \
45+
--hf_model_name_or_path Qwen/Qwen3-8B \
46+
--prune_export_config '{"hidden_size": 3072, "ffn_hidden_size": 9216}' \
47+
--hparams_to_skip num_attention_heads \
48+
--output_hf_path /tmp/Qwen3-8B-Pruned-6B
49+
```
50+
4051
To see the full usage for advanced configurations, run:
4152

4253
```bash
43-
python /opt/Model-Optimizer/examples/megatron_bridge/prune_minitron.py --help
54+
python /opt/Megatron-Bridge/3rdparty/Model-Optimizer/examples/megatron_bridge/prune_minitron.py --help
4455
```
4556

4657
> [!TIP]

examples/megatron_bridge/prune_minitron.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -377,8 +377,8 @@ def score_func_mmlu(m):
377377

378378

379379
if __name__ == "__main__":
380-
args = get_args()
381380
dist.setup()
381+
args = get_args()
382382
try:
383383
main(args)
384384
finally:

0 commit comments

Comments
 (0)