Skip to content

Commit fdfa85f

Browse files
minor
Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
1 parent ca688d1 commit fdfa85f

File tree

3 files changed

+5
-8
lines changed

3 files changed

+5
-8
lines changed

CHANGELOG.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ Changelog
2222

2323
**Bug Fixes**
2424

25-
- Fix Megatron utility functions for generation (with pipeline parallelism) and MMLU score evaluation (10x speedup).
25+
- Fix Megatron utility functions for generation (with pipeline parallelism) and ~10x speedup in MMLU score evaluation (by batching prefill passes).
2626
- Fix Minitron pruning (``mcore_minitron``) for MoE models. Importance estimation hooks were incorrectly registered for MoE modules and NAS step was hanging before this.
2727
- Fix TRT support for remote autotuning in ONNX Autotune from 10.16+ to 10.15+ and fix TRT versioning check to the ``trtexec`` version instead of the TRT Python API when using ``trtexec`` backend.
2828

examples/megatron_bridge/prune_minitron.py

Lines changed: 3 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -299,17 +299,13 @@ def main(args: argparse.Namespace):
299299
match = re.fullmatch(r"mmlu_(\d+)pct", args.prune_score_func)
300300
if not match:
301301
raise ValueError(
302-
f"Invalid score function: {args.prune_score_func}. "
303-
"Expected format: mmlu_<N>pct (e.g. mmlu_10pct)"
302+
f"Invalid score function: {args.prune_score_func}. Expected format: mmlu_<N>pct (e.g. mmlu_10pct)"
304303
)
305-
mmlu_pct = int(match.group(1))
306-
if not 0 < mmlu_pct <= 100:
307-
raise ValueError("--prune_score_func percentage must be in the range [1, 100].")
308-
_mmlu_frac = mmlu_pct / 100.0
304+
mmlu_frac = float(match.group(1)) / 100.0
309305

310306
def score_func(m):
311307
return megatron_mmlu(
312-
m, tokenizer, few_shots=0, fraction=_mmlu_frac, batch_size=args.calib_mbs
308+
m, tokenizer, few_shots=0, fraction=mmlu_frac, batch_size=args.calib_mbs
313309
)
314310

315311
pruning_config["score_func"] = score_func

modelopt/torch/utils/plugins/megatron_mmlu.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -78,6 +78,7 @@ def megatron_mmlu(
7878
f"\nMMLU ({fraction * 100}%, {few_shots}-shot, Batch Size: {batch_size}) evaluation started...\n"
7979
"First batch may take longer to evaluate for Pipeline Parallel models."
8080
)
81+
assert 0 < fraction <= 1, "Fraction must be between 0 and 1"
8182

8283
# Token IDs for " A", " B", " C", " D" — the last subword handles edge cases.
8384
choice_ids = [tokenizer.encode(f" {c}", add_special_tokens=False)[-1] for c in _CHOICES]

0 commit comments

Comments
 (0)