Commit 20d3c5b
committed
feat(utils): add pack mode to get_dataset_dataloader
`pack=False` (default) tokenizes each calibration sample with
`padding=True, truncation=True, max_length=...` — on long-document
datasets like cnn_dailymail that discards most of each article and
pads short samples up to the max, feeding calibration heavily padded
and context-impoverished batches.
`pack=True` concatenates the token streams of all raw samples
(separated by `tokenizer.eos_token_id`) and slices into uniform
`max_sample_length` chunks. Long documents stay intact, padding tokens
disappear, every chunk is natural-length context.
Measured on Qwen3-8B minitron prune to 30L/3584/11776
(cnn_dailymail, 256 samples, seq_length 512):
pack=False: MMLU 0.486
pack=True: MMLU 0.544 (+5.8 pts; Megatron-Bridge ref 0.563)
Default stays False for back-compat with a `warn_rank_0` nudging
callers toward `pack=True`; downstream examples (hf_ptq.py, vlm_ptq.py,
Megatron-LM prune.py / quantize.py) can opt in incrementally.
Tests: extend `_FakeTokenizer` with `encode()` + `eos_token_id` and
flip `TestGetDatasetDataloaderBlending` / HF tiny-dataset tests to
`pack=True`.
CHANGELOG: pack entry under New Features; fused-TE-spec import fix
entry under Bug Fixes (covering Qwen3-style attention/MLP norm
loading via the new per-context rule keys).
Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>1 parent df7ab63 commit 20d3c5b
3 files changed
Lines changed: 110 additions & 34 deletions
File tree
- modelopt/torch/utils
- tests/unit/torch/utils
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
27 | 32 | | |
28 | 33 | | |
29 | 34 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
29 | 29 | | |
30 | 30 | | |
31 | 31 | | |
| 32 | + | |
| 33 | + | |
32 | 34 | | |
33 | 35 | | |
34 | 36 | | |
| |||
432 | 434 | | |
433 | 435 | | |
434 | 436 | | |
| 437 | + | |
435 | 438 | | |
436 | 439 | | |
437 | 440 | | |
| |||
448 | 451 | | |
449 | 452 | | |
450 | 453 | | |
| 454 | + | |
| 455 | + | |
| 456 | + | |
| 457 | + | |
| 458 | + | |
| 459 | + | |
| 460 | + | |
451 | 461 | | |
452 | 462 | | |
453 | 463 | | |
| |||
471 | 481 | | |
472 | 482 | | |
473 | 483 | | |
474 | | - | |
475 | | - | |
476 | | - | |
477 | | - | |
| 484 | + | |
| 485 | + | |
| 486 | + | |
| 487 | + | |
| 488 | + | |
| 489 | + | |
478 | 490 | | |
479 | | - | |
480 | | - | |
481 | | - | |
482 | | - | |
483 | | - | |
484 | | - | |
485 | | - | |
486 | | - | |
487 | | - | |
488 | | - | |
489 | | - | |
| 491 | + | |
| 492 | + | |
| 493 | + | |
| 494 | + | |
| 495 | + | |
| 496 | + | |
| 497 | + | |
| 498 | + | |
| 499 | + | |
| 500 | + | |
| 501 | + | |
| 502 | + | |
| 503 | + | |
| 504 | + | |
| 505 | + | |
| 506 | + | |
| 507 | + | |
| 508 | + | |
| 509 | + | |
| 510 | + | |
| 511 | + | |
| 512 | + | |
| 513 | + | |
| 514 | + | |
| 515 | + | |
| 516 | + | |
| 517 | + | |
| 518 | + | |
| 519 | + | |
| 520 | + | |
| 521 | + | |
| 522 | + | |
| 523 | + | |
| 524 | + | |
| 525 | + | |
| 526 | + | |
| 527 | + | |
| 528 | + | |
| 529 | + | |
| 530 | + | |
| 531 | + | |
| 532 | + | |
| 533 | + | |
| 534 | + | |
| 535 | + | |
| 536 | + | |
| 537 | + | |
| 538 | + | |
| 539 | + | |
| 540 | + | |
| 541 | + | |
| 542 | + | |
490 | 543 | | |
491 | 544 | | |
492 | 545 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
523 | 523 | | |
524 | 524 | | |
525 | 525 | | |
| 526 | + | |
| 527 | + | |
| 528 | + | |
| 529 | + | |
526 | 530 | | |
527 | 531 | | |
528 | | - | |
| 532 | + | |
529 | 533 | | |
530 | 534 | | |
531 | 535 | | |
| |||
547 | 551 | | |
548 | 552 | | |
549 | 553 | | |
550 | | - | |
| 554 | + | |
| 555 | + | |
551 | 556 | | |
552 | 557 | | |
553 | 558 | | |
554 | 559 | | |
555 | 560 | | |
556 | | - | |
| 561 | + | |
557 | 562 | | |
| 563 | + | |
558 | 564 | | |
559 | 565 | | |
560 | | - | |
561 | | - | |
562 | | - | |
| 566 | + | |
| 567 | + | |
| 568 | + | |
| 569 | + | |
563 | 570 | | |
564 | 571 | | |
565 | 572 | | |
566 | 573 | | |
567 | | - | |
568 | | - | |
| 574 | + | |
| 575 | + | |
569 | 576 | | |
570 | 577 | | |
571 | 578 | | |
572 | 579 | | |
573 | | - | |
574 | | - | |
| 580 | + | |
| 581 | + | |
575 | 582 | | |
| 583 | + | |
576 | 584 | | |
577 | 585 | | |
578 | | - | |
579 | | - | |
| 586 | + | |
| 587 | + | |
| 588 | + | |
| 589 | + | |
580 | 590 | | |
581 | 591 | | |
582 | 592 | | |
583 | 593 | | |
584 | | - | |
585 | | - | |
| 594 | + | |
| 595 | + | |
| 596 | + | |
| 597 | + | |
| 598 | + | |
586 | 599 | | |
587 | 600 | | |
588 | 601 | | |
589 | 602 | | |
590 | 603 | | |
591 | 604 | | |
592 | 605 | | |
| 606 | + | |
593 | 607 | | |
594 | 608 | | |
595 | | - | |
596 | | - | |
| 609 | + | |
597 | 610 | | |
598 | 611 | | |
599 | 612 | | |
| |||
606 | 619 | | |
607 | 620 | | |
608 | 621 | | |
| 622 | + | |
609 | 623 | | |
610 | 624 | | |
611 | 625 | | |
| |||
672 | 686 | | |
673 | 687 | | |
674 | 688 | | |
| 689 | + | |
675 | 690 | | |
676 | 691 | | |
677 | | - | |
| 692 | + | |
678 | 693 | | |
679 | 694 | | |
680 | 695 | | |
681 | 696 | | |
682 | | - | |
| 697 | + | |
| 698 | + | |
| 699 | + | |
683 | 700 | | |
684 | 701 | | |
685 | 702 | | |
686 | 703 | | |
687 | 704 | | |
688 | 705 | | |
| 706 | + | |
689 | 707 | | |
690 | 708 | | |
691 | | - | |
| 709 | + | |
0 commit comments