-
Notifications
You must be signed in to change notification settings - Fork 360
Merge puzzletron compression algorithm #1121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
106 commits
Select commit
Hold shift + click to select a range
6c038f9
Add modelopt/torch/_compress CODEOWNERS
kevalmorabia97 230cee1
Merge branch 'main' into feature/compress
kevalmorabia97 54c5f0f
Remove llm_ptq example tests from CICD
kevalmorabia97 9eeee25
E2E test for the experimental compress algorithm based on https://arx…
danielkorzekwa ad1d18e
Merge branch 'main' into feature/compress
kevalmorabia97 cef3655
Add convert_llama3_config_to_decilm_config + unit test (#465)
danielkorzekwa 002b8b5
Implement nas.convert() api for the compress algorithm (#482)
danielkorzekwa 1c12fd8
modelopt nas search() implementation for the compress algorithm (#490)
danielkorzekwa f7d547f
Add decilm modelling code (#505)
danielkorzekwa 50a580c
Compress tutorial (PoC) (#492)
danielkorzekwa b121945
Add llama converter (no dependency on internal Nvidia code) - part 1/…
danielkorzekwa 866e400
llama converter is self-contained now (no dependency on internal nvid…
danielkorzekwa 0868f1c
Add integration test for attention pruning (#562)
danielkorzekwa 69726cc
Merge branch 'main' into feature/compress
kevalmorabia97 07ca24d
Merge branch 'main' into feature/compress
kevalmorabia97 1dde209
Add score_pruning_activations (step 2/6) (#563)
danielkorzekwa 2e559e7
Update README.md
kevalmorabia97 f10be0d
Add activation hooks used for pruning (#576)
danielkorzekwa 194b532
Add sewing kit and utilities used for pruning scoring - pruning scori…
danielkorzekwa 8c9cdd4
Add L2NormHook and use it in megatron.py (#599)
danielkorzekwa 1f72466
Add pruning checkpoints for the compress algorithm (#607)
danielkorzekwa 97fe7f0
Add build replacement library to the compress algorithm. (#616)
danielkorzekwa 954103e
Add subblock stats to the compress algorithm (#623)
danielkorzekwa dcc425f
Add 1-block scoring to the compress algorithm (#625)
danielkorzekwa 56d95de
Add checkpoint save/load to ForwardHook + add IterativeChannelContrib…
danielkorzekwa 74aae83
Add MIP step to the compress algorithm (#627)
danielkorzekwa a1f63bc
Merge branch 'main' into feature/compress
kevalmorabia97 a99f503
Remove unused mip functions + fix multi-gpu test (#660)
kevalmorabia97 67489f4
Fix a bug in IterativeChannelContributionHook + tools for activation …
danielkorzekwa 1d8bd20
Remove runtime.py and directly use torch dist utils + remove unused f…
kevalmorabia97 f7a0cb0
Use shared activation hooks component in the puzzle algorithm (#687)
danielkorzekwa db866d9
Clean up Puzzle Compress Tutorial (#711)
LianaMikael 2e813bf
Two bug fixes: mix checkpointing and dtype (#718)
danielkorzekwa 83ac3b1
Merge remote-tracking branch 'origin/main' into feature/compress
kevalmorabia97 0eecfc6
Fix test assertions for 2-gpu (#772)
kevalmorabia97 43b3cfa
Rename compress to puzzletron (#776)
kevalmorabia97 4c30bd5
Add NeMo Conversion Scripts to Puzzletron (#784)
LianaMikael 96bb0ba
Merge branch 'main' into feature/compress
kevalmorabia97 8c84fee
[CI] Update to only run puzzletron tests
kevalmorabia97 5812777
Merge branch 'main' into feature/puzzletron
kevalmorabia97 5f77c81
Pin torchprofile==0.0.4 to fix CI
kevalmorabia97 82df595
Add anymodel-core to feature/puzzletron (#974)
danielkorzekwa 4dc9932
Draft: anymodel activation scoring (#989)
danielkorzekwa d358eb3
Draft: Merge anymodel pruning (#990)
danielkorzekwa 8e827f3
Draft: Merging anymodel:build_library_and_stats (#993)
danielkorzekwa eb4b210
Draft: merge any model calc one block scores (#994)
danielkorzekwa 8fe318d
Draft: merge any_model: mip_and_realize_models (#995)
danielkorzekwa 2fbdf0e
Update uv.lock for nspect puzzletron scanning
kevalmorabia97 1b42f0b
Dkorzekwa/any model other models (#1007)
danielkorzekwa 67999eb
Dkorzekwa/anymodel gptoss (#1020)
danielkorzekwa 660dc17
Merge any_model tutorial (#1035)
danielkorzekwa 01cba6a
Merge mbridge distillation for any_model (#1036)
danielkorzekwa 2b6572c
MR branch for the remaining difference between dkorzekwa/any_model an…
danielkorzekwa 110316a
Dkorzekwa/decilm hf code cleanup (#1071)
danielkorzekwa 4190275
Dkorzekwa/decilm hf code cleanup 2 (#1073)
danielkorzekwa 0708ca2
Dkorzekwa/anymodel subblock stats (#1085)
danielkorzekwa 3193f30
Dkorzekwa/anymodel subblock stats nodecilm (#1102)
danielkorzekwa 928036e
Dkorzekwa/decilm cleanup post subblockstats (#1103)
danielkorzekwa e508b76
code clean up (#1110)
danielkorzekwa f460d16
Merge branch 'main' into feature/puzzletron
kevalmorabia97 2f55c73
Dkorzekwa/puzzletron use importance hooks from prune (#1115)
danielkorzekwa c5ec50b
Merge remote-tracking branch 'origin/main' into feature/puzzletron
kevalmorabia97 d257871
Merge branch 'main' into feature/puzzletron
kevalmorabia97 7e15fdd
Revert CICD and other config changes
kevalmorabia97 d0209dc
Make Qwen and QwenVL descriptor generic so can be used for other vari…
kevalmorabia97 d987bad
Set strict=True in distill_hf export
kevalmorabia97 75651cc
add basic ruff fixes
kevalmorabia97 03118ce
Apply coderabbit suggestions
kevalmorabia97 2a170b9
Set weights_only=True in checkpoint_utils.py
kevalmorabia97 d6f8ddb
More fixes
kevalmorabia97 4621b65
reuse puzzletron tokenizer in other tests
kevalmorabia97 be4bd3a
disable puzzletron in coverage check as its covered in gpu tests only
kevalmorabia97 45426ca
Remove custom DistillationProvider and simplify mbridge distillation …
kevalmorabia97 5429d86
Merge branch 'main' into feature/puzzletron
kevalmorabia97 41b8ca7
fix test
kevalmorabia97 33b9230
Merge branch 'main' into feature/puzzletron
kevalmorabia97 25266b8
fix hydra config dtype resolution in puzzletron validation tools (#1202)
j-rausch fd5694d
Consolidate lm-eval scripts: merge AnyModel auto-detection into lm_ev…
j-rausch dedcad0
Merge remote-tracking branch 'origin/main' into feature/puzzletron
kevalmorabia97 d0cdbfd
minor cleanup
kevalmorabia97 ee01ace
Move block_config out of deci_lm_hf_code folder
kevalmorabia97 7ce3332
Fix critical bugs flagged by codeRabbit in PR #1121
kevalmorabia97 c7700a9
Fix critical and major bugs flagged by codeRabbit in PR #1121
kevalmorabia97 9f3cc2d
Fix minor bugs flagged by codeRabbit in PR #1121
kevalmorabia97 66fccd2
fix decoder_layer_cls failure on trust_remote_code models (#1222)
j-rausch 7053c61
fix puzzletron container test path; add NeMo setup docs (#1231)
j-rausch 05c6d3b
Add MoE/Nemotron fixes to support Transformers 5.5
kevalmorabia97 0d6eb7e
Update changelog
kevalmorabia97 ac8397b
Refactor puzzletron imports: relative imports, public API, logger fix
kevalmorabia97 977d60a
Merge remote-tracking branch 'origin/main' into feature/puzzletron
kevalmorabia97 01a4e55
Fix custom tiny tokenizer
kevalmorabia97 f361ca6
Fix `RuntimeError: pidfd_getfd: Operation not permitted`
kevalmorabia97 a7eedf8
Add __all__ for modules
kevalmorabia97 ed5fd68
Fix test_puzzletron assertions for transformers v5.5
kevalmorabia97 547e76d
Fix doc building
kevalmorabia97 62070ae
Fix Qwen2.5 test assertion as per CI machine
kevalmorabia97 38d9522
Address coderabbit comments
kevalmorabia97 6395b1e
copy custom modeling files to pruned checkpoint dirs (#1245)
j-rausch d88dfcb
consolidate mbridge distillation: merge distill_hf.py into distill.py…
j-rausch 06eaf74
Merge branch 'main' into feature/puzzletron
kevalmorabia97 3f41819
Address minor coderabbit comments
kevalmorabia97 ad8cf9a
fix lm-eval version conflict in puzzletron requirements (#1257)
j-rausch 5e4c43e
fix hybrid model subblock param counting: all FFN sizes reported iden…
j-rausch 2345af7
Merge branch 'main' into feature/puzzletron
kevalmorabia97 e0bb89d
Fix test
kevalmorabia97 47a612e
Fix test path
kevalmorabia97 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.