Support multimethod in export_llama_lib#17231
Support multimethod in export_llama_lib#17231meta-codesync[bot] merged 11 commits intogh/lucylq/134/basefrom
Conversation
TODO: add CI test. Differential Revision: [D92315602](https://our.internmc.facebook.com/intern/diff/D92315602/) [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17231
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New Failure, 4 Unrelated FailuresAs of commit 4aa2372 with merge base aa2f683 ( NEW FAILURE - The following job has failed:
FLAKY - The following jobs failed but were likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
TODO: add CI test. Differential Revision: [D92315602](https://our.internmc.facebook.com/intern/diff/D92315602/) ghstack-source-id: 338339885 Pull Request resolved: #17231
This PR needs a
|
TODO: add CI test. Differential Revision: [D92315602](https://our.internmc.facebook.com/intern/diff/D92315602/) [ghstack-poisoned]
Pull Request resolved: #17231 TODO: add CI test. Note: multimethod export is currently limited to: - xnnpack or portable lib - only lora (does not support arbitrary nn.Modules in each method) - if quant is enabled, lora models must share quant schemes at source transformation time - no pt2e quant, as each model could have slightly different results after calibration ghstack-source-id: 338339885 @exported-using-ghexport Differential Revision: [D92315602](https://our.internmc.facebook.com/intern/diff/D92315602/)
There was a problem hiding this comment.
Pull request overview
This pull request adds multimethod export support to the Llama export library, enabling the export of multiple methods (base model and LoRA variants) into a single .pte file. This is part of a stack of PRs (#17228-#17231) that collectively add multimethod and LoRA support to the ExecuTorch runtime.
Changes:
- Added
_export_llama_multimethod()function to handle exporting multiple methods to a single .pte file - Added helper functions
_get_xnnpack_partitioners()and_get_output_filename()to support multimethod export - Added validation logic to ensure multimethod export only works with XNNPACK backend or portable ops
- Added configuration file for Qwen3 multimethod export with LoRA
- Added comprehensive test script for validating multimethod export functionality
- Updated build dependencies to include required tokenizer and weight conversion modules
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
examples/models/llama/export_llama_lib.py |
Core implementation of multimethod export logic, including validation, helper functions, and the main export function |
examples/models/qwen3/config/qwen3_multimethod.yaml |
Configuration file demonstrating multimethod export with LoRA and base methods using environment variable interpolation |
examples/models/llama/BUCK |
Added convert_weights.py to export library sources for proper dependency resolution |
examples/models/llama/runner/targets.bzl |
Added regex_lookahead tokenizer dependency for enhanced tokenizer support |
.ci/scripts/test_lora_multimethod.sh |
Comprehensive test script validating both LoRA and base method execution with expected output verification |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
TODO: add CI test. Differential Revision: [D92315602](https://our.internmc.facebook.com/intern/diff/D92315602/) [ghstack-poisoned]
Pull Request resolved: #17231 Note: multimethod export is currently limited to: - xnnpack or portable lib - only lora (does not support arbitrary nn.Modules in each method) - if quant is enabled, lora models must share quant schemes at source transformation time - no pt2e quant, as each model could have slightly different results after calibration ghstack-source-id: 339710072 @exported-using-ghexport Differential Revision: [D92315602](https://our.internmc.facebook.com/intern/diff/D92315602/)
kimishpatel
left a comment
There was a problem hiding this comment.
Review automatically exported from Phabricator review in Meta.
TODO: add CI test. Differential Revision: [D92315602](https://our.internmc.facebook.com/intern/diff/D92315602/) [ghstack-poisoned]
Pull Request resolved: #17231 Note: multimethod export is currently limited to: - xnnpack or portable lib - only lora (does not support arbitrary nn.Modules in each method) - if quant is enabled, lora models must share quant schemes at source transformation time - no pt2e quant, as each model could have slightly different results after calibration Changes: 1. Add MultimethodLoraConfig to yaml 2. Deepcopy yaml config. Move each lora_config into base.lora_config. 3. Create and export the model 4. Repeat 2,3 for each method. 5. Pass a dict of method_name: ep to `to_edge_transform_and_lower` ghstack-source-id: 340935057 @exported-using-ghexport Differential Revision: [D92315602](https://our.internmc.facebook.com/intern/diff/D92315602/)
TODO: add CI test. Differential Revision: [D92315602](https://our.internmc.facebook.com/intern/diff/D92315602/) [ghstack-poisoned]
Pull Request resolved: #17231 Note: multimethod export is currently limited to: - xnnpack or portable lib - only lora (does not support arbitrary nn.Modules in each method) - if quant is enabled, lora models must share quant schemes at source transformation time - no pt2e quant, as each model could have slightly different results after calibration Changes: 1. Add MultimethodLoraConfig to yaml 2. Deepcopy yaml config. Move each lora_config into base.lora_config. 3. Create and export the model 4. Repeat 2,3 for each method. 5. Pass a dict of method_name: ep to `to_edge_transform_and_lower` ghstack-source-id: 341144063 @exported-using-ghexport Differential Revision: [D92315602](https://our.internmc.facebook.com/intern/diff/D92315602/)
TODO: add CI test. Differential Revision: [D92315602](https://our.internmc.facebook.com/intern/diff/D92315602/) [ghstack-poisoned]
Pull Request resolved: #17231 Note: multimethod export is currently limited to: - xnnpack or portable lib - only lora (does not support arbitrary nn.Modules in each method) - if quant is enabled, lora models must share quant schemes at source transformation time - no pt2e quant, as each model could have slightly different results after calibration Changes: 1. Add MultimethodLoraConfig to yaml 2. Deepcopy yaml config. Move each lora_config into base.lora_config. 3. Create and export the model 4. Repeat 2,3 for each method. 5. Pass a dict of method_name: ep to `to_edge_transform_and_lower` ghstack-source-id: 341145198 @exported-using-ghexport Differential Revision: [D92315602](https://our.internmc.facebook.com/intern/diff/D92315602/)
TODO: add CI test. Differential Revision: [D92315602](https://our.internmc.facebook.com/intern/diff/D92315602/) [ghstack-poisoned]
Pull Request resolved: #17231 Note: multimethod export is currently limited to: - xnnpack or portable lib - only lora (does not support arbitrary nn.Modules in each method) - if quant is enabled, lora models must share quant schemes at source transformation time - no pt2e quant, as each model could have slightly different results after calibration Changes: 1. Add MultimethodLoraConfig to yaml 2. Deepcopy yaml config. Move each lora_config into base.lora_config. 3. Create and export the model 4. Repeat 2,3 for each method. 5. Pass a dict of method_name: ep to `to_edge_transform_and_lower` ghstack-source-id: 341145198 @exported-using-ghexport Differential Revision: [D92315602](https://our.internmc.facebook.com/intern/diff/D92315602/)
TODO: add CI test. Differential Revision: [D92315602](https://our.internmc.facebook.com/intern/diff/D92315602/) [ghstack-poisoned]
Pull Request resolved: #17231 Note: multimethod export is currently limited to: - xnnpack or portable lib - only lora (does not support arbitrary nn.Modules in each method) - if quant is enabled, lora models must share quant schemes at source transformation time - no pt2e quant, as each model could have slightly different results after calibration Changes: 1. Add MultimethodLoraConfig to yaml 2. Deepcopy yaml config. Move each lora_config into base.lora_config. 3. Create and export the model 4. Repeat 2,3 for each method. 5. Pass a dict of method_name: ep to `to_edge_transform_and_lower` ghstack-source-id: 341145198 @exported-using-ghexport Differential Revision: [D92315602](https://our.internmc.facebook.com/intern/diff/D92315602/)
TODO: add CI test. Differential Revision: [D92315602](https://our.internmc.facebook.com/intern/diff/D92315602/) [ghstack-poisoned]
Pull Request resolved: #17231 Note: multimethod export is currently limited to: - xnnpack or portable lib - only lora (does not support arbitrary nn.Modules in each method) - if quant is enabled, lora models must share quant schemes at source transformation time - no pt2e quant, as each model could have slightly different results after calibration Changes: 1. Add MultimethodLoraConfig to yaml 2. Deepcopy yaml config. Move each lora_config into base.lora_config. 3. Create and export the model 4. Repeat 2,3 for each method. 5. Pass a dict of method_name: ep to `to_edge_transform_and_lower` ghstack-source-id: 341145198 @exported-using-ghexport Differential Revision: [D92315602](https://our.internmc.facebook.com/intern/diff/D92315602/)
TODO: add CI test. Differential Revision: [D92315602](https://our.internmc.facebook.com/intern/diff/D92315602/) [ghstack-poisoned]
Pull Request resolved: #17231 Note: multimethod export is currently limited to: - xnnpack or portable lib - only lora (does not support arbitrary nn.Modules in each method) - if quant is enabled, lora models must share quant schemes at source transformation time - no pt2e quant, as each model could have slightly different results after calibration Changes: 1. Add MultimethodLoraConfig to yaml 2. Deepcopy yaml config. Move each lora_config into base.lora_config. 3. Create and export the model 4. Repeat 2,3 for each method. 5. Pass a dict of method_name: ep to `to_edge_transform_and_lower` ghstack-source-id: 341235800 @exported-using-ghexport Differential Revision: [D92315602](https://our.internmc.facebook.com/intern/diff/D92315602/)
TODO: add CI test. Differential Revision: [D92315602](https://our.internmc.facebook.com/intern/diff/D92315602/) [ghstack-poisoned]
Pull Request resolved: #17231 Note: multimethod export is currently limited to: - xnnpack or portable lib - only lora (does not support arbitrary nn.Modules in each method) - if quant is enabled, lora models must share quant schemes at source transformation time - no pt2e quant, as each model could have slightly different results after calibration Changes: 1. Add MultimethodLoraConfig to yaml 2. Deepcopy yaml config. Move each lora_config into base.lora_config. 3. Create and export the model 4. Repeat 2,3 for each method. 5. Pass a dict of method_name: ep to `to_edge_transform_and_lower` ghstack-source-id: 341235800 @exported-using-ghexport Differential Revision: [D92315602](https://our.internmc.facebook.com/intern/diff/D92315602/)
943e869
into
gh/lucylq/134/base
This PR was created by the merge bot to help merge the original PR into the main branch. ghstack PR number: #17231 by @lucylq ^ Please use this as the source of truth for the PR details, comments, and reviews ghstack PR base: https://github.com/pytorch/executorch/tree/gh/lucylq/134/base ghstack PR head: https://github.com/pytorch/executorch/tree/gh/lucylq/134/head Merge bot PR base: https://github.com/pytorch/executorch/tree/gh/lucylq/133/orig Merge bot PR head: https://github.com/pytorch/executorch/tree/gh/lucylq/134/orig Differential Revision: [D92315602](https://our.internmc.facebook.com/intern/diff/D92315602/) @diff-train-skip-merge --------- Co-authored-by: Github Executorch <github_executorch@arm.com> Co-authored-by: lucylq <lfq@meta.com>
Stack from ghstack (oldest at bottom):
TODO: add CI test.
Differential Revision: D92315602