exir: add flatbuffer Program serialization for performance#17333
exir: add flatbuffer Program serialization for performance#17333JacobSzwejbka merged 12 commits into
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17333
Note: Links to docs will display an error until the docs builds have been completed. ❌ 2 New Failures, 2 Unrelated FailuresAs of commit b313378 with merge base d660dce ( NEW FAILURES - The following jobs have failed:
BROKEN TRUNK - The following jobs failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@pytorchbot label "partner: arm" |
|
@pytorchbot label ciflow/trunk |
|
To add these label(s) (ciflow/trunk) to the PR, please first approve the workflows that are awaiting approval (scroll to the bottom of this page). This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows. |
There was a problem hiding this comment.
Pull request overview
Adds a new FlatBuffer-based serializer for Program and switches the default serialization path to it for major performance/memory improvements, while keeping a JSON-based fallback for robustness.
Changes:
- Introduces direct Python FlatBuffer packing for
Program(_flatbuffer_program.py) and uses it by default with JSON fallback + logging. - Adds tooling/docs + CI for regenerating FlatBuffer Python bindings from
schema/*.fbs. - Adds unit tests + BUCK targets for FlatBuffer serialization behavior.
Reviewed changes
Copilot reviewed 11 out of 54 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| schema/README.md | Documents how to regenerate committed FlatBuffer Python bindings. |
| exir/_serialize/test/test_flatbuffer_program.py | Adds unit tests for roundtrip and parity between direct vs JSON-based FlatBuffer generation. |
| exir/_serialize/test/BUCK | Registers the new unit test target. |
| exir/_serialize/generate_program.py | Adds a generator script to run flatc and rewrite imports/init files for vendored bindings. |
| exir/_serialize/_program.py | Switches default serialization to FlatBuffer fast path with JSON fallback + logging. |
| exir/_serialize/_flatbuffer_program.py | Implements direct FlatBuffer packing for Program without invoking flatc. |
| exir/_serialize/_flatbuffer.py | Extends schema prep to extract file_identifier and effective alignment values. |
| exir/_serialize/BUCK | Exposes the new serializer module in the library target. |
| .lintrunner.toml | Excludes generated FlatBuffer bindings from lintrunner. |
| .github/workflows/validate_flatbuffer_gen.yml | Adds CI workflow intended to validate generated bindings are up to date. |
| .flake8 | Excludes generated FlatBuffer bindings from flake8. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@pytorchbot label "release notes: exir" |
|
To add the ciflow label This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 11 out of 54 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
exir/_serialize/_flatbuffer_program.py:1
- The shebang line '#!/usr/bin/env fbpython' is specific to Facebook's internal Python environment. For an open source repository, this should be either removed or changed to a standard Python shebang like '#!/usr/bin/env python3'.
# Copyright 2026 Arm Limited and/or its affiliates.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
this is the updated PR of #16691 |
| return _BackendDelegateInlineData.BackendDelegateInlineDataEnd(builder) | ||
|
|
||
|
|
||
| def _install_fast_packers() -> None: |
There was a problem hiding this comment.
Does this make a big difference in serialization speed?
There was a problem hiding this comment.
yes it give very big improvement at speed
There was a problem hiding this comment.
quick test on resnet18 show improve from ~3.54 sec to 0.07 sec ~50x
There was a problem hiding this comment.
wow, that's a big speed up!
|
@pytorchbot label ciflow/trunk |
|
To add these label(s) (ciflow/trunk) to the PR, please first approve the workflows that are awaiting approval (scroll to the bottom of this page). This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 11 out of 54 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
96943c2 to
247294d
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 13 out of 56 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| def _convert_evalue(val: EValue) -> Any: | ||
| result = _flatbuffer_t_class(EValue)() | ||
| union_val = val.val | ||
| if union_val is None: | ||
| result.valType = _KernelTypes.KernelTypes.NONE | ||
| result.val = None | ||
| return result | ||
| union_name = type(union_val).__name__ | ||
| result.valType = getattr(_KernelTypes.KernelTypes, union_name) | ||
| result.val = _convert_value(union_val) | ||
| return result | ||
|
|
||
|
|
||
| def _convert_instruction(val: Instruction) -> Any: | ||
| result = _flatbuffer_t_class(Instruction)() | ||
| union_val = val.instr_args | ||
| if union_val is None: | ||
| result.instrArgsType = _InstructionArguments.InstructionArguments.NONE | ||
| result.instrArgs = None | ||
| return result | ||
| union_name = type(union_val).__name__ | ||
| result.instrArgsType = getattr( | ||
| _InstructionArguments.InstructionArguments, union_name | ||
| ) | ||
| result.instrArgs = _convert_value(union_val) | ||
| return result |
There was a problem hiding this comment.
The functions _convert_evalue and _convert_instruction use type(union_val).__name__ to get the union type name and then use getattr() to retrieve the corresponding enum value. If the union contains a type that doesn't have a corresponding enum value in _KernelTypes.KernelTypes or _InstructionArguments.InstructionArguments, this will raise an AttributeError at runtime. Consider adding error handling or validation to provide a more informative error message if an unexpected type is encountered.
| return val | ||
|
|
||
|
|
||
| def convert_program(val: Program) -> ProgramT: |
There was a problem hiding this comment.
The function convert_program is publicly exposed (no underscore prefix) but lacks documentation and type annotations for its return type. While it's used internally, if it's intended to be part of the public API, it should have proper documentation. If it's meant to be internal, it should be prefixed with an underscore.
| def convert_program(val: Program) -> ProgramT: | |
| def convert_program(val: Program) -> ProgramT: | |
| """ | |
| Convert a :class:`executorch.exir.schema.Program` instance into its | |
| corresponding FlatBuffers table type (:class:`ProgramT`). | |
| This helper is used when serializing a high-level Program schema object | |
| into the low-level FlatBuffers representation expected by the generated | |
| executorch flatbuffer bindings. | |
| Args: | |
| val: The high-level Program object to convert. | |
| Returns: | |
| The corresponding ProgramT instance suitable for FlatBuffers | |
| serialization. | |
| """ |
|
To add the ciflow label This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows. |
|
Pulling internally again. Will play around with the bitwise differences to make sure they all checkout. |
| @@ -19,8 +20,8 @@ | |||
| from executorch.exir._serialize._flatbuffer import ( | |||
| _FlatbufferResult, | |||
| _program_flatbuffer_to_json, | |||
There was a problem hiding this comment.
This is used in the deserialize_pte_binary function, which still relies on JSON:
program: Program = _json_to_program(
_program_flatbuffer_to_json(program_data[:program_size])
)
As I understand it, deserialize_pte_binary is used only for testing and tooling, not in production code. However, since it is part of the public API, I am not entirely sure whether that assumption is correct.
There was a problem hiding this comment.
Please let me know if you would like me to remove _program_flatbuffer_to_json from deserialize_pte_binary, or even remove it entirely from the codebase.
Alternatively, let me know if you think it is fine to keep it as it is.
There was a problem hiding this comment.
We should keep deserialize_pte_binary - is it possible to deserialize using the new python path instead of json?
There was a problem hiding this comment.
is a not a small change
I will open another PR for this
|
Its all good on my side. Just need the merge conflict to be resolved. Im happy if you want to take a look at deserialize in a followup I dont think its blocking for this one. @chizkiyahu |
Signed-off-by: Chizkiyahu Raful <chizkiyahu.raful@arm.com> Change-Id: Ie72323e17bcab7999bed89702bc42dc9df665b93
Change-Id: I23787d086aa9229f5d93d1e16c6f6297119e7f5a Signed-off-by: Chizkiyahu Raful <chizkiyahu.raful@arm.com>
done |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 13 out of 56 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| @functools.lru_cache(maxsize=1) | ||
| def _install_fast_packers() -> None: | ||
| _Buffer.BufferT.Pack = _pack_buffer | ||
| _BackendDelegateInlineData.BackendDelegateInlineDataT.Pack = ( | ||
| _pack_backend_delegate_inline_data | ||
| ) |
There was a problem hiding this comment.
The _install_fast_packers function is decorated with @functools.lru_cache(maxsize=1), which permanently monkey-patches the Pack methods on BufferT and BackendDelegateInlineDataT class objects the first time it is called. Because these are class-level mutations (not instance-level), the patched Pack method will be active for all subsequent uses of those classes throughout the entire process lifetime — including any parallel uses, tests that import these classes independently, or any code path that calls BufferT.Pack or BackendDelegateInlineDataT.Pack without going through _program_to_flatbuffer. This is a global side effect that is never undone, which makes the code fragile and hard to reason about in concurrent or test environments.
| def _pack_buffer(self: Any, builder: Any) -> int: | ||
| storage = 0 | ||
| if self.storage is not None: | ||
| storage = _create_aligned_byte_vector( | ||
| builder, _coerce_bytes(self.storage), _BUFFER_ALIGNMENT.get() | ||
| ) | ||
| _Buffer.BufferStart(builder) | ||
| if storage: | ||
| _Buffer.BufferAddStorage(builder, storage) | ||
| return _Buffer.BufferEnd(builder) |
There was a problem hiding this comment.
The _pack_buffer function checks if storage: (line 97) to decide whether to call BufferAddStorage. Since storage is a FlatBuffers offset (an integer), this check is falsy when the offset is 0. A FlatBuffers offset of 0 would only occur for an empty vector and 0 is not a valid table offset in practice, but this subtle reliance on the offset never being 0 for a present field is fragile. Using an explicit if self.storage is not None: guard mirroring the outer check would be clearer and safer. The same issue applies to _pack_backend_delegate_inline_data at line 109.
|
Just wanted to say thanks again @chizkiyahu. This is an awesome change! |
This continues the work from pytorch#17333. Change-Id: I35ac4cd5f6430ea89939453344c13e056b5c746c Signed-off-by: Chizkiyahu Raful <chizkiyahu.raful@arm.com>
…7333) ## **Title** Introduce flatbuffer serializer as default for program serialization ## Summary Add a FlatBuffers-based serializer at `exir/_serialize/_flatbuffer_program.py` and make it the default serializer for program artifacts. The new serializer achieves large efficiency gains across our model suite while retaining robustness via a JSON fallback and detailed error logging. ### Before python -> json -> binary ### After python -> binary ## Performance Measured across ~600 models: * ~36.3× average speedup (serialize latency). * ~61.6% average reduction in memory usage during serialization. ## Tests Unit tests added at: * `exir/_serialize/test/test_flatbuffer_program.py` cc @digantdesai @SS-JIA @freddan80 @per @zingo @oscarandersson8218 @mansnils @Sebastian-Larsson @robell --------- Signed-off-by: Chizkiyahu Raful <chizkiyahu.raful@arm.com> Co-authored-by: Jacob Szwejbka <jakeszwe@meta.com>
This continues the work from pytorch#17333. Change-Id: I35ac4cd5f6430ea89939453344c13e056b5c746c Signed-off-by: Chizkiyahu Raful <chizkiyahu.raful@arm.com>
This continues the work from pytorch#17333. Change-Id: I35ac4cd5f6430ea89939453344c13e056b5c746c Signed-off-by: Chizkiyahu Raful <chizkiyahu.raful@arm.com>
This continues the work from pytorch#17333. Change-Id: I35ac4cd5f6430ea89939453344c13e056b5c746c Signed-off-by: Chizkiyahu Raful <chizkiyahu.raful@arm.com>
This continues the work from pytorch#17333. Change-Id: I35ac4cd5f6430ea89939453344c13e056b5c746c Signed-off-by: Chizkiyahu Raful <chizkiyahu.raful@arm.com>
Title
Introduce flatbuffer serializer as default for program serialization
Summary
Add a FlatBuffers-based serializer at
exir/_serialize/_flatbuffer_program.pyand make it the default serializer for program artifacts. The new serializer achieves large efficiency gains across our model suite while retaining robustness via a JSON fallback and detailed error logging.Before
python -> json -> binary
After
python -> binary
Performance
Measured across ~600 models:
Tests
Unit tests added at:
exir/_serialize/test/test_flatbuffer_program.pycc @digantdesai @SS-JIA @freddan80 @per @zingo @oscarandersson8218 @mansnils @Sebastian-Larsson @robell