Skip to content

exir: add flatbuffer Program serialization for performance#17333

Merged
JacobSzwejbka merged 12 commits into
pytorch:mainfrom
chizkiyahu:exir-flatbuffer-serialize-fastpath_v2
Mar 6, 2026
Merged

exir: add flatbuffer Program serialization for performance#17333
JacobSzwejbka merged 12 commits into
pytorch:mainfrom
chizkiyahu:exir-flatbuffer-serialize-fastpath_v2

Conversation

@chizkiyahu
Copy link
Copy Markdown
Contributor

@chizkiyahu chizkiyahu commented Feb 10, 2026

Title

Introduce flatbuffer serializer as default for program serialization

Summary

Add a FlatBuffers-based serializer at exir/_serialize/_flatbuffer_program.py and make it the default serializer for program artifacts. The new serializer achieves large efficiency gains across our model suite while retaining robustness via a JSON fallback and detailed error logging.

Before

python -> json -> binary

After

python -> binary

Performance

Measured across ~600 models:

  • ~36.3× average speedup (serialize latency).
  • ~61.6% average reduction in memory usage during serialization.

Tests

Unit tests added at:

  • exir/_serialize/test/test_flatbuffer_program.py

cc @digantdesai @SS-JIA @freddan80 @per @zingo @oscarandersson8218 @mansnils @Sebastian-Larsson @robell

Copilot AI review requested due to automatic review settings February 10, 2026 13:47
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Feb 10, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17333

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures, 2 Unrelated Failures

As of commit b313378 with merge base d660dce (image):

NEW FAILURES - The following jobs have failed:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 10, 2026
@chizkiyahu
Copy link
Copy Markdown
Contributor Author

@pytorchbot label "partner: arm"

@pytorch-bot pytorch-bot Bot added the partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm label Feb 10, 2026
@chizkiyahu
Copy link
Copy Markdown
Contributor Author

@pytorchbot label ciflow/trunk

@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Feb 10, 2026

To add these label(s) (ciflow/trunk) to the PR, please first approve the workflows that are awaiting approval (scroll to the bottom of this page).

This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new FlatBuffer-based serializer for Program and switches the default serialization path to it for major performance/memory improvements, while keeping a JSON-based fallback for robustness.

Changes:

  • Introduces direct Python FlatBuffer packing for Program (_flatbuffer_program.py) and uses it by default with JSON fallback + logging.
  • Adds tooling/docs + CI for regenerating FlatBuffer Python bindings from schema/*.fbs.
  • Adds unit tests + BUCK targets for FlatBuffer serialization behavior.

Reviewed changes

Copilot reviewed 11 out of 54 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
schema/README.md Documents how to regenerate committed FlatBuffer Python bindings.
exir/_serialize/test/test_flatbuffer_program.py Adds unit tests for roundtrip and parity between direct vs JSON-based FlatBuffer generation.
exir/_serialize/test/BUCK Registers the new unit test target.
exir/_serialize/generate_program.py Adds a generator script to run flatc and rewrite imports/init files for vendored bindings.
exir/_serialize/_program.py Switches default serialization to FlatBuffer fast path with JSON fallback + logging.
exir/_serialize/_flatbuffer_program.py Implements direct FlatBuffer packing for Program without invoking flatc.
exir/_serialize/_flatbuffer.py Extends schema prep to extract file_identifier and effective alignment values.
exir/_serialize/BUCK Exposes the new serializer module in the library target.
.lintrunner.toml Excludes generated FlatBuffer bindings from lintrunner.
.github/workflows/validate_flatbuffer_gen.yml Adds CI workflow intended to validate generated bindings are up to date.
.flake8 Excludes generated FlatBuffer bindings from flake8.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread .github/workflows/validate_flatbuffer_gen.yml Outdated
Comment thread .github/workflows/validate_flatbuffer_gen.yml Outdated
Comment thread schema/README.md Outdated
Comment thread exir/_serialize/generate_program.py Outdated
Comment thread exir/_serialize/generate_program.py Outdated
Comment thread exir/_serialize/_flatbuffer.py
Comment thread exir/_serialize/_flatbuffer_program.py
@chizkiyahu
Copy link
Copy Markdown
Contributor Author

@pytorchbot label "release notes: exir"

@pytorch-bot pytorch-bot Bot added the release notes: exir Changes to any dialects and passes on these dialects, such as memory planning label Feb 10, 2026
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Feb 10, 2026

To add the ciflow label ciflow/trunk please first approve the workflows that are awaiting approval (scroll to the bottom of this page).

This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows.

Copilot AI review requested due to automatic review settings February 10, 2026 16:27
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 54 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

exir/_serialize/_flatbuffer_program.py:1

  • The shebang line '#!/usr/bin/env fbpython' is specific to Facebook's internal Python environment. For an open source repository, this should be either removed or changed to a standard Python shebang like '#!/usr/bin/env python3'.
# Copyright 2026 Arm Limited and/or its affiliates.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread exir/_serialize/_flatbuffer.py
@chizkiyahu
Copy link
Copy Markdown
Contributor Author

@mergennachin

this is the updated PR of #16691
thanks

return _BackendDelegateInlineData.BackendDelegateInlineDataEnd(builder)


def _install_fast_packers() -> None:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this make a big difference in serialization speed?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes it give very big improvement at speed

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quick test on resnet18 show improve from ~3.54 sec to 0.07 sec ~50x

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wow, that's a big speed up!

Comment thread exir/_serialize/_flatbuffer_program.py Outdated
Copilot AI review requested due to automatic review settings February 11, 2026 23:25
@chizkiyahu
Copy link
Copy Markdown
Contributor Author

@pytorchbot label ciflow/trunk

@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Feb 11, 2026

To add these label(s) (ciflow/trunk) to the PR, please first approve the workflows that are awaiting approval (scroll to the bottom of this page).

This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 54 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread exir/_serialize/_flatbuffer_program.py
Comment thread exir/_serialize/_flatbuffer_program.py
Comment thread exir/_serialize/_flatbuffer_program.py
Comment thread exir/_serialize/_program.py Outdated
Comment thread exir/_serialize/test/test_flatbuffer_program.py
Comment thread exir/_serialize/_flatbuffer_program.py
Comment thread exir/_serialize/generate_program.py
@chizkiyahu chizkiyahu force-pushed the exir-flatbuffer-serialize-fastpath_v2 branch from 96943c2 to 247294d Compare February 12, 2026 08:24
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 56 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +138 to +163
def _convert_evalue(val: EValue) -> Any:
result = _flatbuffer_t_class(EValue)()
union_val = val.val
if union_val is None:
result.valType = _KernelTypes.KernelTypes.NONE
result.val = None
return result
union_name = type(union_val).__name__
result.valType = getattr(_KernelTypes.KernelTypes, union_name)
result.val = _convert_value(union_val)
return result


def _convert_instruction(val: Instruction) -> Any:
result = _flatbuffer_t_class(Instruction)()
union_val = val.instr_args
if union_val is None:
result.instrArgsType = _InstructionArguments.InstructionArguments.NONE
result.instrArgs = None
return result
union_name = type(union_val).__name__
result.instrArgsType = getattr(
_InstructionArguments.InstructionArguments, union_name
)
result.instrArgs = _convert_value(union_val)
return result
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The functions _convert_evalue and _convert_instruction use type(union_val).__name__ to get the union type name and then use getattr() to retrieve the corresponding enum value. If the union contains a type that doesn't have a corresponding enum value in _KernelTypes.KernelTypes or _InstructionArguments.InstructionArguments, this will raise an AttributeError at runtime. Consider adding error handling or validation to provide a more informative error message if an unexpected type is encountered.

Copilot uses AI. Check for mistakes.
return val


def convert_program(val: Program) -> ProgramT:
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function convert_program is publicly exposed (no underscore prefix) but lacks documentation and type annotations for its return type. While it's used internally, if it's intended to be part of the public API, it should have proper documentation. If it's meant to be internal, it should be prefixed with an underscore.

Suggested change
def convert_program(val: Program) -> ProgramT:
def convert_program(val: Program) -> ProgramT:
"""
Convert a :class:`executorch.exir.schema.Program` instance into its
corresponding FlatBuffers table type (:class:`ProgramT`).
This helper is used when serializing a high-level Program schema object
into the low-level FlatBuffers representation expected by the generated
executorch flatbuffer bindings.
Args:
val: The high-level Program object to convert.
Returns:
The corresponding ProgramT instance suitable for FlatBuffers
serialization.
"""

Copilot uses AI. Check for mistakes.
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Feb 25, 2026

To add the ciflow label ciflow/trunk please first approve the workflows that are awaiting approval (scroll to the bottom of this page).

This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows.

@JacobSzwejbka
Copy link
Copy Markdown
Contributor

Pulling internally again. Will play around with the bitwise differences to make sure they all checkout.

@@ -19,8 +20,8 @@
from executorch.exir._serialize._flatbuffer import (
_FlatbufferResult,
_program_flatbuffer_to_json,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we still need this?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is used in the deserialize_pte_binary function, which still relies on JSON:

program: Program = _json_to_program(
    _program_flatbuffer_to_json(program_data[:program_size])
)

As I understand it, deserialize_pte_binary is used only for testing and tooling, not in production code. However, since it is part of the public API, I am not entirely sure whether that assumption is correct.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Gasoonjia

Please let me know if you would like me to remove _program_flatbuffer_to_json from deserialize_pte_binary, or even remove it entirely from the codebase.

Alternatively, let me know if you think it is fine to keep it as it is.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should keep deserialize_pte_binary - is it possible to deserialize using the new python path instead of json?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is a not a small change
I will open another PR for this

@JacobSzwejbka
Copy link
Copy Markdown
Contributor

JacobSzwejbka commented Mar 3, 2026

Its all good on my side. Just need the merge conflict to be resolved. Im happy if you want to take a look at deserialize in a followup I dont think its blocking for this one. @chizkiyahu

Signed-off-by: Chizkiyahu Raful <chizkiyahu.raful@arm.com>
Change-Id: Ie72323e17bcab7999bed89702bc42dc9df665b93
Change-Id: I23787d086aa9229f5d93d1e16c6f6297119e7f5a
Signed-off-by: Chizkiyahu Raful <chizkiyahu.raful@arm.com>
Copilot AI review requested due to automatic review settings March 3, 2026 10:40
@chizkiyahu
Copy link
Copy Markdown
Contributor Author

Just need the merge conflict to be resolved

done

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 56 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +114 to +119
@functools.lru_cache(maxsize=1)
def _install_fast_packers() -> None:
_Buffer.BufferT.Pack = _pack_buffer
_BackendDelegateInlineData.BackendDelegateInlineDataT.Pack = (
_pack_backend_delegate_inline_data
)
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _install_fast_packers function is decorated with @functools.lru_cache(maxsize=1), which permanently monkey-patches the Pack methods on BufferT and BackendDelegateInlineDataT class objects the first time it is called. Because these are class-level mutations (not instance-level), the patched Pack method will be active for all subsequent uses of those classes throughout the entire process lifetime — including any parallel uses, tests that import these classes independently, or any code path that calls BufferT.Pack or BackendDelegateInlineDataT.Pack without going through _program_to_flatbuffer. This is a global side effect that is never undone, which makes the code fragile and hard to reason about in concurrent or test environments.

Copilot uses AI. Check for mistakes.
Comment on lines +90 to +99
def _pack_buffer(self: Any, builder: Any) -> int:
storage = 0
if self.storage is not None:
storage = _create_aligned_byte_vector(
builder, _coerce_bytes(self.storage), _BUFFER_ALIGNMENT.get()
)
_Buffer.BufferStart(builder)
if storage:
_Buffer.BufferAddStorage(builder, storage)
return _Buffer.BufferEnd(builder)
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _pack_buffer function checks if storage: (line 97) to decide whether to call BufferAddStorage. Since storage is a FlatBuffers offset (an integer), this check is falsy when the offset is 0. A FlatBuffers offset of 0 would only occur for an empty vector and 0 is not a valid table offset in practice, but this subtle reliance on the offset never being 0 for a present field is fragile. Using an explicit if self.storage is not None: guard mirroring the outer check would be clearer and safer. The same issue applies to _pack_backend_delegate_inline_data at line 109.

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings March 6, 2026 04:37
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot was unable to review this pull request because the user who requested the review is ineligible. To be eligible to request a review, you need a paid Copilot license, or your organization must enable Copilot code review.

@JacobSzwejbka JacobSzwejbka merged commit be6b986 into pytorch:main Mar 6, 2026
155 of 161 checks passed
@JacobSzwejbka
Copy link
Copy Markdown
Contributor

Just wanted to say thanks again @chizkiyahu. This is an awesome change!

chizkiyahu added a commit to chizkiyahu/executorch that referenced this pull request Mar 12, 2026
This continues the work from pytorch#17333.

Change-Id: I35ac4cd5f6430ea89939453344c13e056b5c746c
Signed-off-by: Chizkiyahu Raful <chizkiyahu.raful@arm.com>
jpiat pushed a commit to jpiat/executorch that referenced this pull request Mar 17, 2026
…7333)

## **Title**
Introduce flatbuffer serializer as default for program serialization

## Summary

Add a FlatBuffers-based serializer at
`exir/_serialize/_flatbuffer_program.py` and make it the default
serializer for program artifacts. The new serializer achieves large
efficiency gains across our model suite while retaining robustness via a
JSON fallback and detailed error logging.

### Before
python -> json -> binary 

### After
python  -> binary 

## Performance

Measured across ~600 models:

* ~36.3× average speedup (serialize latency).
* ~61.6% average reduction in memory usage during serialization.

## Tests

Unit tests added at:

* `exir/_serialize/test/test_flatbuffer_program.py`

cc @digantdesai @SS-JIA @freddan80 @per @zingo @oscarandersson8218
@mansnils @Sebastian-Larsson @robell

---------

Signed-off-by: Chizkiyahu Raful <chizkiyahu.raful@arm.com>
Co-authored-by: Jacob Szwejbka <jakeszwe@meta.com>
chizkiyahu added a commit to chizkiyahu/executorch that referenced this pull request May 13, 2026
This continues the work from pytorch#17333.

Change-Id: I35ac4cd5f6430ea89939453344c13e056b5c746c
Signed-off-by: Chizkiyahu Raful <chizkiyahu.raful@arm.com>
chizkiyahu added a commit to chizkiyahu/executorch that referenced this pull request May 13, 2026
This continues the work from pytorch#17333.

Change-Id: I35ac4cd5f6430ea89939453344c13e056b5c746c
Signed-off-by: Chizkiyahu Raful <chizkiyahu.raful@arm.com>
chizkiyahu added a commit to chizkiyahu/executorch that referenced this pull request May 13, 2026
This continues the work from pytorch#17333.

Change-Id: I35ac4cd5f6430ea89939453344c13e056b5c746c
Signed-off-by: Chizkiyahu Raful <chizkiyahu.raful@arm.com>
chizkiyahu added a commit to chizkiyahu/executorch that referenced this pull request May 21, 2026
This continues the work from pytorch#17333.

Change-Id: I35ac4cd5f6430ea89939453344c13e056b5c746c
Signed-off-by: Chizkiyahu Raful <chizkiyahu.raful@arm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm release notes: exir Changes to any dialects and passes on these dialects, such as memory planning

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants