Serialize NamedData in PTE file by lucylq · Pull Request #8696 · pytorch/executorch

lucylq · 2025-02-25T20:10:23Z

Stack from ghstack (oldest at bottom):

Serialize NamedData in PTE file
Add NamedDataStore to EdgeProgramManager

Serializing NamedData is slightly different to constant/delegate data as each segment comes with its own alignment.

An example:
Given NamedData = {"key": data}. Data is 250 bytes.

BackendA requires data with alignment=3
BackendB requires data with alignment=4

Then, data0 should be serialized with alignment of lcm(3, 4) = 12

At serialization, ExecuTorch has a 'segment_alignment' that defaults to 128. Data is now serialized to lcm(12, 128) = 384.

Inside the DataSegment, we want to store the original size of the data (250). The offset of the subsequent DataSegment would be 384 bytes after the start of this one.

Design
Introduce a new dataclass 'AlignedData' that stores the buffer and any alignment that's required. This is used when assembling Program.segments to ensure we get lcm(buffer_alignment, segment_alignment).

Note: The default segment_alignment can be overridden inside 'ExecutorchBackendConfig'.

Differential Revision: D69764150

1. Serialize NamedData in PTE file 2. Add NamedDataStore to EdgeProgramManager --- Serializing NamedData is slightly different to constant/delegate data as each segment comes with its own alignment. **An example:** Given NamedData = {"key": data}. Data is 250 bytes. - BackendA requires data with alignment=3 - BackendB requires data with alignment=4 Then, data0 should be serialized with alignment of lcm(3, 4) = 12 At serialization, ExecuTorch has a 'segment_alignment' that defaults to 128. Data is now serialized to lcm(12, 128) = 384. Inside the DataSegment, we want to store the original size of the data (250). The offset of the subsequent DataSegment would be 384 bytes after the start of this one. **Design** Introduce a new dataclass 'AlignedData' that stores the buffer and any alignment that's required. This is used when assembling Program.segments to ensure we get lcm(buffer_alignment, segment_alignment). Note: The default segment_alignment can be overridden inside 'ExecutorchBackendConfig'. Differential Revision: [D69764150](https://our.internmc.facebook.com/intern/diff/D69764150/) [ghstack-poisoned]

pytorch-bot · 2025-02-25T20:10:26Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/8696

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit 657228e with merge base bc55c01 ():

NEW FAILURE - The following job has failed:

Check Labels / Check labels (gh)

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2025-02-25T20:10:31Z

This pull request was exported from Phabricator. Differential Revision: D69764150

1. Serialize NamedData in PTE file 2. Add NamedDataStore to EdgeProgramManager --- Serializing NamedData is slightly different to constant/delegate data as each segment comes with its own alignment. **An example:** Given NamedData = {"key": data}. Data is 250 bytes. - BackendA requires data with alignment=3 - BackendB requires data with alignment=4 Then, data0 should be serialized with alignment of lcm(3, 4) = 12 At serialization, ExecuTorch has a 'segment_alignment' that defaults to 128. Data is now serialized to lcm(12, 128) = 384. Inside the DataSegment, we want to store the original size of the data (250). The offset of the subsequent DataSegment would be 384 bytes after the start of this one. **Design** Introduce a new dataclass 'AlignedData' that stores the buffer and any alignment that's required. This is used when assembling Program.segments to ensure we get lcm(buffer_alignment, segment_alignment). Note: The default segment_alignment can be overridden inside 'ExecutorchBackendConfig'. Differential Revision: [D69764150](https://our.internmc.facebook.com/intern/diff/D69764150/) ghstack-source-id: 268331257 Pull Request resolved: #8696

github-actions · 2025-02-25T20:11:50Z

This PR needs a `release notes:` label

If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

1. Serialize NamedData in PTE file 2. Add NamedDataStore to EdgeProgramManager --- Serializing NamedData is slightly different to constant/delegate data as each segment comes with its own alignment. **An example:** Given NamedData = {"key": data}. Data is 250 bytes. - BackendA requires data with alignment=3 - BackendB requires data with alignment=4 Then, data0 should be serialized with alignment of lcm(3, 4) = 12 At serialization, ExecuTorch has a 'segment_alignment' that defaults to 128. Data is now serialized to lcm(12, 128) = 384. Inside the DataSegment, we want to store the original size of the data (250). The offset of the subsequent DataSegment would be 384 bytes after the start of this one. **Design** Introduce a new dataclass 'AlignedData' that stores the buffer and any alignment that's required. This is used when assembling Program.segments to ensure we get lcm(buffer_alignment, segment_alignment). Note: The default segment_alignment can be overridden inside 'ExecutorchBackendConfig'. Differential Revision: [D69764150](https://our.internmc.facebook.com/intern/diff/D69764150/) [ghstack-poisoned]

Pull Request resolved: #8696 1. Serialize NamedData in PTE file 2. Add NamedDataStore to EdgeProgramManager --- Serializing NamedData is slightly different to constant/delegate data as each segment comes with its own alignment. **An example:** Given NamedData = {"key": data}. Data is 250 bytes. - BackendA requires data with alignment=3 - BackendB requires data with alignment=4 Then, data0 should be serialized with alignment of lcm(3, 4) = 12 At serialization, ExecuTorch has a 'segment_alignment' that defaults to 128. Data is now serialized to lcm(12, 128) = 384. Inside the DataSegment, we want to store the original size of the data (250). The offset of the subsequent DataSegment would be 384 bytes after the start of this one. **Design** Introduce a new dataclass 'AlignedData' that stores the buffer and any alignment that's required. This is used when assembling Program.segments to ensure we get lcm(buffer_alignment, segment_alignment). Note: The default segment_alignment can be overridden inside 'ExecutorchBackendConfig'. ghstack-source-id: 268381885 @exported-using-ghexport Differential Revision: [D69764150](https://our.internmc.facebook.com/intern/diff/D69764150/)

lucylq requested review from JacobSzwejbka, larryliu0820 and tarun292 as code owners February 25, 2025 20:10

facebook-github-bot added CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported labels Feb 25, 2025

lucylq mentioned this pull request Feb 25, 2025

Introduce NamedDataStore #8587

Merged

lucylq mentioned this pull request Feb 25, 2025

[executorch][weight sharing] Introduce NamedData to PTE schema #8695

Closed

lucylq mentioned this pull request Feb 27, 2025

[executorch][runtime] Introduce CoreDataMap for weight sharing #8751

Closed

tarun292 approved these changes Feb 27, 2025

View reviewed changes

lucylq closed this Mar 25, 2025

lucylq had a problem deploying to cherry-pick-bot March 25, 2025 18:00 — with GitHub Actions Failure

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Serialize NamedData in PTE file#8696

Serialize NamedData in PTE file#8696
lucylq wants to merge 2 commits intogh/lucylq/45/basefrom
gh/lucylq/45/head

lucylq commented Feb 25, 2025 •

edited

Loading

Uh oh!

pytorch-bot Bot commented Feb 25, 2025 •

edited

Loading

Uh oh!

facebook-github-bot commented Feb 25, 2025

Uh oh!

github-actions Bot commented Feb 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

lucylq commented Feb 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot Bot commented Feb 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/8696

❌ 1 New Failure

Uh oh!

facebook-github-bot commented Feb 25, 2025

Uh oh!

github-actions Bot commented Feb 25, 2025

This PR needs a release notes: label

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

lucylq commented Feb 25, 2025 •

edited

Loading

pytorch-bot Bot commented Feb 25, 2025 •

edited

Loading

This PR needs a `release notes:` label