You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Convert contiguous select_copy to zero-copy view in ReplaceViewCopyWithViewPass (pytorch#19198)
Summary:
Extends the ReplaceViewCopyWithViewPass to convert `select_copy` ops to zero-copy `memory.select` views when the output is a contiguous sub-region of the base tensor. This is the same pattern used for `view_copy` -> `memory.view`, but for select operations.
The pass checks that the base is densely packed, non-constant, and static, and that the selected output forms a dense packing. It uses the base spec dim_order to compute actual memory strides, so this works for any contiguous layout (C-contiguous, channels-last, etc.).
For static memory-planned subviews, the emitter elides the op entirely (no runtime instruction) by serializing tensor metadata with `mem_offset = base_offset + byte_delta`. For dynamic shapes, a new `executorch_prim::et_select` runtime op sets the output data pointer to `self.data_ptr + offset`.
Changes:
- `exir/memory.py`: Added `memory.select` function
- `exir/passes/replace_view_copy_with_view_pass.py`: Extended `_ViewSpec` with `byte_offset`, `stride`, `dim_order` params; added contiguity check using `stride_from_dim_order`; added select_copy handling in the pass
- Pipeline integration: memory planner, to_out_var skiplist, emitter, serialization
- `kernels/prim_ops/et_select.{h,cpp}`: C++ runtime op for dynamic select views
- Tests: 5 new Python tests + 1 C++ test
Authored with Claude.
Differential Revision: D102396195
0 commit comments