[vulkan] Add a buffer index offset during codegen to allow arbitrary crop offsets by derek-gerstmann · Pull Request #8954 · halide/Halide

derek-gerstmann · 2026-02-18T23:40:04Z

This PR enables arbitrary offsets for accessing buffers and avoids restrictive Vulkan alignment constraints.

In particular, if the device memory requirements have a minimum alignment that is larger than the natural scalar size of a channel, we can encounter errors when trying to access arbitrary crops since we may bind a vk_buffer with an offset less than the minimum required alignment. To avoid this, we always provide an index offset to all storage buffers, which provides a mechanism to handle crops in a more generic manner.

CodeGen now adds one int32 buffer offset param for each buffer after all other scalar args.
The runtime packs these params into the uniform buffer for each storage buffer.
Crop device now computes an index offset (instead of a byte offset).
Copy to/from device recomputes a byte offset from this index offset.

Addresses #8944

…cessing buffers to avoid restrictive alignment constraints. CodeGen now adds one int32 buffer offset param for each buffer after all other scalar args The runtime packs these params into the uniform buffer for each storage buffer Crop device now computes an index offset (instead of a byte offset Copy to/from device recomputes a byte offset from this index offset

mcourteaux · 2026-02-19T12:30:52Z

Can you explain what the difference exactly is between the offset field and the index_offset field in the MemoryRegion struct? Seems odd that'd you'd need two scalar offsets?

Adjusted relative offsets to use indexing for buffer copies. Added RegionAllocation and RegionIndexing to clarify mapping. Updated all affected interfaces.

derek-gerstmann · 2026-02-19T20:53:40Z

Can you explain what the difference exactly is between the offset field and the index_offset field in the MemoryRegion struct? Seems odd that'd you'd need two scalar offsets?

The refactoring cleanup should help. The original "offset" was a "byte offset" pointing to the reserved area within the "block" that the "region" occupied. This is the actual allocation, and we insure this is aligned to an appropriate address for the device memory pool.

Previously, we also had a special "range" that was used for cropping regions, that also added a "head_offset" and "tail_offset" in bytes to the existing "byte offset" for where in the block the cropped region should occupy. This was overly complicated, and ultimately not compatible across all device memory pools. Some have specific alignment requirements, which are usually much bigger than a byte, so using a byte offset could trigger errors.

This PR introduces an "index offset" shader parameter to support crops at arbitrary offsets. This is now defined in the MemoryRegion.indexing field.

mcourteaux · 2026-02-19T21:12:09Z

I still don't understand why there is even a sub-region in a "parent block"? Why not always reference the "parent" and just use a byte-granular offset into the shader? I understand that the shader-parameter will be necessary in any case. I don't understand why the way of referencing this memory seems to be:

block.base_address + region.offset + extra_index_offset_introduced_in_this_pr

instead of:

block.base_address + index_offset_introduced_in_this_pr

and remove this extra "layer" of subregioning.

derek-gerstmann · 2026-02-20T01:03:34Z

I still don't understand why there is even a sub-region in a "parent block"? Why not always reference the "parent" and just use a byte-granular offset into the shader? I understand that the shader-parameter will be necessary in any case. I don't understand why the way of referencing this memory seems to be:
block.base_address + region.offset + extra_index_offset_introduced_in_this_pr
instead of:
block.base_address + index_offset_introduced_in_this_pr
and remove this extra "layer" of subregioning.

We can't do byte-granular offsets in the shader, since not every device supports 8-bit or 16-bit scalar types. So we have to use the natural indexing.

The block base address is the address allocated by the device driver for a chunk of device memory. One allocation has to be mapped to multiple vk_buffers, each of which are used as backing storage for the halide_buffer_t. The region offset is managed by our vulkan allocator and simply adjusts offsets for mapping vk_buffer objects that point into the allocated block. So we need to maintain this offset everytime we bind a vk_buffer.

Refactor cleanup to remove unused MemoryRegion.range.

70830d4

Adjusted relative offsets to use indexing for buffer copies. Added RegionAllocation and RegionIndexing to clarify mapping. Updated all affected interfaces.

Clang formatting pass

bc65d21

derek-gerstmann requested review from abadams and alexreinking February 20, 2026 01:07

abadams approved these changes Feb 20, 2026

View reviewed changes

derek-gerstmann merged commit d5a30fb into main Feb 20, 2026
17 checks passed

derek-gerstmann mentioned this pull request Feb 20, 2026

[vulkan] Add a buffer index offset to support crops at arbitrary alignments #8944

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[vulkan] Add a buffer index offset during codegen to allow arbitrary crop offsets #8954

[vulkan] Add a buffer index offset during codegen to allow arbitrary crop offsets #8954
derek-gerstmann merged 3 commits into
mainfrom
dg/vulkan_buffer_offset_param

derek-gerstmann commented Feb 18, 2026 •

edited

Loading

Uh oh!

mcourteaux commented Feb 19, 2026

Uh oh!

derek-gerstmann commented Feb 19, 2026

Uh oh!

mcourteaux commented Feb 19, 2026

Uh oh!

derek-gerstmann commented Feb 20, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

derek-gerstmann commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mcourteaux commented Feb 19, 2026

Uh oh!

derek-gerstmann commented Feb 19, 2026

Uh oh!

mcourteaux commented Feb 19, 2026

Uh oh!

derek-gerstmann commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

derek-gerstmann commented Feb 18, 2026 •

edited

Loading

derek-gerstmann commented Feb 20, 2026 •

edited

Loading