[ET-VK] Plumb subgroup property queries + VK_EXT_subgroup_size_control#19403
[ET-VK] Plumb subgroup property queries + VK_EXT_subgroup_size_control#19403meta-codesync[bot] merged 1 commit intogh/SS-JIA/530/basefrom
Conversation
Adds infrastructure for querying GPU subgroup capabilities and pinning required subgroup size at pipeline creation time, sourced from the existing `SUBGROUP_SIZE` yaml template parameter. This is the foundation for writing subgroup-using shaders (e.g. cooperative GEMV variants) that remain portable across GPUs with different subgroup widths (Adreno=64, Mali=16, NVIDIA=32, etc.). `PhysicalDevice` now chains `VkPhysicalDeviceSubgroupProperties` and `VkPhysicalDeviceSubgroupSizeControlProperties` into `vkGetPhysicalDeviceProperties2`, plus `VkPhysicalDeviceSubgroupSizeControlFeatures` into `vkGetPhysicalDeviceFeatures2`. The `Adapter` exposes accessors for subgroup_size, supported subgroup ops/stages, [min,max] subgroup size range, and whether the driver supports per-pipeline required subgroup size for the COMPUTE stage. `VK_EXT_subgroup_size_control` is added to the requested extension list and the size-control features are chained into device-create pNext when supported. `ComputePipeline::Descriptor` gains a `required_subgroup_size` field that, when nonzero, chains `VkPipelineShaderStageRequiredSubgroupSizeCreateInfoEXT` into pipeline creation (both the on-demand `retrieve` path and the batch `create_pipelines` path). The pipeline cache key includes the field so pipelines compiled for different subgroup widths cache independently. `ShaderInfo` carries the same field so it can be plumbed from shader yaml through to the pipeline descriptor. The existing `SUBGROUP_SIZE` yaml template parameter is now the single source of truth: `gen_vulkan_spv.py` substitutes it into GLSL as before AND emits it as `ShaderInfo::required_subgroup_size`. At dispatch, `vkapi::resolve_required_subgroup_size` validates the value is within the adapter's `[min, max]` range and throws `ShaderNotSupportedError` if the extension is unsupported or the value is out of range, surfacing a clear failure rather than silently miscompiling a shader whose algorithm depends on the pinned subgroup width. No shader yamls are modified by this change; subsequent commits opt their shaders into the pinning by declaring `SUBGROUP_SIZE` in their yamls. Differential Revision: [D104456803](https://our.internmc.facebook.com/intern/diff/D104456803/) [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19403
Note: Links to docs will display an error until the docs builds have been completed. ⏳ No Failures, 1 Pending, 1 Unclassified FailureAs of commit 230de70 with merge base c564936 ( UNCLASSIFIED FAILURE - DrCI could not classify the following job because the workflow did not run on the merge base. The failure may be pre-existing on trunk or introduced by this PR:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
1f8a6e6
into
gh/SS-JIA/530/base
Stack from ghstack (oldest at bottom):
Adds infrastructure for querying GPU subgroup capabilities and pinning required subgroup size at pipeline creation time, sourced from the existing
SUBGROUP_SIZEyaml template parameter. This is the foundation for writing subgroup-using shaders (e.g. cooperative GEMV variants) that remain portable across GPUs with different subgroup widths (Adreno=64, Mali=16, NVIDIA=32, etc.).PhysicalDevicenow chainsVkPhysicalDeviceSubgroupPropertiesandVkPhysicalDeviceSubgroupSizeControlPropertiesintovkGetPhysicalDeviceProperties2, plusVkPhysicalDeviceSubgroupSizeControlFeaturesintovkGetPhysicalDeviceFeatures2. TheAdapterexposes accessors for subgroup_size, supported subgroup ops/stages, [min,max] subgroup size range, and whether the driver supports per-pipeline required subgroup size for the COMPUTE stage.VK_EXT_subgroup_size_controlis added to the requested extension list and the size-control features are chained into device-create pNext when supported.ComputePipeline::Descriptorgains arequired_subgroup_sizefield that, when nonzero, chainsVkPipelineShaderStageRequiredSubgroupSizeCreateInfoEXTinto pipeline creation (both the on-demandretrievepath and the batchcreate_pipelinespath). The pipeline cache key includes the field so pipelines compiled for different subgroup widths cache independently.ShaderInfocarries the same field so it can be plumbed from shader yaml through to the pipeline descriptor.The existing
SUBGROUP_SIZEyaml template parameter is now the single source of truth:gen_vulkan_spv.pysubstitutes it into GLSL as before AND emits it asShaderInfo::required_subgroup_size. At dispatch,vkapi::resolve_required_subgroup_sizevalidates the value is within the adapter's[min, max]range and throwsShaderNotSupportedErrorif the extension is unsupported or the value is out of range, surfacing a clear failure rather than silently miscompiling a shader whose algorithm depends on the pinned subgroup width.No shader yamls are modified by this change; subsequent commits opt their shaders into the pinning by declaring
SUBGROUP_SIZEin their yamls.Differential Revision: D104456803