1// Copyright 2021-2023 The Khronos Group Inc. 2// 3// SPDX-License-Identifier: CC-BY-4.0 4 5= VK_EXT_mutable_descriptor_type 6:toc: left 7:refpage: https://registry.khronos.org/vulkan/specs/1.3-extensions/man/html/ 8:sectnums: 9 10This extension enables applications to alias multiple descriptor types onto the same binding, reducing friction when porting between Vulkan and DirectX 12. 11 12NOTE: This extension is a direct promotion of link:{refpage}VK_VALVE_mutable_descriptor_type.html[VK_VALVE_mutable_descriptor_type]. As that extension already shipped before proposal documents existed, this document has been written retroactively during promotion to EXT. 13 14 15== Problem Statement 16 17Applications porting to Vulkan from DirectX 12, or layers emulating DX12 on Vulkan, are faced with two major performance hurdles when dealing with descriptors due to a mismatch in how the two APIs handle descriptors and descriptor uploads. 18 19In DirectX 12, resource descriptors are stored in a uniform array of bindings in the API, such that the same array can contain both texture and buffer descriptors. 20This manifests in particular when using Shader Model 6.6, where this uniform array of bindings is exposed directly to the shader. 21In addition to that, when using DirectX 12, users can create a CPU-local heap used for manipulation, before uploading that to device memory. 22This allows for a lot of manipulation on the host without saturating system bandwidth to VRAM for discrete GPUs, and can result in improved descriptor upload performance. 23 24In core Vulkan, there is no way to store different types of descriptors in a single array - each descriptor type has its own array bindings, and there is no way to index between them. 25Emulating this reliably means creating multiple parallel arrays of each resource type, which can result in a significant memory hit compared to DirectX. 26In Vulkan, this would be covered by 6 different descriptor types: 27 28 - `VK_DESCRIPTOR_TYPE_UNIFORM_TEXEL_BUFFER` (SRV) 29 - `VK_DESCRIPTOR_TYPE_STORAGE_TEXEL_BUFFER` (UAV) 30 - `VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE` (SRV) 31 - `VK_DESCRIPTOR_TYPE_STORAGE_IMAGE` (UAV) 32 - `VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER` (CBV) 33 - `VK_DESCRIPTOR_TYPE_STORAGE_BUFFER` (SRV or UAV depending on read-only) 34 35`VK_DESCRIPTOR_TYPE_ACCELERATION_STRUCTURE_KHR` can also be added, but its support is optional and is awkward to use when porting from DirectX 12 due to its use of GPU VA without a pResource. 36 37There is also no way to flag a descriptor as being used for host manipulation in Vulkan, so managing descriptors as a DirectX 12 app would do results in significantly worse performance, and can actually be a bottleneck in dynamic systems. 38 39There are other notable differences between the two APIs in terms of descriptor management, but no other difference has such an outsized impact on performance or memory consumption, so this extension proposal is limited to addressing these specific issues. 40 41 42== Solution Space 43 44There are a handful of ways of dealing with these issues that have been considered: 45 46. Solve this in external software 47. Add the ability to alias descriptors and specify host-only descriptor sets 48. Replace Vulkan's descriptor management APIs wholesale 49 50Firstly, solving this in external software has been attempted (notably by https://github.com/ValveSoftware/vkd3d[vkd3d]) and no satisfying options could be identified; there are workarounds but they are either too slow or too memory intensive to emulate DirectX 12 content at native performance. 51 52Adding descriptor aliasing and host-only descriptor pools is a simple point fix that applications and layers would be able to integrate relatively easily, without hugely impacting existing software decisions. 53More notably, no significant changes to shaders are required, other than changing descriptor sets and binding decorations. 54 55Replacing Vulkan's descriptor management more generally is possible, but ultimately would require significantly more work than option 2, both in design and in application software stacks to make use of it. 56This could be considered for future extensions, but for the problems identified here, it would be overkill. 57 58 59== Proposal 60 61 62=== Mutable Descriptor Type 63 64Typically when specifying a link:{refpage}VkDescriptorSetLayoutBinding.html[descriptor set layout binding], applications have to choose one of the available link:{refpage}VkDescriptorType.html[descriptor types] that will occupy that binding. 65This extension adds a new descriptor type: 66 67[source,c] 68---- 69VK_DESCRIPTOR_TYPE_MUTABLE_EXT = 1000351000 70---- 71 72When this descriptor type is specified, the descriptor type is specified to be a union of other types that are further specified for each binding with the following structures: 73 74[source,c] 75---- 76typedef struct VkMutableDescriptorTypeCreateInfoEXT { 77 VkStructureType sType; 78 const void* pNext; 79 uint32_t mutableDescriptorTypeListCount; 80 const VkMutableDescriptorTypeListEXT* pMutableDescriptorTypeLists; 81} VkMutableDescriptorTypeCreateInfoEXT; 82 83typedef struct VkMutableDescriptorTypeListEXT { 84 uint32_t descriptorTypeCount; 85 const VkDescriptorType* pDescriptorTypes; 86} VkMutableDescriptorTypeListEXT; 87---- 88 89`VkMutableDescriptorTypeCreateInfoEXT` can be added to the `pNext` chain of link:{refpage}VkDescriptorSetLayoutCreateInfo.html[VkDescriptorSetLayoutCreateInfo], where each entry in `pMutableDescriptorTypeLists` corresponds to a binding at the same index in `pBindings`. 90The list of descriptor types in `VkMutableDescriptorTypeListEXT` then defines the set of types which can be used in that binding. 91 92When writing a descriptor to such a binding in a descriptor set, the actual type of the descriptor must be specified, and it must be one of the types specified in this list when the set layout was created. 93 94A mutable descriptor can be consumed as the descriptor type it was updated with. 95For example, if a mutable descriptor was updated with a `STORAGE_IMAGE` it can be consumed as a `STORAGE_IMAGE` in the shader. 96Consuming the descriptor as any other descriptor type is undefined behavior. 97Descriptor types are inherited through descriptor copies as well where the type of the source descriptor is made active in the destination descriptor. 98 99==== Supported descriptor types 100 101As a baseline, the extension guarantees that any combination of these descriptor types are supported, which aims to mirror DirectX 12: 102 103 - `VK_DESCRIPTOR_TYPE_UNIFORM_TEXEL_BUFFER` (SRV) 104 - `VK_DESCRIPTOR_TYPE_STORAGE_TEXEL_BUFFER` (UAV) 105 - `VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE` (SRV) 106 - `VK_DESCRIPTOR_TYPE_STORAGE_IMAGE` (UAV) 107 - `VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER` (CBV) 108 - `VK_DESCRIPTOR_TYPE_STORAGE_BUFFER` (SRV or UAV depending on read-only) 109 110NOTE: Samplers live in separate heaps in DirectX 12, and do not need to be mutable like this. 111 112Support can be restricted if the descriptor type in question cannot be used with the descriptor flags in question. 113An example here would be `VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER` which may not be supported with update-after-bind on some implementations. 114In this situations, applications need to use `VK_DESCRIPTOR_TYPE_STORAGE_BUFFER` and modify the shaders accordingly, but ideally, plain uniform buffers should be used instead if possible. 115 116It is possible to go beyond the minimum supported set. For this purpose, the desired descriptor set layout can be queried with link:{refpage}vkGetDescriptorSetLayoutSupport.html[vkGetDescriptorSetLayoutSupport]. 117 118The interactions between descriptor types and flags can be complicated enough that it is non-trivial to report a list of supported descriptor types at the physical device level. 119 120NOTE: Acceleration structures can also be implemented as a buffer containing `uint64_t` addresses using `OpConvertUToAccelerationStructureKHR`. No descriptor is required. Alternatively, a separate descriptor set for acceleration structures can also be used. 121 122NOTE: While it is valid to expose `VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER`, implementations are discouraged from doing so due to their large sizes and potentially awkward memory layout. Applications should never aim to use combined image samplers with mutable descriptors. 123 124==== Performance considerations 125 126A mutable descriptor is expected to consume as much memory as the largest descriptor type it supports, 127and it is expected that there will be holes in GPU memory between descriptors when smaller descriptor types are used. 128Using mutable descriptor types should only be considered when it is meaningful, e.g. when the alternative is emitting 6+ large descriptor arrays as a workaround in bindless DirectX 12 emulation or similar. 129Using mutable descriptor types as a lazy workaround for using concrete descriptor types will likely lead to lower GPU performance. 130It might also disable certain fast-paths in implementations since the descriptors types are no longer statically known at layout creation time. 131 132=== Host-Only Descriptor Sets 133 134In order to enable better host write performance for descriptors, a new flag is added to descriptor pools and descriptor set layouts to specify that accesses to descriptor sets created with them will be done in host-local memory, and does not need to be directly visible to the device. 135Without these flags, implementations may favor device-local memory with better device access performance characteristics, at the expense of host access performance. 136These flags allow device access performance to be disregarded, enabling memory with better host access performance to be used. 137Host-only descriptor sets cannot be bound to a command buffer, and their contents must be copied to a non-host-only set using link:{refpage}vkUpdateDescriptorSets.html[vkUpdateDescriptorSets] before those descriptors can be used. 138 139Descriptor pools are specified as host-only using a new link:{refpage}VkDescriptorSetLayoutCreateFlagBits.html[create flag]: 140 141[source,c] 142---- 143VK_DESCRIPTOR_POOL_CREATE_HOST_ONLY_BIT_EXT = 0x00000004 144---- 145 146Any descriptor set created from a pool with this flag set is a host-only descriptor set. 147 148The memory layout of a descriptor set may also be optimized for device access rather than host access, so a new link:{refpage}VkDescriptorSetLayoutCreateFlagBits.html[create flag] is provided to specify when a layout will be used with a host-only pool: 149 150[source,c] 151---- 152VK_DESCRIPTOR_SET_LAYOUT_CREATE_HOST_ONLY_POOL_BIT_EXT = 0x00000004 153---- 154 155Descriptor set layouts created with this flag must only be used to create descriptor sets from host-only pools, and descriptor sets created from host-only pools must be created with layouts that specify this flag. 156In addition, as such layouts are not valid for device access, link:{refpage}VkPipelineLayout.html[VkPipelineLayout] objects cannot be created with such descriptor set layouts. 157 158Host-only descriptor sets do not consume device-global descriptor resources (e.g. `maxUpdateAfterBindDescriptorsInAllPools`), 159and they support concurrent descriptor set updates similar to update-after-bind. 160The intention is that a host-only descriptor set can be implemented with a simple `malloc` to back the descriptor set payload. 161 162=== Features 163 164A single new feature enables all the functionality of this extension: 165 166[source,c] 167---- 168typedef struct VkPhysicalDeviceMutableDescriptorTypeFeaturesEXT { 169 VkStructureType sType; 170 void* pNext; 171 VkBool32 mutableDescriptorType; 172} VkPhysicalDeviceMutableDescriptorTypeFeaturesEXT; 173---- 174 175 176== Examples 177 178 179=== Specifying a descriptor binding equivalent to a DirectX 12 CBV_SRV_UAV heap 180 181DirectX 12 descriptor heaps can be specified for general resources containing all types of buffer and image descriptors using the https://docs.microsoft.com/en-us/windows/win32/api/d3d12/ne-d3d12-d3d12_descriptor_heap_type[D3D12_DESCRIPTOR_HEAP_TYPE_CBV_SRV_UAV] type. 182The following example shows a binding specification in Vulkan that would allow it to be used with the same descriptor types as are valid in DirectX 12. 183 184[source,c] 185---- 186VkDescriptorType cbvSrvUavTypes[] = { 187 VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE, 188 VK_DESCRIPTOR_TYPE_STORAGE_IMAGE, 189 VK_DESCRIPTOR_TYPE_UNIFORM_TEXEL_BUFFER, 190 VK_DESCRIPTOR_TYPE_STORAGE_TEXEL_BUFFER, 191 VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER, 192 VK_DESCRIPTOR_TYPE_STORAGE_BUFFER, 193 VK_DESCRIPTOR_TYPE_ACCELERATION_STRUCTURE_KHR /* Need to check support if this is desired. */}; 194 195VkMutableDescriptorTypeListVALVE cbvSrvUavTypeList = { 196 .descriptorTypeCount = sizeof(cbvSrvUavTypes)/sizeof(VkDescriptorType), 197 .pDescriptorTypes = cbvSrvUavTypes}; 198 199VkMutableDescriptorTypeCreateInfoEXT mutableTypeInfo = { 200 .sType = VK_STRUCTURE_TYPE_MUTABLE_DESCRIPTOR_TYPE_CREATE_INFO_EXT, 201 .pNext = NULL, 202 .mutableDescriptorTypeListCount = 1, 203 .pMutableDescriptorTypeLists = &cbvSrvUavTypeList}; 204 205VkDescriptorSetLayoutBinding cbvSrvUavBinding = { 206 .binding = 0, 207 .descriptorType = VK_DESCRIPTOR_TYPE_MUTABLE_EXT, 208 .descriptorCount = /*...*/, 209 .stageFlags = /*...*/, 210 .pImmutableSamplers = NULL}; 211 212VkDescriptorSetLayoutCreateInfo createInfo = { 213 .sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO, 214 .pNext = &mutableTypeInfo, 215 .flags = /*...*/, 216 .bindingCount = 1, 217 .pBindings = &cbvSrvUavBinding}; 218 219// To use optional features, need to query first. 220VkDescriptorSetLayoutSupport support = { .sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_SUPPORT }; 221vkGetDescriptorSetLayoutSupport(device, &createInfo, &support); 222 223if (support.supported) { 224 VkDescriptorSetLayout layout; 225 VkResult result = vkCreateDescriptorSetLayout(device, &createInfo, NULL, &layout); 226} else { 227 // Fallback 228} 229---- 230 231=== Accessing a mutable descriptor in a shader 232 233Very little needs to change, but multiple descriptors can alias over the same binding. 234 235==== GLSL 236 237[source,c] 238---- 239layout(set = 0, binding = 0) uniform texture2D Tex2DHeap[]; 240layout(set = 0, binding = 0) uniform texture3D Tex3DHeap[]; 241layout(set = 0, binding = 0) uniform textureCube TexCubeHeap[]; 242layout(set = 0, binding = 0) uniform textureBuffer TexelBufferHeap[]; 243layout(set = 0, binding = 0) uniform image2D RWTex2DHeap[]; 244layout(set = 0, binding = 0) uniform image3D RWTex3DHeap[]; 245layout(set = 0, binding = 0) uniform imageBuffer StorageTexelBufferHeap[]; 246layout(set = 0, binding = 0) uniform CBVHeap { vec4 data[4096]; } CBVHeap[]; 247// Can alias freely. Might need Aliased decorations if the same SSBO is accessed with different data types. 248// SRV raw buffers 249layout(set = 0, binding = 0) readonly buffer { float data[]; } SRVFloatHeap[]; 250layout(set = 0, binding = 0) readonly buffer { vec2 data[]; } SRVFloat2Heap[]; 251layout(set = 0, binding = 0) readonly buffer { vec4 data[]; } SRVFloat4Heap[]; 252// UAV raw buffers 253layout(set = 0, binding = 0) buffer { float data[]; } UAVFloatHeap[]; 254layout(set = 0, binding = 0) buffer { vec2 data[]; } UAVFloat2Heap[]; 255layout(set = 0, binding = 0) buffer { vec4 data[]; } UAVFloat4Heap[]; 256 257void main() 258{ 259 // Access the heap freely ala SM 6.6. All variables alias on top of the same descriptor array. 260 texelFetch(Tex2DHeap[index0], ...); 261 texelFetch(Tex3DHeap[index1], ...); 262 vec4 data = CBVHeap[index2].data[offset]; 263} 264---- 265 266The ergonomics here are somewhat awkward, but it is possible to move the resource declarations to a common header if desired. 267 268For this to be well defined, `VK_DESCRIPTOR_BINDING_FLAG_PARTIALLY_BOUND_BIT` must be used on the mutable binding, since descriptor validity is only checked when a descriptor is dynamically accessed. 269 270==== HLSL 271 272The example above can mirror HLSL using `\[[vk::]]` attributes, but for a more direct SM 6.6-style integration, it is possible to implement this in a HLSL frontend as such: 273 274 - Application specifies that resource heap lives in a specific set / binding. 275 - To fallback to non-mutable support, it is possible to support a different set / binding for each Vulkan descriptor type. 276 - HLSL frontend emits `OpVariable` runtime array aliases as required when a descriptor is loaded in `ResourceDescriptorHeap[]` or `SamplerDescriptorHeap[]`. 277 - The set / binding is provided by application. 278 - Index into that array is 1:1 the index in HLSL source. 279 - NonUniformResourceIndex must be forwarded to where the resource is accessed. 280 - https://github.com/HansKristian-Work/dxil-spirv[dxil-spirv] implements this. 281 282