1// Copyright 2015-2021 The Khronos Group, Inc. 2// 3// SPDX-License-Identifier: CC-BY-4.0 4 5[[shaders]] 6= Shaders 7 8A shader specifies programmable operations that execute for each vertex, 9control point, tessellated vertex, primitive, fragment, or workgroup in the 10corresponding stage(s) of the graphics and compute pipelines. 11 12Graphics pipelines include vertex shader execution as a result of 13<<drawing,primitive assembly>>, followed, if enabled, by tessellation 14control and evaluation shaders operating on <<drawing-patch-lists,patches>>, 15geometry shaders, if enabled, operating on primitives, and fragment shaders, 16if present, operating on fragments generated by <<primsrast,Rasterization>>. 17In this specification, vertex, tessellation control, tessellation evaluation 18and geometry shaders are collectively referred to as 19<<pipeline-graphics-subsets-pre-rasterization,pre-rasterization shader 20stage>>s and occur in the logical pipeline before rasterization. 21The fragment shader occurs logically after rasterization. 22 23Only the compute shader stage is included in a compute pipeline. 24Compute shaders operate on compute invocations in a workgroup. 25 26Shaders can: read from input variables, and read from and write to output 27variables. 28Input and output variables can: be used to transfer data between shader 29stages, or to allow the shader to interact with values that exist in the 30execution environment. 31Similarly, the execution environment provides constants describing 32capabilities. 33 34Shader variables are associated with execution environment-provided inputs 35and outputs using _built-in_ decorations in the shader. 36The available decorations for each stage are documented in the following 37subsections. 38 39 40[[shader-modules]] 41== Shader Modules 42 43[open,refpage='VkShaderModule',desc='Opaque handle to a shader module object',type='handles'] 44-- 45_Shader modules_ contain _shader code_ and one or more entry points. 46Shaders are selected from a shader module by specifying an entry point as 47part of <<pipelines,pipeline>> creation. 48The stages of a pipeline can: use shaders that come from different modules. 49The shader code defining a shader module must: be in the SPIR-V format, as 50described by the <<spirvenv,Vulkan Environment for SPIR-V>> appendix. 51 52Shader modules are represented by sname:VkShaderModule handles: 53 54include::{generated}/api/handles/VkShaderModule.txt[] 55-- 56 57[open,refpage='vkCreateShaderModule',desc='Creates a new shader module object',type='protos'] 58-- 59To create a shader module, call: 60 61include::{generated}/api/protos/vkCreateShaderModule.txt[] 62 63 * pname:device is the logical device that creates the shader module. 64 * pname:pCreateInfo is a pointer to a slink:VkShaderModuleCreateInfo 65 structure. 66 * pname:pAllocator controls host memory allocation as described in the 67 <<memory-allocation, Memory Allocation>> chapter. 68 * pname:pShaderModule is a pointer to a slink:VkShaderModule handle in 69 which the resulting shader module object is returned. 70 71Once a shader module has been created, any entry points it contains can: be 72used in pipeline shader stages as described in <<pipelines-compute,Compute 73Pipelines>> and <<pipelines-graphics,Graphics Pipelines>>. 74 75include::{generated}/validity/protos/vkCreateShaderModule.txt[] 76-- 77 78[open,refpage='VkShaderModuleCreateInfo',desc='Structure specifying parameters of a newly created shader module',type='structs'] 79-- 80The sname:VkShaderModuleCreateInfo structure is defined as: 81 82include::{generated}/api/structs/VkShaderModuleCreateInfo.txt[] 83 84 * pname:sType is the type of this structure. 85 * pname:pNext is `NULL` or a pointer to a structure extending this 86 structure. 87 * pname:flags is reserved for future use. 88 * pname:codeSize is the size, in bytes, of the code pointed to by 89 pname:pCode. 90 * pname:pCode is a pointer to code that is used to create the shader 91 module. 92 The type and format of the code is determined from the content of the 93 memory addressed by pname:pCode. 94 95.Valid Usage 96**** 97 * [[VUID-VkShaderModuleCreateInfo-codeSize-01085]] 98 pname:codeSize must: be greater than 0 99ifndef::VK_NV_glsl_shader[] 100 * [[VUID-VkShaderModuleCreateInfo-codeSize-01086]] 101 pname:codeSize must: be a multiple of 4 102 * [[VUID-VkShaderModuleCreateInfo-pCode-01087]] 103 pname:pCode must: point to valid SPIR-V code, formatted and packed as 104 described by the <<spirv-spec,Khronos SPIR-V Specification>> 105 * [[VUID-VkShaderModuleCreateInfo-pCode-01088]] 106 pname:pCode must: adhere to the validation rules described by the 107 <<spirvenv-module-validation, Validation Rules within a Module>> section 108 of the <<spirvenv-capabilities,SPIR-V Environment>> appendix 109endif::VK_NV_glsl_shader[] 110ifdef::VK_NV_glsl_shader[] 111 * [[VUID-VkShaderModuleCreateInfo-pCode-01376]] 112 If pname:pCode is a pointer to SPIR-V code, pname:codeSize must: be a 113 multiple of 4 114 * [[VUID-VkShaderModuleCreateInfo-pCode-01377]] 115 pname:pCode must: point to either valid SPIR-V code, formatted and 116 packed as described by the <<spirv-spec,Khronos SPIR-V Specification>> 117 or valid GLSL code which must: be written to the `GL_KHR_vulkan_glsl` 118 extension specification 119 * [[VUID-VkShaderModuleCreateInfo-pCode-01378]] 120 If pname:pCode is a pointer to SPIR-V code, that code must: adhere to 121 the validation rules described by the <<spirvenv-module-validation, 122 Validation Rules within a Module>> section of the 123 <<spirvenv-capabilities,SPIR-V Environment>> appendix 124 * [[VUID-VkShaderModuleCreateInfo-pCode-01379]] 125 If pname:pCode is a pointer to GLSL code, it must: be valid GLSL code 126 written to the `GL_KHR_vulkan_glsl` GLSL extension specification 127endif::VK_NV_glsl_shader[] 128 * [[VUID-VkShaderModuleCreateInfo-pCode-01089]] 129 pname:pCode must: declare the code:Shader capability for SPIR-V code 130 * [[VUID-VkShaderModuleCreateInfo-pCode-01090]] 131 pname:pCode must: not declare any capability that is not supported by 132 the API, as described by the <<spirvenv-module-validation, 133 Capabilities>> section of the <<spirvenv-capabilities,SPIR-V 134 Environment>> appendix 135 * [[VUID-VkShaderModuleCreateInfo-pCode-01091]] 136 If pname:pCode declares any of the capabilities listed in the 137 <<spirvenv-capabilities-table,SPIR-V Environment>> appendix, one of the 138 corresponding requirements must: be satisfied 139 * [[VUID-VkShaderModuleCreateInfo-pCode-04146]] 140 pname:pCode must: not declare any SPIR-V extension that is not supported 141 by the API, as described by the <<spirvenv-extensions, Extension>> 142 section of the <<spirvenv-capabilities,SPIR-V Environment>> appendix 143 * [[VUID-VkShaderModuleCreateInfo-pCode-04147]] 144 If pname:pCode declares any of the SPIR-V extensions listed in the 145 <<spirvenv-extensions-table,SPIR-V Environment>> appendix, one of the 146 corresponding requirements must: be satisfied 147**** 148 149include::{generated}/validity/structs/VkShaderModuleCreateInfo.txt[] 150-- 151 152[open,refpage='VkShaderModuleCreateFlags',desc='Reserved for future use',type='flags'] 153-- 154include::{generated}/api/flags/VkShaderModuleCreateFlags.txt[] 155 156tname:VkShaderModuleCreateFlags is a bitmask type for setting a mask, but is 157currently reserved for future use. 158-- 159 160ifdef::VK_EXT_validation_cache[] 161include::{chapters}/VK_EXT_validation_cache/shader-module-validation-cache.txt[] 162endif::VK_EXT_validation_cache[] 163 164 165[open,refpage='vkDestroyShaderModule',desc='Destroy a shader module',type='protos'] 166-- 167To destroy a shader module, call: 168 169include::{generated}/api/protos/vkDestroyShaderModule.txt[] 170 171 * pname:device is the logical device that destroys the shader module. 172 * pname:shaderModule is the handle of the shader module to destroy. 173 * pname:pAllocator controls host memory allocation as described in the 174 <<memory-allocation, Memory Allocation>> chapter. 175 176A shader module can: be destroyed while pipelines created using its shaders 177are still in use. 178 179.Valid Usage 180**** 181 * [[VUID-vkDestroyShaderModule-shaderModule-01092]] 182 If sname:VkAllocationCallbacks were provided when pname:shaderModule was 183 created, a compatible set of callbacks must: be provided here 184 * [[VUID-vkDestroyShaderModule-shaderModule-01093]] 185 If no sname:VkAllocationCallbacks were provided when pname:shaderModule 186 was created, pname:pAllocator must: be `NULL` 187**** 188 189include::{generated}/validity/protos/vkDestroyShaderModule.txt[] 190-- 191 192 193[[shaders-execution]] 194== Shader Execution 195 196At each stage of the pipeline, multiple invocations of a shader may: execute 197simultaneously. 198Further, invocations of a single shader produced as the result of different 199commands may: execute simultaneously. 200The relative execution order of invocations of the same shader type is 201undefined:. 202Shader invocations may: complete in a different order than that in which the 203primitives they originated from were drawn or dispatched by the application. 204However, fragment shader outputs are written to attachments in 205<<primsrast-order,rasterization order>>. 206 207The relative execution order of invocations of different shader types is 208largely undefined:. 209However, when invoking a shader whose inputs are generated from a previous 210pipeline stage, the shader invocations from the previous stage are 211guaranteed to have executed far enough to generate input values for all 212required inputs. 213 214 215[[shaders-execution-memory-ordering]] 216== Shader Memory Access Ordering 217 218The order in which image or buffer memory is read or written by shaders is 219largely undefined:. 220For some shader types (vertex, tessellation evaluation, and in some cases, 221fragment), even the number of shader invocations that may: perform loads and 222stores is undefined:. 223 224In particular, the following rules apply: 225 226 * <<shaders-vertex-execution,Vertex>> and 227 <<shaders-tessellation-evaluation-execution,tessellation evaluation>> 228 shaders will be invoked at least once for each unique vertex, as defined 229 in those sections. 230 * <<fragops-shader,Fragment>> shaders will be invoked zero or more times, 231 as defined in that section. 232 * The relative execution order of invocations of the same shader type is 233 undefined:. 234 A store issued by a shader when working on primitive B might complete 235 prior to a store for primitive A, even if primitive A is specified prior 236 to primitive B. This applies even to fragment shaders; while fragment 237 shader outputs are always written to the framebuffer in 238 <<primsrast-order, rasterization order>>, stores executed by fragment 239 shader invocations are not. 240 * The relative execution order of invocations of different shader types is 241 largely undefined:. 242 243[NOTE] 244.Note 245==== 246The above limitations on shader invocation order make some forms of 247synchronization between shader invocations within a single set of primitives 248unimplementable. 249For example, having one invocation poll memory written by another invocation 250assumes that the other invocation has been launched and will complete its 251writes in finite time. 252==== 253 254ifdef::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[] 255 256The <<memory-model,Memory Model>> appendix defines the terminology and rules 257for how to correctly communicate between shader invocations, such as when a 258write is <<memory-model-visible-to,Visible-To>> a read, and what constitutes 259a <<memory-model-access-data-race,Data Race>>. 260 261Applications must: not cause a data race. 262 263endif::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[] 264 265ifndef::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[] 266 267Stores issued to different memory locations within a single shader 268invocation may: not be visible to other invocations, or may: not become 269visible in the order they were performed. 270 271The code:OpMemoryBarrier instruction can: be used to provide stronger 272ordering of reads and writes performed by a single invocation. 273code:OpMemoryBarrier guarantees that any memory transactions issued by the 274shader invocation prior to the instruction complete prior to the memory 275transactions issued after the instruction. 276Memory barriers are needed for algorithms that require multiple invocations 277to access the same memory and require the operations to be performed in a 278partially-defined relative order. 279For example, if one shader invocation does a series of writes, followed by 280an code:OpMemoryBarrier instruction, followed by another write, then the 281results of the series of writes before the barrier become visible to other 282shader invocations at a time earlier or equal to when the results of the 283final write become visible to those invocations. 284In practice it means that another invocation that sees the results of the 285final write would also see the previous writes. 286Without the memory barrier, the final write may: be visible before the 287previous writes. 288 289Writes that are the result of shader stores through a variable decorated 290with code:Coherent automatically have available writes to the same buffer, 291buffer view, or image view made visible to them, and are themselves 292automatically made available to access by the same buffer, buffer view, or 293image view. 294Reads that are the result of shader loads through a variable decorated with 295code:Coherent automatically have available writes to the same buffer, buffer 296view, or image view made visible to them. 297The order that coherent writes to different locations become available is 298undefined:, unless enforced by a memory barrier instruction or other memory 299dependency. 300 301[NOTE] 302.Note 303==== 304Explicit memory dependencies must: still be used to guarantee availability 305and visibility for access via other buffers, buffer views, or image views. 306==== 307 308The built-in atomic memory transaction instructions can: be used to read and 309write a given memory address atomically. 310While built-in atomic functions issued by multiple shader invocations are 311executed in undefined: order relative to each other, these functions perform 312both a read and a write of a memory address and guarantee that no other 313memory transaction will write to the underlying memory between the read and 314write. 315Atomic operations ensure automatic availability and visibility for writes 316and reads in the same way as those to code:Coherent variables. 317 318[NOTE] 319.Note 320==== 321Memory accesses performed on different resource descriptors with the same 322memory backing may: not be well-defined even with the code:Coherent 323decoration or via atomics, due to things such as image layouts or ownership 324of the resource - as described in the <<synchronization, Synchronization and 325Cache Control>> chapter. 326==== 327 328[NOTE] 329.Note 330==== 331Atomics allow shaders to use shared global addresses for mutual exclusion or 332as counters, among other uses. 333==== 334 335endif::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[] 336 337The SPIR-V *SubgroupMemory*, *CrossWorkgroupMemory*, and 338*AtomicCounterMemory* memory semantics are ignored. 339Sequentially consistent atomics and barriers are not supported and 340*SequentiallyConsistent* is treated as *AcquireRelease*. 341*SequentiallyConsistent* should: not be used. 342 343 344[[shaders-inputs]] 345== Shader Inputs and Outputs 346 347Data is passed into and out of shaders using variables with input or output 348storage class, respectively. 349User-defined inputs and outputs are connected between stages by matching 350their code:Location decorations. 351Additionally, data can: be provided by or communicated to special functions 352provided by the execution environment using code:BuiltIn decorations. 353 354In many cases, the same code:BuiltIn decoration can: be used in multiple 355shader stages with similar meaning. 356The specific behavior of variables decorated as code:BuiltIn is documented 357in the following sections. 358 359 360ifdef::VK_NV_mesh_shader[] 361[[shaders-task]] 362== Task Shaders 363 364Task shaders operate in conjunction with the mesh shaders to produce a 365collection of primitives that will be processed by subsequent stages of the 366graphics pipeline. 367Its primary purpose is to create a variable amount of subsequent mesh shader 368invocations. 369 370Task shaders are invoked via the execution of the 371<<drawing-mesh-shading,programmable mesh shading>> pipeline. 372 373The task shader has no fixed-function inputs other than variables 374identifying the specific workgroup and invocation. 375The only fixed output of the task shader is a task count, identifying the 376number of mesh shader workgroups to create. 377The task shader can write additional outputs to task memory, which can be 378read by all of the mesh shader workgroups it created. 379 380 381=== Task Shader Execution 382 383Task workloads are formed from groups of work items called workgroups and 384processed by the task shader in the current graphics pipeline. 385A workgroup is a collection of shader invocations that execute the same 386shader, potentially in parallel. 387Task shaders execute in _global workgroups_ which are divided into a number 388of _local workgroups_ with a size that can: be set by assigning a value to 389the code:LocalSize 390ifdef::VK_KHR_maintenance4[or code:LocalSizeId] 391execution mode or via an object decorated by the code:WorkgroupSize 392decoration. 393An invocation within a local workgroup can: share data with other members of 394the local workgroup through shared variables and issue memory and control 395flow barriers to synchronize with other members of the local workgroup. 396 397 398[[shaders-mesh]] 399== Mesh Shaders 400 401Mesh shaders operate in workgroups to produce a collection of primitives 402that will be processed by subsequent stages of the graphics pipeline. 403Each workgroup emits zero or more output primitives and the group of 404vertices and their associated data required for each output primitive. 405 406Mesh shaders are invoked via the execution of the 407<<drawing-mesh-shading,programmable mesh shading>> pipeline. 408 409The only inputs available to the mesh shader are variables identifying the 410specific workgroup and invocation and, if applicable, any outputs written to 411task memory by the task shader that spawned the mesh shader's workgroup. 412The mesh shader can operate without a task shader as well. 413 414The invocations of the mesh shader workgroup write an output mesh, 415comprising a set of primitives with per-primitive attributes, a set of 416vertices with per-vertex attributes, and an array of indices identifying the 417mesh vertices that belong to each primitive. 418The primitives of this mesh are then processed by subsequent graphics 419pipeline stages, where the outputs of the mesh shader form an interface with 420the fragment shader. 421 422 423=== Mesh Shader Execution 424 425Mesh workloads are formed from groups of work items called workgroups and 426processed by the mesh shader in the current graphics pipeline. 427A workgroup is a collection of shader invocations that execute the same 428shader, potentially in parallel. 429Mesh shaders execute in _global workgroups_ which are divided into a number 430of _local workgroups_ with a size that can: be set by assigning a value to 431the code:LocalSize 432ifdef::VK_KHR_maintenance4[or code:LocalSizeId] 433execution mode or via an object decorated by the code:WorkgroupSize 434decoration. 435An invocation within a local workgroup can: share data with other members of 436the local workgroup through shared variables and issue memory and control 437flow barriers to synchronize with other members of the local workgroup. 438 439The _global workgroups_ may be generated explcitly via the API, or 440implicitly through the task shader's work creation mechanism. 441endif::VK_NV_mesh_shader[] 442 443 444[[shaders-vertex]] 445== Vertex Shaders 446 447Each vertex shader invocation operates on one vertex and its associated 448<<fxvertex-attrib,vertex attribute>> data, and outputs one vertex and 449associated data. 450ifndef::VK_NV_mesh_shader[] 451Graphics pipelines must: include a vertex shader, and the vertex shader 452stage is always the first shader stage in the graphics pipeline. 453endif::VK_NV_mesh_shader[] 454ifdef::VK_NV_mesh_shader[] 455Graphics pipelines using primitive shading must: include a vertex shader, 456and the vertex shader stage is always the first shader stage in the graphics 457pipeline. 458endif::VK_NV_mesh_shader[] 459 460 461[[shaders-vertex-execution]] 462=== Vertex Shader Execution 463 464A vertex shader must: be executed at least once for each vertex specified by 465a drawing command. 466ifdef::VK_VERSION_1_1,VK_KHR_multiview[] 467If the subpass includes multiple views in its view mask, the shader may: be 468invoked separately for each view. 469endif::VK_VERSION_1_1,VK_KHR_multiview[] 470During execution, the shader is presented with the index of the vertex and 471instance for which it has been invoked. 472Input variables declared in the vertex shader are filled by the 473implementation with the values of vertex attributes associated with the 474invocation being executed. 475 476If the same vertex is specified multiple times in a drawing command (e.g. by 477including the same index value multiple times in an index buffer) the 478implementation may: reuse the results of vertex shading if it can statically 479determine that the vertex shader invocations will produce identical results. 480 481[NOTE] 482.Note 483==== 484It is implementation-dependent when and if results of vertex shading are 485reused, and thus how many times the vertex shader will be executed. 486This is true also if the vertex shader contains stores or atomic operations 487(see <<features-vertexPipelineStoresAndAtomics, 488pname:vertexPipelineStoresAndAtomics>>). 489==== 490 491 492[[shaders-tessellation-control]] 493== Tessellation Control Shaders 494 495The tessellation control shader is used to read an input patch provided by 496the application and to produce an output patch. 497Each tessellation control shader invocation operates on an input patch 498(after all control points in the patch are processed by a vertex shader) and 499its associated data, and outputs a single control point of the output patch 500and its associated data, and can: also output additional per-patch data. 501The input patch is sized according to the pname:patchControlPoints member of 502slink:VkPipelineTessellationStateCreateInfo, as part of input assembly. 503 504ifdef::VK_EXT_extended_dynamic_state2[] 505The input patch can also be dynamically sized with pname:patchControlPoints 506parameter of flink:vkCmdSetPatchControlPointsEXT. 507 508[open,refpage='vkCmdSetPatchControlPointsEXT',desc='Specify the number of control points per patch dynamically for a command buffer',type='protos'] 509-- 510To <<pipelines-dynamic-state, dynamically set>> the number of control points 511per patch, call: 512 513include::{generated}/api/protos/vkCmdSetPatchControlPointsEXT.txt[] 514 515 * pname:commandBuffer is the command buffer into which the command will be 516 recorded. 517 * pname:patchControlPoints specifies the number of control points per 518 patch. 519 520This command sets the number of control points per patch for subsequent 521drawing commands when the graphics pipeline is created with 522ename:VK_DYNAMIC_STATE_PATCH_CONTROL_POINTS_EXT set in 523slink:VkPipelineDynamicStateCreateInfo::pname:pDynamicStates. 524Otherwise, this state is specified by the 525slink:VkPipelineTessellationStateCreateInfo::pname:patchControlPoints value 526used to create the currently active pipeline. 527 528.Valid Usage 529**** 530 * [[VUID-vkCmdSetPatchControlPointsEXT-None-04873]] 531 The <<features-extendedDynamicState2PatchControlPoints, 532 extendedDynamicState2PatchControlPoints>> feature must: be enabled 533 * [[VUID-vkCmdSetPatchControlPointsEXT-patchControlPoints-04874]] 534 pname:patchControlPoints must: be greater than zero and less than or 535 equal to sname:VkPhysicalDeviceLimits::pname:maxTessellationPatchSize 536**** 537 538include::{generated}/validity/protos/vkCmdSetPatchControlPointsEXT.txt[] 539-- 540endif::VK_EXT_extended_dynamic_state2[] 541 542The size of the output patch is controlled by the code:OpExecutionMode 543code:OutputVertices specified in the tessellation control or tessellation 544evaluation shaders, which must: be specified in at least one of the shaders. 545The size of the input and output patches must: each be greater than zero and 546less than or equal to 547sname:VkPhysicalDeviceLimits::pname:maxTessellationPatchSize. 548 549 550[[shaders-tessellation-control-execution]] 551=== Tessellation Control Shader Execution 552 553A tessellation control shader is invoked at least once for each _output_ 554vertex in a patch. 555ifdef::VK_VERSION_1_1,VK_KHR_multiview[] 556If the subpass includes multiple views in its view mask, the shader may: be 557invoked separately for each view. 558endif::VK_VERSION_1_1,VK_KHR_multiview[] 559 560Inputs to the tessellation control shader are generated by the vertex 561shader. 562Each invocation of the tessellation control shader can: read the attributes 563of any incoming vertices and their associated data. 564The invocations corresponding to a given patch execute logically in 565parallel, with undefined: relative execution order. 566However, the code:OpControlBarrier instruction can: be used to provide 567limited control of the execution order by synchronizing invocations within a 568patch, effectively dividing tessellation control shader execution into a set 569of phases. 570Tessellation control shaders will read undefined: values if one invocation 571reads a per-vertex or per-patch output written by another invocation at any 572point during the same phase, or if two invocations attempt to write 573different values to the same per-patch output in a single phase. 574 575 576[[shaders-tessellation-evaluation]] 577== Tessellation Evaluation Shaders 578 579The Tessellation Evaluation Shader operates on an input patch of control 580points and their associated data, and a single input barycentric coordinate 581indicating the invocation's relative position within the subdivided patch, 582and outputs a single vertex and its associated data. 583 584 585[[shaders-tessellation-evaluation-execution]] 586=== Tessellation Evaluation Shader Execution 587 588A tessellation evaluation shader is invoked at least once for each unique 589vertex generated by the tessellator. 590ifdef::VK_VERSION_1_1,VK_KHR_multiview[] 591If the subpass includes multiple views in its view mask, the shader may: be 592invoked separately for each view. 593endif::VK_VERSION_1_1,VK_KHR_multiview[] 594 595 596[[shaders-geometry]] 597== Geometry Shaders 598 599The geometry shader operates on a group of vertices and their associated 600data assembled from a single input primitive, and emits zero or more output 601primitives and the group of vertices and their associated data required for 602each output primitive. 603 604 605[[shaders-geometry-execution]] 606=== Geometry Shader Execution 607 608A geometry shader is invoked at least once for each primitive produced by 609the tessellation stages, or at least once for each primitive generated by 610<<drawing,primitive assembly>> when tessellation is not in use. 611A shader can request that the geometry shader runs multiple 612<<geometry-invocations, instances>>. 613A geometry shader is invoked at least once for each instance. 614ifdef::VK_VERSION_1_1,VK_KHR_multiview[] 615If the subpass includes multiple views in its view mask, the shader may: be 616invoked separately for each view. 617endif::VK_VERSION_1_1,VK_KHR_multiview[] 618 619 620[[shaders-fragment]] 621== Fragment Shaders 622 623Fragment shaders are invoked as a <<fragops-shader, fragment operation>> in 624a graphics pipeline. 625Each fragment shader invocation operates on a single fragment and its 626associated data. 627With few exceptions, fragment shaders do not have access to any data 628associated with other fragments and are considered to execute in isolation 629of fragment shader invocations associated with other fragments. 630 631 632[[shaders-compute]] 633== Compute Shaders 634 635Compute shaders are invoked via flink:vkCmdDispatch and 636flink:vkCmdDispatchIndirect commands. 637In general, they have access to similar resources as shader stages executing 638as part of a graphics pipeline. 639 640Compute workloads are formed from groups of work items called workgroups and 641processed by the compute shader in the current compute pipeline. 642A workgroup is a collection of shader invocations that execute the same 643shader, potentially in parallel. 644Compute shaders execute in _global workgroups_ which are divided into a 645number of _local workgroups_ with a size that can: be set by assigning a 646value to the code:LocalSize 647ifdef::VK_KHR_maintenance4[or code:LocalSizeId] 648execution mode or via an object decorated by the code:WorkgroupSize 649decoration. 650An invocation within a local workgroup can: share data with other members of 651the local workgroup through shared variables and issue memory and control 652flow barriers to synchronize with other members of the local workgroup. 653 654 655ifdef::VK_NV_ray_tracing,VK_KHR_ray_tracing_pipeline[] 656[[shaders-raytracing-shaders]] 657[[shaders-ray-generation]] 658== Ray Generation Shaders 659 660A ray generation shader is similar to a compute shader. 661Its main purpose is to execute ray tracing queries using code:OpTraceRayKHR 662instructions and process the results. 663 664 665[[shaders-ray-generation-execution]] 666=== Ray Generation Shader Execution 667 668One ray generation shader is executed per ray tracing dispatch. 669Its location in the shader binding table (see <<shader-binding-table,Shader 670Binding Table>> for details) is passed directly into fname:vkCmdTraceRaysKHR 671using the pname:raygenShaderBindingTableBuffer and 672pname:raygenShaderBindingOffset parameters. 673 674 675[[shaders-intersection]] 676== Intersection Shaders 677 678Intersection shaders enable the implementation of arbitrary, application 679defined geometric primitives. 680An intersection shader for a primitive is executed whenever its axis-aligned 681bounding box is hit by a ray. 682 683Like other ray tracing shader domains, an intersection shader operates on a 684single ray at a time. 685It also operates on a single primitive at a time. 686It is therefore the purpose of an intersection shader to compute the 687ray-primitive intersections and report them. 688To report an intersection, the shader calls the code:OpReportIntersectionKHR 689instruction. 690 691An intersection shader communicates with any-hit and closest shaders by 692generating attribute values that they can: read. 693Intersection shaders cannot: read or modify the ray payload. 694 695 696[[shaders-intersection-execution]] 697=== Intersection Shader Execution 698The order in which intersections are found along a ray, and therefore the 699order in which intersection shaders are executed, is unspecified. 700 701The intersection shader of the closest AABB which intersects the ray is 702guaranteed to be executed at some point during traversal, unless the ray is 703forcibly terminated. 704 705 706[[shaders-any-hit]] 707== Any-Hit Shaders 708 709The any-hit shader is executed after the intersection shader reports an 710intersection that lies within the current [eq]#[t~min~,t~max~]# of the ray. 711The main use of any-hit shaders is to programmatically decide whether or not 712an intersection will be accepted. 713The intersection will be accepted unless the shader calls the 714code:OpIgnoreIntersectionKHR instruction. 715Any-hit shaders have read-only access to the attributes generated by the 716corresponding intersection shader, and can: read or modify the ray payload. 717 718 719[[shaders-any-hit-execution]] 720=== Any-Hit Shader Execution 721 722The order in which intersections are found along a ray, and therefore the 723order in which any-hit shaders are executed, is unspecified. 724 725The any-hit shader of the closest hit is guaranteed to be executed at some 726point during traversal, unless the ray is forcibly terminated. 727 728 729[[shaders-closest-hit]] 730== Closest Hit Shaders 731 732Closest hit shaders have read-only access to the attributes generated by the 733corresponding intersection shader, and can: read or modify the ray payload. 734They also have access to a number of system-generated values. 735Closest hit shaders can: call code:OpTraceRayKHR to recursively trace rays. 736 737 738[[shaders-closest-hit-execution]] 739=== Closest Hit Shader Execution 740 741Exactly one closest hit shader is executed when traversal is finished and an 742intersection has been found and accepted. 743 744 745[[shaders-miss]] 746== Miss Shaders 747 748Miss shaders can: access the ray payload and can: trace new rays through the 749code:OpTraceRayKHR instruction, but cannot: access attributes since they are 750not associated with an intersection. 751 752 753[[shaders-miss-execution]] 754=== Miss Shader Execution 755 756A miss shader is executed instead of a closest hit shader if no intersection 757was found during traversal. 758 759 760[[shaders-callable]] 761== Callable Shaders 762 763Callable shaders can: access a callable payload that works similarly to ray 764payloads to do subroutine work. 765 766 767[[shaders-callable-execution]] 768=== Callable Shader Execution 769 770A callable shader is executed by calling code:OpExecuteCallableKHR from an 771allowed shader stage. 772 773endif::VK_NV_ray_tracing,VK_KHR_ray_tracing_pipeline[] 774 775 776[[shaders-interpolation-decorations]] 777== Interpolation Decorations 778 779Interpolation decorations control the behavior of attribute interpolation in 780the fragment shader stage. 781Interpolation decorations can: be applied to code:Input storage class 782variables in the fragment shader stage's interface, and control the 783interpolation behavior of those variables. 784 785Inputs that could be interpolated can: be decorated by at most one of the 786following decorations: 787 788 * code:Flat: no interpolation 789 * code:NoPerspective: linear interpolation (for 790 <<line_linear_interpolation,lines>> and 791 <<triangle_linear_interpolation,polygons>>) 792ifdef::VK_NV_fragment_shader_barycentric[] 793 * code:PerVertexNV: values fetched from shader-specified primitive vertex 794endif::VK_NV_fragment_shader_barycentric[] 795 796Fragment input variables decorated with neither code:Flat nor 797code:NoPerspective use perspective-correct interpolation (for 798<<line_perspective_interpolation,lines>> and 799<<triangle_perspective_interpolation,polygons>>). 800 801The presence of and type of interpolation is controlled by the above 802interpolation decorations as well as the auxiliary decorations code:Centroid 803and code:Sample. 804 805A variable decorated with code:Flat will not be interpolated. 806Instead, it will have the same value for every fragment within a triangle. 807This value will come from a single <<vertexpostproc-flatshading,provoking 808vertex>>. 809A variable decorated with code:Flat can: also be decorated with 810code:Centroid or code:Sample, which will mean the same thing as decorating 811it only as code:Flat. 812 813For fragment shader input variables decorated with neither code:Centroid nor 814code:Sample, the assigned variable may: be interpolated anywhere within the 815fragment and a single value may: be assigned to each sample within the 816fragment. 817 818If a fragment shader input is decorated with code:Centroid, a single value 819may: be assigned to that variable for all samples in the fragment, but that 820value must: be interpolated to a location that lies in both the fragment and 821in the primitive being rendered, including any of the fragment's samples 822covered by the primitive. 823Because the location at which the variable is interpolated may: be different 824in neighboring fragments, and derivatives may: be computed by computing 825differences between neighboring fragments, derivatives of centroid-sampled 826inputs may: be less accurate than those for non-centroid interpolated 827variables. 828ifdef::VK_EXT_post_depth_coverage[] 829The code:PostDepthCoverage execution mode does not affect the determination 830of the centroid location. 831endif::VK_EXT_post_depth_coverage[] 832 833If a fragment shader input is decorated with code:Sample, a separate value 834must: be assigned to that variable for each covered sample in the fragment, 835and that value must: be sampled at the location of the individual sample. 836When pname:rasterizationSamples is ename:VK_SAMPLE_COUNT_1_BIT, the fragment 837center must: be used for code:Centroid, code:Sample, and undecorated 838attribute interpolation. 839 840Fragment shader inputs that are signed or unsigned integers, integer 841vectors, or any double-precision floating-point type must: be decorated with 842code:Flat. 843 844ifdef::VK_AMD_shader_explicit_vertex_parameter[] 845When the `apiext:VK_AMD_shader_explicit_vertex_parameter` device extension 846is enabled inputs can: be also decorated with the code:CustomInterpAMD 847interpolation decoration, including fragment shader inputs that are signed 848or unsigned integers, integer vectors, or any double-precision 849floating-point type. 850Inputs decorated with code:CustomInterpAMD can: only be accessed by the 851extended instruction code:InterpolateAtVertexAMD and allows accessing the 852value of the input for individual vertices of the primitive. 853endif::VK_AMD_shader_explicit_vertex_parameter[] 854 855ifdef::VK_NV_fragment_shader_barycentric[] 856[[shaders-interpolation-decorations-pervertexnv]] 857When the pname:fragmentShaderBarycentric feature is enabled, inputs can: be 858also decorated with the code:PerVertexNV interpolation decoration, including 859fragment shader inputs that are signed or unsigned integers, integer 860vectors, or any double-precision floating-point type. 861Inputs decorated with code:PerVertexNV can: only be accessed using an extra 862array dimension, where the extra index identifies one of the vertices of the 863primitive that produced the fragment. 864endif::VK_NV_fragment_shader_barycentric[] 865 866 867[[shaders-staticuse]] 868== Static Use 869 870A SPIR-V module declares a global object in memory using the code:OpVariable 871instruction, which results in a pointer code:x to that object. 872A specific entry point in a SPIR-V module is said to _statically use_ that 873object if that entry point's call tree contains a function containing a 874memory instruction or image instruction with code:x as an code:id operand. 875See the "`Memory Instructions`" and "`Image Instructions`" subsections of 876section 3 "`Binary Form`" of the SPIR-V specification for the complete list 877of SPIR-V memory instructions. 878 879Static use is not used to control the behavior of variables with code:Input 880and code:Output storage. 881The effects of those variables are applied based only on whether they are 882present in a shader entry point's interface. 883 884 885[[shaders-scope]] 886== Scope 887 888A _scope_ describes a set of shader invocations, where each such set is a 889_scope instance_. 890Each invocation belongs to one or more scope instances, but belongs to no 891more than one scope instance for each scope. 892 893The operations available between invocations in a given scope instance vary, 894with smaller scopes generally able to perform more operations, and with 895greater efficiency. 896 897 898[[shaders-scope-cross-device]] 899=== Cross Device 900 901All invocations executed in a Vulkan instance fall into a single _cross 902device scope instance_. 903 904Whilst the code:CrossDevice scope is defined in SPIR-V, it is disallowed in 905Vulkan. 906API <<synchronization, synchronization>> commands can: be used to 907communicate between devices. 908 909 910[[shaders-scope-device]] 911=== Device 912 913All invocations executed on a single device form a _device scope instance_. 914 915ifdef::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[] 916If the <<features-vulkanMemoryModel,pname:vulkanMemoryModel>> and 917<<features-vulkanMemoryModelDeviceScope, 918pname:vulkanMemoryModelDeviceScope>> features are enabled, this scope is 919represented in SPIR-V by the code:Device code:Scope, which can: be used as a 920code:Memory code:Scope for barrier and atomic operations. 921 922ifdef::VK_KHR_shader_clock[] 923If both the <<features-shaderDeviceClock, pname:shaderDeviceClock>> and 924<<features-vulkanMemoryModelDeviceScope, 925pname:vulkanMemoryModelDeviceScope>> features are enabled, using the 926code:Device code:Scope with the code:OpReadClockKHR instruction will read 927from a clock that is consistent across invocations in the same device scope 928instance. 929endif::VK_KHR_shader_clock[] 930endif::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[] 931 932There is no method to synchronize the execution of these invocations within 933SPIR-V, and this can: only be done with API synchronization primitives. 934 935ifdef::VK_VERSION_1_1,VK_KHR_device_group[] 936Invocations executing on different devices in a device group operate in 937separate device scope instances. 938endif::VK_VERSION_1_1,VK_KHR_device_group[] 939 940ifndef::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[] 941The scope only extends to the queue family, not the whole device. 942endif::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[] 943 944 945[[shaders-scope-queue-family]] 946=== Queue Family 947 948Invocations executed by queues in a given queue family form a _queue family 949scope instance_. 950 951This scope is identified in SPIR-V as the 952ifdef::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[] 953code:QueueFamily code:Scope if the 954<<features-vulkanMemoryModel,pname:vulkanMemoryModel>> feature is enabled, 955or if not, the 956endif::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[] 957code:Device code:Scope, which can: be used as a code:Memory code:Scope for 958barrier and atomic operations. 959 960ifdef::VK_KHR_shader_clock[] 961If the <<features-shaderDeviceClock, pname:shaderDeviceClock>> feature is 962enabled, 963ifdef::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[] 964but the <<features-vulkanMemoryModelDeviceScope, 965pname:vulkanMemoryModelDeviceScope>> feature is not enabled, 966endif::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[] 967using the code:Device code:Scope with the code:OpReadClockKHR instruction 968will read from a clock that is consistent across invocations in the same 969queue family scope instance. 970endif::VK_KHR_shader_clock[] 971 972There is no method to synchronize the execution of these invocations within 973SPIR-V, and this can: only be done with API synchronization primitives. 974 975Each invocation in a queue family scope instance must: be in the same 976<<shaders-scope-device, device scope instance>>. 977 978 979[[shaders-scope-command]] 980=== Command 981 982Any shader invocations executed as the result of a single command such as 983flink:vkCmdDispatch or flink:vkCmdDraw form a _command scope instance_. 984For indirect drawing commands with pname:drawCount greater than one, 985invocations from separate draws are in separate command scope instances. 986ifdef::VK_KHR_ray_tracing_pipeline,VK_NV_ray_tracing[] 987For ray tracing shaders, an invocation group is an implementation-dependent 988subset of the set of shader invocations of a given shader stage which are 989produced by a single trace rays command. 990endif::VK_KHR_ray_tracing_pipeline,VK_NV_ray_tracing[] 991 992There is no specific code:Scope for communication across invocations in a 993command scope instance. 994As this has a clear boundary at the API level, coordination here can: be 995performed in the API, rather than in SPIR-V. 996 997Each invocation in a command scope instance must: be in the same 998<<shaders-scope-queue-family, queue-family scope instance>>. 999 1000For shaders without defined <<shaders-scope-workgroup, workgroups>>, this 1001set of invocations forms an _invocation group_ as defined in the 1002<<spirv-spec,SPIR-V specification>>. 1003 1004 1005[[shaders-scope-primitive]] 1006=== Primitive 1007 1008Any fragment shader invocations executed as the result of rasterization of a 1009single primitive form a _primitive scope instance_. 1010 1011There is no specific code:Scope for communication across invocations in a 1012primitive scope instance. 1013 1014Any generated <<shaders-helper-invocations, helper invocations>> are 1015included in this scope instance. 1016 1017Each invocation in a primitive scope instance must: be in the same 1018<<shaders-scope-command, command scope instance>>. 1019 1020Any input variables decorated with code:Flat are uniform within a primitive 1021scope instance. 1022 1023 1024// intentionally no VK_NV_ray_tracing here since this scope does not exist there 1025ifdef::VK_KHR_ray_tracing_pipeline[] 1026[[shaders-scope-shadercall]] 1027=== Shader Call 1028 1029Any <<shader-call-related,shader-call-related>> invocations that are 1030executed in one or more ray tracing execution models form a _shader call 1031scope instance_. 1032 1033The code:ShaderCallKHR code:Scope can be used as code:Memory code:Scope for 1034barrier and atomic operations. 1035 1036Each invocation in a shader call scope instance must: be in the same 1037<<shaders-scope-queue-family, queue family scope instance>>. 1038endif::VK_KHR_ray_tracing_pipeline[] 1039 1040 1041[[shaders-scope-workgroup]] 1042=== Workgroup 1043 1044A _local workgroup_ is a set of invocations that can synchronize and share 1045data with each other using memory in the code:Workgroup storage class. 1046 1047The code:Workgroup code:Scope can be used as both an code:Execution 1048code:Scope and code:Memory code:Scope for barrier and atomic operations. 1049 1050Each invocation in a local workgroup must: be in the same 1051<<shaders-scope-command, command scope instance>>. 1052 1053Only 1054ifdef::VK_NV_mesh_shader[] 1055task, mesh, and 1056endif::VK_NV_mesh_shader[] 1057compute shaders have defined workgroups - other shader types cannot: use 1058workgroup functionality. 1059For shaders that have defined workgroups, this set of invocations forms an 1060_invocation group_ as defined in the <<spirv-spec,SPIR-V specification>>. 1061 1062 1063ifdef::VK_VERSION_1_1[] 1064[[shaders-scope-subgroup]] 1065=== Subgroup 1066 1067A _subgroup_ (see the subsection "`Control Flow`" of section 2 of the SPIR-V 10681.3 Revision 1 specification) is a set of invocations that can synchronize 1069and share data with each other efficiently. 1070 1071The code:Subgroup code:Scope can be used as both an code:Execution 1072code:Scope and code:Memory code:Scope for barrier and atomic operations. 1073Other <<VkSubgroupFeatureFlagBits, subgroup features>> allow the use of 1074<<shaders-group-operations, group operations>> with subgroup scope. 1075 1076ifdef::VK_KHR_shader_clock[] 1077If the <<features-shaderSubgroupClock, pname:shaderSubgroupClock>> feature 1078is enabled, using the code:Subgroup code:Scope with the code:OpReadClockKHR 1079instruction will read from a clock that is consistent across invocations in 1080the same subgroup. 1081endif::VK_KHR_shader_clock[] 1082 1083For <<shaders-scope-workgroup, shaders that have defined workgroups>>, each 1084invocation in a subgroup must: be in the same <<shaders-scope-workgroup, 1085local workgroup>>. 1086 1087In other shader stages, each invocation in a subgroup must: be in the same 1088<<shaders-scope-device, device scope instance>>. 1089 1090Only <<limits-subgroup-supportedStages, shader stages that support subgroup 1091operations>> have defined subgroups. 1092endif::VK_VERSION_1_1[] 1093 1094 1095[[shaders-scope-quad]] 1096=== Quad 1097 1098A _quad scope instance_ is formed of four shader invocations. 1099 1100In a fragment shader, each invocation in a quad scope instance is formed of 1101invocations in neighboring framebuffer locations [eq]#(x~i~, y~i~)#, where: 1102 1103 * [eq]#i# is the index of the invocation within the scope instance. 1104 * [eq]#w# and [eq]#h# are the number of pixels the fragment covers in the 1105 [eq]#x# and [eq]#y# axes. 1106 * [eq]#w# and [eq]#h# are identical for all participating invocations. 1107 * [eq]#(x~0~) = (x~1~ - w) = (x~2~) = (x~3~ - w)# 1108 * [eq]#(y~0~) = (y~1~) = (y~2~ - h) = (y~3~ - h)# 1109 * Each invocation has the same layer and sample indices. 1110 1111ifdef::VK_NV_compute_shader_derivatives[] 1112In a compute shader, if the code:DerivativeGroupQuadsNV execution mode is 1113specified, each invocation in a quad scope instance is formed of invocations 1114with adjacent local invocation IDs [eq]#(x~i~, y~i~)#, where: 1115 1116 * [eq]#i# is the index of the invocation within the quad scope instance. 1117 * [eq]#(x~0~) = (x~1~ - 1) = (x~2~) = (x~3~ - 1)# 1118 * [eq]#(y~0~) = (y~1~) = (y~2~ - 1) = (y~3~ - 1)# 1119 * [eq]#x~0~# and [eq]#y~0~# are integer multiples of 2. 1120 * Each invocation has the same [eq]#z# coordinate. 1121 1122In a compute shader, if the code:DerivativeGroupLinearNV execution mode is 1123specified, each invocation in a quad scope instance is formed of invocations 1124with adjacent local invocation indices [eq]#(l~i~)#, where: 1125 1126 * [eq]#i# is the index of the invocation within the quad scope instance. 1127 * [eq]#(l~0~) = (l~1~ - 1) = (l~2~ - 2) = (l~3~ - 3)# 1128 * [eq]#l~0~# is an integer multiple of 4. 1129 1130endif::VK_NV_compute_shader_derivatives[] 1131 1132ifdef::VK_VERSION_1_1[] 1133In all shaders, each invocation in a quad scope instance is formed of 1134invocations in adjacent subgroup invocation indices [eq]#(s~i~)#, where: 1135 1136 * [eq]#i# is the index of the invocation within the quad scope instance. 1137 * [eq]#(s~0~) = (s~1~ - 1) = (s~2~ - 2) = (s~3~ - 3)# 1138 * [eq]#s~0~# is an integer multiple of 4. 1139 1140Each invocation in a quad scope instance must: be in the same 1141<<shaders-scope-subgroup, subgroup>>. 1142endif::VK_VERSION_1_1[] 1143 1144ifndef::VK_VERSION_1_1[] 1145The specific set of invocations that make up a quad scope instance in other 1146shader stages is undefined:. 1147endif::VK_VERSION_1_1[] 1148 1149In a fragment shader, each invocation in a quad scope instance must: be in 1150the same <<shaders-scope-primitive, primitive scope instance>>. 1151 1152ifndef::VK_VERSION_1_1[] 1153For <<shaders-scope-workgroup, shaders that have defined workgroups>>, each 1154invocation in a quad scope instance must: be in the same 1155<<shaders-scope-workgroup, local workgroup>>. 1156 1157In other shader stages, each invocation in a quad scope instance must: be in 1158the same <<shaders-scope-device, device scope instance>>. 1159endif::VK_VERSION_1_1[] 1160 1161Fragment 1162ifdef::VK_NV_compute_shader_derivatives,VK_VERSION_1_1[] 1163and compute 1164endif::VK_NV_compute_shader_derivatives,VK_VERSION_1_1[] 1165shaders have defined quad scope instances. 1166ifdef::VK_VERSION_1_1[] 1167If the <<limits-subgroup-quadOperationsInAllStages, 1168pname:quadOperationsInAllStages>> limit is supported, any 1169<<limits-subgroup-supportedStages, shader stages that support subgroup 1170operations>> also have defined quad scope instances. 1171endif::VK_VERSION_1_1[] 1172 1173 1174ifdef::VK_EXT_fragment_shader_interlock[] 1175[[shaders-scope-fragment-interlock]] 1176=== Fragment Interlock 1177 1178A _fragment interlock scope instance_ is formed of fragment shader 1179invocations based on their framebuffer locations [eq]#(x,y,layer,sample)#, 1180executed by commands inside a single <<renderpass,subpass>>. 1181 1182The specific set of invocations included varies based on the execution mode 1183as follows: 1184 1185 * If the code:SampleInterlockOrderedEXT or 1186 code:SampleInterlockUnorderedEXT execution modes are used, only 1187 invocations with identical framebuffer locations 1188 [eq]#(x,y,layer,sample)# are included. 1189 * If the code:PixelInterlockOrderedEXT or code:PixelInterlockUnorderedEXT 1190 execution modes are used, fragments with different sample ids are also 1191 included. 1192ifdef::VK_NV_shading_rate_image,VK_KHR_fragment_shading_rate[] 1193 * If the code:ShadingRateInterlockOrderedEXT or 1194 code:ShadingRateInterlockUnorderedEXT execution modes are used, 1195 fragments from neighbouring framebuffer locations are also included, as 1196 <<primsrast-shading-rate-image,determined by the shading rate>>. 1197endif::VK_NV_shading_rate_image,VK_KHR_fragment_shading_rate[] 1198 1199Only fragment shaders with one of the above execution modes have defined 1200fragment interlock scope instances. 1201 1202There is no specific code:Scope value for communication across invocations 1203in a fragment interlock scope instance. 1204However, this is implicitly used as a memory scope by 1205code:OpBeginInvocationInterlockEXT and code:OpEndInvocationInterlockEXT. 1206 1207Each invocation in a fragment interlock scope instance must: be in the same 1208<<shaders-scope-queue-family, queue family scope instance>>. 1209endif::VK_EXT_fragment_shader_interlock[] 1210 1211 1212[[shaders-scope-invocation]] 1213=== Invocation 1214 1215The smallest _scope_ is a single invocation; this is represented by the 1216code:Invocation code:Scope in SPIR-V. 1217 1218Fragment shader invocations must: be in a <<shaders-scope-primitive, 1219primitive scope instance>>. 1220 1221ifdef::VK_EXT_fragment_shader_interlock[] 1222Invocations in <<shaders-scope-fragment-interlock, fragment shaders that 1223have a defined fragment interlock scope>> must: be in a 1224<<shaders-scope-fragment-interlock, fragment interlock scope instance>>. 1225endif::VK_EXT_fragment_shader_interlock[] 1226 1227Invocations in <<shaders-scope-workgroup, shaders that have defined 1228workgroups>> must: be in a <<shaders-scope-workgroup, local workgroup>>. 1229 1230ifdef::VK_VERSION_1_1[] 1231Invocations in <<shaders-scope-subgroup, shaders that have a defined 1232subgroup scope>> must: be in a <<shaders-scope-subgroup, subgroup>>. 1233endif::VK_VERSION_1_1[] 1234 1235Invocations in <<shaders-scope-quad, shaders that have a defined quad 1236scope>> must: be in a <<shaders-scope-quad, quad scope instance>>. 1237 1238All invocations in all stages must: be in a <<shaders-scope-command,command 1239scope instance>>. 1240 1241 1242ifdef::VK_VERSION_1_1[] 1243[[shaders-group-operations]] 1244== Group Operations 1245 1246_Group operations_ are executed by multiple invocations within a 1247<<shaders-scope, scope instance>>; with each invocation involved in 1248calculating the result. 1249This provides a mechanism for efficient communication between invocations in 1250a particular scope instance. 1251 1252Group operations all take a code:Scope defining the desired 1253<<shaders-scope,scope instance>> to operate within. 1254Only the code:Subgroup scope can: be used for these operations; the 1255<<limits-subgroupSupportedOperations, pname:subgroupSupportedOperations>> 1256limit defines which types of operation can: be used. 1257 1258 1259[[shaders-group-operations-basic]] 1260=== Basic Group Operations 1261 1262Basic group operations include the use of code:OpGroupNonUniformElect, 1263code:OpControlBarrier, code:OpMemoryBarrier, and atomic operations. 1264 1265code:OpGroupNonUniformElect can: be used to choose a single invocation to 1266perform a task for the whole group. 1267Only the invocation with the lowest id in the group will return code:true. 1268 1269The <<memory-model,Memory Model>> appendix defines the operation of barriers 1270and atomics. 1271 1272 1273[[shaders-group-operations-vote]] 1274=== Vote Group Operations 1275 1276The vote group operations allow invocations within a group to compare values 1277across a group. 1278The types of votes enabled are: 1279 1280 * Do all active group invocations agree that an expression is true? 1281 * Do any active group invocations evaluate an expression to true? 1282 * Do all active group invocations have the same value of an expression? 1283 1284[NOTE] 1285.Note 1286==== 1287These operations are useful in combination with control flow in that they 1288allow for developers to check whether conditions match across the group and 1289choose potentially faster code-paths in these cases. 1290==== 1291 1292 1293[[shaders-group-operations-arithmetic]] 1294=== Arithmetic Group Operations 1295 1296The arithmetic group operations allow invocations to perform scans and 1297reductions across a group. 1298The operators supported are add, mul, min, max, and, or, xor. 1299 1300For reductions, every invocation in a group will obtain the cumulative 1301result of these operators applied to all values in the group. 1302For exclusive scans, each invocation in a group will obtain the cumulative 1303result of these operators applied to all values in invocations with a lower 1304index in the group. 1305Inclusive scans are identical to exclusive scans, except the cumulative 1306result includes the operator applied to the value in the current invocation. 1307 1308The order in which these operators are applied is implementation-dependent. 1309 1310 1311[[shaders-group-operations-ballot]] 1312=== Ballot Group Operations 1313 1314The ballot group operations allow invocations to perform more complex votes 1315across the group. 1316The ballot functionality allows all invocations within a group to provide a 1317boolean value and get as a result what each invocation provided as their 1318boolean value. 1319The broadcast functionality allows values to be broadcast from an invocation 1320to all other invocations within the group. 1321 1322 1323[[shaders-group-operations-shuffle]] 1324=== Shuffle Group Operations 1325 1326The shuffle group operations allow invocations to read values from other 1327invocations within a group. 1328 1329 1330[[shaders-group-operations-shuffle-relative]] 1331=== Shuffle Relative Group Operations 1332 1333The shuffle relative group operations allow invocations to read values from 1334other invocations within the group relative to the current invocation in the 1335group. 1336The relative operations supported allow data to be shifted up and down 1337through the invocations within a group. 1338 1339 1340[[shaders-group-operations-clustered]] 1341=== Clustered Group Operations 1342 1343The clustered group operations allow invocations to perform an operation 1344among partitions of a group, such that the operation is only performed 1345within the group invocations within a partition. 1346The partitions for clustered group operations are consecutive power-of-two 1347size groups of invocations and the cluster size must: be known at pipeline 1348creation time. 1349The operations supported are add, mul, min, max, and, or, xor. 1350 1351 1352[[shaders-quad-operations]] 1353== Quad Group Operations 1354 1355Quad group operations (code:OpGroupNonUniformQuad*) are a specialized type 1356of <<shaders-group-operations, group operations>> that only operate on 1357<<shaders-scope-quad, quad scope instances>>. 1358Whilst these instructions do include a code:Scope parameter, this scope is 1359always overridden; only the <<shaders-scope-quad, quad scope instance>> is 1360included in its execution scope. 1361 1362Fragment shaders that statically execute quad group operations must: launch 1363sufficient invocations to ensure their correct operation; additional 1364<<shaders-helper-invocations, helper invocations>> are launched for 1365framebuffer locations not covered by rasterized fragments if necessary. 1366 1367The index used to select participating invocations is [eq]#i#, as described 1368for a <<shaders-scope-quad, quad scope instance>>, defined as the _quad 1369index_ in the <<spirv-spec,SPIR-V specification>>. 1370 1371For code:OpGroupNonUniformQuadBroadcast this value is equal to code:Index. 1372For code:OpGroupNonUniformQuadSwap, it is equal to the implicit code:Index 1373used by each participating invocation. 1374endif::VK_VERSION_1_1[] 1375 1376 1377[[shaders-derivative-operations]] 1378== Derivative Operations 1379 1380Derivative operations calculate the partial derivative for an expression 1381[eq]#P# as a function of an invocation's [eq]#x# and [eq]#y# coordinates. 1382 1383Derivative operations operate on a set of invocations known as a _derivative 1384group_ as defined in the <<spirv-spec,SPIR-V specification>>. 1385A derivative group is equivalent to 1386ifdef::VK_NV_compute_shader_derivatives[] 1387the <<shaders-scope-quad, quad scope instance>> for a compute shader 1388invocation, or 1389endif::VK_NV_compute_shader_derivatives[] 1390the <<shaders-scope-primitive, primitive scope instance>> for a fragment 1391shader invocation. 1392 1393Derivatives are calculated assuming that [eq]#P# is piecewise linear and 1394continuous within the derivative group. 1395All dynamic instances of explicit derivative instructions (code:OpDPdx*, 1396code:OpDPdy*, and code:OpFwidth*) must: be executed in control flow that is 1397uniform within a derivative group. 1398For other derivative operations, results are undefined: if a dynamic 1399instance is executed in control flow that is not uniform within the 1400derivative group. 1401 1402Fragment shaders that statically execute derivative operations must: launch 1403sufficient invocations to ensure their correct operation; additional 1404<<shaders-helper-invocations, helper invocations>> are launched for 1405framebuffer locations not covered by rasterized fragments if necessary. 1406 1407ifdef::VK_NV_compute_shader_derivatives[] 1408[NOTE] 1409.Note 1410==== 1411In a compute shader, it is the application's responsibility to ensure that 1412sufficient invocations are launched. 1413==== 1414endif::VK_NV_compute_shader_derivatives[] 1415 1416Derivative operations calculate their results as the difference between the 1417result of [eq]#P# across invocations in the quad. 1418For fine derivative operations (code:OpDPdxFine and code:OpDPdyFine), the 1419values of [eq]#DPdx(P~i~)# are calculated as 1420 1421 {empty}:: [eq]#DPdx(P~0~) = DPdx(P~1~) = P~1~ - P~0~# 1422 {empty}:: [eq]#DPdx(P~2~) = DPdx(P~3~) = P~3~ - P~2~# 1423 1424and the values of [eq]#DPdy(P~i~)# are calculated as 1425 1426 {empty}:: [eq]#DPdy(P~0~) = DPdy(P~2~) = P~2~ - P~0~# 1427 {empty}:: [eq]#DPdy(P~1~) = DPdy(P~3~) = P~3~ - P~1~# 1428 1429where [eq]#i# is the index of each invocation as described in 1430<<shaders-scope-quad>>. 1431 1432Coarse derivative operations (code:OpDPdxCoarse and code:OpDPdyCoarse), 1433calculate their results in roughly the same manner, but may: only calculate 1434two values instead of four (one for each of [eq]#DPdx# and [eq]#DPdy#), 1435reusing the same result no matter the originating invocation. 1436If an implementation does this, it should: use the fine derivative 1437calculations described for [eq]#P~0~#. 1438 1439[NOTE] 1440.Note 1441==== 1442Derivative values are calculated between fragments rather than pixels. 1443If the fragment shader invocations involved in the calculation cover 1444multiple pixels, these operations cover a wider area, resulting in larger 1445derivative values. 1446This in turn will result in a coarser level of detail being selected for 1447image sampling operations using derivatives. 1448 1449Applications may want to account for this when using multi-pixel fragments; 1450if pixel derivatives are desired, applications should use explicit 1451derivative operations and divide the results by the size of the fragment in 1452each dimension as follows: 1453 1454 {empty}:: [eq]#DPdx(P~n~)' = DPdx(P~n~) / w# 1455 {empty}:: [eq]#DPdy(P~n~)' = DPdy(P~n~) / h# 1456 1457where [eq]#w# and [eq]#h# are the size of the fragments in the quad, and 1458[eq]#DPdx(P~n~)'# and [eq]#DPdy(P~n~)'# are the pixel derivatives. 1459==== 1460 1461The results for code:OpDPdx and code:OpDPdy may: be calculated as either 1462fine or coarse derivatives, with implementations favouring the most 1463efficient approach. 1464Implementations must: choose coarse or fine consistently between the two. 1465 1466Executing code:OpFwidthFine, code:OpFwidthCoarse, or code:OpFwidth is 1467equivalent to executing the corresponding code:OpDPdx* and code:OpDPdy* 1468instructions, taking the absolute value of the results, and summing them. 1469 1470Executing an code:OpImage*Sample*ImplicitLod instruction is equivalent to 1471executing code:OpDPdx(code:Coordinate) and code:OpDPdy(code:Coordinate), and 1472passing the results as the code:Grad operands code:dx and code:dy. 1473 1474[NOTE] 1475.Note 1476==== 1477It is expected that using the code:ImplicitLod variants of sampling 1478functions will be substantially more efficient than using the 1479code:ExplicitLod variants with explicitly generated derivatives. 1480==== 1481 1482 1483[[shaders-helper-invocations]] 1484== Helper Invocations 1485 1486When performing <<shaders-derivative-operations, derivative>> 1487ifdef::VK_VERSION_1_1[] 1488or <<shaders-quad-operations, quad group>> 1489endif::VK_VERSION_1_1[] 1490operations in a fragment shader, additional invocations may: be spawned in 1491order to ensure correct results. 1492These additional invocations are known as _helper invocations_ and can: be 1493identified by a non-zero value in the code:HelperInvocation built-in. 1494Stores and atomics performed by helper invocations must: not have any effect 1495on memory, and values returned by atomic instructions in helper invocations 1496are undefined:. 1497 1498For <<shaders-group-operations, group operations>> other than 1499<<shaders-derivative-operations, derivative>> 1500ifdef::VK_VERSION_1_1[] 1501and <<shaders-quad-operations, quad group>> 1502endif::VK_VERSION_1_1[] 1503operations, helper invocations may: be treated as inactive even if they 1504would be considered otherwise active. 1505 1506ifdef::VK_EXT_shader_demote_to_helper_invocation[] 1507Helper invocations may: become permanently inactive if all invocations in a 1508quad scope instance become helper invocations. 1509endif::VK_EXT_shader_demote_to_helper_invocation[] 1510 1511 1512ifdef::VK_NV_cooperative_matrix[] 1513== Cooperative Matrices 1514 1515A _cooperative matrix_ type is a SPIR-V type where the storage for and 1516computations performed on the matrix are spread across the invocations in a 1517scope instance. 1518These types give the implementation freedom in how to optimize matrix 1519multiplies. 1520 1521SPIR-V defines the types and instructions, but does not specify rules about 1522what sizes/combinations are valid, and it is expected that different 1523implementations may: support different sizes. 1524 1525[open,refpage='vkGetPhysicalDeviceCooperativeMatrixPropertiesNV',desc='Returns properties describing what cooperative matrix types are supported',type='protos'] 1526-- 1527To enumerate the supported cooperative matrix types and operations, call: 1528 1529include::{generated}/api/protos/vkGetPhysicalDeviceCooperativeMatrixPropertiesNV.txt[] 1530 1531 * pname:physicalDevice is the physical device. 1532 * pname:pPropertyCount is a pointer to an integer related to the number of 1533 cooperative matrix properties available or queried. 1534 * pname:pProperties is either `NULL` or a pointer to an array of 1535 slink:VkCooperativeMatrixPropertiesNV structures. 1536 1537If pname:pProperties is `NULL`, then the number of cooperative matrix 1538properties available is returned in pname:pPropertyCount. 1539Otherwise, pname:pPropertyCount must: point to a variable set by the user to 1540the number of elements in the pname:pProperties array, and on return the 1541variable is overwritten with the number of structures actually written to 1542pname:pProperties. 1543If pname:pPropertyCount is less than the number of cooperative matrix 1544properties available, at most pname:pPropertyCount structures will be 1545written, and ename:VK_INCOMPLETE will be returned instead of 1546ename:VK_SUCCESS, to indicate that not all the available cooperative matrix 1547properties were returned. 1548 1549include::{generated}/validity/protos/vkGetPhysicalDeviceCooperativeMatrixPropertiesNV.txt[] 1550-- 1551 1552[open,refpage='VkCooperativeMatrixPropertiesNV',desc='Structure specifying cooperative matrix properties',type='structs'] 1553-- 1554Each sname:VkCooperativeMatrixPropertiesNV structure describes a single 1555supported combination of types for a matrix multiply/add operation 1556(code:OpCooperativeMatrixMulAddNV). 1557The multiply can: be described in terms of the following variables and types 1558(in SPIR-V pseudocode): 1559 1560[source,c] 1561~~~~ 1562 %A is of type OpTypeCooperativeMatrixNV %AType %scope %MSize %KSize 1563 %B is of type OpTypeCooperativeMatrixNV %BType %scope %KSize %NSize 1564 %C is of type OpTypeCooperativeMatrixNV %CType %scope %MSize %NSize 1565 %D is of type OpTypeCooperativeMatrixNV %DType %scope %MSize %NSize 1566 1567 %D = %A * %B + %C // using OpCooperativeMatrixMulAddNV 1568~~~~ 1569 1570A matrix multiply with these dimensions is known as an _MxNxK_ matrix 1571multiply. 1572 1573The sname:VkCooperativeMatrixPropertiesNV structure is defined as: 1574 1575include::{generated}/api/structs/VkCooperativeMatrixPropertiesNV.txt[] 1576 1577 * pname:sType is the type of this structure. 1578 * pname:pNext is `NULL` or a pointer to a structure extending this 1579 structure. 1580 * pname:MSize is the number of rows in matrices A, C, and D. 1581 * pname:KSize is the number of columns in matrix A and rows in matrix B. 1582 * pname:NSize is the number of columns in matrices B, C, D. 1583 * pname:AType is the component type of matrix A, of type 1584 elink:VkComponentTypeNV. 1585 * pname:BType is the component type of matrix B, of type 1586 elink:VkComponentTypeNV. 1587 * pname:CType is the component type of matrix C, of type 1588 elink:VkComponentTypeNV. 1589 * pname:DType is the component type of matrix D, of type 1590 elink:VkComponentTypeNV. 1591 * pname:scope is the scope of all the matrix types, of type 1592 elink:VkScopeNV. 1593 1594If some types are preferred over other types (e.g. for performance), they 1595should: appear earlier in the list enumerated by 1596flink:vkGetPhysicalDeviceCooperativeMatrixPropertiesNV. 1597 1598At least one entry in the list must: have power of two values for all of 1599pname:MSize, pname:KSize, and pname:NSize. 1600 1601include::{generated}/validity/structs/VkCooperativeMatrixPropertiesNV.txt[] 1602-- 1603 1604[open,refpage='VkScopeNV',desc='Specify SPIR-V scope',type='enums'] 1605-- 1606Possible values for elink:VkScopeNV include: 1607 1608include::{generated}/api/enums/VkScopeNV.txt[] 1609 1610 * ename:VK_SCOPE_DEVICE_NV corresponds to SPIR-V code:Device scope. 1611 * ename:VK_SCOPE_WORKGROUP_NV corresponds to SPIR-V code:Workgroup scope. 1612 * ename:VK_SCOPE_SUBGROUP_NV corresponds to SPIR-V code:Subgroup scope. 1613 * ename:VK_SCOPE_QUEUE_FAMILY_NV corresponds to SPIR-V code:QueueFamily 1614 scope. 1615 1616All enum values match the corresponding SPIR-V value. 1617-- 1618 1619[open,refpage='VkComponentTypeNV',desc='Specify SPIR-V cooperative matrix component type',type='enums'] 1620-- 1621Possible values for elink:VkComponentTypeNV include: 1622 1623include::{generated}/api/enums/VkComponentTypeNV.txt[] 1624 1625 * ename:VK_COMPONENT_TYPE_FLOAT16_NV corresponds to SPIR-V 1626 code:OpTypeFloat 16. 1627 * ename:VK_COMPONENT_TYPE_FLOAT32_NV corresponds to SPIR-V 1628 code:OpTypeFloat 32. 1629 * ename:VK_COMPONENT_TYPE_FLOAT64_NV corresponds to SPIR-V 1630 code:OpTypeFloat 64. 1631 * ename:VK_COMPONENT_TYPE_SINT8_NV corresponds to SPIR-V code:OpTypeInt 8 1. 1632 * ename:VK_COMPONENT_TYPE_SINT16_NV corresponds to SPIR-V code:OpTypeInt 1633 16 1. 1634 * ename:VK_COMPONENT_TYPE_SINT32_NV corresponds to SPIR-V code:OpTypeInt 1635 32 1. 1636 * ename:VK_COMPONENT_TYPE_SINT64_NV corresponds to SPIR-V code:OpTypeInt 1637 64 1. 1638 * ename:VK_COMPONENT_TYPE_UINT8_NV corresponds to SPIR-V code:OpTypeInt 8 0. 1639 * ename:VK_COMPONENT_TYPE_UINT16_NV corresponds to SPIR-V code:OpTypeInt 1640 16 0. 1641 * ename:VK_COMPONENT_TYPE_UINT32_NV corresponds to SPIR-V code:OpTypeInt 1642 32 0. 1643 * ename:VK_COMPONENT_TYPE_UINT64_NV corresponds to SPIR-V code:OpTypeInt 1644 64 0. 1645-- 1646endif::VK_NV_cooperative_matrix[] 1647 1648 1649ifdef::VK_EXT_validation_cache[] 1650[[shaders-validation-cache]] 1651== Validation Cache 1652 1653[open,refpage='VkValidationCacheEXT',desc='Opaque handle to a validation cache object',type='handles'] 1654-- 1655Validation cache objects allow the result of internal validation to be 1656reused, both within a single application run and between multiple runs. 1657Reuse within a single run is achieved by passing the same validation cache 1658object when creating supported Vulkan objects. 1659Reuse across runs of an application is achieved by retrieving validation 1660cache contents in one run of an application, saving the contents, and using 1661them to preinitialize a validation cache on a subsequent run. 1662The contents of the validation cache objects are managed by the validation 1663layers. 1664Applications can: manage the host memory consumed by a validation cache 1665object and control the amount of data retrieved from a validation cache 1666object. 1667 1668Validation cache objects are represented by sname:VkValidationCacheEXT 1669handles: 1670 1671include::{generated}/api/handles/VkValidationCacheEXT.txt[] 1672-- 1673 1674[open,refpage='vkCreateValidationCacheEXT',desc='Creates a new validation cache',type='protos'] 1675-- 1676To create validation cache objects, call: 1677 1678include::{generated}/api/protos/vkCreateValidationCacheEXT.txt[] 1679 1680 * pname:device is the logical device that creates the validation cache 1681 object. 1682 * pname:pCreateInfo is a pointer to a slink:VkValidationCacheCreateInfoEXT 1683 structure containing the initial parameters for the validation cache 1684 object. 1685 * pname:pAllocator controls host memory allocation as described in the 1686 <<memory-allocation, Memory Allocation>> chapter. 1687 * pname:pValidationCache is a pointer to a slink:VkValidationCacheEXT 1688 handle in which the resulting validation cache object is returned. 1689 1690[NOTE] 1691.Note 1692==== 1693Applications can: track and manage the total host memory size of a 1694validation cache object using the pname:pAllocator. 1695Applications can: limit the amount of data retrieved from a validation cache 1696object in fname:vkGetValidationCacheDataEXT. 1697Implementations should: not internally limit the total number of entries 1698added to a validation cache object or the total host memory consumed. 1699==== 1700 1701Once created, a validation cache can: be passed to the 1702fname:vkCreateShaderModule command by adding this object to the 1703slink:VkShaderModuleCreateInfo structure's pname:pNext chain. 1704If a slink:VkShaderModuleValidationCacheCreateInfoEXT object is included in 1705the slink:VkShaderModuleCreateInfo::pname:pNext chain, and its 1706pname:validationCache field is not dlink:VK_NULL_HANDLE, the implementation 1707will query it for possible reuse opportunities and update it with new 1708content. 1709The use of the validation cache object in these commands is internally 1710synchronized, and the same validation cache object can: be used in multiple 1711threads simultaneously. 1712 1713[NOTE] 1714.Note 1715==== 1716Implementations should: make every effort to limit any critical sections to 1717the actual accesses to the cache, which is expected to be significantly 1718shorter than the duration of the fname:vkCreateShaderModule command. 1719==== 1720 1721include::{generated}/validity/protos/vkCreateValidationCacheEXT.txt[] 1722-- 1723 1724[open,refpage='VkValidationCacheCreateInfoEXT',desc='Structure specifying parameters of a newly created validation cache',type='structs'] 1725-- 1726The sname:VkValidationCacheCreateInfoEXT structure is defined as: 1727 1728include::{generated}/api/structs/VkValidationCacheCreateInfoEXT.txt[] 1729 1730 * pname:sType is the type of this structure. 1731 * pname:pNext is `NULL` or a pointer to a structure extending this 1732 structure. 1733 * pname:flags is reserved for future use. 1734 * pname:initialDataSize is the number of bytes in pname:pInitialData. 1735 If pname:initialDataSize is zero, the validation cache will initially be 1736 empty. 1737 * pname:pInitialData is a pointer to previously retrieved validation cache 1738 data. 1739 If the validation cache data is incompatible (as defined below) with the 1740 device, the validation cache will be initially empty. 1741 If pname:initialDataSize is zero, pname:pInitialData is ignored. 1742 1743.Valid Usage 1744**** 1745 * [[VUID-VkValidationCacheCreateInfoEXT-initialDataSize-01534]] 1746 If pname:initialDataSize is not `0`, it must: be equal to the size of 1747 pname:pInitialData, as returned by fname:vkGetValidationCacheDataEXT 1748 when pname:pInitialData was originally retrieved 1749 * [[VUID-VkValidationCacheCreateInfoEXT-initialDataSize-01535]] 1750 If pname:initialDataSize is not `0`, pname:pInitialData must: have been 1751 retrieved from a previous call to fname:vkGetValidationCacheDataEXT 1752**** 1753 1754include::{generated}/validity/structs/VkValidationCacheCreateInfoEXT.txt[] 1755-- 1756 1757[open,refpage='VkValidationCacheCreateFlagsEXT',desc='Reserved for future use',type='flags'] 1758-- 1759include::{generated}/api/flags/VkValidationCacheCreateFlagsEXT.txt[] 1760 1761tname:VkValidationCacheCreateFlagsEXT is a bitmask type for setting a mask, 1762but is currently reserved for future use. 1763-- 1764 1765[open,refpage='vkMergeValidationCachesEXT',desc='Combine the data stores of validation caches',type='protos'] 1766-- 1767Validation cache objects can: be merged using the command: 1768 1769include::{generated}/api/protos/vkMergeValidationCachesEXT.txt[] 1770 1771 * pname:device is the logical device that owns the validation cache 1772 objects. 1773 * pname:dstCache is the handle of the validation cache to merge results 1774 into. 1775 * pname:srcCacheCount is the length of the pname:pSrcCaches array. 1776 * pname:pSrcCaches is a pointer to an array of validation cache handles, 1777 which will be merged into pname:dstCache. 1778 The previous contents of pname:dstCache are included after the merge. 1779 1780[NOTE] 1781.Note 1782==== 1783The details of the merge operation are implementation-dependent, but 1784implementations should: merge the contents of the specified validation 1785caches and prune duplicate entries. 1786==== 1787 1788.Valid Usage 1789**** 1790 * [[VUID-vkMergeValidationCachesEXT-dstCache-01536]] 1791 pname:dstCache must: not appear in the list of source caches 1792**** 1793 1794include::{generated}/validity/protos/vkMergeValidationCachesEXT.txt[] 1795-- 1796 1797[open,refpage='vkGetValidationCacheDataEXT',desc='Get the data store from a validation cache',type='protos'] 1798-- 1799Data can: be retrieved from a validation cache object using the command: 1800 1801include::{generated}/api/protos/vkGetValidationCacheDataEXT.txt[] 1802 1803 * pname:device is the logical device that owns the validation cache. 1804 * pname:validationCache is the validation cache to retrieve data from. 1805 * pname:pDataSize is a pointer to a value related to the amount of data in 1806 the validation cache, as described below. 1807 * pname:pData is either `NULL` or a pointer to a buffer. 1808 1809If pname:pData is `NULL`, then the maximum size of the data that can: be 1810retrieved from the validation cache, in bytes, is returned in 1811pname:pDataSize. 1812Otherwise, pname:pDataSize must: point to a variable set by the user to the 1813size of the buffer, in bytes, pointed to by pname:pData, and on return the 1814variable is overwritten with the amount of data actually written to 1815pname:pData. 1816If pname:pDataSize is less than the maximum size that can: be retrieved by 1817the validation cache, at most pname:pDataSize bytes will be written to 1818pname:pData, and fname:vkGetValidationCacheDataEXT will return 1819ename:VK_INCOMPLETE instead of ename:VK_SUCCESS, to indicate that not all of 1820the validation cache was returned. 1821 1822Any data written to pname:pData is valid and can: be provided as the 1823pname:pInitialData member of the slink:VkValidationCacheCreateInfoEXT 1824structure passed to fname:vkCreateValidationCacheEXT. 1825 1826Two calls to fname:vkGetValidationCacheDataEXT with the same parameters 1827must: retrieve the same data unless a command that modifies the contents of 1828the cache is called between them. 1829 1830[[validation-cache-header]] 1831Applications can: store the data retrieved from the validation cache, and 1832use these data, possibly in a future run of the application, to populate new 1833validation cache objects. 1834The results of validation, however, may: depend on the vendor ID, device ID, 1835driver version, and other details of the device. 1836To enable applications to detect when previously retrieved data is 1837incompatible with the device, the initial bytes written to pname:pData must: 1838be a header consisting of the following members: 1839 1840.Layout for validation cache header version ename:VK_VALIDATION_CACHE_HEADER_VERSION_ONE_EXT 1841[width="85%",cols="8%,21%,71%",options="header"] 1842|==== 1843| Offset | Size | Meaning 1844| 0 | 4 | length in bytes of the entire validation cache header 1845 written as a stream of bytes, with the least 1846 significant byte first 1847| 4 | 4 | a elink:VkValidationCacheHeaderVersionEXT value 1848 written as a stream of bytes, with the least 1849 significant byte first 1850| 8 | ename:VK_UUID_SIZE | a layer commit ID expressed as a UUID, which uniquely 1851 identifies the version of the validation layers used 1852 to generate these validation results 1853|==== 1854 1855The first four bytes encode the length of the entire validation cache 1856header, in bytes. 1857This value includes all fields in the header including the validation cache 1858version field and the size of the length field. 1859 1860The next four bytes encode the validation cache version, as described for 1861elink:VkValidationCacheHeaderVersionEXT. 1862A consumer of the validation cache should: use the cache version to 1863interpret the remainder of the cache header. 1864 1865If pname:pDataSize is less than what is necessary to store this header, 1866nothing will be written to pname:pData and zero will be written to 1867pname:pDataSize. 1868 1869include::{generated}/validity/protos/vkGetValidationCacheDataEXT.txt[] 1870-- 1871 1872[open,refpage='VkValidationCacheHeaderVersionEXT',desc='Encode validation cache version',type='enums',xrefs='vkCreateValidationCacheEXT vkGetValidationCacheDataEXT'] 1873-- 1874Possible values of the second group of four bytes in the header returned by 1875flink:vkGetValidationCacheDataEXT, encoding the validation cache version, 1876are: 1877 1878include::{generated}/api/enums/VkValidationCacheHeaderVersionEXT.txt[] 1879 1880 * ename:VK_VALIDATION_CACHE_HEADER_VERSION_ONE_EXT specifies version one 1881 of the validation cache. 1882-- 1883 1884[open,refpage='vkDestroyValidationCacheEXT',desc='Destroy a validation cache object',type='protos'] 1885-- 1886To destroy a validation cache, call: 1887 1888include::{generated}/api/protos/vkDestroyValidationCacheEXT.txt[] 1889 1890 * pname:device is the logical device that destroys the validation cache 1891 object. 1892 * pname:validationCache is the handle of the validation cache to destroy. 1893 * pname:pAllocator controls host memory allocation as described in the 1894 <<memory-allocation, Memory Allocation>> chapter. 1895 1896.Valid Usage 1897**** 1898 * [[VUID-vkDestroyValidationCacheEXT-validationCache-01537]] 1899 If sname:VkAllocationCallbacks were provided when pname:validationCache 1900 was created, a compatible set of callbacks must: be provided here 1901 * [[VUID-vkDestroyValidationCacheEXT-validationCache-01538]] 1902 If no sname:VkAllocationCallbacks were provided when 1903 pname:validationCache was created, pname:pAllocator must: be `NULL` 1904**** 1905 1906include::{generated}/validity/protos/vkDestroyValidationCacheEXT.txt[] 1907-- 1908endif::VK_EXT_validation_cache[] 1909