• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1// Copyright 2015-2021 The Khronos Group, Inc.
2//
3// SPDX-License-Identifier: CC-BY-4.0
4
5[[shaders]]
6= Shaders
7
8A shader specifies programmable operations that execute for each vertex,
9control point, tessellated vertex, primitive, fragment, or workgroup in the
10corresponding stage(s) of the graphics and compute pipelines.
11
12Graphics pipelines include vertex shader execution as a result of
13<<drawing,primitive assembly>>, followed, if enabled, by tessellation
14control and evaluation shaders operating on <<drawing-patch-lists,patches>>,
15geometry shaders, if enabled, operating on primitives, and fragment shaders,
16if present, operating on fragments generated by <<primsrast,Rasterization>>.
17In this specification, vertex, tessellation control, tessellation evaluation
18and geometry shaders are collectively referred to as
19<<pipeline-graphics-subsets-pre-rasterization,pre-rasterization shader
20stage>>s and occur in the logical pipeline before rasterization.
21The fragment shader occurs logically after rasterization.
22
23Only the compute shader stage is included in a compute pipeline.
24Compute shaders operate on compute invocations in a workgroup.
25
26Shaders can: read from input variables, and read from and write to output
27variables.
28Input and output variables can: be used to transfer data between shader
29stages, or to allow the shader to interact with values that exist in the
30execution environment.
31Similarly, the execution environment provides constants describing
32capabilities.
33
34Shader variables are associated with execution environment-provided inputs
35and outputs using _built-in_ decorations in the shader.
36The available decorations for each stage are documented in the following
37subsections.
38
39
40[[shader-modules]]
41== Shader Modules
42
43[open,refpage='VkShaderModule',desc='Opaque handle to a shader module object',type='handles']
44--
45_Shader modules_ contain _shader code_ and one or more entry points.
46Shaders are selected from a shader module by specifying an entry point as
47part of <<pipelines,pipeline>> creation.
48The stages of a pipeline can: use shaders that come from different modules.
49The shader code defining a shader module must: be in the SPIR-V format, as
50described by the <<spirvenv,Vulkan Environment for SPIR-V>> appendix.
51
52Shader modules are represented by sname:VkShaderModule handles:
53
54include::{generated}/api/handles/VkShaderModule.txt[]
55--
56
57[open,refpage='vkCreateShaderModule',desc='Creates a new shader module object',type='protos']
58--
59To create a shader module, call:
60
61include::{generated}/api/protos/vkCreateShaderModule.txt[]
62
63  * pname:device is the logical device that creates the shader module.
64  * pname:pCreateInfo is a pointer to a slink:VkShaderModuleCreateInfo
65    structure.
66  * pname:pAllocator controls host memory allocation as described in the
67    <<memory-allocation, Memory Allocation>> chapter.
68  * pname:pShaderModule is a pointer to a slink:VkShaderModule handle in
69    which the resulting shader module object is returned.
70
71Once a shader module has been created, any entry points it contains can: be
72used in pipeline shader stages as described in <<pipelines-compute,Compute
73Pipelines>> and <<pipelines-graphics,Graphics Pipelines>>.
74
75include::{generated}/validity/protos/vkCreateShaderModule.txt[]
76--
77
78[open,refpage='VkShaderModuleCreateInfo',desc='Structure specifying parameters of a newly created shader module',type='structs']
79--
80The sname:VkShaderModuleCreateInfo structure is defined as:
81
82include::{generated}/api/structs/VkShaderModuleCreateInfo.txt[]
83
84  * pname:sType is the type of this structure.
85  * pname:pNext is `NULL` or a pointer to a structure extending this
86    structure.
87  * pname:flags is reserved for future use.
88  * pname:codeSize is the size, in bytes, of the code pointed to by
89    pname:pCode.
90  * pname:pCode is a pointer to code that is used to create the shader
91    module.
92    The type and format of the code is determined from the content of the
93    memory addressed by pname:pCode.
94
95.Valid Usage
96****
97  * [[VUID-VkShaderModuleCreateInfo-codeSize-01085]]
98    pname:codeSize must: be greater than 0
99ifndef::VK_NV_glsl_shader[]
100  * [[VUID-VkShaderModuleCreateInfo-codeSize-01086]]
101    pname:codeSize must: be a multiple of 4
102  * [[VUID-VkShaderModuleCreateInfo-pCode-01087]]
103    pname:pCode must: point to valid SPIR-V code, formatted and packed as
104    described by the <<spirv-spec,Khronos SPIR-V Specification>>
105  * [[VUID-VkShaderModuleCreateInfo-pCode-01088]]
106    pname:pCode must: adhere to the validation rules described by the
107    <<spirvenv-module-validation, Validation Rules within a Module>> section
108    of the <<spirvenv-capabilities,SPIR-V Environment>> appendix
109endif::VK_NV_glsl_shader[]
110ifdef::VK_NV_glsl_shader[]
111  * [[VUID-VkShaderModuleCreateInfo-pCode-01376]]
112    If pname:pCode is a pointer to SPIR-V code, pname:codeSize must: be a
113    multiple of 4
114  * [[VUID-VkShaderModuleCreateInfo-pCode-01377]]
115    pname:pCode must: point to either valid SPIR-V code, formatted and
116    packed as described by the <<spirv-spec,Khronos SPIR-V Specification>>
117    or valid GLSL code which must: be written to the `GL_KHR_vulkan_glsl`
118    extension specification
119  * [[VUID-VkShaderModuleCreateInfo-pCode-01378]]
120    If pname:pCode is a pointer to SPIR-V code, that code must: adhere to
121    the validation rules described by the <<spirvenv-module-validation,
122    Validation Rules within a Module>> section of the
123    <<spirvenv-capabilities,SPIR-V Environment>> appendix
124  * [[VUID-VkShaderModuleCreateInfo-pCode-01379]]
125    If pname:pCode is a pointer to GLSL code, it must: be valid GLSL code
126    written to the `GL_KHR_vulkan_glsl` GLSL extension specification
127endif::VK_NV_glsl_shader[]
128  * [[VUID-VkShaderModuleCreateInfo-pCode-01089]]
129    pname:pCode must: declare the code:Shader capability for SPIR-V code
130  * [[VUID-VkShaderModuleCreateInfo-pCode-01090]]
131    pname:pCode must: not declare any capability that is not supported by
132    the API, as described by the <<spirvenv-module-validation,
133    Capabilities>> section of the <<spirvenv-capabilities,SPIR-V
134    Environment>> appendix
135  * [[VUID-VkShaderModuleCreateInfo-pCode-01091]]
136    If pname:pCode declares any of the capabilities listed in the
137    <<spirvenv-capabilities-table,SPIR-V Environment>> appendix, one of the
138    corresponding requirements must: be satisfied
139  * [[VUID-VkShaderModuleCreateInfo-pCode-04146]]
140    pname:pCode must: not declare any SPIR-V extension that is not supported
141    by the API, as described by the <<spirvenv-extensions, Extension>>
142    section of the <<spirvenv-capabilities,SPIR-V Environment>> appendix
143  * [[VUID-VkShaderModuleCreateInfo-pCode-04147]]
144    If pname:pCode declares any of the SPIR-V extensions listed in the
145    <<spirvenv-extensions-table,SPIR-V Environment>> appendix, one of the
146    corresponding requirements must: be satisfied
147****
148
149include::{generated}/validity/structs/VkShaderModuleCreateInfo.txt[]
150--
151
152[open,refpage='VkShaderModuleCreateFlags',desc='Reserved for future use',type='flags']
153--
154include::{generated}/api/flags/VkShaderModuleCreateFlags.txt[]
155
156tname:VkShaderModuleCreateFlags is a bitmask type for setting a mask, but is
157currently reserved for future use.
158--
159
160ifdef::VK_EXT_validation_cache[]
161include::{chapters}/VK_EXT_validation_cache/shader-module-validation-cache.txt[]
162endif::VK_EXT_validation_cache[]
163
164
165[open,refpage='vkDestroyShaderModule',desc='Destroy a shader module',type='protos']
166--
167To destroy a shader module, call:
168
169include::{generated}/api/protos/vkDestroyShaderModule.txt[]
170
171  * pname:device is the logical device that destroys the shader module.
172  * pname:shaderModule is the handle of the shader module to destroy.
173  * pname:pAllocator controls host memory allocation as described in the
174    <<memory-allocation, Memory Allocation>> chapter.
175
176A shader module can: be destroyed while pipelines created using its shaders
177are still in use.
178
179.Valid Usage
180****
181  * [[VUID-vkDestroyShaderModule-shaderModule-01092]]
182    If sname:VkAllocationCallbacks were provided when pname:shaderModule was
183    created, a compatible set of callbacks must: be provided here
184  * [[VUID-vkDestroyShaderModule-shaderModule-01093]]
185    If no sname:VkAllocationCallbacks were provided when pname:shaderModule
186    was created, pname:pAllocator must: be `NULL`
187****
188
189include::{generated}/validity/protos/vkDestroyShaderModule.txt[]
190--
191
192
193[[shaders-execution]]
194== Shader Execution
195
196At each stage of the pipeline, multiple invocations of a shader may: execute
197simultaneously.
198Further, invocations of a single shader produced as the result of different
199commands may: execute simultaneously.
200The relative execution order of invocations of the same shader type is
201undefined:.
202Shader invocations may: complete in a different order than that in which the
203primitives they originated from were drawn or dispatched by the application.
204However, fragment shader outputs are written to attachments in
205<<primsrast-order,rasterization order>>.
206
207The relative execution order of invocations of different shader types is
208largely undefined:.
209However, when invoking a shader whose inputs are generated from a previous
210pipeline stage, the shader invocations from the previous stage are
211guaranteed to have executed far enough to generate input values for all
212required inputs.
213
214
215[[shaders-execution-memory-ordering]]
216== Shader Memory Access Ordering
217
218The order in which image or buffer memory is read or written by shaders is
219largely undefined:.
220For some shader types (vertex, tessellation evaluation, and in some cases,
221fragment), even the number of shader invocations that may: perform loads and
222stores is undefined:.
223
224In particular, the following rules apply:
225
226  * <<shaders-vertex-execution,Vertex>> and
227    <<shaders-tessellation-evaluation-execution,tessellation evaluation>>
228    shaders will be invoked at least once for each unique vertex, as defined
229    in those sections.
230  * <<fragops-shader,Fragment>> shaders will be invoked zero or more times,
231    as defined in that section.
232  * The relative execution order of invocations of the same shader type is
233    undefined:.
234    A store issued by a shader when working on primitive B might complete
235    prior to a store for primitive A, even if primitive A is specified prior
236    to primitive B. This applies even to fragment shaders; while fragment
237    shader outputs are always written to the framebuffer in
238    <<primsrast-order, rasterization order>>, stores executed by fragment
239    shader invocations are not.
240  * The relative execution order of invocations of different shader types is
241    largely undefined:.
242
243[NOTE]
244.Note
245====
246The above limitations on shader invocation order make some forms of
247synchronization between shader invocations within a single set of primitives
248unimplementable.
249For example, having one invocation poll memory written by another invocation
250assumes that the other invocation has been launched and will complete its
251writes in finite time.
252====
253
254ifdef::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[]
255
256The <<memory-model,Memory Model>> appendix defines the terminology and rules
257for how to correctly communicate between shader invocations, such as when a
258write is <<memory-model-visible-to,Visible-To>> a read, and what constitutes
259a <<memory-model-access-data-race,Data Race>>.
260
261Applications must: not cause a data race.
262
263endif::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[]
264
265ifndef::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[]
266
267Stores issued to different memory locations within a single shader
268invocation may: not be visible to other invocations, or may: not become
269visible in the order they were performed.
270
271The code:OpMemoryBarrier instruction can: be used to provide stronger
272ordering of reads and writes performed by a single invocation.
273code:OpMemoryBarrier guarantees that any memory transactions issued by the
274shader invocation prior to the instruction complete prior to the memory
275transactions issued after the instruction.
276Memory barriers are needed for algorithms that require multiple invocations
277to access the same memory and require the operations to be performed in a
278partially-defined relative order.
279For example, if one shader invocation does a series of writes, followed by
280an code:OpMemoryBarrier instruction, followed by another write, then the
281results of the series of writes before the barrier become visible to other
282shader invocations at a time earlier or equal to when the results of the
283final write become visible to those invocations.
284In practice it means that another invocation that sees the results of the
285final write would also see the previous writes.
286Without the memory barrier, the final write may: be visible before the
287previous writes.
288
289Writes that are the result of shader stores through a variable decorated
290with code:Coherent automatically have available writes to the same buffer,
291buffer view, or image view made visible to them, and are themselves
292automatically made available to access by the same buffer, buffer view, or
293image view.
294Reads that are the result of shader loads through a variable decorated with
295code:Coherent automatically have available writes to the same buffer, buffer
296view, or image view made visible to them.
297The order that coherent writes to different locations become available is
298undefined:, unless enforced by a memory barrier instruction or other memory
299dependency.
300
301[NOTE]
302.Note
303====
304Explicit memory dependencies must: still be used to guarantee availability
305and visibility for access via other buffers, buffer views, or image views.
306====
307
308The built-in atomic memory transaction instructions can: be used to read and
309write a given memory address atomically.
310While built-in atomic functions issued by multiple shader invocations are
311executed in undefined: order relative to each other, these functions perform
312both a read and a write of a memory address and guarantee that no other
313memory transaction will write to the underlying memory between the read and
314write.
315Atomic operations ensure automatic availability and visibility for writes
316and reads in the same way as those to code:Coherent variables.
317
318[NOTE]
319.Note
320====
321Memory accesses performed on different resource descriptors with the same
322memory backing may: not be well-defined even with the code:Coherent
323decoration or via atomics, due to things such as image layouts or ownership
324of the resource - as described in the <<synchronization, Synchronization and
325Cache Control>> chapter.
326====
327
328[NOTE]
329.Note
330====
331Atomics allow shaders to use shared global addresses for mutual exclusion or
332as counters, among other uses.
333====
334
335endif::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[]
336
337The SPIR-V *SubgroupMemory*, *CrossWorkgroupMemory*, and
338*AtomicCounterMemory* memory semantics are ignored.
339Sequentially consistent atomics and barriers are not supported and
340*SequentiallyConsistent* is treated as *AcquireRelease*.
341*SequentiallyConsistent* should: not be used.
342
343
344[[shaders-inputs]]
345== Shader Inputs and Outputs
346
347Data is passed into and out of shaders using variables with input or output
348storage class, respectively.
349User-defined inputs and outputs are connected between stages by matching
350their code:Location decorations.
351Additionally, data can: be provided by or communicated to special functions
352provided by the execution environment using code:BuiltIn decorations.
353
354In many cases, the same code:BuiltIn decoration can: be used in multiple
355shader stages with similar meaning.
356The specific behavior of variables decorated as code:BuiltIn is documented
357in the following sections.
358
359
360ifdef::VK_NV_mesh_shader[]
361[[shaders-task]]
362== Task Shaders
363
364Task shaders operate in conjunction with the mesh shaders to produce a
365collection of primitives that will be processed by subsequent stages of the
366graphics pipeline.
367Its primary purpose is to create a variable amount of subsequent mesh shader
368invocations.
369
370Task shaders are invoked via the execution of the
371<<drawing-mesh-shading,programmable mesh shading>> pipeline.
372
373The task shader has no fixed-function inputs other than variables
374identifying the specific workgroup and invocation.
375The only fixed output of the task shader is a task count, identifying the
376number of mesh shader workgroups to create.
377The task shader can write additional outputs to task memory, which can be
378read by all of the mesh shader workgroups it created.
379
380
381=== Task Shader Execution
382
383Task workloads are formed from groups of work items called workgroups and
384processed by the task shader in the current graphics pipeline.
385A workgroup is a collection of shader invocations that execute the same
386shader, potentially in parallel.
387Task shaders execute in _global workgroups_ which are divided into a number
388of _local workgroups_ with a size that can: be set by assigning a value to
389the code:LocalSize
390ifdef::VK_KHR_maintenance4[or code:LocalSizeId]
391execution mode or via an object decorated by the code:WorkgroupSize
392decoration.
393An invocation within a local workgroup can: share data with other members of
394the local workgroup through shared variables and issue memory and control
395flow barriers to synchronize with other members of the local workgroup.
396
397
398[[shaders-mesh]]
399== Mesh Shaders
400
401Mesh shaders operate in workgroups to produce a collection of primitives
402that will be processed by subsequent stages of the graphics pipeline.
403Each workgroup emits zero or more output primitives and the group of
404vertices and their associated data required for each output primitive.
405
406Mesh shaders are invoked via the execution of the
407<<drawing-mesh-shading,programmable mesh shading>> pipeline.
408
409The only inputs available to the mesh shader are variables identifying the
410specific workgroup and invocation and, if applicable, any outputs written to
411task memory by the task shader that spawned the mesh shader's workgroup.
412The mesh shader can operate without a task shader as well.
413
414The invocations of the mesh shader workgroup write an output mesh,
415comprising a set of primitives with per-primitive attributes, a set of
416vertices with per-vertex attributes, and an array of indices identifying the
417mesh vertices that belong to each primitive.
418The primitives of this mesh are then processed by subsequent graphics
419pipeline stages, where the outputs of the mesh shader form an interface with
420the fragment shader.
421
422
423=== Mesh Shader Execution
424
425Mesh workloads are formed from groups of work items called workgroups and
426processed by the mesh shader in the current graphics pipeline.
427A workgroup is a collection of shader invocations that execute the same
428shader, potentially in parallel.
429Mesh shaders execute in _global workgroups_ which are divided into a number
430of _local workgroups_ with a size that can: be set by assigning a value to
431the code:LocalSize
432ifdef::VK_KHR_maintenance4[or code:LocalSizeId]
433execution mode or via an object decorated by the code:WorkgroupSize
434decoration.
435An invocation within a local workgroup can: share data with other members of
436the local workgroup through shared variables and issue memory and control
437flow barriers to synchronize with other members of the local workgroup.
438
439The _global workgroups_ may be generated explcitly via the API, or
440implicitly through the task shader's work creation mechanism.
441endif::VK_NV_mesh_shader[]
442
443
444[[shaders-vertex]]
445== Vertex Shaders
446
447Each vertex shader invocation operates on one vertex and its associated
448<<fxvertex-attrib,vertex attribute>> data, and outputs one vertex and
449associated data.
450ifndef::VK_NV_mesh_shader[]
451Graphics pipelines must: include a vertex shader, and the vertex shader
452stage is always the first shader stage in the graphics pipeline.
453endif::VK_NV_mesh_shader[]
454ifdef::VK_NV_mesh_shader[]
455Graphics pipelines using primitive shading must: include a vertex shader,
456and the vertex shader stage is always the first shader stage in the graphics
457pipeline.
458endif::VK_NV_mesh_shader[]
459
460
461[[shaders-vertex-execution]]
462=== Vertex Shader Execution
463
464A vertex shader must: be executed at least once for each vertex specified by
465a drawing command.
466ifdef::VK_VERSION_1_1,VK_KHR_multiview[]
467If the subpass includes multiple views in its view mask, the shader may: be
468invoked separately for each view.
469endif::VK_VERSION_1_1,VK_KHR_multiview[]
470During execution, the shader is presented with the index of the vertex and
471instance for which it has been invoked.
472Input variables declared in the vertex shader are filled by the
473implementation with the values of vertex attributes associated with the
474invocation being executed.
475
476If the same vertex is specified multiple times in a drawing command (e.g. by
477including the same index value multiple times in an index buffer) the
478implementation may: reuse the results of vertex shading if it can statically
479determine that the vertex shader invocations will produce identical results.
480
481[NOTE]
482.Note
483====
484It is implementation-dependent when and if results of vertex shading are
485reused, and thus how many times the vertex shader will be executed.
486This is true also if the vertex shader contains stores or atomic operations
487(see <<features-vertexPipelineStoresAndAtomics,
488pname:vertexPipelineStoresAndAtomics>>).
489====
490
491
492[[shaders-tessellation-control]]
493== Tessellation Control Shaders
494
495The tessellation control shader is used to read an input patch provided by
496the application and to produce an output patch.
497Each tessellation control shader invocation operates on an input patch
498(after all control points in the patch are processed by a vertex shader) and
499its associated data, and outputs a single control point of the output patch
500and its associated data, and can: also output additional per-patch data.
501The input patch is sized according to the pname:patchControlPoints member of
502slink:VkPipelineTessellationStateCreateInfo, as part of input assembly.
503
504ifdef::VK_EXT_extended_dynamic_state2[]
505The input patch can also be dynamically sized with pname:patchControlPoints
506parameter of flink:vkCmdSetPatchControlPointsEXT.
507
508[open,refpage='vkCmdSetPatchControlPointsEXT',desc='Specify the number of control points per patch dynamically for a command buffer',type='protos']
509--
510To <<pipelines-dynamic-state, dynamically set>> the number of control points
511per patch, call:
512
513include::{generated}/api/protos/vkCmdSetPatchControlPointsEXT.txt[]
514
515  * pname:commandBuffer is the command buffer into which the command will be
516    recorded.
517  * pname:patchControlPoints specifies the number of control points per
518    patch.
519
520This command sets the number of control points per patch for subsequent
521drawing commands when the graphics pipeline is created with
522ename:VK_DYNAMIC_STATE_PATCH_CONTROL_POINTS_EXT set in
523slink:VkPipelineDynamicStateCreateInfo::pname:pDynamicStates.
524Otherwise, this state is specified by the
525slink:VkPipelineTessellationStateCreateInfo::pname:patchControlPoints value
526used to create the currently active pipeline.
527
528.Valid Usage
529****
530  * [[VUID-vkCmdSetPatchControlPointsEXT-None-04873]]
531    The <<features-extendedDynamicState2PatchControlPoints,
532    extendedDynamicState2PatchControlPoints>> feature must: be enabled
533  * [[VUID-vkCmdSetPatchControlPointsEXT-patchControlPoints-04874]]
534    pname:patchControlPoints must: be greater than zero and less than or
535    equal to sname:VkPhysicalDeviceLimits::pname:maxTessellationPatchSize
536****
537
538include::{generated}/validity/protos/vkCmdSetPatchControlPointsEXT.txt[]
539--
540endif::VK_EXT_extended_dynamic_state2[]
541
542The size of the output patch is controlled by the code:OpExecutionMode
543code:OutputVertices specified in the tessellation control or tessellation
544evaluation shaders, which must: be specified in at least one of the shaders.
545The size of the input and output patches must: each be greater than zero and
546less than or equal to
547sname:VkPhysicalDeviceLimits::pname:maxTessellationPatchSize.
548
549
550[[shaders-tessellation-control-execution]]
551=== Tessellation Control Shader Execution
552
553A tessellation control shader is invoked at least once for each _output_
554vertex in a patch.
555ifdef::VK_VERSION_1_1,VK_KHR_multiview[]
556If the subpass includes multiple views in its view mask, the shader may: be
557invoked separately for each view.
558endif::VK_VERSION_1_1,VK_KHR_multiview[]
559
560Inputs to the tessellation control shader are generated by the vertex
561shader.
562Each invocation of the tessellation control shader can: read the attributes
563of any incoming vertices and their associated data.
564The invocations corresponding to a given patch execute logically in
565parallel, with undefined: relative execution order.
566However, the code:OpControlBarrier instruction can: be used to provide
567limited control of the execution order by synchronizing invocations within a
568patch, effectively dividing tessellation control shader execution into a set
569of phases.
570Tessellation control shaders will read undefined: values if one invocation
571reads a per-vertex or per-patch output written by another invocation at any
572point during the same phase, or if two invocations attempt to write
573different values to the same per-patch output in a single phase.
574
575
576[[shaders-tessellation-evaluation]]
577== Tessellation Evaluation Shaders
578
579The Tessellation Evaluation Shader operates on an input patch of control
580points and their associated data, and a single input barycentric coordinate
581indicating the invocation's relative position within the subdivided patch,
582and outputs a single vertex and its associated data.
583
584
585[[shaders-tessellation-evaluation-execution]]
586=== Tessellation Evaluation Shader Execution
587
588A tessellation evaluation shader is invoked at least once for each unique
589vertex generated by the tessellator.
590ifdef::VK_VERSION_1_1,VK_KHR_multiview[]
591If the subpass includes multiple views in its view mask, the shader may: be
592invoked separately for each view.
593endif::VK_VERSION_1_1,VK_KHR_multiview[]
594
595
596[[shaders-geometry]]
597== Geometry Shaders
598
599The geometry shader operates on a group of vertices and their associated
600data assembled from a single input primitive, and emits zero or more output
601primitives and the group of vertices and their associated data required for
602each output primitive.
603
604
605[[shaders-geometry-execution]]
606=== Geometry Shader Execution
607
608A geometry shader is invoked at least once for each primitive produced by
609the tessellation stages, or at least once for each primitive generated by
610<<drawing,primitive assembly>> when tessellation is not in use.
611A shader can request that the geometry shader runs multiple
612<<geometry-invocations, instances>>.
613A geometry shader is invoked at least once for each instance.
614ifdef::VK_VERSION_1_1,VK_KHR_multiview[]
615If the subpass includes multiple views in its view mask, the shader may: be
616invoked separately for each view.
617endif::VK_VERSION_1_1,VK_KHR_multiview[]
618
619
620[[shaders-fragment]]
621== Fragment Shaders
622
623Fragment shaders are invoked as a <<fragops-shader, fragment operation>> in
624a graphics pipeline.
625Each fragment shader invocation operates on a single fragment and its
626associated data.
627With few exceptions, fragment shaders do not have access to any data
628associated with other fragments and are considered to execute in isolation
629of fragment shader invocations associated with other fragments.
630
631
632[[shaders-compute]]
633== Compute Shaders
634
635Compute shaders are invoked via flink:vkCmdDispatch and
636flink:vkCmdDispatchIndirect commands.
637In general, they have access to similar resources as shader stages executing
638as part of a graphics pipeline.
639
640Compute workloads are formed from groups of work items called workgroups and
641processed by the compute shader in the current compute pipeline.
642A workgroup is a collection of shader invocations that execute the same
643shader, potentially in parallel.
644Compute shaders execute in _global workgroups_ which are divided into a
645number of _local workgroups_ with a size that can: be set by assigning a
646value to the code:LocalSize
647ifdef::VK_KHR_maintenance4[or code:LocalSizeId]
648execution mode or via an object decorated by the code:WorkgroupSize
649decoration.
650An invocation within a local workgroup can: share data with other members of
651the local workgroup through shared variables and issue memory and control
652flow barriers to synchronize with other members of the local workgroup.
653
654
655ifdef::VK_NV_ray_tracing,VK_KHR_ray_tracing_pipeline[]
656[[shaders-raytracing-shaders]]
657[[shaders-ray-generation]]
658== Ray Generation Shaders
659
660A ray generation shader is similar to a compute shader.
661Its main purpose is to execute ray tracing queries using code:OpTraceRayKHR
662instructions and process the results.
663
664
665[[shaders-ray-generation-execution]]
666=== Ray Generation Shader Execution
667
668One ray generation shader is executed per ray tracing dispatch.
669Its location in the shader binding table (see <<shader-binding-table,Shader
670Binding Table>> for details) is passed directly into fname:vkCmdTraceRaysKHR
671using the pname:raygenShaderBindingTableBuffer and
672pname:raygenShaderBindingOffset parameters.
673
674
675[[shaders-intersection]]
676== Intersection Shaders
677
678Intersection shaders enable the implementation of arbitrary, application
679defined geometric primitives.
680An intersection shader for a primitive is executed whenever its axis-aligned
681bounding box is hit by a ray.
682
683Like other ray tracing shader domains, an intersection shader operates on a
684single ray at a time.
685It also operates on a single primitive at a time.
686It is therefore the purpose of an intersection shader to compute the
687ray-primitive intersections and report them.
688To report an intersection, the shader calls the code:OpReportIntersectionKHR
689instruction.
690
691An intersection shader communicates with any-hit and closest shaders by
692generating attribute values that they can: read.
693Intersection shaders cannot: read or modify the ray payload.
694
695
696[[shaders-intersection-execution]]
697=== Intersection Shader Execution
698The order in which intersections are found along a ray, and therefore the
699order in which intersection shaders are executed, is unspecified.
700
701The intersection shader of the closest AABB which intersects the ray is
702guaranteed to be executed at some point during traversal, unless the ray is
703forcibly terminated.
704
705
706[[shaders-any-hit]]
707== Any-Hit Shaders
708
709The any-hit shader is executed after the intersection shader reports an
710intersection that lies within the current [eq]#[t~min~,t~max~]# of the ray.
711The main use of any-hit shaders is to programmatically decide whether or not
712an intersection will be accepted.
713The intersection will be accepted unless the shader calls the
714code:OpIgnoreIntersectionKHR instruction.
715Any-hit shaders have read-only access to the attributes generated by the
716corresponding intersection shader, and can: read or modify the ray payload.
717
718
719[[shaders-any-hit-execution]]
720=== Any-Hit Shader Execution
721
722The order in which intersections are found along a ray, and therefore the
723order in which any-hit shaders are executed, is unspecified.
724
725The any-hit shader of the closest hit is guaranteed to be executed at some
726point during traversal, unless the ray is forcibly terminated.
727
728
729[[shaders-closest-hit]]
730== Closest Hit Shaders
731
732Closest hit shaders have read-only access to the attributes generated by the
733corresponding intersection shader, and can: read or modify the ray payload.
734They also have access to a number of system-generated values.
735Closest hit shaders can: call code:OpTraceRayKHR to recursively trace rays.
736
737
738[[shaders-closest-hit-execution]]
739=== Closest Hit Shader Execution
740
741Exactly one closest hit shader is executed when traversal is finished and an
742intersection has been found and accepted.
743
744
745[[shaders-miss]]
746== Miss Shaders
747
748Miss shaders can: access the ray payload and can: trace new rays through the
749code:OpTraceRayKHR instruction, but cannot: access attributes since they are
750not associated with an intersection.
751
752
753[[shaders-miss-execution]]
754=== Miss Shader Execution
755
756A miss shader is executed instead of a closest hit shader if no intersection
757was found during traversal.
758
759
760[[shaders-callable]]
761== Callable Shaders
762
763Callable shaders can: access a callable payload that works similarly to ray
764payloads to do subroutine work.
765
766
767[[shaders-callable-execution]]
768=== Callable Shader Execution
769
770A callable shader is executed by calling code:OpExecuteCallableKHR from an
771allowed shader stage.
772
773endif::VK_NV_ray_tracing,VK_KHR_ray_tracing_pipeline[]
774
775
776[[shaders-interpolation-decorations]]
777== Interpolation Decorations
778
779Interpolation decorations control the behavior of attribute interpolation in
780the fragment shader stage.
781Interpolation decorations can: be applied to code:Input storage class
782variables in the fragment shader stage's interface, and control the
783interpolation behavior of those variables.
784
785Inputs that could be interpolated can: be decorated by at most one of the
786following decorations:
787
788  * code:Flat: no interpolation
789  * code:NoPerspective: linear interpolation (for
790    <<line_linear_interpolation,lines>> and
791    <<triangle_linear_interpolation,polygons>>)
792ifdef::VK_NV_fragment_shader_barycentric[]
793  * code:PerVertexNV: values fetched from shader-specified primitive vertex
794endif::VK_NV_fragment_shader_barycentric[]
795
796Fragment input variables decorated with neither code:Flat nor
797code:NoPerspective use perspective-correct interpolation (for
798<<line_perspective_interpolation,lines>> and
799<<triangle_perspective_interpolation,polygons>>).
800
801The presence of and type of interpolation is controlled by the above
802interpolation decorations as well as the auxiliary decorations code:Centroid
803and code:Sample.
804
805A variable decorated with code:Flat will not be interpolated.
806Instead, it will have the same value for every fragment within a triangle.
807This value will come from a single <<vertexpostproc-flatshading,provoking
808vertex>>.
809A variable decorated with code:Flat can: also be decorated with
810code:Centroid or code:Sample, which will mean the same thing as decorating
811it only as code:Flat.
812
813For fragment shader input variables decorated with neither code:Centroid nor
814code:Sample, the assigned variable may: be interpolated anywhere within the
815fragment and a single value may: be assigned to each sample within the
816fragment.
817
818If a fragment shader input is decorated with code:Centroid, a single value
819may: be assigned to that variable for all samples in the fragment, but that
820value must: be interpolated to a location that lies in both the fragment and
821in the primitive being rendered, including any of the fragment's samples
822covered by the primitive.
823Because the location at which the variable is interpolated may: be different
824in neighboring fragments, and derivatives may: be computed by computing
825differences between neighboring fragments, derivatives of centroid-sampled
826inputs may: be less accurate than those for non-centroid interpolated
827variables.
828ifdef::VK_EXT_post_depth_coverage[]
829The code:PostDepthCoverage execution mode does not affect the determination
830of the centroid location.
831endif::VK_EXT_post_depth_coverage[]
832
833If a fragment shader input is decorated with code:Sample, a separate value
834must: be assigned to that variable for each covered sample in the fragment,
835and that value must: be sampled at the location of the individual sample.
836When pname:rasterizationSamples is ename:VK_SAMPLE_COUNT_1_BIT, the fragment
837center must: be used for code:Centroid, code:Sample, and undecorated
838attribute interpolation.
839
840Fragment shader inputs that are signed or unsigned integers, integer
841vectors, or any double-precision floating-point type must: be decorated with
842code:Flat.
843
844ifdef::VK_AMD_shader_explicit_vertex_parameter[]
845When the `apiext:VK_AMD_shader_explicit_vertex_parameter` device extension
846is enabled inputs can: be also decorated with the code:CustomInterpAMD
847interpolation decoration, including fragment shader inputs that are signed
848or unsigned integers, integer vectors, or any double-precision
849floating-point type.
850Inputs decorated with code:CustomInterpAMD can: only be accessed by the
851extended instruction code:InterpolateAtVertexAMD and allows accessing the
852value of the input for individual vertices of the primitive.
853endif::VK_AMD_shader_explicit_vertex_parameter[]
854
855ifdef::VK_NV_fragment_shader_barycentric[]
856[[shaders-interpolation-decorations-pervertexnv]]
857When the pname:fragmentShaderBarycentric feature is enabled, inputs can: be
858also decorated with the code:PerVertexNV interpolation decoration, including
859fragment shader inputs that are signed or unsigned integers, integer
860vectors, or any double-precision floating-point type.
861Inputs decorated with code:PerVertexNV can: only be accessed using an extra
862array dimension, where the extra index identifies one of the vertices of the
863primitive that produced the fragment.
864endif::VK_NV_fragment_shader_barycentric[]
865
866
867[[shaders-staticuse]]
868== Static Use
869
870A SPIR-V module declares a global object in memory using the code:OpVariable
871instruction, which results in a pointer code:x to that object.
872A specific entry point in a SPIR-V module is said to _statically use_ that
873object if that entry point's call tree contains a function containing a
874memory instruction or image instruction with code:x as an code:id operand.
875See the "`Memory Instructions`" and "`Image Instructions`" subsections of
876section 3 "`Binary Form`" of the SPIR-V specification for the complete list
877of SPIR-V memory instructions.
878
879Static use is not used to control the behavior of variables with code:Input
880and code:Output storage.
881The effects of those variables are applied based only on whether they are
882present in a shader entry point's interface.
883
884
885[[shaders-scope]]
886== Scope
887
888A _scope_ describes a set of shader invocations, where each such set is a
889_scope instance_.
890Each invocation belongs to one or more scope instances, but belongs to no
891more than one scope instance for each scope.
892
893The operations available between invocations in a given scope instance vary,
894with smaller scopes generally able to perform more operations, and with
895greater efficiency.
896
897
898[[shaders-scope-cross-device]]
899=== Cross Device
900
901All invocations executed in a Vulkan instance fall into a single _cross
902device scope instance_.
903
904Whilst the code:CrossDevice scope is defined in SPIR-V, it is disallowed in
905Vulkan.
906API <<synchronization, synchronization>> commands can: be used to
907communicate between devices.
908
909
910[[shaders-scope-device]]
911=== Device
912
913All invocations executed on a single device form a _device scope instance_.
914
915ifdef::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[]
916If the <<features-vulkanMemoryModel,pname:vulkanMemoryModel>> and
917<<features-vulkanMemoryModelDeviceScope,
918pname:vulkanMemoryModelDeviceScope>> features are enabled, this scope is
919represented in SPIR-V by the code:Device code:Scope, which can: be used as a
920code:Memory code:Scope for barrier and atomic operations.
921
922ifdef::VK_KHR_shader_clock[]
923If both the <<features-shaderDeviceClock, pname:shaderDeviceClock>> and
924<<features-vulkanMemoryModelDeviceScope,
925pname:vulkanMemoryModelDeviceScope>> features are enabled, using the
926code:Device code:Scope with the code:OpReadClockKHR instruction will read
927from a clock that is consistent across invocations in the same device scope
928instance.
929endif::VK_KHR_shader_clock[]
930endif::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[]
931
932There is no method to synchronize the execution of these invocations within
933SPIR-V, and this can: only be done with API synchronization primitives.
934
935ifdef::VK_VERSION_1_1,VK_KHR_device_group[]
936Invocations executing on different devices in a device group operate in
937separate device scope instances.
938endif::VK_VERSION_1_1,VK_KHR_device_group[]
939
940ifndef::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[]
941The scope only extends to the queue family, not the whole device.
942endif::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[]
943
944
945[[shaders-scope-queue-family]]
946=== Queue Family
947
948Invocations executed by queues in a given queue family form a _queue family
949scope instance_.
950
951This scope is identified in SPIR-V as the
952ifdef::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[]
953code:QueueFamily code:Scope if the
954<<features-vulkanMemoryModel,pname:vulkanMemoryModel>> feature is enabled,
955or if not, the
956endif::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[]
957code:Device code:Scope, which can: be used as a code:Memory code:Scope for
958barrier and atomic operations.
959
960ifdef::VK_KHR_shader_clock[]
961If the <<features-shaderDeviceClock, pname:shaderDeviceClock>> feature is
962enabled,
963ifdef::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[]
964but the <<features-vulkanMemoryModelDeviceScope,
965pname:vulkanMemoryModelDeviceScope>> feature is not enabled,
966endif::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[]
967using the code:Device code:Scope with the code:OpReadClockKHR instruction
968will read from a clock that is consistent across invocations in the same
969queue family scope instance.
970endif::VK_KHR_shader_clock[]
971
972There is no method to synchronize the execution of these invocations within
973SPIR-V, and this can: only be done with API synchronization primitives.
974
975Each invocation in a queue family scope instance must: be in the same
976<<shaders-scope-device, device scope instance>>.
977
978
979[[shaders-scope-command]]
980=== Command
981
982Any shader invocations executed as the result of a single command such as
983flink:vkCmdDispatch or flink:vkCmdDraw form a _command scope instance_.
984For indirect drawing commands with pname:drawCount greater than one,
985invocations from separate draws are in separate command scope instances.
986ifdef::VK_KHR_ray_tracing_pipeline,VK_NV_ray_tracing[]
987For ray tracing shaders, an invocation group is an implementation-dependent
988subset of the set of shader invocations of a given shader stage which are
989produced by a single trace rays command.
990endif::VK_KHR_ray_tracing_pipeline,VK_NV_ray_tracing[]
991
992There is no specific code:Scope for communication across invocations in a
993command scope instance.
994As this has a clear boundary at the API level, coordination here can: be
995performed in the API, rather than in SPIR-V.
996
997Each invocation in a command scope instance must: be in the same
998<<shaders-scope-queue-family, queue-family scope instance>>.
999
1000For shaders without defined <<shaders-scope-workgroup, workgroups>>, this
1001set of invocations forms an _invocation group_ as defined in the
1002<<spirv-spec,SPIR-V specification>>.
1003
1004
1005[[shaders-scope-primitive]]
1006=== Primitive
1007
1008Any fragment shader invocations executed as the result of rasterization of a
1009single primitive form a _primitive scope instance_.
1010
1011There is no specific code:Scope for communication across invocations in a
1012primitive scope instance.
1013
1014Any generated <<shaders-helper-invocations, helper invocations>> are
1015included in this scope instance.
1016
1017Each invocation in a primitive scope instance must: be in the same
1018<<shaders-scope-command, command scope instance>>.
1019
1020Any input variables decorated with code:Flat are uniform within a primitive
1021scope instance.
1022
1023
1024// intentionally no VK_NV_ray_tracing here since this scope does not exist there
1025ifdef::VK_KHR_ray_tracing_pipeline[]
1026[[shaders-scope-shadercall]]
1027=== Shader Call
1028
1029Any <<shader-call-related,shader-call-related>> invocations that are
1030executed in one or more ray tracing execution models form a _shader call
1031scope instance_.
1032
1033The code:ShaderCallKHR code:Scope can be used as code:Memory code:Scope for
1034barrier and atomic operations.
1035
1036Each invocation in a shader call scope instance must: be in the same
1037<<shaders-scope-queue-family, queue family scope instance>>.
1038endif::VK_KHR_ray_tracing_pipeline[]
1039
1040
1041[[shaders-scope-workgroup]]
1042=== Workgroup
1043
1044A _local workgroup_ is a set of invocations that can synchronize and share
1045data with each other using memory in the code:Workgroup storage class.
1046
1047The code:Workgroup code:Scope can be used as both an code:Execution
1048code:Scope and code:Memory code:Scope for barrier and atomic operations.
1049
1050Each invocation in a local workgroup must: be in the same
1051<<shaders-scope-command, command scope instance>>.
1052
1053Only
1054ifdef::VK_NV_mesh_shader[]
1055task, mesh, and
1056endif::VK_NV_mesh_shader[]
1057compute shaders have defined workgroups - other shader types cannot: use
1058workgroup functionality.
1059For shaders that have defined workgroups, this set of invocations forms an
1060_invocation group_ as defined in the <<spirv-spec,SPIR-V specification>>.
1061
1062
1063ifdef::VK_VERSION_1_1[]
1064[[shaders-scope-subgroup]]
1065=== Subgroup
1066
1067A _subgroup_ (see the subsection "`Control Flow`" of section 2 of the SPIR-V
10681.3 Revision 1 specification) is a set of invocations that can synchronize
1069and share data with each other efficiently.
1070
1071The code:Subgroup code:Scope can be used as both an code:Execution
1072code:Scope and code:Memory code:Scope for barrier and atomic operations.
1073Other <<VkSubgroupFeatureFlagBits, subgroup features>> allow the use of
1074<<shaders-group-operations, group operations>> with subgroup scope.
1075
1076ifdef::VK_KHR_shader_clock[]
1077If the <<features-shaderSubgroupClock, pname:shaderSubgroupClock>> feature
1078is enabled, using the code:Subgroup code:Scope with the code:OpReadClockKHR
1079instruction will read from a clock that is consistent across invocations in
1080the same subgroup.
1081endif::VK_KHR_shader_clock[]
1082
1083For <<shaders-scope-workgroup, shaders that have defined workgroups>>, each
1084invocation in a subgroup must: be in the same <<shaders-scope-workgroup,
1085local workgroup>>.
1086
1087In other shader stages, each invocation in a subgroup must: be in the same
1088<<shaders-scope-device, device scope instance>>.
1089
1090Only <<limits-subgroup-supportedStages, shader stages that support subgroup
1091operations>> have defined subgroups.
1092endif::VK_VERSION_1_1[]
1093
1094
1095[[shaders-scope-quad]]
1096=== Quad
1097
1098A _quad scope instance_ is formed of four shader invocations.
1099
1100In a fragment shader, each invocation in a quad scope instance is formed of
1101invocations in neighboring framebuffer locations [eq]#(x~i~, y~i~)#, where:
1102
1103  * [eq]#i# is the index of the invocation within the scope instance.
1104  * [eq]#w# and [eq]#h# are the number of pixels the fragment covers in the
1105    [eq]#x# and [eq]#y# axes.
1106  * [eq]#w# and [eq]#h# are identical for all participating invocations.
1107  * [eq]#(x~0~) = (x~1~ - w) = (x~2~) = (x~3~ - w)#
1108  * [eq]#(y~0~) = (y~1~) = (y~2~ - h) = (y~3~ - h)#
1109  * Each invocation has the same layer and sample indices.
1110
1111ifdef::VK_NV_compute_shader_derivatives[]
1112In a compute shader, if the code:DerivativeGroupQuadsNV execution mode is
1113specified, each invocation in a quad scope instance is formed of invocations
1114with adjacent local invocation IDs [eq]#(x~i~, y~i~)#, where:
1115
1116  * [eq]#i# is the index of the invocation within the quad scope instance.
1117  * [eq]#(x~0~) = (x~1~ - 1) = (x~2~) = (x~3~ - 1)#
1118  * [eq]#(y~0~) = (y~1~) = (y~2~ - 1) = (y~3~ - 1)#
1119  * [eq]#x~0~# and [eq]#y~0~# are integer multiples of 2.
1120  * Each invocation has the same [eq]#z# coordinate.
1121
1122In a compute shader, if the code:DerivativeGroupLinearNV execution mode is
1123specified, each invocation in a quad scope instance is formed of invocations
1124with adjacent local invocation indices [eq]#(l~i~)#, where:
1125
1126  * [eq]#i# is the index of the invocation within the quad scope instance.
1127  * [eq]#(l~0~) = (l~1~ - 1) = (l~2~ - 2) = (l~3~ - 3)#
1128  * [eq]#l~0~# is an integer multiple of 4.
1129
1130endif::VK_NV_compute_shader_derivatives[]
1131
1132ifdef::VK_VERSION_1_1[]
1133In all shaders, each invocation in a quad scope instance is formed of
1134invocations in adjacent subgroup invocation indices [eq]#(s~i~)#, where:
1135
1136  * [eq]#i# is the index of the invocation within the quad scope instance.
1137  * [eq]#(s~0~) = (s~1~ - 1) = (s~2~ - 2) = (s~3~ - 3)#
1138  * [eq]#s~0~# is an integer multiple of 4.
1139
1140Each invocation in a quad scope instance must: be in the same
1141<<shaders-scope-subgroup, subgroup>>.
1142endif::VK_VERSION_1_1[]
1143
1144ifndef::VK_VERSION_1_1[]
1145The specific set of invocations that make up a quad scope instance in other
1146shader stages is undefined:.
1147endif::VK_VERSION_1_1[]
1148
1149In a fragment shader, each invocation in a quad scope instance must: be in
1150the same <<shaders-scope-primitive, primitive scope instance>>.
1151
1152ifndef::VK_VERSION_1_1[]
1153For <<shaders-scope-workgroup, shaders that have defined workgroups>>, each
1154invocation in a quad scope instance must: be in the same
1155<<shaders-scope-workgroup, local workgroup>>.
1156
1157In other shader stages, each invocation in a quad scope instance must: be in
1158the same <<shaders-scope-device, device scope instance>>.
1159endif::VK_VERSION_1_1[]
1160
1161Fragment
1162ifdef::VK_NV_compute_shader_derivatives,VK_VERSION_1_1[]
1163and compute
1164endif::VK_NV_compute_shader_derivatives,VK_VERSION_1_1[]
1165shaders have defined quad scope instances.
1166ifdef::VK_VERSION_1_1[]
1167If the <<limits-subgroup-quadOperationsInAllStages,
1168pname:quadOperationsInAllStages>> limit is supported, any
1169<<limits-subgroup-supportedStages, shader stages that support subgroup
1170operations>> also have defined quad scope instances.
1171endif::VK_VERSION_1_1[]
1172
1173
1174ifdef::VK_EXT_fragment_shader_interlock[]
1175[[shaders-scope-fragment-interlock]]
1176=== Fragment Interlock
1177
1178A _fragment interlock scope instance_ is formed of fragment shader
1179invocations based on their framebuffer locations [eq]#(x,y,layer,sample)#,
1180executed by commands inside a single <<renderpass,subpass>>.
1181
1182The specific set of invocations included varies based on the execution mode
1183as follows:
1184
1185  * If the code:SampleInterlockOrderedEXT or
1186    code:SampleInterlockUnorderedEXT execution modes are used, only
1187    invocations with identical framebuffer locations
1188    [eq]#(x,y,layer,sample)# are included.
1189  * If the code:PixelInterlockOrderedEXT or code:PixelInterlockUnorderedEXT
1190    execution modes are used, fragments with different sample ids are also
1191    included.
1192ifdef::VK_NV_shading_rate_image,VK_KHR_fragment_shading_rate[]
1193  * If the code:ShadingRateInterlockOrderedEXT or
1194    code:ShadingRateInterlockUnorderedEXT execution modes are used,
1195    fragments from neighbouring framebuffer locations are also included, as
1196    <<primsrast-shading-rate-image,determined by the shading rate>>.
1197endif::VK_NV_shading_rate_image,VK_KHR_fragment_shading_rate[]
1198
1199Only fragment shaders with one of the above execution modes have defined
1200fragment interlock scope instances.
1201
1202There is no specific code:Scope value for communication across invocations
1203in a fragment interlock scope instance.
1204However, this is implicitly used as a memory scope by
1205code:OpBeginInvocationInterlockEXT and code:OpEndInvocationInterlockEXT.
1206
1207Each invocation in a fragment interlock scope instance must: be in the same
1208<<shaders-scope-queue-family, queue family scope instance>>.
1209endif::VK_EXT_fragment_shader_interlock[]
1210
1211
1212[[shaders-scope-invocation]]
1213=== Invocation
1214
1215The smallest _scope_ is a single invocation; this is represented by the
1216code:Invocation code:Scope in SPIR-V.
1217
1218Fragment shader invocations must: be in a <<shaders-scope-primitive,
1219primitive scope instance>>.
1220
1221ifdef::VK_EXT_fragment_shader_interlock[]
1222Invocations in <<shaders-scope-fragment-interlock, fragment shaders that
1223have a defined fragment interlock scope>> must: be in a
1224<<shaders-scope-fragment-interlock, fragment interlock scope instance>>.
1225endif::VK_EXT_fragment_shader_interlock[]
1226
1227Invocations in <<shaders-scope-workgroup, shaders that have defined
1228workgroups>> must: be in a <<shaders-scope-workgroup, local workgroup>>.
1229
1230ifdef::VK_VERSION_1_1[]
1231Invocations in <<shaders-scope-subgroup, shaders that have a defined
1232subgroup scope>> must: be in a <<shaders-scope-subgroup, subgroup>>.
1233endif::VK_VERSION_1_1[]
1234
1235Invocations in <<shaders-scope-quad, shaders that have a defined quad
1236scope>> must: be in a <<shaders-scope-quad, quad scope instance>>.
1237
1238All invocations in all stages must: be in a <<shaders-scope-command,command
1239scope instance>>.
1240
1241
1242ifdef::VK_VERSION_1_1[]
1243[[shaders-group-operations]]
1244== Group Operations
1245
1246_Group operations_ are executed by multiple invocations within a
1247<<shaders-scope, scope instance>>; with each invocation involved in
1248calculating the result.
1249This provides a mechanism for efficient communication between invocations in
1250a particular scope instance.
1251
1252Group operations all take a code:Scope defining the desired
1253<<shaders-scope,scope instance>> to operate within.
1254Only the code:Subgroup scope can: be used for these operations; the
1255<<limits-subgroupSupportedOperations, pname:subgroupSupportedOperations>>
1256limit defines which types of operation can: be used.
1257
1258
1259[[shaders-group-operations-basic]]
1260=== Basic Group Operations
1261
1262Basic group operations include the use of code:OpGroupNonUniformElect,
1263code:OpControlBarrier, code:OpMemoryBarrier, and atomic operations.
1264
1265code:OpGroupNonUniformElect can: be used to choose a single invocation to
1266perform a task for the whole group.
1267Only the invocation with the lowest id in the group will return code:true.
1268
1269The <<memory-model,Memory Model>> appendix defines the operation of barriers
1270and atomics.
1271
1272
1273[[shaders-group-operations-vote]]
1274=== Vote Group Operations
1275
1276The vote group operations allow invocations within a group to compare values
1277across a group.
1278The types of votes enabled are:
1279
1280  * Do all active group invocations agree that an expression is true?
1281  * Do any active group invocations evaluate an expression to true?
1282  * Do all active group invocations have the same value of an expression?
1283
1284[NOTE]
1285.Note
1286====
1287These operations are useful in combination with control flow in that they
1288allow for developers to check whether conditions match across the group and
1289choose potentially faster code-paths in these cases.
1290====
1291
1292
1293[[shaders-group-operations-arithmetic]]
1294=== Arithmetic Group Operations
1295
1296The arithmetic group operations allow invocations to perform scans and
1297reductions across a group.
1298The operators supported are add, mul, min, max, and, or, xor.
1299
1300For reductions, every invocation in a group will obtain the cumulative
1301result of these operators applied to all values in the group.
1302For exclusive scans, each invocation in a group will obtain the cumulative
1303result of these operators applied to all values in invocations with a lower
1304index in the group.
1305Inclusive scans are identical to exclusive scans, except the cumulative
1306result includes the operator applied to the value in the current invocation.
1307
1308The order in which these operators are applied is implementation-dependent.
1309
1310
1311[[shaders-group-operations-ballot]]
1312=== Ballot Group Operations
1313
1314The ballot group operations allow invocations to perform more complex votes
1315across the group.
1316The ballot functionality allows all invocations within a group to provide a
1317boolean value and get as a result what each invocation provided as their
1318boolean value.
1319The broadcast functionality allows values to be broadcast from an invocation
1320to all other invocations within the group.
1321
1322
1323[[shaders-group-operations-shuffle]]
1324=== Shuffle Group Operations
1325
1326The shuffle group operations allow invocations to read values from other
1327invocations within a group.
1328
1329
1330[[shaders-group-operations-shuffle-relative]]
1331=== Shuffle Relative Group Operations
1332
1333The shuffle relative group operations allow invocations to read values from
1334other invocations within the group relative to the current invocation in the
1335group.
1336The relative operations supported allow data to be shifted up and down
1337through the invocations within a group.
1338
1339
1340[[shaders-group-operations-clustered]]
1341=== Clustered Group Operations
1342
1343The clustered group operations allow invocations to perform an operation
1344among partitions of a group, such that the operation is only performed
1345within the group invocations within a partition.
1346The partitions for clustered group operations are consecutive power-of-two
1347size groups of invocations and the cluster size must: be known at pipeline
1348creation time.
1349The operations supported are add, mul, min, max, and, or, xor.
1350
1351
1352[[shaders-quad-operations]]
1353== Quad Group Operations
1354
1355Quad group operations (code:OpGroupNonUniformQuad*) are a specialized type
1356of <<shaders-group-operations, group operations>> that only operate on
1357<<shaders-scope-quad, quad scope instances>>.
1358Whilst these instructions do include a code:Scope parameter, this scope is
1359always overridden; only the <<shaders-scope-quad, quad scope instance>> is
1360included in its execution scope.
1361
1362Fragment shaders that statically execute quad group operations must: launch
1363sufficient invocations to ensure their correct operation; additional
1364<<shaders-helper-invocations, helper invocations>> are launched for
1365framebuffer locations not covered by rasterized fragments if necessary.
1366
1367The index used to select participating invocations is [eq]#i#, as described
1368for a <<shaders-scope-quad, quad scope instance>>, defined as the _quad
1369index_ in the <<spirv-spec,SPIR-V specification>>.
1370
1371For code:OpGroupNonUniformQuadBroadcast this value is equal to code:Index.
1372For code:OpGroupNonUniformQuadSwap, it is equal to the implicit code:Index
1373used by each participating invocation.
1374endif::VK_VERSION_1_1[]
1375
1376
1377[[shaders-derivative-operations]]
1378== Derivative Operations
1379
1380Derivative operations calculate the partial derivative for an expression
1381[eq]#P# as a function of an invocation's [eq]#x# and [eq]#y# coordinates.
1382
1383Derivative operations operate on a set of invocations known as a _derivative
1384group_ as defined in the <<spirv-spec,SPIR-V specification>>.
1385A derivative group is equivalent to
1386ifdef::VK_NV_compute_shader_derivatives[]
1387the <<shaders-scope-quad, quad scope instance>> for a compute shader
1388invocation, or
1389endif::VK_NV_compute_shader_derivatives[]
1390the <<shaders-scope-primitive, primitive scope instance>> for a fragment
1391shader invocation.
1392
1393Derivatives are calculated assuming that [eq]#P# is piecewise linear and
1394continuous within the derivative group.
1395All dynamic instances of explicit derivative instructions (code:OpDPdx*,
1396code:OpDPdy*, and code:OpFwidth*) must: be executed in control flow that is
1397uniform within a derivative group.
1398For other derivative operations, results are undefined: if a dynamic
1399instance is executed in control flow that is not uniform within the
1400derivative group.
1401
1402Fragment shaders that statically execute derivative operations must: launch
1403sufficient invocations to ensure their correct operation; additional
1404<<shaders-helper-invocations, helper invocations>> are launched for
1405framebuffer locations not covered by rasterized fragments if necessary.
1406
1407ifdef::VK_NV_compute_shader_derivatives[]
1408[NOTE]
1409.Note
1410====
1411In a compute shader, it is the application's responsibility to ensure that
1412sufficient invocations are launched.
1413====
1414endif::VK_NV_compute_shader_derivatives[]
1415
1416Derivative operations calculate their results as the difference between the
1417result of [eq]#P# across invocations in the quad.
1418For fine derivative operations (code:OpDPdxFine and code:OpDPdyFine), the
1419values of [eq]#DPdx(P~i~)# are calculated as
1420
1421  {empty}:: [eq]#DPdx(P~0~) = DPdx(P~1~) = P~1~ - P~0~#
1422  {empty}:: [eq]#DPdx(P~2~) = DPdx(P~3~) = P~3~ - P~2~#
1423
1424and the values of [eq]#DPdy(P~i~)# are calculated as
1425
1426  {empty}:: [eq]#DPdy(P~0~) = DPdy(P~2~) = P~2~ - P~0~#
1427  {empty}:: [eq]#DPdy(P~1~) = DPdy(P~3~) = P~3~ - P~1~#
1428
1429where [eq]#i# is the index of each invocation as described in
1430<<shaders-scope-quad>>.
1431
1432Coarse derivative operations (code:OpDPdxCoarse and code:OpDPdyCoarse),
1433calculate their results in roughly the same manner, but may: only calculate
1434two values instead of four (one for each of [eq]#DPdx# and [eq]#DPdy#),
1435reusing the same result no matter the originating invocation.
1436If an implementation does this, it should: use the fine derivative
1437calculations described for [eq]#P~0~#.
1438
1439[NOTE]
1440.Note
1441====
1442Derivative values are calculated between fragments rather than pixels.
1443If the fragment shader invocations involved in the calculation cover
1444multiple pixels, these operations cover a wider area, resulting in larger
1445derivative values.
1446This in turn will result in a coarser level of detail being selected for
1447image sampling operations using derivatives.
1448
1449Applications may want to account for this when using multi-pixel fragments;
1450if pixel derivatives are desired, applications should use explicit
1451derivative operations and divide the results by the size of the fragment in
1452each dimension as follows:
1453
1454  {empty}:: [eq]#DPdx(P~n~)' = DPdx(P~n~) / w#
1455  {empty}:: [eq]#DPdy(P~n~)' = DPdy(P~n~) / h#
1456
1457where [eq]#w# and [eq]#h# are the size of the fragments in the quad, and
1458[eq]#DPdx(P~n~)'# and [eq]#DPdy(P~n~)'# are the pixel derivatives.
1459====
1460
1461The results for code:OpDPdx and code:OpDPdy may: be calculated as either
1462fine or coarse derivatives, with implementations favouring the most
1463efficient approach.
1464Implementations must: choose coarse or fine consistently between the two.
1465
1466Executing code:OpFwidthFine, code:OpFwidthCoarse, or code:OpFwidth is
1467equivalent to executing the corresponding code:OpDPdx* and code:OpDPdy*
1468instructions, taking the absolute value of the results, and summing them.
1469
1470Executing an code:OpImage*Sample*ImplicitLod instruction is equivalent to
1471executing code:OpDPdx(code:Coordinate) and code:OpDPdy(code:Coordinate), and
1472passing the results as the code:Grad operands code:dx and code:dy.
1473
1474[NOTE]
1475.Note
1476====
1477It is expected that using the code:ImplicitLod variants of sampling
1478functions will be substantially more efficient than using the
1479code:ExplicitLod variants with explicitly generated derivatives.
1480====
1481
1482
1483[[shaders-helper-invocations]]
1484== Helper Invocations
1485
1486When performing <<shaders-derivative-operations, derivative>>
1487ifdef::VK_VERSION_1_1[]
1488or <<shaders-quad-operations, quad group>>
1489endif::VK_VERSION_1_1[]
1490operations in a fragment shader, additional invocations may: be spawned in
1491order to ensure correct results.
1492These additional invocations are known as _helper invocations_ and can: be
1493identified by a non-zero value in the code:HelperInvocation built-in.
1494Stores and atomics performed by helper invocations must: not have any effect
1495on memory, and values returned by atomic instructions in helper invocations
1496are undefined:.
1497
1498For <<shaders-group-operations, group operations>> other than
1499<<shaders-derivative-operations, derivative>>
1500ifdef::VK_VERSION_1_1[]
1501and <<shaders-quad-operations, quad group>>
1502endif::VK_VERSION_1_1[]
1503operations, helper invocations may: be treated as inactive even if they
1504would be considered otherwise active.
1505
1506ifdef::VK_EXT_shader_demote_to_helper_invocation[]
1507Helper invocations may: become permanently inactive if all invocations in a
1508quad scope instance become helper invocations.
1509endif::VK_EXT_shader_demote_to_helper_invocation[]
1510
1511
1512ifdef::VK_NV_cooperative_matrix[]
1513== Cooperative Matrices
1514
1515A _cooperative matrix_ type is a SPIR-V type where the storage for and
1516computations performed on the matrix are spread across the invocations in a
1517scope instance.
1518These types give the implementation freedom in how to optimize matrix
1519multiplies.
1520
1521SPIR-V defines the types and instructions, but does not specify rules about
1522what sizes/combinations are valid, and it is expected that different
1523implementations may: support different sizes.
1524
1525[open,refpage='vkGetPhysicalDeviceCooperativeMatrixPropertiesNV',desc='Returns properties describing what cooperative matrix types are supported',type='protos']
1526--
1527To enumerate the supported cooperative matrix types and operations, call:
1528
1529include::{generated}/api/protos/vkGetPhysicalDeviceCooperativeMatrixPropertiesNV.txt[]
1530
1531  * pname:physicalDevice is the physical device.
1532  * pname:pPropertyCount is a pointer to an integer related to the number of
1533    cooperative matrix properties available or queried.
1534  * pname:pProperties is either `NULL` or a pointer to an array of
1535    slink:VkCooperativeMatrixPropertiesNV structures.
1536
1537If pname:pProperties is `NULL`, then the number of cooperative matrix
1538properties available is returned in pname:pPropertyCount.
1539Otherwise, pname:pPropertyCount must: point to a variable set by the user to
1540the number of elements in the pname:pProperties array, and on return the
1541variable is overwritten with the number of structures actually written to
1542pname:pProperties.
1543If pname:pPropertyCount is less than the number of cooperative matrix
1544properties available, at most pname:pPropertyCount structures will be
1545written, and ename:VK_INCOMPLETE will be returned instead of
1546ename:VK_SUCCESS, to indicate that not all the available cooperative matrix
1547properties were returned.
1548
1549include::{generated}/validity/protos/vkGetPhysicalDeviceCooperativeMatrixPropertiesNV.txt[]
1550--
1551
1552[open,refpage='VkCooperativeMatrixPropertiesNV',desc='Structure specifying cooperative matrix properties',type='structs']
1553--
1554Each sname:VkCooperativeMatrixPropertiesNV structure describes a single
1555supported combination of types for a matrix multiply/add operation
1556(code:OpCooperativeMatrixMulAddNV).
1557The multiply can: be described in terms of the following variables and types
1558(in SPIR-V pseudocode):
1559
1560[source,c]
1561~~~~
1562    %A is of type OpTypeCooperativeMatrixNV %AType %scope %MSize %KSize
1563    %B is of type OpTypeCooperativeMatrixNV %BType %scope %KSize %NSize
1564    %C is of type OpTypeCooperativeMatrixNV %CType %scope %MSize %NSize
1565    %D is of type OpTypeCooperativeMatrixNV %DType %scope %MSize %NSize
1566
1567    %D = %A * %B + %C // using OpCooperativeMatrixMulAddNV
1568~~~~
1569
1570A matrix multiply with these dimensions is known as an _MxNxK_ matrix
1571multiply.
1572
1573The sname:VkCooperativeMatrixPropertiesNV structure is defined as:
1574
1575include::{generated}/api/structs/VkCooperativeMatrixPropertiesNV.txt[]
1576
1577  * pname:sType is the type of this structure.
1578  * pname:pNext is `NULL` or a pointer to a structure extending this
1579    structure.
1580  * pname:MSize is the number of rows in matrices A, C, and D.
1581  * pname:KSize is the number of columns in matrix A and rows in matrix B.
1582  * pname:NSize is the number of columns in matrices B, C, D.
1583  * pname:AType is the component type of matrix A, of type
1584    elink:VkComponentTypeNV.
1585  * pname:BType is the component type of matrix B, of type
1586    elink:VkComponentTypeNV.
1587  * pname:CType is the component type of matrix C, of type
1588    elink:VkComponentTypeNV.
1589  * pname:DType is the component type of matrix D, of type
1590    elink:VkComponentTypeNV.
1591  * pname:scope is the scope of all the matrix types, of type
1592    elink:VkScopeNV.
1593
1594If some types are preferred over other types (e.g. for performance), they
1595should: appear earlier in the list enumerated by
1596flink:vkGetPhysicalDeviceCooperativeMatrixPropertiesNV.
1597
1598At least one entry in the list must: have power of two values for all of
1599pname:MSize, pname:KSize, and pname:NSize.
1600
1601include::{generated}/validity/structs/VkCooperativeMatrixPropertiesNV.txt[]
1602--
1603
1604[open,refpage='VkScopeNV',desc='Specify SPIR-V scope',type='enums']
1605--
1606Possible values for elink:VkScopeNV include:
1607
1608include::{generated}/api/enums/VkScopeNV.txt[]
1609
1610  * ename:VK_SCOPE_DEVICE_NV corresponds to SPIR-V code:Device scope.
1611  * ename:VK_SCOPE_WORKGROUP_NV corresponds to SPIR-V code:Workgroup scope.
1612  * ename:VK_SCOPE_SUBGROUP_NV corresponds to SPIR-V code:Subgroup scope.
1613  * ename:VK_SCOPE_QUEUE_FAMILY_NV corresponds to SPIR-V code:QueueFamily
1614    scope.
1615
1616All enum values match the corresponding SPIR-V value.
1617--
1618
1619[open,refpage='VkComponentTypeNV',desc='Specify SPIR-V cooperative matrix component type',type='enums']
1620--
1621Possible values for elink:VkComponentTypeNV include:
1622
1623include::{generated}/api/enums/VkComponentTypeNV.txt[]
1624
1625  * ename:VK_COMPONENT_TYPE_FLOAT16_NV corresponds to SPIR-V
1626    code:OpTypeFloat 16.
1627  * ename:VK_COMPONENT_TYPE_FLOAT32_NV corresponds to SPIR-V
1628    code:OpTypeFloat 32.
1629  * ename:VK_COMPONENT_TYPE_FLOAT64_NV corresponds to SPIR-V
1630    code:OpTypeFloat 64.
1631  * ename:VK_COMPONENT_TYPE_SINT8_NV corresponds to SPIR-V code:OpTypeInt 8 1.
1632  * ename:VK_COMPONENT_TYPE_SINT16_NV corresponds to SPIR-V code:OpTypeInt
1633    16 1.
1634  * ename:VK_COMPONENT_TYPE_SINT32_NV corresponds to SPIR-V code:OpTypeInt
1635    32 1.
1636  * ename:VK_COMPONENT_TYPE_SINT64_NV corresponds to SPIR-V code:OpTypeInt
1637    64 1.
1638  * ename:VK_COMPONENT_TYPE_UINT8_NV corresponds to SPIR-V code:OpTypeInt 8 0.
1639  * ename:VK_COMPONENT_TYPE_UINT16_NV corresponds to SPIR-V code:OpTypeInt
1640    16 0.
1641  * ename:VK_COMPONENT_TYPE_UINT32_NV corresponds to SPIR-V code:OpTypeInt
1642    32 0.
1643  * ename:VK_COMPONENT_TYPE_UINT64_NV corresponds to SPIR-V code:OpTypeInt
1644    64 0.
1645--
1646endif::VK_NV_cooperative_matrix[]
1647
1648
1649ifdef::VK_EXT_validation_cache[]
1650[[shaders-validation-cache]]
1651== Validation Cache
1652
1653[open,refpage='VkValidationCacheEXT',desc='Opaque handle to a validation cache object',type='handles']
1654--
1655Validation cache objects allow the result of internal validation to be
1656reused, both within a single application run and between multiple runs.
1657Reuse within a single run is achieved by passing the same validation cache
1658object when creating supported Vulkan objects.
1659Reuse across runs of an application is achieved by retrieving validation
1660cache contents in one run of an application, saving the contents, and using
1661them to preinitialize a validation cache on a subsequent run.
1662The contents of the validation cache objects are managed by the validation
1663layers.
1664Applications can: manage the host memory consumed by a validation cache
1665object and control the amount of data retrieved from a validation cache
1666object.
1667
1668Validation cache objects are represented by sname:VkValidationCacheEXT
1669handles:
1670
1671include::{generated}/api/handles/VkValidationCacheEXT.txt[]
1672--
1673
1674[open,refpage='vkCreateValidationCacheEXT',desc='Creates a new validation cache',type='protos']
1675--
1676To create validation cache objects, call:
1677
1678include::{generated}/api/protos/vkCreateValidationCacheEXT.txt[]
1679
1680  * pname:device is the logical device that creates the validation cache
1681    object.
1682  * pname:pCreateInfo is a pointer to a slink:VkValidationCacheCreateInfoEXT
1683    structure containing the initial parameters for the validation cache
1684    object.
1685  * pname:pAllocator controls host memory allocation as described in the
1686    <<memory-allocation, Memory Allocation>> chapter.
1687  * pname:pValidationCache is a pointer to a slink:VkValidationCacheEXT
1688    handle in which the resulting validation cache object is returned.
1689
1690[NOTE]
1691.Note
1692====
1693Applications can: track and manage the total host memory size of a
1694validation cache object using the pname:pAllocator.
1695Applications can: limit the amount of data retrieved from a validation cache
1696object in fname:vkGetValidationCacheDataEXT.
1697Implementations should: not internally limit the total number of entries
1698added to a validation cache object or the total host memory consumed.
1699====
1700
1701Once created, a validation cache can: be passed to the
1702fname:vkCreateShaderModule command by adding this object to the
1703slink:VkShaderModuleCreateInfo structure's pname:pNext chain.
1704If a slink:VkShaderModuleValidationCacheCreateInfoEXT object is included in
1705the slink:VkShaderModuleCreateInfo::pname:pNext chain, and its
1706pname:validationCache field is not dlink:VK_NULL_HANDLE, the implementation
1707will query it for possible reuse opportunities and update it with new
1708content.
1709The use of the validation cache object in these commands is internally
1710synchronized, and the same validation cache object can: be used in multiple
1711threads simultaneously.
1712
1713[NOTE]
1714.Note
1715====
1716Implementations should: make every effort to limit any critical sections to
1717the actual accesses to the cache, which is expected to be significantly
1718shorter than the duration of the fname:vkCreateShaderModule command.
1719====
1720
1721include::{generated}/validity/protos/vkCreateValidationCacheEXT.txt[]
1722--
1723
1724[open,refpage='VkValidationCacheCreateInfoEXT',desc='Structure specifying parameters of a newly created validation cache',type='structs']
1725--
1726The sname:VkValidationCacheCreateInfoEXT structure is defined as:
1727
1728include::{generated}/api/structs/VkValidationCacheCreateInfoEXT.txt[]
1729
1730  * pname:sType is the type of this structure.
1731  * pname:pNext is `NULL` or a pointer to a structure extending this
1732    structure.
1733  * pname:flags is reserved for future use.
1734  * pname:initialDataSize is the number of bytes in pname:pInitialData.
1735    If pname:initialDataSize is zero, the validation cache will initially be
1736    empty.
1737  * pname:pInitialData is a pointer to previously retrieved validation cache
1738    data.
1739    If the validation cache data is incompatible (as defined below) with the
1740    device, the validation cache will be initially empty.
1741    If pname:initialDataSize is zero, pname:pInitialData is ignored.
1742
1743.Valid Usage
1744****
1745  * [[VUID-VkValidationCacheCreateInfoEXT-initialDataSize-01534]]
1746    If pname:initialDataSize is not `0`, it must: be equal to the size of
1747    pname:pInitialData, as returned by fname:vkGetValidationCacheDataEXT
1748    when pname:pInitialData was originally retrieved
1749  * [[VUID-VkValidationCacheCreateInfoEXT-initialDataSize-01535]]
1750    If pname:initialDataSize is not `0`, pname:pInitialData must: have been
1751    retrieved from a previous call to fname:vkGetValidationCacheDataEXT
1752****
1753
1754include::{generated}/validity/structs/VkValidationCacheCreateInfoEXT.txt[]
1755--
1756
1757[open,refpage='VkValidationCacheCreateFlagsEXT',desc='Reserved for future use',type='flags']
1758--
1759include::{generated}/api/flags/VkValidationCacheCreateFlagsEXT.txt[]
1760
1761tname:VkValidationCacheCreateFlagsEXT is a bitmask type for setting a mask,
1762but is currently reserved for future use.
1763--
1764
1765[open,refpage='vkMergeValidationCachesEXT',desc='Combine the data stores of validation caches',type='protos']
1766--
1767Validation cache objects can: be merged using the command:
1768
1769include::{generated}/api/protos/vkMergeValidationCachesEXT.txt[]
1770
1771  * pname:device is the logical device that owns the validation cache
1772    objects.
1773  * pname:dstCache is the handle of the validation cache to merge results
1774    into.
1775  * pname:srcCacheCount is the length of the pname:pSrcCaches array.
1776  * pname:pSrcCaches is a pointer to an array of validation cache handles,
1777    which will be merged into pname:dstCache.
1778    The previous contents of pname:dstCache are included after the merge.
1779
1780[NOTE]
1781.Note
1782====
1783The details of the merge operation are implementation-dependent, but
1784implementations should: merge the contents of the specified validation
1785caches and prune duplicate entries.
1786====
1787
1788.Valid Usage
1789****
1790  * [[VUID-vkMergeValidationCachesEXT-dstCache-01536]]
1791    pname:dstCache must: not appear in the list of source caches
1792****
1793
1794include::{generated}/validity/protos/vkMergeValidationCachesEXT.txt[]
1795--
1796
1797[open,refpage='vkGetValidationCacheDataEXT',desc='Get the data store from a validation cache',type='protos']
1798--
1799Data can: be retrieved from a validation cache object using the command:
1800
1801include::{generated}/api/protos/vkGetValidationCacheDataEXT.txt[]
1802
1803  * pname:device is the logical device that owns the validation cache.
1804  * pname:validationCache is the validation cache to retrieve data from.
1805  * pname:pDataSize is a pointer to a value related to the amount of data in
1806    the validation cache, as described below.
1807  * pname:pData is either `NULL` or a pointer to a buffer.
1808
1809If pname:pData is `NULL`, then the maximum size of the data that can: be
1810retrieved from the validation cache, in bytes, is returned in
1811pname:pDataSize.
1812Otherwise, pname:pDataSize must: point to a variable set by the user to the
1813size of the buffer, in bytes, pointed to by pname:pData, and on return the
1814variable is overwritten with the amount of data actually written to
1815pname:pData.
1816If pname:pDataSize is less than the maximum size that can: be retrieved by
1817the validation cache, at most pname:pDataSize bytes will be written to
1818pname:pData, and fname:vkGetValidationCacheDataEXT will return
1819ename:VK_INCOMPLETE instead of ename:VK_SUCCESS, to indicate that not all of
1820the validation cache was returned.
1821
1822Any data written to pname:pData is valid and can: be provided as the
1823pname:pInitialData member of the slink:VkValidationCacheCreateInfoEXT
1824structure passed to fname:vkCreateValidationCacheEXT.
1825
1826Two calls to fname:vkGetValidationCacheDataEXT with the same parameters
1827must: retrieve the same data unless a command that modifies the contents of
1828the cache is called between them.
1829
1830[[validation-cache-header]]
1831Applications can: store the data retrieved from the validation cache, and
1832use these data, possibly in a future run of the application, to populate new
1833validation cache objects.
1834The results of validation, however, may: depend on the vendor ID, device ID,
1835driver version, and other details of the device.
1836To enable applications to detect when previously retrieved data is
1837incompatible with the device, the initial bytes written to pname:pData must:
1838be a header consisting of the following members:
1839
1840.Layout for validation cache header version ename:VK_VALIDATION_CACHE_HEADER_VERSION_ONE_EXT
1841[width="85%",cols="8%,21%,71%",options="header"]
1842|====
1843| Offset | Size | Meaning
1844| 0 | 4                    | length in bytes of the entire validation cache header
1845                             written as a stream of bytes, with the least
1846                             significant byte first
1847| 4 | 4                    | a elink:VkValidationCacheHeaderVersionEXT value
1848                             written as a stream of bytes, with the least
1849                             significant byte first
1850| 8 | ename:VK_UUID_SIZE   | a layer commit ID expressed as a UUID, which uniquely
1851                             identifies the version of the validation layers used
1852                             to generate these validation results
1853|====
1854
1855The first four bytes encode the length of the entire validation cache
1856header, in bytes.
1857This value includes all fields in the header including the validation cache
1858version field and the size of the length field.
1859
1860The next four bytes encode the validation cache version, as described for
1861elink:VkValidationCacheHeaderVersionEXT.
1862A consumer of the validation cache should: use the cache version to
1863interpret the remainder of the cache header.
1864
1865If pname:pDataSize is less than what is necessary to store this header,
1866nothing will be written to pname:pData and zero will be written to
1867pname:pDataSize.
1868
1869include::{generated}/validity/protos/vkGetValidationCacheDataEXT.txt[]
1870--
1871
1872[open,refpage='VkValidationCacheHeaderVersionEXT',desc='Encode validation cache version',type='enums',xrefs='vkCreateValidationCacheEXT vkGetValidationCacheDataEXT']
1873--
1874Possible values of the second group of four bytes in the header returned by
1875flink:vkGetValidationCacheDataEXT, encoding the validation cache version,
1876are:
1877
1878include::{generated}/api/enums/VkValidationCacheHeaderVersionEXT.txt[]
1879
1880  * ename:VK_VALIDATION_CACHE_HEADER_VERSION_ONE_EXT specifies version one
1881    of the validation cache.
1882--
1883
1884[open,refpage='vkDestroyValidationCacheEXT',desc='Destroy a validation cache object',type='protos']
1885--
1886To destroy a validation cache, call:
1887
1888include::{generated}/api/protos/vkDestroyValidationCacheEXT.txt[]
1889
1890  * pname:device is the logical device that destroys the validation cache
1891    object.
1892  * pname:validationCache is the handle of the validation cache to destroy.
1893  * pname:pAllocator controls host memory allocation as described in the
1894    <<memory-allocation, Memory Allocation>> chapter.
1895
1896.Valid Usage
1897****
1898  * [[VUID-vkDestroyValidationCacheEXT-validationCache-01537]]
1899    If sname:VkAllocationCallbacks were provided when pname:validationCache
1900    was created, a compatible set of callbacks must: be provided here
1901  * [[VUID-vkDestroyValidationCacheEXT-validationCache-01538]]
1902    If no sname:VkAllocationCallbacks were provided when
1903    pname:validationCache was created, pname:pAllocator must: be `NULL`
1904****
1905
1906include::{generated}/validity/protos/vkDestroyValidationCacheEXT.txt[]
1907--
1908endif::VK_EXT_validation_cache[]
1909