• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1// Copyright 2021-2024 The Khronos Group Inc.
2//
3// SPDX-License-Identifier: CC-BY-4.0
4= VK_EXT_shader_module_identifier
5
6This extension adds functionality to avoid having to pass down complete SPIR-V to shaders in situations
7where we speculate that an implementation already has a pipeline blob in cache and conversion to SPIR-V is not needed to begin with.
8
9== Problem Statement
10
11In some applications, SPIR-V is generated on-the-fly, usually by translating from some other representation.
12API translation libraries and emulators in particular frequently run into these problems.
13
14In such applications, the overhead required to obtain valid SPIR-V before any pipeline creation call can be problematic.
15Especially in graphics API translation layering efforts, applications expect that compilation with hot caches is "instant",
16as that is how a native driver would behave. Translating to SPIR-V can therefore become a performance problem.
17There are two common problems:
18
19 - Applications compile PSOs late -> stutter! -> but we are expected to mitigate
20 - Applications compile a lot of PSOs early -> good! but can lead to excessive load times even on subsequent runs of application
21
22For translation layers, there are currently two options we can consider to mitigate the issue:
23
24 - Optimize the translation
25 - Cache converted SPIR-V on disk
26
27Neither option may be good enough. Disk requirements for large application caches can be impractical on some platforms,
28since we might end up having to store in the order of 100k SPIR-V modules, easily in the gigabyte range.
29Optimizing the translation might not be enough when faced with tens of thousand pipelines being compiled at once.
30
31== Solution Space
32
33The solution this extension addresses is the minimum viable approach to fix the problem.
34The main idea is that when pipeline caches are primed, SPIR-V modules are largely useless,
35since most implementations are likely to only hash, and never look at the SPIR-V again.
36We can just hand back the hash to the implementation instead.
37
38The extension is designed to work on top of `VK_EXT_pipeline_creation_cache_control`.
39We can reuse the main idea of a "non-blocking" compile where we return early if pipeline compilation is required,
40and the translation layer can build SPIR-V as needed. Next time, we are likely to hit in cache.
41
42An important consideration here is that this solution is intended to aid internal implementation caching,
43i.e. a "magic disk cache", which most desktop implementations of graphics APIs are expected to have.
44
45For explicit application side caching mechanisms, larger cache sizes are reasonable and expected,
46but we are more constrained with internal caches. These should be as lean and mean as possible,
47but internal caches are also more "fuzzy" in nature. Spurious failure is okay, a "best effort" approach
48is suitable for this use case.
49
50One could extend this idea to full PSO keys as well, but that is better left to other proposals.
51
52== Example use case
53
54One scenario where this extension has been found to be particularly useful is D3D12 to Vulkan translation.
55The translation layers need to translate DXBC and DXIL code to SPIR-V, which is then translated to GPU ISA.
56SPIR-V to ISA translation is cached by Vulkan pipeline caches or in-driver caches,
57but the DXBC/DXIL -> SPIR-V cache is not covered by the API.
58
59We can store SPIR-V on-disk and reload that in response to a pipeline creation call,
60but the overhead of storing SPIR-V on disk, validating it, decompressing it, etc, is a significant overhead that can be avoided,
61with storage space being the most significant problem.
62If the final ISA is present in pipeline caches, we do not really need the SPIR-V at all.
63
64We have observed >95% disk savings with this scenario, and this is transformative since it makes it practical to share this cache across different machines.
65This hypothetically allows an end-user to never observe shader compilation stutter or excessive load times on first run of a game.
66
67== Proposal
68
69=== Querying identifier
70
71After the application has converted a shader to SPIR-V and compiled a pipeline, `vkGetShaderModuleIdentifierEXT` is used to obtain a shader identifier.
72This identifier can be stored on-disk for later use. (`vkGetShaderModuleCreateInfoIdentifierEXT` can be used as an object-less alternative.)
73`VkPhysicalDeviceShaderModuleIdentifierPropertiesEXT::shaderModuleIdentifierAlgorithmUUID`
74is also needed so applications know if we need to throw away any caches using the identifier.
75This should only happen on different driver implementations. Different versions of the same driver are not expected to change hashing algorithms.
76For drivers sharing the same framework (e.g. Mesa), the module hashing algorithm could even be the same one.
77
78To make the API friendly to applications, there is a small upper bound on how large an identifier may be,
79so that the identifiers can be retrieved without memory allocation.
80
81=== `VK_NULL_HANDLE` module proxy
82
83On subsequent runs of an application, we speculate that the driver caches (or VkPipelineCaches) are primed, and thus having SPIR-V is not useful anymore.
84We then set `VkPipelineShaderStageCreateInfo::module` to `VK_NULL_HANDLE` and chain in `VkPipelineShaderStageModuleIdentifierCreateInfoEXT` as a proxy.
85This allows a driver to generate the same internal PSO key that it would generate if we passed in actual SPIR-V.
86`VK_PIPELINE_CREATE_FAIL_ON_COMPILE_REQUIRED_BIT` must be set in this situation, since this is a speculative compile by definition.
87
88=== Handling fallbacks
89
90In a situation where we do not have the pipeline cached, we receive `VK_PIPELINE_COMPILE_REQUIRED`, and fall back to re-creating SPIR-V as usual.
91
92=== Soft guarantees of successfully compiling pipelines
93
94The proposal as-is states that implementations may fail compilation for any reason. This is a defensive measure
95to make it possible for this extension to interoperate with layers, validation, debug tooling, etc., without too many problems.
96In most such layers, there is a need to parse the SPIR-V itself to figure out information required for correct operation.
97While the ICD might recognize an identifier, a layer might not, and therefore they might need the escape hatch where they can spuriously fail compilation.
98
99This effectively makes the spec somewhat vague, and it becomes a quality-of-implementation issue on what ICDs do.
100This is not different from what implementations already do either way. After all, you may or may not have a PSO in disk cache and that is okay.
101
102== Issues
103
104=== RESOLVED: Should applications be allowed to specify their own shader module identifier?
105
106NO.
107
108It is plausible that applications might want to generate their own keys instead of using driver-generated keys.
109For this to be useful, an application will need to generate a key which depends
110on input data/shaders, the revision of the code which performs runtime conversion to SPIR-V, and potentially, the driver kind or any configuration options
111which affect shader conversion. A typical problem which comes up when doing forward hashing like this is that hashes can change for every revision of the application,
112even if the resulting SPIR-V ends up being identical. This will easily contribute to pipeline cache bloat, since the exact same pipelines might end up in cache with
113different hashes. Implementations can be defensive about this and introduce extra identifier indirections, e.g. have an extra hashmap for application identifier
114to driver identifier, but ideally, this extension should not introduce extra implementation complexity to support it well.
115
116Applications could also hash the resulting SPIR-V and ensure non-duplicated identifiers this way,
117but this is not meaningfully different from just using the driver identifier, and also avoids added implementation complexity.
118
119=== RESOLVED: How does this interact with VK_KHR_ray_tracing_pipeline, VK_KHR_pipeline_library and VK_EXT_graphics_pipeline_library?
120
121SUPPORTED.
122
123When using pipeline libraries, there are two scenarios where pipeline creation can fail if we only have an identifier,
124at creation time of the library, and the consumption of that library.
125
126There are at least three possibilities an implementation could consider when building libraries and consuming them:
127
128- Generate final code when creating library, link step is trivial. Ray tracing pipeline libraries may be implemented like this.
129- Generate code when creating library, but allow link-time optimization for later. Graphics pipeline libraries is a common case here.
130- Just retain a reference to the shader module, perform actual compilation during linking. Another strategy for ray tracing libraries.
131
132In the latter two scenarios, it is reasonable to assume that compilation may happen during the final pipeline build
133and compilation would spuriously fail if the source module was only defined by identifier and the final PSO did not exist in cache.
134If we do not allow compilation to fail with `VK_PIPELINE_CREATE_FAIL_ON_COMPILE_REQUIRED_BIT` here, it would not be safe to return
135`VK_SUCCESS` from the library creation step, which would be unfortunate.
136
137For scenarios where the implementation may generate code later, we require that any pipeline libraries
138which were created with identifiers inherit the requirement of using `VK_PIPELINE_CREATE_FAIL_ON_COMPILE_REQUIRED_BIT`.
139This allows applications to speculatively create link-time optimized pipelines from identifiers only as well as
140ray-tracing pipelines from libraries.
141
142=== RESOLVED: Should there be stronger guarantees on when pipeline compilation with identifier must succeed?
143
144NO.
145
146The existing proposal gives a lot of lee-way for implementations to spuriously fail compilation when module is `VK_NULL_HANDLE`.
147It might be possible to give stronger guarantees with tighter spec language?
148
149CTS testing will report quality warnings if identifiers cannot be used with `VkPipelineCache`,
150as there is no good excuse why an implementation should not be able to satisfy those pipelines.
151
152== Further Functionality
153
154N/A
155