• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1// Copyright 2021-2023 The Khronos Group Inc.
2//
3// SPDX-License-Identifier: CC-BY-4.0
4
5= VK_KHR_video_queue
6:toc: left
7:refpage: https://registry.khronos.org/vulkan/specs/1.2-extensions/man/html/
8:sectnums:
9
10This document outlines a proposal to enable performing video coding operations in Vulkan.
11
12
13== Problem Statement
14
15Integrating video coding operations into Vulkan applications enable a wide set of new usage scenarios including, but not limited to, the following examples:
16
17  * Applying post-processing on top of video frames decoded from a compressed video stream
18  * Sourcing dynamic texture data from compressed video streams
19  * Recording the output of rendering operations
20  * Efficiently transferring rendering results over network (video conferencing, game streaming, etc.)
21
22It is also not uncommon for Vulkan capable devices to feature dedicated hardware acceleration for video compression and decompression.
23
24The goal of this proposal is to enable these use cases, allow exposing the underlying hardware capabilities, and provide tight integration with other functionalities of the Vulkan API.
25
26
27== Solution Space
28
29The following options have been considered:
30
31  1. Rely on external sharing capabilities to interact with existing video APIs
32  2. Add new dedicated APIs to Vulkan separately for video decoding and video encoding
33  3. Add a common set of APIs to Vulkan enabling video coding operations in general
34
35Option 1 has the advantage of being the least invasive in terms of API changes. The disadvantage is that there are a wide range of video APIs out there, most of them being platform or vendor specific which makes creating portable applications difficult. Cross-API interaction also often comes with undesired performance costs and it makes it difficult, if not impossible, to take advantage of all the existing features of Vulkan in such scenarios.
36
37Option 2 enables integrating video coding operations into the API and leveraging all the other capabilities of Vulkan including, but not limited to, explicit resource management and synchronization. Besides that, an integrated solution greatly reduces application complexity and allows for better portability.
38
39Option 3 improves option 2 by acknowledging that there are a lot of facilities that could be shared across different video coding operations like video decoding and encoding. Accordingly, this proposal follows option 3 to introduce a set of concepts, object types, and commands that form the foundation of the video coding capabilities of Vulkan upon which additional functionalities can be layered providing specific video coding operations like video decoding or encoding, and support for individual video compression standards.
40
41
42== Proposal
43
44=== Video Std Headers
45
46As each video compression standard requires a large set of codec-specific parameters that are orthogonal to the Vulkan API itself, the definitions of those are not part of the Vulkan headers. Instead, these definitions are provided separately for each codec-specific extension in corresponding video std headers.
47
48
49=== Video Profiles
50
51This extension introduces the concept of video profiles. A video profile in Vulkan loosely resembles similar concepts defined in video compression standards, however, it is a more generic concept that encompasses additional information like the specific video coding operation, the content type/format, and any other information related to the video coding scenario.
52
53A video profile in Vulkan is defined using the following structure:
54
55[source,c]
56----
57typedef struct VkVideoProfileInfoKHR {
58    VkStructureType                     sType;
59    const void*                         pNext;
60    VkVideoCodecOperationFlagBitsKHR    videoCodecOperation;
61    VkVideoChromaSubsamplingFlagsKHR    chromaSubsampling;
62    VkVideoComponentBitDepthFlagsKHR    lumaBitDepth;
63    VkVideoComponentBitDepthFlagsKHR    chromaBitDepth;
64} VkVideoProfileInfoKHR;
65----
66
67A complete video profile definition includes an instance of the structure above with additional codec and use case specific parameters provided through its `pNext` chain.
68
69The `videoCodecOperation` member identifies the particular video codec and video coding operation, while the other members provide information about the content type/format, including the chroma subsampling mode and the bit depths used by the compressed video stream.
70
71This extension does not define any video codec operations. Instead, it is left to codec-specific extensions layered on top of this proposal to provide those.
72
73
74=== Video Queues
75
76Support for video coding operations is exposed through new commands available for use on video-capable queue families. As it is not uncommon for devices to have separate dedicated hardware for accelerating video compression and decompression, possibly separate ones for different video codecs, implementations may expose multiple queue families with different video coding capabilities, although it is also possible for implementations to support video coding operations on the usual graphics or compute capable queue families.
77
78The set of video codec operations supported by a queue family can be retrieved using queue family property queries by including the following new output structure:
79
80[source,c]
81----
82typedef struct VkQueueFamilyVideoPropertiesKHR {
83    VkStructureType                  sType;
84    void*                            pNext;
85    VkVideoCodecOperationFlagsKHR    videoCodecOperations;
86} VkQueueFamilyVideoPropertiesKHR;
87----
88
89After a successful query, the `videoCodecOperations` member will contain bits corresponding to the individual video codec operations supported by the queue family in question.
90
91
92=== Video Picture Resources
93
94Pictures used by video coding operations are referred to as video picture resources, and are provided to the video coding APIs through instances of the following new structure:
95
96[source,c]
97----
98typedef struct VkVideoPictureResourceInfoKHR {
99    VkStructureType    sType;
100    const void*        pNext;
101    VkOffset2D         codedOffset;
102    VkExtent2D         codedExtent;
103    uint32_t           baseArrayLayer;
104    VkImageView        imageViewBinding;
105} VkVideoPictureResourceInfoKHR;
106----
107
108Each video picture resource is backed by a subregion within a layer of an image object. `baseArrayLayer` specifies the array layer index used relative to the image view specified in `imageViewBinding`. Depending on the specific video codec operation, `codedOffset` can specify an additional offset within the image subresource to read/write picture data from/to, while `codedExtent` typically specifies the size of the video frame.
109
110Actual semantics of `codedOffset` and `codedExtent` are specific to the video profile in use, as the capabilities and semantics of individual codecs varies.
111
112
113=== Decoded Picture Buffer
114
115The chosen video compression standard may require the use of reference pictures. Such reference pictures are used by video coding operations to provide predictions of the values of samples of subsequently decoded or encoded pictures. Just like any other picture data, the decoded picture buffer (DPB) is backed by image layers. In this extension reference pictures are represented by video picture resources and corresponding image views. The DPB is the logical structure that holds this pool of reference pictures.
116
117The DPB is an indexed data structure, and individual indexed entries of the DPB are referred to as the DPB slots. The range of valid DPB slot indices is between zero and `N-1`, where `N` is the capacity of the DPB. Each DPB slot can refer to one or more reference pictures. In case of typical progressive content each DPB slot usually refers to a single picture containing a video frame, but other content types like multiview or interlaced video allow multiple pictures to be associated with each slot. If a DPB slot has any pictures associated with it, then it is an active DPB slot, otherwise it is an inactive DPB slot.
118
119DPB slots can be activated with reference pictures in response to video coding operations requesting such activations. This extension does not introduce any video coding operations. Instead, layered extensions provide those. However, this extension does provide facilities to deactivate currently active DPB slots, as discussed later.
120
121In this extension, the state and the backing store of the DPB are separated as follows:
122
123  * The state of individual DPB slots is maintained by video session objects.
124  * The backing store of DPB slots is provided by video picture resources and the underlying images.
125
126A single non-mipmapped image with a layer count equaling the number of DPB slots can used as the backing store of the DPB, where the picture corresponding to a particular DPB slot index is stored in the layer with the same index. The API also allows arbitrary mapping of image layers to DPB slots. Furthermore, if the `VK_VIDEO_CAPABILITY_SEPARATE_REFERENCE_IMAGES_BIT_KHR` capability flag is supported by the implementation for a specific video profile, then individual DPB slots can be backed by different images, potentially using a separate image for each DPB slot.
127
128Depending on the used video profile, a single DPB slot may contain more than just one picture (e.g. in case of multiview and interlaced content). In such cases the number of needed image layers may be larger than the number of DPB slots, hence the image(s) used as the backing store of the DPB have to be sized accordingly.
129
130There may also be video compression standards, video profiles, or use cases that do not require or do not support reference pictures at all. In such cases a DPB is not needed either.
131
132The responsibility of managing the DPB is split between the application and the implementation as follows:
133
134  * The application maintains the association between DPB slot indices and corresponding video picture resources.
135  * The implementation maintains global and per-slot opaque reference picture metadata.
136
137In addition, the application is also responsible for managing the mapping between the codec-specific picture IDs and DPB slots, and any other codec-specific states.
138
139
140=== Video Session
141
142Before performing any video coding operations, the application needs to create a video session object using the following new command:
143
144[source,c]
145----
146VKAPI_ATTR VkResult VKAPI_CALL vkCreateVideoSessionKHR(
147    VkDevice                                    device,
148    const VkVideoSessionCreateInfoKHR*          pCreateInfo,
149    const VkAllocationCallbacks*                pAllocator,
150    VkVideoSessionKHR*                          pVideoSession);
151----
152
153The creation parameters are as follows:
154
155[source,c]
156----
157typedef struct VkVideoSessionCreateInfoKHR {
158    VkStructureType                 sType;
159    const void*                     pNext;
160    uint32_t                        queueFamilyIndex;
161    VkVideoSessionCreateFlagsKHR    flags;
162    const VkVideoProfileInfoKHR*    pVideoProfile;
163    VkFormat                        pictureFormat;
164    VkExtent2D                      maxCodedExtent;
165    VkFormat                        referencePictureFormat;
166    uint32_t                        maxDpbSlots;
167    uint32_t                        maxActiveReferencePictures;
168    const VkExtensionProperties*    pStdHeaderVersion;
169} VkVideoSessionCreateInfoKHR;
170----
171
172A video session object is created against a specific video profile and the implementation uses it to maintain video coding related state. The creation parameters of a video session object include the following:
173
174  * The queue family the video session can be used with (`queueFamilyIndex`)
175  * A video profile definition specifying the particular video compression standard and video coding operation type the video session can be used with (`pVideoProfile`)
176  * The maximum size of the coded frames the video session can be used with (`maxCodedExtent`)
177  * The capacity of the DPB (`maxDpbSlots`)
178  * The maximum number of reference pictures that can be used in a single operation (`maxActiveReferencePictures`)
179  * The used picture formats (`pictureFormat` and `referencePictureFormat`)
180  * The used video compression standard header (`pStdHeaderVersion`)
181
182A video session object can be used to perform video coding operations on a single video stream at the time. After the application finished processing a video stream, it can reuse the object to process another video stream, provided that the configuration parameters between the two streams are compatible (as determined by the video compression standard in use).
183
184Once a video session has been created, the video compression standard and profiles, picture formats, and other settings like the maximum coded extent cannot be changed. However, many parameters of video coding operations may change between subsequent operations, subject to restrictions imposed on parameter updates by the video compression standard, e.g.:
185
186  * The size of the decoded or encoded pictures
187  * The number of active DPB slots
188  * The number of reference pictures in use
189
190In particular, a given video session can be reused to process video streams with different extents, as long as the used coded extent does not exceed the maximum coded extent the video session was created with. This can be useful to reduce latency/overhead when processing video content that may dynamically change the video resolution as part of adjusting to varying network conditions, for example.
191
192After creating a video session, and before using the object in command buffer commands, the application has to allocate and bind device memory to the video session. Implementations may require one or more memory bindings to be bound with compatible device memory, as reported by the following new command:
193
194[source,c]
195----
196VKAPI_ATTR VkResult VKAPI_CALL vkGetVideoSessionMemoryRequirementsKHR(
197    VkDevice                                    device,
198    VkVideoSessionKHR                           videoSession,
199    uint32_t*                                   pMemoryRequirementsCount,
200    VkVideoSessionMemoryRequirementsKHR*        pMemoryRequirements);
201----
202
203For each memory binding the following information is returned:
204
205[source,c]
206----
207typedef struct VkVideoSessionMemoryRequirementsKHR {
208    VkStructureType         sType;
209    void*                   pNext;
210    uint32_t                memoryBindIndex;
211    VkMemoryRequirements    memoryRequirements;
212} VkVideoSessionMemoryRequirementsKHR;
213----
214
215`memoryBindIndex` is a unique identifier of the corresponding memory binding and can have any value, and `memoryRequirements` contains the memory requirements corresponding to the memory binding.
216
217The application can bind compatible device memory ranges for each binding through one or more calls to the following new command:
218
219[source,c]
220----
221VKAPI_ATTR VkResult VKAPI_CALL vkBindVideoSessionMemoryKHR(
222    VkDevice                                    device,
223    VkVideoSessionKHR                           videoSession,
224    uint32_t                                    bindSessionMemoryInfoCount,
225    const VkBindVideoSessionMemoryInfoKHR*      pBindSessionMemoryInfos);
226----
227
228The parameters of a memory binding are as follows:
229
230[source,c]
231----
232typedef struct VkBindVideoSessionMemoryInfoKHR {
233    VkStructureType    sType;
234    const void*        pNext;
235    uint32_t           memoryBindIndex;
236    VkDeviceMemory     memory;
237    VkDeviceSize       memoryOffset;
238    VkDeviceSize       memorySize;
239} VkBindVideoSessionMemoryInfoKHR;
240----
241
242The application does not have to bind memory to each memory binding with a single call, but before being able to use the video session in video coding operations, all memory bindings have to be bound to compatible device memory, and the bindings are immutable for the lifetime of the video session.
243
244Once a video session object is no longer needed (and is no longer used by any pending command buffers), it can be destroyed with the following new command:
245
246[source,c]
247----
248VKAPI_ATTR void VKAPI_CALL vkDestroyVideoSessionKHR(
249    VkDevice                                    device,
250    VkVideoSessionKHR                           videoSession,
251    const VkAllocationCallbacks*                pAllocator);
252----
253
254
255=== Video Session Parameters
256
257Most video compression standards require parameters that are in use across multiple video coding operations, potentially across the entire video stream. For example, the H.264/AVC and H.265/HEVC standards require sequence and picture parameter sets (SPS and PPS) that apply to multiple video frames, layers, and sub-layers.
258
259This extension uses video session parameters objects to store such standard parameters. These objects enable storing such codec-specific parameters in a preprocessed form and enable reducing the number of parameters needed to be provided and processed by the implementation while recording video coding operations into command buffers.
260
261Video session parameters objects use a key-value storage. The way how keys are derived from the provided parameters is codec-specific (e.g. in case of H.264/AVC picture parameter sets the key consists of an SPS and PPS ID pair).
262
263The application can create a video session parameters object against a video session with the following new command:
264
265[source,c]
266----
267VKAPI_ATTR VkResult VKAPI_CALL vkCreateVideoSessionParametersKHR(
268    VkDevice                                    device,
269    const VkVideoSessionParametersCreateInfoKHR* pCreateInfo,
270    const VkAllocationCallbacks*                pAllocator,
271    VkVideoSessionParametersKHR*                pVideoSessionParameters);
272----
273
274The creation parameters are as follows:
275
276[source,c]
277----
278typedef struct VkVideoSessionParametersCreateInfoKHR {
279    VkStructureType                           sType;
280    const void*                               pNext;
281    VkVideoSessionParametersCreateFlagsKHR    flags;
282    VkVideoSessionParametersKHR               videoSessionParametersTemplate;
283    VkVideoSessionKHR                         videoSession;
284} VkVideoSessionParametersCreateInfoKHR;
285----
286
287Layered extensions may provide mechanisms to specify an initial set of parameters at creation time, and the application can also specify a video session parameters object in `videoSessionParametersTemplate` that will be used as a template for the new object. Applying a template happens by first adding any parameters specified in the codec-specific creation parameters, followed by adding any parameters from the template object that have a key that does not match the key of any of the already added parameters.
288
289Parameters stored in video session parameters objects are immutable to facilitate the concurrent use of the stored parameters in multiple threads. However, new parameters can be added to existing objects using the following new command:
290
291[source,c]
292----
293KAPI_ATTR VkResult VKAPI_CALL vkUpdateVideoSessionParametersKHR(
294    VkDevice                                    device,
295    VkVideoSessionParametersKHR                 videoSessionParameters,
296    const VkVideoSessionParametersUpdateInfoKHR* pUpdateInfo);
297----
298
299The base parameters to the command are as follows:
300
301[source,c]
302----
303typedef struct VkVideoSessionParametersUpdateInfoKHR {
304    VkStructureType    sType;
305    const void*        pNext;
306    uint32_t           updateSequenceCount;
307} VkVideoSessionParametersUpdateInfoKHR;
308----
309
310The `updateSequenceCount` parameter is used to ensure that the video session parameters objects are updated in order. To support concurrent use of the stored immutable parameters while also allowing the video session parameters object to be extended with new parameters, each object maintains an _update sequence counter_ that is set to `0` at object creation time and has to be incremented by each subsequent update operation by specifying an `updateSequenceCount` that equals the current update sequence counter of the object plus one.
311
312Some codecs permit updating previously supplied parameters. As the parameters stored in the video session parameters objects are immutable, if a parameter update is necessary, the application has the following options:
313
314  * Cache the set of parameters on the application side and create a new video session parameters object adding all the parameters with appropriate changes, as necessary; or
315  * Create a new video session parameters object providing only the updated parameters and the previously used object as the template, which ensures that parameters not specified at creation time will be copied unmodified from the template object.
316
317Another case when a new video session parameters object may need to be created is when the capacity of the current object is exhausted. Each video session parameters object is created with a specific capacity, hence if that capacity later turns out to be insufficient, a new object with a larger capacity should be created, typically using the old one as a template.
318
319The application has to track the capacity and the keys of currently stored parameters for each video session parameters object in order to be able to determine when a new object needs to be created due to a change to an existing parameter or due to exceeding the capacity of the existing object.
320
321During command buffer recording, it is the responsibility of the application to provide the video session parameters object containing the necessary parameters for processing the portion of the video stream in question.
322
323The expected usage model for video session parameters object is a single-producer-multiple-consumer one. Typically a single thread processing the video stream is expected to update the corresponding parameters object, or create new ones when necessary, while at the same time any thread can record video coding operations into command buffers referring to parameters previously added to the object. If, for some reason, the application wants to update a given video session parameters object from multiple threads, it is responsible to provide appropriate mutual exclusion so that no two threads update the same object concurrently, and that the used `updateSequenceCount` values are sequentially increasing.
324
325Once a video session parameters object is no longer needed (and is no longer used by any pending command buffers), it can be destroyed with the following new command:
326
327[source,c]
328----
329VKAPI_ATTR void VKAPI_CALL vkDestroyVideoSessionParametersKHR(
330    VkDevice                                    device,
331    VkVideoSessionParametersKHR                 videoSessionParameters,
332    const VkAllocationCallbacks*                pAllocator);
333----
334
335This extension does not define any parameter types. Instead, layered codec-specific extensions define those. Some codecs may not need parameters at all, in which case no video session parameters objects need to be created or managed.
336
337
338=== Command Buffer Commands
339
340This extension does not introduce any specific video coding operations, however, it does introduce new commands that can be recorded into video-capable command buffers (created from command pools that target queue families with video capabilities).
341
342Applications can record video coding operations into such a command buffer only within a _video coding scope_. The following new command begins such a video coding scope within the command buffer:
343
344[source,c]
345----
346VKAPI_ATTR void VKAPI_CALL vkCmdBeginVideoCodingKHR(
347    VkCommandBuffer                             commandBuffer,
348    const VkVideoBeginCodingInfoKHR*            pBeginInfo);
349----
350
351This command takes the following parameters:
352
353[source,c]
354----
355typedef struct VkVideoBeginCodingInfoKHR {
356    VkStructureType                       sType;
357    const void*                           pNext;
358    VkVideoBeginCodingFlagsKHR            flags;
359    VkVideoSessionKHR                     videoSession;
360    VkVideoSessionParametersKHR           videoSessionParameters;
361    uint32_t                              referenceSlotCount;
362    const VkVideoReferenceSlotInfoKHR*    pReferenceSlots;
363} VkVideoBeginCodingInfoKHR;
364----
365
366The mandatory `videoSession` parameter specifies the video session object used to process the video coding operations within the video coding scope. As the video session object is a stateful object providing the device state context needed to perform video coding operations, portions of a video stream can be processed across multiple video coding scopes and multiple command buffers using the same video session object. It is typical, for example, to submit a single command buffer with a single video coding scope encapsulating a single video coding operation (let that be a video decode or encode operation) that performs the decompression or compression of a single video frame produced or consumed by other Vulkan commands.
367
368`videoSessionParameters` provides the optional parameters object to use with the video coding operations, depending on whether one is needed according to the codec-specific requirements.
369
370This command binds the specified video session and (if present) video session parameters objects to the command buffer for the duration of the video coding scope.
371
372In addition, the application can provide a list of reference picture resources, with initial information about which DPB slots they may be currently associated with. This information is provided through an array of the following new structure:
373
374[source,c]
375----
376typedef struct VkVideoReferenceSlotInfoKHR {
377    VkStructureType                         sType;
378    const void*                             pNext;
379    int32_t                                 slotIndex;
380    const VkVideoPictureResourceInfoKHR*    pPictureResource;
381} VkVideoReferenceSlotInfoKHR;
382----
383
384The list of video picture resources provided here is needed because the `vkCmdBeginVideoScopeKHR` command also acts as a resource binding command, as the provided list defines the set of resources that can be used as reconstructed or reference pictures by video coding operations within the video coding scope.
385
386The DPB slot association information needs to be provided because it is the application's responsibility to maintain the association between DPB slot indices and corresponding video picture resources. If a video picture resource is not currently associated with any DPB slot, but it is planned to be associated with one within this video coding scope (e.g. by using it as the target of picture reconstruction), then it has to be included in the list with a negative `slotIndex` value, indicating that it is a bound reference picture resource, but one that is not currently associated with any DPB slot.
387
388The `vkCmdBeginVideoCodingKHR` command also allows the application to deactivate previously activated DPB slots. This can be done by passing the index of the DPB slot to deactivate in `slotIndex` but not specifying any associated picture resource(`pPictureResource = NULL`). Deactivating the DPB slot removes all associated reference pictures which allows the application to e.g. reuse or deallocate the corresponding memory resources.
389
390The associations between these bound video picture resources and DPB slots can also change during the course of the video coding scope in response to video coding operations.
391
392Control and state changing operations can be issued within a video coding scope with the following new command:
393
394[source,c]
395----
396VKAPI_ATTR void VKAPI_CALL vkCmdControlVideoCodingKHR(
397    VkCommandBuffer                             commandBuffer,
398    const VkVideoCodingControlInfoKHR*          pCodingControlInfo);
399----
400
401This extension introduces only a single control flag called `VK_VIDEO_CODING_CONTROL_RESET_BIT_KHR` that is used to initialize the video session object. Before being able to record actual video coding operations against a bound video session object, it has to be initialized (reset) using this command by including the `VK_VIDEO_CODING_CONTROL_RESET_BIT_KHR` flag. The reset operation also returns all DPB slots of the video session to the inactive state and removes any DPB slot index associations.
402
403After processing a video stream using a video session, the reset operation can also be used to return the video session back to the initial state. This enables reusing a single video session object to process different, independent video sequences.
404
405A video coding scope can be ended with the following new command:
406
407[source,c]
408----
409VKAPI_ATTR void VKAPI_CALL vkCmdEndVideoCodingKHR(
410    VkCommandBuffer                             commandBuffer,
411    const VkVideoEndCodingInfoKHR*              pEndCodingInfo);
412----
413
414
415=== Status Queries
416
417Compressing and decompressing video content is a non-trivial process that involves complex codec-specific semantics and requirements. Accordingly, it is possible for a video coding operation to fail when processing input content that is not conformant to the rules defined by the used video compression standard, thus determining whether a particular video coding operation completed successfully can only happen at runtime.
418
419In order to facilitate this, this extension also introduces a new `VK_QUERY_TYPE_RESULT_STATUS_ONLY_KHR` query type that enables getting feedback about the status of operations. Support for this new query type can be queried for each queue family index through the following new output structure:
420
421[source,c]
422----
423typedef struct VkQueueFamilyQueryResultStatusPropertiesKHR {
424    VkStructureType    sType;
425    void*              pNext;
426    VkBool32           queryResultStatusSupport;
427} VkQueueFamilyQueryResultStatusPropertiesKHR;
428----
429
430Quries also work slightly differently within a video coding scope due to the special behavior of video coding operations. Instead of a query being bound to the scope determined by the corresponding `vkCmdBeginQuery` and `vkCmdEndQuery` calls, in case of video coding each video coding operation consumes its own query slot. Thus if a command issues multiple video coding operations, then those may consume multiple subsequent query slots within the query pool. However, as no new commands are introduced by this extension to start queries with multiple activatable query slots, currently only a single video coding operation is allowed between a `vkCmdBeginQuery` and `vkCmdEndQuery` call.
431
432An unsuccessfully completed video coding operation may also have an effect on subsequently executed video coding operations against the same video session. In particular, if a video coding operation requests the setup (activation) of a DPB slot with a reference picture and that video coding operation completes unsuccessfully, then the corresponding DPB slot will end up having an invalid picture reference. This will cause subsequent video coding operations using reference pictures associated with that DPB slot to produce unexpected results, and may even cause such dependent video coding operations themselves to complete unsuccessfully in response to the invalid input data.
433
434Thus applications have to make sure that they use queries to determine the completion status of video coding operations in order to be able to detect if outputs may contain undefined data and potentially drop those, depending on the particular use case.
435
436The mechanisms introduced by the new query type are designed to be generic. While video coding scopes only allow using `VK_QUERY_TYPE_RESULT_STATUS_ONLY_KHR` queries (at least without layered extensions introducing further video-compatible query types), the new `VK_QUERY_RESULT_WITH_STATUS_BIT_KHR` bit can also be used with other query types, replacing the traditional boolean availability information with an enumeration based status value:
437
438[source,c]
439----
440typedef enum VkQueryResultStatusKHR {
441    VK_QUERY_RESULT_STATUS_ERROR_KHR = -1,
442    VK_QUERY_RESULT_STATUS_NOT_READY_KHR = 0,
443    VK_QUERY_RESULT_STATUS_COMPLETE_KHR = 1,
444    VK_QUERY_RESULT_STATUS_MAX_ENUM_KHR = 0x7FFFFFFF
445} VkQueryResultStatusKHR;
446----
447
448In general, when retrieving the result status of a query, negative values indicate some sort of failure (unsuccessful completion of operations) and positive values indicate success.
449
450
451=== Device Memory Management
452
453In this extension the application has complete control over how and when system resources are used. This extension provides the following tools to enable optimal usage of device and host memory resources:
454
455  * The application can manage the number of allocated output and input pictures, and can dynamically grow or shrink the DPB holding the reference pictures, based on the changing video content requirements.
456  * Individual video picture resources can be shared across different contexts, e.g. reference pictures can be shared between video decoding and encoding workloads, and the output of a video decode operation can be used as an input to a video encode operation.
457  * The images backing the video picture resources can also be used in other non-video-related operations, e.g. video decode operations may directly output to presentable swapchain images, or to images that can be subsequently sampled by graphics operations, subject to appropriate implementation capabilities.
458  * The application can also use sparse memory bindings for the images backing the video picture resources. The use of sparse memory bindings allows the application to unbind the device memory backing of the images when the corresponding DPB slot is not in active use.
459
460These general Vulkan capabilities enable this extension to provide seamless and efficient integration across different types of workloads in a "zero-copy" fashion and minimal synchronization overhead.
461
462
463=== Resource Creation
464
465This extension stores video picture resources in image objects. As the device memory requirements of video picture resources may be specific to the video profile used, when creating images with any video-specific usage the application has to provide information about the video profiles the image will be used with. As a single image may be reused across video sessions using different video profiles (e.g. to use the decoded output picture as an input picture to subsequent encode operations), the following new structure is introduced to provide a list of video profiles:
466
467[source,c]
468----
469typedef struct VkVideoProfileListInfoKHR {
470    VkStructureType                 sType;
471    const void*                     pNext;
472    uint32_t                        profileCount;
473    const VkVideoProfileInfoKHR*    pProfiles;
474} VkVideoProfileListInfoKHR;
475----
476
477As multiple profiles are expected to be specified only in video transcoding use cases, the list can include at most one video decode profile and one or more video encode profiles.
478
479When an instance of this structure is included in the `pNext` chain of `VkImageCreateInfo` to a `vkCreateImage` call, the created image will be usable in video coding operations recorded against video sessions using any of the specified video profiles.
480
481Similarly, buffers used as the backing store for video bitstreams have to be created with the `pNext` chain of `VkBufferCreateInfo` including a profile list structure when calling `vkCreateBuffer` in order to make the resulting buffer compatible with video sessions using any of the specified video profiles.
482
483Query pools are also video-profile-specific. In particular, in order to create a `VK_QUERY_TYPE_RESULT_STATUS_ONLY_KHR` query pool compatible with a particular video profile, the application has to include an instance of the `VkVideoProfileInfoKHR` structure in the `pNext` chain of `VkQueryPoolCreateInfo`. Unlike buffers and images, query pools are not reusable across video sessions using different video profiles, hence the used structure is `VkVideoProfileInfoKHR` instead of `VkVideoProfileListInfoKHR`.
484
485
486=== Protected Content Support
487
488This extension also enables support of video coding operations using protected content. Whether a particular implementation supports coding protected content is indicated by the `VK_VIDEO_CAPABILITY_PROTECTED_CONTENT_BIT_KHR` capability flag.
489
490Just like in all other Vulkan operations using protected content, the resources participating in those must either all be protected or unprotected. This applies to the command buffer (and the command pool it is allocated from), to the queue the command buffer is submitted to, to the buffers and images used within those command buffers, as well as to the video session objects used for video coding.
491
492If the `VK_VIDEO_CAPABILITY_PROTECTED_CONTENT_BIT_KHR` capability flag is supported, the application can create protected-capable video sessions using the `VK_VIDEO_SESSION_CREATE_PROTECTED_CONTENT_BIT_KHR` flag.
493
494
495=== Capabilities
496
497The generic capabilities of the implementation for a given video profile can be queried using the following new command:
498
499[source,c]
500----
501VKAPI_ATTR VkResult VKAPI_CALL vkGetPhysicalDeviceVideoCapabilitiesKHR(
502    VkPhysicalDevice                            physicalDevice,
503    const VkVideoProfileInfoKHR*                pVideoProfile,
504    VkVideoCapabilitiesKHR*                     pCapabilities);
505----
506
507The output structure contains only common capabilities that are relevant for all video profiles:
508
509[source,c]
510----
511typedef struct VkVideoCapabilitiesKHR {
512    VkStructureType              sType;
513    void*                        pNext;
514    VkVideoCapabilityFlagsKHR    flags;
515    VkDeviceSize                 minBitstreamBufferOffsetAlignment;
516    VkDeviceSize                 minBitstreamBufferSizeAlignment;
517    VkExtent2D                   pictureAccessGranularity;
518    VkExtent2D                   minCodedExtent;
519    VkExtent2D                   maxCodedExtent;
520    uint32_t                     maxDpbSlots;
521    uint32_t                     maxActiveReferencePictures;
522    VkExtensionProperties        stdHeaderVersion;
523} VkVideoCapabilitiesKHR;
524----
525
526In particular, it contains information about the following:
527
528  * Buffer offset and (range) size requirements of the video bitstream buffer ranges
529  * Access granularity of video picture resources
530  * Minimum and maximum size of coded pictures
531  * Maximum number of DPB slots and active reference pictures
532  * Name and maximum supported version of the codec-specific video std headers
533
534While these capabilities are generic, each video profile may have its own set of capabilities. In addition, layered extensions will include additional capabilities specific to the type of video coding operation and video compression standard.
535
536The picture access granularity is something that the application has to particularly pay attention to. Video coding hardware can often access memory only at a particular granularity (block size) that may span multiple rows or columns of the picture data. This means that when a video coding operation writes data to a video picture resource it is possible that texels outside of the effective extents of the picture will also get modified. Writes to such padding texels will result in undefined texel values, thus the application has to make sure not to assume any particular values in these "shoulder" areas. This is especially important when the application chooses to reuse the same video picture resources to process video frames larger than the resource was previously used with. To avoid reading undefined values in such cases, applications should clear the image subresources used as video picture resources when the resolution of the video content changes, or otherwise ensure that these padding texels contain well-defined data (e.g. by writing to them) before being read from.
537
538Besides the global capabilities of a video profile, the set of image formats usable with video coding operations is also specific to each video profile. The following new query enables the application to enumerate the list and properties of the image formats supported by a given set of video profiles:
539
540[source,c]
541----
542VKAPI_ATTR VkResult VKAPI_CALL vkGetPhysicalDeviceVideoFormatPropertiesKHR(
543    VkPhysicalDevice                            physicalDevice,
544    const VkPhysicalDeviceVideoFormatInfoKHR*   pVideoFormatInfo,
545    uint32_t*                                   pVideoFormatPropertyCount,
546    VkVideoFormatPropertiesKHR*                 pVideoFormatProperties);
547----
548
549The input to this query includes the needed image usage flags, which typically include some video-specific usage flags, and the list of video profiles provided through a `VkVideoProfileListInfoKHR` structure included in the `pNext` of the following new structure:
550
551[source,c]
552----
553typedef struct VkPhysicalDeviceVideoFormatInfoKHR {
554    VkStructureType      sType;
555    const void*          pNext;
556    VkImageUsageFlags    imageUsage;
557} VkPhysicalDeviceVideoFormatInfoKHR;
558----
559
560The query returns the following new output structure:
561
562[source,c]
563----
564typedef struct VkVideoFormatPropertiesKHR {
565    VkStructureType       sType;
566    void*                 pNext;
567    VkFormat              format;
568    VkComponentMapping    componentMapping;
569    VkImageCreateFlags    imageCreateFlags;
570    VkImageType           imageType;
571    VkImageTiling         imageTiling;
572    VkImageUsageFlags     imageUsageFlags;
573} VkVideoFormatPropertiesKHR;
574----
575
576Alongside the format and the supported image creation values/flags, `componentMapping` indicates how the video coding operations interpret the individual components of video picture resources using this format. For example, if the implementation produces video decode output with the `VK_FORMAT_G8_B8R8_2PLANE_420_UNORM` format where the blue and red chrominance channels are swapped then `componentMapping` will have the following values:
577
578[source,c]
579----
580components.r = VK_COMPONENT_SWIZZLE_B;        // Cb component
581components.g = VK_COMPONENT_SWIZZLE_IDENTITY; // Y component
582components.b = VK_COMPONENT_SWIZZLE_R;        // Cr component
583components.a = VK_COMPONENT_SWIZZLE_IDENTITY; // unused, defaults to 1.0
584----
585
586The query may return multiple `VkVideoFormatPropertiesKHR` entries with the same format, but otherwise different values for other members (e.g. with different image type or image tiling). In addition, a different set of entries may be returned depending on the input image usage flags specified, even for the same set of video profiles, for example, based on whether input, output, or DPB usage is requested.
587
588The application can select the parameters from a returned entry and use compatible parameters when creating images to be used as video picture resources with any of the video profiles provided in the input list.
589
590
591== Examples
592
593=== Select queue family with support for a given video codec operation and result status queries
594
595[source,c]
596----
597VkVideoCodecOperationFlagBitsKHR neededVideoCodecOp = ...
598uint32_t queueFamilyIndex;
599uint32_t queueFamilyCount;
600
601vkGetPhysicalDeviceQueueFamilyProperties2(physicalDevice, &queueFamilyCount, NULL);
602
603VkQueueFamilyProperties2* props = calloc(queueFamilyCount,
604    sizeof(VkQueueFamilyProperties2));
605VkQueueFamilyVideoPropertiesKHR* videoProps = calloc(queueFamilyCount,
606    sizeof(VkQueueFamilyVideoPropertiesKHR));
607VkQueueFamilyQueryResultStatusPropertiesKHR* queryResultStatusProps = calloc(queueFamilyCount,
608    sizeof(VkQueueFamilyQueryResultStatusPropertiesKHR));
609
610for (queueFamilyIndex = 0; queueFamilyIndex < queueFamilyCount; ++queueFamilyIndex) {
611    props[queueFamilyIndex].sType = VK_STRUCTURE_TYPE_QUEUE_FAMILY_PROPERTIES_2;
612    props[queueFamilyIndex].pNext = &videoProps[queueFamilyIndex];
613
614    videoProps[queueFamilyIndex].sType = VK_STRUCTURE_TYPE_QUEUE_FAMILY_VIDEO_PROPERTIES_KHR;
615    videoProps[queueFamilyIndex].pNext = &queryResultStatusProps[queueFamilyIndex];
616
617    queryResultStatusProps[queueFamilyIndex].sType = VK_STRUCTURE_TYPE_QUEUE_FAMILY_QUERY_RESULT_STATUS_PROPERTIES_KHR;
618}
619
620vkGetPhysicalDeviceQueueFamilyProperties2(physicalDevice, &queueFamilyCount, props);
621
622for (queueFamilyIndex = 0; queueFamilyIndex < queueFamilyCount; ++queueFamilyIndex) {
623    if ((videoProps[queueFamilyIndex].videoCodecOperations & neededVideoCodecOp) != 0 &&
624        (queryResultStatusProps[queueFamilyIndex].queryResultStatusSupport == VK_TRUE)) {
625        break;
626    }
627}
628
629if (queueFamilyIndex < queueFamilyCount) {
630    // Found appropriate queue family
631    ...
632} else {
633    // Did not find a queue family with the needed capabilities
634    ...
635}
636----
637
638
639=== Check support and query the capabilities for a video profile
640
641[source,c]
642----
643VkResult result;
644
645VkVideoProfileInfoKHR profileInfo = {
646    .sType = VK_STRUCTURE_TYPE_VIDEO_PROFILE_INFO_KHR,
647    .pNext = ... // pointer to additional profile information structures specific to the codec and use case
648    .videoCodecOperation = ... // used video codec operation
649    .chromaSubsampling = VK_VIDEO_CHROMA_SUBSAMPLING_420_BIT_KHR,
650    .lumaBitDepth = VK_VIDEO_COMPONENT_BIT_DEPTH_8_BIT_KHR,
651    .chromaBitDepth = VK_VIDEO_COMPONENT_BIT_DEPTH_8_BIT_KHR
652};
653
654VkVideoCapabilitiesKHR capabilities = {
655    .sType = VK_STRUCTURE_TYPE_VIDEO_CAPABILITIES_KHR,
656    .pNext = ... // pointer to additional capability structures specific to the type of video coding operation and codec
657};
658
659result = vkGetPhysicalDeviceVideoCapabilitiesKHR(physicalDevice, &profileInfo, &capabilities);
660
661if (result == VK_SUCCESS) {
662    // Profile is supported, check additional capabilities
663    ...
664} else {
665    // Profile is not supported, result provides additional information about why
666    ...
667}
668----
669
670
671=== Enumerate supported formats for a video profile with a given usage
672
673[source,c]
674----
675uint32_t formatCount;
676
677VkVideoProfileInfoKHR profileInfo = {
678    ...
679};
680
681VkVideoProfileListInfoKHR profileListInfo = {
682    .sType = VK_STRUCTURE_TYPE_VIDEO_PROFILE_LIST_INFO_KHR,
683    .pNext = NULL,
684    .profileCount = 1,
685    .pProfiles = &profileInfo
686};
687// NOTE: Add any additional profiles to the list for e.g. video transcoding use cases
688
689VkPhysicalDeviceVideoFormatInfoKHR formatInfo = {
690    .sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_VIDEO_FORMAT_INFO_KHR,
691    .pNext = &profileListInfo,
692    .imageUsage = ... // expected image usage, e.g. DPB, input, or output
693};
694
695vkGetPhysicalDeviceVideoFormatPropertiesKHR(physicalDevice, &formatInfo, &formatCount, NULL);
696
697VkVideoFormatPropertiesKHR* formatProps = calloc(formatCount, sizeof(VkVideoFormatPropertiesKHR));
698
699for (uint32_t i = 0; i < formatCount; ++i) {
700    formatProps.sType = VK_STRUCTURE_TYPE_VIDEO_FORMAT_PROPERTIES_KHR;
701}
702
703vkGetPhysicalDeviceVideoFormatPropertiesKHR(physicalDevice, &formatInfo, &formatCount, formatProps);
704
705for (uint32_t i = 0; i < formatCount; ++i) {
706    // Find format and image creation capabilities best suited for the use case
707    ...
708}
709----
710
711
712=== Create video session for a video profile
713
714[source,c]
715----
716VkVideoSessionKHR videoSession = VK_NULL_HANDLE;
717
718VkVideoSessionCreateInfoKHR createInfo = {
719    .sType = VK_STRUCTURE_TYPE_VIDEO_SESSION_CREATE_INFO_KHR,
720    .pNext = NULL,
721    .queueFamilyIndex = ... // index of queue family that supports the video codec operation
722    .flags = 0,
723    .pVideoProfile = ... // pointer to video profile information structure chain
724    .pictureFormat = ... // image format to use for input/output pictures
725    .maxCodedExtent = ... // maximum extent of coded pictures supported by the session
726    .referencePictureFormat = ... // image format to use for reference pictures (if used)
727    .maxDpbSlots = ... // DPB slot capacity to use (if needed)
728    .maxActiveReferencePictures = ... // maximum number of reference pictures used by any operation (if needed)
729    .pStdHeaderVersion = ... // pointer to the video std header information (typically the same as reported in the capabilities)
730};
731
732vkCreateVideoSession(device, &createInfo, NULL, &videoSession);
733----
734
735
736=== Query memory requirements and bind memory to a video session
737
738[source,c]
739----
740uint32_t memReqCount;
741
742vkGetVideoSessionMemoryRequirementsKHR(device, videoSession, &memReqCount, NULL);
743
744VkVideoSessionMemoryRequirementsKHR* memReqs = calloc(memReqCount, sizeof(VkVideoSessionMemoryRequirementsKHR));
745
746for (uint32_t i = 0; i < memReqCount; ++i) {
747    memReqs.sType = VK_STRUCTURE_TYPE_VIDEO_SESSION_MEMORY_REQUIREMENTS_KHR;
748}
749
750vkGetVideoSessionMemoryRequirementsKHR(device, videoSession, &memReqCount, memReqs);
751
752for (uint32_t i = 0; i < memReqCount; ++i) {
753    // Allocate memory compatible with the given memory binding
754    VkDeviceMemory memory = ...
755
756    // Bind the memory to the memory binding
757    VkBindVideoSessionMemoryInfoKHR bindInfo = {
758        .sType = VK_STRUCTURE_TYPE_BIND_VIDEO_SESSION_MEMORY_INFO_KHR,
759        .pNext = NULL,
760        .memoryBindIndex = memReqs[i].memoryBindIndex,
761        .memory = ... // memory object to bind
762        .memoryOffset = ... // offset to bind
763        .memorySize = ... // size to bind
764    };
765
766    vkBindVideoSessionMemoryKHR(device, videoSession, 1, &bindInfo);
767}
768// NOTE: Alternatively, all memory bindings can be bound with a single call
769----
770
771
772=== Create and update video session parameters objects
773
774[source,c]
775----
776VkVideoSessionParametersKHR videoSessionParams = VK_NULL_HANDLE;
777
778VkVideoSessionParametersCreateInfoKHR createInfo = {
779    .sType = VK_STRUCTURE_TYPE_VIDEO_SESSION_PARAMETERS_CREATE_INFO_KHR,
780    .pNext = ... // pointer to codec-specific parameters creation information
781    .flags = 0,
782    .videoSessionParametersTemplate = ... // template to use or VK_NULL_HANDLE
783    .videoSession = videoSession
784};
785
786vkCreateVideoSessionParametersKHR(device, &createInfo, NULL, &videoSessionParams);
787
788...
789
790VkVideoSessionParametersUpdateInfoKHR updateInfo = {
791    .sType = VK_STRUCTURE_TYPE_VIDEO_SESSION_PARAMETERS_UPDATE_INFO_KHR,
792    .pNext = ... // pointer to codec-specific parameters update information
793    .updateSequenceCount = 1 // incremented for each subsequent update
794};
795
796vkUpdateVideoSessionParametersKHR(device, &videoSessionParams, &updateInfo);
797----
798
799
800=== Create bitstream buffer
801
802[source,c]
803----
804VkBuffer buffer = VK_NULL_HANDLE;
805
806VkVideoProfileListInfoKHR profileListInfo = {
807    .sType = VK_STRUCTURE_TYPE_VIDEO_PROFILE_LIST_INFO_KHR,
808    .pNext = NULL,
809    .profileCount = ... // number of video profiles to use the bitstream buffer with
810    .pProfiles = ... // pointer to an array of video profile information structure chains
811};
812
813VkBufferCreateInfo createInfo = {
814    .sType = VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO,
815    .pNext = &profileListInfo,
816    ... // buffer creation parameters including one or more video-specific usage flags
817};
818
819vkCreateBuffer(device, &createInfo, NULL, &buffer);
820----
821
822
823=== Create image and image view backing video picture resources
824
825[source,c]
826----
827VkImage image = VK_NULL_HANDLE;
828VkImageView imageView = VK_NULL_HANDLE;
829
830VkVideoProfileListInfoKHR profileListInfo = {
831    .sType = VK_STRUCTURE_TYPE_VIDEO_PROFILE_LIST_INFO_KHR,
832    .pNext = NULL,
833    .profileCount = ... // number of video profiles to use the image with
834    .pProfiles = ... // pointer to an array of video profile information structure chains
835};
836
837VkImageCreateInfo imageCreateInfo = {
838    .sType = VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO,
839    .pNext = &profileListInfo,
840    ... // image creation parameters including one or more video-specific usage flags
841};
842
843vkCreateImage(device, &imageCreateInfo, NULL, &image);
844
845VkImageViewUsageCreateInfo imageViewUsageInfo = {
846    .sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_USAGE_CREATE_INFO,
847    .pNext = NULL,
848    .usage = // video-specific usage flags
849};
850
851VkImageViewCreateInfo imageViewCreateInfo = {
852    .sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO,
853    .pNext = &imageViewUsageInfo,
854    .flags = 0,
855    .image = image,
856    .viewType = ... // image view type (only 2D or 2D_ARRAY is supported)
857    ... // other image view creation parameters
858};
859
860vkCreateImageView(device, &imageViewCreateInfo, NULL, &imageView);
861----
862
863
864=== Record video coding operations into command buffers
865
866[source,c]
867----
868VkCommandBuffer commandBuffer = ... // allocate command buffer for a queue family supporting the video profile
869
870vkBeginCommandBuffer(commandBuffer, ...);
871...
872
873// Begin video coding scope with given video session, parameters, and reference picture resources
874VkVideoBeginCodingInfoKHR beginInfo = {
875    .sType = VK_STRUCTURE_TYPE_VIDEO_BEGIN_CODING_INFO_KHR,
876    .pNext = NULL,
877    .flags = 0,
878    .videoSession = videoSession,
879    .videoSessionParameters = videoSessionParams,
880    .referenceSlotCount = ...
881    .pReferenceSlots = ...
882};
883
884vkCmdBeginVideoCodingKHR(commandBuffer, &beginInfo);
885
886// Reset video session before starting to use it for video coding operations
887// (only needed when starting to process a new video stream)
888VkVideoCodingControlInfoKHR controlInfo = {
889    .sType = VK_STRUCTURE_TYPE_VIDEO_CODING_CONTROL_INFO_KHR,
890    .pNext = NULL,
891    .flags = VK_VIDEO_CODING_CONTROL_RESET_BIT_KHR
892};
893
894vkCmdControlVideoCodingKHR(commandBuffer, &controlInfo);
895
896// Issue video coding operations against the video session
897...
898
899// End video coding scope
900VkVideoEndCodingInfoKHR endInfo = {
901    .sType = VK_STRUCTURE_TYPE_VIDEO_END_CODING_INFO_KHR,
902    .pNext = NULL,
903    .flags = 0
904};
905
906vkCmdEndVideoCodingKHR(commandBuffer, &endInfo);
907
908...
909vkEndCommandBuffer(commandBuffer);
910----
911
912
913=== Create and use result status query pool with a video session
914
915[source,c]
916----
917VkQueryPool queryPool = VK_NULL_HANDLE;
918
919VkVideoProfileInfoKHR profileInfo = {
920    ...
921};
922
923VkQueryPoolCreateInfo createInfo = {
924    .sType = VK_STRUCTURE_TYPE_QUERY_POOL_CREATE_INFO,
925    .pNext = &profileInfo,
926    .flags = 0,
927    .queryType = VK_QUERY_TYPE_RESULT_STATUS_ONLY_KHR,
928    ...
929};
930
931vkCreateQueryPool(device, &createInfo, NULL, &queryPool);
932
933...
934vkBeginCommandBuffer(commandBuffer, ...);
935...
936vkCmdBeginVideoCodingKHR(commandBuffer, ...);
937...
938vkCmdBeginQuery(commandBuffer, queryPool, 0, 0);
939// Issue video coding operation
940...
941vkCmdEndQuery(commandBuffer, queryPool, 0);
942...
943vkCmdEndVideoCodingKHR(commandBuffer, ...);
944...
945vkEndCommandBuffer(commandBuffer);
946...
947
948VkQueryResultStatusKHR status;
949vkGetQueryPoolResults(device, queryPool, 0, 1,
950                      sizeof(status), &status, sizeof(status),
951                      VK_QUERY_RESULT_WITH_STATUS_BIT_KHR);
952
953if (status == VK_QUERY_RESULT_STATUS_NOT_READY_KHR /* 0 */) {
954    // Query result not ready yet
955    ...
956} else if (status > 0) {
957    // Video coding operation was successful, enum values indicate specific success status code
958    ...
959} else if (status < 0) {
960    // Video coding operation was unsuccessful, enum values indicate specific failure status code
961    ...
962}
963----
964
965
966== Issues
967
968=== RESOLVED: What is within the scope of this extension?
969
970The goal of this extension is to include all infrastructure APIs that are shareable across all video coding use cases, including video decoding and video encoding, independent of the video compression standard used. While there is a large set of parameters and semantics that are specific to the particular video coding operation and video codec used, many fundamental concepts and APIs are common across those, including:
971
972  * The concept of video profiles that describe the video content and video coding use cases
973  * The concept of video picture resources and decoded picture buffers
974  * Queries that allow the application to determine if a video profile is supported, the capabilities of each video profile, and the supported video picture resource formats that can be used in conjunction with particular sets of video profiles
975  * Video session objects that provide the device state context for video coding operations
976  * Video session parameters objects that provide the means to reuse large sets of codec-specific parameters across video coding operations
977  * General command buffer commands and semantics to build command sequences working on video streams using a video session
978  * Feedback mechanisms that enable tracking the status of individual video coding operations
979
980These APIs are designed to be used in conjunction with layered extensions that introduce support for specific video coding operations and video compression standards.
981
982
983=== RESOLVED: Are Vulkan video profiles equivalent to the corresponding concepts of video compression standards?
984
985Not exactly. While they do encompass actual video compression standard profile information, they also contain other information related to the type of the video content and additional use case scenario specific information.
986
987The video coding operation and the used video compression standard is identified by bits in the new `VkVideoCodecOperationFlagBitsKHR` type. While this extension does not define any valid values, layered codec-specific extensions are expected to add corresponding bits in the form `VK_VIDEO_CODEC_OPERATION_<operationType>_<codec>_BIT`.
988
989
990=== RESOLVED: Do we need a query to be able to enumerate all supported video profiles?
991
992Enumerating individual video profiles is a non-trivial problem due to the parameter combinatorics and the interaction between individual parameters. As Vulkan video profiles also include additional use case scenario specific information, it gets even more complicated. It is also expected that most use cases (especially video decoding) will want to target specific video profiles anyway, so this extension does not include an enumeration API for video profiles, rather it provides the mechanisms to determine support for specific ones. Nonetheless, a more generic enumeration API is considered to be included in future extensions.
993
994
995=== RESOLVED: Do we need queries that allow determining how multiple video profiles can be used in conjunction?
996
997Video transcoding is an important use case, so this extension does allow queries and other APIs to take a list of video profiles, when applicable, that enable the application to determine how to use a particular set of video decode and video encode profiles in conjunction, and thus support video transcoding without the need to copy video picture data, when possible.
998
999
1000=== RESOLVED: What kind of capabilitity queries do we need?
1001
1002First, this extension enables the application to query the video codec operations supported by each queue family with the new output structure `VkQueueFamilyVideoPropertiesKHR`.
1003
1004Second, the new `vkGetPhysicalDeviceVideoCapabilitiesKHR` command enables checking support for individual video profiles, and querying their general capabilities. This API also enables layered extensions to add new output structures to retrieve additional capabilities specific to the used video coding operation and video compression standard.
1005
1006Besides those, as the set of image formats and other image creation parameters compatible with video coding varies across video profiles, the new `vkGetPhysicalDeviceVideoFormatPropertiesKHR` command is introduced to query the set of image parameters that are compatible with a given set of video profiles and usage. In addition, the existing `vkGetPhysicalDeviceImageFormatProperties2` command is also extended to be able to take a list of video profiles as input to query video-specific image format capabilities.
1007
1008
1009=== RESOLVED: What kind of command buffer commands do we need?
1010
1011This extension does not introduce any specific video coding operations (e.g. video decode or encode operations). However, it does introduce a set of command buffer commands that enable defining scopes within command buffers where layered extensions can record video coding operations against a specific video session to process a video sequence. These video coding scopes are delimited by the new `vkCmdBeginVideoCodingKHR` and `vkCmdEndVideoCodingKHR` commands.
1012
1013In addition, the `vkCmdControlVideoCodingKHR` command is introduced to allow layered extensions to modify dynamic context state, and control video session state in general.
1014
1015
1016=== RESOLVED: How can the application get feedback about the status of video coding operations?
1017
1018This extension uses queries for the purpose and even introduces a new query type (`VK_QUERY_TYPE_RESULT_STATUS_ONLY_KHR`) that only includes status information. Layered extensions may also introduce other query types to enable retrieving any additional feedback that may be needed in the specific video coding use case.
1019
1020Such queries can be issued within video coding scopes using the existing `vkCmdBeginQuery` and `vkCmdEndQuery` commands (and its variants), however, the behavior of queries within video coding scopes is slightly different. Instead of a single query capturing the overall result of a series of commands, queries in video coding scopes produce separate results for each video coding operation, hence multiple video coding operations need to consume a separate query slot each.
1021
1022
1023=== RESOLVED: Do we need to introduce new `vkCmdBeginQueryRangeKHR` and `vkCmdEndQueryRangeKHR` commands to allow capturing feedback about multiple video coding operations using a single scope?
1024
1025Not in this extension. For now each layered extension is expected to introduce commands that result in the issue of only a single video coding operation, hence using the existing `vkCmdBeginQuery` and `vkCmdEndQuery` commands to surround each such command separately is sufficient. However, future extensions may introduce such commands if needed.
1026
1027
1028=== RESOLVED: Can resources be shared across video sessions, potentially ones using different video profiles?
1029
1030Yes, we need to support resource sharing at least for video bitstream buffers and video picture resources. This is important for the purposes of supporting efficient video transcoding.
1031
1032Subject to the capabilities of the implementation, buffers and image resources can be created to be shareable across video sessions by including the list of video profiles used by each video session in the object creation parameters.
1033
1034Query pools, however, are always specific to a video profile, as there is little use to share them across video sessions, and typically the contents of the query results are specific to the used video profile anyway.
1035
1036
1037=== RESOLVED: How are video coding operations synchronized with respect to other Vulkan operations?
1038
1039Synchronization works in the same way as elsewhere in the API. Command buffers targeting video-capable queues can use `vkCmdPipelineBarrier` or any of the other synchronization commands both inside and outside of video coding scopes. While this extension does not include any new pipeline stages, access flags, or image layouts, the layered extensions introducing particular video coding operations do.
1040
1041
1042=== RESOLVED: Why do some of the members of `VkVideoProfileInfoKHR` have `Flags` types instead of `FlagBits` types when only a single bit can be used?
1043
1044While this extension allows specifying only a single bit in the `chromaSubsampling`, `lumaBitDepth`, and `chromaBitDepth` members of `VkVideoProfileInfoKHR`, it is expected that future extensions may relax those requirements.
1045
1046
1047=== RESOLVED: Can the application create video sessions with any `maxDpbSlots` and `maxActiveReferencePictures` values within the supported capabilities?
1048
1049Yes. While it is quite common for video compression standards to define these values, in particular a given video profile usually supports a specific value for the number of DPB slots and it is also typical for video compression standards to allow using all reference pictures associated with active DPB slots as active reference pictures in a video coding operation. However, depending on the specific use case, the application can choose to use lower values.
1050
1051For example, if the application knows that the video content always uses at most a single reference picture for each frame, and that it only ever uses a single DPB slot, using `1` as the value for both `maxDpbSlots` and `maxActiveReferencePictures` can enable the application to limit the memory requirements of the DPB.
1052
1053Nonetheless, it is the application's responsibility to make sure that it creates video sessions with appropriate values to be able to handle the video content at hand.
1054
1055
1056=== RESOLVED: Are `VkVideoSessionParametersKHR` objects internally or externally synchronized?
1057
1058Video session parameters objects have special synchronization requirements. Typically they will only get updated by a single thread that processes the video stream but they may be consumed concurrently by multiple command buffer recording threads.
1059
1060Accordingly, they are defined to be logically internally synchronized, but in practice concurrent updates of the same object is disallowed by the requirement that the application has to increment the update sequence counter of the object with each update call. This model enables implementations to allow concurrent consumption of already stored parameters with minimal to no synchronization overhead.
1061
1062
1063== Further Functionality
1064
1065This extension is meant to provide only common video coding functionality, thus support for individual video coding operations and video compression standards is left for extensions layered on top of the infrastructure provided here.
1066
1067Currently the following layered extensions are available:
1068
1069  * `VK_KHR_video_decode_queue` - adds general support for video decode operations
1070  * `VK_KHR_video_decode_h264` - adds support for decoding H.264/AVC video sequences
1071  * `VK_KHR_video_decode_h265` - adds support for decoding H.265/HEVC video sequences
1072  * `VK_KHR_video_encode_queue` (provisional) - adds general support for video encode operations
1073  * `VK_EXT_video_encode_h264` (provisional) - adds support for encoding H.264/AVC video sequences
1074  * `VK_EXT_video_encode_h265` (provisional) - adds support for encoding H.265/HEVC video sequences
1075