1// Copyright 2021-2023 The Khronos Group Inc. 2// 3// SPDX-License-Identifier: CC-BY-4.0 4 5= VK_KHR_video_queue 6:toc: left 7:refpage: https://registry.khronos.org/vulkan/specs/1.2-extensions/man/html/ 8:sectnums: 9 10This document outlines a proposal to enable performing video coding operations in Vulkan. 11 12 13== Problem Statement 14 15Integrating video coding operations into Vulkan applications enable a wide set of new usage scenarios including, but not limited to, the following examples: 16 17 * Applying post-processing on top of video frames decoded from a compressed video stream 18 * Sourcing dynamic texture data from compressed video streams 19 * Recording the output of rendering operations 20 * Efficiently transferring rendering results over network (video conferencing, game streaming, etc.) 21 22It is also not uncommon for Vulkan capable devices to feature dedicated hardware acceleration for video compression and decompression. 23 24The goal of this proposal is to enable these use cases, allow exposing the underlying hardware capabilities, and provide tight integration with other functionalities of the Vulkan API. 25 26 27== Solution Space 28 29The following options have been considered: 30 31 1. Rely on external sharing capabilities to interact with existing video APIs 32 2. Add new dedicated APIs to Vulkan separately for video decoding and video encoding 33 3. Add a common set of APIs to Vulkan enabling video coding operations in general 34 35Option 1 has the advantage of being the least invasive in terms of API changes. The disadvantage is that there are a wide range of video APIs out there, most of them being platform or vendor specific which makes creating portable applications difficult. Cross-API interaction also often comes with undesired performance costs and it makes it difficult, if not impossible, to take advantage of all the existing features of Vulkan in such scenarios. 36 37Option 2 enables integrating video coding operations into the API and leveraging all the other capabilities of Vulkan including, but not limited to, explicit resource management and synchronization. Besides that, an integrated solution greatly reduces application complexity and allows for better portability. 38 39Option 3 improves option 2 by acknowledging that there are a lot of facilities that could be shared across different video coding operations like video decoding and encoding. Accordingly, this proposal follows option 3 to introduce a set of concepts, object types, and commands that form the foundation of the video coding capabilities of Vulkan upon which additional functionalities can be layered providing specific video coding operations like video decoding or encoding, and support for individual video compression standards. 40 41 42== Proposal 43 44=== Video Std Headers 45 46As each video compression standard requires a large set of codec-specific parameters that are orthogonal to the Vulkan API itself, the definitions of those are not part of the Vulkan headers. Instead, these definitions are provided separately for each codec-specific extension in corresponding video std headers. 47 48 49=== Video Profiles 50 51This extension introduces the concept of video profiles. A video profile in Vulkan loosely resembles similar concepts defined in video compression standards, however, it is a more generic concept that encompasses additional information like the specific video coding operation, the content type/format, and any other information related to the video coding scenario. 52 53A video profile in Vulkan is defined using the following structure: 54 55[source,c] 56---- 57typedef struct VkVideoProfileInfoKHR { 58 VkStructureType sType; 59 const void* pNext; 60 VkVideoCodecOperationFlagBitsKHR videoCodecOperation; 61 VkVideoChromaSubsamplingFlagsKHR chromaSubsampling; 62 VkVideoComponentBitDepthFlagsKHR lumaBitDepth; 63 VkVideoComponentBitDepthFlagsKHR chromaBitDepth; 64} VkVideoProfileInfoKHR; 65---- 66 67A complete video profile definition includes an instance of the structure above with additional codec and use case specific parameters provided through its `pNext` chain. 68 69The `videoCodecOperation` member identifies the particular video codec and video coding operation, while the other members provide information about the content type/format, including the chroma subsampling mode and the bit depths used by the compressed video stream. 70 71This extension does not define any video codec operations. Instead, it is left to codec-specific extensions layered on top of this proposal to provide those. 72 73 74=== Video Queues 75 76Support for video coding operations is exposed through new commands available for use on video-capable queue families. As it is not uncommon for devices to have separate dedicated hardware for accelerating video compression and decompression, possibly separate ones for different video codecs, implementations may expose multiple queue families with different video coding capabilities, although it is also possible for implementations to support video coding operations on the usual graphics or compute capable queue families. 77 78The set of video codec operations supported by a queue family can be retrieved using queue family property queries by including the following new output structure: 79 80[source,c] 81---- 82typedef struct VkQueueFamilyVideoPropertiesKHR { 83 VkStructureType sType; 84 void* pNext; 85 VkVideoCodecOperationFlagsKHR videoCodecOperations; 86} VkQueueFamilyVideoPropertiesKHR; 87---- 88 89After a successful query, the `videoCodecOperations` member will contain bits corresponding to the individual video codec operations supported by the queue family in question. 90 91 92=== Video Picture Resources 93 94Pictures used by video coding operations are referred to as video picture resources, and are provided to the video coding APIs through instances of the following new structure: 95 96[source,c] 97---- 98typedef struct VkVideoPictureResourceInfoKHR { 99 VkStructureType sType; 100 const void* pNext; 101 VkOffset2D codedOffset; 102 VkExtent2D codedExtent; 103 uint32_t baseArrayLayer; 104 VkImageView imageViewBinding; 105} VkVideoPictureResourceInfoKHR; 106---- 107 108Each video picture resource is backed by a subregion within a layer of an image object. `baseArrayLayer` specifies the array layer index used relative to the image view specified in `imageViewBinding`. Depending on the specific video codec operation, `codedOffset` can specify an additional offset within the image subresource to read/write picture data from/to, while `codedExtent` typically specifies the size of the video frame. 109 110Actual semantics of `codedOffset` and `codedExtent` are specific to the video profile in use, as the capabilities and semantics of individual codecs varies. 111 112 113=== Decoded Picture Buffer 114 115The chosen video compression standard may require the use of reference pictures. Such reference pictures are used by video coding operations to provide predictions of the values of samples of subsequently decoded or encoded pictures. Just like any other picture data, the decoded picture buffer (DPB) is backed by image layers. In this extension reference pictures are represented by video picture resources and corresponding image views. The DPB is the logical structure that holds this pool of reference pictures. 116 117The DPB is an indexed data structure, and individual indexed entries of the DPB are referred to as the DPB slots. The range of valid DPB slot indices is between zero and `N-1`, where `N` is the capacity of the DPB. Each DPB slot can refer to one or more reference pictures. In case of typical progressive content each DPB slot usually refers to a single picture containing a video frame, but other content types like multiview or interlaced video allow multiple pictures to be associated with each slot. If a DPB slot has any pictures associated with it, then it is an active DPB slot, otherwise it is an inactive DPB slot. 118 119DPB slots can be activated with reference pictures in response to video coding operations requesting such activations. This extension does not introduce any video coding operations. Instead, layered extensions provide those. However, this extension does provide facilities to deactivate currently active DPB slots, as discussed later. 120 121In this extension, the state and the backing store of the DPB are separated as follows: 122 123 * The state of individual DPB slots is maintained by video session objects. 124 * The backing store of DPB slots is provided by video picture resources and the underlying images. 125 126A single non-mipmapped image with a layer count equaling the number of DPB slots can used as the backing store of the DPB, where the picture corresponding to a particular DPB slot index is stored in the layer with the same index. The API also allows arbitrary mapping of image layers to DPB slots. Furthermore, if the `VK_VIDEO_CAPABILITY_SEPARATE_REFERENCE_IMAGES_BIT_KHR` capability flag is supported by the implementation for a specific video profile, then individual DPB slots can be backed by different images, potentially using a separate image for each DPB slot. 127 128Depending on the used video profile, a single DPB slot may contain more than just one picture (e.g. in case of multiview and interlaced content). In such cases the number of needed image layers may be larger than the number of DPB slots, hence the image(s) used as the backing store of the DPB have to be sized accordingly. 129 130There may also be video compression standards, video profiles, or use cases that do not require or do not support reference pictures at all. In such cases a DPB is not needed either. 131 132The responsibility of managing the DPB is split between the application and the implementation as follows: 133 134 * The application maintains the association between DPB slot indices and corresponding video picture resources. 135 * The implementation maintains global and per-slot opaque reference picture metadata. 136 137In addition, the application is also responsible for managing the mapping between the codec-specific picture IDs and DPB slots, and any other codec-specific states. 138 139 140=== Video Session 141 142Before performing any video coding operations, the application needs to create a video session object using the following new command: 143 144[source,c] 145---- 146VKAPI_ATTR VkResult VKAPI_CALL vkCreateVideoSessionKHR( 147 VkDevice device, 148 const VkVideoSessionCreateInfoKHR* pCreateInfo, 149 const VkAllocationCallbacks* pAllocator, 150 VkVideoSessionKHR* pVideoSession); 151---- 152 153The creation parameters are as follows: 154 155[source,c] 156---- 157typedef struct VkVideoSessionCreateInfoKHR { 158 VkStructureType sType; 159 const void* pNext; 160 uint32_t queueFamilyIndex; 161 VkVideoSessionCreateFlagsKHR flags; 162 const VkVideoProfileInfoKHR* pVideoProfile; 163 VkFormat pictureFormat; 164 VkExtent2D maxCodedExtent; 165 VkFormat referencePictureFormat; 166 uint32_t maxDpbSlots; 167 uint32_t maxActiveReferencePictures; 168 const VkExtensionProperties* pStdHeaderVersion; 169} VkVideoSessionCreateInfoKHR; 170---- 171 172A video session object is created against a specific video profile and the implementation uses it to maintain video coding related state. The creation parameters of a video session object include the following: 173 174 * The queue family the video session can be used with (`queueFamilyIndex`) 175 * A video profile definition specifying the particular video compression standard and video coding operation type the video session can be used with (`pVideoProfile`) 176 * The maximum size of the coded frames the video session can be used with (`maxCodedExtent`) 177 * The capacity of the DPB (`maxDpbSlots`) 178 * The maximum number of reference pictures that can be used in a single operation (`maxActiveReferencePictures`) 179 * The used picture formats (`pictureFormat` and `referencePictureFormat`) 180 * The used video compression standard header (`pStdHeaderVersion`) 181 182A video session object can be used to perform video coding operations on a single video stream at the time. After the application finished processing a video stream, it can reuse the object to process another video stream, provided that the configuration parameters between the two streams are compatible (as determined by the video compression standard in use). 183 184Once a video session has been created, the video compression standard and profiles, picture formats, and other settings like the maximum coded extent cannot be changed. However, many parameters of video coding operations may change between subsequent operations, subject to restrictions imposed on parameter updates by the video compression standard, e.g.: 185 186 * The size of the decoded or encoded pictures 187 * The number of active DPB slots 188 * The number of reference pictures in use 189 190In particular, a given video session can be reused to process video streams with different extents, as long as the used coded extent does not exceed the maximum coded extent the video session was created with. This can be useful to reduce latency/overhead when processing video content that may dynamically change the video resolution as part of adjusting to varying network conditions, for example. 191 192After creating a video session, and before using the object in command buffer commands, the application has to allocate and bind device memory to the video session. Implementations may require one or more memory bindings to be bound with compatible device memory, as reported by the following new command: 193 194[source,c] 195---- 196VKAPI_ATTR VkResult VKAPI_CALL vkGetVideoSessionMemoryRequirementsKHR( 197 VkDevice device, 198 VkVideoSessionKHR videoSession, 199 uint32_t* pMemoryRequirementsCount, 200 VkVideoSessionMemoryRequirementsKHR* pMemoryRequirements); 201---- 202 203For each memory binding the following information is returned: 204 205[source,c] 206---- 207typedef struct VkVideoSessionMemoryRequirementsKHR { 208 VkStructureType sType; 209 void* pNext; 210 uint32_t memoryBindIndex; 211 VkMemoryRequirements memoryRequirements; 212} VkVideoSessionMemoryRequirementsKHR; 213---- 214 215`memoryBindIndex` is a unique identifier of the corresponding memory binding and can have any value, and `memoryRequirements` contains the memory requirements corresponding to the memory binding. 216 217The application can bind compatible device memory ranges for each binding through one or more calls to the following new command: 218 219[source,c] 220---- 221VKAPI_ATTR VkResult VKAPI_CALL vkBindVideoSessionMemoryKHR( 222 VkDevice device, 223 VkVideoSessionKHR videoSession, 224 uint32_t bindSessionMemoryInfoCount, 225 const VkBindVideoSessionMemoryInfoKHR* pBindSessionMemoryInfos); 226---- 227 228The parameters of a memory binding are as follows: 229 230[source,c] 231---- 232typedef struct VkBindVideoSessionMemoryInfoKHR { 233 VkStructureType sType; 234 const void* pNext; 235 uint32_t memoryBindIndex; 236 VkDeviceMemory memory; 237 VkDeviceSize memoryOffset; 238 VkDeviceSize memorySize; 239} VkBindVideoSessionMemoryInfoKHR; 240---- 241 242The application does not have to bind memory to each memory binding with a single call, but before being able to use the video session in video coding operations, all memory bindings have to be bound to compatible device memory, and the bindings are immutable for the lifetime of the video session. 243 244Once a video session object is no longer needed (and is no longer used by any pending command buffers), it can be destroyed with the following new command: 245 246[source,c] 247---- 248VKAPI_ATTR void VKAPI_CALL vkDestroyVideoSessionKHR( 249 VkDevice device, 250 VkVideoSessionKHR videoSession, 251 const VkAllocationCallbacks* pAllocator); 252---- 253 254 255=== Video Session Parameters 256 257Most video compression standards require parameters that are in use across multiple video coding operations, potentially across the entire video stream. For example, the H.264/AVC and H.265/HEVC standards require sequence and picture parameter sets (SPS and PPS) that apply to multiple video frames, layers, and sub-layers. 258 259This extension uses video session parameters objects to store such standard parameters. These objects enable storing such codec-specific parameters in a preprocessed form and enable reducing the number of parameters needed to be provided and processed by the implementation while recording video coding operations into command buffers. 260 261Video session parameters objects use a key-value storage. The way how keys are derived from the provided parameters is codec-specific (e.g. in case of H.264/AVC picture parameter sets the key consists of an SPS and PPS ID pair). 262 263The application can create a video session parameters object against a video session with the following new command: 264 265[source,c] 266---- 267VKAPI_ATTR VkResult VKAPI_CALL vkCreateVideoSessionParametersKHR( 268 VkDevice device, 269 const VkVideoSessionParametersCreateInfoKHR* pCreateInfo, 270 const VkAllocationCallbacks* pAllocator, 271 VkVideoSessionParametersKHR* pVideoSessionParameters); 272---- 273 274The creation parameters are as follows: 275 276[source,c] 277---- 278typedef struct VkVideoSessionParametersCreateInfoKHR { 279 VkStructureType sType; 280 const void* pNext; 281 VkVideoSessionParametersCreateFlagsKHR flags; 282 VkVideoSessionParametersKHR videoSessionParametersTemplate; 283 VkVideoSessionKHR videoSession; 284} VkVideoSessionParametersCreateInfoKHR; 285---- 286 287Layered extensions may provide mechanisms to specify an initial set of parameters at creation time, and the application can also specify a video session parameters object in `videoSessionParametersTemplate` that will be used as a template for the new object. Applying a template happens by first adding any parameters specified in the codec-specific creation parameters, followed by adding any parameters from the template object that have a key that does not match the key of any of the already added parameters. 288 289Parameters stored in video session parameters objects are immutable to facilitate the concurrent use of the stored parameters in multiple threads. However, new parameters can be added to existing objects using the following new command: 290 291[source,c] 292---- 293KAPI_ATTR VkResult VKAPI_CALL vkUpdateVideoSessionParametersKHR( 294 VkDevice device, 295 VkVideoSessionParametersKHR videoSessionParameters, 296 const VkVideoSessionParametersUpdateInfoKHR* pUpdateInfo); 297---- 298 299The base parameters to the command are as follows: 300 301[source,c] 302---- 303typedef struct VkVideoSessionParametersUpdateInfoKHR { 304 VkStructureType sType; 305 const void* pNext; 306 uint32_t updateSequenceCount; 307} VkVideoSessionParametersUpdateInfoKHR; 308---- 309 310The `updateSequenceCount` parameter is used to ensure that the video session parameters objects are updated in order. To support concurrent use of the stored immutable parameters while also allowing the video session parameters object to be extended with new parameters, each object maintains an _update sequence counter_ that is set to `0` at object creation time and has to be incremented by each subsequent update operation by specifying an `updateSequenceCount` that equals the current update sequence counter of the object plus one. 311 312Some codecs permit updating previously supplied parameters. As the parameters stored in the video session parameters objects are immutable, if a parameter update is necessary, the application has the following options: 313 314 * Cache the set of parameters on the application side and create a new video session parameters object adding all the parameters with appropriate changes, as necessary; or 315 * Create a new video session parameters object providing only the updated parameters and the previously used object as the template, which ensures that parameters not specified at creation time will be copied unmodified from the template object. 316 317Another case when a new video session parameters object may need to be created is when the capacity of the current object is exhausted. Each video session parameters object is created with a specific capacity, hence if that capacity later turns out to be insufficient, a new object with a larger capacity should be created, typically using the old one as a template. 318 319The application has to track the capacity and the keys of currently stored parameters for each video session parameters object in order to be able to determine when a new object needs to be created due to a change to an existing parameter or due to exceeding the capacity of the existing object. 320 321During command buffer recording, it is the responsibility of the application to provide the video session parameters object containing the necessary parameters for processing the portion of the video stream in question. 322 323The expected usage model for video session parameters object is a single-producer-multiple-consumer one. Typically a single thread processing the video stream is expected to update the corresponding parameters object, or create new ones when necessary, while at the same time any thread can record video coding operations into command buffers referring to parameters previously added to the object. If, for some reason, the application wants to update a given video session parameters object from multiple threads, it is responsible to provide appropriate mutual exclusion so that no two threads update the same object concurrently, and that the used `updateSequenceCount` values are sequentially increasing. 324 325Once a video session parameters object is no longer needed (and is no longer used by any pending command buffers), it can be destroyed with the following new command: 326 327[source,c] 328---- 329VKAPI_ATTR void VKAPI_CALL vkDestroyVideoSessionParametersKHR( 330 VkDevice device, 331 VkVideoSessionParametersKHR videoSessionParameters, 332 const VkAllocationCallbacks* pAllocator); 333---- 334 335This extension does not define any parameter types. Instead, layered codec-specific extensions define those. Some codecs may not need parameters at all, in which case no video session parameters objects need to be created or managed. 336 337 338=== Command Buffer Commands 339 340This extension does not introduce any specific video coding operations, however, it does introduce new commands that can be recorded into video-capable command buffers (created from command pools that target queue families with video capabilities). 341 342Applications can record video coding operations into such a command buffer only within a _video coding scope_. The following new command begins such a video coding scope within the command buffer: 343 344[source,c] 345---- 346VKAPI_ATTR void VKAPI_CALL vkCmdBeginVideoCodingKHR( 347 VkCommandBuffer commandBuffer, 348 const VkVideoBeginCodingInfoKHR* pBeginInfo); 349---- 350 351This command takes the following parameters: 352 353[source,c] 354---- 355typedef struct VkVideoBeginCodingInfoKHR { 356 VkStructureType sType; 357 const void* pNext; 358 VkVideoBeginCodingFlagsKHR flags; 359 VkVideoSessionKHR videoSession; 360 VkVideoSessionParametersKHR videoSessionParameters; 361 uint32_t referenceSlotCount; 362 const VkVideoReferenceSlotInfoKHR* pReferenceSlots; 363} VkVideoBeginCodingInfoKHR; 364---- 365 366The mandatory `videoSession` parameter specifies the video session object used to process the video coding operations within the video coding scope. As the video session object is a stateful object providing the device state context needed to perform video coding operations, portions of a video stream can be processed across multiple video coding scopes and multiple command buffers using the same video session object. It is typical, for example, to submit a single command buffer with a single video coding scope encapsulating a single video coding operation (let that be a video decode or encode operation) that performs the decompression or compression of a single video frame produced or consumed by other Vulkan commands. 367 368`videoSessionParameters` provides the optional parameters object to use with the video coding operations, depending on whether one is needed according to the codec-specific requirements. 369 370This command binds the specified video session and (if present) video session parameters objects to the command buffer for the duration of the video coding scope. 371 372In addition, the application can provide a list of reference picture resources, with initial information about which DPB slots they may be currently associated with. This information is provided through an array of the following new structure: 373 374[source,c] 375---- 376typedef struct VkVideoReferenceSlotInfoKHR { 377 VkStructureType sType; 378 const void* pNext; 379 int32_t slotIndex; 380 const VkVideoPictureResourceInfoKHR* pPictureResource; 381} VkVideoReferenceSlotInfoKHR; 382---- 383 384The list of video picture resources provided here is needed because the `vkCmdBeginVideoScopeKHR` command also acts as a resource binding command, as the provided list defines the set of resources that can be used as reconstructed or reference pictures by video coding operations within the video coding scope. 385 386The DPB slot association information needs to be provided because it is the application's responsibility to maintain the association between DPB slot indices and corresponding video picture resources. If a video picture resource is not currently associated with any DPB slot, but it is planned to be associated with one within this video coding scope (e.g. by using it as the target of picture reconstruction), then it has to be included in the list with a negative `slotIndex` value, indicating that it is a bound reference picture resource, but one that is not currently associated with any DPB slot. 387 388The `vkCmdBeginVideoCodingKHR` command also allows the application to deactivate previously activated DPB slots. This can be done by passing the index of the DPB slot to deactivate in `slotIndex` but not specifying any associated picture resource(`pPictureResource = NULL`). Deactivating the DPB slot removes all associated reference pictures which allows the application to e.g. reuse or deallocate the corresponding memory resources. 389 390The associations between these bound video picture resources and DPB slots can also change during the course of the video coding scope in response to video coding operations. 391 392Control and state changing operations can be issued within a video coding scope with the following new command: 393 394[source,c] 395---- 396VKAPI_ATTR void VKAPI_CALL vkCmdControlVideoCodingKHR( 397 VkCommandBuffer commandBuffer, 398 const VkVideoCodingControlInfoKHR* pCodingControlInfo); 399---- 400 401This extension introduces only a single control flag called `VK_VIDEO_CODING_CONTROL_RESET_BIT_KHR` that is used to initialize the video session object. Before being able to record actual video coding operations against a bound video session object, it has to be initialized (reset) using this command by including the `VK_VIDEO_CODING_CONTROL_RESET_BIT_KHR` flag. The reset operation also returns all DPB slots of the video session to the inactive state and removes any DPB slot index associations. 402 403After processing a video stream using a video session, the reset operation can also be used to return the video session back to the initial state. This enables reusing a single video session object to process different, independent video sequences. 404 405A video coding scope can be ended with the following new command: 406 407[source,c] 408---- 409VKAPI_ATTR void VKAPI_CALL vkCmdEndVideoCodingKHR( 410 VkCommandBuffer commandBuffer, 411 const VkVideoEndCodingInfoKHR* pEndCodingInfo); 412---- 413 414 415=== Status Queries 416 417Compressing and decompressing video content is a non-trivial process that involves complex codec-specific semantics and requirements. Accordingly, it is possible for a video coding operation to fail when processing input content that is not conformant to the rules defined by the used video compression standard, thus determining whether a particular video coding operation completed successfully can only happen at runtime. 418 419In order to facilitate this, this extension also introduces a new `VK_QUERY_TYPE_RESULT_STATUS_ONLY_KHR` query type that enables getting feedback about the status of operations. Support for this new query type can be queried for each queue family index through the following new output structure: 420 421[source,c] 422---- 423typedef struct VkQueueFamilyQueryResultStatusPropertiesKHR { 424 VkStructureType sType; 425 void* pNext; 426 VkBool32 queryResultStatusSupport; 427} VkQueueFamilyQueryResultStatusPropertiesKHR; 428---- 429 430Quries also work slightly differently within a video coding scope due to the special behavior of video coding operations. Instead of a query being bound to the scope determined by the corresponding `vkCmdBeginQuery` and `vkCmdEndQuery` calls, in case of video coding each video coding operation consumes its own query slot. Thus if a command issues multiple video coding operations, then those may consume multiple subsequent query slots within the query pool. However, as no new commands are introduced by this extension to start queries with multiple activatable query slots, currently only a single video coding operation is allowed between a `vkCmdBeginQuery` and `vkCmdEndQuery` call. 431 432An unsuccessfully completed video coding operation may also have an effect on subsequently executed video coding operations against the same video session. In particular, if a video coding operation requests the setup (activation) of a DPB slot with a reference picture and that video coding operation completes unsuccessfully, then the corresponding DPB slot will end up having an invalid picture reference. This will cause subsequent video coding operations using reference pictures associated with that DPB slot to produce unexpected results, and may even cause such dependent video coding operations themselves to complete unsuccessfully in response to the invalid input data. 433 434Thus applications have to make sure that they use queries to determine the completion status of video coding operations in order to be able to detect if outputs may contain undefined data and potentially drop those, depending on the particular use case. 435 436The mechanisms introduced by the new query type are designed to be generic. While video coding scopes only allow using `VK_QUERY_TYPE_RESULT_STATUS_ONLY_KHR` queries (at least without layered extensions introducing further video-compatible query types), the new `VK_QUERY_RESULT_WITH_STATUS_BIT_KHR` bit can also be used with other query types, replacing the traditional boolean availability information with an enumeration based status value: 437 438[source,c] 439---- 440typedef enum VkQueryResultStatusKHR { 441 VK_QUERY_RESULT_STATUS_ERROR_KHR = -1, 442 VK_QUERY_RESULT_STATUS_NOT_READY_KHR = 0, 443 VK_QUERY_RESULT_STATUS_COMPLETE_KHR = 1, 444 VK_QUERY_RESULT_STATUS_MAX_ENUM_KHR = 0x7FFFFFFF 445} VkQueryResultStatusKHR; 446---- 447 448In general, when retrieving the result status of a query, negative values indicate some sort of failure (unsuccessful completion of operations) and positive values indicate success. 449 450 451=== Device Memory Management 452 453In this extension the application has complete control over how and when system resources are used. This extension provides the following tools to enable optimal usage of device and host memory resources: 454 455 * The application can manage the number of allocated output and input pictures, and can dynamically grow or shrink the DPB holding the reference pictures, based on the changing video content requirements. 456 * Individual video picture resources can be shared across different contexts, e.g. reference pictures can be shared between video decoding and encoding workloads, and the output of a video decode operation can be used as an input to a video encode operation. 457 * The images backing the video picture resources can also be used in other non-video-related operations, e.g. video decode operations may directly output to presentable swapchain images, or to images that can be subsequently sampled by graphics operations, subject to appropriate implementation capabilities. 458 * The application can also use sparse memory bindings for the images backing the video picture resources. The use of sparse memory bindings allows the application to unbind the device memory backing of the images when the corresponding DPB slot is not in active use. 459 460These general Vulkan capabilities enable this extension to provide seamless and efficient integration across different types of workloads in a "zero-copy" fashion and minimal synchronization overhead. 461 462 463=== Resource Creation 464 465This extension stores video picture resources in image objects. As the device memory requirements of video picture resources may be specific to the video profile used, when creating images with any video-specific usage the application has to provide information about the video profiles the image will be used with. As a single image may be reused across video sessions using different video profiles (e.g. to use the decoded output picture as an input picture to subsequent encode operations), the following new structure is introduced to provide a list of video profiles: 466 467[source,c] 468---- 469typedef struct VkVideoProfileListInfoKHR { 470 VkStructureType sType; 471 const void* pNext; 472 uint32_t profileCount; 473 const VkVideoProfileInfoKHR* pProfiles; 474} VkVideoProfileListInfoKHR; 475---- 476 477As multiple profiles are expected to be specified only in video transcoding use cases, the list can include at most one video decode profile and one or more video encode profiles. 478 479When an instance of this structure is included in the `pNext` chain of `VkImageCreateInfo` to a `vkCreateImage` call, the created image will be usable in video coding operations recorded against video sessions using any of the specified video profiles. 480 481Similarly, buffers used as the backing store for video bitstreams have to be created with the `pNext` chain of `VkBufferCreateInfo` including a profile list structure when calling `vkCreateBuffer` in order to make the resulting buffer compatible with video sessions using any of the specified video profiles. 482 483Query pools are also video-profile-specific. In particular, in order to create a `VK_QUERY_TYPE_RESULT_STATUS_ONLY_KHR` query pool compatible with a particular video profile, the application has to include an instance of the `VkVideoProfileInfoKHR` structure in the `pNext` chain of `VkQueryPoolCreateInfo`. Unlike buffers and images, query pools are not reusable across video sessions using different video profiles, hence the used structure is `VkVideoProfileInfoKHR` instead of `VkVideoProfileListInfoKHR`. 484 485 486=== Protected Content Support 487 488This extension also enables support of video coding operations using protected content. Whether a particular implementation supports coding protected content is indicated by the `VK_VIDEO_CAPABILITY_PROTECTED_CONTENT_BIT_KHR` capability flag. 489 490Just like in all other Vulkan operations using protected content, the resources participating in those must either all be protected or unprotected. This applies to the command buffer (and the command pool it is allocated from), to the queue the command buffer is submitted to, to the buffers and images used within those command buffers, as well as to the video session objects used for video coding. 491 492If the `VK_VIDEO_CAPABILITY_PROTECTED_CONTENT_BIT_KHR` capability flag is supported, the application can create protected-capable video sessions using the `VK_VIDEO_SESSION_CREATE_PROTECTED_CONTENT_BIT_KHR` flag. 493 494 495=== Capabilities 496 497The generic capabilities of the implementation for a given video profile can be queried using the following new command: 498 499[source,c] 500---- 501VKAPI_ATTR VkResult VKAPI_CALL vkGetPhysicalDeviceVideoCapabilitiesKHR( 502 VkPhysicalDevice physicalDevice, 503 const VkVideoProfileInfoKHR* pVideoProfile, 504 VkVideoCapabilitiesKHR* pCapabilities); 505---- 506 507The output structure contains only common capabilities that are relevant for all video profiles: 508 509[source,c] 510---- 511typedef struct VkVideoCapabilitiesKHR { 512 VkStructureType sType; 513 void* pNext; 514 VkVideoCapabilityFlagsKHR flags; 515 VkDeviceSize minBitstreamBufferOffsetAlignment; 516 VkDeviceSize minBitstreamBufferSizeAlignment; 517 VkExtent2D pictureAccessGranularity; 518 VkExtent2D minCodedExtent; 519 VkExtent2D maxCodedExtent; 520 uint32_t maxDpbSlots; 521 uint32_t maxActiveReferencePictures; 522 VkExtensionProperties stdHeaderVersion; 523} VkVideoCapabilitiesKHR; 524---- 525 526In particular, it contains information about the following: 527 528 * Buffer offset and (range) size requirements of the video bitstream buffer ranges 529 * Access granularity of video picture resources 530 * Minimum and maximum size of coded pictures 531 * Maximum number of DPB slots and active reference pictures 532 * Name and maximum supported version of the codec-specific video std headers 533 534While these capabilities are generic, each video profile may have its own set of capabilities. In addition, layered extensions will include additional capabilities specific to the type of video coding operation and video compression standard. 535 536The picture access granularity is something that the application has to particularly pay attention to. Video coding hardware can often access memory only at a particular granularity (block size) that may span multiple rows or columns of the picture data. This means that when a video coding operation writes data to a video picture resource it is possible that texels outside of the effective extents of the picture will also get modified. Writes to such padding texels will result in undefined texel values, thus the application has to make sure not to assume any particular values in these "shoulder" areas. This is especially important when the application chooses to reuse the same video picture resources to process video frames larger than the resource was previously used with. To avoid reading undefined values in such cases, applications should clear the image subresources used as video picture resources when the resolution of the video content changes, or otherwise ensure that these padding texels contain well-defined data (e.g. by writing to them) before being read from. 537 538Besides the global capabilities of a video profile, the set of image formats usable with video coding operations is also specific to each video profile. The following new query enables the application to enumerate the list and properties of the image formats supported by a given set of video profiles: 539 540[source,c] 541---- 542VKAPI_ATTR VkResult VKAPI_CALL vkGetPhysicalDeviceVideoFormatPropertiesKHR( 543 VkPhysicalDevice physicalDevice, 544 const VkPhysicalDeviceVideoFormatInfoKHR* pVideoFormatInfo, 545 uint32_t* pVideoFormatPropertyCount, 546 VkVideoFormatPropertiesKHR* pVideoFormatProperties); 547---- 548 549The input to this query includes the needed image usage flags, which typically include some video-specific usage flags, and the list of video profiles provided through a `VkVideoProfileListInfoKHR` structure included in the `pNext` of the following new structure: 550 551[source,c] 552---- 553typedef struct VkPhysicalDeviceVideoFormatInfoKHR { 554 VkStructureType sType; 555 const void* pNext; 556 VkImageUsageFlags imageUsage; 557} VkPhysicalDeviceVideoFormatInfoKHR; 558---- 559 560The query returns the following new output structure: 561 562[source,c] 563---- 564typedef struct VkVideoFormatPropertiesKHR { 565 VkStructureType sType; 566 void* pNext; 567 VkFormat format; 568 VkComponentMapping componentMapping; 569 VkImageCreateFlags imageCreateFlags; 570 VkImageType imageType; 571 VkImageTiling imageTiling; 572 VkImageUsageFlags imageUsageFlags; 573} VkVideoFormatPropertiesKHR; 574---- 575 576Alongside the format and the supported image creation values/flags, `componentMapping` indicates how the video coding operations interpret the individual components of video picture resources using this format. For example, if the implementation produces video decode output with the `VK_FORMAT_G8_B8R8_2PLANE_420_UNORM` format where the blue and red chrominance channels are swapped then `componentMapping` will have the following values: 577 578[source,c] 579---- 580components.r = VK_COMPONENT_SWIZZLE_B; // Cb component 581components.g = VK_COMPONENT_SWIZZLE_IDENTITY; // Y component 582components.b = VK_COMPONENT_SWIZZLE_R; // Cr component 583components.a = VK_COMPONENT_SWIZZLE_IDENTITY; // unused, defaults to 1.0 584---- 585 586The query may return multiple `VkVideoFormatPropertiesKHR` entries with the same format, but otherwise different values for other members (e.g. with different image type or image tiling). In addition, a different set of entries may be returned depending on the input image usage flags specified, even for the same set of video profiles, for example, based on whether input, output, or DPB usage is requested. 587 588The application can select the parameters from a returned entry and use compatible parameters when creating images to be used as video picture resources with any of the video profiles provided in the input list. 589 590 591== Examples 592 593=== Select queue family with support for a given video codec operation and result status queries 594 595[source,c] 596---- 597VkVideoCodecOperationFlagBitsKHR neededVideoCodecOp = ... 598uint32_t queueFamilyIndex; 599uint32_t queueFamilyCount; 600 601vkGetPhysicalDeviceQueueFamilyProperties2(physicalDevice, &queueFamilyCount, NULL); 602 603VkQueueFamilyProperties2* props = calloc(queueFamilyCount, 604 sizeof(VkQueueFamilyProperties2)); 605VkQueueFamilyVideoPropertiesKHR* videoProps = calloc(queueFamilyCount, 606 sizeof(VkQueueFamilyVideoPropertiesKHR)); 607VkQueueFamilyQueryResultStatusPropertiesKHR* queryResultStatusProps = calloc(queueFamilyCount, 608 sizeof(VkQueueFamilyQueryResultStatusPropertiesKHR)); 609 610for (queueFamilyIndex = 0; queueFamilyIndex < queueFamilyCount; ++queueFamilyIndex) { 611 props[queueFamilyIndex].sType = VK_STRUCTURE_TYPE_QUEUE_FAMILY_PROPERTIES_2; 612 props[queueFamilyIndex].pNext = &videoProps[queueFamilyIndex]; 613 614 videoProps[queueFamilyIndex].sType = VK_STRUCTURE_TYPE_QUEUE_FAMILY_VIDEO_PROPERTIES_KHR; 615 videoProps[queueFamilyIndex].pNext = &queryResultStatusProps[queueFamilyIndex]; 616 617 queryResultStatusProps[queueFamilyIndex].sType = VK_STRUCTURE_TYPE_QUEUE_FAMILY_QUERY_RESULT_STATUS_PROPERTIES_KHR; 618} 619 620vkGetPhysicalDeviceQueueFamilyProperties2(physicalDevice, &queueFamilyCount, props); 621 622for (queueFamilyIndex = 0; queueFamilyIndex < queueFamilyCount; ++queueFamilyIndex) { 623 if ((videoProps[queueFamilyIndex].videoCodecOperations & neededVideoCodecOp) != 0 && 624 (queryResultStatusProps[queueFamilyIndex].queryResultStatusSupport == VK_TRUE)) { 625 break; 626 } 627} 628 629if (queueFamilyIndex < queueFamilyCount) { 630 // Found appropriate queue family 631 ... 632} else { 633 // Did not find a queue family with the needed capabilities 634 ... 635} 636---- 637 638 639=== Check support and query the capabilities for a video profile 640 641[source,c] 642---- 643VkResult result; 644 645VkVideoProfileInfoKHR profileInfo = { 646 .sType = VK_STRUCTURE_TYPE_VIDEO_PROFILE_INFO_KHR, 647 .pNext = ... // pointer to additional profile information structures specific to the codec and use case 648 .videoCodecOperation = ... // used video codec operation 649 .chromaSubsampling = VK_VIDEO_CHROMA_SUBSAMPLING_420_BIT_KHR, 650 .lumaBitDepth = VK_VIDEO_COMPONENT_BIT_DEPTH_8_BIT_KHR, 651 .chromaBitDepth = VK_VIDEO_COMPONENT_BIT_DEPTH_8_BIT_KHR 652}; 653 654VkVideoCapabilitiesKHR capabilities = { 655 .sType = VK_STRUCTURE_TYPE_VIDEO_CAPABILITIES_KHR, 656 .pNext = ... // pointer to additional capability structures specific to the type of video coding operation and codec 657}; 658 659result = vkGetPhysicalDeviceVideoCapabilitiesKHR(physicalDevice, &profileInfo, &capabilities); 660 661if (result == VK_SUCCESS) { 662 // Profile is supported, check additional capabilities 663 ... 664} else { 665 // Profile is not supported, result provides additional information about why 666 ... 667} 668---- 669 670 671=== Enumerate supported formats for a video profile with a given usage 672 673[source,c] 674---- 675uint32_t formatCount; 676 677VkVideoProfileInfoKHR profileInfo = { 678 ... 679}; 680 681VkVideoProfileListInfoKHR profileListInfo = { 682 .sType = VK_STRUCTURE_TYPE_VIDEO_PROFILE_LIST_INFO_KHR, 683 .pNext = NULL, 684 .profileCount = 1, 685 .pProfiles = &profileInfo 686}; 687// NOTE: Add any additional profiles to the list for e.g. video transcoding use cases 688 689VkPhysicalDeviceVideoFormatInfoKHR formatInfo = { 690 .sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_VIDEO_FORMAT_INFO_KHR, 691 .pNext = &profileListInfo, 692 .imageUsage = ... // expected image usage, e.g. DPB, input, or output 693}; 694 695vkGetPhysicalDeviceVideoFormatPropertiesKHR(physicalDevice, &formatInfo, &formatCount, NULL); 696 697VkVideoFormatPropertiesKHR* formatProps = calloc(formatCount, sizeof(VkVideoFormatPropertiesKHR)); 698 699for (uint32_t i = 0; i < formatCount; ++i) { 700 formatProps.sType = VK_STRUCTURE_TYPE_VIDEO_FORMAT_PROPERTIES_KHR; 701} 702 703vkGetPhysicalDeviceVideoFormatPropertiesKHR(physicalDevice, &formatInfo, &formatCount, formatProps); 704 705for (uint32_t i = 0; i < formatCount; ++i) { 706 // Find format and image creation capabilities best suited for the use case 707 ... 708} 709---- 710 711 712=== Create video session for a video profile 713 714[source,c] 715---- 716VkVideoSessionKHR videoSession = VK_NULL_HANDLE; 717 718VkVideoSessionCreateInfoKHR createInfo = { 719 .sType = VK_STRUCTURE_TYPE_VIDEO_SESSION_CREATE_INFO_KHR, 720 .pNext = NULL, 721 .queueFamilyIndex = ... // index of queue family that supports the video codec operation 722 .flags = 0, 723 .pVideoProfile = ... // pointer to video profile information structure chain 724 .pictureFormat = ... // image format to use for input/output pictures 725 .maxCodedExtent = ... // maximum extent of coded pictures supported by the session 726 .referencePictureFormat = ... // image format to use for reference pictures (if used) 727 .maxDpbSlots = ... // DPB slot capacity to use (if needed) 728 .maxActiveReferencePictures = ... // maximum number of reference pictures used by any operation (if needed) 729 .pStdHeaderVersion = ... // pointer to the video std header information (typically the same as reported in the capabilities) 730}; 731 732vkCreateVideoSession(device, &createInfo, NULL, &videoSession); 733---- 734 735 736=== Query memory requirements and bind memory to a video session 737 738[source,c] 739---- 740uint32_t memReqCount; 741 742vkGetVideoSessionMemoryRequirementsKHR(device, videoSession, &memReqCount, NULL); 743 744VkVideoSessionMemoryRequirementsKHR* memReqs = calloc(memReqCount, sizeof(VkVideoSessionMemoryRequirementsKHR)); 745 746for (uint32_t i = 0; i < memReqCount; ++i) { 747 memReqs.sType = VK_STRUCTURE_TYPE_VIDEO_SESSION_MEMORY_REQUIREMENTS_KHR; 748} 749 750vkGetVideoSessionMemoryRequirementsKHR(device, videoSession, &memReqCount, memReqs); 751 752for (uint32_t i = 0; i < memReqCount; ++i) { 753 // Allocate memory compatible with the given memory binding 754 VkDeviceMemory memory = ... 755 756 // Bind the memory to the memory binding 757 VkBindVideoSessionMemoryInfoKHR bindInfo = { 758 .sType = VK_STRUCTURE_TYPE_BIND_VIDEO_SESSION_MEMORY_INFO_KHR, 759 .pNext = NULL, 760 .memoryBindIndex = memReqs[i].memoryBindIndex, 761 .memory = ... // memory object to bind 762 .memoryOffset = ... // offset to bind 763 .memorySize = ... // size to bind 764 }; 765 766 vkBindVideoSessionMemoryKHR(device, videoSession, 1, &bindInfo); 767} 768// NOTE: Alternatively, all memory bindings can be bound with a single call 769---- 770 771 772=== Create and update video session parameters objects 773 774[source,c] 775---- 776VkVideoSessionParametersKHR videoSessionParams = VK_NULL_HANDLE; 777 778VkVideoSessionParametersCreateInfoKHR createInfo = { 779 .sType = VK_STRUCTURE_TYPE_VIDEO_SESSION_PARAMETERS_CREATE_INFO_KHR, 780 .pNext = ... // pointer to codec-specific parameters creation information 781 .flags = 0, 782 .videoSessionParametersTemplate = ... // template to use or VK_NULL_HANDLE 783 .videoSession = videoSession 784}; 785 786vkCreateVideoSessionParametersKHR(device, &createInfo, NULL, &videoSessionParams); 787 788... 789 790VkVideoSessionParametersUpdateInfoKHR updateInfo = { 791 .sType = VK_STRUCTURE_TYPE_VIDEO_SESSION_PARAMETERS_UPDATE_INFO_KHR, 792 .pNext = ... // pointer to codec-specific parameters update information 793 .updateSequenceCount = 1 // incremented for each subsequent update 794}; 795 796vkUpdateVideoSessionParametersKHR(device, &videoSessionParams, &updateInfo); 797---- 798 799 800=== Create bitstream buffer 801 802[source,c] 803---- 804VkBuffer buffer = VK_NULL_HANDLE; 805 806VkVideoProfileListInfoKHR profileListInfo = { 807 .sType = VK_STRUCTURE_TYPE_VIDEO_PROFILE_LIST_INFO_KHR, 808 .pNext = NULL, 809 .profileCount = ... // number of video profiles to use the bitstream buffer with 810 .pProfiles = ... // pointer to an array of video profile information structure chains 811}; 812 813VkBufferCreateInfo createInfo = { 814 .sType = VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO, 815 .pNext = &profileListInfo, 816 ... // buffer creation parameters including one or more video-specific usage flags 817}; 818 819vkCreateBuffer(device, &createInfo, NULL, &buffer); 820---- 821 822 823=== Create image and image view backing video picture resources 824 825[source,c] 826---- 827VkImage image = VK_NULL_HANDLE; 828VkImageView imageView = VK_NULL_HANDLE; 829 830VkVideoProfileListInfoKHR profileListInfo = { 831 .sType = VK_STRUCTURE_TYPE_VIDEO_PROFILE_LIST_INFO_KHR, 832 .pNext = NULL, 833 .profileCount = ... // number of video profiles to use the image with 834 .pProfiles = ... // pointer to an array of video profile information structure chains 835}; 836 837VkImageCreateInfo imageCreateInfo = { 838 .sType = VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO, 839 .pNext = &profileListInfo, 840 ... // image creation parameters including one or more video-specific usage flags 841}; 842 843vkCreateImage(device, &imageCreateInfo, NULL, &image); 844 845VkImageViewUsageCreateInfo imageViewUsageInfo = { 846 .sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_USAGE_CREATE_INFO, 847 .pNext = NULL, 848 .usage = // video-specific usage flags 849}; 850 851VkImageViewCreateInfo imageViewCreateInfo = { 852 .sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO, 853 .pNext = &imageViewUsageInfo, 854 .flags = 0, 855 .image = image, 856 .viewType = ... // image view type (only 2D or 2D_ARRAY is supported) 857 ... // other image view creation parameters 858}; 859 860vkCreateImageView(device, &imageViewCreateInfo, NULL, &imageView); 861---- 862 863 864=== Record video coding operations into command buffers 865 866[source,c] 867---- 868VkCommandBuffer commandBuffer = ... // allocate command buffer for a queue family supporting the video profile 869 870vkBeginCommandBuffer(commandBuffer, ...); 871... 872 873// Begin video coding scope with given video session, parameters, and reference picture resources 874VkVideoBeginCodingInfoKHR beginInfo = { 875 .sType = VK_STRUCTURE_TYPE_VIDEO_BEGIN_CODING_INFO_KHR, 876 .pNext = NULL, 877 .flags = 0, 878 .videoSession = videoSession, 879 .videoSessionParameters = videoSessionParams, 880 .referenceSlotCount = ... 881 .pReferenceSlots = ... 882}; 883 884vkCmdBeginVideoCodingKHR(commandBuffer, &beginInfo); 885 886// Reset video session before starting to use it for video coding operations 887// (only needed when starting to process a new video stream) 888VkVideoCodingControlInfoKHR controlInfo = { 889 .sType = VK_STRUCTURE_TYPE_VIDEO_CODING_CONTROL_INFO_KHR, 890 .pNext = NULL, 891 .flags = VK_VIDEO_CODING_CONTROL_RESET_BIT_KHR 892}; 893 894vkCmdControlVideoCodingKHR(commandBuffer, &controlInfo); 895 896// Issue video coding operations against the video session 897... 898 899// End video coding scope 900VkVideoEndCodingInfoKHR endInfo = { 901 .sType = VK_STRUCTURE_TYPE_VIDEO_END_CODING_INFO_KHR, 902 .pNext = NULL, 903 .flags = 0 904}; 905 906vkCmdEndVideoCodingKHR(commandBuffer, &endInfo); 907 908... 909vkEndCommandBuffer(commandBuffer); 910---- 911 912 913=== Create and use result status query pool with a video session 914 915[source,c] 916---- 917VkQueryPool queryPool = VK_NULL_HANDLE; 918 919VkVideoProfileInfoKHR profileInfo = { 920 ... 921}; 922 923VkQueryPoolCreateInfo createInfo = { 924 .sType = VK_STRUCTURE_TYPE_QUERY_POOL_CREATE_INFO, 925 .pNext = &profileInfo, 926 .flags = 0, 927 .queryType = VK_QUERY_TYPE_RESULT_STATUS_ONLY_KHR, 928 ... 929}; 930 931vkCreateQueryPool(device, &createInfo, NULL, &queryPool); 932 933... 934vkBeginCommandBuffer(commandBuffer, ...); 935... 936vkCmdBeginVideoCodingKHR(commandBuffer, ...); 937... 938vkCmdBeginQuery(commandBuffer, queryPool, 0, 0); 939// Issue video coding operation 940... 941vkCmdEndQuery(commandBuffer, queryPool, 0); 942... 943vkCmdEndVideoCodingKHR(commandBuffer, ...); 944... 945vkEndCommandBuffer(commandBuffer); 946... 947 948VkQueryResultStatusKHR status; 949vkGetQueryPoolResults(device, queryPool, 0, 1, 950 sizeof(status), &status, sizeof(status), 951 VK_QUERY_RESULT_WITH_STATUS_BIT_KHR); 952 953if (status == VK_QUERY_RESULT_STATUS_NOT_READY_KHR /* 0 */) { 954 // Query result not ready yet 955 ... 956} else if (status > 0) { 957 // Video coding operation was successful, enum values indicate specific success status code 958 ... 959} else if (status < 0) { 960 // Video coding operation was unsuccessful, enum values indicate specific failure status code 961 ... 962} 963---- 964 965 966== Issues 967 968=== RESOLVED: What is within the scope of this extension? 969 970The goal of this extension is to include all infrastructure APIs that are shareable across all video coding use cases, including video decoding and video encoding, independent of the video compression standard used. While there is a large set of parameters and semantics that are specific to the particular video coding operation and video codec used, many fundamental concepts and APIs are common across those, including: 971 972 * The concept of video profiles that describe the video content and video coding use cases 973 * The concept of video picture resources and decoded picture buffers 974 * Queries that allow the application to determine if a video profile is supported, the capabilities of each video profile, and the supported video picture resource formats that can be used in conjunction with particular sets of video profiles 975 * Video session objects that provide the device state context for video coding operations 976 * Video session parameters objects that provide the means to reuse large sets of codec-specific parameters across video coding operations 977 * General command buffer commands and semantics to build command sequences working on video streams using a video session 978 * Feedback mechanisms that enable tracking the status of individual video coding operations 979 980These APIs are designed to be used in conjunction with layered extensions that introduce support for specific video coding operations and video compression standards. 981 982 983=== RESOLVED: Are Vulkan video profiles equivalent to the corresponding concepts of video compression standards? 984 985Not exactly. While they do encompass actual video compression standard profile information, they also contain other information related to the type of the video content and additional use case scenario specific information. 986 987The video coding operation and the used video compression standard is identified by bits in the new `VkVideoCodecOperationFlagBitsKHR` type. While this extension does not define any valid values, layered codec-specific extensions are expected to add corresponding bits in the form `VK_VIDEO_CODEC_OPERATION_<operationType>_<codec>_BIT`. 988 989 990=== RESOLVED: Do we need a query to be able to enumerate all supported video profiles? 991 992Enumerating individual video profiles is a non-trivial problem due to the parameter combinatorics and the interaction between individual parameters. As Vulkan video profiles also include additional use case scenario specific information, it gets even more complicated. It is also expected that most use cases (especially video decoding) will want to target specific video profiles anyway, so this extension does not include an enumeration API for video profiles, rather it provides the mechanisms to determine support for specific ones. Nonetheless, a more generic enumeration API is considered to be included in future extensions. 993 994 995=== RESOLVED: Do we need queries that allow determining how multiple video profiles can be used in conjunction? 996 997Video transcoding is an important use case, so this extension does allow queries and other APIs to take a list of video profiles, when applicable, that enable the application to determine how to use a particular set of video decode and video encode profiles in conjunction, and thus support video transcoding without the need to copy video picture data, when possible. 998 999 1000=== RESOLVED: What kind of capabilitity queries do we need? 1001 1002First, this extension enables the application to query the video codec operations supported by each queue family with the new output structure `VkQueueFamilyVideoPropertiesKHR`. 1003 1004Second, the new `vkGetPhysicalDeviceVideoCapabilitiesKHR` command enables checking support for individual video profiles, and querying their general capabilities. This API also enables layered extensions to add new output structures to retrieve additional capabilities specific to the used video coding operation and video compression standard. 1005 1006Besides those, as the set of image formats and other image creation parameters compatible with video coding varies across video profiles, the new `vkGetPhysicalDeviceVideoFormatPropertiesKHR` command is introduced to query the set of image parameters that are compatible with a given set of video profiles and usage. In addition, the existing `vkGetPhysicalDeviceImageFormatProperties2` command is also extended to be able to take a list of video profiles as input to query video-specific image format capabilities. 1007 1008 1009=== RESOLVED: What kind of command buffer commands do we need? 1010 1011This extension does not introduce any specific video coding operations (e.g. video decode or encode operations). However, it does introduce a set of command buffer commands that enable defining scopes within command buffers where layered extensions can record video coding operations against a specific video session to process a video sequence. These video coding scopes are delimited by the new `vkCmdBeginVideoCodingKHR` and `vkCmdEndVideoCodingKHR` commands. 1012 1013In addition, the `vkCmdControlVideoCodingKHR` command is introduced to allow layered extensions to modify dynamic context state, and control video session state in general. 1014 1015 1016=== RESOLVED: How can the application get feedback about the status of video coding operations? 1017 1018This extension uses queries for the purpose and even introduces a new query type (`VK_QUERY_TYPE_RESULT_STATUS_ONLY_KHR`) that only includes status information. Layered extensions may also introduce other query types to enable retrieving any additional feedback that may be needed in the specific video coding use case. 1019 1020Such queries can be issued within video coding scopes using the existing `vkCmdBeginQuery` and `vkCmdEndQuery` commands (and its variants), however, the behavior of queries within video coding scopes is slightly different. Instead of a single query capturing the overall result of a series of commands, queries in video coding scopes produce separate results for each video coding operation, hence multiple video coding operations need to consume a separate query slot each. 1021 1022 1023=== RESOLVED: Do we need to introduce new `vkCmdBeginQueryRangeKHR` and `vkCmdEndQueryRangeKHR` commands to allow capturing feedback about multiple video coding operations using a single scope? 1024 1025Not in this extension. For now each layered extension is expected to introduce commands that result in the issue of only a single video coding operation, hence using the existing `vkCmdBeginQuery` and `vkCmdEndQuery` commands to surround each such command separately is sufficient. However, future extensions may introduce such commands if needed. 1026 1027 1028=== RESOLVED: Can resources be shared across video sessions, potentially ones using different video profiles? 1029 1030Yes, we need to support resource sharing at least for video bitstream buffers and video picture resources. This is important for the purposes of supporting efficient video transcoding. 1031 1032Subject to the capabilities of the implementation, buffers and image resources can be created to be shareable across video sessions by including the list of video profiles used by each video session in the object creation parameters. 1033 1034Query pools, however, are always specific to a video profile, as there is little use to share them across video sessions, and typically the contents of the query results are specific to the used video profile anyway. 1035 1036 1037=== RESOLVED: How are video coding operations synchronized with respect to other Vulkan operations? 1038 1039Synchronization works in the same way as elsewhere in the API. Command buffers targeting video-capable queues can use `vkCmdPipelineBarrier` or any of the other synchronization commands both inside and outside of video coding scopes. While this extension does not include any new pipeline stages, access flags, or image layouts, the layered extensions introducing particular video coding operations do. 1040 1041 1042=== RESOLVED: Why do some of the members of `VkVideoProfileInfoKHR` have `Flags` types instead of `FlagBits` types when only a single bit can be used? 1043 1044While this extension allows specifying only a single bit in the `chromaSubsampling`, `lumaBitDepth`, and `chromaBitDepth` members of `VkVideoProfileInfoKHR`, it is expected that future extensions may relax those requirements. 1045 1046 1047=== RESOLVED: Can the application create video sessions with any `maxDpbSlots` and `maxActiveReferencePictures` values within the supported capabilities? 1048 1049Yes. While it is quite common for video compression standards to define these values, in particular a given video profile usually supports a specific value for the number of DPB slots and it is also typical for video compression standards to allow using all reference pictures associated with active DPB slots as active reference pictures in a video coding operation. However, depending on the specific use case, the application can choose to use lower values. 1050 1051For example, if the application knows that the video content always uses at most a single reference picture for each frame, and that it only ever uses a single DPB slot, using `1` as the value for both `maxDpbSlots` and `maxActiveReferencePictures` can enable the application to limit the memory requirements of the DPB. 1052 1053Nonetheless, it is the application's responsibility to make sure that it creates video sessions with appropriate values to be able to handle the video content at hand. 1054 1055 1056=== RESOLVED: Are `VkVideoSessionParametersKHR` objects internally or externally synchronized? 1057 1058Video session parameters objects have special synchronization requirements. Typically they will only get updated by a single thread that processes the video stream but they may be consumed concurrently by multiple command buffer recording threads. 1059 1060Accordingly, they are defined to be logically internally synchronized, but in practice concurrent updates of the same object is disallowed by the requirement that the application has to increment the update sequence counter of the object with each update call. This model enables implementations to allow concurrent consumption of already stored parameters with minimal to no synchronization overhead. 1061 1062 1063== Further Functionality 1064 1065This extension is meant to provide only common video coding functionality, thus support for individual video coding operations and video compression standards is left for extensions layered on top of the infrastructure provided here. 1066 1067Currently the following layered extensions are available: 1068 1069 * `VK_KHR_video_decode_queue` - adds general support for video decode operations 1070 * `VK_KHR_video_decode_h264` - adds support for decoding H.264/AVC video sequences 1071 * `VK_KHR_video_decode_h265` - adds support for decoding H.265/HEVC video sequences 1072 * `VK_KHR_video_encode_queue` (provisional) - adds general support for video encode operations 1073 * `VK_EXT_video_encode_h264` (provisional) - adds support for encoding H.264/AVC video sequences 1074 * `VK_EXT_video_encode_h265` (provisional) - adds support for encoding H.265/HEVC video sequences 1075