1// Copyright 2021-2024 The Khronos Group Inc. 2// 3// SPDX-License-Identifier: CC-BY-4.0 4 5= VK_KHR_video_encode_queue 6:toc: left 7:refpage: https://registry.khronos.org/vulkan/specs/1.3-extensions/man/html/ 8:sectnums: 9 10This document outlines a proposal to enable performing video encode operations in Vulkan. 11 12== Problem Statement 13 14Integrating video encode operations into Vulkan applications enables a wide set of new usage scenarios including, but not limited to, the following examples: 15 16 * Recording the output of rendering operations 17 * Efficiently transferring rendering results over network (video conferencing, game streaming, etc.) 18 19It is also not uncommon for Vulkan capable devices to feature dedicated hardware acceleration for video compression. 20 21The goal of this proposal is to enable these use cases, expose the underlying hardware capabilities, and provide tight integration with other functionalities of the Vulkan API. 22 23 24== Solution Space 25 26The following options have been considered: 27 28 1. Rely on external sharing capabilities to interact with existing video encode APIs 29 2. Add new dedicated APIs to Vulkan specific to video encoding 30 3. Build upon a common set of APIs that enable video coding operations in general 31 32As discussed in the proposal for the `VK_KHR_video_queue` extension, reusing a common, shared infrastructure across all video coding functionalities that leverage existing Vulkan capabilities was preferred, hence this extension follows option 3. 33 34Further sub-options were considered whether a common set of APIs could be used to enable video encoding in general, upon which codec-specific extensions can be built. As the possibility of API reuse is similarly possible within the domain of video encoding as it is for video coding in general, this proposal follows the same principle to extend `VK_KHR_video_queue` with codec-independent video encoding capabilities. 35 36 37== Proposal 38 39=== Video Encode Queues 40 41While `VK_KHR_video_queue` already includes support for a more fine grained query to determine the set of supported video codec operations for a given queue family, this extension introduces an explicit queue flag called `VK_QUEUE_VIDEO_ENCODE_BIT_KHR` to indicate support for video encoding. 42 43Applications can use this flag bit to identify video encode capable queue families in general, if needed, before querying more details about the individual video codec operations supported through the use of the `VkQueueFamilyVideoPropertiesKHR` structure. It also indicates support for the set of command buffer commands available on video encode queues, which include the following: 44 45 * Pipeline barrier and event handling commands used for synchronization 46 * Basic query commands to begin, end, and reset queries 47 * Timestamp write commands 48 * Generic video coding commands 49 * The new video encode command introduced by this extension 50 51For the full list of individual commands supported by video encode queues, and whether any command is supported inside/outside of video coding scopes, refer to the manual page of the corresponding command. 52 53 54=== Video Encode Profiles 55 56Video encode profiles are defined using a `VkVideoProfileInfoKHR` structure that specifies a `videoCodecOperation` value identifying a video encode operation. This extension does not introduce any video encode operation flags, as that is left to the codec-specific encode extensions. 57 58On the other hand, this extension allows the application to specify usage information specific to video encoding by chaining the following new structure to `VkVideoProfileInfoKHR`: 59 60[source,c] 61---- 62typedef struct VkVideoEncodeUsageInfoKHR { 63 VkStructureType sType; 64 const void* pNext; 65 VkVideoEncodeUsageFlagsKHR videoUsageHints; 66 VkVideoEncodeContentFlagsKHR videoContentHints; 67 VkVideoEncodeTuningModeKHR tuningMode; 68} VkVideoEncodeUsageInfoKHR; 69---- 70 71This structure contains two hints specific to the encoding use case and the content to be encoded, respectively, as well as a tuning mode. 72 73The usage hint flags introduced by this extension are as follows: 74 75 * `VK_VIDEO_ENCODE_USAGE_TRANSCODING_BIT_KHR` should be used in video transcoding use cases 76 * `VK_VIDEO_ENCODE_USAGE_STREAMING_BIT_KHR` should be used when encoding video content streamed over network 77 * `VK_VIDEO_ENCODE_USAGE_RECORDING_BIT_KHR` should be used in real-time recording but offline consumption use cases 78 * `VK_VIDEO_ENCODE_USAGE_CONFERENCING_BIT_KHR` should be used for video conferencing use cases 79 80The content hint flags introduced are as follows: 81 82 * `VK_VIDEO_ENCODE_CONTENT_CAMERA_BIT_KHR` should be used when encoding images captured using a camera 83 * `VK_VIDEO_ENCODE_CONTENT_DESKTOP_BIT_KHR` should be used when encoding desktop screen captures 84 * `VK_VIDEO_ENCODE_CONTENT_RENDERED_BIT_KHR` should be used when encoding rendered (e.g. game) content 85 86These usage hints do not provide any restrictions or guarantees, so any combination of flags can be used, but they allow the application to better communicate the intended use case scenario so that implementations can make appropriate choices based on it. 87 88Logically, however, it is part of the video profile definition, so capabilities may vary across video encode profiles that only differ in terms of video encode usage hints, and it also affects video profile compatibility between resources and video sessions, so the same `VkVideoEncodeUsageInfoKHR` structure has to be included everywhere where the specific video encode profile is used. The contemporary extension `VK_KHR_video_maintenance1`, however, does allow creating buffer and image resources that are compatible with multiple video profiles when they are created with the `VK_BUFFER_CREATE_VIDEO_PROFILE_INDEPENDENT_BIT_KHR` or `VK_IMAGE_CREATE_VIDEO_PROFILE_INDEPENDENT_BIT_KHR` flags, respectively, introduced by that extension. 89 90Unlike the hints, `tuningMode` is an explicit mode setting parameter that has functional implications and is expected to limit encoding capabilities to fit the usage scenario. The following tuning mode values are introduced by this extension: 91 92 * `VK_VIDEO_ENCODE_TUNING_MODE_DEFAULT_KHR` is the default tuning mode 93 * `VK_VIDEO_ENCODE_TUNING_MODE_HIGH_QUALITY_KHR` tunes encoding for high quality and will likely impose latency and performance compromises 94 * `VK_VIDEO_ENCODE_TUNING_MODE_LOW_LATENCY_KHR` tunes encoding for low latency and will likely impose quality compromises for better performance 95 * `VK_VIDEO_ENCODE_TUNING_MODE_ULTRA_LOW_LATENCY_KHR` tunes encoding for ultra-low latency with further quality compromises for maximum performance 96 * `VK_VIDEO_ENCODE_TUNING_MODE_LOSSLESS_KHR` tunes encoding to produce lossless output. 97 98In practice, not all codecs and profiles will support every tuning mode. The new query command `vkGetPhysicalDeviceVideoEncodeQualityLevelPropertiesKHR`, as described later, may also return different recommended configuration parameters based on the tuning mode specified in the video profile in order to further aid application developers in choosing the most suitable settings for the encoding scenario at hand. 99 100 101=== New Pipeline Stage and Access Flags 102 103This extension also introduces a new pipeline stage identified by the `VK_PIPELINE_STAGE_2_VIDEO_ENCODE_BIT_KHR` flag to enable synchronizing video encode operations with respect to other Vulkan operations. 104 105In addition, two new access flags are introduced to indicate reads and writes, respectively, performed by the video encode pipeline stage: 106 107 * `VK_ACCESS_2_VIDEO_ENCODE_READ_BIT_KHR` 108 * `VK_ACCESS_2_VIDEO_ENCODE_WRITE_BIT_KHR` 109 110As these flags did no longer fit into the legacy 32-bit enums, this extension requires the `VK_KHR_synchronization2` extension and relies on the 64-bit versions of the pipeline stage and access mask flags to handle synchronization specific to video encode operations. 111 112 113=== New Buffer and Image Usage Flags 114 115This extension introduces the following new buffer usage flags: 116 117 * `VK_BUFFER_USAGE_VIDEO_ENCODE_SRC_BIT_KHR` is reserved for future use 118 * `VK_BUFFER_USAGE_VIDEO_ENCODE_DST_BIT_KHR` allows using the buffer as a video bitstream buffer in video encode operations 119 120This extension also introduces the following new image usage flags: 121 122 * `VK_IMAGE_USAGE_VIDEO_ENCODE_SRC_BIT_KHR` allows using the image as an encode input picture 123 * `VK_IMAGE_USAGE_VIDEO_ENCODE_DST_BIT_KHR` is reserved for future use 124 * `VK_IMAGE_USAGE_VIDEO_ENCODE_DPB_BIT_KHR` allows using the image as an encode DPB picture (reconstructed/reference picture) 125 126Specifying these usage flags alone is not sufficient to create a buffer or image that is compatible with a video session created against any particular video profile. In fact, when specifying any of these usage flags at resource creation time, the application has to include a `VkVideoProfileListInfoKHR` structure in the `pNext` chain of the corresponding create info structure with `VkVideoProfileListInfoKHR::pProfiles` including a video encode profile. The created resources will be compatible only with the included video encode profiles (and a video encode profile, if one is also specified in the list). 127 128 129=== New Format Feature Flags 130 131To indicate which formats are compatible with video encode usage, the following new format feature flags are introduced: 132 133 * `VK_FORMAT_FEATURE_VIDEO_ENCODE_INPUT_BIT_KHR` indicates support for encode input picture usage 134 * `VK_FORMAT_FEATURE_VIDEO_ENCODE_DPB_BIT_KHR` indicates support for encode DPB picture usage 135 136The presence of the format flags alone, as returned by the various format queries, is not sufficient to indicate that an image with that format is usable with video encoding using any particular video encode profile. Actual compatibility with a specific video encode profile has to be verified using the `vkGetPhysicalDeviceVideoFormatPropertiesKHR` command. 137 138 139=== Basic Operation 140 141Video encode operations can be recorded into command buffers allocated from command pools created against queue families that support the `VK_QUEUE_VIDEO_ENCODE_BIT_KHR` flag. 142 143Recording video encode operations happens through the use of the following new command: 144 145[source,c] 146---- 147VKAPI_ATTR void VKAPI_CALL vkCmdEncodeVideoKHR( 148 VkCommandBuffer commandBuffer, 149 const VkVideoEncodeInfoKHR* pEncodeInfo); 150---- 151 152The common, codec-independent parameters of the video encode operation are provided using the following new structure: 153 154[source,c] 155---- 156typedef struct VkVideoEncodeInfoKHR { 157 VkStructureType sType; 158 const void* pNext; 159 VkVideoEncodeFlagsKHR flags; 160 VkBuffer dstBuffer; 161 VkDeviceSize dstBufferOffset; 162 VkDeviceSize dstBufferRange; 163 VkVideoPictureResourceInfoKHR srcPictureResource; 164 const VkVideoReferenceSlotInfoKHR* pSetupReferenceSlot; 165 uint32_t referenceSlotCount; 166 const VkVideoReferenceSlotInfoKHR* pReferenceSlots; 167 uint32_t precedingExternallyEncodedBytes; 168} VkVideoEncodeInfoKHR; 169---- 170 171Executing such a video encode operation results in the compression of a single picture (unless otherwise defined by layered extensions), and, if there is an active `VK_QUERY_TYPE_RESULT_STATUS_ONLY_KHR` query, the status of the video encode operation is recorded into the active query slot. 172 173In addition to `VK_QUERY_TYPE_RESULT_STATUS_ONLY_KHR` queries, applications can use the new `VK_QUERY_TYPE_VIDEO_ENCODE_FEEDBACK_KHR` queries to retrieve additional feedback about the encoded picture including the offset and size of the bitstream written to the specified video bitstream buffer range, as discussed later. 174 175If the encode operation requires additional codec-specific parameters, then such parameters are provided in the `pNext` chain of the structure above. Whether such codec-specific information is necessary, and what it may contain is up to the codec-specific extensions. 176 177`dstBuffer`, `dstBufferOffset`, and `dstBufferRange` provide information about the target video bitstream buffer range. The video encode operation writes the compressed picture data to this buffer range. 178 179The application has to create the video bitstream buffer with the new `VK_BUFFER_USAGE_VIDEO_ENCODE_DST_BIT_KHR` usage flag, and must also include the used video session's video profile in the `VkVideoProfileListInfoKHR` structure specified at buffer creation time. 180 181The data written to the video bitstream buffer range depends on the specific video codec used, as defined by corresponding codec-specific extensions built upon this proposal. 182 183The `srcPictureResource`, `pSetupReferenceSlot`, and `pReferenceSlots` members specify the encode input picture, reconstructed picture, and reference pictures, respectively, used by the video encode operation, as discussed in later sections of this proposal. 184 185The `precedingExternallyEncodedBytes` member specifies the number of bytes externally encoded into the bitstream by the application. This value is used to update the implementation's rate control algorithm for the rate control layer this encode operation belongs to, by accounting for the bitrate budget consumed by these externally encoded bytes. This parameter is respected by the implementation only if the `VK_VIDEO_ENCODE_CAPABILITY_PRECEDING_EXTERNALLY_ENCODED_BYTES_BIT_KHR` capability is supported. 186 187 188=== Encode Input Picture 189 190`srcPictureResource` defines the parameters of the video picture resource to use as the encode input picture. The video encode operation reads the picture data to compress from this video picture resource. As such it is a mandatory parameter of the operation. 191 192The application has to create the image view specified in `srcPictureResource.imageViewBinding` with the new `VK_IMAGE_USAGE_VIDEO_ENCODE_SRC_BIT_KHR` usage flag, and must also include the used video session's video profile in the `VkVideoProfileListInfoKHR` structure specified at image creation time. 193 194The image subresource backing the encode input picture has to be in the new `VK_IMAGE_LAYOUT_VIDEO_ENCODE_SRC_KHR` layout at the time the video encode operation is executed. 195 196 197=== Reconstructed Picture 198 199`pSetupReferenceSlot` is an optional parameter specifying the video picture resource and DPB slot index to use for the reconstructed picture. Implementations use the reconstructed picture for one of the following purposes: 200 201 1. When the encoded picture is requested to be set up as a reference, according to the codec-specific semantics, the video encode operation will perform picture reconstruction, output the results to this picture, and activate the reconstructed picture's DPB slot with the picture in order to enable using the picture as a reference picture in future video encode operations. 202 2. When the encoded picture is not requested to be set up as a reference, implementations may use the reconstructed picture's resource and/or DPB slot for intermediate data required by the encoding process. 203 204Accordingly, `pSetupReferenceSlot` must never be `NULL`, except when the video session was created without any DPB slots. 205 206[NOTE] 207.Note 208==== 209The original version of this extension only required the specification of the reconstructed picture information (i.e. a non-`NULL` `pSetupReferenceSlot`) when the application intended to set up a reference picture by activating a DPB slot. Consequently, the presence of reconstructed picture information always implied DPB slot activation. This was changed in revision 12 of the extension, and whether DPB slot activation happens is now subject to codec-specific semantics. More details on this change are discussed in the corresponding issue in this proposal document. 210==== 211 212In summary, for encoded pictures requested to be set up as reference, this parameter can be used to add new reference pictures to the DPB, and change the association between DPB slot indices and video picture resources. That also implies that the application has to specify a video picture resource in `pSetupReferenceSlot->pPictureResource` that was included in the set of bound reference picture resources specified when the video coding scope was started (in one of the elements of `VkVideoBeginCodingInfoKHR::pReferenceSlots`). No similar requirement exists for the encode input picture specified by `srcPictureResource` which can refer to any video picture resource. 213 214The application has to create the image view specified in `pSetupReferenceSlot->pPictureResource->imageViewBinding` with the new `VK_IMAGE_USAGE_VIDEO_ENCODE_DPB_BIT_KHR` usage flag, and must also include the used video session's video profile in the `VkVideoProfileListInfoKHR` structure specified at image creation time. 215 216The image subresource backing the reconstructed picture has to be in the new `VK_IMAGE_LAYOUT_VIDEO_ENCODE_DPB_KHR` layout at the time the video encode operation is executed. 217 218If the video profile in use requires additional codec-specific parameters for the reconstructed picture, then such parameters are provided in the `pNext` chain of `pSetupReferenceSlot`. Whether such codec-specific reconstructed picture information is necessary, and what it may contain is up to the codec-specific extensions. 219 220 221=== Reference Pictures 222 223If the video session allows, reference pictures can be specified in the `pReferenceSlots` array to provide predictions of the values of samples of the encoded picture. 224 225Each entry in the `pReferenceSlots` array adds one or more pictures, currently associated with the DPB slot specified in the element's `slotIndex` member and stored in the video picture resource specified in the element's `pPictureResource` member, to the list of active reference pictures to use in the video encode operation. 226 227The application has to make sure to specify each video picture resource used as a reference picture in a video encode operation, beforehand, in the set of bound reference picture resources specified when the video coding scope was started (in one of the elements of `VkVideoBeginCodingInfoKHR::pReferenceSlots`). 228 229The application has to create the image view specified in `pPictureResource->imageViewBinding` of the elements of `pReferenceSlots` with the new `VK_IMAGE_USAGE_VIDEO_ENCODE_DPB_BIT_KHR` usage flag, and must also include the used video session’s video profile in the `VkVideoProfileListInfoKHR` structure specified at image creation time. 230 231The image subresources backing the reference pictures have to be in the new `VK_IMAGE_LAYOUT_VIDEO_ENCODE_DPB_KHR` layout at the time the video encode operation is executed. 232 233Typically the number of elements in `pReferenceSlots` equals the number of reference pictures added, but in certain cases (depending on the used video codec and video profile) there may be multiple pictures in the same DPB slot resource. 234 235If the video profile in use requires additional codec-specific parameters for the reference pictures, then such parameters are provided in the `pNext` chain of the elements of `pReferenceSlots`. Whether such codec-specific reference picture information is necessary, and what it may contain is up to the codec-specific extensions. 236 237 238=== Video Encode Parameter Overrides 239 240Encoder implementations usually only support a subset of the available encoding tools defined by the corresponding video compression standards. This may prevent some implementation from being able to respect certain codec-specific parameters, or specific parameter values. 241 242Enumerating exhaustively all of these constraints and potentially defining application queryable capabilities corresponding to those is not practical, as it would potentially require separate capabilities for almost every single codec-specific parameter, parameter value, and combinations of those, as usually there are complicated interactions between those codec-specific parameters. Instead, this proposal approaches this problem from the other direction. 243 244Instead of defining capabilities for each of these constraints, implementations are allowed to override codec-specific parameter values or combinations thereof, so that the resulting overridden codec-specific parameters now comply to the constraints of the target implementation. This has multiple benefits: 245 246 * Enables the video encode APIs to be supported on a much wider set of hardware implementations, as the codec-specific extensions layered on top of this extension would not have codec-specific requirements that assume implementations to support certain, potentially not universally available, encoding tools 247 * Enables implementations to expose all of the encoding tools they support for a particular video compression standard, which typically is not possible in other video APIs as, without overrides, implementations may not be able to expose a large set of their encoding tools just because they do not comply to the exact wording of the capabilities defined by that API 248 * Enables writing portable applications without getting lost in myriads of capabilities 249 250Allowing implementations to override codec-specific parameters does not mean, however, that implementations can do any overrides they wish. The base parameter override mechanism is reserved to deal with implementation limitations only. Thus, by default, implementations are expected to override codec-specific parameters only if it is absolutely paramount for the correct functioning of their encoder hardware. 251 252In certain cases, applications may want to allow the implementation to make its own choices about the certain codec-specific parameters that are not driven by implementation constraints, but rather aim to allow the implementation to choose parameters and encoding tools that better fit the usage scenario described by the video profile and other parameters, like the encode quality level, than the one the application specified. This proposal introduces a new video session creation flag called `VK_VIDEO_SESSION_CREATE_ALLOW_ENCODE_PARAMETER_OPTIMIZATIONS_BIT_KHR` that enables the application to opt in to such optimization overrides. 253 254There are certain rules that implementations need to follow in all cases where they may apply codec-specific parameter overrides. In particular: 255 256 * Certain codec-specific parameters are defined by layered codec-specific extensions to be always respected, and thus cannot be overridden, which is generally expected to be the case for all parameters that may affect the overall behavior of video encoding, or any bitstream elements that are not encoded in any fashion by the implementation, so that applications still have the necessary freedom to encode such auxiliary bitstream elements the way they wish 257 * In a similar vein, implementation overrides cannot affect the compliance of the generated bitstream to the video compression standard 258 259The details of these rules can be found in the specification language of this extension, and any layered extension built upon it. 260 261In general, there are two categories of codec-specific parameters to which implementation overrides may be applied: 262 263 1. Codec-specific parameters stored in video session parameters objects, if any 264 2. Codec-specific parameters provided to video encode commands 265 266Both of these codec-specific parameter categories may have an effect on the video bitstream data produced by video encode operations. However, parameters falling into the first category are particularly important as it is common for applications to encode the codec-specific parameters stored in video session parameters on their own. 267 268In order to enable the application to deal with parameter overrides applied to video session parameters, this proposal introduces the following new command: 269 270[source,c] 271---- 272VKAPI_ATTR VkResult VKAPI_CALL vkGetEncodedVideoSessionParametersKHR( 273 VkDevice device, 274 const VkVideoEncodeSessionParametersGetInfoKHR* pVideoSessionParametersInfo, 275 VkVideoEncodeSessionParametersFeedbackInfoKHR* pFeedbackInfo, 276 size_t* pDataSize, 277 void* pData); 278---- 279 280The main input to this command is the video session parameters object in question, with layered extensions adding additional chainable structures to provide additional codec-specific input parameters: 281 282[source,c] 283---- 284typedef struct VkVideoEncodeSessionParametersGetInfoKHR { 285 VkStructureType sType; 286 const void* pNext; 287 VkVideoSessionParametersKHR videoSessionParameters; 288} VkVideoEncodeSessionParametersGetInfoKHR; 289---- 290 291This command has multiple purposes. 292 293First, by providing a non-`NULL` `pFeedbackInfo` parameter, the application can get feedback about whether the implementation applied any parameter overrides to the video session parameters in question through the following output structure: 294 295[source,c] 296---- 297typedef struct VkVideoEncodeSessionParametersFeedbackInfoKHR { 298 VkStructureType sType; 299 void* pNext; 300 VkBool32 hasOverrides; 301} VkVideoEncodeSessionParametersFeedbackInfoKHR; 302---- 303 304The `hasOverrides` member will be set to `VK_TRUE` if implementation overrides were applied, and layered extensions may provide additional chainable output structures that return further (typically codec-specific) information about the applied overrides. 305 306When this feedback indicates that implementation overrides were applied, the application needs to retrieve the encoded video session parameters containing the overrides in order to be able to produce a compliant bitstream. This can be done in the usual fashion by providing a non-`NULL` `pDataSize` parameter to retrieve the size of the encoded parameter data, and then calling the command again with a non-`NULL` `pData` pointer to retrieve the data. 307 308The application can choose to use the `vkGetEncodedVideoSessionParametersKHR` command to encode the video session parameters even if the implementation did not override any of the parameters, but in this case it can also choose to encode the respective bitstream elements on its own. 309 310It is worth calling out though that if the application does not use this command to determine whether video session parameter overrides happened or does not use the encoded parameters retrievable using this command when video session parameter overrides happened, but rather just encodes the respective bitstream elements with its own choice of codec-specific parameters, then it risks the resulting video bitstream to end up being non-compliant to the video compression standard. 311 312 313=== Capabilities 314 315Querying capabilities specific to video encoding happens through the query mechanisms introduced by the `VK_KHR_video_queue` extension. 316 317Support for individual video encode operations can be retrieved for each queue family using the `VkQueueFamilyVideoPropertiesKHR` structure, as discussed earlier. 318 319The application can also use the `vkGetPhysicalDeviceVideoCapabilitiesKHR` command to query the capabilities of a specific video encode profile. In case of video encode profiles, the following new structure has to be included in the `pNext` chain of the `VkVideoCapabilitiesKHR` structure used to retrieve the general video encode capabilities: 320 321[source,c] 322---- 323typedef struct VkVideoEncodeCapabilitiesKHR { 324 VkStructureType sType; 325 void* pNext; 326 VkVideoEncodeCapabilityFlagsKHR flags; 327 VkVideoEncodeRateControlModeFlagsKHR rateControlModes; 328 uint32_t maxRateControlLayers; 329 uint64_t maxBitrate; 330 uint32_t maxQualityLevels; 331 VkExtent2D encodeInputPictureGranularity; 332 VkVideoEncodeFeedbackFlagsKHR supportedEncodeFeedbackFlags; 333} VkVideoEncodeCapabilitiesKHR; 334---- 335 336This structure contains a new encode-specific `flags` member that indicates support for various video encode capabilities, like the support for the `precedingExternallyEncodedBytes` parameter discussed before. 337 338The `rateControlModes` and `maxRateControlLayers` members provide information about the supported rate control modes and maximum number of rate control layers that can be used in a video session, as discussed later. 339 340The `maxBitrate` member provides information about the maximum bitrate supported for the video profile. 341 342The `maxQualityLevels` member specifies the number of different video encode quality level values supported by the video encode profile in question which are identified with numbers in the range `0..maxQualityLevels`. The number and implementation effect of the quality levels is expected to vary across video encode profiles, even in video encode profiles using the same video codec operation (e.g. due to the use of different tuning modes), as discussed later. 343 344The `encodeInputPictureGranularity` member indicates the granularity at which data from the encode input picture is used for encoding individual codec-specific coding blocks. If this capability is not `{1,1}`, then it is recommend for applications to initialize the data in the encode input picture at this granularity, as the encoder will use data in such padding texels during the encoding, which may affect the quality and efficiency of the encoding. 345 346The `supportedEncodeFeedbackFlags` member indicates the set of supported encode feedback flags for the `VK_QUERY_TYPE_VIDEO_ENCODE_FEEDBACK_KHR` queries described later. 347 348The `vkGetPhysicalDeviceVideoFormatPropertiesKHR` command can be used to query the supported image/picture formats for a given set of video profiles, as described in the `VK_KHR_video_queue` extension. 349 350In particular, if the application would like to query the list of format properties supported for encode input pictures, then it should include the new `VK_IMAGE_USAGE_VIDEO_ENCODE_SRC_BIT_KHR` usage flag in `VkPhysicalDeviceVideoFormatInfoKHR::imageUsage`. 351 352Similarly, to query the list of format properties supported for encode DPB pictures (reconstructed/reference pictures), then it should include the new `VK_IMAGE_USAGE_VIDEO_ENCODE_DPB_BIT_KHR` usage flag in `VkPhysicalDeviceVideoFormatInfoKHR::imageUsage`. 353 354 355=== Video Encode Quality Levels 356 357This proposal introduces the concept of video encode quality levels, which can be thought of as encoder presets that control the number and type of implementation-specific encoding tools and algorithms utilized in the encoding process. Implementations can expose support for one or more such video encode quality levels for each video profile. By default, video encode quality level index zero is used, unless otherwise specified. 358 359Generally, using higher video encode quality levels may produce higher quality video streams at the cost of additional processing time. However, as the final quality of an encoded picture depends on the contents of the encode input picture, the contents of the active reference pictures, the codec-specific encode parameters, and the particular implementation-specific tools used corresponding to the individual video encode quality levels, there are no guarantees that using a higher video encode quality level will always produce a higher quality encoded picture for any given set of inputs. 360 361The chosen quality level may also affect the optimization overrides applied by implementations when using the `VK_VIDEO_SESSION_CREATE_ALLOW_ENCODE_PARAMETER_OPTIMIZATIONS_BIT_KHR` flag, and thus codec-specific parameters stored in video session parameters may be affected by the used video encode quality level. As such, video session parameters objects are always created with respect to a specific video encode quality level. The application can choose to create a video session parameters object with a video encode quality level index different than the default quality level of zero by including the following new structure in the `pNext` chain of `VkVideoSessionParametersCreateInfoKHR`: 362 363[source,c] 364---- 365typedef struct VkVideoEncodeQualityLevelInfoKHR { 366 VkStructureType sType; 367 const void* pNext; 368 uint32_t qualityLevel; 369} VkVideoEncodeQualityLevelInfoKHR; 370---- 371 372Where `qualityLevel` specifies the used video encode quality level. 373 374Video sessions created against a video encode profile allow changing the used video encode quality level dynamically. After creation, the video session is configured with the default quality level of zero, which then can be changed by including the new `VK_VIDEO_CODING_CONTROL_ENCODE_QUALITY_LEVEL_BIT_KHR` flag in the `flags` member of the `VkVideoCodingControlInfoKHR` structure passed to the `vkCmdControlVideoCodingKHR` command and including an instance of the `VkVideoEncodeQualityLevelInfoKHR` structure in the `VkVideoCodingControlInfoKHR::pNext` chain specifying the new quality level to set for the video session. 375 376If video session parameters objects are used by a particular video encode command, then the video encode quality the parameters object was created with has to match the currently configured quality level for the bound video session. 377 378Implementations may have certain recommendations for encoding parameters and configuration (e.g. for rate control) specific to each supported video encode quality level. These recommendations and other quality level related properties can be queried for a specific video encode profile using the following new command: 379 380[source,c] 381---- 382VKAPI_ATTR VkResult VKAPI_CALL vkGetPhysicalDeviceVideoEncodeQualityLevelPropertiesKHR( 383 VkPhysicalDevice physicalDevice, 384 const VkPhysicalDeviceVideoEncodeQualityLevelInfoKHR* pQualityLevelInfo, 385 VkVideoEncodeQualityLevelPropertiesKHR* pQualityLevelProperties); 386---- 387 388The input to the command is a structure that specifies the video encode profile and quality level to query properties for: 389 390[source,c] 391---- 392typedef struct VkPhysicalDeviceVideoEncodeQualityLevelInfoKHR { 393 VkStructureType sType; 394 const void* pNext; 395 const VkVideoProfileInfoKHR* pVideoProfile; 396 uint32_t qualityLevel; 397} VkPhysicalDeviceVideoEncodeQualityLevelInfoKHR; 398---- 399 400This proposal allows retrieving the following codec-independent quality level properties: 401 402[source,c] 403---- 404typedef struct VkVideoEncodeQualityLevelPropertiesKHR { 405 VkStructureType sType; 406 void* pNext; 407 VkVideoEncodeRateControlModeFlagBitsKHR preferredRateControlMode; 408 uint32_t preferredRateControlLayerCount; 409} VkVideoEncodeQualityLevelPropertiesKHR; 410---- 411 412Layered extensions may add additional (typically codec-specific) property structures that can be chained to the base output structure defined above. 413 414 415=== Video Encode Feedback Queries 416 417The new `VK_QUERY_TYPE_VIDEO_ENCODE_FEEDBACK_KHR` query type works similarly to pipeline statistics from the perspective of being able to report multiple distinct values about the video encode operations they collect feedback about. When creating a query pool with this type the following new structure specifies the selected feedback values: 418 419[source,c] 420---- 421typedef struct VkQueryPoolVideoEncodeFeedbackCreateInfoKHR { 422 VkStructureType sType; 423 const void* pNext; 424 VkVideoEncodeFeedbackFlagsKHR encodeFeedbackFlags; 425} VkQueryPoolVideoEncodeFeedbackCreateInfoKHR; 426---- 427 428This extension adds support for the following video encode feedback flags: 429 430 * `VK_VIDEO_ENCODE_FEEDBACK_BITSTREAM_BUFFER_OFFSET_BIT_KHR` requests capturing the offset relative to `dstBufferOffset` where the bitstream data corresponding to the video encode operation is written to 431 * `VK_VIDEO_ENCODE_FEEDBACK_BITSTREAM_BYTES_WRITTEN_BIT_KHR` requests capturing the number of bytes written by the video encode operation to the bitstream buffer 432 * `VK_VIDEO_ENCODE_FEEDBACK_BITSTREAM_HAS_OVERRIDES_BIT_KHR` requests capturing information about whether the implementation overrode any codec-specific parameters in the generated bitstream data with respect to the parameter values supplied by the application 433 434All implementations are expected to support `VK_VIDEO_ENCODE_FEEDBACK_BITSTREAM_BUFFER_OFFSET_BIT_KHR` and `VK_VIDEO_ENCODE_FEEDBACK_BITSTREAM_BYTES_WRITTEN_BIT_KHR`, but `VK_VIDEO_ENCODE_FEEDBACK_BITSTREAM_HAS_OVERRIDES_BIT_KHR` is optional, as not all implementations may be able to provide feedback about overrides performed on the encoded bitstream data. 435 436The reported offset for `VK_VIDEO_ENCODE_FEEDBACK_BITSTREAM_BUFFER_OFFSET_BIT_KHR` is currently defined to be always zero until otherwise defined by any layered extension. 437 438 439=== Video Encode Rate Control 440 441A key aspect of video encoding is to control the size of the encoded bitstream. This happens through the application of rate control. Rate control settings consist of codec-independent and codec-specific parameters hence this extension only includes the common parameters. 442 443The following rate control modes are introduced by this extension: 444 445 * `VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DISABLED_BIT_KHR` for disabling rate control 446 * `VK_VIDEO_ENCODE_RATE_CONTROL_MODE_CBR_BIT_KHR` for constant bitrate (CBR) rate control 447 * `VK_VIDEO_ENCODE_RATE_CONTROL_MODE_VBR_BIT_KHR` for variable bitrate (VBR) rate control 448 449In addition, the `VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DEFAULT_KHR` constant is used to set rate control configuration to implementation-dependent default settings. This is the initial rate control mode that is set for newly created video sessions which leaves rate control entirely in the implementation's control. 450 451Certain codecs define a concept typically referred to as _video coding layers_. The semantics of these layers are defined by the corresponding video compression standards. However, some implementations allow certain configuration parameters of rate control to be specified separately for each such video coding layer, thus this proposal introduces the concept of rate control layers which enable the application to explicitly control these parameters on a per layer basis. 452 453When a single rate control layer is configured, it is applied to all encoded pictures. In contrast, when multiple rate control layers are configured, then each rate control layer is applied only to encoded pictures targeting a specific video coding layer. 454 455After a video session is reset using `VK_VIDEO_CODING_CONTROL_RESET_BIT_KHR`, its rate control settings are initialized to implementation-specific defaults. Applications can change these by calling `vkCmdControlVideoCodingKHR` and specifying the `VK_VIDEO_CODING_CONTROL_ENCODE_RATE_CONTROL_BIT_KHR` flag. When this flag is present, the following new structure chained to the `pNext` chain of `VkVideoCodingControlInfoKHR` specifies the rate control configuration: 456 457[source,c] 458---- 459typedef struct VkVideoEncodeRateControlInfoKHR { 460 VkStructureType sType; 461 const void* pNext; 462 VkVideoEncodeRateControlFlagsKHR flags; 463 VkVideoEncodeRateControlModeFlagBitsKHR rateControlMode; 464 uint32_t layerCount; 465 const VkVideoEncodeRateControlLayerInfoKHR* pLayers; 466 uint32_t virtualBufferSizeInMs; 467 uint32_t initialVirtualBufferSizeInMs; 468} VkVideoEncodeRateControlInfoKHR; 469---- 470 471`rateControlMode` specifies the rate control mode to set. 472 473`layerCount` specifies the number of rate control layers to use from this point, and `pLayers` specifies the configuration of each layer. Rate control layers can only be specified when rate control is not disabled or is not set to the implementation-specific defaults. 474 475`virtualBufferSizeInMs` and `initialVirtualBufferSizeInMs` specify the size and initial occupancy, respectively, in milliseconds of the leaky bucket model virtual buffer. 476 477The `VkVideoEncodeRateControlLayerInfoKHR` structure is defined as follows: 478 479[source,c] 480---- 481typedef struct VkVideoEncodeRateControlLayerInfoKHR { 482 VkStructureType sType; 483 const void* pNext; 484 uint64_t averageBitrate; 485 uint64_t maxBitrate; 486 uint32_t frameRateNumerator; 487 uint32_t frameRateDenominator; 488} VkVideoEncodeRateControlLayerInfoKHR; 489---- 490 491`averageBitrate` and `maxBitrate` specify the target and peak bitrate that the rate control layer should use in bits/second. In case of CBR mode the two values have to match. 492 493`frameRateNumerator` and `frameRateDenominator` specify the numerator and denominator of the frame rate used by the video sequence. 494 495The exact behavior of rate control is implementation-specific but it is typically constrained by the video compression standard corresponding to the used video profile. Implementations are expected to implement rate control as follows: 496 497 * In case of CBR mode the bitrate should stay as close to the specified `averageBitrate` as possible within the virtual buffer window. 498 * In case of VBR mode the bitrate should not exceed the value of `maxBitrate` while also trying to get close to the target bitrate specified by `averageBitrate` within the virtual buffer window. 499 500Codec-specific video encode extensions can include both global and per-layer codec-specific rate control configurations by chaining codec-specific parameters to the `VkVideoEncodeRateControlInfoKHR` and `VkVideoEncodeRateControlLayerInfoKHR` structures, respectively. 501 502Some implementations do not track the current rate control configuration as part of the device state maintained in the video session object, but the current rate control configuration may affect the device commands recorded in response to video encode operations. In order to enable implementations to have access to the current rate control configuration when recording video encoding commands into command buffers, this proposal requires the current rate control configuration to be also specified when calling `vkCmdBeginVideoCodingKHR` by including the `VkVideoEncodeRateControlInfoKHR` structure describing it in the `pNext` chain of the `pBeginCodingInfo` parameter. When this information is not included, it is assumed that the currently expected rate control configuration is the default one, i.e. the implementation-specific rate control mode indicated by `VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DEFAULT_KHR`. 503 504It is important to note that specifying the rate control configuration when calling `vkCmdBeginVideoCodingKHR` does not change the current rate control configuration. For that the `vkCmdControlVideoCodingKHR` command must be used with the `VK_VIDEO_CODING_CONTROL_ENCODE_RATE_CONTROL_BIT_KHR` flag, as discussed earlier. The rate control configuration specified to `vkCmdBeginVideoCodingKHR` serves only to make the information about the current rate control state available to implementations during command recording and is expected to always match the effective current rate control state at the time the command is executed on the device. 505 506 507=== Usage Summary 508 509To summarize the usage of the video encoding features introduced by this extension, let us take a look at a typical usage scenario when using this extension to encode a video stream. 510 511Before the application can start recording command buffers with video encode operations, it has to do the following: 512 513 . Ensure that the implementation can encode the video content by first querying the video codec operations supported by each queue family using the `vkGetPhysicalDeviceQueueFamilyProperties2` command and the `VkQueueFamilyVideoPropertiesKHR` output structure. 514 . If needed, the application has to also retrieve the `VkQueueFamilyQueryResultStatusPropertiesKHR` output structure for the queue family to check support for `VK_QUERY_TYPE_RESULT_STATUS_ONLY_KHR` queries. 515 . Construct the `VkVideoProfileInfoKHR` structure describing the entire video profile, including the video codec operation, chroma subsampling, bit depths, and any other usage or codec-specific parameters. 516 . Ensure that the specific video profile is supported by the implementation using the `vkGetPhysicalDeviceVideoCapabilitiesKHR` command and retrieve the general, encode-specific, and codec-specific capabilities at the same time. 517 . Query the list of supported image/picture format properties supported for the video profile using the `vkGetPhysicalDeviceVideoFormatPropertiesKHR` structure, and select a suitable format for the DPB and encode input pictures. 518 . Create an image corresponding to the encode input picture with the appropriate usage flags and video profile list, as described earlier, and bind suitable device memory to the image. Also create an image view with the appropriate usage flags to use in the video encode operations. 519 . If needed, create one or more images corresponding to the DPB pictures with the appropriate usage flags and video profile list, as described earlier, and bind suitable device memory to them. Also create any image views with the appropriate usage flags to use in the video encode operations. 520 . Create a buffer with the `VK_BUFFER_USAGE_VIDEO_ENCODE_DST_BIT_KHR` usage flag and the video profile list, to use as the destination video bitstream buffer. If the buffer is expected to be consumed using the CPU, consider binding compatible host-visible device memory to the buffer. 521 . If result status or video encode feedback queries are needed and supported (as determined earlier), create a query pool with the corresponding query type and the used video encode profile. 522 . Create the video session using the video encode profile and appropriate parameters within the capabilities supported by the profile, as determined earlier. Bind suitable device memory to each memory binding index of the video session. 523 . If needed, create a video session parameters object for the video session. 524 525Recording video encode operations into command buffers typically consists of the following sequence: 526 527 . Start a video coding scope with the created video session (and parameters) object using the `vkCmdBeginVideoCodingKHR` command. Make sure to include all video picture resources in `VkVideoBeginCodingInfoKHR::pReferenceSlots` that may be used as reconstructed or reference pictures within the video coding scope, and ensure that the DPB slots specified for each reflect the current DPB slot association for the resource. 528 . If this is the first video coding scope the video session is used in, reset the video session to the initial state by recording a `vkCmdControlVideoCodingKHR` command with the `VK_VIDEO_CODING_CONTROL_RESET_BIT_KHR` flag. 529 . If needed, also update the rate control state or the used video encode quality level for the video session by recording a 530 `vkCmdControlVideoCodingKHR` command with the `VK_VIDEO_CODING_CONTROL_ENCODE_RATE_CONTROL_BIT_KHR` and/or `VK_VIDEO_CODING_CONTROL_ENCODE_QUALITY_LEVEL_BIT_KHR` flags (can be done in the same command that resets the video session, if needed). 531 . If needed, start a result status or video coding feedback query using `vkCmdBeginQuery`. Reset the query using `vkCmdResetQueryPool`, beforehand, as needed. 532 . Issue a video encode operation using the `vkCmdEncodeVideoKHR` command with appropriate parameters, as discussed earlier. 533 . If needed, end the started query using `vkCmdEndQuery`. 534 . Record any further control or encode operations into the video coding scope, as needed. 535 . End the video coding scope using the `vkCmdEndVideoCodingKHR` command. 536 537Video profiles that require the use of video session parameters objects may also require the application to encode the stored codec-specific parameters separately into the final bitstream. Applications are expected to encode these parameters according to the following steps: 538 539 . If the application wants to encode such parameters on its own, when possible, it should first call the `vkGetEncodedVideoSessionParametersKHR` command with a non-NULL `pFeedbackInfo` parameter to retrieve information about whether the implementation applied any overrides to the codec-specific parameters in question. 540 . If the results of the previous step indicate that no implementation overrides were applied, then the application can choose to encode the codec-specific parameters in question on its own and ignore the rest of the steps listed here 541 . Otherwise, the application has to retrieve the encoded codec-specific parameters by calling the `vkGetEncodedVideoSessionParametersKHR` command twice: first, to retrieve the size, second to retrieve the data of the encoded codec-specific parameters in question, as discussed earlier. 542 543 544== Examples 545 546=== Select queue family with video encode support for a given video codec operation 547 548[source,c] 549---- 550VkVideoCodecOperationFlagBitsKHR neededVideoEncodeOp = ... 551uint32_t queueFamilyIndex; 552uint32_t queueFamilyCount; 553 554vkGetPhysicalDeviceQueueFamilyProperties2(physicalDevice, &queueFamilyCount, NULL); 555 556VkQueueFamilyProperties2* props = calloc(queueFamilyCount, 557 sizeof(VkQueueFamilyProperties2)); 558VkQueueFamilyVideoPropertiesKHR* videoProps = calloc(queueFamilyCount, 559 sizeof(VkQueueFamilyVideoPropertiesKHR)); 560 561for (queueFamilyIndex = 0; queueFamilyIndex < queueFamilyCount; ++queueFamilyIndex) { 562 props[queueFamilyIndex].sType = VK_STRUCTURE_TYPE_QUEUE_FAMILY_PROPERTIES_2; 563 props[queueFamilyIndex].pNext = &videoProps[queueFamilyIndex]; 564 565 videoProps[queueFamilyIndex].sType = VK_STRUCTURE_TYPE_QUEUE_FAMILY_VIDEO_PROPERTIES_KHR; 566} 567 568vkGetPhysicalDeviceQueueFamilyProperties2(physicalDevice, &queueFamilyCount, props); 569 570for (queueFamilyIndex = 0; queueFamilyIndex < queueFamilyCount; ++queueFamilyIndex) { 571 if ((props[queueFamilyIndex].queueFamilyProperties.queueFlags & VK_QUEUE_VIDEO_ENCODE_BIT_KHR) != 0 && 572 (videoProps[queueFamilyIndex].videoCodecOperations & neededVideoEncodeOp) != 0) { 573 break; 574 } 575} 576 577if (queueFamilyIndex < queueFamilyCount) { 578 // Found appropriate queue family 579 ... 580} else { 581 // Did not find a queue family with the needed capabilities 582 ... 583} 584---- 585 586 587=== Check support and query the capabilities for a video encode profile 588 589[source,c] 590---- 591VkResult result; 592 593// We also include the optional encode usage information here 594VkVideoEncodeUsageInfoKHR profileUsageInfo = { 595 .sType = VK_STRUCTURE_TYPE_VIDEO_ENCODE_USAGE_INFO_KHR, 596 .pNext = ... // pointer to codec-specific profile structure 597 .videoUsageHints = VK_VIDEO_ENCODE_USAGE_DEFAULT_KHR, 598 .videoContentHints = VK_VIDEO_ENCODE_CONTENT_DEFAULT_KHR, 599 .tuningMode = VK_VIDEO_ENCODE_TUNING_MODE_DEFAULT_KHR 600}; 601 602VkVideoProfileInfoKHR profileInfo = { 603 .sType = VK_STRUCTURE_TYPE_VIDEO_PROFILE_INFO_KHR, 604 .pNext = &profileUsageInfo, 605 .videoCodecOperation = ... // used video encode operation 606 .chromaSubsampling = VK_VIDEO_CHROMA_SUBSAMPLING_420_BIT_KHR, 607 .lumaBitDepth = VK_VIDEO_COMPONENT_BIT_DEPTH_8_BIT_KHR, 608 .chromaBitDepth = VK_VIDEO_COMPONENT_BIT_DEPTH_8_BIT_KHR 609}; 610 611VkVideoEncodeCapabilitiesKHR encodeCapabilities = { 612 .sType = VK_STRUCTURE_TYPE_VIDEO_ENCODE_CAPABILITIES_KHR, 613 .pNext = ... // pointer to codec-specific capability structure 614} 615 616VkVideoCapabilitiesKHR capabilities = { 617 .sType = VK_STRUCTURE_TYPE_VIDEO_CAPABILITIES_KHR, 618 .pNext = &encodeCapabilities 619}; 620 621result = vkGetPhysicalDeviceVideoCapabilitiesKHR(physicalDevice, &profileInfo, &capabilities); 622 623if (result == VK_SUCCESS) { 624 // Profile is supported, check additional capabilities 625 ... 626} else { 627 // Profile is not supported, result provides additional information about why 628 ... 629} 630---- 631 632 633=== Select encode input and DPB formats supported by the video encode profile 634 635[source,c] 636---- 637VkVideoProfileInfoKHR profileInfo = { 638 ... 639}; 640 641VkVideoProfileListInfoKHR profileListInfo = { 642 .sType = VK_STRUCTURE_TYPE_VIDEO_PROFILE_LIST_INFO_KHR, 643 .pNext = NULL, 644 .profileCount = 1, 645 .pProfiles = &profileInfo 646}; 647 648VkPhysicalDeviceVideoFormatInfoKHR formatInfo = { 649 .sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_VIDEO_FORMAT_INFO_KHR, 650 .pNext = &profileListInfo 651}; 652 653VkVideoFormatPropertiesKHR* formatProps = NULL; 654 655// First query encode input formats 656formatInfo.imageUsage = VK_IMAGE_USAGE_VIDEO_ENCODE_SRC_BIT_KHR; 657 658vkGetPhysicalDeviceVideoFormatPropertiesKHR(physicalDevice, &formatInfo, &formatCount, NULL); 659formatProps = calloc(formatCount, sizeof(VkVideoFormatPropertiesKHR)); 660for (uint32_t i = 0; i < formatCount; ++i) { 661 formatProps.sType = VK_STRUCTURE_TYPE_VIDEO_FORMAT_PROPERTIES_KHR; 662} 663vkGetPhysicalDeviceVideoFormatPropertiesKHR(physicalDevice, &formatInfo, &formatCount, formatProps); 664 665for (uint32_t i = 0; i < formatCount; ++i) { 666 // Select encode input format and image creation capabilities best suited for the use case 667 ... 668} 669free(formatProps); 670 671// Then query DPB formats 672formatInfo.imageUsage = VK_IMAGE_USAGE_VIDEO_ENCODE_DPB_BIT_KHR; 673 674vkGetPhysicalDeviceVideoFormatPropertiesKHR(physicalDevice, &formatInfo, &formatCount, NULL); 675formatProps = calloc(formatCount, sizeof(VkVideoFormatPropertiesKHR)); 676for (uint32_t i = 0; i < formatCount; ++i) { 677 formatProps.sType = VK_STRUCTURE_TYPE_VIDEO_FORMAT_PROPERTIES_KHR; 678} 679vkGetPhysicalDeviceVideoFormatPropertiesKHR(physicalDevice, &formatInfo, &formatCount, formatProps); 680 681for (uint32_t i = 0; i < formatCount; ++i) { 682 // Select DPB format and image creation capabilities best suited for the use case 683 ... 684} 685free(formatProps); 686---- 687 688 689=== Create bitstream buffer 690 691[source,c] 692---- 693VkBuffer bitstreamBuffer = VK_NULL_HANDLE; 694 695VkVideoProfileListInfoKHR profileListInfo = { 696 .sType = VK_STRUCTURE_TYPE_VIDEO_PROFILE_LIST_INFO_KHR, 697 .pNext = NULL, 698 .profileCount = ... // number of video profiles to use the bitstream buffer with 699 .pProfiles = ... // pointer to an array of video profile information structure chains 700}; 701 702VkBufferCreateInfo createInfo = { 703 .sType = VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO, 704 .pNext = &profileListInfo, 705 ... 706 .usage = VK_BUFFER_USAGE_VIDEO_ENCODE_DST_BIT_KHR | ... // any other usages that may be needed 707 ... 708}; 709 710vkCreateBuffer(device, &createInfo, NULL, &bitstreamBuffer); 711---- 712 713 714=== Create encode input image and image view 715 716[source,c] 717---- 718VkImage inputImage = VK_NULL_HANDLE; 719VkImageView inputImageView = VK_NULL_HANDLE; 720 721VkVideoProfileListInfoKHR profileListInfo = { 722 .sType = VK_STRUCTURE_TYPE_VIDEO_PROFILE_LIST_INFO_KHR, 723 .pNext = NULL, 724 .profileCount = ... // number of video profiles to use the encode input image with 725 .pProfiles = ... // pointer to an array of video profile information structure chains 726}; 727 728VkImageCreateInfo imageCreateInfo = { 729 .sType = VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO, 730 .pNext = &profileListInfo, 731 ... 732 .usage = VK_IMAGE_USAGE_VIDEO_ENCODE_SRC_BIT_KHR | ... // any other usages that may be needed 733 ... 734}; 735 736vkCreateImage(device, &imageCreateInfo, NULL, &inputImage); 737 738VkImageViewUsageCreateInfo imageViewUsageInfo = { 739 .sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_USAGE_CREATE_INFO, 740 .pNext = NULL, 741 .usage = VK_IMAGE_USAGE_VIDEO_ENCODE_SRC_BIT_KHR 742}; 743 744VkImageViewCreateInfo imageViewCreateInfo = { 745 .sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO, 746 .pNext = &imageViewUsageInfo, 747 .flags = 0, 748 .image = inputImage, 749 .viewType = ... // image view type (only 2D or 2D_ARRAY is supported) 750 ... // other image view creation parameters 751}; 752 753vkCreateImageView(device, &imageViewCreateInfo, NULL, &inputImageView); 754---- 755 756 757=== Create DPB image and image view 758 759[source,c] 760---- 761// NOTE: This example creates a single image and image view that is used to back all DPB pictures 762// but, depending on the support of the VK_VIDEO_CAPABILITY_SEPARATE_REFERENCE_IMAGES_BIT_KHR 763// capability flag, the application can choose to create separate images for each DPB slot or 764// picture 765 766VkImage dpbImage = VK_NULL_HANDLE; 767VkImageView dpbImageView = VK_NULL_HANDLE; 768 769VkVideoProfileListInfoKHR profileListInfo = { 770 .sType = VK_STRUCTURE_TYPE_VIDEO_PROFILE_LIST_INFO_KHR, 771 .pNext = NULL, 772 .profileCount = ... // number of video profiles to use the encode DPB image with 773 .pProfiles = ... // pointer to an array of video profile information structure chains 774}; 775 776VkImageCreateInfo imageCreateInfo = { 777 .sType = VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO, 778 .pNext = &profileListInfo, 779 ... 780 .usage = VK_IMAGE_USAGE_VIDEO_ENCODE_DPB_BIT_KHR | ... // any other usages that may be needed 781 ... 782 .arrayLayers = // typically equal to the DPB slot count 783}; 784 785vkCreateImage(device, &imageCreateInfo, NULL, &dpbImage); 786 787VkImageViewUsageCreateInfo imageViewUsageInfo = { 788 .sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_USAGE_CREATE_INFO, 789 .pNext = NULL, 790 .usage = VK_IMAGE_USAGE_VIDEO_ENCODE_DPB_BIT_KHR 791}; 792 793VkImageViewCreateInfo imageViewCreateInfo = { 794 .sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO, 795 .pNext = &imageViewUsageInfo, 796 .flags = 0, 797 .image = dpbImage, 798 .viewType = ... // image view type (only 2D or 2D_ARRAY is supported) 799 ... // other image view creation parameters 800}; 801 802vkCreateImageView(device, &imageViewCreateInfo, NULL, &dpbImageView); 803---- 804 805 806=== Create and use video encode feedback query pool with a video session 807 808[source,c] 809---- 810VkQueryPool queryPool = VK_NULL_HANDLE; 811 812VkVideoProfileInfoKHR profileInfo = { 813 ... 814}; 815 816// We will capture both bitstream offset and bitstream bytes written in the feedback 817VkVideoEncodeFeedbackFlags capturedEncodeFeedbackValues = 818 VK_VIDEO_ENCODE_FEEDBACK_BITSTREAM_BUFFER_OFFSET_BIT_KHR | 819 VK_VIDEO_ENCODE_FEEDBACK_BITSTREAM_BYTES_WRITTEN_BIT_KHR; 820 821// NOTE: Only the encode feedback values listed above are required to be supported by all 822// video encode implementations. So if the application intends to use other encode 823// feedback values like VK_VIDEO_ENCODE_FEEDBACK_BITSTREAM_HAS_OVERRIDES_BIT_KHR, then 824// it must first check support for it as indicated by the supportedEncodeFeedbackFlags 825// capability for the video encode profile in question. 826 827VkQueryPoolVideoEncodeFeedbackCreateInfoKHR feedbackInfo = { 828 .sType = VK_STRUCTURE_TYPE_QUERY_POOL_VIDEO_ENCODE_FEEDBACK_CREATE_INFO_KHR, 829 .pNext = &profileInfo, 830 .encodeFeedbackFlags = capturedEncodeFeedbackValues 831}; 832 833VkQueryPoolCreateInfo createInfo = { 834 .sType = VK_STRUCTURE_TYPE_QUERY_POOL_CREATE_INFO, 835 .pNext = &feedbackInfo, 836 .flags = 0, 837 .queryType = VK_QUERY_TYPE_VIDEO_ENCODE_FEEDBACK_KHR, 838 ... 839}; 840 841vkCreateQueryPool(device, &createInfo, NULL, &queryPool); 842 843... 844vkBeginCommandBuffer(commandBuffer, ...); 845... 846vkCmdBeginVideoCodingKHR(commandBuffer, ...); 847... 848vkCmdBeginQuery(commandBuffer, queryPool, 0, 0); 849// Issue video encode operation 850... 851vkCmdEndQuery(commandBuffer, queryPool, 0); 852... 853vkCmdEndVideoCodingKHR(commandBuffer, ...); 854... 855vkEndCommandBuffer(commandBuffer); 856... 857 858// We retrieve the captured feedback values as well as the status 859struct { 860 uint32_t bitstreamBufferOffset; 861 uint32_t bitstreamBytesWritten; 862 VkQueryResultStatusKHR status; 863} results; 864vkGetQueryPoolResults(device, queryPool, 0, 1, 865 sizeof(results), &results, sizeof(results), 866 VK_QUERY_RESULT_WITH_STATUS_BIT_KHR); 867 868if (results.status == VK_QUERY_RESULT_STATUS_NOT_READY_KHR /* 0 */) { 869 // Query result not ready yet 870 ... 871} else if (results.status > 0) { 872 // Video encode operation was successful, we can use bitstream feedback data 873 ... 874} else if (results.status < 0) { 875 // Video encode operation was unsuccessful, feedback data is undefined 876 ... 877} 878 879---- 880 881 882=== Record encode operation (video session without DPB slots) 883 884[source,c] 885---- 886vkCmdBeginVideoCodingKHR(commandBuffer, ...); 887 888VkVideoPictureResourceInfoKHR encodeInputPictureResource = { 889 .sType = VK_STRUCTURE_TYPE_VIDEO_PICTURE_RESOURCE_INFO_KHR, 890 .pNext = NULL, 891 .codedOffset = ... // offset within the image subresource (typically { 0, 0 }) 892 .codedExtent = ... // extent of encoded picture (typically the video frame size) 893 .baseArrayLayer = 0, 894 .imageViewBinding = inputImageView 895}; 896 897VkVideoEncodeInfoKHR encodeInfo = { 898 .sType = VK_STRUCTURE_TYPE_VIDEO_ENCODE_INFO_KHR, 899 .pNext = ... // pointer to codec-specific picture information structure 900 .flags = 0, 901 .dstBuffer = bitstreamBuffer, 902 .dstBufferOffset = ... // offset where the encoded bitstream is written 903 .dstBufferRange = ... // maximum size in bytes of the written bitstream data 904 .srcPictureResource = encodeInputPictureResource, 905 .pSetupReferenceSlot = NULL, 906 .referenceSlotCount = 0, 907 .pReferenceSlots = NULL, 908 .precedingExternallyEncodedBytes = ... 909}; 910 911vkCmdEncodeVideoKHR(commandBuffer, &encodeInfo); 912 913vkCmdEndVideoCodingKHR(commandBuffer, ...); 914---- 915 916 917=== Record encode operation with reconstructed picture information 918 919[source,c] 920---- 921// Bound reference resource list provided has to include reconstructed picture resource 922vkCmdBeginVideoCodingKHR(commandBuffer, ...); 923 924VkVideoPictureResourceInfoKHR encodeInputPictureResource = { 925 .sType = VK_STRUCTURE_TYPE_VIDEO_PICTURE_RESOURCE_INFO_KHR, 926 .pNext = NULL, 927 .codedOffset = ... // offset within the image subresource (typically { 0, 0 }) 928 .codedExtent = ... // extent of encoded picture (typically the video frame size) 929 .baseArrayLayer = 0, 930 .imageViewBinding = inputImageView 931}; 932 933VkVideoPictureResourceInfoKHR reconstructedPictureResource = { 934 .sType = VK_STRUCTURE_TYPE_VIDEO_PICTURE_RESOURCE_INFO_KHR, 935 .pNext = NULL, 936 .codedOffset = ... // offset within the image subresource (typically { 0, 0 }) 937 .codedExtent = ... // extent of reconstructed picture (typically the video frame size) 938 .baseArrayLayer = ... // layer to use for setup picture in DPB 939 .imageViewBinding = dpbImageView 940}; 941 942VkVideoReferenceSlotInfoKHR setupSlotInfo = { 943 .sType = VK_STRUCTURE_TYPE_VIDEO_REFERENCE_SLOT_INFO_KHR, 944 .pNext = ... // pointer to codec-specific reconstructed picture information structure 945 .slotIndex = ... // DPB slot index to use with the reconstructed picture 946 // (optionally activated per the codec-specific semantics) 947 .pPictureResource = &reconstructedPictureResource 948}; 949 950VkVideoEncodeInfoKHR encodeInfo = { 951 .sType = VK_STRUCTURE_TYPE_VIDEO_ENCODE_INFO_KHR, 952 .pNext = ... // pointer to codec-specific picture information structure 953 ... 954 .srcPictureResource = encodeInputPictureResource, 955 .pSetupReferenceSlot = &setupSlotInfo, 956 ... 957}; 958 959vkCmdEncodeVideoKHR(commandBuffer, &encodeInfo); 960 961vkCmdEndVideoCodingKHR(commandBuffer, ...); 962---- 963 964 965=== Record encode operation with reference picture list 966 967[source,c] 968---- 969// Bound reference resource list provided has to include all used reference picture resources 970vkCmdBeginVideoCodingKHR(commandBuffer, ...); 971 972VkVideoPictureResourceInfoKHR referencePictureResources[] = { 973 { 974 .sType = VK_STRUCTURE_TYPE_VIDEO_PICTURE_RESOURCE_INFO_KHR, 975 .pNext = NULL, 976 .codedOffset = ... // offset within the image subresource (typically { 0, 0 }) 977 .codedExtent = ... // extent of reference picture (typically the video frame size) 978 .baseArrayLayer = ... // layer of first reference picture resource 979 .imageViewBinding = dpbImageView 980 }, 981 { 982 .sType = VK_STRUCTURE_TYPE_VIDEO_PICTURE_RESOURCE_INFO_KHR, 983 .pNext = NULL, 984 .codedOffset = ... // offset within the image subresource (typically { 0, 0 }) 985 .codedExtent = ... // extent of reference picture (typically the video frame size) 986 .baseArrayLayer = ... // layer of second reference picture resource 987 .imageViewBinding = dpbImageView 988 }, 989 ... 990}; 991// NOTE: Individual resources do not have to refer to the same image view, e.g. if different 992// image views are created for each picture resource, or if the 993// VK_VIDEO_CAPABILITY_SEPARATE_REFERENCE_IMAGES_BIT_KHR capability is supported and the 994// application created separate images for the reference pictures. 995 996VkVideoReferenceSlotInfoKHR referenceSlotInfo[] = { 997 { 998 .sType = VK_STRUCTURE_TYPE_VIDEO_REFERENCE_SLOT_INFO_KHR, 999 .pNext = ... // pointer to codec-specific reference picture information structure 1000 .slotIndex = ... // DPB slot index of the first reference picture 1001 .pPictureResource = &referencePictureResource[0] 1002 }, 1003 { 1004 .sType = VK_STRUCTURE_TYPE_VIDEO_REFERENCE_SLOT_INFO_KHR, 1005 .pNext = ... // pointer to codec-specific reference picture information structure 1006 .slotIndex = ... // DPB slot index of the second reference picture 1007 .pPictureResource = &referencePictureResource[1] 1008 }, 1009 ... 1010}; 1011 1012VkVideoEncodeInfoKHR encodeInfo = { 1013 .sType = VK_STRUCTURE_TYPE_VIDEO_ENCODE_INFO_KHR, 1014 .pNext = ... // pointer to codec-specific picture information structure 1015 ... 1016 .referenceSlotCount = sizeof(referenceSlotInfo) / sizeof(referenceSlotInfo[0]), 1017 .pReferenceSlots = &referenceSlotInfo[0] 1018}; 1019 1020vkCmdEncodeVideoKHR(commandBuffer, &encodeInfo); 1021 1022vkCmdEndVideoCodingKHR(commandBuffer, ...); 1023---- 1024 1025 1026=== Encode codec-specific parameters stored in video session parameters objects 1027 1028[source,c] 1029---- 1030VkVideoEncodeSessionParametersGetInfoKHR getInfo = { 1031 .sType = VK_STRUCTURE_TYPE_VIDEO_ENCODE_SESSION_PARAMETERS_GET_INFO_KHR, 1032 .pNext = ... // pointer to any codec-specific parameters, if needed 1033 .videoSessionParameters = // video session parameters object to query 1034}; 1035 1036// VK_TRUE, if application prefers to encode the stored codec-specific parameters 1037// itself, if possible, VK_FALSE otherwise 1038VkBool32 preferApplicationParameterEncode = ...; 1039 1040VkBool32 parametersContainOverrides = VK_FALSE; 1041 1042if (preferApplicationParameterEncode) { 1043 VkVideoEncodeSessionParametersFeedbackInfoKHR feedbackInfo = { 1044 .sType = VK_STRUCTURE_TYPE_VIDEO_ENCODE_SESSION_PARAMETERS_FEEDBACK_INFO_KHR, 1045 .pNext = ... // pointer to any codec-specific feedback info, if needed 1046 .hasOverrides = VK_FALSE; 1047 }; 1048 1049 vkGetEncodedVideoSessionParametersKHR(device, &getInfo, &feedbackInfo, NULL, NULL); 1050 1051 parametersContainOverrides = feedbackInfo.hasOverrides; 1052} 1053 1054if (preferApplicationParameterEncode && !parametersContainOverrides) { 1055 // Encode codec-specific parameters manually 1056 ... 1057} else { 1058 // Retrieve encoded codec-specific parameters from implementation 1059 size_t dataSize = 0; 1060 vkGetEncodedVideoSessionParametersKHR(device, &getInfo, NULL, &dataSize, NULL); 1061 1062 // Pointer to CPU buffer with at least dataSize number of bytes of storage 1063 // (allocate it on demand or use an existing pool used for bitstream storage) 1064 void* data = ...; 1065 vkGetEncodedVideoSessionParametersKHR(device, &getInfo, NULL, &dataSize, data); 1066} 1067---- 1068 1069 1070=== Change the rate control configuration of a video encode session 1071 1072[source,c] 1073---- 1074vkCmdBeginVideoCodingKHR(commandBuffer, ...); 1075 1076VkVideoEncodeRateControlLayerInfoKHR rateControlLayers[] = { 1077 { 1078 .sType = VK_STRUCTURE_TYPE_VIDEO_ENCODE_RATE_CONTROL_LAYER_INFO_KHR, 1079 .pNext = ... // pointer to optional codec-specific rate control layer configuration 1080 .averageBitrate = 2000000, // 2 Mbps target bitrate 1081 .maxBitrate = 5000000, // 5 Mbps peak bitrate 1082 .frameRateNumerator = 30000, // 29.97 fps numerator 1083 .frameRateDenominator = 1001 // 29.97 fps denominator 1084 }, 1085 ... 1086}; 1087 1088VkVideoEncodeRateControlInfoKHR rateControlInfo = { 1089 .sType = VK_STRUCTURE_TYPE_VIDEO_ENCODE_RATE_CONTROL_INFO_KHR, 1090 .pNext = ... // pointer to optional codec-specific rate control configuration 1091 .flags = 0, 1092 .rateControlMode = VK_VIDEO_ENCODE_RATE_CONTROL_MODE_VBR_BIT_KHR, // variable bitrate mode 1093 .layerCount = sizeof(rateControlLayers) / sizeof(rateControlLayers[0]), 1094 .pLayers = rateControlLayers, 1095 .virtualBufferSizeInMs = 2000, // virtual buffer size is 2 seconds 1096 .initialVirtualBufferSizeInMs = 0 1097}; 1098 1099// Change the rate control configuration for the video session 1100VkVideoCodingControlInfoKHR controlInfo = { 1101 .sType = VK_STRUCTURE_TYPE_VIDEO_CODING_CONTROL_INFO_KHR, 1102 .pNext = &rateControlInfo, 1103 .flags = VK_VIDEO_CODING_CONTROL_ENCODE_RATE_CONTROL_BIT_KHR 1104}; 1105 1106vkCmdControlVideoCodingKHR(commandBuffer, &controlInfo); 1107 1108... 1109 1110vkCmdEndVideoCodingKHR(commandBuffer, ...); 1111---- 1112 1113 1114=== Change the video encode quality level used by a video encode session 1115 1116[source,c] 1117---- 1118vkCmdBeginVideoCodingKHR(commandBuffer, ...); 1119 1120VkVideoEncodeQualityLevelInfoKHR qualityLevelInfo = { 1121 .sType = VK_STRUCTURE_TYPE_VIDEO_ENCODE_QUALITY_LEVEL_INFO_KHR, 1122 .pNext = NULL, 1123 .qualityLevel = ... // the new quality level to set 1124}; 1125 1126VkVideoCodingControlInfoKHR controlInfo = { 1127 .sType = VK_STRUCTURE_TYPE_VIDEO_CODING_CONTROL_INFO_KHR, 1128 .pNext = &qualityLevelInfo, 1129 .flags = VK_VIDEO_CODING_CONTROL_ENCODE_QUALITY_LEVEL_BIT_KHR 1130}; 1131 1132vkCmdControlVideoCodingKHR(commandBuffer, &controlInfo); 1133 1134... 1135 1136vkCmdEndVideoCodingKHR(commandBuffer, ...); 1137---- 1138 1139 1140=== Initialize a video encode session with a specific quality level and corresponding recommended rate control settings 1141 1142[source,c] 1143---- 1144// Construct the video encode profile with appropriate usage scenario information 1145// We also include the optional encode usage information here 1146VkVideoEncodeUsageInfoKHR profileUsageInfo = { 1147 .sType = VK_STRUCTURE_TYPE_VIDEO_ENCODE_USAGE_INFO_KHR, 1148 .pNext = ... // pointer to codec-specific profile structure 1149 .videoUsageHints = ... // usage hints 1150 .videoContentHints = ... // content hints 1151 .tuningMode = ... // tuning mode 1152}; 1153 1154VkVideoProfileInfoKHR profileInfo = { 1155 .sType = VK_STRUCTURE_TYPE_VIDEO_PROFILE_INFO_KHR, 1156 .pNext = &profileUsageInfo, 1157 ... 1158}; 1159 1160// Query the video encode profile capabilities to determine maxQualityLevels 1161VkVideoEncodeCapabilitiesKHR encodeCapabilities = { 1162 .sType = VK_STRUCTURE_TYPE_VIDEO_ENCODE_CAPABILITIES_KHR, 1163 .pNext = ... // pointer to codec-specific capability structure 1164} 1165 1166VkVideoCapabilitiesKHR capabilities = { 1167 .sType = VK_STRUCTURE_TYPE_VIDEO_CAPABILITIES_KHR, 1168 .pNext = &encodeCapabilities 1169}; 1170 1171result = vkGetPhysicalDeviceVideoCapabilitiesKHR(physicalDevice, &profileInfo, &capabilities); 1172 1173// Select a quality level to use between 0 and maxQualityLevels-1 1174uint32_t selectedQualityLevel = selectQualityLevelFrom(0, encodeCapabilities.maxQualityLevels - 1); 1175 1176// Query recommended settings for the selected video encode quality level 1177VkPhysicalDeviceVideoEncodeQualityLevelInfoKHR qualityLevelInfo = { 1178 .sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_VIDEO_ENCODE_QUALITY_LEVEL_INFO_KHR, 1179 .pNext = NULL, 1180 .pVideoProfile = &profileInfo, 1181 .qualityLevel = selectedQualityLevel 1182}; 1183 1184VkVideoEncodeQualityLevelPropertiesKHR qualityLevelProps = { 1185 .sType = VK_STRUCTURE_TYPE_VIDEO_ENCODE_QUALITY_LEVEL_PROPERTIES_KHR, 1186 .pNext = ... // pointer to any codec-specific parameters, if needed 1187}; 1188 1189result = vkGetPhysicalDeviceVideoEncodeQualityLevelPropertiesKHR(physicalDevice, &qualityLevelInfo, &qualityLevelProps); 1190 1191... 1192 1193// Video session parameters are always created with respect to the used 1194// video encode quality level, so create one accordingly 1195VkVideoEncodeQualityLevelInfoKHR paramsQualityLevelInfo = { 1196 .sType = VK_STRUCTURE_TYPE_VIDEO_ENCODE_QUALITY_LEVEL_INFO_KHR, 1197 .pNext = ... // pointer to codec-specific parameters creation information 1198 .qualityLevel = selectedQualityLevel 1199}; 1200 1201VkVideoSessionParametersCreateInfoKHR paramsCreateInfo = { 1202 .sType = VK_STRUCTURE_TYPE_VIDEO_SESSION_PARAMETERS_CREATE_INFO_KHR, 1203 .pNext = ¶msQualityLevelInfo, 1204 ... 1205}; 1206 1207VkVideoSessionParametersKHR params = VK_NULL_HANDLE; 1208result = vkCreateVideoSessionParametersKHR(device, ¶msCreateInfo, NULL, ¶ms); 1209 1210... 1211 1212vkCmdBeginVideoCodingKHR(commandBuffer, ...); 1213 1214// Initialize the video session, set the quality level, and the 1215// recommended rate control configuration 1216// NOTE: The application can choose other rate control settings as the 1217// quality level properties only indicate preference, not a requirement 1218 1219// Include rate control information 1220VkVideoEncodeRateControlInfoKHR rateControlInfo = { 1221 .sType = VK_STRUCTURE_TYPE_VIDEO_ENCODE_RATE_CONTROL_INFO_KHR, 1222 .pNext = ... // pointer to optional codec-specific rate control configuration 1223 .flags = 0, 1224 .rateControlMode = qualityLevelProps.preferredRateControlMode, 1225 .layerCount = qualityLevelProps.preferredRateControlLayerCount, 1226 ... 1227}; 1228 1229// Include quality level information 1230VkVideoEncodeQualityLevelInfoKHR qualityLevelInfo = { 1231 .sType = VK_STRUCTURE_TYPE_VIDEO_ENCODE_QUALITY_LEVEL_INFO_KHR, 1232 .pNext = &rateControlInfo, 1233 .qualityLevel = selectedQualityLevel 1234}; 1235 1236// Include all of the RESET, ENCODE_QUALITY_LEVEL, and RATE_CONTROL bits 1237// because in this example we do an initialization followed by an immediate 1238// update to the quality level and rate control states 1239VkVideoCodingControlInfoKHR controlInfo = { 1240 .sType = VK_STRUCTURE_TYPE_VIDEO_CODING_CONTROL_INFO_KHR, 1241 .pNext = &qualityLevelInfo, 1242 .flags = VK_VIDEO_CODING_CONTROL_RESET_BIT_KHR 1243 | VK_VIDEO_CODING_CONTROL_ENCODE_QUALITY_LEVEL_BIT_KHR 1244 | VK_VIDEO_CODING_CONTROL_ENCODE_RATE_CONTROL_BIT_KHR 1245}; 1246 1247vkCmdControlVideoCodingKHR(commandBuffer, &controlInfo); 1248 1249... 1250 1251vkCmdEndVideoCodingKHR(commandBuffer, ...); 1252---- 1253 1254 1255== Issues 1256 1257=== RESOLVED: Why is there no `VK_PIPELINE_STAGE_VIDEO_ENCODE_BIT_KHR`? 1258 1259This extension requires the `VK_KHR_synchronization2` extension because the new access flags introduced did not fit in the 32-bit enum `VkAccessFlagBits`. Accordingly, all new pipeline stage and access flags have been added to the corresponding 64-bit enums and no new flags have been added to the legacy 32-bit enums. While the new pipeline stage flag introduced uses bit #27 which would also fit in the legacy `VkPipelineStageFlagBits` enum, there is no real benefit to include it. Instead the bit is marked reserved. 1260 1261 1262=== RESOLVED: How can layered codec-specific encode extensions enable applications to provide the necessary codec-specific picture information, parameter sets, etc. that may be needed to perform the video coding operations? 1263 1264There are multiple points where codec-specific picture information can be provided to a video encode operation. This extension suggests the following convention: 1265 1266 * Codec-specific encode parameters are expected to be provided in the `pNext` chain of `VkVideoEncodeInfoKHR`. 1267 * Codec-specific reconstructed picture information is expected to be provided in the `pNext` chain of `VkVideoEncodeInfoKHR::pSetupReferenceSlot`. 1268 * Codec-specific reference picture information is expected to be provided in the `pNext` chain of the elements of the `VkVideoEncodeInfoKHR::pReferenceSlots` array. 1269 1270 1271=== RESOLVED: Can `vkCmdVideoEncodeKHR` only encode frames? What about field encoding, slice encoding, etc.? 1272 1273This extension does not define the types of pictures or sub-picture content that can be encoded by a `vkCmdVideoEncodeKHR` command. It is expected that the codec-specific encode extensions built upon this extension define the types of pictures that can be encoded. Furthermore, both codec-specific and codec-independent extensions can expand the set of capabilities introduced here to enable more advanced use cases, as needed. 1274 1275 1276=== RESOLVED: What is the effect of the flags provided in `VkVideoEncodeUsageInfoKHR::videoUsageHints` and `VkVideoEncodeUsageInfoKHR::videoContentHints`? 1277 1278There are no specific behavioral effects associated with any of the video encode usage and content hints, so the application can specify any combination of these flags. They are included to enable the application to better communicate the intended use case scenario to the implementation. 1279 1280However, just like any other additional video profile information included in the `pNext` chain of `VkVideoProfileInfoKHR` structures, they are part of the video profile definition, hence whenever matching video profiles have to be provided to an API call, be that queries or resource creation structures, the application must provide identical video encode usage and content hint values. This also applies if the application does not include the `VkVideoEncodeUsageInfoKHR` structure, which is treated equivalently to specifying the structure with `videoUsageHints`, `videoContentHints`, and `tuningMode` equal to `VK_VIDEO_ENCODE_USAGE_DEFAULT_KHR`, `VK_VIDEO_ENCODE_CONTENT_DEFAULT_KHR`, and `VK_VIDEO_ENCODE_TUNING_MODE_DEFAULT_KHR` (or zero), respectively, per the usual conventions of Vulkan. 1281 1282 1283=== RESOLVED: What is the effect of the tuning mode provided in `VkVideoEncodeUsageInfoKHR::tuningMode`? 1284 1285Unlike the other fields in `VkVideoEncodeUsageInfoKHR`, the tuning mode affects the behavior of video session objects created using them. Different tuning modes may put the hardware in a different mode of operation tuned for the particular use case with significantly different capabilities, as well as quality and performance characteristics. 1286 1287 1288=== RESOLVED: How should we expose video encoding feedback values (e.g. encoded bitstream size)? 1289 1290Through a new query type. We follow the model of pipeline statistics queries to enable adding additional feedback values to the query thus this extension introduces a new `VK_QUERY_TYPE_VIDEO_ENCODE_FEEDBACK_KHR` query type with the ability to get feedback about the offset and size of the bitstream data produced by video encode operations (amongst other feedback values). We expect that in the future video decode operations will need to support similar feedback values thus a similar query type for video decode operations can be introduced by another extension. 1291 1292 1293=== RESOLVED: Do result status queries need to be used in conjunction with video encode feedback queries? 1294 1295No, in fact only a single query can ever be active within a video coding scope, hence executing a result status query as well as a video encode feedback query for the same video encode operation is not possible. Though it is also not needed, as all query types allow returning a result status, just like availability status. Thus, in practice, result status queries are only needed to be used when no other query type is supported in the particular context, and in case of video encoding applications are expected to only use video encode feedback queries within a video coding scope. 1296 1297 1298=== RESOLVED: Why is there a need to allow implementations to override codec-specific parameters? 1299 1300As described in the corresponding section earlier, encoder implementations usually only support a subset of the available encoding tools defined by the corresponding video compression standards and enumerating exhaustively all of these constraints would be impractical and could result in a combinatorial explosion of codec-specific capabilities. Instead, this proposal allows implementations to override any codec-specific parameter values or combinations thereof, so that the resulting parameters comply to the constraints of the target implementation. 1301 1302Some other video encode APIs do not support implementation overrides, but the drawback of that choice is that implementations may not be able to expose a potentially large set of their encoding tools just because they do not comply to the exact wording of the capabilities defined by these APIs, so this proposal chose to maximize the exposed capabilities instead. 1303 1304Such minimal and necessary implementation overrides are expected to be applied only when they are absolutely paramount for the correct functioning of the underlying encoder hardware. Additional, optimizing overrides can be, however, explicitly enabled by the application using the `VK_VIDEO_SESSION_CREATE_ALLOW_ENCODE_PARAMETER_OPTIMIZATIONS_BIT_KHR` video session creation flag. 1305 1306 1307=== RESOLVED: Can the application disable all implementation overrides? 1308 1309No. Without the ability to override codec-specific parameters, as necessitated by the constraints of the target implementation, the implementation may not be able to guarantee that the generated bitstreams will be compliant to the video compression standard in question. 1310 1311Accordingly, if the API would allow the application to disable all implementation overrides, that would, for all practical purposes, be equivalent to a flag enabling undefined behavior from the perspective of video compression standard compliance. 1312 1313For the same reason, if the application chooses to encode codec-specific parameters stored in video session parameters object on its own, indifferent of whether the implementation had to apply overrides to those, as reported by `vkGetEncodedVideoSessionParametersKHR`, it risks the final bitstream to be non-compliant. 1314 1315Applications seeking to only accept bitstreams produced exactly according to the codec-specific parameters they requested can choose to treat the presence of any overrides as an encoding error. 1316 1317 1318=== RESOLVED: Can implementations override any codec-specific parameter? 1319 1320No. First, there are a set of rules that implementations have to comply to when applying any parameter overrides, as defined in detail in the specification. In addition, codec-specific extensions layered on top of this proposal can define their own restrictions about what parameters can implementations override. In practice, it is expected that certain codec-specific parameters that affect the overall behavior of the encoder and that could have an impact on any additional bitstream elements that need to be encoded by the application will never be overridden by the implementation, and thus will be excluded from the set of overridable parameters in the corresponding codec-specific extension. 1321 1322Over time, it is expected that the set of these guarantees will grow (e.g. by exposing additional capabilities) according to the needs of encoder applications. 1323 1324 1325=== RESOLVED: Do all implementations have to implement the same rate control algorithms corresponding to the rate control modes defined by this proposal? 1326 1327No. While the high-level rate control modes (CBR and VBR) defined by this proposal are fairly universal, each rate control mode can be implemented in many different ways while still complying to the fundamental model of the mode itself. In practice, the rate control algorithms employed by implementations significantly differ. 1328 1329Accordingly, this proposal does not try to describe any specific rate control algorithm for any of the rate control modes introduced, rather it provides a high-level description of the modes and the underlying leaky bucket model used by them. 1330 1331The only case where the effects of rate control are defined exactly is when rate control is disabled (using `VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DISABLED_BIT_KHR`), where implementations must encode the pictures exactly per the application-specified codec-specific quantization parameters. 1332 1333 1334=== RESOLVED: Do rate control implementations guarantee to respect the average/max bitrates, or frame sizes configured for the video session? 1335 1336Unfortunately, implementations cannot provide hard guarantees about always respecting these rate control parameters, as the ability to conform to these is affected by the input content, the encoder tools of the video compression standard or the implementation, including the contents of future pictures, which implementations cannot make predictions about. 1337 1338However, for all practical purposes, these rate control parameters are expected to be respected when the application chooses them in a way that is in line with the encoded content and the characteristics of the used video compression standard. 1339 1340 1341=== RESOLVED: Are video session parameters objects dependent on the used video encode quality level? 1342 1343Some implementations may support different hardware modes that are enabled in response to the used video encode quality level. This may also have an effect on the constraints related to the available encoding tools and as such may also affect the necessary codec-specific parameter overrides the implementation has to apply. As video session parameters objects are expected to store the already overridden codec-specific parameters typically in an encoded or otherwise optimized format, using a video session parameters object with any video encode quality level would require implementations to also store the original parameters in order to be able to re-encode them according to the needs of the target video encode quality level, which would partially defeat the purpose of video session parameters object. 1344 1345Instead, this proposal defines video session parameters objects to be created with respect to a specific video encode quality level (when using a video encode profile) and applications have to make sure that they use a compatible video session parameters object in their encode commands according to the current quality level state of the video session. 1346 1347In practice, this should not have any effect on most encoder applications, as usually they use a single video encode quality level throughout the lifetime of the video session, so the additional complexity resulting from this specialization will only affect advanced applications that may need to operate using different video encode quality levels within a single video stream. 1348 1349 1350=== RESOLVED: Are video encode quality levels and rate control mutually exclusive? 1351 1352No, they are completely orthogonal, as they control different aspects of the encoder, and they are both always in effect all the time. There is always a currently active video encode quality level and rate control state, which default to quality level zero and implementation-specific rate control state, respectively, when the video encode session is initialized. The used video encode quality level and the rate control settings can be updated subsequently, potentially independently, or together with initialization per the application's needs. The only relation between video encode quality levels and rate control is that the application can query for each video encode profile and video encode quality level the implementation recommended settings (using `vkGetPhysicalDeviceVideoEncodeQualityLevelPropertiesKHR`) that are best suited for the selected quality level and the usage scenario information included in the video encode profile. These include recommendations on the rate control mode to use amongst other codec-independent and codec-specific suggestions. Nonetheless, these are only recommendations and the application can diverge from these if deemed necessary. 1353 1354 1355=== RESOLVED: Does specifying `VkVideoEncodeRateControlInfoKHR` in the `pNext` chain of the `pBeginCodingInfo` parameter of `vkCmdBeginVideoCodingKHR` change the current rate control configuration? 1356 1357No. The rate control information specified to `vkCmdBeginVideoCodingKHR` does not change the state of the video session, it is only expected to specify the current rate control configuration (previously already set through the execution of an appropriate `vkCmdControlVideoCodingKHR` command). This information is needed by some implementations in order to be aware of the current rate control configuration of the video session while recording commands, as some of the rate control state may affect the recorded device commands. When this information is not specified, the implementation will assume that the current rate control mode is set to `VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DEFAULT_KHR`. 1358 1359The validation layers are expected to detect at command buffer submission time if there is any mismatch between the expected rate control configuration specified to the `vkCmdBeginVideoCodingKHR` command and the actual rate control configuration of the video session at the time the video coding scope is started on the device timeline. If these two sets of state do not match, then the behavior of the implementations is undefined and may result in any sort of misbehavior permitted by the Vulkan specification when valid usage conditions are not met. Accordingly, applications have to make sure to track and specify the expected rate control configuration at the beginning of every video coding scope performing video encode operations in order to attain correct encoder behavior. 1360 1361 1362=== RESOLVED: When is it mandatory to specify reconstructed picture information in `VkVideoEncodeInfoKHR::pSetupReferenceSlot`? 1363 1364In line with the `VK_KHR_video_decode_queue` extension, due to foreseeable implementation limitations that may require the presence of a reconstructed picture resource and/or DPB slot for encoding, revision 12 of this extension changed the requirements on reconstructed picture information as follows: 1365 1366 1. Specifying reconstructed picture information (i.e. a non-`NULL` `pSetupReferenceSlot`) is made mandatory for all cases except when the video session was created with no DPB slots 1367 2. Reference picture setup (and, inherently, DPB slot activation) was changed to be subject to codec-specific behavior, meaning that specifying a non-`NULL` `pSetupReferenceSlot` will only trigger reference picture setup if the appropriate codec-specific parameters or semantics indicate so (typically in the form of marking the encoded picture as reference) 1368 1369As some implementations may use the reconstructed picture resource and/or DPB slot as transient storage during the decoding process, if a non-`NULL` `pSetupReferenceSlot` is specified but no reference picture setup is requested, then the contents of the reconstructed picture resource become undefined and some of the picture references associated with the reconstructed picture's DPB slot may get invalidated. 1370 1371 1372== Further Functionality 1373 1374This extension is meant to provide only common video encode functionality, thus support for individual video encode profiles using specific video compression standards is left for extensions layered on top of the infrastructure provided here. 1375 1376Currently the following layered extensions are available: 1377 1378 * `VK_KHR_video_encode_h264` - adds support for encoding H.264/AVC video sequences 1379 * `VK_KHR_video_encode_h265` - adds support for encoding H.265/HEVC video sequences 1380