• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1// Copyright 2021-2024 The Khronos Group Inc.
2//
3// SPDX-License-Identifier: CC-BY-4.0
4
5= VK_KHR_video_encode_queue
6:toc: left
7:refpage: https://registry.khronos.org/vulkan/specs/1.3-extensions/man/html/
8:sectnums:
9
10This document outlines a proposal to enable performing video encode operations in Vulkan.
11
12== Problem Statement
13
14Integrating video encode operations into Vulkan applications enables a wide set of new usage scenarios including, but not limited to, the following examples:
15
16  * Recording the output of rendering operations
17  * Efficiently transferring rendering results over network (video conferencing, game streaming, etc.)
18
19It is also not uncommon for Vulkan capable devices to feature dedicated hardware acceleration for video compression.
20
21The goal of this proposal is to enable these use cases, expose the underlying hardware capabilities, and provide tight integration with other functionalities of the Vulkan API.
22
23
24== Solution Space
25
26The following options have been considered:
27
28  1. Rely on external sharing capabilities to interact with existing video encode APIs
29  2. Add new dedicated APIs to Vulkan specific to video encoding
30  3. Build upon a common set of APIs that enable video coding operations in general
31
32As discussed in the proposal for the `VK_KHR_video_queue` extension, reusing a common, shared infrastructure across all video coding functionalities that leverage existing Vulkan capabilities was preferred, hence this extension follows option 3.
33
34Further sub-options were considered whether a common set of APIs could be used to enable video encoding in general, upon which codec-specific extensions can be built. As the possibility of API reuse is similarly possible within the domain of video encoding as it is for video coding in general, this proposal follows the same principle to extend `VK_KHR_video_queue` with codec-independent video encoding capabilities.
35
36
37== Proposal
38
39=== Video Encode Queues
40
41While `VK_KHR_video_queue` already includes support for a more fine grained query to determine the set of supported video codec operations for a given queue family, this extension introduces an explicit queue flag called `VK_QUEUE_VIDEO_ENCODE_BIT_KHR` to indicate support for video encoding.
42
43Applications can use this flag bit to identify video encode capable queue families in general, if needed, before querying more details about the individual video codec operations supported through the use of the `VkQueueFamilyVideoPropertiesKHR` structure. It also indicates support for the set of command buffer commands available on video encode queues, which include the following:
44
45  * Pipeline barrier and event handling commands used for synchronization
46  * Basic query commands to begin, end, and reset queries
47  * Timestamp write commands
48  * Generic video coding commands
49  * The new video encode command introduced by this extension
50
51For the full list of individual commands supported by video encode queues, and whether any command is supported inside/outside of video coding scopes, refer to the manual page of the corresponding command.
52
53
54=== Video Encode Profiles
55
56Video encode profiles are defined using a `VkVideoProfileInfoKHR` structure that specifies a `videoCodecOperation` value identifying a video encode operation. This extension does not introduce any video encode operation flags, as that is left to the codec-specific encode extensions.
57
58On the other hand, this extension allows the application to specify usage information specific to video encoding by chaining the following new structure to `VkVideoProfileInfoKHR`:
59
60[source,c]
61----
62typedef struct VkVideoEncodeUsageInfoKHR {
63    VkStructureType               sType;
64    const void*                   pNext;
65    VkVideoEncodeUsageFlagsKHR    videoUsageHints;
66    VkVideoEncodeContentFlagsKHR  videoContentHints;
67    VkVideoEncodeTuningModeKHR    tuningMode;
68} VkVideoEncodeUsageInfoKHR;
69----
70
71This structure contains two hints specific to the encoding use case and the content to be encoded, respectively, as well as a tuning mode.
72
73The usage hint flags introduced by this extension are as follows:
74
75  * `VK_VIDEO_ENCODE_USAGE_TRANSCODING_BIT_KHR` should be used in video transcoding use cases
76  * `VK_VIDEO_ENCODE_USAGE_STREAMING_BIT_KHR` should be used when encoding video content streamed over network
77  * `VK_VIDEO_ENCODE_USAGE_RECORDING_BIT_KHR` should be used in real-time recording but offline consumption use cases
78  * `VK_VIDEO_ENCODE_USAGE_CONFERENCING_BIT_KHR` should be used for video conferencing use cases
79
80The content hint flags introduced are as follows:
81
82  * `VK_VIDEO_ENCODE_CONTENT_CAMERA_BIT_KHR` should be used when encoding images captured using a camera
83  * `VK_VIDEO_ENCODE_CONTENT_DESKTOP_BIT_KHR` should be used when encoding desktop screen captures
84  * `VK_VIDEO_ENCODE_CONTENT_RENDERED_BIT_KHR` should be used when encoding rendered (e.g. game) content
85
86These usage hints do not provide any restrictions or guarantees, so any combination of flags can be used, but they allow the application to better communicate the intended use case scenario so that implementations can make appropriate choices based on it.
87
88Logically, however, it is part of the video profile definition, so capabilities may vary across video encode profiles that only differ in terms of video encode usage hints, and it also affects video profile compatibility between resources and video sessions, so the same `VkVideoEncodeUsageInfoKHR` structure has to be included everywhere where the specific video encode profile is used. The contemporary extension `VK_KHR_video_maintenance1`, however, does allow creating buffer and image resources that are compatible with multiple video profiles when they are created with the `VK_BUFFER_CREATE_VIDEO_PROFILE_INDEPENDENT_BIT_KHR` or `VK_IMAGE_CREATE_VIDEO_PROFILE_INDEPENDENT_BIT_KHR` flags, respectively, introduced by that extension.
89
90Unlike the hints, `tuningMode` is an explicit mode setting parameter that has functional implications and is expected to limit encoding capabilities to fit the usage scenario. The following tuning mode values are introduced by this extension:
91
92  * `VK_VIDEO_ENCODE_TUNING_MODE_DEFAULT_KHR` is the default tuning mode
93  * `VK_VIDEO_ENCODE_TUNING_MODE_HIGH_QUALITY_KHR` tunes encoding for high quality and will likely impose latency and performance compromises
94  * `VK_VIDEO_ENCODE_TUNING_MODE_LOW_LATENCY_KHR` tunes encoding for low latency and will likely impose quality compromises for better performance
95  * `VK_VIDEO_ENCODE_TUNING_MODE_ULTRA_LOW_LATENCY_KHR` tunes encoding for ultra-low latency with further quality compromises for maximum performance
96  * `VK_VIDEO_ENCODE_TUNING_MODE_LOSSLESS_KHR` tunes encoding to produce lossless output.
97
98In practice, not all codecs and profiles will support every tuning mode. The new query command `vkGetPhysicalDeviceVideoEncodeQualityLevelPropertiesKHR`, as described later, may also return different recommended configuration parameters based on the tuning mode specified in the video profile in order to further aid application developers in choosing the most suitable settings for the encoding scenario at hand.
99
100
101=== New Pipeline Stage and Access Flags
102
103This extension also introduces a new pipeline stage identified by the `VK_PIPELINE_STAGE_2_VIDEO_ENCODE_BIT_KHR` flag to enable synchronizing video encode operations with respect to other Vulkan operations.
104
105In addition, two new access flags are introduced to indicate reads and writes, respectively, performed by the video encode pipeline stage:
106
107  * `VK_ACCESS_2_VIDEO_ENCODE_READ_BIT_KHR`
108  * `VK_ACCESS_2_VIDEO_ENCODE_WRITE_BIT_KHR`
109
110As these flags did no longer fit into the legacy 32-bit enums, this extension requires the `VK_KHR_synchronization2` extension and relies on the 64-bit versions of the pipeline stage and access mask flags to handle synchronization specific to video encode operations.
111
112
113=== New Buffer and Image Usage Flags
114
115This extension introduces the following new buffer usage flags:
116
117  * `VK_BUFFER_USAGE_VIDEO_ENCODE_SRC_BIT_KHR` is reserved for future use
118  * `VK_BUFFER_USAGE_VIDEO_ENCODE_DST_BIT_KHR` allows using the buffer as a video bitstream buffer in video encode operations
119
120This extension also introduces the following new image usage flags:
121
122  * `VK_IMAGE_USAGE_VIDEO_ENCODE_SRC_BIT_KHR` allows using the image as an encode input picture
123  * `VK_IMAGE_USAGE_VIDEO_ENCODE_DST_BIT_KHR` is reserved for future use
124  * `VK_IMAGE_USAGE_VIDEO_ENCODE_DPB_BIT_KHR` allows using the image as an encode DPB picture (reconstructed/reference picture)
125
126Specifying these usage flags alone is not sufficient to create a buffer or image that is compatible with a video session created against any particular video profile. In fact, when specifying any of these usage flags at resource creation time, the application has to include a `VkVideoProfileListInfoKHR` structure in the `pNext` chain of the corresponding create info structure with `VkVideoProfileListInfoKHR::pProfiles` including a video encode profile. The created resources will be compatible only with the included video encode profiles (and a video encode profile, if one is also specified in the list).
127
128
129=== New Format Feature Flags
130
131To indicate which formats are compatible with video encode usage, the following new format feature flags are introduced:
132
133  * `VK_FORMAT_FEATURE_VIDEO_ENCODE_INPUT_BIT_KHR` indicates support for encode input picture usage
134  * `VK_FORMAT_FEATURE_VIDEO_ENCODE_DPB_BIT_KHR` indicates support for encode DPB picture usage
135
136The presence of the format flags alone, as returned by the various format queries, is not sufficient to indicate that an image with that format is usable with video encoding using any particular video encode profile. Actual compatibility with a specific video encode profile has to be verified using the `vkGetPhysicalDeviceVideoFormatPropertiesKHR` command.
137
138
139=== Basic Operation
140
141Video encode operations can be recorded into command buffers allocated from command pools created against queue families that support the `VK_QUEUE_VIDEO_ENCODE_BIT_KHR` flag.
142
143Recording video encode operations happens through the use of the following new command:
144
145[source,c]
146----
147VKAPI_ATTR void VKAPI_CALL vkCmdEncodeVideoKHR(
148    VkCommandBuffer                             commandBuffer,
149    const VkVideoEncodeInfoKHR*                 pEncodeInfo);
150----
151
152The common, codec-independent parameters of the video encode operation are provided using the following new structure:
153
154[source,c]
155----
156typedef struct VkVideoEncodeInfoKHR {
157    VkStructureType                       sType;
158    const void*                           pNext;
159    VkVideoEncodeFlagsKHR                 flags;
160    VkBuffer                              dstBuffer;
161    VkDeviceSize                          dstBufferOffset;
162    VkDeviceSize                          dstBufferRange;
163    VkVideoPictureResourceInfoKHR         srcPictureResource;
164    const VkVideoReferenceSlotInfoKHR*    pSetupReferenceSlot;
165    uint32_t                              referenceSlotCount;
166    const VkVideoReferenceSlotInfoKHR*    pReferenceSlots;
167    uint32_t                              precedingExternallyEncodedBytes;
168} VkVideoEncodeInfoKHR;
169----
170
171Executing such a video encode operation results in the compression of a single picture (unless otherwise defined by layered extensions), and, if there is an active `VK_QUERY_TYPE_RESULT_STATUS_ONLY_KHR` query, the status of the video encode operation is recorded into the active query slot.
172
173In addition to `VK_QUERY_TYPE_RESULT_STATUS_ONLY_KHR` queries, applications can use the new `VK_QUERY_TYPE_VIDEO_ENCODE_FEEDBACK_KHR` queries to retrieve additional feedback about the encoded picture including the offset and size of the bitstream written to the specified video bitstream buffer range, as discussed later.
174
175If the encode operation requires additional codec-specific parameters, then such parameters are provided in the `pNext` chain of the structure above. Whether such codec-specific information is necessary, and what it may contain is up to the codec-specific extensions.
176
177`dstBuffer`, `dstBufferOffset`, and `dstBufferRange` provide information about the target video bitstream buffer range. The video encode operation writes the compressed picture data to this buffer range.
178
179The application has to create the video bitstream buffer with the new `VK_BUFFER_USAGE_VIDEO_ENCODE_DST_BIT_KHR` usage flag, and must also include the used video session's video profile in the `VkVideoProfileListInfoKHR` structure specified at buffer creation time.
180
181The data written to the video bitstream buffer range depends on the specific video codec used, as defined by corresponding codec-specific extensions built upon this proposal.
182
183The `srcPictureResource`, `pSetupReferenceSlot`, and `pReferenceSlots` members specify the encode input picture, reconstructed picture, and reference pictures, respectively, used by the video encode operation, as discussed in later sections of this proposal.
184
185The `precedingExternallyEncodedBytes` member specifies the number of bytes externally encoded into the bitstream by the application. This value is used to update the implementation's rate control algorithm for the rate control layer this encode operation belongs to, by accounting for the bitrate budget consumed by these externally encoded bytes. This parameter is respected by the implementation only if the `VK_VIDEO_ENCODE_CAPABILITY_PRECEDING_EXTERNALLY_ENCODED_BYTES_BIT_KHR` capability is supported.
186
187
188=== Encode Input Picture
189
190`srcPictureResource` defines the parameters of the video picture resource to use as the encode input picture. The video encode operation reads the picture data to compress from this video picture resource. As such it is a mandatory parameter of the operation.
191
192The application has to create the image view specified in `srcPictureResource.imageViewBinding` with the new `VK_IMAGE_USAGE_VIDEO_ENCODE_SRC_BIT_KHR` usage flag, and must also include the used video session's video profile in the `VkVideoProfileListInfoKHR` structure specified at image creation time.
193
194The image subresource backing the encode input picture has to be in the new `VK_IMAGE_LAYOUT_VIDEO_ENCODE_SRC_KHR` layout at the time the video encode operation is executed.
195
196
197=== Reconstructed Picture
198
199`pSetupReferenceSlot` is an optional parameter specifying the video picture resource and DPB slot index to use for the reconstructed picture. Implementations use the reconstructed picture for one of the following purposes:
200
201  1. When the encoded picture is requested to be set up as a reference, according to the codec-specific semantics, the video encode operation will perform picture reconstruction, output the results to this picture, and activate the reconstructed picture's DPB slot with the picture in order to enable using the picture as a reference picture in future video encode operations.
202  2. When the encoded picture is not requested to be set up as a reference, implementations may use the reconstructed picture's resource and/or DPB slot for intermediate data required by the encoding process.
203
204Accordingly, `pSetupReferenceSlot` must never be `NULL`, except when the video session was created without any DPB slots.
205
206[NOTE]
207.Note
208====
209The original version of this extension only required the specification of the reconstructed picture information (i.e. a non-`NULL` `pSetupReferenceSlot`) when the application intended to set up a reference picture by activating a DPB slot. Consequently, the presence of reconstructed picture information always implied DPB slot activation. This was changed in revision 12 of the extension, and whether DPB slot activation happens is now subject to codec-specific semantics. More details on this change are discussed in the corresponding issue in this proposal document.
210====
211
212In summary, for encoded pictures requested to be set up as reference, this parameter can be used to add new reference pictures to the DPB, and change the association between DPB slot indices and video picture resources. That also implies that the application has to specify a video picture resource in `pSetupReferenceSlot->pPictureResource` that was included in the set of bound reference picture resources specified when the video coding scope was started (in one of the elements of `VkVideoBeginCodingInfoKHR::pReferenceSlots`). No similar requirement exists for the encode input picture specified by `srcPictureResource` which can refer to any video picture resource.
213
214The application has to create the image view specified in `pSetupReferenceSlot->pPictureResource->imageViewBinding` with the new `VK_IMAGE_USAGE_VIDEO_ENCODE_DPB_BIT_KHR` usage flag, and must also include the used video session's video profile in the `VkVideoProfileListInfoKHR` structure specified at image creation time.
215
216The image subresource backing the reconstructed picture has to be in the new `VK_IMAGE_LAYOUT_VIDEO_ENCODE_DPB_KHR` layout at the time the video encode operation is executed.
217
218If the video profile in use requires additional codec-specific parameters for the reconstructed picture, then such parameters are provided in the `pNext` chain of `pSetupReferenceSlot`. Whether such codec-specific reconstructed picture information is necessary, and what it may contain is up to the codec-specific extensions.
219
220
221=== Reference Pictures
222
223If the video session allows, reference pictures can be specified in the `pReferenceSlots` array to provide predictions of the values of samples of the encoded picture.
224
225Each entry in the `pReferenceSlots` array adds one or more pictures, currently associated with the DPB slot specified in the element's `slotIndex` member and stored in the video picture resource specified in the element's `pPictureResource` member, to the list of active reference pictures to use in the video encode operation.
226
227The application has to make sure to specify each video picture resource used as a reference picture in a video encode operation, beforehand, in the set of bound reference picture resources specified when the video coding scope was started (in one of the elements of `VkVideoBeginCodingInfoKHR::pReferenceSlots`).
228
229The application has to create the image view specified in `pPictureResource->imageViewBinding` of the elements of `pReferenceSlots` with the new `VK_IMAGE_USAGE_VIDEO_ENCODE_DPB_BIT_KHR` usage flag, and must also include the used video session’s video profile in the `VkVideoProfileListInfoKHR` structure specified at image creation time.
230
231The image subresources backing the reference pictures have to be in the new `VK_IMAGE_LAYOUT_VIDEO_ENCODE_DPB_KHR` layout at the time the video encode operation is executed.
232
233Typically the number of elements in `pReferenceSlots` equals the number of reference pictures added, but in certain cases (depending on the used video codec and video profile) there may be multiple pictures in the same DPB slot resource.
234
235If the video profile in use requires additional codec-specific parameters for the reference pictures, then such parameters are provided in the `pNext` chain of the elements of `pReferenceSlots`. Whether such codec-specific reference picture information is necessary, and what it may contain is up to the codec-specific extensions.
236
237
238=== Video Encode Parameter Overrides
239
240Encoder implementations usually only support a subset of the available encoding tools defined by the corresponding video compression standards. This may prevent some implementation from being able to respect certain codec-specific parameters, or specific parameter values.
241
242Enumerating exhaustively all of these constraints and potentially defining application queryable capabilities corresponding to those is not practical, as it would potentially require separate capabilities for almost every single codec-specific parameter, parameter value, and combinations of those, as usually there are complicated interactions between those codec-specific parameters. Instead, this proposal approaches this problem from the other direction.
243
244Instead of defining capabilities for each of these constraints, implementations are allowed to override codec-specific parameter values or combinations thereof, so that the resulting overridden codec-specific parameters now comply to the constraints of the target implementation. This has multiple benefits:
245
246  * Enables the video encode APIs to be supported on a much wider set of hardware implementations, as the codec-specific extensions layered on top of this extension would not have codec-specific requirements that assume implementations to support certain, potentially not universally available, encoding tools
247  * Enables implementations to expose all of the encoding tools they support for a particular video compression standard, which typically is not possible in other video APIs as, without overrides, implementations may not be able to expose a large set of their encoding tools just because they do not comply to the exact wording of the capabilities defined by that API
248  * Enables writing portable applications without getting lost in myriads of capabilities
249
250Allowing implementations to override codec-specific parameters does not mean, however, that implementations can do any overrides they wish. The base parameter override mechanism is reserved to deal with implementation limitations only. Thus, by default, implementations are expected to override codec-specific parameters only if it is absolutely paramount for the correct functioning of their encoder hardware.
251
252In certain cases, applications may want to allow the implementation to make its own choices about the certain codec-specific parameters that are not driven by implementation constraints, but rather aim to allow the implementation to choose parameters and encoding tools that better fit the usage scenario described by the video profile and other parameters, like the encode quality level, than the one the application specified. This proposal introduces a new video session creation flag called `VK_VIDEO_SESSION_CREATE_ALLOW_ENCODE_PARAMETER_OPTIMIZATIONS_BIT_KHR` that enables the application to opt in to such optimization overrides.
253
254There are certain rules that implementations need to follow in all cases where they may apply codec-specific parameter overrides. In particular:
255
256  * Certain codec-specific parameters are defined by layered codec-specific extensions to be always respected, and thus cannot be overridden, which is generally expected to be the case for all parameters that may affect the overall behavior of video encoding, or any bitstream elements that are not encoded in any fashion by the implementation, so that applications still have the necessary freedom to encode such auxiliary bitstream elements the way they wish
257  * In a similar vein, implementation overrides cannot affect the compliance of the generated bitstream to the video compression standard
258
259The details of these rules can be found in the specification language of this extension, and any layered extension built upon it.
260
261In general, there are two categories of codec-specific parameters to which implementation overrides may be applied:
262
263  1. Codec-specific parameters stored in video session parameters objects, if any
264  2. Codec-specific parameters provided to video encode commands
265
266Both of these codec-specific parameter categories may have an effect on the video bitstream data produced by video encode operations. However, parameters falling into the first category are particularly important as it is common for applications to encode the codec-specific parameters stored in video session parameters on their own.
267
268In order to enable the application to deal with parameter overrides applied to video session parameters, this proposal introduces the following new command:
269
270[source,c]
271----
272VKAPI_ATTR VkResult VKAPI_CALL vkGetEncodedVideoSessionParametersKHR(
273    VkDevice                                        device,
274    const VkVideoEncodeSessionParametersGetInfoKHR* pVideoSessionParametersInfo,
275    VkVideoEncodeSessionParametersFeedbackInfoKHR*  pFeedbackInfo,
276    size_t*                                         pDataSize,
277    void*                                           pData);
278----
279
280The main input to this command is the video session parameters object in question, with layered extensions adding additional chainable structures to provide additional codec-specific input parameters:
281
282[source,c]
283----
284typedef struct VkVideoEncodeSessionParametersGetInfoKHR {
285    VkStructureType                sType;
286    const void*                    pNext;
287    VkVideoSessionParametersKHR    videoSessionParameters;
288} VkVideoEncodeSessionParametersGetInfoKHR;
289----
290
291This command has multiple purposes.
292
293First, by providing a non-`NULL` `pFeedbackInfo` parameter, the application can get feedback about whether the implementation applied any parameter overrides to the video session parameters in question through the following output structure:
294
295[source,c]
296----
297typedef struct VkVideoEncodeSessionParametersFeedbackInfoKHR {
298    VkStructureType    sType;
299    void*              pNext;
300    VkBool32           hasOverrides;
301} VkVideoEncodeSessionParametersFeedbackInfoKHR;
302----
303
304The `hasOverrides` member will be set to `VK_TRUE` if implementation overrides were applied, and layered extensions may provide additional chainable output structures that return further (typically codec-specific) information about the applied overrides.
305
306When this feedback indicates that implementation overrides were applied, the application needs to retrieve the encoded video session parameters containing the overrides in order to be able to produce a compliant bitstream. This can be done in the usual fashion by providing a non-`NULL` `pDataSize` parameter to retrieve the size of the encoded parameter data, and then calling the command again with a non-`NULL` `pData` pointer to retrieve the data.
307
308The application can choose to use the `vkGetEncodedVideoSessionParametersKHR` command to encode the video session parameters even if the implementation did not override any of the parameters, but in this case it can also choose to encode the respective bitstream elements on its own.
309
310It is worth calling out though that if the application does not use this command to determine whether video session parameter overrides happened or does not use the encoded parameters retrievable using this command when video session parameter overrides happened, but rather just encodes the respective bitstream elements with its own choice of codec-specific parameters, then it risks the resulting video bitstream to end up being non-compliant to the video compression standard.
311
312
313=== Capabilities
314
315Querying capabilities specific to video encoding happens through the query mechanisms introduced by the `VK_KHR_video_queue` extension.
316
317Support for individual video encode operations can be retrieved for each queue family using the `VkQueueFamilyVideoPropertiesKHR` structure, as discussed earlier.
318
319The application can also use the `vkGetPhysicalDeviceVideoCapabilitiesKHR` command to query the capabilities of a specific video encode profile. In case of video encode profiles, the following new structure has to be included in the `pNext` chain of the `VkVideoCapabilitiesKHR` structure used to retrieve the general video encode capabilities:
320
321[source,c]
322----
323typedef struct VkVideoEncodeCapabilitiesKHR {
324    VkStructureType                         sType;
325    void*                                   pNext;
326    VkVideoEncodeCapabilityFlagsKHR         flags;
327    VkVideoEncodeRateControlModeFlagsKHR    rateControlModes;
328    uint32_t                                maxRateControlLayers;
329    uint64_t                                maxBitrate;
330    uint32_t                                maxQualityLevels;
331    VkExtent2D                              encodeInputPictureGranularity;
332    VkVideoEncodeFeedbackFlagsKHR           supportedEncodeFeedbackFlags;
333} VkVideoEncodeCapabilitiesKHR;
334----
335
336This structure contains a new encode-specific `flags` member that indicates support for various video encode capabilities, like the support for the `precedingExternallyEncodedBytes` parameter discussed before.
337
338The `rateControlModes` and `maxRateControlLayers` members provide information about the supported rate control modes and maximum number of rate control layers that can be used in a video session, as discussed later.
339
340The `maxBitrate` member provides information about the maximum bitrate supported for the video profile.
341
342The `maxQualityLevels` member specifies the number of different video encode quality level values supported by the video encode profile in question which are identified with numbers in the range `0..maxQualityLevels`. The number and implementation effect of the quality levels is expected to vary across video encode profiles, even in video encode profiles using the same video codec operation (e.g. due to the use of different tuning modes), as discussed later.
343
344The `encodeInputPictureGranularity` member indicates the granularity at which data from the encode input picture is used for encoding individual codec-specific coding blocks. If this capability is not `{1,1}`, then it is recommend for applications to initialize the data in the encode input picture at this granularity, as the encoder will use data in such padding texels during the encoding, which may affect the quality and efficiency of the encoding.
345
346The `supportedEncodeFeedbackFlags` member indicates the set of supported encode feedback flags for the `VK_QUERY_TYPE_VIDEO_ENCODE_FEEDBACK_KHR` queries described later.
347
348The `vkGetPhysicalDeviceVideoFormatPropertiesKHR` command can be used to query the supported image/picture formats for a given set of video profiles, as described in the `VK_KHR_video_queue` extension.
349
350In particular, if the application would like to query the list of format properties supported for encode input pictures, then it should include the new `VK_IMAGE_USAGE_VIDEO_ENCODE_SRC_BIT_KHR` usage flag in `VkPhysicalDeviceVideoFormatInfoKHR::imageUsage`.
351
352Similarly, to query the list of format properties supported for encode DPB pictures (reconstructed/reference pictures), then it should include the new `VK_IMAGE_USAGE_VIDEO_ENCODE_DPB_BIT_KHR` usage flag in `VkPhysicalDeviceVideoFormatInfoKHR::imageUsage`.
353
354
355=== Video Encode Quality Levels
356
357This proposal introduces the concept of video encode quality levels, which can be thought of as encoder presets that control the number and type of implementation-specific encoding tools and algorithms utilized in the encoding process. Implementations can expose support for one or more such video encode quality levels for each video profile. By default, video encode quality level index zero is used, unless otherwise specified.
358
359Generally, using higher video encode quality levels may produce higher quality video streams at the cost of additional processing time. However, as the final quality of an encoded picture depends on the contents of the encode input picture, the contents of the active reference pictures, the codec-specific encode parameters, and the particular implementation-specific tools used corresponding to the individual video encode quality levels, there are no guarantees that using a higher video encode quality level will always produce a higher quality encoded picture for any given set of inputs.
360
361The chosen quality level may also affect the optimization overrides applied by implementations when using the `VK_VIDEO_SESSION_CREATE_ALLOW_ENCODE_PARAMETER_OPTIMIZATIONS_BIT_KHR` flag, and thus codec-specific parameters stored in video session parameters may be affected by the used video encode quality level. As such, video session parameters objects are always created with respect to a specific video encode quality level. The application can choose to create a video session parameters object with a video encode quality level index different than the default quality level of zero by including the following new structure in the `pNext` chain of `VkVideoSessionParametersCreateInfoKHR`:
362
363[source,c]
364----
365typedef struct VkVideoEncodeQualityLevelInfoKHR {
366    VkStructureType    sType;
367    const void*        pNext;
368    uint32_t           qualityLevel;
369} VkVideoEncodeQualityLevelInfoKHR;
370----
371
372Where `qualityLevel` specifies the used video encode quality level.
373
374Video sessions created against a video encode profile allow changing the used video encode quality level dynamically. After creation, the video session is configured with the default quality level of zero, which then can be changed by including the new `VK_VIDEO_CODING_CONTROL_ENCODE_QUALITY_LEVEL_BIT_KHR` flag in the `flags` member of the `VkVideoCodingControlInfoKHR` structure passed to the `vkCmdControlVideoCodingKHR` command and including an instance of the `VkVideoEncodeQualityLevelInfoKHR` structure in the `VkVideoCodingControlInfoKHR::pNext` chain specifying the new quality level to set for the video session.
375
376If video session parameters objects are used by a particular video encode command, then the video encode quality the parameters object was created with has to match the currently configured quality level for the bound video session.
377
378Implementations may have certain recommendations for encoding parameters and configuration (e.g. for rate control) specific to each supported video encode quality level. These recommendations and other quality level related properties can be queried for a specific video encode profile using the following new command:
379
380[source,c]
381----
382VKAPI_ATTR VkResult VKAPI_CALL vkGetPhysicalDeviceVideoEncodeQualityLevelPropertiesKHR(
383    VkPhysicalDevice                                        physicalDevice,
384    const VkPhysicalDeviceVideoEncodeQualityLevelInfoKHR*   pQualityLevelInfo,
385    VkVideoEncodeQualityLevelPropertiesKHR*                 pQualityLevelProperties);
386----
387
388The input to the command is a structure that specifies the video encode profile and quality level to query properties for:
389
390[source,c]
391----
392typedef struct VkPhysicalDeviceVideoEncodeQualityLevelInfoKHR {
393    VkStructureType                 sType;
394    const void*                     pNext;
395    const VkVideoProfileInfoKHR*    pVideoProfile;
396    uint32_t                        qualityLevel;
397} VkPhysicalDeviceVideoEncodeQualityLevelInfoKHR;
398----
399
400This proposal allows retrieving the following codec-independent quality level properties:
401
402[source,c]
403----
404typedef struct VkVideoEncodeQualityLevelPropertiesKHR {
405    VkStructureType                            sType;
406    void*                                      pNext;
407    VkVideoEncodeRateControlModeFlagBitsKHR    preferredRateControlMode;
408    uint32_t                                   preferredRateControlLayerCount;
409} VkVideoEncodeQualityLevelPropertiesKHR;
410----
411
412Layered extensions may add additional (typically codec-specific) property structures that can be chained to the base output structure defined above.
413
414
415=== Video Encode Feedback Queries
416
417The new `VK_QUERY_TYPE_VIDEO_ENCODE_FEEDBACK_KHR` query type works similarly to pipeline statistics from the perspective of being able to report multiple distinct values about the video encode operations they collect feedback about. When creating a query pool with this type the following new structure specifies the selected feedback values:
418
419[source,c]
420----
421typedef struct VkQueryPoolVideoEncodeFeedbackCreateInfoKHR {
422    VkStructureType                         sType;
423    const void*                             pNext;
424    VkVideoEncodeFeedbackFlagsKHR           encodeFeedbackFlags;
425} VkQueryPoolVideoEncodeFeedbackCreateInfoKHR;
426----
427
428This extension adds support for the following video encode feedback flags:
429
430  * `VK_VIDEO_ENCODE_FEEDBACK_BITSTREAM_BUFFER_OFFSET_BIT_KHR` requests capturing the offset relative to `dstBufferOffset` where the bitstream data corresponding to the video encode operation is written to
431  * `VK_VIDEO_ENCODE_FEEDBACK_BITSTREAM_BYTES_WRITTEN_BIT_KHR` requests capturing the number of bytes written by the video encode operation to the bitstream buffer
432  * `VK_VIDEO_ENCODE_FEEDBACK_BITSTREAM_HAS_OVERRIDES_BIT_KHR` requests capturing information about whether the implementation overrode any codec-specific parameters in the generated bitstream data with respect to the parameter values supplied by the application
433
434All implementations are expected to support `VK_VIDEO_ENCODE_FEEDBACK_BITSTREAM_BUFFER_OFFSET_BIT_KHR` and `VK_VIDEO_ENCODE_FEEDBACK_BITSTREAM_BYTES_WRITTEN_BIT_KHR`, but `VK_VIDEO_ENCODE_FEEDBACK_BITSTREAM_HAS_OVERRIDES_BIT_KHR` is optional, as not all implementations may be able to provide feedback about overrides performed on the encoded bitstream data.
435
436The reported offset for `VK_VIDEO_ENCODE_FEEDBACK_BITSTREAM_BUFFER_OFFSET_BIT_KHR` is currently defined to be always zero until otherwise defined by any layered extension.
437
438
439=== Video Encode Rate Control
440
441A key aspect of video encoding is to control the size of the encoded bitstream. This happens through the application of rate control. Rate control settings consist of codec-independent and codec-specific parameters hence this extension only includes the common parameters.
442
443The following rate control modes are introduced by this extension:
444
445  * `VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DISABLED_BIT_KHR` for disabling rate control
446  * `VK_VIDEO_ENCODE_RATE_CONTROL_MODE_CBR_BIT_KHR` for constant bitrate (CBR) rate control
447  * `VK_VIDEO_ENCODE_RATE_CONTROL_MODE_VBR_BIT_KHR` for variable bitrate (VBR) rate control
448
449In addition, the `VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DEFAULT_KHR` constant is used to set rate control configuration to implementation-dependent default settings. This is the initial rate control mode that is set for newly created video sessions which leaves rate control entirely in the implementation's control.
450
451Certain codecs define a concept typically referred to as _video coding layers_. The semantics of these layers are defined by the corresponding video compression standards. However, some implementations allow certain configuration parameters of rate control to be specified separately for each such video coding layer, thus this proposal introduces the concept of rate control layers which enable the application to explicitly control these parameters on a per layer basis.
452
453When a single rate control layer is configured, it is applied to all encoded pictures. In contrast, when multiple rate control layers are configured, then each rate control layer is applied only to encoded pictures targeting a specific video coding layer.
454
455After a video session is reset using `VK_VIDEO_CODING_CONTROL_RESET_BIT_KHR`, its rate control settings are initialized to implementation-specific defaults. Applications can change these by calling `vkCmdControlVideoCodingKHR` and specifying the `VK_VIDEO_CODING_CONTROL_ENCODE_RATE_CONTROL_BIT_KHR` flag. When this flag is present, the following new structure chained to the `pNext` chain of `VkVideoCodingControlInfoKHR` specifies the rate control configuration:
456
457[source,c]
458----
459typedef struct VkVideoEncodeRateControlInfoKHR {
460    VkStructureType                                sType;
461    const void*                                    pNext;
462    VkVideoEncodeRateControlFlagsKHR               flags;
463    VkVideoEncodeRateControlModeFlagBitsKHR        rateControlMode;
464    uint32_t                                       layerCount;
465    const VkVideoEncodeRateControlLayerInfoKHR*    pLayers;
466    uint32_t                                       virtualBufferSizeInMs;
467    uint32_t                                       initialVirtualBufferSizeInMs;
468} VkVideoEncodeRateControlInfoKHR;
469----
470
471`rateControlMode` specifies the rate control mode to set.
472
473`layerCount` specifies the number of rate control layers to use from this point, and `pLayers` specifies the configuration of each layer. Rate control layers can only be specified when rate control is not disabled or is not set to the implementation-specific defaults.
474
475`virtualBufferSizeInMs` and `initialVirtualBufferSizeInMs` specify the size and initial occupancy, respectively, in milliseconds of the leaky bucket model virtual buffer.
476
477The `VkVideoEncodeRateControlLayerInfoKHR` structure is defined as follows:
478
479[source,c]
480----
481typedef struct VkVideoEncodeRateControlLayerInfoKHR {
482    VkStructureType    sType;
483    const void*        pNext;
484    uint64_t           averageBitrate;
485    uint64_t           maxBitrate;
486    uint32_t           frameRateNumerator;
487    uint32_t           frameRateDenominator;
488} VkVideoEncodeRateControlLayerInfoKHR;
489----
490
491`averageBitrate` and `maxBitrate` specify the target and peak bitrate that the rate control layer should use in bits/second. In case of CBR mode the two values have to match.
492
493`frameRateNumerator` and `frameRateDenominator` specify the numerator and denominator of the frame rate used by the video sequence.
494
495The exact behavior of rate control is implementation-specific but it is typically constrained by the video compression standard corresponding to the used video profile. Implementations are expected to implement rate control as follows:
496
497  * In case of CBR mode the bitrate should stay as close to the specified `averageBitrate` as possible within the virtual buffer window.
498  * In case of VBR mode the bitrate should not exceed the value of `maxBitrate` while also trying to get close to the target bitrate specified by `averageBitrate` within the virtual buffer window.
499
500Codec-specific video encode extensions can include both global and per-layer codec-specific rate control configurations by chaining codec-specific parameters to the `VkVideoEncodeRateControlInfoKHR` and `VkVideoEncodeRateControlLayerInfoKHR` structures, respectively.
501
502Some implementations do not track the current rate control configuration as part of the device state maintained in the video session object, but the current rate control configuration may affect the device commands recorded in response to video encode operations. In order to enable implementations to have access to the current rate control configuration when recording video encoding commands into command buffers, this proposal requires the current rate control configuration to be also specified when calling `vkCmdBeginVideoCodingKHR` by including the `VkVideoEncodeRateControlInfoKHR` structure describing it in the `pNext` chain of the `pBeginCodingInfo` parameter. When this information is not included, it is assumed that the currently expected rate control configuration is the default one, i.e. the implementation-specific rate control mode indicated by `VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DEFAULT_KHR`.
503
504It is important to note that specifying the rate control configuration when calling `vkCmdBeginVideoCodingKHR` does not change the current rate control configuration. For that the `vkCmdControlVideoCodingKHR` command must be used with the `VK_VIDEO_CODING_CONTROL_ENCODE_RATE_CONTROL_BIT_KHR` flag, as discussed earlier. The rate control configuration specified to `vkCmdBeginVideoCodingKHR` serves only to make the information about the current rate control state available to implementations during command recording and is expected to always match the effective current rate control state at the time the command is executed on the device.
505
506
507=== Usage Summary
508
509To summarize the usage of the video encoding features introduced by this extension, let us take a look at a typical usage scenario when using this extension to encode a video stream.
510
511Before the application can start recording command buffers with video encode operations, it has to do the following:
512
513  . Ensure that the implementation can encode the video content by first querying the video codec operations supported by each queue family using the `vkGetPhysicalDeviceQueueFamilyProperties2` command and the `VkQueueFamilyVideoPropertiesKHR` output structure.
514  . If needed, the application has to also retrieve the `VkQueueFamilyQueryResultStatusPropertiesKHR` output structure for the queue family to check support for `VK_QUERY_TYPE_RESULT_STATUS_ONLY_KHR` queries.
515  . Construct the `VkVideoProfileInfoKHR` structure describing the entire video profile, including the video codec operation, chroma subsampling, bit depths, and any other usage or codec-specific parameters.
516  . Ensure that the specific video profile is supported by the implementation using the `vkGetPhysicalDeviceVideoCapabilitiesKHR` command and retrieve the general, encode-specific, and codec-specific capabilities at the same time.
517  . Query the list of supported image/picture format properties supported for the video profile using the `vkGetPhysicalDeviceVideoFormatPropertiesKHR` structure, and select a suitable format for the DPB and encode input pictures.
518  . Create an image corresponding to the encode input picture with the appropriate usage flags and video profile list, as described earlier, and bind suitable device memory to the image. Also create an image view with the appropriate usage flags to use in the video encode operations.
519  . If needed, create one or more images corresponding to the DPB pictures with the appropriate usage flags and video profile list, as described earlier, and bind suitable device memory to them. Also create any image views with the appropriate usage flags to use in the video encode operations.
520  . Create a buffer with the `VK_BUFFER_USAGE_VIDEO_ENCODE_DST_BIT_KHR` usage flag and the video profile list, to use as the destination video bitstream buffer. If the buffer is expected to be consumed using the CPU, consider binding compatible host-visible device memory to the buffer.
521  . If result status or video encode feedback queries are needed and supported (as determined earlier), create a query pool with the corresponding query type and the used video encode profile.
522  . Create the video session using the video encode profile and appropriate parameters within the capabilities supported by the profile, as determined earlier. Bind suitable device memory to each memory binding index of the video session.
523  . If needed, create a video session parameters object for the video session.
524
525Recording video encode operations into command buffers typically consists of the following sequence:
526
527  . Start a video coding scope with the created video session (and parameters) object using the `vkCmdBeginVideoCodingKHR` command. Make sure to include all video picture resources in `VkVideoBeginCodingInfoKHR::pReferenceSlots` that may be used as reconstructed or reference pictures within the video coding scope, and ensure that the DPB slots specified for each reflect the current DPB slot association for the resource.
528  . If this is the first video coding scope the video session is used in, reset the video session to the initial state by recording a `vkCmdControlVideoCodingKHR` command with the `VK_VIDEO_CODING_CONTROL_RESET_BIT_KHR` flag.
529  . If needed, also update the rate control state or the used video encode quality level for the video session by recording a
530  `vkCmdControlVideoCodingKHR` command with the `VK_VIDEO_CODING_CONTROL_ENCODE_RATE_CONTROL_BIT_KHR` and/or `VK_VIDEO_CODING_CONTROL_ENCODE_QUALITY_LEVEL_BIT_KHR` flags (can be done in the same command that resets the video session, if needed).
531  . If needed, start a result status or video coding feedback query using `vkCmdBeginQuery`. Reset the query using `vkCmdResetQueryPool`, beforehand, as needed.
532  . Issue a video encode operation using the `vkCmdEncodeVideoKHR` command with appropriate parameters, as discussed earlier.
533  . If needed, end the started query using `vkCmdEndQuery`.
534  . Record any further control or encode operations into the video coding scope, as needed.
535  . End the video coding scope using the `vkCmdEndVideoCodingKHR` command.
536
537Video profiles that require the use of video session parameters objects may also require the application to encode the stored codec-specific parameters separately into the final bitstream. Applications are expected to encode these parameters according to the following steps:
538
539  . If the application wants to encode such parameters on its own, when possible, it should first call the `vkGetEncodedVideoSessionParametersKHR` command with a non-NULL `pFeedbackInfo` parameter to retrieve information about whether the implementation applied any overrides to the codec-specific parameters in question.
540  . If the results of the previous step indicate that no implementation overrides were applied, then the application can choose to encode the codec-specific parameters in question on its own and ignore the rest of the steps listed here
541  . Otherwise, the application has to retrieve the encoded codec-specific parameters by calling the `vkGetEncodedVideoSessionParametersKHR` command twice: first, to retrieve the size, second to retrieve the data of the encoded codec-specific parameters in question, as discussed earlier.
542
543
544== Examples
545
546=== Select queue family with video encode support for a given video codec operation
547
548[source,c]
549----
550VkVideoCodecOperationFlagBitsKHR neededVideoEncodeOp = ...
551uint32_t queueFamilyIndex;
552uint32_t queueFamilyCount;
553
554vkGetPhysicalDeviceQueueFamilyProperties2(physicalDevice, &queueFamilyCount, NULL);
555
556VkQueueFamilyProperties2* props = calloc(queueFamilyCount,
557    sizeof(VkQueueFamilyProperties2));
558VkQueueFamilyVideoPropertiesKHR* videoProps = calloc(queueFamilyCount,
559    sizeof(VkQueueFamilyVideoPropertiesKHR));
560
561for (queueFamilyIndex = 0; queueFamilyIndex < queueFamilyCount; ++queueFamilyIndex) {
562    props[queueFamilyIndex].sType = VK_STRUCTURE_TYPE_QUEUE_FAMILY_PROPERTIES_2;
563    props[queueFamilyIndex].pNext = &videoProps[queueFamilyIndex];
564
565    videoProps[queueFamilyIndex].sType = VK_STRUCTURE_TYPE_QUEUE_FAMILY_VIDEO_PROPERTIES_KHR;
566}
567
568vkGetPhysicalDeviceQueueFamilyProperties2(physicalDevice, &queueFamilyCount, props);
569
570for (queueFamilyIndex = 0; queueFamilyIndex < queueFamilyCount; ++queueFamilyIndex) {
571    if ((props[queueFamilyIndex].queueFamilyProperties.queueFlags & VK_QUEUE_VIDEO_ENCODE_BIT_KHR) != 0 &&
572        (videoProps[queueFamilyIndex].videoCodecOperations & neededVideoEncodeOp) != 0) {
573        break;
574    }
575}
576
577if (queueFamilyIndex < queueFamilyCount) {
578    // Found appropriate queue family
579    ...
580} else {
581    // Did not find a queue family with the needed capabilities
582    ...
583}
584----
585
586
587=== Check support and query the capabilities for a video encode profile
588
589[source,c]
590----
591VkResult result;
592
593// We also include the optional encode usage information here
594VkVideoEncodeUsageInfoKHR profileUsageInfo = {
595    .sType = VK_STRUCTURE_TYPE_VIDEO_ENCODE_USAGE_INFO_KHR,
596    .pNext = ... // pointer to codec-specific profile structure
597    .videoUsageHints = VK_VIDEO_ENCODE_USAGE_DEFAULT_KHR,
598    .videoContentHints = VK_VIDEO_ENCODE_CONTENT_DEFAULT_KHR,
599    .tuningMode = VK_VIDEO_ENCODE_TUNING_MODE_DEFAULT_KHR
600};
601
602VkVideoProfileInfoKHR profileInfo = {
603    .sType = VK_STRUCTURE_TYPE_VIDEO_PROFILE_INFO_KHR,
604    .pNext = &profileUsageInfo,
605    .videoCodecOperation = ... // used video encode operation
606    .chromaSubsampling = VK_VIDEO_CHROMA_SUBSAMPLING_420_BIT_KHR,
607    .lumaBitDepth = VK_VIDEO_COMPONENT_BIT_DEPTH_8_BIT_KHR,
608    .chromaBitDepth = VK_VIDEO_COMPONENT_BIT_DEPTH_8_BIT_KHR
609};
610
611VkVideoEncodeCapabilitiesKHR encodeCapabilities = {
612    .sType = VK_STRUCTURE_TYPE_VIDEO_ENCODE_CAPABILITIES_KHR,
613    .pNext = ... // pointer to codec-specific capability structure
614}
615
616VkVideoCapabilitiesKHR capabilities = {
617    .sType = VK_STRUCTURE_TYPE_VIDEO_CAPABILITIES_KHR,
618    .pNext = &encodeCapabilities
619};
620
621result = vkGetPhysicalDeviceVideoCapabilitiesKHR(physicalDevice, &profileInfo, &capabilities);
622
623if (result == VK_SUCCESS) {
624    // Profile is supported, check additional capabilities
625    ...
626} else {
627    // Profile is not supported, result provides additional information about why
628    ...
629}
630----
631
632
633=== Select encode input and DPB formats supported by the video encode profile
634
635[source,c]
636----
637VkVideoProfileInfoKHR profileInfo = {
638    ...
639};
640
641VkVideoProfileListInfoKHR profileListInfo = {
642    .sType = VK_STRUCTURE_TYPE_VIDEO_PROFILE_LIST_INFO_KHR,
643    .pNext = NULL,
644    .profileCount = 1,
645    .pProfiles = &profileInfo
646};
647
648VkPhysicalDeviceVideoFormatInfoKHR formatInfo = {
649    .sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_VIDEO_FORMAT_INFO_KHR,
650    .pNext = &profileListInfo
651};
652
653VkVideoFormatPropertiesKHR* formatProps = NULL;
654
655// First query encode input formats
656formatInfo.imageUsage = VK_IMAGE_USAGE_VIDEO_ENCODE_SRC_BIT_KHR;
657
658vkGetPhysicalDeviceVideoFormatPropertiesKHR(physicalDevice, &formatInfo, &formatCount, NULL);
659formatProps = calloc(formatCount, sizeof(VkVideoFormatPropertiesKHR));
660for (uint32_t i = 0; i < formatCount; ++i) {
661    formatProps.sType = VK_STRUCTURE_TYPE_VIDEO_FORMAT_PROPERTIES_KHR;
662}
663vkGetPhysicalDeviceVideoFormatPropertiesKHR(physicalDevice, &formatInfo, &formatCount, formatProps);
664
665for (uint32_t i = 0; i < formatCount; ++i) {
666    // Select encode input format and image creation capabilities best suited for the use case
667    ...
668}
669free(formatProps);
670
671// Then query DPB formats
672formatInfo.imageUsage = VK_IMAGE_USAGE_VIDEO_ENCODE_DPB_BIT_KHR;
673
674vkGetPhysicalDeviceVideoFormatPropertiesKHR(physicalDevice, &formatInfo, &formatCount, NULL);
675formatProps = calloc(formatCount, sizeof(VkVideoFormatPropertiesKHR));
676for (uint32_t i = 0; i < formatCount; ++i) {
677    formatProps.sType = VK_STRUCTURE_TYPE_VIDEO_FORMAT_PROPERTIES_KHR;
678}
679vkGetPhysicalDeviceVideoFormatPropertiesKHR(physicalDevice, &formatInfo, &formatCount, formatProps);
680
681for (uint32_t i = 0; i < formatCount; ++i) {
682    // Select DPB format and image creation capabilities best suited for the use case
683    ...
684}
685free(formatProps);
686----
687
688
689=== Create bitstream buffer
690
691[source,c]
692----
693VkBuffer bitstreamBuffer = VK_NULL_HANDLE;
694
695VkVideoProfileListInfoKHR profileListInfo = {
696    .sType = VK_STRUCTURE_TYPE_VIDEO_PROFILE_LIST_INFO_KHR,
697    .pNext = NULL,
698    .profileCount = ... // number of video profiles to use the bitstream buffer with
699    .pProfiles = ... // pointer to an array of video profile information structure chains
700};
701
702VkBufferCreateInfo createInfo = {
703    .sType = VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO,
704    .pNext = &profileListInfo,
705    ...
706    .usage = VK_BUFFER_USAGE_VIDEO_ENCODE_DST_BIT_KHR | ... // any other usages that may be needed
707    ...
708};
709
710vkCreateBuffer(device, &createInfo, NULL, &bitstreamBuffer);
711----
712
713
714=== Create encode input image and image view
715
716[source,c]
717----
718VkImage inputImage = VK_NULL_HANDLE;
719VkImageView inputImageView = VK_NULL_HANDLE;
720
721VkVideoProfileListInfoKHR profileListInfo = {
722    .sType = VK_STRUCTURE_TYPE_VIDEO_PROFILE_LIST_INFO_KHR,
723    .pNext = NULL,
724    .profileCount = ... // number of video profiles to use the encode input image with
725    .pProfiles = ... // pointer to an array of video profile information structure chains
726};
727
728VkImageCreateInfo imageCreateInfo = {
729    .sType = VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO,
730    .pNext = &profileListInfo,
731    ...
732    .usage = VK_IMAGE_USAGE_VIDEO_ENCODE_SRC_BIT_KHR | ... // any other usages that may be needed
733    ...
734};
735
736vkCreateImage(device, &imageCreateInfo, NULL, &inputImage);
737
738VkImageViewUsageCreateInfo imageViewUsageInfo = {
739    .sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_USAGE_CREATE_INFO,
740    .pNext = NULL,
741    .usage = VK_IMAGE_USAGE_VIDEO_ENCODE_SRC_BIT_KHR
742};
743
744VkImageViewCreateInfo imageViewCreateInfo = {
745    .sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO,
746    .pNext = &imageViewUsageInfo,
747    .flags = 0,
748    .image = inputImage,
749    .viewType = ... // image view type (only 2D or 2D_ARRAY is supported)
750    ... // other image view creation parameters
751};
752
753vkCreateImageView(device, &imageViewCreateInfo, NULL, &inputImageView);
754----
755
756
757=== Create DPB image and image view
758
759[source,c]
760----
761// NOTE: This example creates a single image and image view that is used to back all DPB pictures
762// but, depending on the support of the VK_VIDEO_CAPABILITY_SEPARATE_REFERENCE_IMAGES_BIT_KHR
763// capability flag, the application can choose to create separate images for each DPB slot or
764// picture
765
766VkImage dpbImage = VK_NULL_HANDLE;
767VkImageView dpbImageView = VK_NULL_HANDLE;
768
769VkVideoProfileListInfoKHR profileListInfo = {
770    .sType = VK_STRUCTURE_TYPE_VIDEO_PROFILE_LIST_INFO_KHR,
771    .pNext = NULL,
772    .profileCount = ... // number of video profiles to use the encode DPB image with
773    .pProfiles = ... // pointer to an array of video profile information structure chains
774};
775
776VkImageCreateInfo imageCreateInfo = {
777    .sType = VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO,
778    .pNext = &profileListInfo,
779    ...
780    .usage = VK_IMAGE_USAGE_VIDEO_ENCODE_DPB_BIT_KHR | ... // any other usages that may be needed
781    ...
782    .arrayLayers = // typically equal to the DPB slot count
783};
784
785vkCreateImage(device, &imageCreateInfo, NULL, &dpbImage);
786
787VkImageViewUsageCreateInfo imageViewUsageInfo = {
788    .sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_USAGE_CREATE_INFO,
789    .pNext = NULL,
790    .usage = VK_IMAGE_USAGE_VIDEO_ENCODE_DPB_BIT_KHR
791};
792
793VkImageViewCreateInfo imageViewCreateInfo = {
794    .sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO,
795    .pNext = &imageViewUsageInfo,
796    .flags = 0,
797    .image = dpbImage,
798    .viewType = ... // image view type (only 2D or 2D_ARRAY is supported)
799    ... // other image view creation parameters
800};
801
802vkCreateImageView(device, &imageViewCreateInfo, NULL, &dpbImageView);
803----
804
805
806=== Create and use video encode feedback query pool with a video session
807
808[source,c]
809----
810VkQueryPool queryPool = VK_NULL_HANDLE;
811
812VkVideoProfileInfoKHR profileInfo = {
813    ...
814};
815
816// We will capture both bitstream offset and bitstream bytes written in the feedback
817VkVideoEncodeFeedbackFlags capturedEncodeFeedbackValues =
818    VK_VIDEO_ENCODE_FEEDBACK_BITSTREAM_BUFFER_OFFSET_BIT_KHR |
819    VK_VIDEO_ENCODE_FEEDBACK_BITSTREAM_BYTES_WRITTEN_BIT_KHR;
820
821// NOTE: Only the encode feedback values listed above are required to be supported by all
822// video encode implementations. So if the application intends to use other encode
823// feedback values like VK_VIDEO_ENCODE_FEEDBACK_BITSTREAM_HAS_OVERRIDES_BIT_KHR, then
824// it must first check support for it as indicated by the supportedEncodeFeedbackFlags
825// capability for the video encode profile in question.
826
827VkQueryPoolVideoEncodeFeedbackCreateInfoKHR feedbackInfo = {
828    .sType = VK_STRUCTURE_TYPE_QUERY_POOL_VIDEO_ENCODE_FEEDBACK_CREATE_INFO_KHR,
829    .pNext = &profileInfo,
830    .encodeFeedbackFlags = capturedEncodeFeedbackValues
831};
832
833VkQueryPoolCreateInfo createInfo = {
834    .sType = VK_STRUCTURE_TYPE_QUERY_POOL_CREATE_INFO,
835    .pNext = &feedbackInfo,
836    .flags = 0,
837    .queryType = VK_QUERY_TYPE_VIDEO_ENCODE_FEEDBACK_KHR,
838    ...
839};
840
841vkCreateQueryPool(device, &createInfo, NULL, &queryPool);
842
843...
844vkBeginCommandBuffer(commandBuffer, ...);
845...
846vkCmdBeginVideoCodingKHR(commandBuffer, ...);
847...
848vkCmdBeginQuery(commandBuffer, queryPool, 0, 0);
849// Issue video encode operation
850...
851vkCmdEndQuery(commandBuffer, queryPool, 0);
852...
853vkCmdEndVideoCodingKHR(commandBuffer, ...);
854...
855vkEndCommandBuffer(commandBuffer);
856...
857
858// We retrieve the captured feedback values as well as the status
859struct {
860    uint32_t                bitstreamBufferOffset;
861    uint32_t                bitstreamBytesWritten;
862    VkQueryResultStatusKHR  status;
863} results;
864vkGetQueryPoolResults(device, queryPool, 0, 1,
865                      sizeof(results), &results, sizeof(results),
866                      VK_QUERY_RESULT_WITH_STATUS_BIT_KHR);
867
868if (results.status == VK_QUERY_RESULT_STATUS_NOT_READY_KHR /* 0 */) {
869    // Query result not ready yet
870    ...
871} else if (results.status > 0) {
872    // Video encode operation was successful, we can use bitstream feedback data
873    ...
874} else if (results.status < 0) {
875    // Video encode operation was unsuccessful, feedback data is undefined
876    ...
877}
878
879----
880
881
882=== Record encode operation (video session without DPB slots)
883
884[source,c]
885----
886vkCmdBeginVideoCodingKHR(commandBuffer, ...);
887
888VkVideoPictureResourceInfoKHR encodeInputPictureResource = {
889    .sType = VK_STRUCTURE_TYPE_VIDEO_PICTURE_RESOURCE_INFO_KHR,
890    .pNext = NULL,
891    .codedOffset = ... // offset within the image subresource (typically { 0, 0 })
892    .codedExtent = ... // extent of encoded picture (typically the video frame size)
893    .baseArrayLayer = 0,
894    .imageViewBinding = inputImageView
895};
896
897VkVideoEncodeInfoKHR encodeInfo = {
898    .sType = VK_STRUCTURE_TYPE_VIDEO_ENCODE_INFO_KHR,
899    .pNext = ... // pointer to codec-specific picture information structure
900    .flags = 0,
901    .dstBuffer = bitstreamBuffer,
902    .dstBufferOffset = ... // offset where the encoded bitstream is written
903    .dstBufferRange = ... // maximum size in bytes of the written bitstream data
904    .srcPictureResource = encodeInputPictureResource,
905    .pSetupReferenceSlot = NULL,
906    .referenceSlotCount = 0,
907    .pReferenceSlots = NULL,
908    .precedingExternallyEncodedBytes = ...
909};
910
911vkCmdEncodeVideoKHR(commandBuffer, &encodeInfo);
912
913vkCmdEndVideoCodingKHR(commandBuffer, ...);
914----
915
916
917=== Record encode operation with reconstructed picture information
918
919[source,c]
920----
921// Bound reference resource list provided has to include reconstructed picture resource
922vkCmdBeginVideoCodingKHR(commandBuffer, ...);
923
924VkVideoPictureResourceInfoKHR encodeInputPictureResource = {
925    .sType = VK_STRUCTURE_TYPE_VIDEO_PICTURE_RESOURCE_INFO_KHR,
926    .pNext = NULL,
927    .codedOffset = ... // offset within the image subresource (typically { 0, 0 })
928    .codedExtent = ... // extent of encoded picture (typically the video frame size)
929    .baseArrayLayer = 0,
930    .imageViewBinding = inputImageView
931};
932
933VkVideoPictureResourceInfoKHR reconstructedPictureResource = {
934    .sType = VK_STRUCTURE_TYPE_VIDEO_PICTURE_RESOURCE_INFO_KHR,
935    .pNext = NULL,
936    .codedOffset = ... // offset within the image subresource (typically { 0, 0 })
937    .codedExtent = ... // extent of reconstructed picture (typically the video frame size)
938    .baseArrayLayer = ... // layer to use for setup picture in DPB
939    .imageViewBinding = dpbImageView
940};
941
942VkVideoReferenceSlotInfoKHR setupSlotInfo = {
943    .sType = VK_STRUCTURE_TYPE_VIDEO_REFERENCE_SLOT_INFO_KHR,
944    .pNext = ... // pointer to codec-specific reconstructed picture information structure
945    .slotIndex = ... // DPB slot index to use with the reconstructed picture
946                     // (optionally activated per the codec-specific semantics)
947    .pPictureResource = &reconstructedPictureResource
948};
949
950VkVideoEncodeInfoKHR encodeInfo = {
951    .sType = VK_STRUCTURE_TYPE_VIDEO_ENCODE_INFO_KHR,
952    .pNext = ... // pointer to codec-specific picture information structure
953    ...
954    .srcPictureResource = encodeInputPictureResource,
955    .pSetupReferenceSlot = &setupSlotInfo,
956    ...
957};
958
959vkCmdEncodeVideoKHR(commandBuffer, &encodeInfo);
960
961vkCmdEndVideoCodingKHR(commandBuffer, ...);
962----
963
964
965=== Record encode operation with reference picture list
966
967[source,c]
968----
969// Bound reference resource list provided has to include all used reference picture resources
970vkCmdBeginVideoCodingKHR(commandBuffer, ...);
971
972VkVideoPictureResourceInfoKHR referencePictureResources[] = {
973    {
974        .sType = VK_STRUCTURE_TYPE_VIDEO_PICTURE_RESOURCE_INFO_KHR,
975        .pNext = NULL,
976        .codedOffset = ... // offset within the image subresource (typically { 0, 0 })
977        .codedExtent = ... // extent of reference picture (typically the video frame size)
978        .baseArrayLayer = ... // layer of first reference picture resource
979        .imageViewBinding = dpbImageView
980    },
981    {
982        .sType = VK_STRUCTURE_TYPE_VIDEO_PICTURE_RESOURCE_INFO_KHR,
983        .pNext = NULL,
984        .codedOffset = ... // offset within the image subresource (typically { 0, 0 })
985        .codedExtent = ... // extent of reference picture (typically the video frame size)
986        .baseArrayLayer = ... // layer of second reference picture resource
987        .imageViewBinding = dpbImageView
988    },
989    ...
990};
991// NOTE: Individual resources do not have to refer to the same image view, e.g. if different
992// image views are created for each picture resource, or if the
993// VK_VIDEO_CAPABILITY_SEPARATE_REFERENCE_IMAGES_BIT_KHR capability is supported and the
994// application created separate images for the reference pictures.
995
996VkVideoReferenceSlotInfoKHR referenceSlotInfo[] = {
997    {
998        .sType = VK_STRUCTURE_TYPE_VIDEO_REFERENCE_SLOT_INFO_KHR,
999        .pNext = ... // pointer to codec-specific reference picture information structure
1000        .slotIndex = ... // DPB slot index of the first reference picture
1001        .pPictureResource = &referencePictureResource[0]
1002    },
1003    {
1004        .sType = VK_STRUCTURE_TYPE_VIDEO_REFERENCE_SLOT_INFO_KHR,
1005        .pNext = ... // pointer to codec-specific reference picture information structure
1006        .slotIndex = ... // DPB slot index of the second reference picture
1007        .pPictureResource = &referencePictureResource[1]
1008    },
1009    ...
1010};
1011
1012VkVideoEncodeInfoKHR encodeInfo = {
1013    .sType = VK_STRUCTURE_TYPE_VIDEO_ENCODE_INFO_KHR,
1014    .pNext = ... // pointer to codec-specific picture information structure
1015    ...
1016    .referenceSlotCount = sizeof(referenceSlotInfo) / sizeof(referenceSlotInfo[0]),
1017    .pReferenceSlots = &referenceSlotInfo[0]
1018};
1019
1020vkCmdEncodeVideoKHR(commandBuffer, &encodeInfo);
1021
1022vkCmdEndVideoCodingKHR(commandBuffer, ...);
1023----
1024
1025
1026=== Encode codec-specific parameters stored in video session parameters objects
1027
1028[source,c]
1029----
1030VkVideoEncodeSessionParametersGetInfoKHR getInfo = {
1031    .sType = VK_STRUCTURE_TYPE_VIDEO_ENCODE_SESSION_PARAMETERS_GET_INFO_KHR,
1032    .pNext = ... // pointer to any codec-specific parameters, if needed
1033    .videoSessionParameters = // video session parameters object to query
1034};
1035
1036// VK_TRUE, if application prefers to encode the stored codec-specific parameters
1037// itself, if possible, VK_FALSE otherwise
1038VkBool32 preferApplicationParameterEncode = ...;
1039
1040VkBool32 parametersContainOverrides = VK_FALSE;
1041
1042if (preferApplicationParameterEncode) {
1043    VkVideoEncodeSessionParametersFeedbackInfoKHR feedbackInfo = {
1044        .sType = VK_STRUCTURE_TYPE_VIDEO_ENCODE_SESSION_PARAMETERS_FEEDBACK_INFO_KHR,
1045        .pNext = ... // pointer to any codec-specific feedback info, if needed
1046        .hasOverrides = VK_FALSE;
1047    };
1048
1049    vkGetEncodedVideoSessionParametersKHR(device, &getInfo, &feedbackInfo, NULL, NULL);
1050
1051    parametersContainOverrides = feedbackInfo.hasOverrides;
1052}
1053
1054if (preferApplicationParameterEncode && !parametersContainOverrides) {
1055    // Encode codec-specific parameters manually
1056    ...
1057} else {
1058    // Retrieve encoded codec-specific parameters from implementation
1059    size_t dataSize = 0;
1060    vkGetEncodedVideoSessionParametersKHR(device, &getInfo, NULL, &dataSize, NULL);
1061
1062    // Pointer to CPU buffer with at least dataSize number of bytes of storage
1063    // (allocate it on demand or use an existing pool used for bitstream storage)
1064    void* data = ...;
1065    vkGetEncodedVideoSessionParametersKHR(device, &getInfo, NULL, &dataSize, data);
1066}
1067----
1068
1069
1070=== Change the rate control configuration of a video encode session
1071
1072[source,c]
1073----
1074vkCmdBeginVideoCodingKHR(commandBuffer, ...);
1075
1076VkVideoEncodeRateControlLayerInfoKHR rateControlLayers[] = {
1077    {
1078        .sType = VK_STRUCTURE_TYPE_VIDEO_ENCODE_RATE_CONTROL_LAYER_INFO_KHR,
1079        .pNext = ... // pointer to optional codec-specific rate control layer configuration
1080        .averageBitrate = 2000000, // 2 Mbps target bitrate
1081        .maxBitrate = 5000000, // 5 Mbps peak bitrate
1082        .frameRateNumerator = 30000, // 29.97 fps numerator
1083        .frameRateDenominator = 1001 // 29.97 fps denominator
1084    },
1085    ...
1086};
1087
1088VkVideoEncodeRateControlInfoKHR rateControlInfo = {
1089    .sType = VK_STRUCTURE_TYPE_VIDEO_ENCODE_RATE_CONTROL_INFO_KHR,
1090    .pNext = ... // pointer to optional codec-specific rate control configuration
1091    .flags = 0,
1092    .rateControlMode = VK_VIDEO_ENCODE_RATE_CONTROL_MODE_VBR_BIT_KHR, // variable bitrate mode
1093    .layerCount = sizeof(rateControlLayers) / sizeof(rateControlLayers[0]),
1094    .pLayers = rateControlLayers,
1095    .virtualBufferSizeInMs = 2000, // virtual buffer size is 2 seconds
1096    .initialVirtualBufferSizeInMs = 0
1097};
1098
1099// Change the rate control configuration for the video session
1100VkVideoCodingControlInfoKHR controlInfo = {
1101    .sType = VK_STRUCTURE_TYPE_VIDEO_CODING_CONTROL_INFO_KHR,
1102    .pNext = &rateControlInfo,
1103    .flags = VK_VIDEO_CODING_CONTROL_ENCODE_RATE_CONTROL_BIT_KHR
1104};
1105
1106vkCmdControlVideoCodingKHR(commandBuffer, &controlInfo);
1107
1108...
1109
1110vkCmdEndVideoCodingKHR(commandBuffer, ...);
1111----
1112
1113
1114=== Change the video encode quality level used by a video encode session
1115
1116[source,c]
1117----
1118vkCmdBeginVideoCodingKHR(commandBuffer, ...);
1119
1120VkVideoEncodeQualityLevelInfoKHR qualityLevelInfo = {
1121    .sType = VK_STRUCTURE_TYPE_VIDEO_ENCODE_QUALITY_LEVEL_INFO_KHR,
1122    .pNext = NULL,
1123    .qualityLevel = ... // the new quality level to set
1124};
1125
1126VkVideoCodingControlInfoKHR controlInfo = {
1127    .sType = VK_STRUCTURE_TYPE_VIDEO_CODING_CONTROL_INFO_KHR,
1128    .pNext = &qualityLevelInfo,
1129    .flags = VK_VIDEO_CODING_CONTROL_ENCODE_QUALITY_LEVEL_BIT_KHR
1130};
1131
1132vkCmdControlVideoCodingKHR(commandBuffer, &controlInfo);
1133
1134...
1135
1136vkCmdEndVideoCodingKHR(commandBuffer, ...);
1137----
1138
1139
1140=== Initialize a video encode session with a specific quality level and corresponding recommended rate control settings
1141
1142[source,c]
1143----
1144// Construct the video encode profile with appropriate usage scenario information
1145// We also include the optional encode usage information here
1146VkVideoEncodeUsageInfoKHR profileUsageInfo = {
1147    .sType = VK_STRUCTURE_TYPE_VIDEO_ENCODE_USAGE_INFO_KHR,
1148    .pNext = ... // pointer to codec-specific profile structure
1149    .videoUsageHints = ... // usage hints
1150    .videoContentHints = ... // content hints
1151    .tuningMode = ... // tuning mode
1152};
1153
1154VkVideoProfileInfoKHR profileInfo = {
1155    .sType = VK_STRUCTURE_TYPE_VIDEO_PROFILE_INFO_KHR,
1156    .pNext = &profileUsageInfo,
1157    ...
1158};
1159
1160// Query the video encode profile capabilities to determine maxQualityLevels
1161VkVideoEncodeCapabilitiesKHR encodeCapabilities = {
1162    .sType = VK_STRUCTURE_TYPE_VIDEO_ENCODE_CAPABILITIES_KHR,
1163    .pNext = ... // pointer to codec-specific capability structure
1164}
1165
1166VkVideoCapabilitiesKHR capabilities = {
1167    .sType = VK_STRUCTURE_TYPE_VIDEO_CAPABILITIES_KHR,
1168    .pNext = &encodeCapabilities
1169};
1170
1171result = vkGetPhysicalDeviceVideoCapabilitiesKHR(physicalDevice, &profileInfo, &capabilities);
1172
1173// Select a quality level to use between 0 and maxQualityLevels-1
1174uint32_t selectedQualityLevel = selectQualityLevelFrom(0, encodeCapabilities.maxQualityLevels - 1);
1175
1176// Query recommended settings for the selected video encode quality level
1177VkPhysicalDeviceVideoEncodeQualityLevelInfoKHR qualityLevelInfo = {
1178    .sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_VIDEO_ENCODE_QUALITY_LEVEL_INFO_KHR,
1179    .pNext = NULL,
1180    .pVideoProfile = &profileInfo,
1181    .qualityLevel = selectedQualityLevel
1182};
1183
1184VkVideoEncodeQualityLevelPropertiesKHR qualityLevelProps = {
1185    .sType = VK_STRUCTURE_TYPE_VIDEO_ENCODE_QUALITY_LEVEL_PROPERTIES_KHR,
1186    .pNext = ... // pointer to any codec-specific parameters, if needed
1187};
1188
1189result = vkGetPhysicalDeviceVideoEncodeQualityLevelPropertiesKHR(physicalDevice, &qualityLevelInfo, &qualityLevelProps);
1190
1191...
1192
1193// Video session parameters are always created with respect to the used
1194// video encode quality level, so create one accordingly
1195VkVideoEncodeQualityLevelInfoKHR paramsQualityLevelInfo = {
1196    .sType = VK_STRUCTURE_TYPE_VIDEO_ENCODE_QUALITY_LEVEL_INFO_KHR,
1197    .pNext = ... // pointer to codec-specific parameters creation information
1198    .qualityLevel = selectedQualityLevel
1199};
1200
1201VkVideoSessionParametersCreateInfoKHR paramsCreateInfo = {
1202    .sType = VK_STRUCTURE_TYPE_VIDEO_SESSION_PARAMETERS_CREATE_INFO_KHR,
1203    .pNext = &paramsQualityLevelInfo,
1204    ...
1205};
1206
1207VkVideoSessionParametersKHR params = VK_NULL_HANDLE;
1208result = vkCreateVideoSessionParametersKHR(device, &paramsCreateInfo, NULL, &params);
1209
1210...
1211
1212vkCmdBeginVideoCodingKHR(commandBuffer, ...);
1213
1214// Initialize the video session, set the quality level, and the
1215// recommended rate control configuration
1216// NOTE: The application can choose other rate control settings as the
1217// quality level properties only indicate preference, not a requirement
1218
1219// Include rate control information
1220VkVideoEncodeRateControlInfoKHR rateControlInfo = {
1221    .sType = VK_STRUCTURE_TYPE_VIDEO_ENCODE_RATE_CONTROL_INFO_KHR,
1222    .pNext = ... // pointer to optional codec-specific rate control configuration
1223    .flags = 0,
1224    .rateControlMode = qualityLevelProps.preferredRateControlMode,
1225    .layerCount = qualityLevelProps.preferredRateControlLayerCount,
1226    ...
1227};
1228
1229// Include quality level information
1230VkVideoEncodeQualityLevelInfoKHR qualityLevelInfo = {
1231    .sType = VK_STRUCTURE_TYPE_VIDEO_ENCODE_QUALITY_LEVEL_INFO_KHR,
1232    .pNext = &rateControlInfo,
1233    .qualityLevel = selectedQualityLevel
1234};
1235
1236// Include all of the RESET, ENCODE_QUALITY_LEVEL, and RATE_CONTROL bits
1237// because in this example we do an initialization followed by an immediate
1238// update to the quality level and rate control states
1239VkVideoCodingControlInfoKHR controlInfo = {
1240    .sType = VK_STRUCTURE_TYPE_VIDEO_CODING_CONTROL_INFO_KHR,
1241    .pNext = &qualityLevelInfo,
1242    .flags = VK_VIDEO_CODING_CONTROL_RESET_BIT_KHR
1243           | VK_VIDEO_CODING_CONTROL_ENCODE_QUALITY_LEVEL_BIT_KHR
1244           | VK_VIDEO_CODING_CONTROL_ENCODE_RATE_CONTROL_BIT_KHR
1245};
1246
1247vkCmdControlVideoCodingKHR(commandBuffer, &controlInfo);
1248
1249...
1250
1251vkCmdEndVideoCodingKHR(commandBuffer, ...);
1252----
1253
1254
1255== Issues
1256
1257=== RESOLVED: Why is there no `VK_PIPELINE_STAGE_VIDEO_ENCODE_BIT_KHR`?
1258
1259This extension requires the `VK_KHR_synchronization2` extension because the new access flags introduced did not fit in the 32-bit enum `VkAccessFlagBits`. Accordingly, all new pipeline stage and access flags have been added to the corresponding 64-bit enums and no new flags have been added to the legacy 32-bit enums. While the new pipeline stage flag introduced uses bit #27 which would also fit in the legacy `VkPipelineStageFlagBits` enum, there is no real benefit to include it. Instead the bit is marked reserved.
1260
1261
1262=== RESOLVED: How can layered codec-specific encode extensions enable applications to provide the necessary codec-specific picture information, parameter sets, etc. that may be needed to perform the video coding operations?
1263
1264There are multiple points where codec-specific picture information can be provided to a video encode operation. This extension suggests the following convention:
1265
1266  * Codec-specific encode parameters are expected to be provided in the `pNext` chain of `VkVideoEncodeInfoKHR`.
1267  * Codec-specific reconstructed picture information is expected to be provided in the `pNext` chain of `VkVideoEncodeInfoKHR::pSetupReferenceSlot`.
1268  * Codec-specific reference picture information is expected to be provided in the `pNext` chain of the elements of the `VkVideoEncodeInfoKHR::pReferenceSlots` array.
1269
1270
1271=== RESOLVED: Can `vkCmdVideoEncodeKHR` only encode frames? What about field encoding, slice encoding, etc.?
1272
1273This extension does not define the types of pictures or sub-picture content that can be encoded by a `vkCmdVideoEncodeKHR` command. It is expected that the codec-specific encode extensions built upon this extension define the types of pictures that can be encoded. Furthermore, both codec-specific and codec-independent extensions can expand the set of capabilities introduced here to enable more advanced use cases, as needed.
1274
1275
1276=== RESOLVED: What is the effect of the flags provided in `VkVideoEncodeUsageInfoKHR::videoUsageHints` and `VkVideoEncodeUsageInfoKHR::videoContentHints`?
1277
1278There are no specific behavioral effects associated with any of the video encode usage and content hints, so the application can specify any combination of these flags. They are included to enable the application to better communicate the intended use case scenario to the implementation.
1279
1280However, just like any other additional video profile information included in the `pNext` chain of `VkVideoProfileInfoKHR` structures, they are part of the video profile definition, hence whenever matching video profiles have to be provided to an API call, be that queries or resource creation structures, the application must provide identical video encode usage and content hint values. This also applies if the application does not include the `VkVideoEncodeUsageInfoKHR` structure, which is treated equivalently to specifying the structure with `videoUsageHints`, `videoContentHints`, and `tuningMode` equal to `VK_VIDEO_ENCODE_USAGE_DEFAULT_KHR`, `VK_VIDEO_ENCODE_CONTENT_DEFAULT_KHR`, and `VK_VIDEO_ENCODE_TUNING_MODE_DEFAULT_KHR` (or zero), respectively, per the usual conventions of Vulkan.
1281
1282
1283=== RESOLVED: What is the effect of the tuning mode provided in `VkVideoEncodeUsageInfoKHR::tuningMode`?
1284
1285Unlike the other fields in `VkVideoEncodeUsageInfoKHR`, the tuning mode affects the behavior of video session objects created using them. Different tuning modes may put the hardware in a different mode of operation tuned for the particular use case with significantly different capabilities, as well as quality and performance characteristics.
1286
1287
1288=== RESOLVED: How should we expose video encoding feedback values (e.g. encoded bitstream size)?
1289
1290Through a new query type. We follow the model of pipeline statistics queries to enable adding additional feedback values to the query thus this extension introduces a new `VK_QUERY_TYPE_VIDEO_ENCODE_FEEDBACK_KHR` query type with the ability to get feedback about the offset and size of the bitstream data produced by video encode operations (amongst other feedback values). We expect that in the future video decode operations will need to support similar feedback values thus a similar query type for video decode operations can be introduced by another extension.
1291
1292
1293=== RESOLVED: Do result status queries need to be used in conjunction with video encode feedback queries?
1294
1295No, in fact only a single query can ever be active within a video coding scope, hence executing a result status query as well as a video encode feedback query for the same video encode operation is not possible. Though it is also not needed, as all query types allow returning a result status, just like availability status. Thus, in practice, result status queries are only needed to be used when no other query type is supported in the particular context, and in case of video encoding applications are expected to only use video encode feedback queries within a video coding scope.
1296
1297
1298=== RESOLVED: Why is there a need to allow implementations to override codec-specific parameters?
1299
1300As described in the corresponding section earlier, encoder implementations usually only support a subset of the available encoding tools defined by the corresponding video compression standards and enumerating exhaustively all of these constraints would be impractical and could result in a combinatorial explosion of codec-specific capabilities. Instead, this proposal allows implementations to override any codec-specific parameter values or combinations thereof, so that the resulting parameters comply to the constraints of the target implementation.
1301
1302Some other video encode APIs do not support implementation overrides, but the drawback of that choice is that implementations may not be able to expose a potentially large set of their encoding tools just because they do not comply to the exact wording of the capabilities defined by these APIs, so this proposal chose to maximize the exposed capabilities instead.
1303
1304Such minimal and necessary implementation overrides are expected to be applied only when they are absolutely paramount for the correct functioning of the underlying encoder hardware. Additional, optimizing overrides can be, however, explicitly enabled by the application using the `VK_VIDEO_SESSION_CREATE_ALLOW_ENCODE_PARAMETER_OPTIMIZATIONS_BIT_KHR` video session creation flag.
1305
1306
1307=== RESOLVED: Can the application disable all implementation overrides?
1308
1309No. Without the ability to override codec-specific parameters, as necessitated by the constraints of the target implementation, the implementation may not be able to guarantee that the generated bitstreams will be compliant to the video compression standard in question.
1310
1311Accordingly, if the API would allow the application to disable all implementation overrides, that would, for all practical purposes, be equivalent to a flag enabling undefined behavior from the perspective of video compression standard compliance.
1312
1313For the same reason, if the application chooses to encode codec-specific parameters stored in video session parameters object on its own, indifferent of whether the implementation had to apply overrides to those, as reported by `vkGetEncodedVideoSessionParametersKHR`, it risks the final bitstream to be non-compliant.
1314
1315Applications seeking to only accept bitstreams produced exactly according to the codec-specific parameters they requested can choose to treat the presence of any overrides as an encoding error.
1316
1317
1318=== RESOLVED: Can implementations override any codec-specific parameter?
1319
1320No. First, there are a set of rules that implementations have to comply to when applying any parameter overrides, as defined in detail in the specification. In addition, codec-specific extensions layered on top of this proposal can define their own restrictions about what parameters can implementations override. In practice, it is expected that certain codec-specific parameters that affect the overall behavior of the encoder and that could have an impact on any additional bitstream elements that need to be encoded by the application will never be overridden by the implementation, and thus will be excluded from the set of overridable parameters in the corresponding codec-specific extension.
1321
1322Over time, it is expected that the set of these guarantees will grow (e.g. by exposing additional capabilities) according to the needs of encoder applications.
1323
1324
1325=== RESOLVED: Do all implementations have to implement the same rate control algorithms corresponding to the rate control modes defined by this proposal?
1326
1327No. While the high-level rate control modes (CBR and VBR) defined by this proposal are fairly universal, each rate control mode can be implemented in many different ways while still complying to the fundamental model of the mode itself. In practice, the rate control algorithms employed by implementations significantly differ.
1328
1329Accordingly, this proposal does not try to describe any specific rate control algorithm for any of the rate control modes introduced, rather it provides a high-level description of the modes and the underlying leaky bucket model used by them.
1330
1331The only case where the effects of rate control are defined exactly is when rate control is disabled (using `VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DISABLED_BIT_KHR`), where implementations must encode the pictures exactly per the application-specified codec-specific quantization parameters.
1332
1333
1334=== RESOLVED: Do rate control implementations guarantee to respect the average/max bitrates, or frame sizes configured for the video session?
1335
1336Unfortunately, implementations cannot provide hard guarantees about always respecting these rate control parameters, as the ability to conform to these is affected by the input content, the encoder tools of the video compression standard or the implementation, including the contents of future pictures, which implementations cannot make predictions about.
1337
1338However, for all practical purposes, these rate control parameters are expected to be respected when the application chooses them in a way that is in line with the encoded content and the characteristics of the used video compression standard.
1339
1340
1341=== RESOLVED: Are video session parameters objects dependent on the used video encode quality level?
1342
1343Some implementations may support different hardware modes that are enabled in response to the used video encode quality level. This may also have an effect on the constraints related to the available encoding tools and as such may also affect the necessary codec-specific parameter overrides the implementation has to apply. As video session parameters objects are expected to store the already overridden codec-specific parameters typically in an encoded or otherwise optimized format, using a video session parameters object with any video encode quality level would require implementations to also store the original parameters in order to be able to re-encode them according to the needs of the target video encode quality level, which would partially defeat the purpose of video session parameters object.
1344
1345Instead, this proposal defines video session parameters objects to be created with respect to a specific video encode quality level (when using a video encode profile) and applications have to make sure that they use a compatible video session parameters object in their encode commands according to the current quality level state of the video session.
1346
1347In practice, this should not have any effect on most encoder applications, as usually they use a single video encode quality level throughout the lifetime of the video session, so the additional complexity resulting from this specialization will only affect advanced applications that may need to operate using different video encode quality levels within a single video stream.
1348
1349
1350=== RESOLVED: Are video encode quality levels and rate control mutually exclusive?
1351
1352No, they are completely orthogonal, as they control different aspects of the encoder, and they are both always in effect all the time. There is always a currently active video encode quality level and rate control state, which default to quality level zero and implementation-specific rate control state, respectively, when the video encode session is initialized. The used video encode quality level and the rate control settings can be updated subsequently, potentially independently, or together with initialization per the application's needs. The only relation between video encode quality levels and rate control is that the application can query for each video encode profile and video encode quality level the implementation recommended settings (using `vkGetPhysicalDeviceVideoEncodeQualityLevelPropertiesKHR`) that are best suited for the selected quality level and the usage scenario information included in the video encode profile. These include recommendations on the rate control mode to use amongst other codec-independent and codec-specific suggestions. Nonetheless, these are only recommendations and the application can diverge from these if deemed necessary.
1353
1354
1355=== RESOLVED: Does specifying `VkVideoEncodeRateControlInfoKHR` in the `pNext` chain of the `pBeginCodingInfo` parameter of `vkCmdBeginVideoCodingKHR` change the current rate control configuration?
1356
1357No. The rate control information specified to `vkCmdBeginVideoCodingKHR` does not change the state of the video session, it is only expected to specify the current rate control configuration (previously already set through the execution of an appropriate `vkCmdControlVideoCodingKHR` command). This information is needed by some implementations in order to be aware of the current rate control configuration of the video session while recording commands, as some of the rate control state may affect the recorded device commands. When this information is not specified, the implementation will assume that the current rate control mode is set to `VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DEFAULT_KHR`.
1358
1359The validation layers are expected to detect at command buffer submission time if there is any mismatch between the expected rate control configuration specified to the `vkCmdBeginVideoCodingKHR` command and the actual rate control configuration of the video session at the time the video coding scope is started on the device timeline. If these two sets of state do not match, then the behavior of the implementations is undefined and may result in any sort of misbehavior permitted by the Vulkan specification when valid usage conditions are not met. Accordingly, applications have to make sure to track and specify the expected rate control configuration at the beginning of every video coding scope performing video encode operations in order to attain correct encoder behavior.
1360
1361
1362=== RESOLVED: When is it mandatory to specify reconstructed picture information in `VkVideoEncodeInfoKHR::pSetupReferenceSlot`?
1363
1364In line with the `VK_KHR_video_decode_queue` extension, due to foreseeable implementation limitations that may require the presence of a reconstructed picture resource and/or DPB slot for encoding, revision 12 of this extension changed the requirements on reconstructed picture information as follows:
1365
1366  1. Specifying reconstructed picture information (i.e. a non-`NULL` `pSetupReferenceSlot`) is made mandatory for all cases except when the video session was created with no DPB slots
1367  2. Reference picture setup (and, inherently, DPB slot activation) was changed to be subject to codec-specific behavior, meaning that specifying a non-`NULL` `pSetupReferenceSlot` will only trigger reference picture setup if the appropriate codec-specific parameters or semantics indicate so (typically in the form of marking the encoded picture as reference)
1368
1369As some implementations may use the reconstructed picture resource and/or DPB slot as transient storage during the decoding process, if a non-`NULL` `pSetupReferenceSlot` is specified but no reference picture setup is requested, then the contents of the reconstructed picture resource become undefined and some of the picture references associated with the reconstructed picture's DPB slot may get invalidated.
1370
1371
1372== Further Functionality
1373
1374This extension is meant to provide only common video encode functionality, thus support for individual video encode profiles using specific video compression standards is left for extensions layered on top of the infrastructure provided here.
1375
1376Currently the following layered extensions are available:
1377
1378  * `VK_KHR_video_encode_h264` - adds support for encoding H.264/AVC video sequences
1379  * `VK_KHR_video_encode_h265` - adds support for encoding H.265/HEVC video sequences
1380