• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1// Copyright 2021-2024 The Khronos Group Inc.
2//
3// SPDX-License-Identifier: CC-BY-4.0
4
5= VK_KHR_video_decode_h264
6:toc: left
7:refpage: https://registry.khronos.org/vulkan/specs/1.2-extensions/man/html/
8:sectnums:
9
10This document outlines a proposal to enable performing H.264/AVC video decode operations in Vulkan.
11
12== Problem Statement
13
14The `VK_KHR_video_queue` extension introduces support for video coding operations and the `VK_KHR_video_decode_queue` extension further extends this with APIs specific to video decoding.
15
16The goal of this proposal is to build upon this infrastructure to introduce support for decoding elementary video stream sequences compliant with the H.264/AVC video compression standard.
17
18
19== Solution Space
20
21As the `VK_KHR_video_queue` and `VK_KHR_video_decode_queue` extensions already laid down the architecture for how codec-specific video decode extensions need to be designed, this extension only needs to define the APIs to provide the necessary codec-specific parameters at various points during the use of the codec-independent APIs. In particular:
22
23  * APIs allowing to specify H.264 sequence and picture parameter sets (SPS, PPS) to be stored in video session parameters objects
24  * APIs allowing to specify H.264 information specific to the decoded picture, including references to previously stored SPS and PPS entries
25  * APIs allowing to specify H.264 reference picture information specific to the active reference pictures and optional reconstructed picture used in video decode operations
26
27The following options have been considered to choose the structure of these definitions:
28
29  1. Allow specifying packed codec-specific data to the APIs in the form they appear in bitstreams
30  2. Specify codec-specific parameters through custom type definitions that the application can populate after parsing the corresponding data elements in the bitstreams
31
32Option (1) would allow for a simpler API, but it requires implementations to include an appropriate parser for these data elements. As decoding applications typically parse these data elements for other reasons anyway, this proposal choses option (2) to enable the application to provide the needed parameters through custom definitions provided by a video std header dedicated to H.264 video decoding.
33
34The following additional options have been considered to choose the way this video std header is defined:
35
36  1. Include all definitions in this H.264 video decode std header
37  2. Add a separate video std header that includes H.264 parameter definitions that can be shared across video decoding and video encoding use cases that the H.264 video decode std header depends on, and only include decode-specific definitions in the H.264 video decode std header
38
39Both options are reasonable, however, as the H.264 video decoding and H.264 video encoding functionalities were designed in parallel, this extension uses option (2) and introduces the following new video std headers:
40
41  * `vulkan_video_codec_h264std` - containing common definitions for all H.264 video coding operations
42  * `vulkan_video_codec_h264std_decode` - containing definitions specific to H.264 video decoding operations
43
44These headers can be included as follows:
45
46[source,c]
47----
48#include <vk_video/vulkan_video_codec_h264std.h>
49#include <vk_video/vulkan_video_codec_h264std_decode.h>
50----
51
52
53== Proposal
54
55=== Video Std Headers
56
57This extension uses the new `vulkan_video_codec_h264std_decode` video std header. Implementations must always support at least version 1.0.0 of this video std header.
58
59
60=== H.264 Decode Profiles
61
62This extension introduces the new video codec operation `VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR`. This flag can be used to check whether a particular queue family supports decoding H.264/AVC content, as returned in `VkQueueFamilyVideoPropertiesKHR`.
63
64An H.264 decode profile can be defined through a `VkVideoProfileInfoKHR` structure using this new video codec operation and by including the following new codec-specific profile information structure in the `pNext` chain:
65
66[source,c]
67----
68typedef struct VkVideoDecodeH264ProfileInfoKHR {
69    VkStructureType                              sType;
70    const void*                                  pNext;
71    StdVideoH264ProfileIdc                       stdProfileIdc;
72    VkVideoDecodeH264PictureLayoutFlagBitsKHR    pictureLayout;
73} VkVideoDecodeH264ProfileInfoKHR;
74----
75
76`stdProfileIdc` specifies the H.264 profile indicator, while `pictureLayout` provides information about the representation of pictures usable with a video session created with such a video profile, and takes its value from the following new flag bit type:
77
78[source,c]
79----
80typedef enum VkVideoDecodeH264PictureLayoutFlagBitsKHR {
81    VK_VIDEO_DECODE_H264_PICTURE_LAYOUT_PROGRESSIVE_KHR = 0,
82    VK_VIDEO_DECODE_H264_PICTURE_LAYOUT_INTERLACED_INTERLEAVED_LINES_BIT_KHR = 0x00000001,
83    VK_VIDEO_DECODE_H264_PICTURE_LAYOUT_INTERLACED_SEPARATE_PLANES_BIT_KHR = 0x00000002,
84    VK_VIDEO_DECODE_H264_PICTURE_LAYOUT_FLAG_BITS_MAX_ENUM_KHR = 0x7FFFFFFF
85} VkVideoDecodeH264PictureLayoutFlagBitsKHR;
86----
87
88If `pictureLayout` is zero (`VK_VIDEO_DECODE_H264_PICTURE_LAYOUT_PROGRESSIVE_KHR`), then the video profile only allows producing and consuming progressive frames. Otherwise, it also supports interlaced frames, and the individual bits indicate the way individual fields of such interlaced frames are represented within the images backing the video picture resources. In particular:
89
90  * `VK_VIDEO_DECODE_H264_PICTURE_LAYOUT_INTERLACED_INTERLEAVED_LINES_BIT_KHR` indicates that the top and bottom fields are stored interleaved across the scanlines of the video picture resources, with all lines belonging to the top field being stored at even-numbered lines within the picture resource, and all lines belonging to the bottom field being stored at odd-numbered lines within the picture resource.
91  * `VK_VIDEO_DECODE_H264_PICTURE_LAYOUT_INTERLACED_SEPARATE_PLANES_BIT_KHR` indicates that the top and bottom fields are stored separately, i.e. all lines belonging to a field are grouped together in a single image subregion. The two fields comprising the frame thus can be stored in separate image subregions of the same image subresource or in separate image subresources.
92
93It is expected that most implementations will at least support the `VK_VIDEO_DECODE_H264_PICTURE_LAYOUT_INTERLACED_INTERLEAVED_LINES_BIT_KHR` picture layout, but support for any particular interlaced picture layout is not mandatory. Applications need to verify support for individual H.264 decode profiles specifying particular picture layouts using the `vkGetPhysicalDeviceVideoCapabilitiesKHR` command. The `VK_ERROR_VIDEO_PICTURE_LAYOUT_NOT_SUPPORTED_KHR` error code indicates that the chosen picture layout is not supported by the implementation.
94
95
96=== H.264 Decode Capabilities
97
98Applications need to include the following new structure in the `pNext` chain of `VkVideoCapabilitiesKHR` when calling the `vkGetPhysicalDeviceVideoCapabilitiesKHR` command to retrieve the capabilities specific to H.264 video decoding:
99
100[source,c]
101----
102typedef struct VkVideoDecodeH264CapabilitiesKHR {
103    VkStructureType         sType;
104    void*                   pNext;
105    StdVideoH264LevelIdc    maxLevelIdc;
106    VkOffset2D              fieldOffsetGranularity;
107} VkVideoDecodeH264CapabilitiesKHR;
108----
109
110`maxLevelIdc` indicates the maximum supported H.264 level indicator, while `fieldOffsetGranularity` indicates the alignment requirements of the `codedOffset` values specified for video picture resources when using the `VK_VIDEO_DECODE_H264_PICTURE_LAYOUT_INTERLACED_SEPARATE_PLANES_BIT_KHR` picture layout.
111
112
113=== H.264 Decode Parameter Sets
114
115The use of video session parameters objects is mandatory when decoding H.264 video streams. Applications need to include the following new structure in the `pNext` chain of `VkVideoSessionParametersCreateInfoKHR` when creating video session parameters objects for H.264 decode use, to specify the parameter set capacity of the created objects:
116
117[source,c]
118----
119typedef struct VkVideoDecodeH264SessionParametersCreateInfoKHR {
120    VkStructureType                                        sType;
121    const void*                                            pNext;
122    uint32_t                                               maxStdSPSCount;
123    uint32_t                                               maxStdPPSCount;
124    const VkVideoDecodeH264SessionParametersAddInfoKHR*    pParametersAddInfo;
125} VkVideoDecodeH264SessionParametersCreateInfoKHR;
126----
127
128The optional `pParametersAddInfo` member also allows specifying an initial set of parameter sets to add to the created object:
129
130[source,c]
131----
132typedef struct VkVideoDecodeH264SessionParametersAddInfoKHR {
133    VkStructureType                            sType;
134    const void*                                pNext;
135    uint32_t                                   stdSPSCount;
136    const StdVideoH264SequenceParameterSet*    pStdSPSs;
137    uint32_t                                   stdPPSCount;
138    const StdVideoH264PictureParameterSet*     pStdPPSs;
139} VkVideoDecodeH264SessionParametersAddInfoKHR;
140----
141
142This structure can also be included in the `pNext` chain of `VkVideoSessionParametersUpdateInfoKHR` used in video session parameters update operations to add further parameter sets to an object after its creation.
143
144Individual parameter sets are stored using parameter set IDs as their keys, specifically:
145
146  * H.264 SPS entries are identified using a `seq_parameter_set_id` value
147  * H.264 PPS entries are identified using a pair of `seq_parameter_set_id` and `pic_parameter_set_id` values
148
149The H.264/AVC video compression standard always requires an SPS and PPS, hence the application has to add an instance of each parameter set to the used parameters object before being able to record video decode operations.
150
151Furthermore, the H.264/AVC video compression standard also allows modifying existing parameter sets, but as parameters already stored in video session parameters objects cannot be changed in Vulkan, the application has to create new parameters objects in such cases, as described in the proposal for `VK_KHR_video_queue`.
152
153
154=== H.264 Decoding Parameters
155
156Decode parameters specific to H.264 need to be provided by the application through the `pNext` chain of `VkVideoDecodeInfoKHR`, using the following new structure:
157
158[source,c]
159----
160typedef struct VkVideoDecodeH264PictureInfoKHR {
161    VkStructureType                         sType;
162    const void*                             pNext;
163    const StdVideoDecodeH264PictureInfo*    pStdPictureInfo;
164    uint32_t                                sliceCount;
165    const uint32_t*                         pSliceOffsets;
166} VkVideoDecodeH264PictureInfoKHR;
167----
168
169`pStdPictureInfo` points to the codec-specific decode parameters defined in the `vulkan_video_codec_h264std_decode` video std header, while the `pSliceOffsets` array contains the relative offset of individual slices of the picture within the video bitstream range used by the video decode operation.
170
171Specific flags within the codec-specific decode parameters are used to determine whether the picture to be decoded is a frame or a field, according to the table below:
172
173|===
174| **field_pic_flag** | **bottom_field_flag** | **frame / field**
175| 0 | _ignored_ | frame
176| 1 | 0 | top field
177| 1 | 1 | bottom field
178|===
179
180The active SPS and PPS (sourced from the bound video session parameters object) are identified by the `seq_parameter_set_id` and `pic_parameter_set_id` parameters.
181
182Picture information specific to H.264 for the active reference pictures and the optional reconstructed picture need to be provided by the application through the `pNext` chain of corresponding elements of `VkVideoDecodeInfoKHR::pReferenceSlots` and the `pNext` chain of `VkVideoDecodeInfoKHR::pSetupReferenceSlot`, respectively, using the following new structure:
183
184[source,c]
185----
186typedef struct VkVideoDecodeH264DpbSlotInfoKHR {
187    VkStructureType                           sType;
188    const void*                               pNext;
189    const StdVideoDecodeH264ReferenceInfo*    pStdReferenceInfo;
190} VkVideoDecodeH264DpbSlotInfoKHR;
191----
192
193`pStdReferenceInfo` points to the codec-specific reference picture parameters defined in the `vulkan_video_codec_h264std_decode` video std header.
194
195Specific flags within the codec-specific reference picture parameters are used to determined whether the picture is a frame or a field, according to the table below:
196
197|===
198| **top_field_flag** | **bottom_field_flag** | **frame / field**
199| 0 | 0 | frame
200| 1 | 0 | top field
201| 0 | 1 | bottom field
202| 1 | 1 | both fields (for active reference pictures only)
203|===
204
205The ability to specify both fields is specific to the list of active reference pictures provided in `VkVideoDecodeInfo::pReferenceSlots` and is needed to allow the application to use both fields of an interlaced frame when the two fields are stored in the same video picture resource, which happens when using the `VK_VIDEO_DECODE_H264_PICTURE_LAYOUT_INTERLACED_INTERLEAVED_LINES_BIT_KHR` picture layout. As a consequence, the value of `VkVideoDecodeInfo::referenceSlotCount` is not always indicative of the total number of active reference pictures used by a video decode operation, as a single element of `pReferenceSlots` may refer to two reference pictures in this case.
206
207It is the application's responsibility to specify video bitstream buffer data and codec-specific parameters that are compliant to the rules defined by the H.264/AVC video compression standard. While it is not illegal, from the API usage's point of view, to specify non-compliant inputs, they may cause the video decode operation to complete unsuccessfully and will cause the output pictures (decode output and reconstructed pictures) to have undefined contents after the execution of the operation.
208
209For more information about how to parse individual H.264 bitstream syntax elements, calculate derived values, and, in general, how to interpret these parameters, please refer to the corresponding sections of the https://www.itu.int/rec/T-REC-H.264-202108-I/[ITU-T H.264 Specification].
210
211
212== Examples
213
214=== Select queue family with H.264 decode support
215
216[source,c]
217----
218uint32_t queueFamilyIndex;
219uint32_t queueFamilyCount;
220
221vkGetPhysicalDeviceQueueFamilyProperties2(physicalDevice, &queueFamilyCount, NULL);
222
223VkQueueFamilyProperties2* props = calloc(queueFamilyCount,
224    sizeof(VkQueueFamilyProperties2));
225VkQueueFamilyVideoPropertiesKHR* videoProps = calloc(queueFamilyCount,
226    sizeof(VkQueueFamilyVideoPropertiesKHR));
227
228for (queueFamilyIndex = 0; queueFamilyIndex < queueFamilyCount; ++queueFamilyIndex) {
229    props[queueFamilyIndex].sType = VK_STRUCTURE_TYPE_QUEUE_FAMILY_PROPERTIES_2;
230    props[queueFamilyIndex].pNext = &videoProps[queueFamilyIndex];
231
232    videoProps[queueFamilyIndex].sType = VK_STRUCTURE_TYPE_QUEUE_FAMILY_VIDEO_PROPERTIES_KHR;
233}
234
235vkGetPhysicalDeviceQueueFamilyProperties2(physicalDevice, &queueFamilyCount, props);
236
237for (queueFamilyIndex = 0; queueFamilyIndex < queueFamilyCount; ++queueFamilyIndex) {
238    if ((props[queueFamilyIndex].queueFamilyProperties.queueFlags & VK_QUEUE_VIDEO_DECODE_BIT_KHR) != 0 &&
239        (videoProps[queueFamilyIndex].videoCodecOperations & VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR) != 0) {
240        break;
241    }
242}
243
244if (queueFamilyIndex < queueFamilyCount) {
245    // Found appropriate queue family
246    ...
247} else {
248    // Did not find a queue family with the needed capabilities
249    ...
250}
251----
252
253
254=== Check support and query the capabilities for an H.264 decode profile
255
256[source,c]
257----
258VkResult result;
259
260VkVideoDecodeH264ProfileInfoKHR decodeH264ProfileInfo = {
261    .sType = VK_STRUCTURE_TYPE_VIDEO_DECODE_H264_PROFILE_INFO_KHR,
262    .pNext = NULL,
263    .stdProfileIdc = STD_VIDEO_H264_PROFILE_IDC_BASELINE,
264    .pictureLayout = VK_VIDEO_DECODE_H264_PICTURE_LAYOUT_PROGRESSIVE_KHR
265};
266
267VkVideoProfileInfoKHR profileInfo = {
268    .sType = VK_STRUCTURE_TYPE_VIDEO_PROFILE_INFO_KHR,
269    .pNext = &decodeH264ProfileInfo,
270    .videoCodecOperation = VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR,
271    .chromaSubsampling = VK_VIDEO_CHROMA_SUBSAMPLING_420_BIT_KHR,
272    .lumaBitDepth = VK_VIDEO_COMPONENT_BIT_DEPTH_8_BIT_KHR,
273    .chromaBitDepth = VK_VIDEO_COMPONENT_BIT_DEPTH_8_BIT_KHR
274};
275
276VkVideoDecodeH264CapabilitiesKHR decodeH264Capabilities = {
277    .sType = VK_STRUCTURE_TYPE_VIDEO_DECODE_H264_CAPABILITIES_KHR,
278    .pNext = NULL,
279};
280
281VkVideoDecodeCapabilitiesKHR decodeCapabilities = {
282    .sType = VK_STRUCTURE_TYPE_VIDEO_DECODE_CAPABILITIES_KHR,
283    .pNext = &decodeH264Capabilities
284}
285
286VkVideoCapabilitiesKHR capabilities = {
287    .sType = VK_STRUCTURE_TYPE_VIDEO_CAPABILITIES_KHR,
288    .pNext = &decodeCapabilities
289};
290
291result = vkGetPhysicalDeviceVideoCapabilitiesKHR(physicalDevice, &profileInfo, &capabilities);
292
293if (result == VK_SUCCESS) {
294    // Profile is supported, check additional capabilities
295    ...
296} else {
297    // Profile is not supported, result provides additional information about why
298    ...
299}
300----
301
302=== Create and update H.264 video session parameters objects
303
304[source,c]
305----
306VkVideoSessionParametersKHR videoSessionParams = VK_NULL_HANDLE;
307
308VkVideoDecodeH264SessionParametersCreateInfoKHR decodeH264CreateInfo = {
309    .sType = VK_STRUCTURE_TYPE_VIDEO_DECODE_H264_SESSION_PARAMETERS_CREATE_INFO_KHR,
310    .pNext = NULL,
311    .maxStdSPSCount = ... // SPS capacity
312    .maxStdPPSCount = ... // PPS capacity
313    .pParametersAddInfo = ... // parameters to add at creation time or NULL
314};
315
316VkVideoSessionParametersCreateInfoKHR createInfo = {
317    .sType = VK_STRUCTURE_TYPE_VIDEO_SESSION_PARAMETERS_CREATE_INFO_KHR,
318    .pNext = &decodeH264CreateInfo,
319    .flags = 0,
320    .videoSessionParametersTemplate = ... // template to use or VK_NULL_HANDLE
321    .videoSession = videoSession
322};
323
324vkCreateVideoSessionParametersKHR(device, &createInfo, NULL, &videoSessionParams);
325
326...
327
328StdVideoH264SequenceParameterSet sps = {};
329// parse and populate SPS parameters
330...
331
332StdVideoH264PictureParameterSet pps = {};
333// parse and populate PPS parameters
334...
335
336VkVideoDecodeH264SessionParametersAddInfoKHR decodeH264AddInfo = {
337    .sType = VK_STRUCTURE_TYPE_VIDEO_DECODE_H264_SESSION_PARAMETERS_ADD_INFO_KHR,
338    .pNext = NULL,
339    .stdSPSCount = 1,
340    .pStdSPSs = &sps,
341    .stdPPSCount = 1,
342    .pStdPPSs = &pps
343};
344
345VkVideoSessionParametersUpdateInfoKHR updateInfo = {
346    .sType = VK_STRUCTURE_TYPE_VIDEO_SESSION_PARAMETERS_UPDATE_INFO_KHR,
347    .pNext = &decodeH264AddInfo,
348    .updateSequenceCount = 1 // incremented for each subsequent update
349};
350
351vkUpdateVideoSessionParametersKHR(device, &videoSessionParams, &updateInfo);
352----
353
354
355=== Record H.264 decode operation (video session without DPB slots)
356
357[source,c]
358----
359vkCmdBeginVideoCodingKHR(commandBuffer, ...);
360
361StdVideoDecodeH264PictureInfo stdPictureInfo = {};
362// parse and populate picture info from slice header data
363...
364
365VkVideoDecodeH264PictureInfoKHR decodeH264PictureInfo = {
366    .sType = VK_STRUCTURE_TYPE_VIDEO_DECODE_H264_PICTURE_INFO_KHR,
367    .pNext = NULL,
368    .pStdPictureInfo = &stdPictureInfo,
369    .sliceCount = ... // number of slices
370    .pSliceOffsets = ... // array of slice offsets relative to the bitstream buffer range
371};
372
373VkVideoDecodeInfoKHR decodeInfo = {
374    .sType = VK_STRUCTURE_TYPE_VIDEO_DECODE_INFO_KHR,
375    .pNext = &decodeH264PictureInfo,
376    ...
377    // reconstructed picture is not needed if video session was created without DPB slots
378    .pSetupReferenceSlot = NULL,
379    .referenceSlotCount = 0,
380    .pReferenceSlots = NULL
381};
382
383vkCmdDecodeVideoKHR(commandBuffer, &decodeInfo);
384
385vkCmdEndVideoCodingKHR(commandBuffer, ...);
386----
387
388
389=== Record H.264 decode operation with optional reference picture setup
390
391[source,c]
392----
393vkCmdBeginVideoCodingKHR(commandBuffer, ...);
394
395StdVideoDecodeH264ReferenceInfo stdReferenceInfo = {};
396// parse and populate reconstructed reference picture info from slice header data
397...
398
399VkVideoDecodeH264DpbSlotInfoKHR decodeH264DpbSlotInfo = {
400    .sType = VK_STRUCTURE_TYPE_VIDEO_DECODE_H264_DPB_SLOT_INFO_KHR,
401    .pNext = NULL,
402    .pStdReferenceInfo = &stdReferenceInfo
403};
404
405VkVideoReferenceSlotInfoKHR setupSlotInfo = {
406    .sType = VK_STRUCTURE_TYPE_VIDEO_REFERENCE_SLOT_INFO_KHR,
407    .pNext = &decodeH264DpbSlotInfo
408    ...
409};
410
411StdVideoDecodeH264PictureInfo stdPictureInfo = {};
412// parse and populate picture info from frame header data
413...
414if (stdPictureInfo.flags.is_reference) {
415    // reconstructed picture will be used for reference picture setup and DPB slot activation
416} else {
417    // reconstructed picture and slot may only be used by implementations as transient resource
418}
419
420VkVideoDecodeH264PictureInfoKHR decodeH264PictureInfo = {
421    .sType = VK_STRUCTURE_TYPE_VIDEO_DECODE_H264_PICTURE_INFO_KHR,
422    .pNext = NULL,
423    .pStdPictureInfo = &stdPictureInfo,
424    .sliceCount = ... // number of slices
425    .pSliceOffsets = ... // array of slice offsets relative to the bitstream buffer range
426};
427
428VkVideoDecodeInfoKHR decodeInfo = {
429    .sType = VK_STRUCTURE_TYPE_VIDEO_DECODE_INFO_KHR,
430    .pNext = &decodeH264PictureInfo,
431    ...
432    .pSetupReferenceSlot = &setupSlotInfo,
433    ...
434};
435
436vkCmdDecodeVideoKHR(commandBuffer, &decodeInfo);
437
438vkCmdEndVideoCodingKHR(commandBuffer, ...);
439----
440
441
442=== Record H.264 decode operation with reference picture list
443
444[source,c]
445----
446vkCmdBeginVideoCodingKHR(commandBuffer, ...);
447
448StdVideoDecodeH264ReferenceInfo stdReferenceInfo[] = {};
449// populate reference picture info for each active reference picture
450...
451
452VkVideoDecodeH264DpbSlotInfoKHR decodeH264DpbSlotInfo[] = {
453    {
454        .sType = VK_STRUCTURE_TYPE_VIDEO_DECODE_H264_DPB_SLOT_INFO_KHR,
455        .pNext = NULL,
456        .pStdReferenceInfo = &stdReferenceInfo[0]
457    },
458    {
459        .sType = VK_STRUCTURE_TYPE_VIDEO_DECODE_H264_DPB_SLOT_INFO_KHR,
460        .pNext = NULL,
461        .pStdReferenceInfo = &stdReferenceInfo[1]
462    },
463    ...
464};
465
466
467VkVideoReferenceSlotInfoKHR referenceSlotInfo[] = {
468    {
469        .sType = VK_STRUCTURE_TYPE_VIDEO_REFERENCE_SLOT_INFO_KHR,
470        .pNext = &decodeH264DpbSlotInfo[0],
471        ...
472    },
473    {
474        .sType = VK_STRUCTURE_TYPE_VIDEO_REFERENCE_SLOT_INFO_KHR,
475        .pNext = &decodeH264DpbSlotInfo[1],
476        ...
477    },
478    ...
479};
480
481
482StdVideoDecodeH264PictureInfo stdPictureInfo = {};
483// parse and populate picture info from frame header data
484...
485if (stdPictureInfo.flags.is_reference) {
486    // reconstructed picture will be used for reference picture setup and DPB slot activation
487} else {
488    // reconstructed picture and slot may only be used by implementations as transient resource
489}
490
491VkVideoDecodeH264PictureInfoKHR decodeH264PictureInfo = {
492    .sType = VK_STRUCTURE_TYPE_VIDEO_DECODE_H264_PICTURE_INFO_KHR,
493    .pNext = NULL,
494    .pStdPictureInfo = &stdPictureInfo,
495    .sliceCount = ... // number of slices
496    .pSliceOffsets = ... // array of slice offsets relative to the bitstream buffer range
497};
498
499VkVideoDecodeInfoKHR decodeInfo = {
500    .sType = VK_STRUCTURE_TYPE_VIDEO_DECODE_INFO_KHR,
501    .pNext = &decodeH264PictureInfo,
502    ...
503    .referenceSlotCount = sizeof(referenceSlotInfo) / sizeof(referenceSlotInfo[0]),
504    .pReferenceSlots = &referenceSlotInfo[0]
505};
506
507vkCmdDecodeVideoKHR(commandBuffer, &decodeInfo);
508
509vkCmdEndVideoCodingKHR(commandBuffer, ...);
510----
511
512
513== Issues
514
515=== RESOLVED: In what form should codec-specific parameters be provided?
516
517In the form of structures defined by the `vulkan_video_codec_h264std_decode` and `vulkan_video_codec_h264std` video std headers. Applications are responsible to parse parameter sets and slice header data and use the parsed data to populate the structures defined by the video std headers. It is also the application's responsibility to maintain and manage these data structures, as needed, to be able to provide them as inputs to video decode operations where needed.
518
519
520=== RESOLVED: Why the `vulkan_video_codec_h264std` video std header does not have a version number?
521
522The `vulkan_video_codec_h264std` video std header was introduced to share common definitions used in both H.264/AVC video decoding and video encoding, as the two functionalities were designed in parallel. However, as no video coding extension uses this video std header directly, only as a dependency of the video std header specific to the particular video coding operation, no separate versioning scheme was deemed necessary.
523
524
525=== RESOLVED: What are the requirements for the codec-specific input parameters and bitstream data?
526
527It is legal from an API usage perspective for the application to provide any values for the codec-specific input parameters (parameter sets, picture information, etc.) or video bitstream data. However, if the input data does not conform to the requirements of the H.264/AVC video compression standard, then video decode operations may complete unsuccessfully and, in general, the outputs produced by the video decode operation will have undefined contents.
528
529
530=== RESOLVED: Why is there a need for the application to specify the offset of individual slices of the decoded pictures?
531
532Implementations can take advantage of having access to the offsets of individual slices within the video bitstream buffer range provided to the video decode operations, hence this extension requires the application provide these offsets as input.
533
534
535=== RESOLVED: Are interlaced frames supported?
536
537Yes, through specifying an interlaced picture layout in the H.264 decode profile.
538
539Video sessions created with an interlaced picture layout can be used to decode field pictures, as well as progressive frame pictures. This also enables support for decoding PAFF and MBAFF content.
540
541
542=== RESOLVED: How are interlaced frames stored?
543
544Depending on the used picture layout, interlaced frames may be stored _interleaved_ by storing both the top and bottom fields in even and odd scanlines of a single video picture resource, respectively, or in _separate planes_. In the latter case the two fields comprising an interlaced frame may be stored in different subregions of a single image array layer, in separate image array layers, or in entirely separate images.
545
546
547=== RESOLVED: How should DPB images be created in case of interlaced frame support?
548
549Typically, interlaced frames are stored with one frame in each image array layer, hence the total number of layers across the DPB image(s) usually still matches the DPB slot capacity. The only exception is when the `VK_VIDEO_DECODE_H264_PICTURE_LAYOUT_INTERLACED_SEPARATE_PLANES_BIT_KHR` picture layout is used and the application wants to store individual fields in separate image array layers, in which case the total number of layers across the DPB image(s) may need to be twice as large as the DPB slot capacity.
550
551
552=== RESOLVED: How should both fields of an interlaced frame be specified as part of the active reference picture list?
553
554The way how both fields of an interlaced frame can be included in the list of active reference pictures differs depending on the used picture layout.
555
556If `VK_VIDEO_DECODE_H264_PICTURE_LAYOUT_INTERLACED_INTERLEAVED_LINES_BIT_KHR` is used, then both fields of an interlaced frame are stored in the same video picture resource, hence the application has to refer to both fields using a single `VkVideoReferenceSlotInfoKHR` structure with `StdVideoDecodeH264ReferenceInfo` having both `top_field_flag` and `bottom_field_flag` set to `1`.
557
558If `VK_VIDEO_DECODE_H264_PICTURE_LAYOUT_INTERLACED_SEPARATE_PLANES_BIT_KHR` is used, then each field is stored in a separate video picture resource (even if backed by the same image array layer), hence the application has to refer to each field using a separate `VkVideoReferenceSlotInfoKHR` structure with `StdVideoDecodeH264ReferenceInfo` setting only the field flag corresponding to the field picture in question.
559
560
561=== RESOLVED: Is H.264 Multiview content supported?
562
563Not as part of this extension, but future extensions can add support for that. While the provisional `VK_EXT_video_decode_h264` this extension was promoted from did include support for H.264 MVC, the corresponding APIs were not considered to be mature enough to be included in this extension.
564
565
566=== RESOLVED: Why are H.264 level indicator values specified differently than the way they are defined in the codec specification?
567
568For historical reasons, the `StdVideoH264Level` type is defined with ordinal enum constant values, which does not match the decimal encoding used by the H.264/AVC video compression standard specification. All APIs defined by this extension and the used video std headers accept and report H.264 levels using the enum constants `STD_VIDEO_H264_LEVEL_<major>.<minor>`, not the decimal encoding used within raw H.264/AVC bitstreams.
569
570
571=== RESOLVED: How is reference picture setup requested for H.264 decode operations?
572
573As specifying a reconstructed picture DPB slot and resource is always required per the latest revision of the video extensions, additional codec syntax controls whether reference picture setup is requested and, in response, the DPB slot is activated with the reconstructed picture.
574
575For H.264 decode, reference picture setup is requested and the DPB slot specified for the reconstructed picture is activated with the picture if and only if the `StdVideoDecodeH264PictureInfo::flags.is_reference` flag is set.
576
577
578== Further Functionality
579
580Future extensions can further extend the capabilities provided here, e.g. exposing support to decode H.264 Multiview content.
581