// Copyright 2018-2022 The Khronos Group Inc. // // SPDX-License-Identifier: CC-BY-4.0 [[video-encode-operations]] == Video Encode Operations Before the application can start recording Vulkan command buffers for the Video Encode Operations, it must: do the following, beforehand: . Ensure that the implementation can encode the Video Content by querying the supported codec operations and profiles using flink:vkGetPhysicalDeviceQueueFamilyProperties2. . By using flink:vkGetPhysicalDeviceVideoFormatPropertiesKHR and providing one or more video profiles, choose the Vulkan formats supported by the implementation. The formats for <> and <> pictures must: be queried and chosen separately. Refer to the section on <>. . Before creating an image to be used as a video picture resource, obtain the supported image creation parameters by querying with flink:vkGetPhysicalDeviceFormatProperties2 and flink:vkGetPhysicalDeviceImageFormatProperties2 using one of the reported formats and adding slink:VkVideoProfileListInfoKHR to the pname:pNext chain of slink:VkFormatProperties2. When querying the parameters with flink:vkGetPhysicalDeviceImageFormatProperties2 for images targeting <> and <> pictures, the slink:VkPhysicalDeviceImageFormatInfo2::pname:usage field should contain ename:VK_IMAGE_USAGE_VIDEO_ENCODE_SRC_BIT_KHR and ename:VK_IMAGE_USAGE_VIDEO_ENCODE_DPB_BIT_KHR, respectively. . Create none, some, or all of the required <> for the <> and <> pictures. More Video Picture Resources can: be created at some later point if needed while processing the content to be encoded. Also, if the size of the picture to be encoded is expected to change, the images can: be created based on the maximum expected content size. . Create the <> to be used for video encode operations. Before creating the Encode Video Session, the encode capabilities should: be queried with flink:vkGetPhysicalDeviceVideoCapabilitiesKHR to obtain the limits of the parameters allowed by the implementation for a particular codec profile. . Bind memory resources with the encode video session by calling flink:vkBindVideoSessionMemoryKHR. The video session cannot: be used until memory resources are allocated and bound to it. In order to determine the required memory sizes and heap types of the device memory allocations, flink:vkGetVideoSessionMemoryRequirementsKHR should: be called. . Create one or more <> for use across command buffer recording operations, if required by the codec extension in use. These objects must: be created against a <> with the parameters required by the codec. Each <> created is a child object of the associated <> and cannot: be bound in the command buffer with any other <>. The recording of Video Encode Commands against a Vulkan Command Buffer consists of the following sequence: . flink:vkCmdBeginVideoCodingKHR starts the recording of one or more Video Encode operations in the command buffer. For each Video Encode Command operation, a Video Session must: be bound to the command buffer within this command. This command establishes a Vulkan Video Encode Context that consists of the bound Video Session Object, Session Parameters Object, and the required Video Picture Resources. The established Video Encode Context is in effect until the flink:vkCmdEndVideoCodingKHR command is recorded. If more Video Encode operations are to be required after the flink:vkCmdEndVideoCodingKHR command, another Video Encode Context can: be started with the flink:vkCmdBeginVideoCodingKHR command. . flink:vkCmdEncodeVideoKHR specifies one or more frames to be encoded. The slink:VkVideoEncodeInfoKHR parameters, and the codec extension structures chained to this, specify the details of the encode operation. . flink:vkCmdControlVideoCodingKHR records operations against the encoded data, encoding device, or the Video Session state. . flink:vkCmdEndVideoCodingKHR signals the end of the recording of the Vulkan Video Encode Context, as established by flink:vkCmdBeginVideoCodingKHR. In addition to the above, the following commands can: be recorded between flink:vkCmdBeginVideoCodingKHR and flink:vkCmdEndVideoCodingKHR: * Query operations * Global Memory Barriers * Buffer Memory Barriers * Image Memory Barriers (these must: be used to transition the Video Picture Resources to the proper ename:VK_IMAGE_LAYOUT_VIDEO_ENCODE_SRC_KHR and ename:VK_IMAGE_LAYOUT_VIDEO_ENCODE_DPB_KHR layouts). * Pipeline Barriers * Events * Timestamps * Device Groups (device mask) The following Video Encode related commands must: be recorded *outside* the Vulkan Video Encode Context established with the flink:vkCmdBeginVideoCodingKHR and flink:vkCmdEndVideoCodingKHR commands: * Sparse Memory Binding * Copy Commands * Clear Commands [[encode-input-picture]] === Encode Input Picture The primary source of input pixels for the video encoding process is the _Encode Input Picture_, represented by a slink:VkImageView. It may: also be a direct target of ifdef::VK_KHR_video_decode_queue[] video decode, endif::VK_KHR_video_decode_queue[] graphics, or compute operations ifdef::VK_KHR_surface[] , or with <> APIs endif::VK_KHR_surface[] . === Capabilities [open,refpage='VkVideoEncodeCapabilitiesKHR',desc='Structure specifying encode capabilities',type='structs'] -- When calling flink:vkGetPhysicalDeviceVideoCapabilitiesKHR with pname:pVideoProfile->videoCodecOperation specified as one of the encode operation bits, the slink:VkVideoEncodeCapabilitiesKHR structure must: be included in the pname:pNext chain of the slink:VkVideoCapabilitiesKHR structure to retrieve capabilities specific to video encoding. The sname:VkVideoEncodeCapabilitiesKHR structure is defined as: include::{generated}/api/structs/VkVideoEncodeCapabilitiesKHR.adoc[] * pname:sType is the type of this structure. * pname:pNext is `NULL` or a pointer to a structure extending this structure. * pname:flags is a bitmask of elink:VkVideoEncodeCapabilityFlagBitsKHR describing supported encoding features. * pname:rateControlModes is a bitmask of elink:VkVideoEncodeRateControlModeFlagBitsKHR describing supported rate control modes. All implementations must: support ename:VK_VIDEO_ENCODE_RATE_CONTROL_MODE_NONE_BIT_KHR. * pname:rateControlLayerCount reports the maximum number of rate control layers supported. Implementations must: report at least 1. * pname:qualityLevelCount is the number of discrete quality levels supported. Implementations must: report at least 1. * pname:inputImageDataFillAlignment reports alignment of data that should be filled in the input image horizontally and vertically in pixels before encode operations are performed on the input image. The input content and encode resolution (specified in slink:VkVideoEncodeInfoKHR::pname:codedExtent) may not be aligned with the codec-specific coding block size. For example, the input content may be 1920x1080 and the coding block size may be 16x16 pixel blocks. In this example, the content is horizontally aligned with the coding block size, but not vertically aligned with the coding block size. Encoding of the last row of blocks may be impacted by contents of the input image in pixel rows 1081 to 1088 (the next vertical alignment with the coding block size). In general, to ensure efficient encoding for the last row/column of blocks, and/or to ensure consistent encoding results between repeated encoding of the same input content, these extra pixel rows/columns should be filled to known values up to the coding block size alignment before encoding operations are performed. Some implementations support performing auto-fill of unaligned pixels beyond a specific alignment, which is reported in pname:inputImageDataFillAlignment. For example, if an implementation reports 1x1 in pname:inputImageDataFillAlignment, then the implementation will perform auto-fill for any unaligned pixels beyond the encode resolution up to the next coding block size. For a coding block size of 16x16, if the implementation reports 16x16 in pname:inputImageDataFillAlignment, then it is the application's responsibility to fill any unaligned pixels, if desired. If not, it may impact the encoding efficiency, but it will not affect the validity of the generated bitstream. If the implementation reports 8x8 in pname:inputImageDataFillAlignment, then for the 1920x1080 example, since the content is aligned to 8 pixels vertically, the implementation will auto-fill pixel rows 1081 to 1088 (up to the 16x16 coding block size in the example). The auto-fill value(s) are implementation-specific. The auto-fill value(s) are not written to the input image memory, but are used as part of the encoding operation on the input image. include::{generated}/validity/structs/VkVideoEncodeCapabilitiesKHR.adoc[] -- [open,refpage='VkVideoEncodeCapabilityFlagsKHR',desc='Bitmask of VkVideoEncodeCapabilityFlagBitsKHR',type='flags'] -- include::{generated}/api/flags/VkVideoEncodeCapabilityFlagsKHR.adoc[] tname:VkVideoEncodeCapabilityFlagsKHR is a bitmask type for setting a mask of zero or more elink:VkVideoEncodeCapabilityFlagBitsKHR. -- [open,refpage='VkVideoEncodeCapabilityFlagBitsKHR',desc='Video encode capability flags',type='enums'] -- Bits which may: be set in slink:VkVideoEncodeCapabilitiesKHR::pname:flags, indicating the encoding tools supported, are: include::{generated}/api/enums/VkVideoEncodeCapabilityFlagBitsKHR.adoc[] * ename:VK_VIDEO_ENCODE_CAPABILITY_PRECEDING_EXTERNALLY_ENCODED_BYTES_BIT_KHR reports that the implementation supports use of slink:VkVideoEncodeInfoKHR::pname:precedingExternallyEncodedBytes. -- === Video Encode Commands [open,refpage='vkCmdEncodeVideoKHR',desc='Encode operation for bitstream generation',type='protos'] -- To launch video encode operations, call: include::{generated}/api/protos/vkCmdEncodeVideoKHR.adoc[] * pname:commandBuffer is the command buffer to be filled with this function for encoding to generate a bitstream. * pname:pEncodeInfo is a pointer to a slink:VkVideoEncodeInfoKHR structure. Each call issues one or more video encode operations. The implicit parameter pname:opCount corresponds to the number of video encode operations issued by the command. After calling this command, the <> of each <> query is incremented by pname:opCount. Currently each call to this command results in the issue of a single video encode operation. .Valid Usage **** * [[VUID-vkCmdEncodeVideoKHR-opCount-07174]] For each <> query, the <> corresponding to the query type of that query plus pname:opCount must: be less than or equal to the <> corresponding to the query type of that query plus one **** include::{generated}/validity/protos/vkCmdEncodeVideoKHR.adoc[] -- [open,refpage='VkVideoEncodeInfoKHR',desc='Structure to chain codec-specific structures to',type='structs'] -- The sname:VkVideoEncodeInfoKHR structure is defined as: include::{generated}/api/structs/VkVideoEncodeInfoKHR.adoc[] * pname:sType is the type of this structure. * pname:pNext is a pointer to a structure extending this structure. A codec-specific extension structure must: be chained to specify what bitstream unit to generate with this encode operation. * pname:flags is reserved for future use. * pname:qualityLevel is the coding quality level of the encoding. It is defined by the codec-specific extensions. * pname:dstBitstreamBuffer is the buffer where the encoded bitstream output will be produced. * pname:dstBitstreamBufferOffset is the offset in the pname:dstBitstreamBuffer where the encoded bitstream output will start. pname:dstBitstreamBufferOffset's value must: be aligned to slink:VkVideoCapabilitiesKHR::pname:minBitstreamBufferOffsetAlignment, as reported by the implementation. * pname:dstBitstreamBufferMaxRange is the maximum size of the pname:dstBitstreamBuffer that can be used while the encoded bitstream output is produced. pname:dstBitstreamBufferMaxRange's value must: be aligned to slink:VkVideoCapabilitiesKHR::pname:minBitstreamBufferSizeAlignment, as reported by the implementation. * pname:srcPictureResource is the Picture Resource of the <> to be encoded by the operation. * pname:pSetupReferenceSlot is a pointer to a slink:VkVideoReferenceSlotInfoKHR structure used for generating a reconstructed reference slot and Picture Resource. pname:pSetupReferenceSlot->slotIndex specifies the slot index number to use as a target for producing the Reconstructed (DPB) data. pname:pSetupReferenceSlot must: be one of the entries provided in slink:VkVideoBeginCodingInfoKHR via the pname:pReferenceSlots within the flink:vkCmdBeginVideoCodingKHR command that established the Vulkan Video Encode Context for this command. * pname:referenceSlotCount is the number of Reconstructed Reference Pictures that will be used when this encoding operation is executing. * pname:pReferenceSlots is `NULL` or a pointer to an array of slink:VkVideoReferenceSlotInfoKHR structures that will be used when this encoding operation is executing. Each entry in pname:pReferenceSlots must: be one of the entries provided in slink:VkVideoBeginCodingInfoKHR via the pname:pReferenceSlots within the flink:vkCmdBeginVideoCodingKHR command that established the Vulkan Video Encode Context for this command. * pname:precedingExternallyEncodedBytes is the number of bytes externally encoded for insertion in the active video encode session overall bitstream prior to the bitstream that will be generated by the implementation for this instance of sname:VkVideoEncodeInfoKHR. Valid when slink:VkVideoEncodeRateControlInfoKHR::pname:rateControlMode is not ename:VK_VIDEO_ENCODE_RATE_CONTROL_MODE_NONE_BIT_KHR. The value provided is used to update the implementation's rate control algorithm for the rate control layer this instance of sname:VkVideoEncodeInfoKHR belongs to, by accounting for the bitrate budget consumed by these externally encoded bytes. See slink:VkVideoEncodeRateControlInfoKHR for additional information about encode rate control. The coded size of the encode operation is specified in pname:codedExtent of pname:srcPictureResource. Multiple flink:vkCmdEncodeVideoKHR commands may: be recorded within a Vulkan Video Encode Context. The execution of each flink:vkCmdEncodeVideoKHR command will result in generating codec-specific bitstream units. These bitstream units are generated consecutively into the bitstream buffer specified in pname:dstBitstreamBuffer of a sname:VkVideoEncodeInfoKHR structure within the flink:vkCmdBeginVideoCodingKHR command. The produced bitstream is the sum of all these bitstream units, including any padding between the bitstream units. Any bitstream padding must: be filled with data compliant to the codec standard so as not to cause any syntax errors during decoding of the bitstream units with the padding included. The range of the bitstream buffer written can: be queried via <>. .Valid Usage **** * [[VUID-VkVideoEncodeInfoKHR-None-07012]] The bound video session must: not be in <> state at the time the command is executed on the device **** include::{generated}/validity/structs/VkVideoEncodeInfoKHR.adoc[] -- [open,refpage='VkVideoEncodeFlagsKHR',desc='Reserved for future use',type='flags'] -- include::{generated}/api/flags/VkVideoEncodeFlagsKHR.adoc[] tlink:VkVideoEncodeFlagsKHR is a bitmask type for setting a mask, but is currently reserved for future use. -- [open,refpage='VkVideoEncodeRateControlInfoKHR',desc='Structure to set encode stream rate control parameters',type='structs'] -- The sname:VkVideoEncodeRateControlInfoKHR structure is defined as: include::{generated}/api/structs/VkVideoEncodeRateControlInfoKHR.adoc[] * pname:sType is the type of this structure. * pname:pNext is `NULL` or a pointer to a structure extending this structure. * pname:flags is reserved for future use. * pname:rateControlMode is a elink:VkVideoEncodeRateControlModeFlagBitsKHR value specifying the encode stream rate control mode. * pname:layerCount specifies the number of rate control layers in the video encode stream. * pname:pLayerConfigs is a pointer to an array of slink:VkVideoEncodeRateControlLayerInfoKHR structures specifying the rate control configurations of pname:layerCount rate control layers. Including this structure in the pname:pNext chain of slink:VkVideoCodingControlInfoKHR and including ename:VK_VIDEO_CODING_CONTROL_ENCODE_RATE_CONTROL_BIT_KHR in slink:VkVideoCodingControlInfoKHR::pname:flags will define stream rate control settings for video encoding. Additional structures providing codec-specific rate control parameters may: need to be included in the pname:pNext chain of sname:VkVideoCodingControlInfoKHR depending on the codec profile the bound video session was created with and the parameters specified in sname:VkVideoEncodeRateControlInfoKHR (see <>). To ensure that the video session is properly initialized with stream-level rate control settings, the application must: call flink:vkCmdControlVideoCodingKHR with stream-level rate control settings at least once in execution order before the first flink:vkCmdEncodeVideoKHR command that is executed after video session reset. If not provided, default implementation-specific stream rate control settings will be used. Stream rate control settings can: also be re-initialized during an active video encoding session. The re-initialization takes effect whenever the sname:VkVideoEncodeRateControlInfoKHR structure is included in the pname:pNext chain of the slink:VkVideoCodingControlInfoKHR structure in the call to flink:vkCmdControlVideoCodingKHR, and only impacts flink:vkCmdEncodeVideoKHR operations that follow in execution order. include::{generated}/validity/structs/VkVideoEncodeRateControlInfoKHR.adoc[] -- [open,refpage='VkVideoEncodeRateControlFlagsKHR',desc='Reserved for future use',type='flags'] -- include::{generated}/api/flags/VkVideoEncodeRateControlFlagsKHR.adoc[] tname:VkVideoEncodeRateControlFlagsKHR is a bitmask type for setting a mask, but currently reserved for future use. -- [open,refpage='VkVideoEncodeRateControlModeFlagBitsKHR',desc='Video encode rate control modes',type='enums'] -- The rate control modes are defined with the following enums: include::{generated}/api/enums/VkVideoEncodeRateControlModeFlagBitsKHR.adoc[] * ename:VK_VIDEO_ENCODE_RATE_CONTROL_MODE_NONE_BIT_KHR for disabling rate control. * ename:VK_VIDEO_ENCODE_RATE_CONTROL_MODE_CBR_BIT_KHR for constant bitrate rate control mode. * ename:VK_VIDEO_ENCODE_RATE_CONTROL_MODE_VBR_BIT_KHR for variable bitrate rate control mode. -- [open,refpage='VkVideoEncodeRateControlModeFlagsKHR',desc='Bitmask of VkVideoEncodeRateControlModeFlagBitsKHR', type='flags'] -- include::{generated}/api/flags/VkVideoEncodeRateControlModeFlagsKHR.adoc[] tname:VkVideoEncodeRateControlModeFlagsKHR is a bitmask type for setting a mask of zero or more elink:VkVideoEncodeRateControlModeFlagBitsKHR. -- [open,refpage='VkVideoEncodeRateControlLayerInfoKHR',desc='Structure to set encode per-layer rate control parameters',type='structs'] -- The sname:VkVideoEncodeRateControlLayerInfoKHR structure is defined as: include::{generated}/api/structs/VkVideoEncodeRateControlLayerInfoKHR.adoc[] * pname:sType is the type of this structure. * pname:pNext is a pointer to a structure extending this structure. * pname:averageBitrate is the average bitrate in bits/second. Valid when rate control mode is not ename:VK_VIDEO_ENCODE_RATE_CONTROL_MODE_NONE_BIT_KHR. * pname:maxBitrate is the peak bitrate in bits/second. Valid when rate control mode is ename:VK_VIDEO_ENCODE_RATE_CONTROL_MODE_VBR_BIT_KHR. * pname:frameRateNumerator is the numerator of the frame rate. Valid when rate control mode is not ename:VK_VIDEO_ENCODE_RATE_CONTROL_MODE_NONE_BIT_KHR. * pname:frameRateDenominator is the denominator of the frame rate. Valid when rate control mode is not ename:VK_VIDEO_ENCODE_RATE_CONTROL_MODE_NONE_BIT_KHR. * pname:virtualBufferSizeInMs is the leaky bucket model virtual buffer size in milliseconds, with respect to peak bitrate. Valid when rate control mode is not ename:VK_VIDEO_ENCODE_RATE_CONTROL_MODE_NONE_BIT_KHR. For example, virtual buffer size is (pname:virtualBufferSizeInMs {times} pname:maxBitrate / 1000). * pname:initialVirtualBufferSizeInMs is the initial occupancy in milliseconds of the virtual buffer in the leaky bucket model. Valid when the rate control mode is not ename:VK_VIDEO_ENCODE_RATE_CONTROL_MODE_NONE_BIT_KHR. A codec-specific structure specifying additional per-layer rate control settings must: be chained to sname:VkVideoEncodeRateControlLayerInfoKHR. If multiple rate control layers are enabled (slink:VkVideoEncodeRateControlInfoKHR::pname:layerCount is greater than 1), then the chained codec-specific extension structure also identifies the specific video coding layer its parent sname:VkVideoEncodeRateControlLayerInfoKHR applies to. If multiple rate control layers are enabled, the number of rate control layers must: match the number of video coding layers. The specification for an encode codec-specific extension would describe how multiple video coding layers are enabled for the corresponding codec. Per-layer rate control settings for all enabled rate control layers must: be initialized or re-initialized whenever stream rate control settings are provided via slink:VkVideoEncodeRateControlInfoKHR. This is done by specifying settings for all enabled rate control layers in slink:VkVideoEncodeRateControlInfoKHR::pname:pLayerConfigs. Including this structure in the pname:pNext chain of slink:VkVideoCodingControlInfoKHR and including ename:VK_VIDEO_CODING_CONTROL_ENCODE_RATE_CONTROL_LAYER_BIT_KHR in slink:VkVideoCodingControlInfoKHR::pname:flags will define stream rate control settings for individual layers during video encoding. This adjustment only impacts the specified layer without impacting the rate control settings or implementation rate control algorithm behavior for any other enabled rate control layers. The adjustment takes effect whenever the corresponding flink:vkCmdControlVideoCodingKHR is executed, and only impacts flink:vkCmdEncodeVideoKHR operations pertaining to the corresponding video coding layer that follow in execution order. It is possible for an application to enable multiple video coding layers (via codec-specific extensions to encoding operations) while only enabling a single layer of rate control for the entire video stream. To achieve this, pname:layerCount in slink:VkVideoEncodeRateControlInfoKHR must: be set to 1, and the single sname:VkVideoEncodeRateControlLayerInfoKHR provided in pname:pLayerConfigs would apply to all encoded segments of the video stream, regardless of which codec-defined video coding layer they belong to. In this case, the implementation decides bitrate distribution across video coding layers (if applicable to the specified stream rate control mode). include::{generated}/validity/structs/VkVideoEncodeRateControlLayerInfoKHR.adoc[] --