Video Coding

Vulkan implementations may expose one or more queue families supporting video coding operations. These operations are performed by recording them into a command buffer within a video coding scope, and submitting them to queues with compatible video coding capabilities.

The Vulkan video functionalities are designed to be made available through a set of APIs built on top of each other, consisting of:

  • A core API providing common video coding functionalities,

  • APIs providing codec-independent video decode and video encode related functionalities, respectively,

  • Additional codec-specific APIs built on top of those.

This chapter details the fundamental components and operations of these.

Video Picture Resources

In the context of video coding, multidimensional arrays of image data that can be used as the source or target of video coding operations are referred to as video picture resources. They may store additional metadata that includes implementation-private information used during the execution of video coding operations, as discussed later.

Video picture resources are backed by VkImage objects. Individual subregions of VkImageView objects created from such resources can be used as decode output pictures, encode input pictures, reconstructed pictures, and/or reference pictures.

The parameters of a video picture resource are specified using a VkVideoPictureResourceInfoKHR structure.

The VkVideoPictureResourceInfoKHR structure is defined as:

// Provided by VK_KHR_video_queue
typedef struct VkVideoPictureResourceInfoKHR {
    VkStructureType    sType;
    const void*        pNext;
    VkOffset2D         codedOffset;
    VkExtent2D         codedExtent;
    uint32_t           baseArrayLayer;
    VkImageView        imageViewBinding;
} VkVideoPictureResourceInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • codedOffset is the offset in texels of the image subregion to use.

  • codedExtent is the size in pixels of the coded image data.

  • baseArrayLayer is the array layer of the image view specified in imageViewBinding to use as the video picture resource.

  • imageViewBinding is an image view representing the video picture resource.

The image subresource referred to by such a structure is defined as the image array layer index specified in baseArrayLayer relative to the image subresource range the image view specified in imageViewBinding was created with.

The meaning of the codedOffset and codedExtent depends on the command and context the video picture resource is used in, as well as on the used video profile and corresponding codec-specific semantics, as described later.

A video picture resource is uniquely defined by the image subresource referred to by an instance of this structure, together with the codedOffset and codedExtent members that identify the image subregion within the image subresource referenced corresponding to the video picture resource according to the particular codec-specific semantics.

Accesses to image data within a video picture resource happen at the granularity indicated by VkVideoCapabilitiesKHR::pictureAccessGranularity, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the used video profile. As a result, given an effective image subregion corresponding to a video picture resource, the actual image subregion accessed may be larger than that as it may include additional padding texels due to the picture access granularity. Any writes performed by video coding operations to such padding texels will result in undefined texel values.

Two video picture resources match if they refer to the same image subresource and they specify identical codedOffset and codedExtent values.

Valid Usage
  • VUID-VkVideoPictureResourceInfoKHR-baseArrayLayer-07175
    baseArrayLayer must be less than the VkImageViewCreateInfo::subresourceRange.layerCount specified when the image view imageViewBinding was created

Valid Usage (Implicit)
  • VUID-VkVideoPictureResourceInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_PICTURE_RESOURCE_INFO_KHR

  • VUID-VkVideoPictureResourceInfoKHR-pNext-pNext
    pNext must be NULL

  • VUID-VkVideoPictureResourceInfoKHR-imageViewBinding-parameter
    imageViewBinding must be a valid VkImageView handle

Decoded Picture Buffer

An integral part of video coding pipelines is the reconstruction of pictures from a compressed video bitstream. A reconstructed picture is a video picture resource resulting from this process.

Such reconstructed pictures can be used as reference pictures in subsequent video coding operations to provide predictions of the values of samples of subsequently decoded or encoded pictures. The correct use of such reconstructed pictures as reference pictures is driven by the video compression standard, the implementation, and the application-specific use cases.

The list of reference pictures used to provide such predictions within a single video coding operation is referred to as the list of active reference pictures.

The decoded picture buffer (DPB) is an indexed data structure that maintains the set of reference pictures available to be used in video coding operations. Individual indexed entries of the DPB are referred to as the decoded picture buffer (DPB) slots. The range of valid DPB slot indices is between zero and N-1, where N is the capacity of the DPB. Each DPB slot can refer to a reference picture containing a video frame or can refer to up to two reference pictures containing the top and/or bottom fields that, when both present, together represent a full video frame .

In Vulkan, the state and the backing store of the DPB is separated as follows:

In addition, the implementation may also maintain opaque metadata associated with DPB slots, including:

  • Reference picture metadata corresponding to the video picture resource associated with the DPB slot.

Such metadata may be stored by the implementation as part of the DPB slot state maintained by the video session, or as part of the video picture resource backing the DPB slot.

Any metadata stored in the video picture resources backing DPB slots are independent of the video session used to store it, hence such video picture resources can be shared with other video sessions. Correspondingly, any metadata that is dependent on the video session will always be stored as part of the DPB slot state maintained by that video session.

The responsibility of managing the DPB is split between the application and the implementation as follows:

In addition, the application is also responsible for managing the mapping between the codec-specific picture IDs and DPB slots, and any other codec-specific states unless otherwise specified.

DPB Slot States

At a given time, each DPB slot is either in active or inactive state. Initially, all DPB slots managed by a video session are in inactive state.

A DPB slot can be activated by using it as the target of picture reconstruction in a video coding operation with the reconstructed picture requested to be set up as a reference picture, according to the codec-specific semantics, changing its state to active and associating it with a picture reference to the reconstructed pictures.

Some video coding standards allow multiple picture references to be associated with a single DPB slot. In this case the state of the individual picture references can be independently updated.

As an example, H.264 decoding allows associating a separate top field and bottom field picture with the same DPB slot.

As part of reference picture setup, the implementation may also generate reference picture metadata. Such reference picture metadata is specific to each picture reference associated with the DPB slot.

If such a video coding operation completes successfully, the activated DPB slot will have a valid picture reference and the reconstructed picture is associated with the DPB slot. This is true even if the DPB slot is used as the target of a picture reconstruction that only sets up a top field or bottom field reference picture and thus does not yet refer to a complete frame. However, if any data provided as input to such a video coding operation is not compliant with the video compression standard used, that video coding operation may complete unsuccessfully, in which case the activated DPB slot will have an invalid picture reference. This is true even if the DPB slot previously had a valid picture reference to a top field or bottom field reference picture, but the reconstruction of the other field corresponding to the DPB slot failed.

The application can use queries to get feedback about the outcome of video coding operations and use the resulting VkQueryResultStatusKHR value to determine whether the video coding operation completed successfully (result status is positive) or unsuccessfully (result status is negative).

Using a reference picture associated with a DPB slot that has an invalid picture reference as an active reference picture in subsequent video coding operations is legal, however, the contents of the outputs of such operations are undefined, and any DPB slots activated by such video coding operations will also have an invalid picture reference. This is true even if such video coding operations may otherwise complete successfully.

A DPB slot can also be deactivated by the application, changing its state to inactive and invalidating any picture references and reference picture metadata associated with the DPB slot.

If an already active DPB slot is used as the target of picture reconstruction in a video coding operation, but the decoded picture is not requested to be set up as a reference picture, according to the codec-specific semantics, no reference picture setup happens and the corresponding picture reference and reference picture metadata is invalidated within the DPB slot. If the DPB slot no longer has any associated picture references after such an operation, the DPB slot is implicitly deactivated.

If an already active DPB slot is used as the target of picture reconstruction when decoding a field picture that is not marked as reference, then the behavior is as follows:

  • If the DPB slot is currently associated with a frame, then the DPB slot is deactivated.

  • If the DPB slot is not currently associated with a top field picture and the decoded picture is a top field picture, or if the DPB slot is not currently associated with a bottom field picture and the decoded picture is a bottom field picture, then the other field picture association of the DPB slot, if any, is not disturbed.

  • If the DPB slot is currently associated with a top field picture and the decoded picture is a top field picture, or if the DPB slot is currently associated with a bottom field picture and the decoded picture is a bottom field picture, then that picture association is invalidated, without disturbing the other field picture association, if any. If the DPB slot no longer has any associated picture references after such an operation, the DPB slot is implicitly deactivated.

A DPB slot can be activated with a new frame even if it is already active. In this case all previous associations of the DPB slots with reference pictures are replaced with an association with the reconstructed picture used to activate it.

If an already active DPB slot is activated with a reconstructed field picture, then the behavior is as follows:

  • If the DPB slot is currently associated with a frame, then that association is replaced with an association with the reconstructed field picture used to activate it.

  • If the DPB slot is not currently associated with a top field picture and the DPB slot is activated with a top field picture, or if the DPB slot is not currently associated with a bottom field picture and the DPB slot is activated with a bottom field picture, then the DPB slot is associated with the reconstructed field picture used to activate it, without disturbing the other field picture association, if any.

  • If the DPB slot is currently associated with a top field picture and the DPB slot is activated with a new top field picture, or if the DPB slot is currently associated with a bottom field picture and the DPB slot is activated with a new bottom field picture, then that association is replaced with an association with the reconstructed field picture used to activate it, without disturbing the other field picture association, if any.

Video Profiles

The VkVideoProfileInfoKHR structure is defined as follows:

// Provided by VK_KHR_video_queue
typedef struct VkVideoProfileInfoKHR {
    VkStructureType                     sType;
    const void*                         pNext;
    VkVideoCodecOperationFlagBitsKHR    videoCodecOperation;
    VkVideoChromaSubsamplingFlagsKHR    chromaSubsampling;
    VkVideoComponentBitDepthFlagsKHR    lumaBitDepth;
    VkVideoComponentBitDepthFlagsKHR    chromaBitDepth;
} VkVideoProfileInfoKHR;

Video profiles are provided as input to video capability queries such as vkGetPhysicalDeviceVideoCapabilitiesKHR or vkGetPhysicalDeviceVideoFormatPropertiesKHR, as well as when creating resources to be used by video coding operations such as images, buffers, query pools, and video sessions.

The full description of a video profile is specified by an instance of this structure, and the codec-specific and auxiliary structures provided in its pNext chain.

When this structure is specified as an input parameter to vkGetPhysicalDeviceVideoCapabilitiesKHR, or through the pProfiles member of a VkVideoProfileListInfoKHR structure in the pNext chain of the input parameter of a query command such as vkGetPhysicalDeviceVideoFormatPropertiesKHR or vkGetPhysicalDeviceImageFormatProperties2, the following error codes indicate specific causes of the failure of the query operation:

  • VK_ERROR_VIDEO_PICTURE_LAYOUT_NOT_SUPPORTED_KHR specifies that the requested video picture layout (e.g. through the pictureLayout member of a VkVideoDecodeH264ProfileInfoKHR structure included in the pNext chain of VkVideoProfileInfoKHR) is not supported.

  • VK_ERROR_VIDEO_PROFILE_OPERATION_NOT_SUPPORTED_KHR specifies that a video profile operation specified by videoCodecOperation is not supported.

  • VK_ERROR_VIDEO_PROFILE_FORMAT_NOT_SUPPORTED_KHR specifies that video format parameters specified by chromaSubsampling, lumaBitDepth, or chromaBitDepth are not supported.

  • VK_ERROR_VIDEO_PROFILE_CODEC_NOT_SUPPORTED_KHR specifies that the codec-specific parameters corresponding to the video codec operation are not supported.

Valid Usage
  • VUID-VkVideoProfileInfoKHR-chromaSubsampling-07013
    chromaSubsampling must have a single bit set

  • VUID-VkVideoProfileInfoKHR-lumaBitDepth-07014
    lumaBitDepth must have a single bit set

  • VUID-VkVideoProfileInfoKHR-chromaSubsampling-07015
    If chromaSubsampling is not VK_VIDEO_CHROMA_SUBSAMPLING_MONOCHROME_BIT_KHR, then chromaBitDepth must have a single bit set

  • VUID-VkVideoProfileInfoKHR-videoCodecOperation-07179
    If videoCodecOperation is VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR, then the pNext chain must include a VkVideoDecodeH264ProfileInfoKHR structure

  • VUID-VkVideoProfileInfoKHR-videoCodecOperation-07180
    If videoCodecOperation is VK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_KHR, then the pNext chain must include a VkVideoDecodeH265ProfileInfoKHR structure

  • VUID-VkVideoProfileInfoKHR-videoCodecOperation-09256
    If videoCodecOperation is VK_VIDEO_CODEC_OPERATION_DECODE_AV1_BIT_KHR, then the pNext chain must include a VkVideoDecodeAV1ProfileInfoKHR structure

  • VUID-VkVideoProfileInfoKHR-videoCodecOperation-07181
    If videoCodecOperation is VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR, then the pNext chain must include a VkVideoEncodeH264ProfileInfoKHR structure

  • VUID-VkVideoProfileInfoKHR-videoCodecOperation-07182
    If videoCodecOperation is VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR, then the pNext chain must include a VkVideoEncodeH265ProfileInfoKHR structure

  • VUID-VkVideoProfileInfoKHR-videoCodecOperation-10262
    If videoCodecOperation is VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR, then the pNext chain must include a VkVideoEncodeAV1ProfileInfoKHR structure

Valid Usage (Implicit)
  • VUID-VkVideoProfileInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_PROFILE_INFO_KHR

  • VUID-VkVideoProfileInfoKHR-videoCodecOperation-parameter
    videoCodecOperation must be a valid VkVideoCodecOperationFlagBitsKHR value

  • VUID-VkVideoProfileInfoKHR-chromaSubsampling-parameter
    chromaSubsampling must be a valid combination of VkVideoChromaSubsamplingFlagBitsKHR values

  • VUID-VkVideoProfileInfoKHR-chromaSubsampling-requiredbitmask
    chromaSubsampling must not be 0

  • VUID-VkVideoProfileInfoKHR-lumaBitDepth-parameter
    lumaBitDepth must be a valid combination of VkVideoComponentBitDepthFlagBitsKHR values

  • VUID-VkVideoProfileInfoKHR-lumaBitDepth-requiredbitmask
    lumaBitDepth must not be 0

  • VUID-VkVideoProfileInfoKHR-chromaBitDepth-parameter
    chromaBitDepth must be a valid combination of VkVideoComponentBitDepthFlagBitsKHR values

Possible values of VkVideoProfileInfoKHR::videoCodecOperation, specifying the type of video coding operation and video compression standard used by a video profile, are:

// Provided by VK_KHR_video_queue
typedef enum VkVideoCodecOperationFlagBitsKHR {
    VK_VIDEO_CODEC_OPERATION_NONE_KHR = 0,
  // Provided by VK_KHR_video_encode_h264
    VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR = 0x00010000,
  // Provided by VK_KHR_video_encode_h265
    VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR = 0x00020000,
  // Provided by VK_KHR_video_decode_h264
    VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR = 0x00000001,
  // Provided by VK_KHR_video_decode_h265
    VK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_KHR = 0x00000002,
  // Provided by VK_KHR_video_decode_av1
    VK_VIDEO_CODEC_OPERATION_DECODE_AV1_BIT_KHR = 0x00000004,
  // Provided by VK_KHR_video_encode_av1
    VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR = 0x00040000,
} VkVideoCodecOperationFlagBitsKHR;
// Provided by VK_KHR_video_queue
typedef VkFlags VkVideoCodecOperationFlagsKHR;

VkVideoCodecOperationFlagsKHR is a bitmask type for setting a mask of zero or more VkVideoCodecOperationFlagBitsKHR.

The video format chroma subsampling is defined with the following enums:

// Provided by VK_KHR_video_queue
typedef enum VkVideoChromaSubsamplingFlagBitsKHR {
    VK_VIDEO_CHROMA_SUBSAMPLING_INVALID_KHR = 0,
    VK_VIDEO_CHROMA_SUBSAMPLING_MONOCHROME_BIT_KHR = 0x00000001,
    VK_VIDEO_CHROMA_SUBSAMPLING_420_BIT_KHR = 0x00000002,
    VK_VIDEO_CHROMA_SUBSAMPLING_422_BIT_KHR = 0x00000004,
    VK_VIDEO_CHROMA_SUBSAMPLING_444_BIT_KHR = 0x00000008,
} VkVideoChromaSubsamplingFlagBitsKHR;
  • VK_VIDEO_CHROMA_SUBSAMPLING_MONOCHROME_BIT_KHR specifies that the format is monochrome.

  • VK_VIDEO_CHROMA_SUBSAMPLING_420_BIT_KHR specified that the format is 4:2:0 chroma subsampled, i.e. the two chroma components are sampled horizontally and vertically at half the sample rate of the luma component.

  • VK_VIDEO_CHROMA_SUBSAMPLING_422_BIT_KHR - the format is 4:2:2 chroma subsampled, i.e. the two chroma components are sampled horizontally at half the sample rate of luma component.

  • VK_VIDEO_CHROMA_SUBSAMPLING_444_BIT_KHR - the format is 4:4:4 chroma sampled, i.e. all three components of the Y′CBCR format are sampled at the same rate, thus there is no chroma subsampling.

Chroma subsampling is described in more detail in the Chroma Reconstruction section.

// Provided by VK_KHR_video_queue
typedef VkFlags VkVideoChromaSubsamplingFlagsKHR;

VkVideoChromaSubsamplingFlagsKHR is a bitmask type for setting a mask of zero or more VkVideoChromaSubsamplingFlagBitsKHR.

Possible values for the video format component bit depth are:

// Provided by VK_KHR_video_queue
typedef enum VkVideoComponentBitDepthFlagBitsKHR {
    VK_VIDEO_COMPONENT_BIT_DEPTH_INVALID_KHR = 0,
    VK_VIDEO_COMPONENT_BIT_DEPTH_8_BIT_KHR = 0x00000001,
    VK_VIDEO_COMPONENT_BIT_DEPTH_10_BIT_KHR = 0x00000004,
    VK_VIDEO_COMPONENT_BIT_DEPTH_12_BIT_KHR = 0x00000010,
} VkVideoComponentBitDepthFlagBitsKHR;
  • VK_VIDEO_COMPONENT_BIT_DEPTH_8_BIT_KHR specifies a component bit depth of 8 bits.

  • VK_VIDEO_COMPONENT_BIT_DEPTH_10_BIT_KHR specifies a component bit depth of 10 bits.

  • VK_VIDEO_COMPONENT_BIT_DEPTH_12_BIT_KHR specifies a component bit depth of 12 bits.

// Provided by VK_KHR_video_queue
typedef VkFlags VkVideoComponentBitDepthFlagsKHR;

VkVideoComponentBitDepthFlagsKHR is a bitmask type for setting a mask of zero or more VkVideoComponentBitDepthFlagBitsKHR.

Additional information about the video decode use case can be provided by adding a VkVideoDecodeUsageInfoKHR structure to the pNext chain of VkVideoProfileInfoKHR.

The VkVideoDecodeUsageInfoKHR structure is defined as:

// Provided by VK_KHR_video_decode_queue
typedef struct VkVideoDecodeUsageInfoKHR {
    VkStructureType               sType;
    const void*                   pNext;
    VkVideoDecodeUsageFlagsKHR    videoUsageHints;
} VkVideoDecodeUsageInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • videoUsageHints is a bitmask of VkVideoDecodeUsageFlagBitsKHR specifying hints about the intended use of the video decode profile.

Valid Usage (Implicit)
  • VUID-VkVideoDecodeUsageInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_DECODE_USAGE_INFO_KHR

  • VUID-VkVideoDecodeUsageInfoKHR-videoUsageHints-parameter
    videoUsageHints must be a valid combination of VkVideoDecodeUsageFlagBitsKHR values

The following bits can be specified in VkVideoDecodeUsageInfoKHR::videoUsageHints as a hint about the video decode use case:

// Provided by VK_KHR_video_decode_queue
typedef enum VkVideoDecodeUsageFlagBitsKHR {
    VK_VIDEO_DECODE_USAGE_DEFAULT_KHR = 0,
    VK_VIDEO_DECODE_USAGE_TRANSCODING_BIT_KHR = 0x00000001,
    VK_VIDEO_DECODE_USAGE_OFFLINE_BIT_KHR = 0x00000002,
    VK_VIDEO_DECODE_USAGE_STREAMING_BIT_KHR = 0x00000004,
} VkVideoDecodeUsageFlagBitsKHR;
  • VK_VIDEO_DECODE_USAGE_TRANSCODING_BIT_KHR specifies that video decoding is intended to be used in conjunction with video encoding to transcode a video bitstream with the same and/or different codecs.

  • VK_VIDEO_DECODE_USAGE_OFFLINE_BIT_KHR specifies that video decoding is intended to be used to consume a local video bitstream.

  • VK_VIDEO_DECODE_USAGE_STREAMING_BIT_KHR specifies that video decoding is intended to be used to consume a video bitstream received as a continuous flow over network.

There are no restrictions on the combination of bits that can be specified by the application. However, applications should use reasonable combinations in order for the implementation to be able to select the most appropriate mode of operation for the particular use case.

// Provided by VK_KHR_video_decode_queue
typedef VkFlags VkVideoDecodeUsageFlagsKHR;

VkVideoDecodeUsageFlagsKHR is a bitmask type for setting a mask of zero or more VkVideoDecodeUsageFlagBitsKHR.

Additional information about the video encode use case can be provided by adding a VkVideoEncodeUsageInfoKHR structure to the pNext chain of VkVideoProfileInfoKHR.

The VkVideoEncodeUsageInfoKHR structure is defined as:

// Provided by VK_KHR_video_encode_queue
typedef struct VkVideoEncodeUsageInfoKHR {
    VkStructureType                 sType;
    const void*                     pNext;
    VkVideoEncodeUsageFlagsKHR      videoUsageHints;
    VkVideoEncodeContentFlagsKHR    videoContentHints;
    VkVideoEncodeTuningModeKHR      tuningMode;
} VkVideoEncodeUsageInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • videoUsageHints is a bitmask of VkVideoEncodeUsageFlagBitsKHR specifying hints about the intended use of the video encode profile.

  • videoContentHints is a bitmask of VkVideoEncodeContentFlagBitsKHR specifying hints about the content to be encoded using the video encode profile.

  • tuningMode is a VkVideoEncodeTuningModeKHR value specifying the tuning mode to use when encoding with the video profile.

Valid Usage (Implicit)
  • VUID-VkVideoEncodeUsageInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_USAGE_INFO_KHR

  • VUID-VkVideoEncodeUsageInfoKHR-videoUsageHints-parameter
    videoUsageHints must be a valid combination of VkVideoEncodeUsageFlagBitsKHR values

  • VUID-VkVideoEncodeUsageInfoKHR-videoContentHints-parameter
    videoContentHints must be a valid combination of VkVideoEncodeContentFlagBitsKHR values

  • VUID-VkVideoEncodeUsageInfoKHR-tuningMode-parameter
    If tuningMode is not 0, tuningMode must be a valid VkVideoEncodeTuningModeKHR value

The following bits can be specified in VkVideoEncodeUsageInfoKHR::videoUsageHints as a hint about the video encode use case:

// Provided by VK_KHR_video_encode_queue
typedef enum VkVideoEncodeUsageFlagBitsKHR {
    VK_VIDEO_ENCODE_USAGE_DEFAULT_KHR = 0,
    VK_VIDEO_ENCODE_USAGE_TRANSCODING_BIT_KHR = 0x00000001,
    VK_VIDEO_ENCODE_USAGE_STREAMING_BIT_KHR = 0x00000002,
    VK_VIDEO_ENCODE_USAGE_RECORDING_BIT_KHR = 0x00000004,
    VK_VIDEO_ENCODE_USAGE_CONFERENCING_BIT_KHR = 0x00000008,
} VkVideoEncodeUsageFlagBitsKHR;
  • VK_VIDEO_ENCODE_USAGE_TRANSCODING_BIT_KHR specifies that video encoding is intended to be used in conjunction with video decoding to transcode a video bitstream with the same and/or different codecs.

  • VK_VIDEO_ENCODE_USAGE_STREAMING_BIT_KHR specifies that video encoding is intended to be used to produce a video bitstream that is expected to be sent as a continuous flow over network.

  • VK_VIDEO_ENCODE_USAGE_RECORDING_BIT_KHR specifies that video encoding is intended to be used for real-time recording for offline consumption.

  • VK_VIDEO_ENCODE_USAGE_CONFERENCING_BIT_KHR specifies that video encoding is intended to be used in a video conferencing scenario.

There are no restrictions on the combination of bits that can be specified by the application. However, applications should use reasonable combinations in order for the implementation to be able to select the most appropriate mode of operation for the particular use case.

// Provided by VK_KHR_video_encode_queue
typedef VkFlags VkVideoEncodeUsageFlagsKHR;

VkVideoEncodeUsageFlagsKHR is a bitmask type for setting a mask of zero or more VkVideoEncodeUsageFlagBitsKHR.

The following bits can be specified in VkVideoEncodeUsageInfoKHR::videoContentHints as a hint about the encoded video content:

// Provided by VK_KHR_video_encode_queue
typedef enum VkVideoEncodeContentFlagBitsKHR {
    VK_VIDEO_ENCODE_CONTENT_DEFAULT_KHR = 0,
    VK_VIDEO_ENCODE_CONTENT_CAMERA_BIT_KHR = 0x00000001,
    VK_VIDEO_ENCODE_CONTENT_DESKTOP_BIT_KHR = 0x00000002,
    VK_VIDEO_ENCODE_CONTENT_RENDERED_BIT_KHR = 0x00000004,
} VkVideoEncodeContentFlagBitsKHR;
  • VK_VIDEO_ENCODE_CONTENT_CAMERA_BIT_KHR specifies that video encoding is intended to be used to encode camera content.

  • VK_VIDEO_ENCODE_CONTENT_DESKTOP_BIT_KHR specifies that video encoding is intended to be used to encode desktop content.

  • VK_VIDEO_ENCODE_CONTENT_RENDERED_BIT_KHR specified that video encoding is intended to be used to encode rendered (e.g. game) content.

There are no restrictions on the combination of bits that can be specified by the application. However, applications should use reasonable combinations in order for the implementation to be able to select the most appropriate mode of operation for the particular content type.

// Provided by VK_KHR_video_encode_queue
typedef VkFlags VkVideoEncodeContentFlagsKHR;

VkVideoEncodeContentFlagsKHR is a bitmask type for setting a mask of zero or more VkVideoEncodeContentFlagBitsKHR.

Possible video encode tuning mode values are as follows:

// Provided by VK_KHR_video_encode_queue
typedef enum VkVideoEncodeTuningModeKHR {
    VK_VIDEO_ENCODE_TUNING_MODE_DEFAULT_KHR = 0,
    VK_VIDEO_ENCODE_TUNING_MODE_HIGH_QUALITY_KHR = 1,
    VK_VIDEO_ENCODE_TUNING_MODE_LOW_LATENCY_KHR = 2,
    VK_VIDEO_ENCODE_TUNING_MODE_ULTRA_LOW_LATENCY_KHR = 3,
    VK_VIDEO_ENCODE_TUNING_MODE_LOSSLESS_KHR = 4,
} VkVideoEncodeTuningModeKHR;
  • VK_VIDEO_ENCODE_TUNING_MODE_DEFAULT_KHR specifies the default tuning mode.

  • VK_VIDEO_ENCODE_TUNING_MODE_HIGH_QUALITY_KHR specifies that video encoding is tuned for high quality. When using this tuning mode, the implementation may compromise the latency of video encoding operations to improve quality.

  • VK_VIDEO_ENCODE_TUNING_MODE_LOW_LATENCY_KHR specifies that video encoding is tuned for low latency. When using this tuning mode, the implementation may compromise quality to increase the performance and lower the latency of video encode operations.

  • VK_VIDEO_ENCODE_TUNING_MODE_ULTRA_LOW_LATENCY_KHR specifies that video encoding is tuned for ultra-low latency. When using this tuning mode, the implementation may compromise quality to maximize the performance and minimize the latency of video encoding operations.

  • VK_VIDEO_ENCODE_TUNING_MODE_LOSSLESS_KHR specifies that video encoding is tuned for lossless encoding. When using this tuning mode, video encode operations produce lossless output.

The VkVideoProfileListInfoKHR structure is defined as:

// Provided by VK_KHR_video_queue
typedef struct VkVideoProfileListInfoKHR {
    VkStructureType                 sType;
    const void*                     pNext;
    uint32_t                        profileCount;
    const VkVideoProfileInfoKHR*    pProfiles;
} VkVideoProfileListInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • profileCount is the number of elements in the pProfiles array.

  • pProfiles is a pointer to an array of VkVideoProfileInfoKHR structures.

Video transcoding is an example of a use case that necessitates the specification of multiple profiles in various contexts.

When the application provides a video decode profile and one or more video encode profiles in the profile list, the implementation ensures that any capabilitities returned or resources created are suitable for the video transcoding use cases without the need for manual data transformations.

Valid Usage
  • VUID-VkVideoProfileListInfoKHR-pProfiles-06813
    pProfiles must not contain more than one element whose videoCodecOperation member specifies a decode operation

Valid Usage (Implicit)
  • VUID-VkVideoProfileListInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_PROFILE_LIST_INFO_KHR

  • VUID-VkVideoProfileListInfoKHR-pProfiles-parameter
    If profileCount is not 0, pProfiles must be a valid pointer to an array of profileCount valid VkVideoProfileInfoKHR structures

Video Capabilities

Video Coding Capabilities

To query video coding capabilities for a specific video profile, call:

// Provided by VK_KHR_video_queue
VkResult vkGetPhysicalDeviceVideoCapabilitiesKHR(
    VkPhysicalDevice                            physicalDevice,
    const VkVideoProfileInfoKHR*                pVideoProfile,
    VkVideoCapabilitiesKHR*                     pCapabilities);
  • physicalDevice is the physical device from which to query the video decode or encode capabilities.

  • pVideoProfile is a pointer to a VkVideoProfileInfoKHR structure.

  • pCapabilities is a pointer to a VkVideoCapabilitiesKHR structure in which the capabilities are returned.

If the video profile described by pVideoProfile is supported by the implementation, then this command returns VK_SUCCESS and pCapabilities is filled with the capabilities supported with the specified video profile. Otherwise, one of the video-profile-specific error codes are returned.

Valid Usage
  • VUID-vkGetPhysicalDeviceVideoCapabilitiesKHR-pVideoProfile-07183
    If pVideoProfile->videoCodecOperation specifies a decode operation, then the pNext chain of pCapabilities must include a VkVideoDecodeCapabilitiesKHR structure

  • VUID-vkGetPhysicalDeviceVideoCapabilitiesKHR-pVideoProfile-07184
    If pVideoProfile->videoCodecOperation is VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR, then the pNext chain of pCapabilities must include a VkVideoDecodeH264CapabilitiesKHR structure

  • VUID-vkGetPhysicalDeviceVideoCapabilitiesKHR-pVideoProfile-07185
    If pVideoProfile->videoCodecOperation is VK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_KHR, then the pNext chain of pCapabilities must include a VkVideoDecodeH265CapabilitiesKHR structure

  • VUID-vkGetPhysicalDeviceVideoCapabilitiesKHR-pVideoProfile-09257
    If pVideoProfile->videoCodecOperation is VK_VIDEO_CODEC_OPERATION_DECODE_AV1_BIT_KHR, then the pNext chain of pCapabilities must include a VkVideoDecodeAV1CapabilitiesKHR structure

  • VUID-vkGetPhysicalDeviceVideoCapabilitiesKHR-pVideoProfile-07186
    If pVideoProfile->videoCodecOperation specifies an encode operation, then the pNext chain of pCapabilities must include a VkVideoEncodeCapabilitiesKHR structure

  • VUID-vkGetPhysicalDeviceVideoCapabilitiesKHR-pVideoProfile-07187
    If pVideoProfile->videoCodecOperation is VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR, then the pNext chain of pCapabilities must include a VkVideoEncodeH264CapabilitiesKHR structure

  • VUID-vkGetPhysicalDeviceVideoCapabilitiesKHR-pVideoProfile-07188
    If pVideoProfile->videoCodecOperation is VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR, then the pNext chain of pCapabilities must include a VkVideoEncodeH265CapabilitiesKHR structure

  • VUID-vkGetPhysicalDeviceVideoCapabilitiesKHR-pVideoProfile-10263
    If pVideoProfile->videoCodecOperation is VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR, then the pNext chain of pCapabilities must include a VkVideoEncodeAV1CapabilitiesKHR structure

Valid Usage (Implicit)
  • VUID-vkGetPhysicalDeviceVideoCapabilitiesKHR-physicalDevice-parameter
    physicalDevice must be a valid VkPhysicalDevice handle

  • VUID-vkGetPhysicalDeviceVideoCapabilitiesKHR-pVideoProfile-parameter
    pVideoProfile must be a valid pointer to a valid VkVideoProfileInfoKHR structure

  • VUID-vkGetPhysicalDeviceVideoCapabilitiesKHR-pCapabilities-parameter
    pCapabilities must be a valid pointer to a VkVideoCapabilitiesKHR structure

Return Codes
Success
  • VK_SUCCESS

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

  • VK_ERROR_VIDEO_PROFILE_OPERATION_NOT_SUPPORTED_KHR

  • VK_ERROR_VIDEO_PROFILE_FORMAT_NOT_SUPPORTED_KHR

  • VK_ERROR_VIDEO_PICTURE_LAYOUT_NOT_SUPPORTED_KHR

  • VK_ERROR_VIDEO_PROFILE_CODEC_NOT_SUPPORTED_KHR

The VkVideoCapabilitiesKHR structure is defined as:

// Provided by VK_KHR_video_queue
typedef struct VkVideoCapabilitiesKHR {
    VkStructureType              sType;
    void*                        pNext;
    VkVideoCapabilityFlagsKHR    flags;
    VkDeviceSize                 minBitstreamBufferOffsetAlignment;
    VkDeviceSize                 minBitstreamBufferSizeAlignment;
    VkExtent2D                   pictureAccessGranularity;
    VkExtent2D                   minCodedExtent;
    VkExtent2D                   maxCodedExtent;
    uint32_t                     maxDpbSlots;
    uint32_t                     maxActiveReferencePictures;
    VkExtensionProperties        stdHeaderVersion;
} VkVideoCapabilitiesKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • flags is a bitmask of VkVideoCapabilityFlagBitsKHR specifying capability flags.

  • minBitstreamBufferOffsetAlignment is the minimum alignment for bitstream buffer offsets.

  • minBitstreamBufferSizeAlignment is the minimum alignment for bitstream buffer range sizes.

  • pictureAccessGranularity is the granularity at which image access to video picture resources happen.

  • minCodedExtent is the minimum width and height of the coded frames.

  • maxCodedExtent is the maximum width and height of the coded frames.

  • maxDpbSlots is the maximum number of DPB slots supported by a single video session.

  • maxActiveReferencePictures is the maximum number of active reference pictures a single video coding operation can use.

  • stdHeaderVersion is a VkExtensionProperties structure reporting the Video Std header name and version supported for the video profile.

It is common for video compression standards to allow using all reference pictures associated with active DPB slots as active reference pictures, hence for video decode profiles the values returned in maxDpbSlots and maxActiveReferencePictures are often equal. Similarly, in case of video decode profiles supporting field pictures the value of maxActiveReferencePictures often equals maxDpbSlots × 2.

Valid Usage (Implicit)

Bits which can be set in VkVideoCapabilitiesKHR::flags are:

// Provided by VK_KHR_video_queue
typedef enum VkVideoCapabilityFlagBitsKHR {
    VK_VIDEO_CAPABILITY_PROTECTED_CONTENT_BIT_KHR = 0x00000001,
    VK_VIDEO_CAPABILITY_SEPARATE_REFERENCE_IMAGES_BIT_KHR = 0x00000002,
} VkVideoCapabilityFlagBitsKHR;
  • VK_VIDEO_CAPABILITY_PROTECTED_CONTENT_BIT_KHR specifies that video sessions support producing and consuming protected content.

  • VK_VIDEO_CAPABILITY_SEPARATE_REFERENCE_IMAGES_BIT_KHR indicates that the video picture resources associated with the DPB slots of a video session can be backed by separate VkImage objects. If this capability flag is not present, then all DPB slots of a video session must be associated with video picture resources backed by the same VkImage object (e.g. using different layers of the same image).

// Provided by VK_KHR_video_queue
typedef VkFlags VkVideoCapabilityFlagsKHR;

VkVideoCapabilityFlagsKHR is a bitmask type for setting a mask of zero or more VkVideoCapabilityFlagBitsKHR.

Video Format Capabilities

To enumerate the supported video formats and corresponding capabilities for a specific video profile, call:

// Provided by VK_KHR_video_queue
VkResult vkGetPhysicalDeviceVideoFormatPropertiesKHR(
    VkPhysicalDevice                            physicalDevice,
    const VkPhysicalDeviceVideoFormatInfoKHR*   pVideoFormatInfo,
    uint32_t*                                   pVideoFormatPropertyCount,
    VkVideoFormatPropertiesKHR*                 pVideoFormatProperties);
  • physicalDevice is the physical device from which to query the video format properties.

  • pVideoFormatInfo is a pointer to a VkPhysicalDeviceVideoFormatInfoKHR structure specifying the usage and video profiles for which supported image formats and capabilities are returned.

  • pVideoFormatPropertyCount is a pointer to an integer related to the number of video format properties available or queried, as described below.

  • pVideoFormatProperties is a pointer to an array of VkVideoFormatPropertiesKHR structures in which supported image formats and capabilities are returned.

If pVideoFormatProperties is NULL, then the number of video format properties supported for the given physicalDevice is returned in pVideoFormatPropertyCount. Otherwise, pVideoFormatPropertyCount must point to a variable set by the application to the number of elements in the pVideoFormatProperties array, and on return the variable is overwritten with the number of values actually written to pVideoFormatProperties. If the value of pVideoFormatPropertyCount is less than the number of video format properties supported, at most pVideoFormatPropertyCount values will be written to pVideoFormatProperties, and VK_INCOMPLETE will be returned instead of VK_SUCCESS, to indicate that not all the available values were returned.

Video format properties are always queried with respect to a specific set of video profiles. These are specified by chaining the VkVideoProfileListInfoKHR structure to pVideoFormatInfo.

For most use cases, the images are used by a single video session and a single video profile is provided. For a use case such as video transcoding, where a decode session output image can be used as encode input in one or more encode sessions, multiple video profiles corresponding to the video sessions that will share the image must be provided.

If any of the video profiles specified via VkVideoProfileListInfoKHR::pProfiles are not supported, then this command returns one of the video-profile-specific error codes. Furthermore, if VkPhysicalDeviceVideoFormatInfoKHR::imageUsage includes any image usage flags not supported by the specified video profiles, then this command returns VK_ERROR_IMAGE_USAGE_NOT_SUPPORTED_KHR.

This command also returns VK_ERROR_IMAGE_USAGE_NOT_SUPPORTED_KHR if VkPhysicalDeviceVideoFormatInfoKHR::imageUsage does not include the appropriate flags as dictated by the decode capability flags returned in VkVideoDecodeCapabilitiesKHR::flags for any of the profiles specified in the VkVideoProfileListInfoKHR structure provided in the pNext chain of pVideoFormatInfo.

If the decode capability flags include VK_VIDEO_DECODE_CAPABILITY_DPB_AND_OUTPUT_COINCIDE_BIT_KHR but not VK_VIDEO_DECODE_CAPABILITY_DPB_AND_OUTPUT_DISTINCT_BIT_KHR, then in order to query video format properties for decode DPB and output usage, VkPhysicalDeviceVideoFormatInfoKHR::imageUsage must include both VK_IMAGE_USAGE_VIDEO_DECODE_DPB_BIT_KHR and VK_IMAGE_USAGE_VIDEO_DECODE_DST_BIT_KHR. Otherwise, the call will fail with VK_ERROR_IMAGE_USAGE_NOT_SUPPORTED_KHR.

If the decode capability flags include VK_VIDEO_DECODE_CAPABILITY_DPB_AND_OUTPUT_DISTINCT_BIT_KHR but not VK_VIDEO_DECODE_CAPABILITY_DPB_AND_OUTPUT_COINCIDE_BIT_KHR, then in order to query video format properties for decode DPB usage, VkPhysicalDeviceVideoFormatInfoKHR::imageUsage must include VK_IMAGE_USAGE_VIDEO_DECODE_DPB_BIT_KHR, but not VK_IMAGE_USAGE_VIDEO_DECODE_DST_BIT_KHR. Otherwise, the call will fail with VK_ERROR_IMAGE_USAGE_NOT_SUPPORTED_KHR. Similarly, to query video format properties for decode output usage, VkPhysicalDeviceVideoFormatInfoKHR::imageUsage must include VK_IMAGE_USAGE_VIDEO_DECODE_DST_BIT_KHR, but not VK_IMAGE_USAGE_VIDEO_DECODE_DPB_BIT_KHR. Otherwise, the call will fail with VK_ERROR_IMAGE_USAGE_NOT_SUPPORTED_KHR.

The imageUsage member of the VkPhysicalDeviceVideoFormatInfoKHR structure specifies the expected video usage flags that the returned video formats must support. Correspondingly, the imageUsageFlags member of each VkVideoFormatPropertiesKHR structure returned will contain at least the same set of image usage flags.

If the implementation supports using images of a particular format in operations other than video decode/encode then the imageUsageFlags member of the corresponding VkVideoFormatPropertiesKHR structure returned will include additional image usage flags indicating that.

For most use cases, only decode or encode related usage flags are going to be specified. For a use case such as transcode, if the image were to be shared between decode and encode session(s), then both decode and encode related usage flags can be set.

Multiple VkVideoFormatPropertiesKHR entries may be returned with the same format member with different componentMapping, imageType, or imageTiling values, as described later.

If VkPhysicalDeviceVideoFormatInfoKHR::imageUsageFlags includes VK_IMAGE_USAGE_VIDEO_ENCODE_QUANTIZATION_DELTA_MAP_BIT_KHR or VK_IMAGE_USAGE_VIDEO_ENCODE_EMPHASIS_MAP_BIT_KHR, multiple VkVideoFormatPropertiesKHR entries may be returned with the same format, componentMapping, imageType, and imageTiling member values, but different quantizationMapTexelSize returned in the VkVideoFormatQuantizationMapPropertiesKHR structure, if one is included in the VkVideoFormatPropertiesKHR::pNext chain, when the queried quantization map type supports multiple distinct quantization map texel sizes.

In addition, a different set of VkVideoFormatPropertiesKHR entries may be returned depending on the imageUsage member of the VkPhysicalDeviceVideoFormatInfoKHR structure, even for the same set of video profiles, for example, based on whether encode input, encode DPB, decode output, and/or decode DPB usage is requested.

The application can select the parameters returned in the VkVideoFormatPropertiesKHR entries and use compatible parameters when creating the input, output, and DPB images. The implementation will report all image creation and usage flags that are valid for images used with the requested video profiles but applications should create images only with those that are necessary for the particular use case.

Before creating an image, the application can obtain the complete set of supported image format features by calling vkGetPhysicalDeviceImageFormatProperties2 using parameters derived from the members of one of the reported VkVideoFormatPropertiesKHR entries and adding the same VkVideoProfileListInfoKHR structure to the pNext chain of VkPhysicalDeviceImageFormatInfo2.

The following applies to all VkVideoFormatPropertiesKHR entries returned by vkGetPhysicalDeviceVideoFormatPropertiesKHR:

  • vkGetPhysicalDeviceFormatProperties2 must succeed when called with VkVideoFormatPropertiesKHR::format

  • If VkVideoFormatPropertiesKHR::imageTiling is VK_IMAGE_TILING_OPTIMAL, then the optimalTilingFeatures returned by vkGetPhysicalDeviceFormatProperties2 must include all format features required by the image usage flags reported in VkVideoFormatPropertiesKHR::imageUsageFlags for the format, as indicated in the Format Feature Dependent Usage Flags section.

  • If VkVideoFormatPropertiesKHR::imageTiling is VK_IMAGE_TILING_LINEAR, then the linearTilingFeatures returned by vkGetPhysicalDeviceFormatProperties2 must include all format features required by the image usage flags reported in VkVideoFormatPropertiesKHR::imageUsageFlags for the format, as indicated in the Format Feature Dependent Usage Flags section.

  • vkGetPhysicalDeviceImageFormatProperties2 must succeed when called with a VkPhysicalDeviceImageFormatInfo2 structure containing the following information:

    • The pNext chain including the same VkVideoProfileListInfoKHR structure used to call vkGetPhysicalDeviceVideoFormatPropertiesKHR.

    • format set to the value of VkVideoFormatPropertiesKHR::format.

    • type set to the value of VkVideoFormatPropertiesKHR::imageType.

    • tiling set to the value of VkVideoFormatPropertiesKHR::imageTiling.

    • usage set to the value of VkVideoFormatPropertiesKHR::imageUsageFlags.

    • flags set to the value of VkVideoFormatPropertiesKHR::imageCreateFlags.

The componentMapping member of VkVideoFormatPropertiesKHR defines the ordering of the Y′CBCR color channels from the perspective of the video codec operations specified in VkVideoProfileListInfoKHR. For example, if the implementation produces video decode output with the format VK_FORMAT_G8_B8R8_2PLANE_420_UNORM where the blue and red chrominance channels are swapped then the componentMapping member of the corresponding VkVideoFormatPropertiesKHR structure will have the following member values:

components.r = VK_COMPONENT_SWIZZLE_B;        // Cb component
components.g = VK_COMPONENT_SWIZZLE_IDENTITY; // Y component
components.b = VK_COMPONENT_SWIZZLE_R;        // Cr component
components.a = VK_COMPONENT_SWIZZLE_IDENTITY; // unused, defaults to 1.0
Valid Usage
  • VUID-vkGetPhysicalDeviceVideoFormatPropertiesKHR-pNext-06812
    The pNext chain of pVideoFormatInfo must include a VkVideoProfileListInfoKHR structure with profileCount greater than 0

Valid Usage (Implicit)
  • VUID-vkGetPhysicalDeviceVideoFormatPropertiesKHR-physicalDevice-parameter
    physicalDevice must be a valid VkPhysicalDevice handle

  • VUID-vkGetPhysicalDeviceVideoFormatPropertiesKHR-pVideoFormatInfo-parameter
    pVideoFormatInfo must be a valid pointer to a valid VkPhysicalDeviceVideoFormatInfoKHR structure

  • VUID-vkGetPhysicalDeviceVideoFormatPropertiesKHR-pVideoFormatPropertyCount-parameter
    pVideoFormatPropertyCount must be a valid pointer to a uint32_t value

  • VUID-vkGetPhysicalDeviceVideoFormatPropertiesKHR-pVideoFormatProperties-parameter
    If the value referenced by pVideoFormatPropertyCount is not 0, and pVideoFormatProperties is not NULL, pVideoFormatProperties must be a valid pointer to an array of pVideoFormatPropertyCount VkVideoFormatPropertiesKHR structures

Return Codes
Success
  • VK_SUCCESS

  • VK_INCOMPLETE

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

  • VK_ERROR_IMAGE_USAGE_NOT_SUPPORTED_KHR

  • VK_ERROR_VIDEO_PROFILE_OPERATION_NOT_SUPPORTED_KHR

  • VK_ERROR_VIDEO_PROFILE_FORMAT_NOT_SUPPORTED_KHR

  • VK_ERROR_VIDEO_PICTURE_LAYOUT_NOT_SUPPORTED_KHR

  • VK_ERROR_VIDEO_PROFILE_CODEC_NOT_SUPPORTED_KHR

The VkPhysicalDeviceVideoFormatInfoKHR structure is defined as:

// Provided by VK_KHR_video_queue
typedef struct VkPhysicalDeviceVideoFormatInfoKHR {
    VkStructureType      sType;
    const void*          pNext;
    VkImageUsageFlags    imageUsage;
} VkPhysicalDeviceVideoFormatInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • imageUsage is a bitmask of VkImageUsageFlagBits specifying the intended usage of the video images.

Valid Usage (Implicit)
  • VUID-VkPhysicalDeviceVideoFormatInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_VIDEO_FORMAT_INFO_KHR

  • VUID-VkPhysicalDeviceVideoFormatInfoKHR-pNext-pNext
    pNext must be NULL or a pointer to a valid instance of VkVideoProfileListInfoKHR

  • VUID-VkPhysicalDeviceVideoFormatInfoKHR-sType-unique
    The sType value of each struct in the pNext chain must be unique

  • VUID-VkPhysicalDeviceVideoFormatInfoKHR-imageUsage-parameter
    imageUsage must be a valid combination of VkImageUsageFlagBits values

  • VUID-VkPhysicalDeviceVideoFormatInfoKHR-imageUsage-requiredbitmask
    imageUsage must not be 0

The VkVideoFormatPropertiesKHR structure is defined as:

// Provided by VK_KHR_video_queue
typedef struct VkVideoFormatPropertiesKHR {
    VkStructureType       sType;
    void*                 pNext;
    VkFormat              format;
    VkComponentMapping    componentMapping;
    VkImageCreateFlags    imageCreateFlags;
    VkImageType           imageType;
    VkImageTiling         imageTiling;
    VkImageUsageFlags     imageUsageFlags;
} VkVideoFormatPropertiesKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • format is a VkFormat that specifies the format that can be used with the specified video profiles and image usages.

  • componentMapping defines the color channel order used for the format. format along with componentMapping describe how the color channels are ordered when producing video decoder output or are expected to be ordered in video encoder input, when applicable. If the format reported does not require component swizzling then all members of componentMapping will be set to VK_COMPONENT_SWIZZLE_IDENTITY.

  • imageCreateFlags is a bitmask of VkImageCreateFlagBits specifying the supported image creation flags for the format.

  • imageType is a VkImageType that specifies the image type the format can be used with.

  • imageTiling is a VkImageTiling that specifies the image tiling the format can be used with.

  • imageUsageFlags is a bitmask of VkImageUsageFlagBits specifying the supported image usage flags for the format.

Valid Usage (Implicit)

The list of supported video format properties for a set of image usage flags with respect to a video profile is defined as the list of VkVideoFormatPropertiesKHR structures and any structures included in its pNext chain, obtained by calling vkGetPhysicalDeviceVideoFormatPropertiesKHR with VkPhysicalDeviceVideoFormatInfoKHR::imageUsage equal to the VkImageUsageFlags in question and the VkPhysicalDeviceVideoFormatInfoKHR::pNext chain including a VkVideoProfileListInfoKHR structure with its pProfiles member containing a single array element specifying the VkVideoProfileInfoKHR structure chain describing the video profile in question.

Video Sessions

Video sessions are objects that represent and maintain the state needed to perform video decode or encode operations using a specific video profile.

In case of video encode profiles this includes the current rate control configuration and the currently set video encode quality level.

Video sessions are represented by VkVideoSessionKHR handles:

// Provided by VK_KHR_video_queue
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkVideoSessionKHR)

Creating a Video Session

To create a video session object, call:

// Provided by VK_KHR_video_queue
VkResult vkCreateVideoSessionKHR(
    VkDevice                                    device,
    const VkVideoSessionCreateInfoKHR*          pCreateInfo,
    const VkAllocationCallbacks*                pAllocator,
    VkVideoSessionKHR*                          pVideoSession);
  • device is the logical device that creates the video session.

  • pCreateInfo is a pointer to a VkVideoSessionCreateInfoKHR structure containing parameters to be used to create the video session.

  • pAllocator controls host memory allocation as described in the Memory Allocation chapter.

  • pVideoSession is a pointer to a VkVideoSessionKHR handle in which the resulting video session object is returned.

The resulting video session object is said to be created with the video codec operation specified in pCreateInfo->pVideoProfile→videoCodecOperation.

The name and version of the codec-specific Video Std header to be used with the video session is specified by the VkExtensionProperties structure pointed to by pCreateInfo->pStdHeaderVersion. If a non-existent or unsupported Video Std header version is specified in pCreateInfo->pStdHeaderVersion→specVersion, then this command returns VK_ERROR_VIDEO_STD_VERSION_NOT_SUPPORTED_KHR.

Video session objects are created in uninitialized state. In order to transition the video session into initial state, the application must issue a vkCmdControlVideoCodingKHR command with VkVideoCodingControlInfoKHR::flags including VK_VIDEO_CODING_CONTROL_RESET_BIT_KHR.

Video session objects also maintain the state of the DPB. The number of DPB slots usable with the created video session is specified in pCreateInfo->maxDpbSlots, and each slot is initially in the inactive state.

Each DPB slot maintained by the created video session can refer to a reference picture representing a video frame.

In addition, if the videoCodecOperation member of the VkVideoProfileInfoKHR structure pointed to by pCreateInfo->pVideoProfile is VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR and the pictureLayout member of the VkVideoDecodeH264ProfileInfoKHR structure provided in the VkVideoProfileInfoKHR::pNext chain is not VK_VIDEO_DECODE_H264_PICTURE_LAYOUT_PROGRESSIVE_KHR, then the created video session supports interlaced frames and each DPB slot maintained by the created video session can instead refer to separate top field and bottom field reference pictures that together can represent a full video frame. In this case, it is up to the application, driven by the video content, whether it associates any individual DPB slot with separate top and/or bottom field pictures or a single picture representing a full frame.

The created video session can be used to perform video coding operations using video frames up to the maximum size specified in pCreateInfo->maxCodedExtent. The minimum frame size allowed is implicitly derived from VkVideoCapabilitiesKHR::minCodedExtent, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile specified by pCreateInfo->pVideoProfile. Accordingly, the created video session is said to be created with a minCodedExtent equal to that.

In case of video session objects created with a video encode operation, implementations may return the VK_ERROR_INVALID_VIDEO_STD_PARAMETERS_KHR error if any of the specified Video Std parameters do not adhere to the syntactic or semantic requirements of the used video compression standard, or if values derived from parameters according to the rules defined by the used video compression standard do not adhere to the capabilities of the video compression standard or the implementation.

Applications should not rely on the VK_ERROR_INVALID_VIDEO_STD_PARAMETERS_KHR error being returned by any command as a means to verify Video Std parameters, as implementations are not required to report the error in any specific set of cases.

Valid Usage (Implicit)
  • VUID-vkCreateVideoSessionKHR-device-parameter
    device must be a valid VkDevice handle

  • VUID-vkCreateVideoSessionKHR-pCreateInfo-parameter
    pCreateInfo must be a valid pointer to a valid VkVideoSessionCreateInfoKHR structure

  • VUID-vkCreateVideoSessionKHR-pAllocator-parameter
    If pAllocator is not NULL, pAllocator must be a valid pointer to a valid VkAllocationCallbacks structure

  • VUID-vkCreateVideoSessionKHR-pVideoSession-parameter
    pVideoSession must be a valid pointer to a VkVideoSessionKHR handle

Return Codes
Success
  • VK_SUCCESS

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

  • VK_ERROR_INITIALIZATION_FAILED

  • VK_ERROR_VIDEO_STD_VERSION_NOT_SUPPORTED_KHR

  • VK_ERROR_INVALID_VIDEO_STD_PARAMETERS_KHR

The VkVideoSessionCreateInfoKHR structure is defined as:

// Provided by VK_KHR_video_queue
typedef struct VkVideoSessionCreateInfoKHR {
    VkStructureType                 sType;
    const void*                     pNext;
    uint32_t                        queueFamilyIndex;
    VkVideoSessionCreateFlagsKHR    flags;
    const VkVideoProfileInfoKHR*    pVideoProfile;
    VkFormat                        pictureFormat;
    VkExtent2D                      maxCodedExtent;
    VkFormat                        referencePictureFormat;
    uint32_t                        maxDpbSlots;
    uint32_t                        maxActiveReferencePictures;
    const VkExtensionProperties*    pStdHeaderVersion;
} VkVideoSessionCreateInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • queueFamilyIndex is the index of the queue family the created video session will be used with.

  • flags is a bitmask of VkVideoSessionCreateFlagBitsKHR specifying creation flags.

  • pVideoProfile is a pointer to a VkVideoProfileInfoKHR structure specifying the video profile the created video session will be used with.

  • pictureFormat is the image format the created video session will be used with. If pVideoProfile->videoCodecOperation specifies a decode operation, then pictureFormat is the image format of decode output pictures usable with the created video session. If pVideoProfile->videoCodecOperation specifies an encode operation, then pictureFormat is the image format of encode input pictures usable with the created video session.

  • maxCodedExtent is the maximum width and height of the coded frames the created video session will be used with.

  • referencePictureFormat is the image format of reference pictures stored in the DPB the created video session will be used with.

  • maxDpbSlots is the maximum number of DPB Slots that can be used with the created video session.

  • maxActiveReferencePictures is the maximum number of active reference pictures that can be used in a single video coding operation using the created video session.

  • pStdHeaderVersion is a pointer to a VkExtensionProperties structure requesting the Video Std header version to use for the videoCodecOperation specified in pVideoProfile.

Valid Usage
  • VUID-VkVideoSessionCreateInfoKHR-protectedMemory-07189
    If the protectedMemory feature is not enabled or if VkVideoCapabilitiesKHR::flags does not include VK_VIDEO_CAPABILITY_PROTECTED_CONTENT_BIT_KHR, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile specified by pVideoProfile, then flags must not include VK_VIDEO_SESSION_CREATE_PROTECTED_CONTENT_BIT_KHR

  • VUID-VkVideoSessionCreateInfoKHR-flags-08371
    If flags includes VK_VIDEO_SESSION_CREATE_INLINE_QUERIES_BIT_KHR, then videoMaintenance1 must be enabled

  • VUID-VkVideoSessionCreateInfoKHR-flags-10264
    If flags includes VK_VIDEO_SESSION_CREATE_ALLOW_ENCODE_QUANTIZATION_DELTA_MAP_BIT_KHR or VK_VIDEO_SESSION_CREATE_ALLOW_ENCODE_EMPHASIS_MAP_BIT_KHR, then the videoEncodeQuantizationMap feature must be enabled

  • VUID-VkVideoSessionCreateInfoKHR-flags-10265
    If flags includes VK_VIDEO_SESSION_CREATE_ALLOW_ENCODE_QUANTIZATION_DELTA_MAP_BIT_KHR or VK_VIDEO_SESSION_CREATE_ALLOW_ENCODE_EMPHASIS_MAP_BIT_KHR, then pVideoProfile->videoCodecOperation must specify an encode operation

  • VUID-VkVideoSessionCreateInfoKHR-flags-10266
    If flags includes VK_VIDEO_SESSION_CREATE_ALLOW_ENCODE_QUANTIZATION_DELTA_MAP_BIT_KHR, then it must not also include VK_VIDEO_SESSION_CREATE_ALLOW_ENCODE_EMPHASIS_MAP_BIT_KHR

  • VUID-VkVideoSessionCreateInfoKHR-flags-10267
    If VkVideoEncodeCapabilitiesKHR::flags does not include VK_VIDEO_ENCODE_CAPABILITY_QUANTIZATION_DELTA_MAP_BIT_KHR, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile specified by pVideoProfile, then flags must not include VK_VIDEO_SESSION_CREATE_ALLOW_ENCODE_QUANTIZATION_DELTA_MAP_BIT_KHR

  • VUID-VkVideoSessionCreateInfoKHR-flags-10268
    If VkVideoEncodeCapabilitiesKHR::flags does not include VK_VIDEO_ENCODE_CAPABILITY_EMPHASIS_MAP_BIT_KHR, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile specified by pVideoProfile, then flags must not include VK_VIDEO_SESSION_CREATE_ALLOW_ENCODE_EMPHASIS_MAP_BIT_KHR

  • VUID-VkVideoSessionCreateInfoKHR-pVideoProfile-04845
    pVideoProfile must be a supported video profile

  • VUID-VkVideoSessionCreateInfoKHR-maxDpbSlots-04847
    maxDpbSlots must be less than or equal to VkVideoCapabilitiesKHR::maxDpbSlots, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile specified by pVideoProfile

  • VUID-VkVideoSessionCreateInfoKHR-maxActiveReferencePictures-04849
    maxActiveReferencePictures must be less than or equal to VkVideoCapabilitiesKHR::maxActiveReferencePictures, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile specified by pVideoProfile

  • VUID-VkVideoSessionCreateInfoKHR-maxDpbSlots-04850
    If either maxDpbSlots or maxActiveReferencePictures is 0, then both must be 0

  • VUID-VkVideoSessionCreateInfoKHR-maxCodedExtent-04851
    maxCodedExtent must be between VkVideoCapabilitiesKHR::minCodedExtent and VkVideoCapabilitiesKHR::maxCodedExtent, inclusive, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile specified by pVideoProfile

  • VUID-VkVideoSessionCreateInfoKHR-referencePictureFormat-04852
    If pVideoProfile->videoCodecOperation specifies a decode operation and maxActiveReferencePictures is greater than 0, then referencePictureFormat must be one of the supported decode DPB formats, as returned by vkGetPhysicalDeviceVideoFormatPropertiesKHR in VkVideoFormatPropertiesKHR::format when called with the imageUsage member of its pVideoFormatInfo parameter containing VK_IMAGE_USAGE_VIDEO_DECODE_DPB_BIT_KHR, and with a VkVideoProfileListInfoKHR structure specified in the pNext chain of its pVideoFormatInfo parameter whose pProfiles member contains an element matching pVideoProfile

  • VUID-VkVideoSessionCreateInfoKHR-referencePictureFormat-06814
    If pVideoProfile->videoCodecOperation specifies an encode operation and maxActiveReferencePictures is greater than 0, then referencePictureFormat must be one of the supported decode DPB formats, as returned by then referencePictureFormat must be one of the supported encode DPB formats, as returned by vkGetPhysicalDeviceVideoFormatPropertiesKHR in VkVideoFormatPropertiesKHR::format when called with the imageUsage member of its pVideoFormatInfo parameter containing VK_IMAGE_USAGE_VIDEO_ENCODE_DPB_BIT_KHR, and with a VkVideoProfileListInfoKHR structure specified in the pNext chain of its pVideoFormatInfo parameter whose pProfiles member contains an element matching pVideoProfile

  • VUID-VkVideoSessionCreateInfoKHR-pictureFormat-04853
    If pVideoProfile->videoCodecOperation specifies a decode operation, then pictureFormat must be one of the supported decode output formats, as returned by vkGetPhysicalDeviceVideoFormatPropertiesKHR in VkVideoFormatPropertiesKHR::format when called with the imageUsage member of its pVideoFormatInfo parameter containing VK_IMAGE_USAGE_VIDEO_DECODE_DST_BIT_KHR, and with a VkVideoProfileListInfoKHR structure specified in the pNext chain of its pVideoFormatInfo parameter whose pProfiles member contains an element matching pVideoProfile

  • VUID-VkVideoSessionCreateInfoKHR-pictureFormat-04854
    If pVideoProfile->videoCodecOperation specifies an encode operation, then pictureFormat must be one of the supported encode input formats, as returned by vkGetPhysicalDeviceVideoFormatPropertiesKHR in VkVideoFormatPropertiesKHR::format when called with the imageUsage member of its pVideoFormatInfo parameter containing VK_IMAGE_USAGE_VIDEO_ENCODE_SRC_BIT_KHR, and with a VkVideoProfileListInfoKHR structure specified in the pNext chain of its pVideoFormatInfo parameter whose pProfiles member contains an element matching pVideoProfile

  • VUID-VkVideoSessionCreateInfoKHR-pStdHeaderVersion-07190
    pStdHeaderVersion->extensionName must match VkVideoCapabilitiesKHR::stdHeaderVersion.extensionName, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile specified by pVideoProfile

  • VUID-VkVideoSessionCreateInfoKHR-pStdHeaderVersion-07191
    pStdHeaderVersion->specVersion must be less than or equal to VkVideoCapabilitiesKHR::stdHeaderVersion.specVersion, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile specified by pVideoProfile

  • VUID-VkVideoSessionCreateInfoKHR-pVideoProfile-08251
    If pVideoProfile->videoCodecOperation is VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR and the pNext chain of this structure includes a VkVideoEncodeH264SessionCreateInfoKHR structure, then its maxLevelIdc member must be less than or equal to VkVideoEncodeH264CapabilitiesKHR::maxLevelIdc, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile specified in pVideoProfile

  • VUID-VkVideoSessionCreateInfoKHR-pVideoProfile-08252
    If pVideoProfile->videoCodecOperation is VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR and the pNext chain of this structure includes a VkVideoEncodeH265SessionCreateInfoKHR structure, then its maxLevelIdc member must be less than or equal to VkVideoEncodeH265CapabilitiesKHR::maxLevelIdc, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile specified in pVideoProfile

  • VUID-VkVideoSessionCreateInfoKHR-pVideoProfile-10269
    If pVideoProfile->videoCodecOperation is VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR, then the videoEncodeAV1 feature must be enabled

  • VUID-VkVideoSessionCreateInfoKHR-pVideoProfile-10270
    If pVideoProfile->videoCodecOperation is VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR and the pNext chain of this structure includes a VkVideoEncodeAV1SessionCreateInfoKHR structure, then its maxLevel member must be less than or equal to VkVideoEncodeAV1CapabilitiesKHR::maxLevel, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile specified in pVideoProfile

Valid Usage (Implicit)
  • VUID-VkVideoSessionCreateInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_SESSION_CREATE_INFO_KHR

  • VUID-VkVideoSessionCreateInfoKHR-pNext-pNext
    Each pNext member of any structure (including this one) in the pNext chain must be either NULL or a pointer to a valid instance of VkVideoEncodeAV1SessionCreateInfoKHR, VkVideoEncodeH264SessionCreateInfoKHR, or VkVideoEncodeH265SessionCreateInfoKHR

  • VUID-VkVideoSessionCreateInfoKHR-sType-unique
    The sType value of each struct in the pNext chain must be unique

  • VUID-VkVideoSessionCreateInfoKHR-flags-parameter
    flags must be a valid combination of VkVideoSessionCreateFlagBitsKHR values

  • VUID-VkVideoSessionCreateInfoKHR-pVideoProfile-parameter
    pVideoProfile must be a valid pointer to a valid VkVideoProfileInfoKHR structure

  • VUID-VkVideoSessionCreateInfoKHR-pictureFormat-parameter
    pictureFormat must be a valid VkFormat value

  • VUID-VkVideoSessionCreateInfoKHR-referencePictureFormat-parameter
    referencePictureFormat must be a valid VkFormat value

  • VUID-VkVideoSessionCreateInfoKHR-pStdHeaderVersion-parameter
    pStdHeaderVersion must be a valid pointer to a valid VkExtensionProperties structure

Bits which can be set in VkVideoSessionCreateInfoKHR::flags are:

// Provided by VK_KHR_video_queue
typedef enum VkVideoSessionCreateFlagBitsKHR {
    VK_VIDEO_SESSION_CREATE_PROTECTED_CONTENT_BIT_KHR = 0x00000001,
  // Provided by VK_KHR_video_encode_queue
    VK_VIDEO_SESSION_CREATE_ALLOW_ENCODE_PARAMETER_OPTIMIZATIONS_BIT_KHR = 0x00000002,
  // Provided by VK_KHR_video_maintenance1
    VK_VIDEO_SESSION_CREATE_INLINE_QUERIES_BIT_KHR = 0x00000004,
  // Provided by VK_KHR_video_encode_quantization_map
    VK_VIDEO_SESSION_CREATE_ALLOW_ENCODE_QUANTIZATION_DELTA_MAP_BIT_KHR = 0x00000008,
  // Provided by VK_KHR_video_encode_quantization_map
    VK_VIDEO_SESSION_CREATE_ALLOW_ENCODE_EMPHASIS_MAP_BIT_KHR = 0x00000010,
} VkVideoSessionCreateFlagBitsKHR;
  • VK_VIDEO_SESSION_CREATE_PROTECTED_CONTENT_BIT_KHR specifies that the video session uses protected video content.

  • VK_VIDEO_SESSION_CREATE_ALLOW_ENCODE_PARAMETER_OPTIMIZATIONS_BIT_KHR specifies that the implementation is allowed to override video session parameters and other codec-specific encoding parameters to optimize video encode operations based on the use case information specified in the video profile and the used video encode quality level.

    Not specifying VK_VIDEO_SESSION_CREATE_ALLOW_ENCODE_PARAMETER_OPTIMIZATIONS_BIT_KHR does not guarantee that the implementation will not do any codec-specific parameter overrides, as certain overrides are necessary for the correct operation of the video encoder implementation due to limitations to the available encoding tools on that implementation. This flag, however, enables the implementation to apply further optimizing overrides.

  • VK_VIDEO_SESSION_CREATE_INLINE_QUERIES_BIT_KHR specifies that queries within video coding scopes using the created video session are executed inline with video coding operations.

  • VK_VIDEO_SESSION_CREATE_ALLOW_ENCODE_QUANTIZATION_DELTA_MAP_BIT_KHR specifies that the video session can be used to encode pictures with quantization delta maps.

  • VK_VIDEO_SESSION_CREATE_ALLOW_ENCODE_EMPHASIS_MAP_BIT_KHR specifies that the video session can be used to encode pictures with emphasis maps.

// Provided by VK_KHR_video_queue
typedef VkFlags VkVideoSessionCreateFlagsKHR;

VkVideoSessionCreateFlagsKHR is a bitmask type for setting a mask of zero or more VkVideoSessionCreateFlagBitsKHR.

Destroying a Video Session

To destroy a video session, call:

// Provided by VK_KHR_video_queue
void vkDestroyVideoSessionKHR(
    VkDevice                                    device,
    VkVideoSessionKHR                           videoSession,
    const VkAllocationCallbacks*                pAllocator);
  • device is the logical device that destroys the video session.

  • videoSession is the video session to destroy.

  • pAllocator controls host memory allocation as described in the Memory Allocation chapter.

Valid Usage
  • VUID-vkDestroyVideoSessionKHR-videoSession-07192
    All submitted commands that refer to videoSession must have completed execution

  • VUID-vkDestroyVideoSessionKHR-videoSession-07193
    If VkAllocationCallbacks were provided when videoSession was created, a compatible set of callbacks must be provided here

  • VUID-vkDestroyVideoSessionKHR-videoSession-07194
    If no VkAllocationCallbacks were provided when videoSession was created, pAllocator must be NULL

Valid Usage (Implicit)
  • VUID-vkDestroyVideoSessionKHR-device-parameter
    device must be a valid VkDevice handle

  • VUID-vkDestroyVideoSessionKHR-videoSession-parameter
    If videoSession is not VK_NULL_HANDLE, videoSession must be a valid VkVideoSessionKHR handle

  • VUID-vkDestroyVideoSessionKHR-pAllocator-parameter
    If pAllocator is not NULL, pAllocator must be a valid pointer to a valid VkAllocationCallbacks structure

  • VUID-vkDestroyVideoSessionKHR-videoSession-parent
    If videoSession is a valid handle, it must have been created, allocated, or retrieved from device

Host Synchronization
  • Host access to videoSession must be externally synchronized

Video Session Memory Association

After creating a video session object, and before the object can be used to record video coding operations into command buffers using it, the application must allocate and bind device memory to the video session. Device memory is allocated separately (see Device Memory) and then associated with the video session.

Video sessions may have multiple memory bindings identified by unique unsigned integer values. Appropriate device memory must be bound to each such memory binding before using the video session to record command buffer commands with it.

To determine the memory requirements for a video session object, call:

// Provided by VK_KHR_video_queue
VkResult vkGetVideoSessionMemoryRequirementsKHR(
    VkDevice                                    device,
    VkVideoSessionKHR                           videoSession,
    uint32_t*                                   pMemoryRequirementsCount,
    VkVideoSessionMemoryRequirementsKHR*        pMemoryRequirements);
  • device is the logical device that owns the video session.

  • videoSession is the video session to query.

  • pMemoryRequirementsCount is a pointer to an integer related to the number of memory binding requirements available or queried, as described below.

  • pMemoryRequirements is NULL or a pointer to an array of VkVideoSessionMemoryRequirementsKHR structures in which the memory binding requirements of the video session are returned.

If pMemoryRequirements is NULL, then the number of memory bindings required for the video session is returned in pMemoryRequirementsCount. Otherwise, pMemoryRequirementsCount must point to a variable set by the application to the number of elements in the pMemoryRequirements array, and on return the variable is overwritten with the number of memory binding requirements actually written to pMemoryRequirements. If pMemoryRequirementsCount is less than the number of memory bindings required for the video session, then at most pMemoryRequirementsCount elements will be written to pMemoryRequirements, and VK_INCOMPLETE will be returned, instead of VK_SUCCESS, to indicate that not all required memory binding requirements were returned.

Valid Usage (Implicit)
  • VUID-vkGetVideoSessionMemoryRequirementsKHR-device-parameter
    device must be a valid VkDevice handle

  • VUID-vkGetVideoSessionMemoryRequirementsKHR-videoSession-parameter
    videoSession must be a valid VkVideoSessionKHR handle

  • VUID-vkGetVideoSessionMemoryRequirementsKHR-pMemoryRequirementsCount-parameter
    pMemoryRequirementsCount must be a valid pointer to a uint32_t value

  • VUID-vkGetVideoSessionMemoryRequirementsKHR-pMemoryRequirements-parameter
    If the value referenced by pMemoryRequirementsCount is not 0, and pMemoryRequirements is not NULL, pMemoryRequirements must be a valid pointer to an array of pMemoryRequirementsCount VkVideoSessionMemoryRequirementsKHR structures

  • VUID-vkGetVideoSessionMemoryRequirementsKHR-videoSession-parent
    videoSession must have been created, allocated, or retrieved from device

Return Codes
Success
  • VK_SUCCESS

  • VK_INCOMPLETE

Failure

None

The VkVideoSessionMemoryRequirementsKHR structure is defined as:

// Provided by VK_KHR_video_queue
typedef struct VkVideoSessionMemoryRequirementsKHR {
    VkStructureType         sType;
    void*                   pNext;
    uint32_t                memoryBindIndex;
    VkMemoryRequirements    memoryRequirements;
} VkVideoSessionMemoryRequirementsKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • memoryBindIndex is the index of the memory binding.

  • memoryRequirements is a VkMemoryRequirements structure in which the requested memory binding requirements for the binding index specified by memoryBindIndex are returned.

Valid Usage (Implicit)
  • VUID-VkVideoSessionMemoryRequirementsKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_SESSION_MEMORY_REQUIREMENTS_KHR

  • VUID-VkVideoSessionMemoryRequirementsKHR-pNext-pNext
    pNext must be NULL

To attach memory to a video session object, call:

// Provided by VK_KHR_video_queue
VkResult vkBindVideoSessionMemoryKHR(
    VkDevice                                    device,
    VkVideoSessionKHR                           videoSession,
    uint32_t                                    bindSessionMemoryInfoCount,
    const VkBindVideoSessionMemoryInfoKHR*      pBindSessionMemoryInfos);
  • device is the logical device that owns the video session.

  • videoSession is the video session to be bound with device memory.

  • bindSessionMemoryInfoCount is the number of elements in pBindSessionMemoryInfos.

  • pBindSessionMemoryInfos is a pointer to an array of bindSessionMemoryInfoCount VkBindVideoSessionMemoryInfoKHR structures specifying memory regions to be bound to specific memory bindings of the video session.

The valid usage statements below refer to the VkMemoryRequirements structure corresponding to a specific element of pBindSessionMemoryInfos, which is defined as follows:

  • If the memoryBindIndex member of the element of pBindSessionMemoryInfos in question matches the memoryBindIndex member of one of the elements returned in pMemoryRequirements when vkGetVideoSessionMemoryRequirementsKHR is called with the same videoSession and with pMemoryRequirementsCount equal to bindSessionMemoryInfoCount, then the memoryRequirements member of that element of pMemoryRequirements is the VkMemoryRequirements structure corresponding to the element of pBindSessionMemoryInfos in question.

  • Otherwise the element of pBindSessionMemoryInfos in question is said to not have a corresponding VkMemoryRequirements structure.

Valid Usage
  • VUID-vkBindVideoSessionMemoryKHR-videoSession-07195
    The memory binding of videoSession identified by the memoryBindIndex member of any element of pBindSessionMemoryInfos must not already be backed by a memory object

  • VUID-vkBindVideoSessionMemoryKHR-memoryBindIndex-07196
    The memoryBindIndex member of each element of pBindSessionMemoryInfos must be unique within pBindSessionMemoryInfos

  • VUID-vkBindVideoSessionMemoryKHR-pBindSessionMemoryInfos-07197
    Each element of pBindSessionMemoryInfos must have a corresponding VkMemoryRequirements structure

  • VUID-vkBindVideoSessionMemoryKHR-pBindSessionMemoryInfos-07198
    If an element of pBindSessionMemoryInfos has a corresponding VkMemoryRequirements structure, then the memory member of that element of pBindSessionMemoryInfos must have been allocated using one of the memory types allowed in the memoryTypeBits member of the corresponding VkMemoryRequirements structure

  • VUID-vkBindVideoSessionMemoryKHR-pBindSessionMemoryInfos-07199
    If an element of pBindSessionMemoryInfos has a corresponding VkMemoryRequirements structure, then the memoryOffset member of that element of pBindSessionMemoryInfos must be an integer multiple of the alignment member of the corresponding VkMemoryRequirements structure

  • VUID-vkBindVideoSessionMemoryKHR-pBindSessionMemoryInfos-07200
    If an element of pBindSessionMemoryInfos has a corresponding VkMemoryRequirements structure, then the memorySize member of that element of pBindSessionMemoryInfos must equal the size member of the corresponding VkMemoryRequirements structure

Valid Usage (Implicit)
  • VUID-vkBindVideoSessionMemoryKHR-device-parameter
    device must be a valid VkDevice handle

  • VUID-vkBindVideoSessionMemoryKHR-videoSession-parameter
    videoSession must be a valid VkVideoSessionKHR handle

  • VUID-vkBindVideoSessionMemoryKHR-pBindSessionMemoryInfos-parameter
    pBindSessionMemoryInfos must be a valid pointer to an array of bindSessionMemoryInfoCount valid VkBindVideoSessionMemoryInfoKHR structures

  • VUID-vkBindVideoSessionMemoryKHR-bindSessionMemoryInfoCount-arraylength
    bindSessionMemoryInfoCount must be greater than 0

  • VUID-vkBindVideoSessionMemoryKHR-videoSession-parent
    videoSession must have been created, allocated, or retrieved from device

Host Synchronization
  • Host access to videoSession must be externally synchronized

Return Codes
Success
  • VK_SUCCESS

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

The VkBindVideoSessionMemoryInfoKHR structure is defined as:

// Provided by VK_KHR_video_queue
typedef struct VkBindVideoSessionMemoryInfoKHR {
    VkStructureType    sType;
    const void*        pNext;
    uint32_t           memoryBindIndex;
    VkDeviceMemory     memory;
    VkDeviceSize       memoryOffset;
    VkDeviceSize       memorySize;
} VkBindVideoSessionMemoryInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • memoryBindIndex is the memory binding index to bind memory to.

  • memory is the allocated device memory to be bound to the video session’s memory binding with index memoryBindIndex.

  • memoryOffset is the start offset of the region of memory which is to be bound.

  • memorySize is the size in bytes of the region of memory, starting from memoryOffset bytes, to be bound.

Valid Usage
  • VUID-VkBindVideoSessionMemoryInfoKHR-memoryOffset-07201
    memoryOffset must be less than the size of memory

  • VUID-VkBindVideoSessionMemoryInfoKHR-memorySize-07202
    memorySize must be less than or equal to the size of memory minus memoryOffset

Valid Usage (Implicit)
  • VUID-VkBindVideoSessionMemoryInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_BIND_VIDEO_SESSION_MEMORY_INFO_KHR

  • VUID-VkBindVideoSessionMemoryInfoKHR-pNext-pNext
    pNext must be NULL

  • VUID-VkBindVideoSessionMemoryInfoKHR-memory-parameter
    memory must be a valid VkDeviceMemory handle

Video Profile Compatibility

Resources and query pools used with a particular video session must be compatible with the video profile the video session was created with.

A VkBuffer is compatible with a video profile if it was created with the VkBufferCreateInfo::pNext chain including a VkVideoProfileListInfoKHR structure with its pProfiles member containing an element matching the VkVideoProfileInfoKHR structure chain describing the video profile, and VkBufferCreateInfo::usage including at least one bit specific to video coding usage.

  • VK_BUFFER_USAGE_VIDEO_DECODE_SRC_BIT_KHR

  • VK_BUFFER_USAGE_VIDEO_DECODE_DST_BIT_KHR

  • VK_BUFFER_USAGE_VIDEO_ENCODE_SRC_BIT_KHR

  • VK_BUFFER_USAGE_VIDEO_ENCODE_DST_BIT_KHR

A VkBuffer is also compatible with a video profile if it was created with VkBufferCreateInfo::flags including VK_BUFFER_CREATE_VIDEO_PROFILE_INDEPENDENT_BIT_KHR.

A VkImage is compatible with a video profile if it was created with the VkImageCreateInfo::pNext chain including a VkVideoProfileListInfoKHR structure with its pProfiles member containing an element matching the VkVideoProfileInfoKHR structure chain describing the video profile, and VkImageCreateInfo::usage including at least one bit specific to video coding usage.

  • VK_IMAGE_USAGE_VIDEO_DECODE_SRC_BIT_KHR

  • VK_IMAGE_USAGE_VIDEO_DECODE_DST_BIT_KHR

  • VK_IMAGE_USAGE_VIDEO_DECODE_DPB_BIT_KHR

  • VK_IMAGE_USAGE_VIDEO_ENCODE_SRC_BIT_KHR

  • VK_IMAGE_USAGE_VIDEO_ENCODE_DST_BIT_KHR

  • VK_IMAGE_USAGE_VIDEO_ENCODE_DPB_BIT_KHR

  • VK_IMAGE_USAGE_VIDEO_ENCODE_QUANTIZATION_DELTA_MAP_BIT_KHR

  • VK_IMAGE_USAGE_VIDEO_ENCODE_EMPHASIS_MAP_BIT_KHR

A VkImage is also compatible with a video profile if all of the following conditions are true for the VkImageCreateInfo structure the image was created with:

While some of these rules allow creating buffer or image resources that may be compatible with any video profile, applications should still prefer to include the specific video profiles the buffer or image resource is expected to be used with (through a VkVideoProfileListInfoKHR structure included in the pNext chain of the corresponding create info structure) whenever the information about the complete set of video profiles is available at resource creation time, to enable the implementation to optimize the created resource for the specific use case. In the absence of that information, the implementation may have to make conservative decisions about the memory requirements or representation of the resource.

A VkImageView is compatible with a video profile if the VkImage it was created from is also compatible with that video profile.

A VkQueryPool is compatible with a video profile if it was created with the VkQueryPoolCreateInfo::pNext chain including a VkVideoProfileInfoKHR structure chain describing the same video profile, and VkQueryPoolCreateInfo::queryType having one of the following values:

  • VK_QUERY_TYPE_RESULT_STATUS_ONLY_KHR

  • VK_QUERY_TYPE_VIDEO_ENCODE_FEEDBACK_KHR

Video Session Parameters

Video session parameters objects can store preprocessed codec-specific parameters used with a compatible video session, and enable reducing the number of parameters needed to be provided and processed by the implementation while recording video coding operations into command buffers.

Parameters stored in such objects are immutable to facilitate the concurrent use of the stored parameters in multiple threads. At the same time, new parameters can be added to existing objects using the vkUpdateVideoSessionParametersKHR command.

In order to support concurrent use of the stored immutable parameters while also allowing the video session parameters object to be extended with new parameters, each video session parameters object maintains an update sequence counter that is set to 0 at object creation time and must be incremented by each subsequent update operation.

Certain video sequences that adhere to particular video compression standards permit updating previously supplied parameters. If a parameter update is necessary, the application has the following options:

  • Cache the set of parameters on the application side and create a new video session parameters object adding all the parameters with appropriate changes, as necessary; or

  • Create a new video session parameters object providing only the updated parameters and the previously used object as the template, which ensures that parameters not specified at creation time will be copied unmodified from the template object.

The actual types of parameters that can be stored and the capacity for individual parameter types, and the methods of initializing, updating, and referring to individual parameters are specific to the video codec operation the video session parameters object was created with.

Video session parameters objects created with an encode operation are further specialized based on the video encode quality level the video session parameters are used with, as implementations may apply different sets of parameter overrides depending on the used quality level. This enables implementations to store the potentially optimized set of parameters in these objects, further limiting the necessary processing required while recording video encode operations into command buffers.

Video session parameters are represented by VkVideoSessionParametersKHR handles:

// Provided by VK_KHR_video_queue
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkVideoSessionParametersKHR)

Creating Video Session Parameters

To create a video session parameters object, call:

// Provided by VK_KHR_video_queue
VkResult vkCreateVideoSessionParametersKHR(
    VkDevice                                    device,
    const VkVideoSessionParametersCreateInfoKHR* pCreateInfo,
    const VkAllocationCallbacks*                pAllocator,
    VkVideoSessionParametersKHR*                pVideoSessionParameters);
  • device is the logical device that creates the video session parameters object.

  • pCreateInfo is a pointer to VkVideoSessionParametersCreateInfoKHR structure containing parameters to be used to create the video session parameters object.

  • pAllocator controls host memory allocation as described in the Memory Allocation chapter.

  • pVideoSessionParameters is a pointer to a VkVideoSessionParametersKHR handle in which the resulting video session parameters object is returned.

The resulting video session parameters object is said to be created with the video codec operation pCreateInfo->videoSession was created with.

Video session parameters objects created with an encode operation are always created with respect to a video encode quality level. By default, the created video session parameters objects are created with quality level zero, unless otherwise specified by including a VkVideoEncodeQualityLevelInfoKHR structure in the pCreateInfo->pNext chain, in which case the video session parameters object is created with the quality level specified in VkVideoEncodeQualityLevelInfoKHR::qualityLevel.

If pCreateInfo->videoSessionParametersTemplate is not VK_NULL_HANDLE, then it will be used as a template for constructing the new video session parameters object. This happens by first adding any parameters according to the additional creation parameters provided in the pCreateInfo->pNext chain, followed by adding any parameters from the template object that have a key that does not match the key of any of the already added parameters.

For video session parameters objects created with an encode operation, the template object specified in pCreateInfo->videoSessionParametersTemplate must have been created with the same video encode quality level as the newly created object.

This means that codec-specific parameters stored in video session parameters objects can only be reused across different video encode quality levels by re-specifying them, as previously created video session parameters against other quality levels cannot be used as template because the original codec-specific parameters (before the implementation may have applied parameter overrides) may no longer be available in them for the purposes of constructing the derived object.

Video session parameters objects are only compatible with quantization maps if they are created with pCreateInfo->flags including VK_VIDEO_SESSION_PARAMETERS_CREATE_QUANTIZATION_MAP_COMPATIBLE_BIT_KHR.

Video session parameters objects created with VK_VIDEO_SESSION_PARAMETERS_CREATE_QUANTIZATION_MAP_COMPATIBLE_BIT_KHR against a video session object that was created with VK_VIDEO_SESSION_CREATE_ALLOW_ENCODE_QUANTIZATION_DELTA_MAP_BIT_KHR or VK_VIDEO_SESSION_CREATE_ALLOW_ENCODE_EMPHASIS_MAP_BIT_KHR are created with a specific compatible quantization map texel size specified in the quantizationMapTexelSize member of the VkVideoEncodeQuantizationMapSessionParametersCreateInfoKHR structure included in the pNext chain of pCreateInfo.

This means that the quantization map texel size that such a video session parameters object is compatible with is fixed for the lifetime of the object. Applications have to create separate video session parameters objects to use different quantization map texel sizes with a single video session object. This is necessary because the used quantization map texel size may affect the parameter overrides the implementation has to perform and thus the final values of the used codec-specific parameters.

For video session parameters objects created with VK_VIDEO_SESSION_PARAMETERS_CREATE_QUANTIZATION_MAP_COMPATIBLE_BIT_KHR, the template object specified in pCreateInfo->videoSessionParametersTemplate must also have been created with VK_VIDEO_SESSION_PARAMETERS_CREATE_QUANTIZATION_MAP_COMPATIBLE_BIT_KHR and the same compatible quantization map texel size specified in VkVideoEncodeQuantizationMapSessionParametersCreateInfoKHR::quantizationMapTexelSize.

This means that codec-specific parameters stored in video session parameters objects can only be reused with different quantization map texel sizes by re-specifying them, as previously created video session parameters against other quantization map texel sizes cannot be used as template because the original codec-specific parameters (before the implementation may have applied parameter overrides) may no longer be available in them for the purposes of constructing the derived object.

For video session parameters objects created without VK_VIDEO_SESSION_PARAMETERS_CREATE_QUANTIZATION_MAP_COMPATIBLE_BIT_KHR, the template object specified in pCreateInfo->videoSessionParametersTemplate must also have been created without VK_VIDEO_SESSION_PARAMETERS_CREATE_QUANTIZATION_MAP_COMPATIBLE_BIT_KHR.

If pCreateInfo->videoSession was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR, then the created video session parameters object will initially contain the following sets of parameter entries:

  • StdVideoH264SequenceParameterSet structures representing H.264 SPS entries, as follows:

    • If the pParametersAddInfo member of the VkVideoDecodeH264SessionParametersCreateInfoKHR structure provided in the pCreateInfo->pNext chain is not NULL, then the set of StdVideoH264SequenceParameterSet entries specified in pParametersAddInfo->pStdSPSs are added first;

    • If pCreateInfo->videoSessionParametersTemplate is not VK_NULL_HANDLE, then each StdVideoH264SequenceParameterSet entry stored in it is copied to the created video session parameters object if the created object does not already contain such an entry with the same seq_parameter_set_id.

  • StdVideoH264PictureParameterSet structures representing H.264 PPS entries, as follows:

    • If the pParametersAddInfo member of the VkVideoDecodeH264SessionParametersCreateInfoKHR structure provided in the pCreateInfo->pNext chain is not NULL, then the set of StdVideoH264PictureParameterSet entries specified in pParametersAddInfo->pStdPPSs are added first;

    • If pCreateInfo->videoSessionParametersTemplate is not VK_NULL_HANDLE, then each StdVideoH264PictureParameterSet entry stored in it is copied to the created video session parameters object if the created object does not already contain such an entry with the same seq_parameter_set_id and pic_parameter_set_id.

If pCreateInfo->videoSession was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_KHR, then the created video session parameters object will initially contain the following sets of parameter entries:

  • StdVideoH265VideoParameterSet structures representing H.265 VPS entries, as follows:

    • If the pParametersAddInfo member of the VkVideoDecodeH265SessionParametersCreateInfoKHR structure provided in the pCreateInfo->pNext chain is not NULL, then the set of StdVideoH265VideoParameterSet entries specified in pParametersAddInfo->pStdVPSs are added first;

    • If pCreateInfo->videoSessionParametersTemplate is not VK_NULL_HANDLE, then each StdVideoH265VideoParameterSet entry stored in it is copied to the created video session parameters object if the created object does not already contain such an entry with the same vps_video_parameter_set_id.

  • StdVideoH265SequenceParameterSet structures representing H.265 SPS entries, as follows:

    • If the pParametersAddInfo member of the VkVideoDecodeH265SessionParametersCreateInfoKHR structure provided in the pCreateInfo->pNext chain is not NULL, then the set of StdVideoH265SequenceParameterSet entries specified in pParametersAddInfo->pStdSPSs are added first;

    • If pCreateInfo->videoSessionParametersTemplate is not VK_NULL_HANDLE, then each StdVideoH265SequenceParameterSet entry stored in it is copied to the created video session parameters object if the created object does not already contain such an entry with the same sps_video_parameter_set_id and sps_seq_parameter_set_id.

  • StdVideoH265PictureParameterSet structures representing H.265 PPS entries, as follows:

    • If the pParametersAddInfo member of the VkVideoDecodeH265SessionParametersCreateInfoKHR structure provided in the pCreateInfo->pNext chain is not NULL, then the set of StdVideoH265PictureParameterSet entries specified in pParametersAddInfo->pStdPPSs are added first;

    • If pCreateInfo->videoSessionParametersTemplate is not VK_NULL_HANDLE, then each StdVideoH265PictureParameterSet entry stored in it is copied to the created video session parameters object if the created object does not already contain such an entry with the same sps_video_parameter_set_id, pps_seq_parameter_set_id, and pps_pic_parameter_set_id.

If pCreateInfo->videoSession was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_AV1_BIT_KHR, then the created video session parameters object will contain a single AV1 sequence header represented by a StdVideoAV1SequenceHeader structure specified through the pStdSequenceHeader member of the VkVideoDecodeAV1SessionParametersCreateInfoKHR structure provided in the pCreateInfo->pNext chain. As such video session parameters objects can only contain a single AV1 sequence header, it is not possible to use a previously created object as a template or subsequently update the created video session parameters object.

If pCreateInfo->videoSession was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR, then the created video session parameters object will initially contain the following sets of parameter entries:

  • StdVideoH264SequenceParameterSet structures representing H.264 SPS entries, as follows:

    • If the pParametersAddInfo member of the VkVideoEncodeH264SessionParametersCreateInfoKHR structure provided in the pCreateInfo->pNext chain is not NULL, then the set of StdVideoH264SequenceParameterSet entries specified in pParametersAddInfo->pStdSPSs are added first;

    • If pCreateInfo->videoSessionParametersTemplate is not VK_NULL_HANDLE, then each StdVideoH264SequenceParameterSet entry stored in it is copied to the created video session parameters object if the created object does not already contain such an entry with the same seq_parameter_set_id.

  • StdVideoH264PictureParameterSet structures representing H.264 PPS entries, as follows:

    • If the pParametersAddInfo member of the VkVideoEncodeH264SessionParametersCreateInfoKHR structure provided in the pCreateInfo->pNext chain is not NULL, then the set of StdVideoH264PictureParameterSet entries specified in pParametersAddInfo->pStdPPSs are added first;

    • If pCreateInfo->videoSessionParametersTemplate is not VK_NULL_HANDLE, then each StdVideoH264PictureParameterSet entry stored in it is copied to the created video session parameters object if the created object does not already contain such an entry with the same seq_parameter_set_id and pic_parameter_set_id.

If pCreateInfo->videoSession was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR, then the created video session parameters object will initially contain the following sets of parameter entries:

  • StdVideoH265VideoParameterSet structures representing H.265 VPS entries, as follows:

    • If the pParametersAddInfo member of the VkVideoEncodeH265SessionParametersCreateInfoKHR structure provided in the pCreateInfo->pNext chain is not NULL, then the set of StdVideoH265VideoParameterSet entries specified in pParametersAddInfo->pStdVPSs are added first;

    • If pCreateInfo->videoSessionParametersTemplate is not VK_NULL_HANDLE, then each StdVideoH265VideoParameterSet entry stored in it is copied to the created video session parameters object if the created object does not already contain such an entry with the same vps_video_parameter_set_id.

  • StdVideoH265SequenceParameterSet structures representing H.265 SPS entries, as follows:

    • If the pParametersAddInfo member of the VkVideoEncodeH265SessionParametersCreateInfoKHR structure provided in the pCreateInfo->pNext chain is not NULL, then the set of StdVideoH265SequenceParameterSet entries specified in pParametersAddInfo->pStdSPSs are added first;

    • If pCreateInfo->videoSessionParametersTemplate is not VK_NULL_HANDLE, then each StdVideoH265SequenceParameterSet entry stored in it is copied to the created video session parameters object if the created object does not already contain such an entry with the same sps_video_parameter_set_id and sps_seq_parameter_set_id.

  • StdVideoH265PictureParameterSet structures representing H.265 PPS entries, as follows:

    • If the pParametersAddInfo member of the VkVideoEncodeH265SessionParametersCreateInfoKHR structure provided in the pCreateInfo->pNext chain is not NULL, then the set of StdVideoH265PictureParameterSet entries specified in pParametersAddInfo->pStdPPSs are added first;

    • If pCreateInfo->videoSessionParametersTemplate is not VK_NULL_HANDLE, then each StdVideoH265PictureParameterSet entry stored in it is copied to the created video session parameters object if the created object does not already contain such an entry with the same sps_video_parameter_set_id, pps_seq_parameter_set_id, and pps_pic_parameter_set_id.

If pCreateInfo->videoSession was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR, then the created video session parameters object will contain a single AV1 sequence header specified through the members of the VkVideoEncodeAV1SessionParametersCreateInfoKHR structure provided in the pCreateInfo->pNext chain. As such video session parameters objects can only contain a single AV1 sequence header, it is not possible to use a previously created object as a template or subsequently update the created video session parameters object.

In case of video session parameters objects created with a video encode operation, implementations may return the VK_ERROR_INVALID_VIDEO_STD_PARAMETERS_KHR error if any of the specified Video Std parameters do not adhere to the syntactic or semantic requirements of the used video compression standard, or if values derived from parameters according to the rules defined by the used video compression standard do not adhere to the capabilities of the video compression standard or the implementation.

Applications should not rely on the VK_ERROR_INVALID_VIDEO_STD_PARAMETERS_KHR error being returned by any command as a means to verify Video Std parameters, as implementations are not required to report the error in any specific set of cases.

Valid Usage (Implicit)
  • VUID-vkCreateVideoSessionParametersKHR-device-parameter
    device must be a valid VkDevice handle

  • VUID-vkCreateVideoSessionParametersKHR-pCreateInfo-parameter
    pCreateInfo must be a valid pointer to a valid VkVideoSessionParametersCreateInfoKHR structure

  • VUID-vkCreateVideoSessionParametersKHR-pAllocator-parameter
    If pAllocator is not NULL, pAllocator must be a valid pointer to a valid VkAllocationCallbacks structure

  • VUID-vkCreateVideoSessionParametersKHR-pVideoSessionParameters-parameter
    pVideoSessionParameters must be a valid pointer to a VkVideoSessionParametersKHR handle

Return Codes
Success
  • VK_SUCCESS

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

  • VK_ERROR_INITIALIZATION_FAILED

  • VK_ERROR_INVALID_VIDEO_STD_PARAMETERS_KHR

The VkVideoSessionParametersCreateInfoKHR structure is defined as:

// Provided by VK_KHR_video_queue
typedef struct VkVideoSessionParametersCreateInfoKHR {
    VkStructureType                           sType;
    const void*                               pNext;
    VkVideoSessionParametersCreateFlagsKHR    flags;
    VkVideoSessionParametersKHR               videoSessionParametersTemplate;
    VkVideoSessionKHR                         videoSession;
} VkVideoSessionParametersCreateInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • flags is a bitmask of VkVideoSessionParametersCreateFlagBitsKHR specifying create flags.

  • videoSessionParametersTemplate is VK_NULL_HANDLE or a valid handle to a VkVideoSessionParametersKHR object used as a template for constructing the new video session parameters object.

  • videoSession is the video session object against which the video session parameters object is going to be created.

Limiting values are defined below that are referenced by the relevant valid usage statements of this structure.

  • If videoSession was created with the codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR, then let StdVideoH264SequenceParameterSet spsAddList[] be the list of H.264 SPS entries to add to the created video session parameters object, defined as follows:

    • If the pParametersAddInfo member of the VkVideoDecodeH264SessionParametersCreateInfoKHR structure provided in the pNext chain is not NULL, then the set of StdVideoH264SequenceParameterSet entries specified in pParametersAddInfo->pStdSPSs are added to spsAddList;

    • If videoSessionParametersTemplate is not VK_NULL_HANDLE, then each StdVideoH264SequenceParameterSet entry stored in it with seq_parameter_set_id not matching any of the entries already in spsAddList is added to spsAddList.

  • If videoSession was created with the codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR, then let StdVideoH264PictureParameterSet ppsAddList[] be the list of H.264 PPS entries to add to the created video session parameters object, defined as follows:

    • If the pParametersAddInfo member of the VkVideoDecodeH264SessionParametersCreateInfoKHR structure provided in the pNext chain is not NULL, then the set of StdVideoH264PictureParameterSet entries specified in pParametersAddInfo->pStdPPSs are added to ppsAddList;

    • If videoSessionParametersTemplate is not VK_NULL_HANDLE, then each StdVideoH264PictureParameterSet entry stored in it with seq_parameter_set_id or pic_parameter_set_id not matching any of the entries already in ppsAddList is added to ppsAddList.

  • If videoSession was created with the codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_KHR, then let StdVideoH265VideoParameterSet vpsAddList[] be the list of H.265 VPS entries to add to the created video session parameters object, defined as follows:

    • If the pParametersAddInfo member of the VkVideoDecodeH265SessionParametersCreateInfoKHR structure provided in the pNext chain is not NULL, then the set of StdVideoH265VideoParameterSet entries specified in pParametersAddInfo->pStdVPSs are added to vpsAddList;

    • If videoSessionParametersTemplate is not VK_NULL_HANDLE, then each StdVideoH265VideoParameterSet entry stored in it with vps_video_parameter_set_id not matching any of the entries already in vpsAddList is added to vpsAddList.

  • If videoSession was created with the codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_KHR, then let StdVideoH265SequenceParameterSet spsAddList[] be the list of H.265 SPS entries to add to the created video session parameters object, defined as follows:

    • If the pParametersAddInfo member of the VkVideoDecodeH265SessionParametersCreateInfoKHR structure provided in the pNext chain is not NULL, then the set of StdVideoH265SequenceParameterSet entries specified in pParametersAddInfo->pStdSPSs are added to spsAddList;

    • If videoSessionParametersTemplate is not VK_NULL_HANDLE, then each StdVideoH265SequenceParameterSet entry stored in it with sps_video_parameter_set_id or sps_seq_parameter_set_id not matching any of the entries already in spsAddList is added to spsAddList.

  • If videoSession was created with the codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_KHR, then let StdVideoH265PictureParameterSet ppsAddList[] be the list of H.265 PPS entries to add to the created video session parameters object, defined as follows:

    • If the pParametersAddInfo member of the VkVideoDecodeH265SessionParametersCreateInfoKHR structure provided in the pNext chain is not NULL, then the set of StdVideoH265PictureParameterSet entries specified in pParametersAddInfo->pStdPPSs are added to ppsAddList;

    • If videoSessionParametersTemplate is not VK_NULL_HANDLE, then each StdVideoH265PictureParameterSet entry stored in it with sps_video_parameter_set_id, pps_seq_parameter_set_id, or pps_pic_parameter_set_id not matching any of the entries already in ppsAddList is added to ppsAddList.

  • If videoSession was created with an encode operation, then let uint32_t qualityLevel be the video encode quality level of the created video session parameters object, defined as follows:

  • If videoSession was created with the codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR, then let StdVideoH264SequenceParameterSet spsAddList[] be the list of H.264 SPS entries to add to the created video session parameters object, defined as follows:

    • If the pParametersAddInfo member of the VkVideoEncodeH264SessionParametersCreateInfoKHR structure provided in the pNext chain is not NULL, then the set of StdVideoH264SequenceParameterSet entries specified in pParametersAddInfo->pStdSPSs are added to spsAddList;

    • If videoSessionParametersTemplate is not VK_NULL_HANDLE, then each StdVideoH264SequenceParameterSet entry stored in it with seq_parameter_set_id not matching any of the entries already in spsAddList is added to spsAddList.

  • If videoSession was created with the codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR, then let StdVideoH264PictureParameterSet ppsAddList[] be the list of H.264 PPS entries to add to the created video session parameters object, defined as follows:

    • If the pParametersAddInfo member of the VkVideoEncodeH264SessionParametersCreateInfoKHR structure provided in the pNext chain is not NULL, then the set of StdVideoH264PictureParameterSet entries specified in pParametersAddInfo->pStdPPSs are added to ppsAddList;

    • If videoSessionParametersTemplate is not VK_NULL_HANDLE, then each StdVideoH264PictureParameterSet entry stored in it with seq_parameter_set_id or pic_parameter_set_id not matching any of the entries already in ppsAddList is added to ppsAddList.

  • If videoSession was created with the codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR, then let StdVideoH265VideoParameterSet vpsAddList[] be the list of H.265 VPS entries to add to the created video session parameters object, defined as follows:

    • If the pParametersAddInfo member of the VkVideoEncodeH265SessionParametersCreateInfoKHR structure provided in the pNext chain is not NULL, then the set of StdVideoH265VideoParameterSet entries specified in pParametersAddInfo->pStdVPSs are added to vpsAddList;

    • If videoSessionParametersTemplate is not VK_NULL_HANDLE, then each StdVideoH265VideoParameterSet entry stored in it with vps_video_parameter_set_id not matching any of the entries already in vpsAddList is added to vpsAddList.

  • If videoSession was created with the codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR, then let StdVideoH265SequenceParameterSet spsAddList[] be the list of H.265 SPS entries to add to the created video session parameters object, defined as follows:

    • If the pParametersAddInfo member of the VkVideoEncodeH265SessionParametersCreateInfoKHR structure provided in the pNext chain is not NULL, then the set of StdVideoH265SequenceParameterSet entries specified in pParametersAddInfo->pStdSPSs are added to spsAddList;

    • If videoSessionParametersTemplate is not VK_NULL_HANDLE, then each StdVideoH265SequenceParameterSet entry stored in it with sps_video_parameter_set_id or sps_seq_parameter_set_id not matching any of the entries already in spsAddList is added to spsAddList.

  • If videoSession was created with the codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR, then let StdVideoH265PictureParameterSet ppsAddList[] be the list of H.265 PPS entries to add to the created video session parameters object, defined as follows:

    • If the pParametersAddInfo member of the VkVideoEncodeH265SessionParametersCreateInfoKHR structure provided in the pNext chain is not NULL, then the set of StdVideoH265PictureParameterSet entries specified in pParametersAddInfo->pStdPPSs are added to ppsAddList;

    • If videoSessionParametersTemplate is not VK_NULL_HANDLE, then each StdVideoH265PictureParameterSet entry stored in it with sps_video_parameter_set_id, pps_seq_parameter_set_id, or pps_pic_parameter_set_id not matching any of the entries already in ppsAddList is added to ppsAddList.

Valid Usage
  • VUID-VkVideoSessionParametersCreateInfoKHR-videoSessionParametersTemplate-04855
    If videoSessionParametersTemplate is not VK_NULL_HANDLE, it must have been created against videoSession

  • VUID-VkVideoSessionParametersCreateInfoKHR-videoSessionParametersTemplate-08310
    If videoSessionParametersTemplate is not VK_NULL_HANDLE and videoSession was created with an encode operation, then qualityLevel must equal the video encode quality level videoSessionParametersTemplate was created with

  • VUID-VkVideoSessionParametersCreateInfoKHR-flags-10271
    If flags includes VK_VIDEO_SESSION_PARAMETERS_CREATE_QUANTIZATION_MAP_COMPATIBLE_BIT_KHR, then videoSession must have been created with VK_VIDEO_SESSION_CREATE_ALLOW_ENCODE_QUANTIZATION_DELTA_MAP_BIT_KHR or VK_VIDEO_SESSION_CREATE_ALLOW_ENCODE_EMPHASIS_MAP_BIT_KHR

  • VUID-VkVideoSessionParametersCreateInfoKHR-flags-10272
    If flags includes VK_VIDEO_SESSION_PARAMETERS_CREATE_QUANTIZATION_MAP_COMPATIBLE_BIT_KHR, then the pNext chain must include a VkVideoEncodeQuantizationMapSessionParametersCreateInfoKHR structure

  • VUID-VkVideoSessionParametersCreateInfoKHR-flags-10273
    If flags includes VK_VIDEO_SESSION_PARAMETERS_CREATE_QUANTIZATION_MAP_COMPATIBLE_BIT_KHR and videoSession was created with VK_VIDEO_SESSION_CREATE_ALLOW_ENCODE_QUANTIZATION_DELTA_MAP_BIT_KHR, then the list of video format properties supported for the image usage flag VK_IMAGE_USAGE_VIDEO_ENCODE_QUANTIZATION_DELTA_MAP_BIT_KHR must have an element for which VkVideoFormatQuantizationMapPropertiesKHR::quantizationMapTexelSize equals the quantizationMapTexelSize member of the VkVideoEncodeQuantizationMapSessionParametersCreateInfoKHR structure included in the pNext chain

  • VUID-VkVideoSessionParametersCreateInfoKHR-flags-10274
    If flags includes VK_VIDEO_SESSION_PARAMETERS_CREATE_QUANTIZATION_MAP_COMPATIBLE_BIT_KHR and videoSession was created with VK_VIDEO_SESSION_CREATE_ALLOW_ENCODE_EMPHASIS_MAP_BIT_KHR, then the list of video format properties supported for the image usage flag VK_IMAGE_USAGE_VIDEO_ENCODE_EMPHASIS_MAP_BIT_KHR must have an element for which VkVideoFormatQuantizationMapPropertiesKHR::quantizationMapTexelSize equals the quantizationMapTexelSize member of the VkVideoEncodeQuantizationMapSessionParametersCreateInfoKHR structure included in the pNext chain

  • VUID-VkVideoSessionParametersCreateInfoKHR-videoSessionParametersTemplate-10275
    If videoSessionParametersTemplate is not VK_NULL_HANDLE and flags includes VK_VIDEO_SESSION_PARAMETERS_CREATE_QUANTIZATION_MAP_COMPATIBLE_BIT_KHR, then videoSessionParametersTemplate must have been created with VK_VIDEO_SESSION_PARAMETERS_CREATE_QUANTIZATION_MAP_COMPATIBLE_BIT_KHR

  • VUID-VkVideoSessionParametersCreateInfoKHR-videoSessionParametersTemplate-10276
    If videoSessionParametersTemplate is not VK_NULL_HANDLE and flags includes VK_VIDEO_SESSION_PARAMETERS_CREATE_QUANTIZATION_MAP_COMPATIBLE_BIT_KHR, then videoSessionParametersTemplate must have been created with the same quantization map texel size as the one specified in the quantizationMapTexelSize member of the VkVideoEncodeQuantizationMapSessionParametersCreateInfoKHR structure included in the pNext chain

  • VUID-VkVideoSessionParametersCreateInfoKHR-videoSessionParametersTemplate-10277
    If videoSessionParametersTemplate is not VK_NULL_HANDLE and flags does not include VK_VIDEO_SESSION_PARAMETERS_CREATE_QUANTIZATION_MAP_COMPATIBLE_BIT_KHR, then videoSessionParametersTemplate must have been created without VK_VIDEO_SESSION_PARAMETERS_CREATE_QUANTIZATION_MAP_COMPATIBLE_BIT_KHR

  • VUID-VkVideoSessionParametersCreateInfoKHR-videoSession-07203
    If videoSession was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR, then the pNext chain must include a VkVideoDecodeH264SessionParametersCreateInfoKHR structure

  • VUID-VkVideoSessionParametersCreateInfoKHR-videoSession-07204
    If videoSession was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR, then the number of elements of spsAddList must be less than or equal to the maxStdSPSCount specified in the VkVideoDecodeH264SessionParametersCreateInfoKHR structure included in the pNext chain

  • VUID-VkVideoSessionParametersCreateInfoKHR-videoSession-07205
    If videoSession was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR, then the number of elements of ppsAddList must be less than or equal to the maxStdPPSCount specified in the VkVideoDecodeH264SessionParametersCreateInfoKHR structure included in the pNext chain

  • VUID-VkVideoSessionParametersCreateInfoKHR-videoSession-07206
    If videoSession was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_KHR, then the pNext chain must include a VkVideoDecodeH265SessionParametersCreateInfoKHR structure

  • VUID-VkVideoSessionParametersCreateInfoKHR-videoSession-07207
    If videoSession was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_KHR, then the number of elements of vpsAddList must be less than or equal to the maxStdVPSCount specified in the VkVideoDecodeH265SessionParametersCreateInfoKHR structure included in the pNext chain

  • VUID-VkVideoSessionParametersCreateInfoKHR-videoSession-07208
    If videoSession was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_KHR, then the number of elements of spsAddList must be less than or equal to the maxStdSPSCount specified in the VkVideoDecodeH265SessionParametersCreateInfoKHR structure included in the pNext chain

  • VUID-VkVideoSessionParametersCreateInfoKHR-videoSession-07209
    If videoSession was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_KHR, then the number of elements of ppsAddList must be less than or equal to the maxStdPPSCount specified in the VkVideoDecodeH265SessionParametersCreateInfoKHR structure included in the pNext chain

  • VUID-VkVideoSessionParametersCreateInfoKHR-videoSession-09258
    If videoSession was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_AV1_BIT_KHR, then videoSessionParametersTemplate must be VK_NULL_HANDLE

  • VUID-VkVideoSessionParametersCreateInfoKHR-videoSession-09259
    If videoSession was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_AV1_BIT_KHR, then the pNext chain must include a VkVideoDecodeAV1SessionParametersCreateInfoKHR structure

  • VUID-VkVideoSessionParametersCreateInfoKHR-videoSession-07210
    If videoSession was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR, then the pNext chain must include a VkVideoEncodeH264SessionParametersCreateInfoKHR structure

  • VUID-VkVideoSessionParametersCreateInfoKHR-videoSession-04839
    If videoSession was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR, then the number of elements of spsAddList must be less than or equal to the maxStdSPSCount specified in the VkVideoEncodeH264SessionParametersCreateInfoKHR structure included in the pNext chain

  • VUID-VkVideoSessionParametersCreateInfoKHR-videoSession-04840
    If videoSession was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR, then the number of elements of ppsAddList must be less than or equal to the maxStdPPSCount specified in the VkVideoEncodeH264SessionParametersCreateInfoKHR structure included in the pNext chain

  • VUID-VkVideoSessionParametersCreateInfoKHR-videoSession-07211
    If videoSession was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR, then the pNext chain must include a VkVideoEncodeH265SessionParametersCreateInfoKHR structure

  • VUID-VkVideoSessionParametersCreateInfoKHR-videoSession-04841
    If videoSession was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR, then the number of elements of vpsAddList must be less than or equal to the maxStdVPSCount specified in the VkVideoEncodeH265SessionParametersCreateInfoKHR structure included in the pNext chain

  • VUID-VkVideoSessionParametersCreateInfoKHR-videoSession-04842
    If videoSession was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR, then the number of elements of spsAddList must be less than or equal to the maxStdSPSCount specified in the VkVideoEncodeH265SessionParametersCreateInfoKHR structure included in the pNext chain

  • VUID-VkVideoSessionParametersCreateInfoKHR-videoSession-04843
    If videoSession was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR, then the number of elements of ppsAddList must be less than or equal to the maxStdPPSCount specified in the VkVideoEncodeH265SessionParametersCreateInfoKHR structure included in the pNext chain

  • VUID-VkVideoSessionParametersCreateInfoKHR-videoSession-08319
    If videoSession was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR, then num_tile_columns_minus1 must be less than VkVideoEncodeH265CapabilitiesKHR::maxTiles.width, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile videoSession was created with, for each element of ppsAddList

  • VUID-VkVideoSessionParametersCreateInfoKHR-videoSession-08320
    If videoSession was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR, then num_tile_rows_minus1 must be less than VkVideoEncodeH265CapabilitiesKHR::maxTiles.height, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile videoSession was created with, for each element of ppsAddList

  • VUID-VkVideoSessionParametersCreateInfoKHR-videoSession-10278
    If videoSession was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR, then videoSessionParametersTemplate must be VK_NULL_HANDLE

  • VUID-VkVideoSessionParametersCreateInfoKHR-videoSession-10279
    If videoSession was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR, then the pNext chain must include a VkVideoEncodeAV1SessionParametersCreateInfoKHR structure

  • VUID-VkVideoSessionParametersCreateInfoKHR-videoSession-10280
    If videoSession was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR, then the stdOperatingPointCount member of the VkVideoEncodeAV1SessionParametersCreateInfoKHR structure included in the pNext chain must be less than or equal to VkVideoEncodeAV1CapabilitiesKHR::maxOperatingPoints, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile videoSession was created with

Valid Usage (Implicit)

Bits which can be set in VkVideoSessionParametersCreateInfoKHR::flags are:

// Provided by VK_KHR_video_encode_quantization_map
typedef enum VkVideoSessionParametersCreateFlagBitsKHR {
  // Provided by VK_KHR_video_encode_quantization_map
    VK_VIDEO_SESSION_PARAMETERS_CREATE_QUANTIZATION_MAP_COMPATIBLE_BIT_KHR = 0x00000001,
} VkVideoSessionParametersCreateFlagBitsKHR;
  • VK_VIDEO_SESSION_PARAMETERS_CREATE_QUANTIZATION_MAP_COMPATIBLE_BIT_KHR specifies that the created video session parameters object can be used with quantization maps.

// Provided by VK_KHR_video_queue
typedef VkFlags VkVideoSessionParametersCreateFlagsKHR;

VkVideoSessionParametersCreateFlagsKHR is a bitmask type for setting a mask of zero or more VkVideoSessionParametersCreateFlagBitsKHR.

The VkVideoEncodeQuantizationMapSessionParametersCreateInfoKHR structure is defined as:

// Provided by VK_KHR_video_encode_quantization_map
typedef struct VkVideoEncodeQuantizationMapSessionParametersCreateInfoKHR {
    VkStructureType    sType;
    const void*        pNext;
    VkExtent2D         quantizationMapTexelSize;
} VkVideoEncodeQuantizationMapSessionParametersCreateInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • quantizationMapTexelSize specifies the quantization map texel size a video session parameters object created with VK_VIDEO_SESSION_PARAMETERS_CREATE_QUANTIZATION_MAP_COMPATIBLE_BIT_KHR is compatible with.

Valid Usage (Implicit)
  • VUID-VkVideoEncodeQuantizationMapSessionParametersCreateInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_QUANTIZATION_MAP_SESSION_PARAMETERS_CREATE_INFO_KHR

Destroying Video Session Parameters

To destroy a video session parameters object, call:

// Provided by VK_KHR_video_queue
void vkDestroyVideoSessionParametersKHR(
    VkDevice                                    device,
    VkVideoSessionParametersKHR                 videoSessionParameters,
    const VkAllocationCallbacks*                pAllocator);
  • device is the logical device that destroys the video session parameters object.

  • videoSessionParameters is the video session parameters object to destroy.

  • pAllocator controls host memory allocation as described in the Memory Allocation chapter.

Valid Usage
  • VUID-vkDestroyVideoSessionParametersKHR-videoSessionParameters-07212
    All submitted commands that refer to videoSessionParameters must have completed execution

  • VUID-vkDestroyVideoSessionParametersKHR-videoSessionParameters-07213
    If VkAllocationCallbacks were provided when videoSessionParameters was created, a compatible set of callbacks must be provided here

  • VUID-vkDestroyVideoSessionParametersKHR-videoSessionParameters-07214
    If no VkAllocationCallbacks were provided when videoSessionParameters was created, pAllocator must be NULL

Valid Usage (Implicit)
  • VUID-vkDestroyVideoSessionParametersKHR-device-parameter
    device must be a valid VkDevice handle

  • VUID-vkDestroyVideoSessionParametersKHR-videoSessionParameters-parameter
    If videoSessionParameters is not VK_NULL_HANDLE, videoSessionParameters must be a valid VkVideoSessionParametersKHR handle

  • VUID-vkDestroyVideoSessionParametersKHR-pAllocator-parameter
    If pAllocator is not NULL, pAllocator must be a valid pointer to a valid VkAllocationCallbacks structure

  • VUID-vkDestroyVideoSessionParametersKHR-videoSessionParameters-parent
    If videoSessionParameters is a valid handle, it must have been created, allocated, or retrieved from device

Host Synchronization
  • Host access to videoSessionParameters must be externally synchronized

Updating Video Session Parameters

To update video session parameters object with new parameters, call:

// Provided by VK_KHR_video_queue
VkResult vkUpdateVideoSessionParametersKHR(
    VkDevice                                    device,
    VkVideoSessionParametersKHR                 videoSessionParameters,
    const VkVideoSessionParametersUpdateInfoKHR* pUpdateInfo);
  • device is the logical device that updates the video session parameters.

  • videoSessionParameters is the video session parameters object to update.

  • pUpdateInfo is a pointer to a VkVideoSessionParametersUpdateInfoKHR structure specifying the parameter update information.

After a successful call to this command, the update sequence counter of videoSessionParameters is changed to the value specified in pUpdateInfo->updateSequenceCount.

As each update issued to a video session parameters object needs to specify the next available update sequence count value, concurrent updates of the same video session parameters object are inherently disallowed. However, recording video coding operations to command buffers referring to parameters previously added to the video session parameters object is allowed, even if there is a concurrent update in progress adding some new entries to the object.

If videoSessionParameters was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR and the pUpdateInfo->pNext chain includes a VkVideoDecodeH264SessionParametersAddInfoKHR structure, then this command adds the following parameter entries to videoSessionParameters:

If videoSessionParameters was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_KHR and the pUpdateInfo->pNext chain includes a VkVideoDecodeH265SessionParametersAddInfoKHR structure, then this command adds the following parameter entries to videoSessionParameters:

If videoSessionParameters was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR and the pUpdateInfo->pNext chain includes a VkVideoEncodeH264SessionParametersAddInfoKHR structure, then this command adds the following parameter entries to videoSessionParameters:

If videoSessionParameters was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR and the pUpdateInfo->pNext chain includes a VkVideoEncodeH265SessionParametersAddInfoKHR structure, then this command adds the following parameter entries to videoSessionParameters:

In case of video session parameters objects created with a video encode operation, implementations may return the VK_ERROR_INVALID_VIDEO_STD_PARAMETERS_KHR error if any of the specified Video Std parameters do not adhere to the syntactic or semantic requirements of the used video compression standard, or if values derived from parameters according to the rules defined by the used video compression standard do not adhere to the capabilities of the video compression standard or the implementation.

Applications should not rely on the VK_ERROR_INVALID_VIDEO_STD_PARAMETERS_KHR error being returned by any command as a means to verify Video Std parameters, as implementations are not required to report the error in any specific set of cases.

Valid Usage
  • VUID-vkUpdateVideoSessionParametersKHR-pUpdateInfo-07215
    pUpdateInfo->updateSequenceCount must equal the current update sequence counter of videoSessionParameters plus one

  • VUID-vkUpdateVideoSessionParametersKHR-videoSessionParameters-07216
    If videoSessionParameters was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR and the pNext chain of pUpdateInfo includes a VkVideoDecodeH264SessionParametersAddInfoKHR structure, then videoSessionParameters must not already contain a StdVideoH264SequenceParameterSet entry with seq_parameter_set_id matching any of the elements of VkVideoDecodeH264SessionParametersAddInfoKHR::pStdSPSs

  • VUID-vkUpdateVideoSessionParametersKHR-videoSessionParameters-07217
    If videoSessionParameters was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR, then the number of StdVideoH264SequenceParameterSet entries already stored in it plus the value of the stdSPSCount member of the VkVideoDecodeH264SessionParametersAddInfoKHR structure included in the pUpdateInfo->pNext chain must be less than or equal to the VkVideoDecodeH264SessionParametersCreateInfoKHR::maxStdSPSCount videoSessionParameters was created with

  • VUID-vkUpdateVideoSessionParametersKHR-videoSessionParameters-07218
    If videoSessionParameters was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR and the pNext chain of pUpdateInfo includes a VkVideoDecodeH264SessionParametersAddInfoKHR structure, then videoSessionParameters must not already contain a StdVideoH264PictureParameterSet entry with both seq_parameter_set_id and pic_parameter_set_id matching any of the elements of VkVideoDecodeH264SessionParametersAddInfoKHR::pStdPPSs

  • VUID-vkUpdateVideoSessionParametersKHR-videoSessionParameters-07219
    If videoSessionParameters was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR, then the number of StdVideoH264PictureParameterSet entries already stored in it plus the value of the stdPPSCount member of the VkVideoDecodeH264SessionParametersAddInfoKHR structure included in the pUpdateInfo->pNext chain must be less than or equal to the VkVideoDecodeH264SessionParametersCreateInfoKHR::maxStdPPSCount videoSessionParameters was created with

  • VUID-vkUpdateVideoSessionParametersKHR-videoSessionParameters-07220
    If videoSessionParameters was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_KHR and the pNext chain of pUpdateInfo includes a VkVideoDecodeH265SessionParametersAddInfoKHR structure, then videoSessionParameters must not already contain a StdVideoH265VideoParameterSet entry with vps_video_parameter_set_id matching any of the elements of VkVideoDecodeH265SessionParametersAddInfoKHR::pStdVPSs

  • VUID-vkUpdateVideoSessionParametersKHR-videoSessionParameters-07221
    If videoSessionParameters was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_KHR, then the number of StdVideoH265VideoParameterSet entries already stored in it plus the value of the stdVPSCount member of the VkVideoDecodeH265SessionParametersAddInfoKHR structure included in the pUpdateInfo->pNext chain must be less than or equal to the VkVideoDecodeH265SessionParametersCreateInfoKHR::maxStdVPSCount videoSessionParameters was created with

  • VUID-vkUpdateVideoSessionParametersKHR-videoSessionParameters-07222
    If videoSessionParameters was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_KHR and the pNext chain of pUpdateInfo includes a VkVideoDecodeH265SessionParametersAddInfoKHR structure, then videoSessionParameters must not already contain a StdVideoH265SequenceParameterSet entry with both sps_video_parameter_set_id and sps_seq_parameter_set_id matching any of the elements of VkVideoDecodeH265SessionParametersAddInfoKHR::pStdSPSs

  • VUID-vkUpdateVideoSessionParametersKHR-videoSessionParameters-07223
    If videoSessionParameters was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_KHR, then the number of StdVideoH265SequenceParameterSet entries already stored in it plus the value of the stdSPSCount member of the VkVideoDecodeH265SessionParametersAddInfoKHR structure included in the pUpdateInfo->pNext chain must be less than or equal to the VkVideoDecodeH265SessionParametersCreateInfoKHR::maxStdSPSCount videoSessionParameters was created with

  • VUID-vkUpdateVideoSessionParametersKHR-videoSessionParameters-07224
    If videoSessionParameters was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_KHR and the pNext chain of pUpdateInfo includes a VkVideoDecodeH265SessionParametersAddInfoKHR structure, then videoSessionParameters must not already contain a StdVideoH265PictureParameterSet entry with sps_video_parameter_set_id, pps_seq_parameter_set_id, and pps_pic_parameter_set_id all matching any of the elements of VkVideoDecodeH265SessionParametersAddInfoKHR::pStdPPSs

  • VUID-vkUpdateVideoSessionParametersKHR-videoSessionParameters-07225
    If videoSessionParameters was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_KHR, then the number of StdVideoH265PictureParameterSet entries already stored in it plus the value of the stdPPSCount member of the VkVideoDecodeH265SessionParametersAddInfoKHR structure included in the pUpdateInfo->pNext chain must be less than or equal to the VkVideoDecodeH265SessionParametersCreateInfoKHR::maxStdPPSCount videoSessionParameters was created with

  • VUID-vkUpdateVideoSessionParametersKHR-videoSessionParameters-09260
    videoSessionParameters must not have been created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_AV1_BIT_KHR

  • VUID-vkUpdateVideoSessionParametersKHR-videoSessionParameters-07226
    If videoSessionParameters was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR and the pNext chain of pUpdateInfo includes a VkVideoEncodeH264SessionParametersAddInfoKHR structure, then videoSessionParameters must not already contain a StdVideoH264SequenceParameterSet entry with seq_parameter_set_id matching any of the elements of VkVideoEncodeH264SessionParametersAddInfoKHR::pStdSPSs

  • VUID-vkUpdateVideoSessionParametersKHR-videoSessionParameters-06441
    If videoSessionParameters was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR, then the number of StdVideoH264SequenceParameterSet entries already stored in it plus the value of the stdSPSCount member of the VkVideoEncodeH264SessionParametersAddInfoKHR structure included in the pUpdateInfo->pNext chain must be less than or equal to the VkVideoEncodeH264SessionParametersCreateInfoKHR::maxStdSPSCount videoSessionParameters was created with

  • VUID-vkUpdateVideoSessionParametersKHR-videoSessionParameters-07227
    If videoSessionParameters was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR and the pNext chain of pUpdateInfo includes a VkVideoEncodeH264SessionParametersAddInfoKHR structure, then videoSessionParameters must not already contain a StdVideoH264PictureParameterSet entry with both seq_parameter_set_id and pic_parameter_set_id matching any of the elements of VkVideoEncodeH264SessionParametersAddInfoKHR::pStdPPSs

  • VUID-vkUpdateVideoSessionParametersKHR-videoSessionParameters-06442
    If videoSessionParameters was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR, then the number of StdVideoH264PictureParameterSet entries already stored in it plus the value of the stdPPSCount member of the VkVideoEncodeH264SessionParametersAddInfoKHR structure included in the pUpdateInfo->pNext chain must be less than or equal to the VkVideoEncodeH264SessionParametersCreateInfoKHR::maxStdPPSCount videoSessionParameters was created with

  • VUID-vkUpdateVideoSessionParametersKHR-videoSessionParameters-07228
    If videoSessionParameters was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR and the pNext chain of pUpdateInfo includes a VkVideoEncodeH265SessionParametersAddInfoKHR structure, then videoSessionParameters must not already contain a StdVideoH265VideoParameterSet entry with vps_video_parameter_set_id matching any of the elements of VkVideoEncodeH265SessionParametersAddInfoKHR::pStdVPSs

  • VUID-vkUpdateVideoSessionParametersKHR-videoSessionParameters-06443
    If videoSessionParameters was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR, then the number of StdVideoH265VideoParameterSet entries already stored in it plus the value of the stdVPSCount member of the VkVideoEncodeH265SessionParametersAddInfoKHR structure included in the pUpdateInfo->pNext chain must be less than or equal to the VkVideoEncodeH265SessionParametersCreateInfoKHR::maxStdVPSCount videoSessionParameters was created with

  • VUID-vkUpdateVideoSessionParametersKHR-videoSessionParameters-07229
    If videoSessionParameters was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR and the pNext chain of pUpdateInfo includes a VkVideoEncodeH265SessionParametersAddInfoKHR structure, then videoSessionParameters must not already contain a StdVideoH265SequenceParameterSet entry with both sps_video_parameter_set_id and sps_seq_parameter_set_id matching any of the elements of VkVideoEncodeH265SessionParametersAddInfoKHR::pStdSPSs

  • VUID-vkUpdateVideoSessionParametersKHR-videoSessionParameters-06444
    If videoSessionParameters was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR, then the number of StdVideoH265SequenceParameterSet entries already stored in it plus the value of the stdSPSCount member of the VkVideoEncodeH265SessionParametersAddInfoKHR structure included in the pUpdateInfo->pNext chain must be less than or equal to the VkVideoEncodeH265SessionParametersCreateInfoKHR::maxStdSPSCount videoSessionParameters was created with

  • VUID-vkUpdateVideoSessionParametersKHR-videoSessionParameters-07230
    If videoSessionParameters was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR and the pNext chain of pUpdateInfo includes a VkVideoEncodeH265SessionParametersAddInfoKHR structure, then videoSessionParameters must not already contain a StdVideoH265PictureParameterSet entry with sps_video_parameter_set_id, pps_seq_parameter_set_id, and pps_pic_parameter_set_id all matching any of the elements of VkVideoEncodeH265SessionParametersAddInfoKHR::pStdPPSs

  • VUID-vkUpdateVideoSessionParametersKHR-videoSessionParameters-06445
    If videoSessionParameters was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR, then the number of StdVideoH265PictureParameterSet entries already stored in it plus the value of the stdPPSCount member of the VkVideoEncodeH265SessionParametersAddInfoKHR structure included in the pUpdateInfo->pNext chain must be less than or equal to the VkVideoEncodeH265SessionParametersCreateInfoKHR::maxStdPPSCount videoSessionParameters was created with

  • VUID-vkUpdateVideoSessionParametersKHR-videoSessionParameters-08321
    If videoSessionParameters was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR and the pNext chain of pUpdateInfo includes a VkVideoEncodeH265SessionParametersAddInfoKHR structure, then num_tile_columns_minus1 must be less than VkVideoEncodeH265CapabilitiesKHR::maxTiles.width, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile videoSessionParameters was created with, for each element of VkVideoEncodeH265SessionParametersAddInfoKHR::pStdPPSs

  • VUID-vkUpdateVideoSessionParametersKHR-videoSessionParameters-08322
    If videoSessionParameters was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR and the pNext chain of pUpdateInfo includes a VkVideoEncodeH265SessionParametersAddInfoKHR structure, then num_tile_rows_minus1 must be less than VkVideoEncodeH265CapabilitiesKHR::maxTiles.height, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile videoSessionParameters was created with, for each element of VkVideoEncodeH265SessionParametersAddInfoKHR::pStdPPSs

  • VUID-vkUpdateVideoSessionParametersKHR-videoSessionParameters-10281
    videoSessionParameters must not have been created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR

Valid Usage (Implicit)
  • VUID-vkUpdateVideoSessionParametersKHR-device-parameter
    device must be a valid VkDevice handle

  • VUID-vkUpdateVideoSessionParametersKHR-videoSessionParameters-parameter
    videoSessionParameters must be a valid VkVideoSessionParametersKHR handle

  • VUID-vkUpdateVideoSessionParametersKHR-pUpdateInfo-parameter
    pUpdateInfo must be a valid pointer to a valid VkVideoSessionParametersUpdateInfoKHR structure

  • VUID-vkUpdateVideoSessionParametersKHR-videoSessionParameters-parent
    videoSessionParameters must have been created, allocated, or retrieved from device

Return Codes
Success
  • VK_SUCCESS

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

  • VK_ERROR_INVALID_VIDEO_STD_PARAMETERS_KHR

The VkVideoSessionParametersUpdateInfoKHR structure is defined as:

// Provided by VK_KHR_video_queue
typedef struct VkVideoSessionParametersUpdateInfoKHR {
    VkStructureType    sType;
    const void*        pNext;
    uint32_t           updateSequenceCount;
} VkVideoSessionParametersUpdateInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • updateSequenceCount is the new update sequence count to set for the video session parameters object.

Valid Usage (Implicit)

Video Coding Scope

Applications can record video coding commands for a video session only within a video coding scope.

To begin a video coding scope, call:

// Provided by VK_KHR_video_queue
void vkCmdBeginVideoCodingKHR(
    VkCommandBuffer                             commandBuffer,
    const VkVideoBeginCodingInfoKHR*            pBeginInfo);
  • commandBuffer is the command buffer in which to record the command.

  • pBeginInfo is a pointer to a VkVideoBeginCodingInfoKHR structure specifying the parameters of the video coding scope, including the video session and video session parameters object to use.

After beginning a video coding scope, the video session object specified in pBeginInfo->videoSession is bound to the command buffer, and the command buffer is ready to record video coding operations. Similarly, if pBeginInfo->videoSessionParameters is not VK_NULL_HANDLE, it is also bound to the command buffer, and video coding operations can refer to the codec-specific parameters stored in it.

This command also establishes the set of bound reference picture resources that can be used as reconstructed pictures or reference pictures within the video coding scope. Each element of this set consists of a video picture resource and the DPB slot index associated with it, if there is one.

The set of bound reference picture resources is immutable within a video coding scope, however, the DPB slot index associated with any of the bound reference picture resources can change during the video coding scope in response to video coding operations.

The VkVideoReferenceSlotInfoKHR structures provided as the elements of pBeginInfo->pReferenceSlots are interpreted by this command as follows:

  • If slotIndex is non-negative and pPictureResource is not NULL, then the video picture resource defined by the VkVideoPictureResourceInfoKHR structure pointed to by pPictureResource is added to the set of bound reference picture resources and is associated with the DPB slot index specified in slotIndex.

  • If slotIndex is non-negative and pPictureResource is NULL, then the DPB slot with index slotIndex is deactivated by this command.

  • If slotIndex is negative and pPictureResource is not NULL, then the video picture resource defined by the VkVideoPictureResourceInfoKHR structure pointed to by pPictureResource is added to the set of bound reference picture resources without an associated DPB slot. Such a picture resource can be subsequently used as a reconstructed picture to associate it with a DPB slot.

  • If slotIndex is negative and pPictureResource is NULL, then the element is ignored.

It is possible for multiple bound reference picture resources to be associated with the same DPB slot index, or for a single bound reference picture to refer to multiple separate reference pictures. For example, in case of an H.264 decode profile with interlaced frame support a single DPB slot can refer to two separate pictures for the top and bottom fields. Depending on the picture layout used by the H.264 decode profile, the following special cases may arise:

  • If the picture layout is VK_VIDEO_DECODE_H264_PICTURE_LAYOUT_INTERLACED_INTERLEAVED_LINES_BIT_KHR, then the top and bottom field pictures are physically co-located in the same video picture resource with even scanlines corresponding to the top field and odd scanlines corresponding to the bottom field, respectively.

  • If the picture layout is VK_VIDEO_DECODE_H264_PICTURE_LAYOUT_INTERLACED_SEPARATE_PLANES_BIT_KHR, then the top and bottom field pictures are stored in separate video picture resources (in separate subregions of the same image layer, in separate layers of the same image, or in entirely separate images), hence two elements of VkVideoBeginCodingInfoKHR::pReferenceSlots can contain the same slotIndex but specify different video picture resources in their pPictureResource members.

All non-negative slotIndex values specified in the elements of pBeginInfo->pReferenceSlots must identify DPB slots of the video session that are in the active state at the time this command is executed on the device.

The application does not have to specify an entry in pBeginInfo->pReferenceSlots corresponding to all active DPB slots of the video session, but only for those which are intended to be used in the video coding scope. This way the application can avoid any potential runtime cost associated with binding the corresponding picture resources to the command buffer.

In case of a video encode session, the application is also responsible for providing information about the current rate control state configured for the video session by including an instance of the VkVideoEncodeRateControlInfoKHR structure in the pNext chain of pBeginInfo. If no VkVideoEncodeRateControlInfoKHR is included, then the presence of an empty VkVideoEncodeRateControlInfoKHR structure is implied which indicates that the current rate control mode is VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DEFAULT_KHR. The specified state must match the effective rate control state configured for the video session at the time the recorded command is executed on the device.

Including an instance of the VkVideoEncodeRateControlInfoKHR structure in the pNext chain of pBeginInfo does not change the rate control state configured for the video session, but only specifies the expected rate control state configured at the time the recorded command is executed on the device which allows the implementation to have information about the configured rate control state at command buffer recording time. In order to change the current rate control state of a video session, the application has to issue an appropriate vkCmdControlVideoCodingKHR command as described in the Video Coding Control and Rate Control State sections.

Valid Usage
  • VUID-vkCmdBeginVideoCodingKHR-commandBuffer-07231
    The VkCommandPool that commandBuffer was allocated from must support the video codec operation pBeginInfo->videoSession was created with, as returned by vkGetPhysicalDeviceQueueFamilyProperties2 in VkQueueFamilyVideoPropertiesKHR::videoCodecOperations

  • VUID-vkCmdBeginVideoCodingKHR-None-07232
    There must be no active queries

  • VUID-vkCmdBeginVideoCodingKHR-commandBuffer-07233
    If commandBuffer is an unprotected command buffer and protectedNoFault is not supported, then pBeginInfo->videoSession must not have been created with VK_VIDEO_SESSION_CREATE_PROTECTED_CONTENT_BIT_KHR

  • VUID-vkCmdBeginVideoCodingKHR-commandBuffer-07234
    If commandBuffer is a protected command buffer and protectedNoFault is not supported, then pBeginInfo->videoSession must have been created with VK_VIDEO_SESSION_CREATE_PROTECTED_CONTENT_BIT_KHR

  • VUID-vkCmdBeginVideoCodingKHR-commandBuffer-07235
    If commandBuffer is an unprotected command buffer, protectedNoFault is not supported, and the pPictureResource member of any element of pBeginInfo->pReferenceSlots is not NULL, then pPictureResource->imageViewBinding for that element must not specify an image view created from a protected image

  • VUID-vkCmdBeginVideoCodingKHR-commandBuffer-07236
    If commandBuffer is a protected command buffer protectedNoFault is not supported, and the pPictureResource member of any element of pBeginInfo->pReferenceSlots is not NULL, then pPictureResource->imageViewBinding for that element must specify an image view created from a protected image

  • VUID-vkCmdBeginVideoCodingKHR-slotIndex-07239
    If the slotIndex member of any element of pBeginInfo->pReferenceSlots is not negative, then it must specify the index of a DPB slot that is in the active state in pBeginInfo->videoSession at the time the command is executed on the device

  • VUID-vkCmdBeginVideoCodingKHR-pPictureResource-07265
    Each video picture resource specified by any non-NULL pPictureResource member specified in the elements of pBeginInfo->pReferenceSlots for which slotIndex is not negative must match one of the video picture resources currently associated with the DPB slot index of pBeginInfo->videoSession specified by slotIndex at the time the command is executed on the device

  • VUID-vkCmdBeginVideoCodingKHR-pBeginInfo-08253
    If pBeginInfo->videoSession was created with a video encode operation and the pNext chain of pBeginInfo does not include an instance of the VkVideoEncodeRateControlInfoKHR structure, then the rate control mode configured for pBeginInfo->videoSession at the time the command is executed on the device must be VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DEFAULT_KHR

  • VUID-vkCmdBeginVideoCodingKHR-pBeginInfo-08254
    If pBeginInfo->videoSession was created with a video encode operation and the pNext chain of pBeginInfo includes an instance of the VkVideoEncodeRateControlInfoKHR structure, then it must match the rate control state configured for pBeginInfo->videoSession at the time the command is executed on the device

  • VUID-vkCmdBeginVideoCodingKHR-pBeginInfo-08255
    If pBeginInfo->videoSession was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR, the current rate control mode is not VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DEFAULT_KHR or VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DISABLED_BIT_KHR, and VkVideoEncodeH264CapabilitiesKHR::requiresGopRemainingFrames is VK_TRUE, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile the pBeginInfo->videoSession was created with, then the pNext chain of pBeginInfo must include an instance of the VkVideoEncodeH264GopRemainingFrameInfoKHR with its useGopRemainingFrames member set to VK_TRUE

  • VUID-vkCmdBeginVideoCodingKHR-pBeginInfo-08256
    If pBeginInfo->videoSession was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR, the current rate control mode is not VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DEFAULT_KHR or VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DISABLED_BIT_KHR, and VkVideoEncodeH265CapabilitiesKHR::requiresGopRemainingFrames is VK_TRUE, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile the pBeginInfo->videoSession was created with, then the pNext chain of pBeginInfo must include an instance of the VkVideoEncodeH265GopRemainingFrameInfoKHR with its useGopRemainingFrames member set to VK_TRUE

  • VUID-vkCmdBeginVideoCodingKHR-pBeginInfo-10282
    If pBeginInfo->videoSession was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR, the current rate control mode is not VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DEFAULT_KHR or VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DISABLED_BIT_KHR, and VkVideoEncodeAV1CapabilitiesKHR::requiresGopRemainingFrames is VK_TRUE, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile the pBeginInfo->videoSession was created with, then the pNext chain of pBeginInfo must include an instance of the VkVideoEncodeAV1GopRemainingFrameInfoKHR with its useGopRemainingFrames member set to VK_TRUE

Valid Usage (Implicit)
  • VUID-vkCmdBeginVideoCodingKHR-commandBuffer-parameter
    commandBuffer must be a valid VkCommandBuffer handle

  • VUID-vkCmdBeginVideoCodingKHR-pBeginInfo-parameter
    pBeginInfo must be a valid pointer to a valid VkVideoBeginCodingInfoKHR structure

  • VUID-vkCmdBeginVideoCodingKHR-commandBuffer-recording
    commandBuffer must be in the recording state

  • VUID-vkCmdBeginVideoCodingKHR-commandBuffer-cmdpool
    The VkCommandPool that commandBuffer was allocated from must support decode, or encode operations

  • VUID-vkCmdBeginVideoCodingKHR-renderpass
    This command must only be called outside of a render pass instance

  • VUID-vkCmdBeginVideoCodingKHR-videocoding
    This command must only be called outside of a video coding scope

  • VUID-vkCmdBeginVideoCodingKHR-bufferlevel
    commandBuffer must be a primary VkCommandBuffer

Host Synchronization
  • Host access to commandBuffer must be externally synchronized

  • Host access to the VkCommandPool that commandBuffer was allocated from must be externally synchronized

Command Properties
Command Buffer Levels Render Pass Scope Video Coding Scope Supported Queue Types Command Type

Primary

Outside

Outside

Decode
Encode

Action
State

The VkVideoBeginCodingInfoKHR structure is defined as:

// Provided by VK_KHR_video_queue
typedef struct VkVideoBeginCodingInfoKHR {
    VkStructureType                       sType;
    const void*                           pNext;
    VkVideoBeginCodingFlagsKHR            flags;
    VkVideoSessionKHR                     videoSession;
    VkVideoSessionParametersKHR           videoSessionParameters;
    uint32_t                              referenceSlotCount;
    const VkVideoReferenceSlotInfoKHR*    pReferenceSlots;
} VkVideoBeginCodingInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • flags is reserved for future use.

  • videoSession is the video session object to be bound for the processing of the video commands.

  • videoSessionParameters is VK_NULL_HANDLE or a handle of a VkVideoSessionParametersKHR object to be used for the processing of the video commands. If VK_NULL_HANDLE, then no video session parameters object is bound for the duration of the video coding scope.

  • referenceSlotCount is the number of elements in the pReferenceSlots array.

  • pReferenceSlots is a pointer to an array of VkVideoReferenceSlotInfoKHR structures specifying the information used to determine the set of bound reference picture resources for the video coding scope and their initial association with DPB slot indices.

Limiting values are defined below that are referenced by the relevant valid usage statements of this structure.

Valid Usage
  • VUID-VkVideoBeginCodingInfoKHR-videoSession-07237
    videoSession must have memory bound to all of its memory bindings returned by vkGetVideoSessionMemoryRequirementsKHR for videoSession

  • VUID-VkVideoBeginCodingInfoKHR-slotIndex-04856
    Each non-negative VkVideoReferenceSlotInfoKHR::slotIndex specified in the elements of pReferenceSlots must be less than the VkVideoSessionCreateInfoKHR::maxDpbSlots specified when videoSession was created

  • VUID-VkVideoBeginCodingInfoKHR-pPictureResource-07238
    Each video picture resource corresponding to any non-NULL pPictureResource member specified in the elements of pReferenceSlots must be unique within pReferenceSlots

  • VUID-VkVideoBeginCodingInfoKHR-pPictureResource-07240
    If the pPictureResource member of any element of pReferenceSlots is not NULL, then the image view specified in pPictureResource->imageViewBinding for that element must be compatible with the video profile videoSession was created with

  • VUID-VkVideoBeginCodingInfoKHR-pPictureResource-07241
    If the pPictureResource member of any element of pReferenceSlots is not NULL, then the format of the image view specified in pPictureResource->imageViewBinding for that element must match the VkVideoSessionCreateInfoKHR::referencePictureFormat videoSession was created with

  • VUID-VkVideoBeginCodingInfoKHR-pPictureResource-07242
    If the pPictureResource member of any element of pReferenceSlots is not NULL, then its codedOffset member must be an integer multiple of codedOffsetGranularity

  • VUID-VkVideoBeginCodingInfoKHR-pPictureResource-07243
    If the pPictureResource member of any element of pReferenceSlots is not NULL, then its codedExtent member must be between minCodedExtent and maxCodedExtent, inclusive, videoSession was created with

  • VUID-VkVideoBeginCodingInfoKHR-flags-07244
    If VkVideoCapabilitiesKHR::flags does not include VK_VIDEO_CAPABILITY_SEPARATE_REFERENCE_IMAGES_BIT_KHR, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile videoSession was created with, then pPictureResource->imageViewBinding of all elements of pReferenceSlots with a non-NULL pPictureResource member must specify image views created from the same image

  • VUID-VkVideoBeginCodingInfoKHR-slotIndex-07245
    If videoSession was created with a decode operation and the slotIndex member of any element of pReferenceSlots is not negative, then the image view specified in pPictureResource->imageViewBinding for that element must have been created with VK_IMAGE_USAGE_VIDEO_DECODE_DPB_BIT_KHR

  • VUID-VkVideoBeginCodingInfoKHR-slotIndex-07246
    If videoSession was created with an encode operation and the slotIndex member of any element of pReferenceSlots is not negative, then the image view specified in pPictureResource->imageViewBinding for that element must have been created with VK_IMAGE_USAGE_VIDEO_ENCODE_DPB_BIT_KHR

  • VUID-VkVideoBeginCodingInfoKHR-videoSession-07247
    If videoSession was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR, then videoSessionParameters must not be VK_NULL_HANDLE

  • VUID-VkVideoBeginCodingInfoKHR-videoSession-07248
    If videoSession was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_KHR, then videoSessionParameters must not be VK_NULL_HANDLE

  • VUID-VkVideoBeginCodingInfoKHR-videoSession-09261
    If videoSession was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_AV1_BIT_KHR, then videoSessionParameters must not be VK_NULL_HANDLE

  • VUID-VkVideoBeginCodingInfoKHR-videoSession-07249
    If videoSession was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR, then videoSessionParameters must not be VK_NULL_HANDLE

  • VUID-VkVideoBeginCodingInfoKHR-videoSession-07250
    If videoSession was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR, then videoSessionParameters must not be VK_NULL_HANDLE

  • VUID-VkVideoBeginCodingInfoKHR-videoSession-10283
    If videoSession was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR, then videoSessionParameters must not be VK_NULL_HANDLE

  • VUID-VkVideoBeginCodingInfoKHR-videoSessionParameters-04857
    If videoSessionParameters is not VK_NULL_HANDLE, it must have been created with videoSession specified in VkVideoSessionParametersCreateInfoKHR::videoSession

Valid Usage (Implicit)
// Provided by VK_KHR_video_queue
typedef VkFlags VkVideoBeginCodingFlagsKHR;

VkVideoBeginCodingFlagsKHR is a bitmask type for setting a mask, but is currently reserved for future use.

The VkVideoReferenceSlotInfoKHR structure is defined as:

// Provided by VK_KHR_video_queue
typedef struct VkVideoReferenceSlotInfoKHR {
    VkStructureType                         sType;
    const void*                             pNext;
    int32_t                                 slotIndex;
    const VkVideoPictureResourceInfoKHR*    pPictureResource;
} VkVideoReferenceSlotInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • slotIndex is the index of the DPB slot or a negative integer value.

  • pPictureResource is NULL or a pointer to a VkVideoPictureResourceInfoKHR structure describing the video picture resource associated with the DPB slot index specified by slotIndex.

Valid Usage (Implicit)

To end a video coding scope, call:

// Provided by VK_KHR_video_queue
void vkCmdEndVideoCodingKHR(
    VkCommandBuffer                             commandBuffer,
    const VkVideoEndCodingInfoKHR*              pEndCodingInfo);
  • commandBuffer is the command buffer in which to record the command.

  • pEndCodingInfo is a pointer to a VkVideoEndCodingInfoKHR structure specifying the parameters for ending the video coding scope.

After ending a video coding scope, the video session object, the optional video session parameters object, and all reference picture resources previously bound by the corresponding vkCmdBeginVideoCodingKHR command are unbound.

Valid Usage
  • VUID-vkCmdEndVideoCodingKHR-None-07251
    There must be no active queries

Valid Usage (Implicit)
  • VUID-vkCmdEndVideoCodingKHR-commandBuffer-parameter
    commandBuffer must be a valid VkCommandBuffer handle

  • VUID-vkCmdEndVideoCodingKHR-pEndCodingInfo-parameter
    pEndCodingInfo must be a valid pointer to a valid VkVideoEndCodingInfoKHR structure

  • VUID-vkCmdEndVideoCodingKHR-commandBuffer-recording
    commandBuffer must be in the recording state

  • VUID-vkCmdEndVideoCodingKHR-commandBuffer-cmdpool
    The VkCommandPool that commandBuffer was allocated from must support decode, or encode operations

  • VUID-vkCmdEndVideoCodingKHR-renderpass
    This command must only be called outside of a render pass instance

  • VUID-vkCmdEndVideoCodingKHR-videocoding
    This command must only be called inside of a video coding scope

  • VUID-vkCmdEndVideoCodingKHR-bufferlevel
    commandBuffer must be a primary VkCommandBuffer

Host Synchronization
  • Host access to commandBuffer must be externally synchronized

  • Host access to the VkCommandPool that commandBuffer was allocated from must be externally synchronized

Command Properties
Command Buffer Levels Render Pass Scope Video Coding Scope Supported Queue Types Command Type

Primary

Outside

Inside

Decode
Encode

Action
State

The VkVideoEndCodingInfoKHR structure is defined as:

// Provided by VK_KHR_video_queue
typedef struct VkVideoEndCodingInfoKHR {
    VkStructureType             sType;
    const void*                 pNext;
    VkVideoEndCodingFlagsKHR    flags;
} VkVideoEndCodingInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • flags is reserved for future use.

Valid Usage (Implicit)
  • VUID-VkVideoEndCodingInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_END_CODING_INFO_KHR

  • VUID-VkVideoEndCodingInfoKHR-pNext-pNext
    pNext must be NULL

  • VUID-VkVideoEndCodingInfoKHR-flags-zerobitmask
    flags must be 0

// Provided by VK_KHR_video_queue
typedef VkFlags VkVideoEndCodingFlagsKHR;

VkVideoEndCodingFlagsKHR is a bitmask type for setting a mask, but is currently reserved for future use.

Video Coding Control

To apply dynamic controls to the bound video session object, call:

// Provided by VK_KHR_video_queue
void vkCmdControlVideoCodingKHR(
    VkCommandBuffer                             commandBuffer,
    const VkVideoCodingControlInfoKHR*          pCodingControlInfo);
  • commandBuffer is the command buffer in which to record the command.

  • pCodingControlInfo is a pointer to a VkVideoCodingControlInfoKHR structure specifying the control parameters.

The control parameters provided in this call are applied to the video session at the time the command executes on the device and are in effect until a subsequent call to this command with the same video session bound changes the corresponding control parameters.

A newly created video session must be reset before performing video coding operations using it by including VK_VIDEO_CODING_CONTROL_RESET_BIT_KHR in pCodingControlInfo->flags. The reset operation also returns all DPB slots of the video session to the inactive state. Correspondingly, any DPB slot index associated with the bound reference picture resources is removed.

For encode sessions, the reset operation returns rate control configuration to implementation default settings and sets the video encode quality level to zero.

After video coding operations are performed using a video session, the reset operation can be used to return the video session to the same initial state as after the reset of a newly created video session. This can be used, for example, when different video sequences are needed to be processed with the same video session object.

If pCodingControlInfo->flags includes VK_VIDEO_CODING_CONTROL_ENCODE_RATE_CONTROL_BIT_KHR, then the command replaces the rate control configuration maintained by the video session with the configuration specified in the VkVideoEncodeRateControlInfoKHR structure included in the pCodingControlInfo->pNext chain.

If pCodingControlInfo->flags includes VK_VIDEO_CODING_CONTROL_ENCODE_QUALITY_LEVEL_BIT_KHR, then the command changes the current video encode quality level to the value specified in the qualityLevel member of the VkVideoEncodeQualityLevelInfoKHR structure included in the pCodingControlInfo->pNext chain.

Valid Usage
  • VUID-vkCmdControlVideoCodingKHR-flags-07017
    If pCodingControlInfo->flags does not include VK_VIDEO_CODING_CONTROL_RESET_BIT_KHR, then the bound video session must not be in uninitialized state at the time the command is executed on the device

  • VUID-vkCmdControlVideoCodingKHR-pCodingControlInfo-08243
    If the bound video session was not created with an encode operation, then pCodingControlInfo->flags must not include VK_VIDEO_CODING_CONTROL_ENCODE_RATE_CONTROL_BIT_KHR or VK_VIDEO_CODING_CONTROL_ENCODE_QUALITY_LEVEL_BIT_KHR

Valid Usage (Implicit)
  • VUID-vkCmdControlVideoCodingKHR-commandBuffer-parameter
    commandBuffer must be a valid VkCommandBuffer handle

  • VUID-vkCmdControlVideoCodingKHR-pCodingControlInfo-parameter
    pCodingControlInfo must be a valid pointer to a valid VkVideoCodingControlInfoKHR structure

  • VUID-vkCmdControlVideoCodingKHR-commandBuffer-recording
    commandBuffer must be in the recording state

  • VUID-vkCmdControlVideoCodingKHR-commandBuffer-cmdpool
    The VkCommandPool that commandBuffer was allocated from must support decode, or encode operations

  • VUID-vkCmdControlVideoCodingKHR-renderpass
    This command must only be called outside of a render pass instance

  • VUID-vkCmdControlVideoCodingKHR-videocoding
    This command must only be called inside of a video coding scope

  • VUID-vkCmdControlVideoCodingKHR-bufferlevel
    commandBuffer must be a primary VkCommandBuffer

Host Synchronization
  • Host access to commandBuffer must be externally synchronized

  • Host access to the VkCommandPool that commandBuffer was allocated from must be externally synchronized

Command Properties
Command Buffer Levels Render Pass Scope Video Coding Scope Supported Queue Types Command Type

Primary

Outside

Inside

Decode
Encode

Action

The VkVideoCodingControlInfoKHR structure is defined as:

// Provided by VK_KHR_video_queue
typedef struct VkVideoCodingControlInfoKHR {
    VkStructureType                 sType;
    const void*                     pNext;
    VkVideoCodingControlFlagsKHR    flags;
} VkVideoCodingControlInfoKHR;
Valid Usage
  • VUID-VkVideoCodingControlInfoKHR-flags-07018
    If flags includes VK_VIDEO_CODING_CONTROL_ENCODE_RATE_CONTROL_BIT_KHR, then the pNext chain must include a VkVideoEncodeRateControlInfoKHR structure

  • VUID-VkVideoCodingControlInfoKHR-flags-08349
    If flags includes VK_VIDEO_CODING_CONTROL_ENCODE_QUALITY_LEVEL_BIT_KHR, then the pNext chain must include a VkVideoEncodeQualityLevelInfoKHR structure

Valid Usage (Implicit)

Bits which can be set in VkVideoCodingControlInfoKHR::flags, specifying the video coding control parameters to be modified, are:

// Provided by VK_KHR_video_queue
typedef enum VkVideoCodingControlFlagBitsKHR {
    VK_VIDEO_CODING_CONTROL_RESET_BIT_KHR = 0x00000001,
  // Provided by VK_KHR_video_encode_queue
    VK_VIDEO_CODING_CONTROL_ENCODE_RATE_CONTROL_BIT_KHR = 0x00000002,
  // Provided by VK_KHR_video_encode_queue
    VK_VIDEO_CODING_CONTROL_ENCODE_QUALITY_LEVEL_BIT_KHR = 0x00000004,
} VkVideoCodingControlFlagBitsKHR;
  • VK_VIDEO_CODING_CONTROL_RESET_BIT_KHR specifies a request for the bound video session to be reset before other coding control parameters are applied.

  • VK_VIDEO_CODING_CONTROL_ENCODE_RATE_CONTROL_BIT_KHR specifies that the coding control parameters include video encode rate control parameters (see VkVideoEncodeRateControlInfoKHR).

  • VK_VIDEO_CODING_CONTROL_ENCODE_QUALITY_LEVEL_BIT_KHR specifies that the coding control parameters include video encode quality level parameters (see VkVideoEncodeQualityLevelInfoKHR).

// Provided by VK_KHR_video_queue
typedef VkFlags VkVideoCodingControlFlagsKHR;

VkVideoCodingControlFlagsKHR is a bitmask type for setting a mask of zero or more VkVideoCodingControlFlagBitsKHR.

Inline Queries

If a video session was created with VK_VIDEO_SESSION_CREATE_INLINE_QUERIES_BIT_KHR, beginning queries using commands such as vkCmdBeginQuery within a video coding scope is not allowed. Instead, queries are executed inline by including an instance of the VkVideoInlineQueryInfoKHR structure in the pNext chain of the parameters of one of the video coding commands, with its queryPool member set to a valid VkQueryPool handle.

The VkVideoInlineQueryInfoKHR structure is defined as:

// Provided by VK_KHR_video_maintenance1
typedef struct VkVideoInlineQueryInfoKHR {
    VkStructureType    sType;
    const void*        pNext;
    VkQueryPool        queryPool;
    uint32_t           firstQuery;
    uint32_t           queryCount;
} VkVideoInlineQueryInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • queryPool is VK_NULL_HANDLE or a valid handle to a VkQueryPool object that will manage the results of the queries.

  • firstQuery is the query index within the query pool that will contain the query results for the first video coding operation. The query results of subsequent video coding operations will be contained by subsequent query indices.

  • queryCount is the number of queries to execute.

    In practice, if queryPool is not VK_NULL_HANDLE, then queryCount will always have to match the number of video coding operations issued by the video coding command this structure is specified to, meaning that using inline queries in a video coding command will always execute a query for each issued video coding operation.

This structure can be included in the pNext chain of the input parameter structure of video coding commands.

  • In the pNext chain of the pDecodeInfo parameter of the vkCmdDecodeVideoKHR command to execute a query for each video decode operation issued by the command.

  • In the pNext chain of the pEncodeInfo parameter of the vkCmdEncodeVideoKHR command to execute a query for each video encode operation issued by the command.

Valid Usage
  • VUID-VkVideoInlineQueryInfoKHR-queryPool-08372
    If queryPool is not VK_NULL_HANDLE, then firstQuery must be less than the number of queries in queryPool

  • VUID-VkVideoInlineQueryInfoKHR-queryPool-08373
    If queryPool is not VK_NULL_HANDLE, then the sum of firstQuery and queryCount must be less than or equal to the number of queries in queryPool

Valid Usage (Implicit)
  • VUID-VkVideoInlineQueryInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_INLINE_QUERY_INFO_KHR

  • VUID-VkVideoInlineQueryInfoKHR-queryPool-parameter
    If queryPool is not VK_NULL_HANDLE, queryPool must be a valid VkQueryPool handle

Video Decode Operations

Video decode operations consume compressed video data from a video bitstream buffer and zero or more reference pictures, and produce a decode output picture and an optional reconstructed picture.

Such decode output pictures can be shared with the Decoded Picture Buffer, and can also be used as the input of video encode operations, with graphics or compute operations, or with Window System Integration APIs, depending on the capabilities of the implementation.

Video decode operations may access the following resources in the VK_PIPELINE_STAGE_2_VIDEO_DECODE_BIT_KHR stage:

The image subresource of each video picture resource accessed by the video decode operation is specified using a corresponding VkVideoPictureResourceInfoKHR structure. Each such image subresource must be in the appropriate image layout as follows:

  • If the image subresource is used in the video decode operation only as decode output picture, then it must be in the VK_IMAGE_LAYOUT_VIDEO_DECODE_DST_KHR layout.

  • If the image subresource is used in the video decode operation both as decode output picture and reconstructed picture, then it must be in the VK_IMAGE_LAYOUT_VIDEO_DECODE_DPB_KHR layout.

  • If the image subresource is used in the video decode operation only as reconstructed picture, then it must be in the VK_IMAGE_LAYOUT_VIDEO_DECODE_DPB_KHR layout.

  • If the image subresource is used in the video decode operation as a reference picture, then it must be in the VK_IMAGE_LAYOUT_VIDEO_DECODE_DPB_KHR layout.

A video decode operation may complete unsuccessfully. In this case the decode output picture will have undefined contents. Similarly, if reference picture setup is requested, the reconstructed picture will also have undefined contents, and the activated DPB slot will have an invalid picture reference.

Codec-Specific Semantics

The following aspects of video decode operations are codec-specific:

  • The interpretation of the contents of the source video bitstream buffer range.

  • The construction and interpretation of the list of active reference pictures and the interpretation of the picture data referred to by the corresponding image subregions.

  • The construction and interpretation of information related to the decode output picture and the generation of picture data to the corresponding image subregion.

  • The decision on reference picture setup.

  • The construction and interpretation of information related to the optional reconstructed picture and the generation of picture data to the corresponding image subregion.

These codec-specific behaviors are defined for each video codec operation separately.

  • If the used video codec operation is VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR, then the codec-specific aspects of the video decoding process are performed as defined in the H.264 Decode Operations section.

  • If the used video codec operation is VK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_KHR, then the codec-specific aspects of the video decoding process are performed as defined in the H.265 Decode Operations section.

  • If the used video codec operation is VK_VIDEO_CODEC_OPERATION_DECODE_AV1_BIT_KHR, then the codec-specific aspects of the video decoding process are performed as defined in the AV1 Decode Operations section.

Video Decode Operation Steps

Each video decode operation performs the following steps in the VK_PIPELINE_STAGE_2_VIDEO_DECODE_BIT_KHR stage:

  1. Reads the encoded video data from the source video bitstream buffer range.

  2. Performs picture reconstruction of the encoded video data according to the codec-specific semantics, applying any prediction data read from the active reference pictures in the process;

  3. Writes the decoded picture data to the decode output picture, and optionally to the reconstructed picture, if one is specified and is different from the decode output picture, according to the codec-specific semantics;

  4. If reference picture setup is requested, the DPB slot index specified in the reconstructed picture information is activated with the reconstructed picture.

When reconstructed picture information is provided, the specified DPB slot index is associated with the corresponding bound reference picture resource, indifferent of whether reference picture setup is requested.

Capabilities

When calling vkGetPhysicalDeviceVideoCapabilitiesKHR with pVideoProfile->videoCodecOperation specifying a decode operation, the VkVideoDecodeCapabilitiesKHR structure must be included in the pNext chain of the VkVideoCapabilitiesKHR structure to retrieve capabilities specific to video decoding.

The VkVideoDecodeCapabilitiesKHR structure is defined as:

// Provided by VK_KHR_video_decode_queue
typedef struct VkVideoDecodeCapabilitiesKHR {
    VkStructureType                    sType;
    void*                              pNext;
    VkVideoDecodeCapabilityFlagsKHR    flags;
} VkVideoDecodeCapabilitiesKHR;
Valid Usage (Implicit)
  • VUID-VkVideoDecodeCapabilitiesKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_DECODE_CAPABILITIES_KHR

Bits which may be set in VkVideoDecodeCapabilitiesKHR::flags, indicating the decoding capabilities supported, are:

// Provided by VK_KHR_video_decode_queue
typedef enum VkVideoDecodeCapabilityFlagBitsKHR {
    VK_VIDEO_DECODE_CAPABILITY_DPB_AND_OUTPUT_COINCIDE_BIT_KHR = 0x00000001,
    VK_VIDEO_DECODE_CAPABILITY_DPB_AND_OUTPUT_DISTINCT_BIT_KHR = 0x00000002,
} VkVideoDecodeCapabilityFlagBitsKHR;
  • VK_VIDEO_DECODE_CAPABILITY_DPB_AND_OUTPUT_COINCIDE_BIT_KHR indicates support for using the same video picture resource as the reconstructed picture and decode output picture in a video decode operation.

  • VK_VIDEO_DECODE_CAPABILITY_DPB_AND_OUTPUT_DISTINCT_BIT_KHR indicates support for using distinct video picture resources as the reconstructed picture and decode output picture in a video decode operation.

    Some video profiles allow using distinct video picture resources as the reconstructed picture and decode output picture in specific video decode operations even when the video decode profile does not support VK_VIDEO_DECODE_CAPABILITY_DPB_AND_OUTPUT_DISTINCT_BIT_KHR. Even if the implementation only reports coincide, the decode output picture for film grain enabled frames must be a different video picture resource from the reconstructed picture because film grain is applied outside of the coding loop.

Implementations are only required to support one of VK_VIDEO_DECODE_CAPABILITY_DPB_AND_OUTPUT_COINCIDE_BIT_KHR and VK_VIDEO_DECODE_CAPABILITY_DPB_AND_OUTPUT_DISTINCT_BIT_KHR. Accordingly, applications should handle both cases to maximize portability.

If both VK_VIDEO_DECODE_CAPABILITY_DPB_AND_OUTPUT_COINCIDE_BIT_KHR and VK_VIDEO_DECODE_CAPABILITY_DPB_AND_OUTPUT_DISTINCT_BIT_KHR are supported, an application can choose to create separate images for decode DPB and decode output. E.g. in cases when linear tiling is preferred (and supported) for the decode output picture and the DPB requires optimal tiling, this avoids the need for a separate copy at the expense of additional memory bandwidth requirements during decoding.

// Provided by VK_KHR_video_decode_queue
typedef VkFlags VkVideoDecodeCapabilityFlagsKHR;

VkVideoDecodeCapabilityFlagsKHR is a bitmask type for setting a mask of zero or more VkVideoDecodeCapabilityFlagBitsKHR.

Video Decode Commands

To launch video decode operations, call:

// Provided by VK_KHR_video_decode_queue
void vkCmdDecodeVideoKHR(
    VkCommandBuffer                             commandBuffer,
    const VkVideoDecodeInfoKHR*                 pDecodeInfo);
  • commandBuffer is the command buffer in which to record the command.

  • pDecodeInfo is a pointer to a VkVideoDecodeInfoKHR structure specifying the parameters of the video decode operations.

Each call issues one or more video decode operations. The implicit parameter opCount corresponds to the number of video decode operations issued by the command. After calling this command, the active query index of each active query is incremented by opCount.

Currently each call to this command results in the issue of a single video decode operation.

If the bound video session was created with VK_VIDEO_SESSION_CREATE_INLINE_QUERIES_BIT_KHR and the pNext chain of pDecodeInfo includes a VkVideoInlineQueryInfoKHR structure with its queryPool member specifying a valid VkQueryPool handle, then this command will execute a query for each video decode operation issued by it.

Active Reference Picture Information

The list of active reference pictures used by a video decode operation is a list of image subregions used as the source of reference picture data and related parameters, and is derived from the VkVideoReferenceSlotInfoKHR structures provided as the elements of the pDecodeInfo->pReferenceSlots array. For each element of pDecodeInfo->pReferenceSlots, one or more elements are added to the active reference picture list, as defined by the codec-specific semantics. Each element of this list contains the following information:

  • The image subregion within the image subresource referred to by the video picture resource used as the reference picture.

  • The DPB slot index the reference picture is associated with.

  • The codec-specific reference information related to the reference picture.

Reconstructed Picture Information

Information related to the optional reconstructed picture used by a video decode operation is derived from the VkVideoReferenceSlotInfoKHR structure pointed to by pDecodeInfo->pSetupReferenceSlot, if not NULL, as defined by the codec-specific semantics, and consists of the following:

  • The image subregion within the image subresource referred to by the video picture resource used as the reconstructed picture.

  • The DPB slot index to use for picture reconstruction.

  • The codec-specific reference information related to the reconstructed picture.

Specifying a valid VkVideoReferenceSlotInfoKHR structure in pDecodeInfo->pSetupReferenceSlot is always required, unless the video session was created with VkVideoSessionCreateInfoKHR::maxDpbSlot equal to zero. However, the DPB slot identified by pDecodeInfo->pSetupReferenceSlot→slotIndex is only activated with the reconstructed picture specified in pDecodeInfo->pSetupReferenceSlot→pPictureResource if reference picture setup is requested according to the codec-specific semantics.

If reconstructed picture information is specified, and pDecodeInfo->pSetupReferenceSlot→pPictureResource refers to a video picture resource different than that of the decode output picture, but reference picture setup is not requested, the contents of the video picture resource corresponding to the reconstructed picture will be undefined after the video decode operation.

Some implementations may always output the reconstructed picture or use it as temporary storage during the video decode operation even when the reconstructed picture is not marked for future reference.

Decode Output Picture Information

Information related to the decode output picture used by a video decode operation is derived from pDecodeInfo->dstPictureResource and any codec-specific parameters provided in the pDecodeInfo->pNext chain, as defined by the codec-specific semantics, and consists of the following:

  • The image subregion within the image subresource referred to by the video picture resource used as the decode output picture.

  • The codec-specific picture information related to the decode output picture.

Several limiting values are defined below that are referenced by the relevant valid usage statements of this command.

  • Let uint32_t activeReferencePictureCount be the size of the list of active reference pictures used by the video decode operation. Unless otherwise defined, activeReferencePictureCount is set to the value of pDecodeInfo->referenceSlotCount.

    • If the bound video session was created with an H.264 decode profile, then let activeReferencePictureCount be the value of pDecodeInfo->referenceSlotCount plus the number of elements of the pDecodeInfo->pReferenceSlots array that have a VkVideoDecodeH264DpbSlotInfoKHR structure included in their pNext chain with both pStdReferenceInfo->flags.top_field_flag and pStdReferenceInfo->flags.bottom_field_flag set.

      This means that the elements of pDecodeInfo->pReferenceSlots that include both a top and bottom field reference are counted as two separate active reference pictures, as described in the active reference picture list construction rules for H.264 decode operations.

  • Let VkOffset2D codedOffsetGranularity be the minimum alignment requirement for the coded offset of video picture resources. Unless otherwise defined, the value of the x and y members of codedOffsetGranularity are 0.

  • Let uint32_t dpbFrameUseCount[] be an array of size maxDpbSlots, where maxDpbSlots is the VkVideoSessionCreateInfoKHR::maxDpbSlots the bound video session was created with, with each element indicating the number of times a frame associated with the corresponding DPB slot index is referred to by the video coding operation. Let the initial value of each element of the array be 0.

    • If pDecodeInfo->pSetupReferenceSlot is not NULL, then dpbFrameUseCount[i] is incremented by one, where i equals pDecodeInfo->pSetupReferenceSlot→slotIndex. If the bound video session object was created with an H.264 decode profile, then dpbFrameUseCount[i] is decremented by one if either pStdReferenceInfo->flags.top_field_flag or pStdReferenceInfo->flags.bottom_field_flag is set in the VkVideoDecodeH264DpbSlotInfoKHR structure in the pDecodeInfo->pSetupReferenceSlot→pNext chain.

    • For each element of pDecodeInfo->pReferenceSlots, dpbFrameUseCount[i] is incremented by one, where i equals the slotIndex member of the corresponding element. If the bound video session object was created with an H.264 decode profile, then dpbFrameUseCount[i] is decremented by one if either pStdReferenceInfo->flags.top_field_flag or pStdReferenceInfo->flags.bottom_field_flag is set in the VkVideoDecodeH264DpbSlotInfoKHR structure in the pNext chain of the corresponding element of pDecodeInfo->pReferenceSlots.

  • Let uint32_t dpbTopFieldUseCount[] and uint32_t dpbBottomFieldUseCount[] be arrays of size maxDpbSlots, where maxDpbSlots is the VkVideoSessionCreateInfoKHR::maxDpbSlots the bound video session was created with, with each element indicating the number of times the top field or the bottom field, respectively, associated with the corresponding DPB slot index is referred to by the video coding operation. Let the initial value of each element of the arrays be 0.

    • If the bound video session object was created with an H.264 decode profile and pDecodeInfo->pSetupReferenceSlot is not NULL, then perform the following:

      • If pStdReferenceInfo->flags.top_field_flag is set in the VkVideoDecodeH264DpbSlotInfoKHR structure in the pDecodeInfo->pSetupReferenceSlot→pNext chain, then dpbTopFieldUseCount[i] is incremented by one, where i equals pDecodeInfo->pSetupReferenceSlot→slotIndex.

      • If pStdReferenceInfo->flags.bottom_field_flag is set in the VkVideoDecodeH264DpbSlotInfoKHR structure in the pDecodeInfo->pSetupReferenceSlot→pNext chain, then dpbBottomFieldUseCount[i] is incremented by one, where i equals pDecodeInfo->pSetupReferenceSlot→slotIndex.

    • If the bound video session object was created with an H.264 decode profile, then perform the following for each element of pDecodeInfo->pReferenceSlots:

      • If pStdReferenceInfo->flags.top_field_flag is set in the VkVideoDecodeH264DpbSlotInfoKHR structure in the pNext chain of the element, then dpbTopFieldUseCount[i] is incremented by one, where i equals the slotIndex member of the element.

      • If pStdReferenceInfo->flags.bottom_field_flag is set in the VkVideoDecodeH264DpbSlotInfoKHR structure in the pNext chain of the element, then dpbBottomFieldUseCount[i] is incremented by one, where i equals the slotIndex member of the element.

Valid Usage
  • VUID-vkCmdDecodeVideoKHR-None-08249
    The bound video session must have been created with a decode operation

  • VUID-vkCmdDecodeVideoKHR-None-07011
    The bound video session must not be in uninitialized state at the time the command is executed on the device

  • VUID-vkCmdDecodeVideoKHR-opCount-07134
    For each active query, the active query index corresponding to the query type of that query plus opCount must be less than or equal to the last activatable query index corresponding to the query type of that query plus one

  • VUID-vkCmdDecodeVideoKHR-pNext-08365
    If the bound video session was created with VK_VIDEO_SESSION_CREATE_INLINE_QUERIES_BIT_KHR, and the pNext chain of pDecodeInfo includes a VkVideoInlineQueryInfoKHR structure with its queryPool member specifying a valid VkQueryPool handle, then VkVideoInlineQueryInfoKHR::queryCount must equal opCount

  • VUID-vkCmdDecodeVideoKHR-pNext-08366
    If the bound video session was created with VK_VIDEO_SESSION_CREATE_INLINE_QUERIES_BIT_KHR, and the pNext chain of pDecodeInfo includes a VkVideoInlineQueryInfoKHR structure with its queryPool member specifying a valid VkQueryPool handle, then all the queries used by the command, as specified by the VkVideoInlineQueryInfoKHR structure, must be unavailable

  • VUID-vkCmdDecodeVideoKHR-queryType-08367
    If the bound video session was created with VK_VIDEO_SESSION_CREATE_INLINE_QUERIES_BIT_KHR, then the queryType used to create the queryPool specified in the VkVideoInlineQueryInfoKHR structure included in the pNext chain of pDecodeInfo must be VK_QUERY_TYPE_RESULT_STATUS_ONLY_KHR

  • VUID-vkCmdDecodeVideoKHR-queryPool-08368
    If the bound video session was created with VK_VIDEO_SESSION_CREATE_INLINE_QUERIES_BIT_KHR, then the queryPool specified in the VkVideoInlineQueryInfoKHR structure included in the pNext chain of pDecodeInfo must have been created with a VkVideoProfileInfoKHR structure included in the pNext chain of VkQueryPoolCreateInfo identical to the one specified in VkVideoSessionCreateInfoKHR::pVideoProfile the bound video session was created with

  • VUID-vkCmdDecodeVideoKHR-queryType-08369
    If the bound video session was created with VK_VIDEO_SESSION_CREATE_INLINE_QUERIES_BIT_KHR, and the queryType used to create the queryPool specified in the VkVideoInlineQueryInfoKHR structure included in the pNext chain of pDecodeInfo is VK_QUERY_TYPE_RESULT_STATUS_ONLY_KHR, then the VkCommandPool that commandBuffer was allocated from must have been created with a queue family index that supports result status queries, as indicated by VkQueueFamilyQueryResultStatusPropertiesKHR::queryResultStatusSupport

  • VUID-vkCmdDecodeVideoKHR-pDecodeInfo-07135
    pDecodeInfo->srcBuffer must be compatible with the video profile the bound video session was created with

  • VUID-vkCmdDecodeVideoKHR-commandBuffer-07136
    If commandBuffer is an unprotected command buffer and protectedNoFault is not supported, then pDecodeInfo->srcBuffer must not be a protected buffer

  • VUID-vkCmdDecodeVideoKHR-pDecodeInfo-07138
    pDecodeInfo->srcBufferOffset must be an integer multiple of VkVideoCapabilitiesKHR::minBitstreamBufferOffsetAlignment, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile the bound video session was created with

  • VUID-vkCmdDecodeVideoKHR-pDecodeInfo-07139
    pDecodeInfo->srcBufferRange must be an integer multiple of VkVideoCapabilitiesKHR::minBitstreamBufferSizeAlignment, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile the bound video session was created with

  • VUID-vkCmdDecodeVideoKHR-pDecodeInfo-07140
    If pDecodeInfo->pSetupReferenceSlot is not NULL and VkVideoDecodeCapabilitiesKHR::flags does not include VK_VIDEO_DECODE_CAPABILITY_DPB_AND_OUTPUT_COINCIDE_BIT_KHR, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile the bound video session was created with, then the video picture resources specified by pDecodeInfo->dstPictureResource and pDecodeInfo->pSetupReferenceSlot→pPictureResource must not match

  • VUID-vkCmdDecodeVideoKHR-pDecodeInfo-07141
    If pDecodeInfo->pSetupReferenceSlot is not NULL and none of the following is true:

    then the video picture resources specified by pDecodeInfo->dstPictureResource and pDecodeInfo->pSetupReferenceSlot→pPictureResource must match

  • VUID-vkCmdDecodeVideoKHR-pDecodeInfo-07142
    pDecodeInfo->dstPictureResource.imageViewBinding must be compatible with the video profile the bound video session was created with

  • VUID-vkCmdDecodeVideoKHR-pDecodeInfo-07143
    The format of pDecodeInfo->dstPictureResource.imageViewBinding must match the VkVideoSessionCreateInfoKHR::pictureFormat the bound video session was created with

  • VUID-vkCmdDecodeVideoKHR-pDecodeInfo-07144
    pDecodeInfo->dstPictureResource.codedOffset must be an integer multiple of codedOffsetGranularity

  • VUID-vkCmdDecodeVideoKHR-pDecodeInfo-07145
    pDecodeInfo->dstPictureResource.codedExtent must be between minCodedExtent and maxCodedExtent, inclusive, the bound video session was created with

  • VUID-vkCmdDecodeVideoKHR-pDecodeInfo-07146
    pDecodeInfo->dstPictureResource.imageViewBinding must have been created with VK_IMAGE_USAGE_VIDEO_DECODE_DST_BIT_KHR

  • VUID-vkCmdDecodeVideoKHR-commandBuffer-07147
    If commandBuffer is an unprotected command buffer and protectedNoFault is not supported, then pDecodeInfo->dstPictureResource.imageViewBinding must not have been created from a protected image

  • VUID-vkCmdDecodeVideoKHR-commandBuffer-07148
    If commandBuffer is a protected command buffer and protectedNoFault is not supported, then pDecodeInfo->dstPictureResource.imageViewBinding must have been created from a protected image

  • VUID-vkCmdDecodeVideoKHR-pDecodeInfo-08376
    pDecodeInfo->pSetupReferenceSlot must not be NULL unless the bound video session was created with VkVideoSessionCreateInfoKHR::maxDpbSlots equal to zero

  • VUID-vkCmdDecodeVideoKHR-pDecodeInfo-07170
    If pDecodeInfo->pSetupReferenceSlot is not NULL, then pDecodeInfo->pSetupReferenceSlot→slotIndex must be less than the VkVideoSessionCreateInfoKHR::maxDpbSlots specified when the bound video session was created

  • VUID-vkCmdDecodeVideoKHR-pDecodeInfo-07173
    If pDecodeInfo->pSetupReferenceSlot is not NULL, then pDecodeInfo->pSetupReferenceSlot→pPictureResource→codedOffset must be an integer multiple of codedOffsetGranularity

  • VUID-vkCmdDecodeVideoKHR-pDecodeInfo-07149
    If pDecodeInfo->pSetupReferenceSlot is not NULL, then pDecodeInfo->pSetupReferenceSlot→pPictureResource must match one of the bound reference picture resource

  • VUID-vkCmdDecodeVideoKHR-activeReferencePictureCount-07150
    activeReferencePictureCount must be less than or equal to the VkVideoSessionCreateInfoKHR::maxActiveReferencePictures specified when the bound video session was created

  • VUID-vkCmdDecodeVideoKHR-slotIndex-07256
    The slotIndex member of each element of pDecodeInfo->pReferenceSlots must be less than the VkVideoSessionCreateInfoKHR::maxDpbSlots specified when the bound video session was created

  • VUID-vkCmdDecodeVideoKHR-codedOffset-07257
    The codedOffset member of the VkVideoPictureResourceInfoKHR structure pointed to by the pPictureResource member of each element of pDecodeInfo->pReferenceSlots must be an integer multiple of codedOffsetGranularity

  • VUID-vkCmdDecodeVideoKHR-pDecodeInfo-07151
    The pPictureResource member of each element of pDecodeInfo->pReferenceSlots must match one of the bound reference picture resource associated with the DPB slot index specified in the slotIndex member of that element

  • VUID-vkCmdDecodeVideoKHR-pDecodeInfo-07264
    Each video picture resource corresponding to the pPictureResource member specified in the elements of pDecodeInfo->pReferenceSlots must be unique within pDecodeInfo->pReferenceSlots

  • VUID-vkCmdDecodeVideoKHR-dpbFrameUseCount-07176
    All elements of dpbFrameUseCount must be less than or equal to 1

  • VUID-vkCmdDecodeVideoKHR-dpbTopFieldUseCount-07177
    All elements of dpbTopFieldUseCount must be less than or equal to 1

  • VUID-vkCmdDecodeVideoKHR-dpbBottomFieldUseCount-07178
    All elements of dpbBottomFieldUseCount must be less than or equal to 1

  • VUID-vkCmdDecodeVideoKHR-pDecodeInfo-07252
    If pDecodeInfo->pSetupReferenceSlot is NULL or pDecodeInfo->pSetupReferenceSlot→pPictureResource does not refer to the same image subresource as pDecodeInfo->dstPictureResource, then the image subresource referred to by pDecodeInfo->dstPictureResource must be in the VK_IMAGE_LAYOUT_VIDEO_DECODE_DST_KHR layout at the time the video decode operation is executed on the device

  • VUID-vkCmdDecodeVideoKHR-pDecodeInfo-07253
    If pDecodeInfo->pSetupReferenceSlot is not NULL and pDecodeInfo->pSetupReferenceSlot→pPictureResource refers to the same image subresource as pDecodeInfo->dstPictureResource, then the image subresource referred to by pDecodeInfo->dstPictureResource must be in the VK_IMAGE_LAYOUT_VIDEO_DECODE_DPB_KHR layout at the time the video decode operation is executed on the device

  • VUID-vkCmdDecodeVideoKHR-pDecodeInfo-07254
    If pDecodeInfo->pSetupReferenceSlot is not NULL, then the image subresource referred to by pDecodeInfo->pSetupReferenceSlot→pPictureResource must be in the VK_IMAGE_LAYOUT_VIDEO_DECODE_DPB_KHR layout at the time the video decode operation is executed on the device

  • VUID-vkCmdDecodeVideoKHR-pPictureResource-07255
    The image subresource referred to by the pPictureResource member of each element of pDecodeInfo->pReferenceSlots must be in the VK_IMAGE_LAYOUT_VIDEO_DECODE_DPB_KHR layout at the time the video decode operation is executed on the device

  • VUID-vkCmdDecodeVideoKHR-pNext-07152
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR, then the pNext chain of pDecodeInfo must include a VkVideoDecodeH264PictureInfoKHR structure

  • VUID-vkCmdDecodeVideoKHR-None-07258
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR but was not created with interlaced frame support, then the decode output picture must represent a frame

  • VUID-vkCmdDecodeVideoKHR-pSliceOffsets-07153
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR, then all elements of the pSliceOffsets member of the VkVideoDecodeH264PictureInfoKHR structure included in the pNext chain of pDecodeInfo must be less than pDecodeInfo->srcBufferRange

  • VUID-vkCmdDecodeVideoKHR-StdVideoH264SequenceParameterSet-07154
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR, then the bound video session parameters object must contain a StdVideoH264SequenceParameterSet entry with seq_parameter_set_id matching StdVideoDecodeH264PictureInfo::seq_parameter_set_id that is provided in the pStdPictureInfo member of the VkVideoDecodeH264PictureInfoKHR structure included in the pNext chain of pDecodeInfo

  • VUID-vkCmdDecodeVideoKHR-StdVideoH264PictureParameterSet-07155
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR, then the bound video session parameters object must contain a StdVideoH264PictureParameterSet entry with seq_parameter_set_id and pic_parameter_set_id matching StdVideoDecodeH264PictureInfo::seq_parameter_set_id and StdVideoDecodeH264PictureInfo::pic_parameter_set_id, respectively, that are provided in the pStdPictureInfo member of the VkVideoDecodeH264PictureInfoKHR structure included in the pNext chain of pDecodeInfo

  • VUID-vkCmdDecodeVideoKHR-pDecodeInfo-07156
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR and pDecodeInfo->pSetupReferenceSlot is not NULL, then the pNext chain of pDecodeInfo->pSetupReferenceSlot must include a VkVideoDecodeH264DpbSlotInfoKHR structure

  • VUID-vkCmdDecodeVideoKHR-pDecodeInfo-07259
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR but was not created with interlaced frame support, and pDecodeInfo->pSetupReferenceSlot is not NULL, then the reconstructed picture must represent a frame

  • VUID-vkCmdDecodeVideoKHR-pNext-07157
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR, then the pNext chain of each element of pDecodeInfo->pReferenceSlots must include a VkVideoDecodeH264DpbSlotInfoKHR structure

  • VUID-vkCmdDecodeVideoKHR-pDecodeInfo-07260
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR but was not created with interlaced frame support, then each active reference picture corresponding to the elements of pDecodeInfo->pReferenceSlots must represent a frame

  • VUID-vkCmdDecodeVideoKHR-pDecodeInfo-07261
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR, pDecodeInfo->pSetupReferenceSlot is not NULL, and the decode output picture represents a frame, then the reconstructed picture must also represent a frame

  • VUID-vkCmdDecodeVideoKHR-pDecodeInfo-07262
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR, pDecodeInfo->pSetupReferenceSlot is not NULL, and the decode output picture represents a top field, then the reconstructed picture must also represent a top field

  • VUID-vkCmdDecodeVideoKHR-pDecodeInfo-07263
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR, pDecodeInfo->pSetupReferenceSlot is not NULL, and the decode output picture represents a bottom field, then the reconstructed picture must also represent a bottom field

  • VUID-vkCmdDecodeVideoKHR-pDecodeInfo-07266
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR and an active reference picture corresponding to any element of pDecodeInfo->pReferenceSlots represents a frame, then the DPB slot index of the bound video session specified by the slotIndex member of that element must be currently associated with a frame picture matching the video picture resource specified by the pPictureResource member of the same element at the time the command is executed on the device

  • VUID-vkCmdDecodeVideoKHR-pDecodeInfo-07267
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR and an active reference picture corresponding to any element of pDecodeInfo->pReferenceSlots represents a top field, then the DPB slot index of the bound video session specified by the slotIndex member of that element must be currently associated with a top field picture matching the video picture resource specified by the pPictureResource member of the same element at the time the command is executed on the device

  • VUID-vkCmdDecodeVideoKHR-pDecodeInfo-07268
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR and an active reference picture corresponding to any element of pDecodeInfo->pReferenceSlots represents a bottom field, then the DPB slot index of the bound video session specified by the slotIndex member of that element must be currently associated with a bottom field picture matching the video picture resource specified by the pPictureResource member of the same element at the time the command is executed on the device

  • VUID-vkCmdDecodeVideoKHR-pNext-07158
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_KHR, then the pNext chain of pDecodeInfo must include a VkVideoDecodeH265PictureInfoKHR structure

  • VUID-vkCmdDecodeVideoKHR-pSliceSegmentOffsets-07159
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_KHR, then all elements of the pSliceSegmentOffsets member of the VkVideoDecodeH265PictureInfoKHR structure included in the pNext chain of pDecodeInfo must be less than pDecodeInfo->srcBufferRange

  • VUID-vkCmdDecodeVideoKHR-StdVideoH265VideoParameterSet-07160
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_KHR, then the bound video session parameters object must contain a StdVideoH265VideoParameterSet entry with vps_video_parameter_set_id matching StdVideoDecodeH265PictureInfo::sps_video_parameter_set_id that is provided in the pStdPictureInfo member of the VkVideoDecodeH265PictureInfoKHR structure included in the pNext chain of pDecodeInfo

  • VUID-vkCmdDecodeVideoKHR-StdVideoH265SequenceParameterSet-07161
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_KHR, then the bound video session parameters object must contain a StdVideoH265SequenceParameterSet entry with sps_video_parameter_set_id and sps_seq_parameter_set_id matching StdVideoDecodeH265PictureInfo::sps_video_parameter_set_id and StdVideoDecodeH265PictureInfo::pps_seq_parameter_set_id, respectively, that are provided in the pStdPictureInfo member of the VkVideoDecodeH265PictureInfoKHR structure included in the pNext chain of pDecodeInfo

  • VUID-vkCmdDecodeVideoKHR-StdVideoH265PictureParameterSet-07162
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_KHR, then the bound video session parameters object must contain a StdVideoH265PictureParameterSet entry with sps_video_parameter_set_id, pps_seq_parameter_set_id, and pps_pic_parameter_set_id matching StdVideoDecodeH265PictureInfo::sps_video_parameter_set_id, StdVideoDecodeH265PictureInfo::pps_seq_parameter_set_id, and StdVideoDecodeH265PictureInfo::pps_pic_parameter_set_id, respectively, that are provided in the pStdPictureInfo member of the VkVideoDecodeH265PictureInfoKHR structure included in the pNext chain of pDecodeInfo

  • VUID-vkCmdDecodeVideoKHR-pDecodeInfo-07163
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_KHR and pDecodeInfo->pSetupReferenceSlot is not NULL, then the pNext chain of pDecodeInfo->pSetupReferenceSlot must include a VkVideoDecodeH265DpbSlotInfoKHR structure

  • VUID-vkCmdDecodeVideoKHR-pNext-07164
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_KHR, then the pNext chain of each element of pDecodeInfo->pReferenceSlots must include a VkVideoDecodeH265DpbSlotInfoKHR structure

  • VUID-vkCmdDecodeVideoKHR-filmGrainSupport-09248
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_AV1_BIT_KHR and VkVideoDecodeAV1ProfileInfoKHR::filmGrainSupport set to VK_FALSE, then film grain must not be enabled for the decoded picture

  • VUID-vkCmdDecodeVideoKHR-pDecodeInfo-09249
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_AV1_BIT_KHR, pDecodeInfo->pSetupReferenceSlot is not NULL, and film grain is enabled for the decoded picture, then the video picture resources specified by pDecodeInfo->dstPictureResource and pDecodeInfo->pSetupReferenceSlot→pPictureResource must not match

  • VUID-vkCmdDecodeVideoKHR-pNext-09250
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_AV1_BIT_KHR, then the pNext chain of pDecodeInfo must include a VkVideoDecodeAV1PictureInfoKHR structure

  • VUID-vkCmdDecodeVideoKHR-frameHeaderOffset-09251
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_AV1_BIT_KHR, then the frameHeaderOffset member of the VkVideoDecodeAV1PictureInfoKHR structure included in the pNext chain of pDecodeInfo must be less than the minimum of pDecodeInfo->srcBufferRange

  • VUID-vkCmdDecodeVideoKHR-pTileOffsets-09253
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_AV1_BIT_KHR, then all elements of the pTileOffsets member of the VkVideoDecodeAV1PictureInfoKHR structure included in the pNext chain of pDecodeInfo must be less than pDecodeInfo->srcBufferRange

  • VUID-vkCmdDecodeVideoKHR-pTileOffsets-09252
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_AV1_BIT_KHR, then for each element i of the pTileOffsets and pTileSizes members of the VkVideoDecodeAV1PictureInfoKHR structure included in the pNext chain of pDecodeInfo the sum of pTileOffsets[i] and pTileSizes[i] must be less than or equal to pDecodeInfo->srcBufferRange

  • VUID-vkCmdDecodeVideoKHR-pDecodeInfo-09254
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_AV1_BIT_KHR and pDecodeInfo->pSetupReferenceSlot is not NULL, then the pNext chain of pDecodeInfo->pSetupReferenceSlot must include a VkVideoDecodeAV1DpbSlotInfoKHR structure

  • VUID-vkCmdDecodeVideoKHR-pNext-09255
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_AV1_BIT_KHR, then the pNext chain of each element of pDecodeInfo->pReferenceSlots must include a VkVideoDecodeAV1DpbSlotInfoKHR structure

  • VUID-vkCmdDecodeVideoKHR-referenceNameSlotIndices-09262
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_AV1_BIT_KHR, then each element of the referenceNameSlotIndices array member of the VkVideoDecodeAV1PictureInfoKHR structure included in the pNext chain of pDecodeInfo must either be negative or must equal the slotIndex member of one of the elements of pDecodeInfo->pReferenceSlots

  • VUID-vkCmdDecodeVideoKHR-slotIndex-09263
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_AV1_BIT_KHR, then the slotIndex member of each element of pDecodeInfo->pReferenceSlots must equal one of the elements of the referenceNameSlotIndices array member of the VkVideoDecodeAV1PictureInfoKHR structure included in the pNext chain of pDecodeInfo

Valid Usage (Implicit)
  • VUID-vkCmdDecodeVideoKHR-commandBuffer-parameter
    commandBuffer must be a valid VkCommandBuffer handle

  • VUID-vkCmdDecodeVideoKHR-pDecodeInfo-parameter
    pDecodeInfo must be a valid pointer to a valid VkVideoDecodeInfoKHR structure

  • VUID-vkCmdDecodeVideoKHR-commandBuffer-recording
    commandBuffer must be in the recording state

  • VUID-vkCmdDecodeVideoKHR-commandBuffer-cmdpool
    The VkCommandPool that commandBuffer was allocated from must support decode operations

  • VUID-vkCmdDecodeVideoKHR-renderpass
    This command must only be called outside of a render pass instance

  • VUID-vkCmdDecodeVideoKHR-videocoding
    This command must only be called inside of a video coding scope

  • VUID-vkCmdDecodeVideoKHR-bufferlevel
    commandBuffer must be a primary VkCommandBuffer

Host Synchronization
  • Host access to commandBuffer must be externally synchronized

  • Host access to the VkCommandPool that commandBuffer was allocated from must be externally synchronized

Command Properties
Command Buffer Levels Render Pass Scope Video Coding Scope Supported Queue Types Command Type

Primary

Outside

Inside

Decode

Action

The VkVideoDecodeInfoKHR structure is defined as:

// Provided by VK_KHR_video_decode_queue
typedef struct VkVideoDecodeInfoKHR {
    VkStructureType                       sType;
    const void*                           pNext;
    VkVideoDecodeFlagsKHR                 flags;
    VkBuffer                              srcBuffer;
    VkDeviceSize                          srcBufferOffset;
    VkDeviceSize                          srcBufferRange;
    VkVideoPictureResourceInfoKHR         dstPictureResource;
    const VkVideoReferenceSlotInfoKHR*    pSetupReferenceSlot;
    uint32_t                              referenceSlotCount;
    const VkVideoReferenceSlotInfoKHR*    pReferenceSlots;
} VkVideoDecodeInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • flags is reserved for future use.

  • srcBuffer is the source video bitstream buffer to read the encoded bitstream from.

  • srcBufferOffset is the starting offset in bytes from the start of srcBuffer to read the encoded bitstream from.

  • srcBufferRange is the size in bytes of the encoded bitstream to decode from srcBuffer, starting from srcBufferOffset.

  • dstPictureResource is the video picture resource to use as the decode output picture.

  • pSetupReferenceSlot is NULL or a pointer to a VkVideoReferenceSlotInfoKHR structure specifying the reconstructed picture information.

  • referenceSlotCount is the number of elements in the pReferenceSlots array.

  • pReferenceSlots is NULL or a pointer to an array of VkVideoReferenceSlotInfoKHR structures describing the DPB slots and corresponding reference picture resources to use in this video decode operation (the set of active reference pictures).

Valid Usage
  • VUID-VkVideoDecodeInfoKHR-srcBuffer-07165
    srcBuffer must have been created with VK_BUFFER_USAGE_VIDEO_DECODE_SRC_BIT_KHR set

  • VUID-VkVideoDecodeInfoKHR-srcBufferOffset-07166
    srcBufferOffset must be less than the size of srcBuffer

  • VUID-VkVideoDecodeInfoKHR-srcBufferRange-07167
    srcBufferRange must be less than or equal to the size of srcBuffer minus srcBufferOffset

  • VUID-VkVideoDecodeInfoKHR-pSetupReferenceSlot-07168
    If pSetupReferenceSlot is not NULL, then its slotIndex member must not be negative

  • VUID-VkVideoDecodeInfoKHR-pSetupReferenceSlot-07169
    If pSetupReferenceSlot is not NULL, then its pPictureResource must not be NULL

  • VUID-VkVideoDecodeInfoKHR-slotIndex-07171
    The slotIndex member of each element of pReferenceSlots must not be negative

  • VUID-VkVideoDecodeInfoKHR-pPictureResource-07172
    The pPictureResource member of each element of pReferenceSlots must not be NULL

Valid Usage (Implicit)
  • VUID-VkVideoDecodeInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_DECODE_INFO_KHR

  • VUID-VkVideoDecodeInfoKHR-pNext-pNext
    Each pNext member of any structure (including this one) in the pNext chain must be either NULL or a pointer to a valid instance of VkVideoDecodeAV1PictureInfoKHR, VkVideoDecodeH264PictureInfoKHR, VkVideoDecodeH265PictureInfoKHR, or VkVideoInlineQueryInfoKHR

  • VUID-VkVideoDecodeInfoKHR-sType-unique
    The sType value of each struct in the pNext chain must be unique

  • VUID-VkVideoDecodeInfoKHR-flags-zerobitmask
    flags must be 0

  • VUID-VkVideoDecodeInfoKHR-srcBuffer-parameter
    srcBuffer must be a valid VkBuffer handle

  • VUID-VkVideoDecodeInfoKHR-dstPictureResource-parameter
    dstPictureResource must be a valid VkVideoPictureResourceInfoKHR structure

  • VUID-VkVideoDecodeInfoKHR-pSetupReferenceSlot-parameter
    If pSetupReferenceSlot is not NULL, pSetupReferenceSlot must be a valid pointer to a valid VkVideoReferenceSlotInfoKHR structure

  • VUID-VkVideoDecodeInfoKHR-pReferenceSlots-parameter
    If referenceSlotCount is not 0, pReferenceSlots must be a valid pointer to an array of referenceSlotCount valid VkVideoReferenceSlotInfoKHR structures

// Provided by VK_KHR_video_decode_queue
typedef VkFlags VkVideoDecodeFlagsKHR;

VkVideoDecodeFlagsKHR is a bitmask type for setting a mask, but is currently reserved for future use.

H.264 Decode Operations

Video decode operations using an H.264 decode profile can be used to decode elementary video stream sequences compliant to the ITU-T H.264 Specification.

Refer to the Preamble for information on how the Khronos Intellectual Property Rights Policy relates to normative references to external materials not created by Khronos.

This process is performed according to the video decode operation steps with the codec-specific semantics defined in section 8 of the ITU-T H.264 Specification as follows:

If the parameters and the bitstream adhere to the syntactic and semantic requirements defined in the corresponding sections of the ITU-T H.264 Specification, as described above, and the DPB slots associated with the active reference pictures all refer to valid picture references, then the video decode operation will complete successfully. Otherwise, the video decode operation may complete unsuccessfully.

H.264 Decode Bitstream Data Access

If the target decode output picture is a frame, then the video bitstream buffer range should contain a VCL NAL unit comprised of the slice headers and data of a picture representing an entire frame, as defined in sections 7.3.3 and 7.3.4, and this data is interpreted as defined in sections 7.4.3 and 7.4.4 of the ITU-T H.264 Specification, respectively.

If the target decode output picture is a field, then the video bitstream buffer range should contain a VCL NAL unit comprised of the slice headers and data of a picture representing a field, as defined in sections 7.3.3 and 7.3.4, and this data is interpreted as defined in sections 7.4.3 and 7.4.4 of the ITU-T H.264 Specification, respectively.

The offsets provided in VkVideoDecodeH264PictureInfoKHR::pSliceOffsets should specify the starting offsets corresponding to each slice header within the video bitstream buffer range.

H.264 Decode Picture Data Access

The effective imageOffset and imageExtent corresponding to a decode output picture, reference picture, or reconstructed picture used in video decode operations with an H.264 decode profile are defined as follows:

  • imageOffset is (codedOffset.x,codedOffset.y) and imageExtent is (codedExtent.width, codedExtent.height), if the picture represents a frame.

  • imageOffset is (codedOffset.x,codedOffset.y) and imageExtent is (codedExtent.width, codedExtent.height), if the picture represents a field and the picture layout of the used H.264 decode profile is VK_VIDEO_DECODE_H264_PICTURE_LAYOUT_INTERLACED_INTERLEAVED_LINES_BIT_KHR.

  • imageOffset is (codedOffset.x,codedOffset.y) and imageExtent is (codedExtent.width, codedExtent.height / 2), if the picture represents a field and the picture layout of the used H.264 decode profile is VK_VIDEO_DECODE_H264_PICTURE_LAYOUT_INTERLACED_SEPARATE_PLANES_BIT_KHR.

Where codedOffset and codedExtent are the members of the VkVideoPictureResourceInfoKHR structure corresponding to the picture.

However, accesses to image data within a video picture resource happen at the granularity indicated by VkVideoCapabilitiesKHR::pictureAccessGranularity, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the used video profile. This means that the complete image subregion accessed by video coding operations using an H.264 decode profile for the video picture resource is defined as the set of texels within the coordinate range:

([startX,endX), [startY,endY))

Where:

  • startX equals imageOffset.x rounded down to the nearest integer multiple of pictureAccessGranularity.width;

  • endX equals imageOffset.x + imageExtent.width rounded up to the nearest integer multiple of pictureAccessGranularity.width and clamped to the width of the image subresource referred to by the corresponding VkVideoPictureResourceInfoKHR structure;

  • startY equals imageOffset.y rounded down to the nearest integer multiple of pictureAccessGranularity.height;

  • endY equals imageOffset.y + imageExtent.height rounded up to the nearest integer multiple of pictureAccessGranularity.height and clamped to the height of the image subresource referred to by the corresponding VkVideoPictureResourceInfoKHR structure.

In case of video decode operations using an H.264 decode profile, any access to a picture at the coordinates (x,y), as defined by the ITU-T H.264 Specification, is an access to the image subresource referred to by the corresponding VkVideoPictureResourceInfoKHR structure at the texel coordinates specified below:

  • (x,y), if the accessed picture represents a frame.

  • (x,y × 2), if the accessed picture represents a top field and the picture layout of the used H.264 decode profile is VK_VIDEO_DECODE_H264_PICTURE_LAYOUT_INTERLACED_INTERLEAVED_LINES_BIT_KHR.

  • (x,y × 2 + 1), if the accessed picture represents a bottom field and the picture layout of the used H.264 decode profile is VK_VIDEO_DECODE_H264_PICTURE_LAYOUT_INTERLACED_INTERLEAVED_LINES_BIT_KHR.

  • (x,y), if the accessed picture represents a top field and the picture layout of the used H.264 decode profile is VK_VIDEO_DECODE_H264_PICTURE_LAYOUT_INTERLACED_SEPARATE_PLANES_BIT_KHR.

  • (codedOffset.x + x,codedOffset.y + y), if the accessed picture represents a bottom field and the picture layout of the used H.264 decode profile is VK_VIDEO_DECODE_H264_PICTURE_LAYOUT_INTERLACED_SEPARATE_PLANES_BIT_KHR.

Where codedOffset is the member of the corresponding VkVideoPictureResourceInfoKHR structure.

H.264 Decode Profile

A video profile supporting H.264 video decode operations is specified by setting VkVideoProfileInfoKHR::videoCodecOperation to VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR and adding a VkVideoDecodeH264ProfileInfoKHR structure to the VkVideoProfileInfoKHR::pNext chain.

The VkVideoDecodeH264ProfileInfoKHR structure is defined as:

// Provided by VK_KHR_video_decode_h264
typedef struct VkVideoDecodeH264ProfileInfoKHR {
    VkStructureType                              sType;
    const void*                                  pNext;
    StdVideoH264ProfileIdc                       stdProfileIdc;
    VkVideoDecodeH264PictureLayoutFlagBitsKHR    pictureLayout;
} VkVideoDecodeH264ProfileInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • stdProfileIdc is a StdVideoH264ProfileIdc value specifying the H.264 codec profile IDC, as defined in section A.2 of the ITU-T H.264 Specification.

  • pictureLayout is a VkVideoDecodeH264PictureLayoutFlagBitsKHR value specifying the picture layout used by the H.264 video sequence to be decoded.

Valid Usage (Implicit)
  • VUID-VkVideoDecodeH264ProfileInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_DECODE_H264_PROFILE_INFO_KHR

  • VUID-VkVideoDecodeH264ProfileInfoKHR-pictureLayout-parameter
    If pictureLayout is not 0, pictureLayout must be a valid VkVideoDecodeH264PictureLayoutFlagBitsKHR value

The H.264 video decode picture layout flags are defined as follows:

// Provided by VK_KHR_video_decode_h264
typedef enum VkVideoDecodeH264PictureLayoutFlagBitsKHR {
    VK_VIDEO_DECODE_H264_PICTURE_LAYOUT_PROGRESSIVE_KHR = 0,
    VK_VIDEO_DECODE_H264_PICTURE_LAYOUT_INTERLACED_INTERLEAVED_LINES_BIT_KHR = 0x00000001,
    VK_VIDEO_DECODE_H264_PICTURE_LAYOUT_INTERLACED_SEPARATE_PLANES_BIT_KHR = 0x00000002,
} VkVideoDecodeH264PictureLayoutFlagBitsKHR;
  • VK_VIDEO_DECODE_H264_PICTURE_LAYOUT_PROGRESSIVE_KHR specifies support for progressive content. This flag has the value 0.

  • VK_VIDEO_DECODE_H264_PICTURE_LAYOUT_INTERLACED_INTERLEAVED_LINES_BIT_KHR specifies support for or use of a picture layout for interlaced content where all lines belonging to the top field are decoded to the even-numbered lines within the picture resource, and all lines belonging to the bottom field are decoded to the odd-numbered lines within the picture resource.

  • VK_VIDEO_DECODE_H264_PICTURE_LAYOUT_INTERLACED_SEPARATE_PLANES_BIT_KHR specifies support for or use of a picture layout for interlaced content where all lines belonging to a field are grouped together in a single image subregion, and the two fields comprising the frame can be stored in separate image subregions of the same image subresource or in separate image subresources.

// Provided by VK_KHR_video_decode_h264
typedef VkFlags VkVideoDecodeH264PictureLayoutFlagsKHR;

VkVideoDecodeH264PictureLayoutFlagsKHR is a bitmask type for setting a mask of zero or more VkVideoDecodeH264PictureLayoutFlagBitsKHR.

H.264 Decode Capabilities

When calling vkGetPhysicalDeviceVideoCapabilitiesKHR to query the capabilities for an H.264 decode profile, the VkVideoCapabilitiesKHR::pNext chain must include a VkVideoDecodeH264CapabilitiesKHR structure that will be filled with the profile-specific capabilities.

The VkVideoDecodeH264CapabilitiesKHR structure is defined as:

// Provided by VK_KHR_video_decode_h264
typedef struct VkVideoDecodeH264CapabilitiesKHR {
    VkStructureType         sType;
    void*                   pNext;
    StdVideoH264LevelIdc    maxLevelIdc;
    VkOffset2D              fieldOffsetGranularity;
} VkVideoDecodeH264CapabilitiesKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • maxLevelIdc is a StdVideoH264LevelIdc value indicating the maximum H.264 level supported by the profile, where enum constant STD_VIDEO_H264_LEVEL_IDC_<major>_<minor> identifies H.264 level <major>.<minor> as defined in section A.3 of the ITU-T H.264 Specification.

  • fieldOffsetGranularity is the minimum alignment for VkVideoPictureResourceInfoKHR::codedOffset specified for a video picture resource when using the picture layout VK_VIDEO_DECODE_H264_PICTURE_LAYOUT_INTERLACED_SEPARATE_PLANES_BIT_KHR.

Valid Usage (Implicit)
  • VUID-VkVideoDecodeH264CapabilitiesKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_DECODE_H264_CAPABILITIES_KHR

H.264 Decode Parameter Sets

Video session parameters objects created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR can contain the following types of parameters:

H.264 Sequence Parameter Sets (SPS)

Represented by StdVideoH264SequenceParameterSet structures and interpreted as follows:

  • reserved1 and reserved2 are used only for padding purposes and are otherwise ignored;

  • seq_parameter_set_id is used as the key of the SPS entry;

  • level_idc is one of the enum constants STD_VIDEO_H264_LEVEL_IDC_<major>_<minor> identifying the H.264 level <major>.<minor> as defined in section A.3 of the ITU-T H.264 Specification;

  • if flags.seq_scaling_matrix_present_flag is set, then the StdVideoH264ScalingLists structure pointed to by pScalingLists is interpreted as follows:

    • scaling_list_present_mask is a bitmask where bit index i corresponds to seq_scaling_list_present_flag[i] as defined in section 7.4.2.1 of the ITU-T H.264 Specification;

    • use_default_scaling_matrix_mask is a bitmask where bit index i corresponds to UseDefaultScalingMatrix4x4Flag[i], when i < 6, or corresponds to UseDefaultScalingMatrix8x8Flag[i-6], otherwise, as defined in section 7.3.2.1 of the ITU-T H.264 Specification;

    • ScalingList4x4 and ScalingList8x8 correspond to the identically named syntax elements defined in section 7.3.2.1 of the ITU-T H.264 Specification;

  • if flags.vui_parameters_present_flag is set, then pSequenceParameterSetVui is a pointer to a StdVideoH264SequenceParameterSetVui structure that is interpreted as follows:

    • reserved1 is used only for padding purposes and is otherwise ignored;

    • flags.color_description_present_flag is interpreted as the value of colour_description_present_flag, as defined in section E.2.1 of the ITU-T H.264 Specification;

      The name of colour_description_present_flag was misspelled in the Video Std header.

    • if flags.nal_hrd_parameters_present_flag or flags.vcl_hrd_parameters_present_flag is set, then the StdVideoH264HrdParameters structure pointed to by pHrdParameters is interpreted as follows:

      • reserved1 is used only for padding purposes and is otherwise ignored;

      • all other members of StdVideoH264HrdParameters are interpreted as defined in section E.2.2 of the ITU-T H.264 Specification;

    • all other members of StdVideoH264SequenceParameterSetVui are interpreted as defined in section E.2.1 of the ITU-T H.264 Specification;

  • all other members of StdVideoH264SequenceParameterSet are interpreted as defined in section 7.4.2.1 of the ITU-T H.264 Specification.

H.264 Picture Parameter Sets (PPS)

Represented by StdVideoH264PictureParameterSet structures and interpreted as follows:

  • the pair constructed from seq_parameter_set_id and pic_parameter_set_id is used as the key of the PPS entry;

  • if flags.pic_scaling_matrix_present_flag is set, then the StdVideoH264ScalingLists structure pointed to by pScalingLists is interpreted as follows:

    • scaling_list_present_mask is a bitmask where bit index i corresponds to pic_scaling_list_present_flag[i] as defined in section 7.4.2.2 of the ITU-T H.264 Specification;

    • use_default_scaling_matrix_mask is a bitmask where bit index i corresponds to UseDefaultScalingMatrix4x4Flag[i], when i < 6, or corresponds to UseDefaultScalingMatrix8x8Flag[i-6], otherwise, as defined in section 7.3.2.2 of the ITU-T H.264 Specification;

    • ScalingList4x4 and ScalingList8x8 correspond to the identically named syntax elements defined in section 7.3.2.2 of the ITU-T H.264 Specification;

  • all other members of StdVideoH264PictureParameterSet are interpreted as defined in section 7.4.2.2 of the ITU-T H.264 Specification.

When a video session parameters object is created with the codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR, the VkVideoSessionParametersCreateInfoKHR::pNext chain must include a VkVideoDecodeH264SessionParametersCreateInfoKHR structure specifying the capacity and initial contents of the object.

The VkVideoDecodeH264SessionParametersCreateInfoKHR structure is defined as:

// Provided by VK_KHR_video_decode_h264
typedef struct VkVideoDecodeH264SessionParametersCreateInfoKHR {
    VkStructureType                                        sType;
    const void*                                            pNext;
    uint32_t                                               maxStdSPSCount;
    uint32_t                                               maxStdPPSCount;
    const VkVideoDecodeH264SessionParametersAddInfoKHR*    pParametersAddInfo;
} VkVideoDecodeH264SessionParametersCreateInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • maxStdSPSCount is the maximum number of H.264 SPS entries the created VkVideoSessionParametersKHR can contain.

  • maxStdPPSCount is the maximum number of H.264 PPS entries the created VkVideoSessionParametersKHR can contain.

  • pParametersAddInfo is NULL or a pointer to a VkVideoDecodeH264SessionParametersAddInfoKHR structure specifying H.264 parameters to add upon object creation.

Valid Usage (Implicit)
  • VUID-VkVideoDecodeH264SessionParametersCreateInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_DECODE_H264_SESSION_PARAMETERS_CREATE_INFO_KHR

  • VUID-VkVideoDecodeH264SessionParametersCreateInfoKHR-pParametersAddInfo-parameter
    If pParametersAddInfo is not NULL, pParametersAddInfo must be a valid pointer to a valid VkVideoDecodeH264SessionParametersAddInfoKHR structure

The VkVideoDecodeH264SessionParametersAddInfoKHR structure is defined as:

// Provided by VK_KHR_video_decode_h264
typedef struct VkVideoDecodeH264SessionParametersAddInfoKHR {
    VkStructureType                            sType;
    const void*                                pNext;
    uint32_t                                   stdSPSCount;
    const StdVideoH264SequenceParameterSet*    pStdSPSs;
    uint32_t                                   stdPPSCount;
    const StdVideoH264PictureParameterSet*     pStdPPSs;
} VkVideoDecodeH264SessionParametersAddInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • stdSPSCount is the number of elements in the pStdSPSs array.

  • pStdSPSs is a pointer to an array of StdVideoH264SequenceParameterSet structures describing the H.264 SPS entries to add.

  • stdPPSCount is the number of elements in the pStdPPSs array.

  • pStdPPSs is a pointer to an array of StdVideoH264PictureParameterSet structures describing the H.264 PPS entries to add.

This structure can be specified in the following places:

Valid Usage
  • VUID-VkVideoDecodeH264SessionParametersAddInfoKHR-None-04825
    The seq_parameter_set_id member of each StdVideoH264SequenceParameterSet structure specified in the elements of pStdSPSs must be unique within pStdSPSs

  • VUID-VkVideoDecodeH264SessionParametersAddInfoKHR-None-04826
    The pair constructed from the seq_parameter_set_id and pic_parameter_set_id members of each StdVideoH264PictureParameterSet structure specified in the elements of pStdPPSs must be unique within pStdPPSs

Valid Usage (Implicit)
  • VUID-VkVideoDecodeH264SessionParametersAddInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_DECODE_H264_SESSION_PARAMETERS_ADD_INFO_KHR

  • VUID-VkVideoDecodeH264SessionParametersAddInfoKHR-pStdSPSs-parameter
    If stdSPSCount is not 0, pStdSPSs must be a valid pointer to an array of stdSPSCount StdVideoH264SequenceParameterSet values

  • VUID-VkVideoDecodeH264SessionParametersAddInfoKHR-pStdPPSs-parameter
    If stdPPSCount is not 0, pStdPPSs must be a valid pointer to an array of stdPPSCount StdVideoH264PictureParameterSet values

H.264 Decoding Parameters

The VkVideoDecodeH264PictureInfoKHR structure is defined as:

// Provided by VK_KHR_video_decode_h264
typedef struct VkVideoDecodeH264PictureInfoKHR {
    VkStructureType                         sType;
    const void*                             pNext;
    const StdVideoDecodeH264PictureInfo*    pStdPictureInfo;
    uint32_t                                sliceCount;
    const uint32_t*                         pSliceOffsets;
} VkVideoDecodeH264PictureInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • pStdPictureInfo is a pointer to a StdVideoDecodeH264PictureInfo structure specifying H.264 picture information.

  • sliceCount is the number of elements in pSliceOffsets.

  • pSliceOffsets is a pointer to an array of sliceCount offsets specifying the start offset of the slices of the picture within the video bitstream buffer range specified in VkVideoDecodeInfoKHR.

This structure is specified in the pNext chain of the VkVideoDecodeInfoKHR structure passed to vkCmdDecodeVideoKHR to specify the codec-specific picture information for an H.264 decode operation.

Decode Output Picture Information

When this structure is specified in the pNext chain of the VkVideoDecodeInfoKHR structure passed to vkCmdDecodeVideoKHR, the information related to the decode output picture is defined as follows:

  • If pStdPictureInfo->flags.field_pic_flag is not set, then the picture represents a frame.

  • If pStdPictureInfo->flags.field_pic_flag is set, then the picture represents a field. Specifically:

    • If pStdPictureInfo->flags.bottom_field_flag is not set, then the picture represents the top field of the frame.

    • If pStdPictureInfo->flags.bottom_field_flag is set, then the picture represents the bottom field of the frame.

  • The image subregion used is determined according to the H.264 Decode Picture Data Access section.

  • The decode output picture is associated with the H.264 picture information provided in pStdPictureInfo.

Std Picture Information

The members of the StdVideoDecodeH264PictureInfo structure pointed to by pStdPictureInfo are interpreted as follows:

  • reserved1 and reserved2 are used only for padding purposes and are otherwise ignored;

  • flags.is_intra as defined in section 3.73 of the ITU-T H.264 Specification;

  • flags.is_reference as defined in section 3.136 of the ITU-T H.264 Specification;

  • flags.complementary_field_pair as defined in section 3.35 of the ITU-T H.264 Specification;

  • seq_parameter_set_id and pic_parameter_set_id are used to identify the active parameter sets, as described below;

  • all other members are interpreted as defined in section 7.4.3 of the ITU-T H.264 Specification.

Reference picture setup is controlled by the value of StdVideoDecodeH264PictureInfo::flags.is_reference. If it is set and a reconstructed picture is specified, then the latter is used as the target of picture reconstruction to activate the DPB slot specified in pDecodeInfo->pSetupReferenceSlot→slotIndex. If StdVideoDecodeH264PictureInfo::flags.is_reference is not set, but a reconstructed picture is specified, then the corresponding picture reference associated with the DPB slot is invalidated, as described in the DPB Slot States section.

Active Parameter Sets

The members of the StdVideoDecodeH264PictureInfo structure pointed to by pStdPictureInfo are used to select the active parameter sets to use from the bound video session parameters object, as follows:

  • The active SPS is the SPS identified by the key specified in StdVideoDecodeH264PictureInfo::seq_parameter_set_id.

  • The active PPS is the PPS identified by the key specified by the pair constructed from StdVideoDecodeH264PictureInfo::seq_parameter_set_id and StdVideoDecodeH264PictureInfo::pic_parameter_set_id.

Valid Usage (Implicit)
  • VUID-VkVideoDecodeH264PictureInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_DECODE_H264_PICTURE_INFO_KHR

  • VUID-VkVideoDecodeH264PictureInfoKHR-pStdPictureInfo-parameter
    pStdPictureInfo must be a valid pointer to a valid StdVideoDecodeH264PictureInfo value

  • VUID-VkVideoDecodeH264PictureInfoKHR-pSliceOffsets-parameter
    pSliceOffsets must be a valid pointer to an array of sliceCount uint32_t values

  • VUID-VkVideoDecodeH264PictureInfoKHR-sliceCount-arraylength
    sliceCount must be greater than 0

The VkVideoDecodeH264DpbSlotInfoKHR structure is defined as:

// Provided by VK_KHR_video_decode_h264
typedef struct VkVideoDecodeH264DpbSlotInfoKHR {
    VkStructureType                           sType;
    const void*                               pNext;
    const StdVideoDecodeH264ReferenceInfo*    pStdReferenceInfo;
} VkVideoDecodeH264DpbSlotInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • pStdReferenceInfo is a pointer to a StdVideoDecodeH264ReferenceInfo structure specifying H.264 reference information.

This structure is specified in the pNext chain of VkVideoDecodeInfoKHR::pSetupReferenceSlot, if not NULL, and the pNext chain of the elements of VkVideoDecodeInfoKHR::pReferenceSlots to specify the codec-specific reference picture information for an H.264 decode operation.

Active Reference Picture Information

When this structure is specified in the pNext chain of the elements of VkVideoDecodeInfoKHR::pReferenceSlots, one or two elements are added to the list of active reference pictures used by the video decode operation for each element of VkVideoDecodeInfoKHR::pReferenceSlots as follows:

  • If neither pStdReferenceInfo->flags.top_field_flag nor pStdReferenceInfo->flags.bottom_field_flag is set, then the picture is added as a frame reference to the list of active reference pictures.

  • If pStdReferenceInfo->flags.top_field_flag is set, then the picture is added as a top field reference to the list of active reference pictures.

  • If pStdReferenceInfo->flags.bottom_field_flag is set, then the picture is added as a bottom field reference to the list of active reference pictures.

  • For each added reference picture, the corresponding image subregion used is determined according to the H.264 Decode Picture Data Access section.

  • Each added reference picture is associated with the DPB slot index specified in the slotIndex member of the corresponding element of VkVideoDecodeInfoKHR::pReferenceSlots.

  • Each added reference picture is associated with the H.264 reference information provided in pStdReferenceInfo.

When both the top and bottom field of an interlaced frame currently associated with a DPB slot is intended to be used as an active reference picture and both fields are stored in the same image subregion (which is the case when using VK_VIDEO_DECODE_H264_PICTURE_LAYOUT_INTERLACED_INTERLEAVED_LINES_BIT_KHR which stores the two fields at even and odd scanlines of the same image subregion), both references have to be provided through a single VkVideoReferenceSlotInfoKHR structure that has both flags.top_field_flag and flags.bottom_field_flag set in the StdVideoDecodeH264ReferenceInfo structure pointed to by the pStdReferenceInfo member of the VkVideoDecodeH264DpbSlotInfoKHR structure included in the corresponding VkVideoReferenceSlotInfoKHR structure’s pNext chain. However, this approach can only be used when both fields are stored in the same image subregion. If that is not the case (e.g. when using VK_VIDEO_DECODE_H264_PICTURE_LAYOUT_INTERLACED_SEPARATE_PLANES_BIT_KHR which requires separate codedOffset values for the two fields and also allows storing the two fields of a frame in separate image layers or entirely separate images), then a separate VkVideoReferenceSlotInfoKHR structure needs to be provided for referencing the two fields, each only setting one of flags.top_field_flag or flags.bottom_field_flag, and providing the appropriate video picture resource information in VkVideoReferenceSlotInfoKHR::pPictureResource.

Reconstructed Picture Information

When this structure is specified in the pNext chain of VkVideoDecodeInfoKHR::pSetupReferenceSlot, the information related to the reconstructed picture is defined as follows:

  • If neither pStdReferenceInfo->flags.top_field_flag nor pStdReferenceInfo->flags.bottom_field_flag is set, then the picture represents a frame.

  • If pStdReferenceInfo->flags.top_field_flag is set, then the picture represents a field, specifically, the top field of the frame.

  • If pStdReferenceInfo->flags.bottom_field_flag is set, then the picture represents a field, specifically, the bottom field of the frame.

  • The image subregion used is determined according to the H.264 Decode Picture Data Access section.

  • If reference picture setup is requested, then the reconstructed picture is used to activate the DPB slot with the index specified in VkVideoDecodeInfoKHR::pSetupReferenceSlot->slotIndex.

  • The reconstructed picture is associated with the H.264 reference information provided in pStdReferenceInfo.

Std Reference Information

The members of the StdVideoDecodeH264ReferenceInfo structure pointed to by pStdReferenceInfo are interpreted as follows:

  • flags.top_field_flag is used to indicate whether the reference is used as top field reference;

  • flags.bottom_field_flag is used to indicate whether the reference is used as bottom field reference;

  • flags.used_for_long_term_reference is used to indicate whether the picture is marked as “used for long-term reference” as defined in section 8.2.5.1 of the ITU-T H.264 Specification;

  • flags.is_non_existing is used to indicate whether the picture is marked as “non-existing” as defined in section 8.2.5.2 of the ITU-T H.264 Specification;

  • all other members are interpreted as defined in section 8.2 of the ITU-T H.264 Specification.

Valid Usage (Implicit)
  • VUID-VkVideoDecodeH264DpbSlotInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_DECODE_H264_DPB_SLOT_INFO_KHR

  • VUID-VkVideoDecodeH264DpbSlotInfoKHR-pStdReferenceInfo-parameter
    pStdReferenceInfo must be a valid pointer to a valid StdVideoDecodeH264ReferenceInfo value

H.264 Decode Requirements

This section describes the required H.264 decoding capabilities for physical devices that have at least one queue family that supports the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H264_BIT_KHR, as returned by vkGetPhysicalDeviceQueueFamilyProperties2 in VkQueueFamilyVideoPropertiesKHR::videoCodecOperations.

Table 1. Required Video Std Header Versions
Video Std Header Name Version

vulkan_video_codec_h264std_decode

1.0.0

Table 2. Required Video Capabilities
Video Capability Requirement Requirement Type1

VkVideoCapabilitiesKHR

flags

-

min

minBitstreamBufferOffsetAlignment

4096

max

minBitstreamBufferSizeAlignment

4096

max

pictureAccessGranularity

(64,64)

max

minCodedExtent

-

max

maxCodedExtent

-

min

maxDpbSlots

0

min

maxActiveReferencePictures

0

min

VkVideoDecodeCapabilitiesKHR

flags

VK_VIDEO_DECODE_CAPABILITY_DPB_AND_OUTPUT_COINCIDE_BIT_KHR or VK_VIDEO_DECODE_CAPABILITY_DPB_AND_OUTPUT_DISTINCT_BIT_KHR

min

VkVideoDecodeH264CapabilitiesKHR

maxLevelIdc

STD_VIDEO_H264_LEVEL_IDC_1_0

min

fieldOffsetGranularity

(0,0) except for profiles using VK_VIDEO_DECODE_H264_PICTURE_LAYOUT_INTERLACED_SEPARATE_PLANES_BIT_KHR

implementation-dependent

1

The Requirement Type column specifies the requirement is either the minimum value all implementations must support, the maximum value all implementations must support, or the exact value all implementations must support. For bitmasks a minimum value is the least bits all implementations must set, but they may have additional bits set beyond this minimum.

H.265 Decode Operations

Video decode operations using an H.265 decode profile can be used to decode elementary video stream sequences compliant to the ITU-T H.265 Specification.

Refer to the Preamble for information on how the Khronos Intellectual Property Rights Policy relates to normative references to external materials not created by Khronos.

This process is performed according to the video decode operation steps with the codec-specific semantics defined in section 8 of ITU-T H.265 Specification:

If the parameters and the bitstream adhere to the syntactic and semantic requirements defined in the corresponding sections of the ITU-T H.265 Specification, as described above, and the DPB slots associated with the active reference pictures all refer to valid picture references, then the video decode operation will complete successfully. Otherwise, the video decode operation may complete unsuccessfully.

H.265 Decode Bitstream Data Access

The video bitstream buffer range should contain a VCL NAL unit comprised of the slice segment headers and data of a picture representing a frame, as defined in sections 7.3.6 and 7.3.8, and this data is interpreted as defined in sections 7.4.7 and 7.4.9 of the ITU-T H.265 Specification, respectively.

The offsets provided in VkVideoDecodeH265PictureInfoKHR::pSliceSegmentOffsets should specify the starting offsets corresponding to each slice segment header within the video bitstream buffer range.

H.265 Decode Picture Data Access

Accesses to image data within a video picture resource happen at the granularity indicated by VkVideoCapabilitiesKHR::pictureAccessGranularity, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the used video profile. Accordingly, the complete image subregion of a decode output picture, reference picture, or reconstructed picture accessed by video coding operations using an H.265 decode profile is defined as the set of texels within the coordinate range:

([0,endX), [0,endY))

Where:

  • endX equals codedExtent.width rounded up to the nearest integer multiple of pictureAccessGranularity.width and clamped to the width of the image subresource referred to by the corresponding VkVideoPictureResourceInfoKHR structure;

  • endY equals codedExtent.height rounded up to the nearest integer multiple of pictureAccessGranularity.height and clamped to the height of the image subresource referred to by the corresponding VkVideoPictureResourceInfoKHR structure;

Where codedExtent is the member of the VkVideoPictureResourceInfoKHR structure corresponding to the picture.

In case of video decode operations using an H.265 decode profile, any access to a picture at the coordinates (x,y), as defined by the ITU-T H.265 Specification, is an access to the image subresource referred to by the corresponding VkVideoPictureResourceInfoKHR structure at the texel coordinates (x,y).

H.265 Decode Profile

A video profile supporting H.265 video decode operations is specified by setting VkVideoProfileInfoKHR::videoCodecOperation to VK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_KHR and adding a VkVideoDecodeH265ProfileInfoKHR structure to the VkVideoProfileInfoKHR::pNext chain.

The VkVideoDecodeH265ProfileInfoKHR structure is defined as:

// Provided by VK_KHR_video_decode_h265
typedef struct VkVideoDecodeH265ProfileInfoKHR {
    VkStructureType           sType;
    const void*               pNext;
    StdVideoH265ProfileIdc    stdProfileIdc;
} VkVideoDecodeH265ProfileInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • stdProfileIdc is a StdVideoH265ProfileIdc value specifying the H.265 codec profile IDC, as defined in section A.3 of the ITU-T H.265 Specification.

Valid Usage (Implicit)
  • VUID-VkVideoDecodeH265ProfileInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_DECODE_H265_PROFILE_INFO_KHR

H.265 Decode Capabilities

When calling vkGetPhysicalDeviceVideoCapabilitiesKHR to query the capabilities for an H.265 decode profile, the VkVideoCapabilitiesKHR::pNext chain must include a VkVideoDecodeH265CapabilitiesKHR structure that will be filled with the profile-specific capabilities.

The VkVideoDecodeH265CapabilitiesKHR structure is defined as:

// Provided by VK_KHR_video_decode_h265
typedef struct VkVideoDecodeH265CapabilitiesKHR {
    VkStructureType         sType;
    void*                   pNext;
    StdVideoH265LevelIdc    maxLevelIdc;
} VkVideoDecodeH265CapabilitiesKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • maxLevelIdc is a StdVideoH265LevelIdc value indicating the maximum H.265 level supported by the profile, where enum constant STD_VIDEO_H265_LEVEL_IDC_<major>_<minor> identifies H.265 level <major>.<minor> as defined in section A.4 of the ITU-T H.265 Specification.

Valid Usage (Implicit)
  • VUID-VkVideoDecodeH265CapabilitiesKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_DECODE_H265_CAPABILITIES_KHR

H.265 Decode Parameter Sets

Video session parameters objects created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_KHR can contain the following types of parameters:

H.265 Video Parameter Sets (VPS)

Represented by StdVideoH265VideoParameterSet structures and interpreted as follows:

  • reserved1, reserved2, and reserved3 are used only for padding purposes and are otherwise ignored;

  • vps_video_parameter_set_id is used as the key of the VPS entry;

  • the max_latency_increase_plus1, max_dec_pic_buffering_minus1, and max_num_reorder_pics members of the StdVideoH265DecPicBufMgr structure pointed to by pDecPicBufMgr correspond to vps_max_latency_increase_plus1, vps_max_dec_pic_buffering_minus1, and vps_max_num_reorder_pics, respectively, as defined in section 7.4.3.1 of the ITU-T H.265 Specification;

  • the StdVideoH265HrdParameters structure pointed to by pHrdParameters is interpreted as follows:

    • reserved is used only for padding purposes and is otherwise ignored;

    • flags.fixed_pic_rate_general_flag is a bitmask where bit index i corresponds to fixed_pic_rate_general_flag[i] as defined in section E.3.2 of the ITU-T H.265 Specification;

    • flags.fixed_pic_rate_within_cvs_flag is a bitmask where bit index i corresponds to fixed_pic_rate_within_cvs_flag[i] as defined in section E.3.2 of the ITU-T H.265 Specification;

    • flags.low_delay_hrd_flag is a bitmask where bit index i corresponds to low_delay_hrd_flag[i] as defined in section E.3.2 of the ITU-T H.265 Specification;

    • if flags.nal_hrd_parameters_present_flag is set, then pSubLayerHrdParametersNal is a pointer to an array of vps_max_sub_layers_minus1 + 1 number of StdVideoH265SubLayerHrdParameters structures where vps_max_sub_layers_minus1 is the corresponding member of the encompassing StdVideoH265VideoParameterSet structure and each element is interpreted as follows:

      • cbr_flag is a bitmask where bit index i corresponds to cbr_flag[i] as defined in section E.3.3 of the ITU-T H.265 Specification;

      • all other members of the StdVideoH265SubLayerHrdParameters structure are interpreted as defined in section E.3.3 of the ITU-T H.265 Specification;

    • if flags.vcl_hrd_parameters_present_flag is set, then pSubLayerHrdParametersVcl is a pointer to an array of vps_max_sub_layers_minus1 + 1 number of StdVideoH265SubLayerHrdParameters structures where vps_max_sub_layers_minus1 is the corresponding member of the encompassing StdVideoH265VideoParameterSet structure and each element is interpreted as follows:

      • cbr_flag is a bitmask where bit index i corresponds to cbr_flag[i] as defined in section E.3.3 of the ITU-T H.265 Specification;

      • all other members of the StdVideoH265SubLayerHrdParameters structure are interpreted as defined in section E.3.3 of the ITU-T H.265 Specification;

    • all other members of StdVideoH265HrdParameters are interpreted as defined in section E.3.2 of the ITU-T H.265 Specification;

  • the StdVideoH265ProfileTierLevel structure pointed to by pProfileTierLevel are interpreted as follows:

    • general_level_idc is one of the enum constants STD_VIDEO_H265_LEVEL_IDC_<major>_<minor> identifying the H.265 level <major>.<minor> as defined in section A.4 of the ITU-T H.265 Specification;

    • all other members of StdVideoH265ProfileTierLevel are interpreted as defined in section 7.4.4 of the ITU-T H.265 Specification;

  • all other members of StdVideoH265VideoParameterSet are interpreted as defined in section 7.4.3.1 of the ITU-T H.265 Specification.

H.265 Sequence Parameter Sets (SPS)

Represented by StdVideoH265SequenceParameterSet structures and interpreted as follows:

  • reserved1 and reserved2 are used only for padding purposes and are otherwise ignored;

  • the pair constructed from sps_video_parameter_set_id and sps_seq_parameter_set_id is used as the key of the SPS entry;

  • the StdVideoH265ProfileTierLevel structure pointed to by pProfileTierLevel are interpreted as follows:

    • general_level_idc is one of the enum constants STD_VIDEO_H265_LEVEL_IDC_<major>_<minor> identifying the H.265 level <major>.<minor> as defined in section A.4 of the ITU-T H.265 Specification;

    • all other members of StdVideoH265ProfileTierLevel are interpreted as defined in section 7.4.4 of the ITU-T H.265 Specification;

  • the max_latency_increase_plus1, max_dec_pic_buffering_minus1, and max_num_reorder_pics members of the StdVideoH265DecPicBufMgr structure pointed to by pDecPicBufMgr correspond to sps_max_latency_increase_plus1, sps_max_dec_pic_buffering_minus1, and sps_max_num_reorder_pics, respectively, as defined in section 7.4.3.2 of the ITU-T H.265 Specification;

  • if flags.sps_scaling_list_data_present_flag is set, then the StdVideoH265ScalingLists structure pointed to by pScalingLists is interpreted as follows:

    • ScalingList4x4, ScalingList8x8, ScalingList16x16, and ScalingList32x32 correspond to ScalingList[0], ScalingList[1], ScalingList[2], and ScalingList[3], respectively, as defined in section 7.3.4 of the ITU-T H.265 Specification;

    • ScalingListDCCoef16x16 and ScalingListDCCoef32x32 correspond to scaling_list_dc_coef_minus8[0] and scaling_list_dc_coef_minus8[1], respectively, as defined in section 7.3.4 of the ITU-T H.265 Specification;

  • pShortTermRefPicSet is a pointer to an array of num_short_term_ref_pic_sets number of StdVideoH265ShortTermRefPicSet structures where each element is interpreted as follows:

    • reserved1, reserved2, and reserved3 are used only for padding purposes and are otherwise ignored;

    • used_by_curr_pic_flag is a bitmask where bit index i corresponds to used_by_curr_pic_flag[i] as defined in section 7.4.8 of the ITU-T H.265 Specification;

    • use_delta_flag is a bitmask where bit index i corresponds to use_delta_flag[i] as defined in section 7.4.8 of the ITU-T H.265 Specification;

    • used_by_curr_pic_s0_flag is a bitmask where bit index i corresponds to used_by_curr_pic_s0_flag[i] as defined in section 7.4.8 of the ITU-T H.265 Specification;

    • used_by_curr_pic_s1_flag is a bitmask where bit index i corresponds to used_by_curr_pic_s1_flag[i] as defined in section 7.4.8 of the ITU-T H.265 Specification;

    • all other members of StdVideoH265ShortTermRefPicSet are interpreted as defined in section 7.4.8 of the ITU-T H.265 Specification;

  • if flags.long_term_ref_pics_present_flag is set then the StdVideoH265LongTermRefPicsSps structure pointed to by pLongTermRefPicsSps is interpreted as follows:

    • used_by_curr_pic_lt_sps_flag is a bitmask where bit index i corresponds to used_by_curr_pic_lt_sps_flag[i] as defined in section 7.4.3.2 of the ITU-T H.265 Specification;

    • all other members of StdVideoH265LongTermRefPicsSps are interpreted as defined in section 7.4.3.2 of the ITU-T H.265 Specification;

  • if flags.vui_parameters_present_flag is set, then the StdVideoH265SequenceParameterSetVui structure pointed to by pSequenceParameterSetVui is interpreted as follows:

    • reserved1, reserved2, and reserved3 are used only for padding purposes and are otherwise ignored;

    • the StdVideoH265HrdParameters structure pointed to by pHrdParameters is interpreted as follows:

      • flags.fixed_pic_rate_general_flag is a bitmask where bit index i corresponds to fixed_pic_rate_general_flag[i] as defined in section E.3.2 of the ITU-T H.265 Specification;

      • flags.fixed_pic_rate_within_cvs_flag is a bitmask where bit index i corresponds to fixed_pic_rate_within_cvs_flag[i] as defined in section E.3.2 of the ITU-T H.265 Specification;

      • flags.low_delay_hrd_flag is a bitmask where bit index i corresponds to low_delay_hrd_flag[i] as defined in section E.3.2 of the ITU-T H.265 Specification;

      • if flags.nal_hrd_parameters_present_flag is set, then pSubLayerHrdParametersNal is a pointer to an array of sps_max_sub_layers_minus1 + 1 number of StdVideoH265SubLayerHrdParameters structures where sps_max_sub_layers_minus1 is the corresponding member of the encompassing StdVideoH265SequenceParameterSet structure and each element is interpreted as follows:

        • cbr_flag is a bitmask where bit index i corresponds to cbr_flag[i] as defined in section E.3.3 of the ITU-T H.265 Specification;

        • all other members of the StdVideoH265SubLayerHrdParameters structure are interpreted as defined in section E.3.3 of the ITU-T H.265 Specification;

      • if flags.vcl_hrd_parameters_present_flag is set, then pSubLayerHrdParametersVcl is a pointer to an array of sps_max_sub_layers_minus1 + 1 number of StdVideoH265SubLayerHrdParameters structures where sps_max_sub_layers_minus1 is the corresponding member of the encompassing StdVideoH265SequenceParameterSet structure and each element is interpreted as follows:

        • cbr_flag is a bitmask where bit index i corresponds to cbr_flag[i] as defined in section E.3.3 of the ITU-T H.265 Specification;

        • all other members of the StdVideoH265SubLayerHrdParameters structure are interpreted as defined in section E.3.3 of the ITU-T H.265 Specification;

      • all other members of StdVideoH265HrdParameters are interpreted as defined in section E.3.2 of the ITU-T H.265 Specification;

    • all other members of pSequenceParameterSetVui are interpreted as defined in section E.3.1 of the ITU-T H.265 Specification;

  • if flags.sps_palette_predictor_initializer_present_flag is set, then the PredictorPaletteEntries member of the StdVideoH265PredictorPaletteEntries structure pointed to by pPredictorPaletteEntries is interpreted as defined in section 7.4.9.13 of the ITU-T H.265 Specification;

  • all other members of StdVideoH265SequenceParameterSet are interpreted as defined in section 7.4.3.1 of the ITU-T H.265 Specification.

H.265 Picture Parameter Sets (PPS)

Represented by StdVideoH265PictureParameterSet structures and interpreted as follows:

  • reserved1, reserved2, and reserved3 are used only for padding purposes and are otherwise ignored;

  • the triplet constructed from sps_video_parameter_set_id, pps_seq_parameter_set_id, and pps_pic_parameter_set_id is used as the key of the PPS entry;

  • if flags.pps_scaling_list_data_present_flag is set, then the StdVideoH265ScalingLists structure pointed to by pScalingLists is interpreted as follows:

    • ScalingList4x4, ScalingList8x8, ScalingList16x16, and ScalingList32x32 correspond to ScalingList[0], ScalingList[1], ScalingList[2], and ScalingList[3], respectively, as defined in section 7.3.4 of the ITU-T H.265 Specification;

    • ScalingListDCCoef16x16 and ScalingListDCCoef32x32 correspond to scaling_list_dc_coef_minus8[0] and scaling_list_dc_coef_minus8[1], respectively, as defined in section 7.3.4 of the ITU-T H.265 Specification;

  • if flags.pps_palette_predictor_initializer_present_flag is set, then the PredictorPaletteEntries member of the StdVideoH265PredictorPaletteEntries structure pointed to by pPredictorPaletteEntries is interpreted as defined in section 7.4.9.13 of the ITU-T H.265 Specification;

  • all other members of StdVideoH265PictureParameterSet are interpreted as defined in section 7.4.3.3 of the ITU-T H.265 Specification.

When a video session parameters object is created with the codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_KHR, the VkVideoSessionParametersCreateInfoKHR::pNext chain must include a VkVideoDecodeH265SessionParametersCreateInfoKHR structure specifying the capacity and initial contents of the object.

The VkVideoDecodeH265SessionParametersCreateInfoKHR structure is defined as:

// Provided by VK_KHR_video_decode_h265
typedef struct VkVideoDecodeH265SessionParametersCreateInfoKHR {
    VkStructureType                                        sType;
    const void*                                            pNext;
    uint32_t                                               maxStdVPSCount;
    uint32_t                                               maxStdSPSCount;
    uint32_t                                               maxStdPPSCount;
    const VkVideoDecodeH265SessionParametersAddInfoKHR*    pParametersAddInfo;
} VkVideoDecodeH265SessionParametersCreateInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • maxStdVPSCount is the maximum number of H.265 VPS entries the created VkVideoSessionParametersKHR can contain.

  • maxStdSPSCount is the maximum number of H.265 SPS entries the created VkVideoSessionParametersKHR can contain.

  • maxStdPPSCount is the maximum number of H.265 PPS entries the created VkVideoSessionParametersKHR can contain.

  • pParametersAddInfo is NULL or a pointer to a VkVideoDecodeH265SessionParametersAddInfoKHR structure specifying H.265 parameters to add upon object creation.

Valid Usage (Implicit)
  • VUID-VkVideoDecodeH265SessionParametersCreateInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_DECODE_H265_SESSION_PARAMETERS_CREATE_INFO_KHR

  • VUID-VkVideoDecodeH265SessionParametersCreateInfoKHR-pParametersAddInfo-parameter
    If pParametersAddInfo is not NULL, pParametersAddInfo must be a valid pointer to a valid VkVideoDecodeH265SessionParametersAddInfoKHR structure

The VkVideoDecodeH265SessionParametersAddInfoKHR structure is defined as:

// Provided by VK_KHR_video_decode_h265
typedef struct VkVideoDecodeH265SessionParametersAddInfoKHR {
    VkStructureType                            sType;
    const void*                                pNext;
    uint32_t                                   stdVPSCount;
    const StdVideoH265VideoParameterSet*       pStdVPSs;
    uint32_t                                   stdSPSCount;
    const StdVideoH265SequenceParameterSet*    pStdSPSs;
    uint32_t                                   stdPPSCount;
    const StdVideoH265PictureParameterSet*     pStdPPSs;
} VkVideoDecodeH265SessionParametersAddInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • stdVPSCount is the number of elements in the pStdVPSs array.

  • pStdVPSs is a pointer to an array of StdVideoH265VideoParameterSet structures describing the H.265 VPS entries to add.

  • stdSPSCount is the number of elements in the pStdSPSs array.

  • pStdSPSs is a pointer to an array of StdVideoH265SequenceParameterSet structures describing the H.265 SPS entries to add.

  • stdPPSCount is the number of elements in the pStdPPSs array.

  • pStdPPSs is a pointer to an array of StdVideoH265PictureParameterSet structures describing the H.265 PPS entries to add.

This structure can be specified in the following places:

Valid Usage
  • VUID-VkVideoDecodeH265SessionParametersAddInfoKHR-None-04833
    The vps_video_parameter_set_id member of each StdVideoH265VideoParameterSet structure specified in the elements of pStdVPSs must be unique within pStdVPSs

  • VUID-VkVideoDecodeH265SessionParametersAddInfoKHR-None-04834
    The pair constructed from the sps_video_parameter_set_id and sps_seq_parameter_set_id members of each StdVideoH265SequenceParameterSet structure specified in the elements of pStdSPSs must be unique within pStdSPSs

  • VUID-VkVideoDecodeH265SessionParametersAddInfoKHR-None-04835
    The triplet constructed from the sps_video_parameter_set_id, pps_seq_parameter_set_id, and pps_pic_parameter_set_id members of each StdVideoH265PictureParameterSet structure specified in the elements of pStdPPSs must be unique within pStdPPSs

Valid Usage (Implicit)
  • VUID-VkVideoDecodeH265SessionParametersAddInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_DECODE_H265_SESSION_PARAMETERS_ADD_INFO_KHR

  • VUID-VkVideoDecodeH265SessionParametersAddInfoKHR-pStdVPSs-parameter
    If stdVPSCount is not 0, pStdVPSs must be a valid pointer to an array of stdVPSCount StdVideoH265VideoParameterSet values

  • VUID-VkVideoDecodeH265SessionParametersAddInfoKHR-pStdSPSs-parameter
    If stdSPSCount is not 0, pStdSPSs must be a valid pointer to an array of stdSPSCount StdVideoH265SequenceParameterSet values

  • VUID-VkVideoDecodeH265SessionParametersAddInfoKHR-pStdPPSs-parameter
    If stdPPSCount is not 0, pStdPPSs must be a valid pointer to an array of stdPPSCount StdVideoH265PictureParameterSet values

H.265 Decoding Parameters

The VkVideoDecodeH265PictureInfoKHR structure is defined as:

// Provided by VK_KHR_video_decode_h265
typedef struct VkVideoDecodeH265PictureInfoKHR {
    VkStructureType                         sType;
    const void*                             pNext;
    const StdVideoDecodeH265PictureInfo*    pStdPictureInfo;
    uint32_t                                sliceSegmentCount;
    const uint32_t*                         pSliceSegmentOffsets;
} VkVideoDecodeH265PictureInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • pStdPictureInfo is a pointer to a StdVideoDecodeH265PictureInfo structure specifying H.265 picture information.

  • sliceSegmentCount is the number of elements in pSliceSegmentOffsets.

  • pSliceSegmentOffsets is a pointer to an array of sliceSegmentCount offsets specifying the start offset of the slice segments of the picture within the video bitstream buffer range specified in VkVideoDecodeInfoKHR.

This structure is specified in the pNext chain of the VkVideoDecodeInfoKHR structure passed to vkCmdDecodeVideoKHR to specify the codec-specific picture information for an H.265 decode operation.

Decode Output Picture Information

When this structure is specified in the pNext chain of the VkVideoDecodeInfoKHR structure passed to vkCmdDecodeVideoKHR, the information related to the decode output picture is defined as follows:

Std Picture Information

The members of the StdVideoDecodeH265PictureInfo structure pointed to by pStdPictureInfo are interpreted as follows:

  • reserved is used only for padding purposes and is otherwise ignored;

  • flags.IrapPicFlag as defined in section 3.73 of the ITU-T H.265 Specification;

  • flags.IdrPicFlag as defined in section 3.67 of the ITU-T H.265 Specification;

  • flags.IsReference as defined in section 3.132 of the ITU-T H.265 Specification;

  • sps_video_parameter_set_id, pps_seq_parameter_set_id, and pps_pic_parameter_set_id are used to identify the active parameter sets, as described below;

  • PicOrderCntVal as defined in section 8.3.1 of the ITU-T H.265 Specification;

  • NumBitsForSTRefPicSetInSlice is the number of bits used in st_ref_pic_set when short_term_ref_pic_set_sps_flag is 0, or 0 otherwise, as defined in sections 7.4.7 and 7.4.8 of the ITU-T H.265 Specification;

  • NumDeltaPocsOfRefRpsIdx is the value of NumDeltaPocs[RefRpsIdx] when short_term_ref_pic_set_sps_flag is 1, or 0 otherwise, as defined in sections 7.4.7 and 7.4.8 of the ITU-T H.265 Specification;

  • RefPicSetStCurrBefore, RefPicSetStCurrAfter, and RefPicSetLtCurr are interpreted as defined in section 8.3.2 of the ITU-T H.265 Specification where each element of these arrays either identifies an active reference picture using its DPB slot index or contains the value STD_VIDEO_H265_NO_REFERENCE_PICTURE to indicate “no reference picture”;

  • all other members are interpreted as defined in section 8.3.2 of the ITU-T H.265 Specification.

Reference picture setup is controlled by the value of StdVideoDecodeH265PictureInfo::flags.IsReference. If it is set and a reconstructed picture is specified, then the latter is used as the target of picture reconstruction to activate the corresponding DPB slot. If StdVideoDecodeH265PictureInfo::flags.IsReference is not set, but a reconstructed picture is specified, then the corresponding picture reference associated with the DPB slot is invalidated, as described in the DPB Slot States section.

Active Parameter Sets

The members of the StdVideoDecodeH265PictureInfo structure pointed to by pStdPictureInfo are used to select the active parameter sets to use from the bound video session parameters object, as follows:

  • The active VPS is the VPS identified by the key specified in StdVideoDecodeH265PictureInfo::sps_video_parameter_set_id.

  • The active SPS is the SPS identified by the key specified by the pair constructed from StdVideoDecodeH265PictureInfo::sps_video_parameter_set_id and StdVideoDecodeH265PictureInfo::pps_seq_parameter_set_id.

  • The active PPS is the PPS identified by the key specified by the triplet constructed from StdVideoDecodeH265PictureInfo::sps_video_parameter_set_id, StdVideoDecodeH265PictureInfo::pps_seq_parameter_set_id, and StdVideoDecodeH265PictureInfo::pps_pic_parameter_set_id.

Valid Usage (Implicit)
  • VUID-VkVideoDecodeH265PictureInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_DECODE_H265_PICTURE_INFO_KHR

  • VUID-VkVideoDecodeH265PictureInfoKHR-pStdPictureInfo-parameter
    pStdPictureInfo must be a valid pointer to a valid StdVideoDecodeH265PictureInfo value

  • VUID-VkVideoDecodeH265PictureInfoKHR-pSliceSegmentOffsets-parameter
    pSliceSegmentOffsets must be a valid pointer to an array of sliceSegmentCount uint32_t values

  • VUID-VkVideoDecodeH265PictureInfoKHR-sliceSegmentCount-arraylength
    sliceSegmentCount must be greater than 0

The VkVideoDecodeH265DpbSlotInfoKHR structure is defined as:

// Provided by VK_KHR_video_decode_h265
typedef struct VkVideoDecodeH265DpbSlotInfoKHR {
    VkStructureType                           sType;
    const void*                               pNext;
    const StdVideoDecodeH265ReferenceInfo*    pStdReferenceInfo;
} VkVideoDecodeH265DpbSlotInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • pStdReferenceInfo is a pointer to a StdVideoDecodeH265ReferenceInfo structure specifying reference picture information described in section 8.3 of the ITU-T H.265 Specification.

This structure is specified in the pNext chain of VkVideoDecodeInfoKHR::pSetupReferenceSlot, if not NULL, and the pNext chain of the elements of VkVideoDecodeInfoKHR::pReferenceSlots to specify the codec-specific reference picture information for an H.265 decode operation.

Active Reference Picture Information

When this structure is specified in the pNext chain of the elements of VkVideoDecodeInfoKHR::pReferenceSlots, one element is added to the list of active reference pictures used by the video decode operation for each element of VkVideoDecodeInfoKHR::pReferenceSlots as follows:

Reconstructed Picture Information

When this structure is specified in the pNext chain of VkVideoDecodeInfoKHR::pSetupReferenceSlot, the information related to the reconstructed picture is defined as follows:

Std Reference Information

The members of the StdVideoDecodeH265ReferenceInfo structure pointed to by pStdReferenceInfo are interpreted as follows:

  • flags.used_for_long_term_reference is used to indicate whether the picture is marked as “used for long-term reference” as defined in section 8.3.2 of the ITU-T H.265 Specification;

  • flags.unused_for_reference is used to indicate whether the picture is marked as “unused for reference” as defined in section 8.3.2 of the ITU-T H.265 Specification;

  • all other members are interpreted as defined in section 8.3 of the ITU-T H.265 Specification.

Valid Usage (Implicit)
  • VUID-VkVideoDecodeH265DpbSlotInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_DECODE_H265_DPB_SLOT_INFO_KHR

  • VUID-VkVideoDecodeH265DpbSlotInfoKHR-pStdReferenceInfo-parameter
    pStdReferenceInfo must be a valid pointer to a valid StdVideoDecodeH265ReferenceInfo value

H.265 Decode Requirements

This section describes the required H.265 decoding capabilities for physical devices that have at least one queue family that supports the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_H265_BIT_KHR, as returned by vkGetPhysicalDeviceQueueFamilyProperties2 in VkQueueFamilyVideoPropertiesKHR::videoCodecOperations.

Table 3. Required Video Std Header Versions
Video Std Header Name Version

vulkan_video_codec_h265std_decode

1.0.0

Table 4. Required Video Capabilities
Video Capability Requirement Requirement Type1

VkVideoCapabilitiesKHR

flags

-

min

minBitstreamBufferOffsetAlignment

4096

max

minBitstreamBufferSizeAlignment

4096

max

pictureAccessGranularity

(64,64)

max

minCodedExtent

-

max

maxCodedExtent

-

min

maxDpbSlots

0

min

maxActiveReferencePictures

0

min

VkVideoDecodeCapabilitiesKHR

flags

VK_VIDEO_DECODE_CAPABILITY_DPB_AND_OUTPUT_COINCIDE_BIT_KHR or VK_VIDEO_DECODE_CAPABILITY_DPB_AND_OUTPUT_DISTINCT_BIT_KHR

min

VkVideoDecodeH265CapabilitiesKHR

maxLevelIdc

STD_VIDEO_H265_LEVEL_IDC_1_0

min

1

The Requirement Type column specifies the requirement is either the minimum value all implementations must support, the maximum value all implementations must support, or the exact value all implementations must support. For bitmasks a minimum value is the least bits all implementations must set, but they may have additional bits set beyond this minimum.

AV1 Decode Operations

Video decode operations using an AV1 decode profile can be used to decode elementary video stream sequences compliant with the AV1 Specification.

Refer to the Preamble for information on how the Khronos Intellectual Property Rights Policy relates to normative references to external materials not created by Khronos.

This process is performed according to the video decode operation steps with the codec-specific semantics defined in section 7 of the AV1 Specification:

If the parameters and the bitstream adhere to the syntactic and semantic requirements defined in the corresponding sections of the AV1 Specification, as described above, and the DPB slots associated with the active reference pictures all refer to valid picture references, then the video decode operation will complete successfully. Otherwise, the video decode operation may complete unsuccessfully.

AV1 Decode Bitstream Data Access

The video bitstream buffer range should contain one or more frame OBUs, comprised of a frame header OBU and tile group OBU, that together represent an entire frame, as defined in sections 5.10, 5.9, and 5.11, and this data is interpreted as defined in sections 6.9, 6.8, and 6.10 of the AV1 Specification, respectively.

The offset specified in VkVideoDecodeAV1PictureInfoKHR::frameHeaderOffset should specify the starting offset of the frame header OBU of the frame.

When the tiles of the frame are encoded into multiple tile groups, each frame OBU has a separate frame header OBU but their content is expected to match per the requirements of the AV1 Specification. Accordingly, the offset specified in frameHeaderOffset can be the offset of any of the otherwise identical frame header OBUs when multiple tile groups are present.

The offsets and sizes provided in VkVideoDecodeAV1PictureInfoKHR::pTileOffsets and VkVideoDecodeAV1PictureInfoKHR::pTileSizes, respectively, should specify the starting offsets and sizes corresponding to each tile within the video bitstream buffer range.

AV1 Decode Picture Data Access

Accesses to image data within a video picture resource happen at the granularity indicated by VkVideoCapabilitiesKHR::pictureAccessGranularity, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the used video profile. Accordingly, the complete image subregion of a decode output picture, reference picture, or reconstructed picture accessed by video coding operations using an AV1 decode profile is defined as the set of texels within the coordinate range:

([0,endX), [0,endY))

Where:

  • endX equals codedExtent.width rounded up to the nearest integer multiple of pictureAccessGranularity.width and clamped to the width of the image subresource referred to by the corresponding VkVideoPictureResourceInfoKHR structure;

  • endY equals codedExtent.height rounded up to the nearest integer multiple of pictureAccessGranularity.height and clamped to the height of the image subresource referred to by the corresponding VkVideoPictureResourceInfoKHR structure;

Where codedExtent is the member of the VkVideoPictureResourceInfoKHR structure corresponding to the picture.

In case of video decode operations using an AV1 decode profile, any access to a picture at the coordinates (x,y), as defined by the AV1 Specification, is an access to the image subresource referred to by the corresponding VkVideoPictureResourceInfoKHR structure at the texel coordinates (x,y).

AV1 Reference Names and Semantics

Individual reference frames used in the decoding process have different semantics, as defined in section 6.10.24 of the AV1 Specification. The AV1 semantics associated with a reference picture are indicated by the corresponding enumeration constant defined in the Video Std enumeration type StdVideoAV1ReferenceName:

  • STD_VIDEO_AV1_REFERENCE_NAME_INTRA_FRAME identifies the reference used for intra coding (INTRA_FRAME), as defined in sections 2 and 7.11.2 of the AV1 Specification.

  • All other enumeration constants refer to backward or forward references used for inter coding, as defined in sections 2 and 7.11.3 of the AV1 Specification:

    • STD_VIDEO_AV1_REFERENCE_NAME_LAST_FRAME identifies the LAST_FRAME reference

    • STD_VIDEO_AV1_REFERENCE_NAME_LAST2_FRAME identifies the LAST2_FRAME reference

    • STD_VIDEO_AV1_REFERENCE_NAME_LAST3_FRAME identifies the LAST3_FRAME reference

    • STD_VIDEO_AV1_REFERENCE_NAME_GOLDEN_FRAME identifies the GOLDEN_FRAME reference

    • STD_VIDEO_AV1_REFERENCE_NAME_BWDREF_FRAME identifies the BWDREF_FRAME reference

    • STD_VIDEO_AV1_REFERENCE_NAME_ALTREF2_FRAME identifies the ALTREF2_FRAME reference

    • STD_VIDEO_AV1_REFERENCE_NAME_ALTREF_FRAME identifies the ALTREF_FRAME reference

These enumeration constants are not directly used in any APIs but are used to indirectly index into certain Video Std and Vulkan API parameter arrays.

AV1 Decode Profile

A video profile supporting AV1 video decode operations is specified by setting VkVideoProfileInfoKHR::videoCodecOperation to VK_VIDEO_CODEC_OPERATION_DECODE_AV1_BIT_KHR and adding a VkVideoDecodeAV1ProfileInfoKHR structure to the VkVideoProfileInfoKHR::pNext chain.

The VkVideoDecodeAV1ProfileInfoKHR structure is defined as:

// Provided by VK_KHR_video_decode_av1
typedef struct VkVideoDecodeAV1ProfileInfoKHR {
    VkStructureType       sType;
    const void*           pNext;
    StdVideoAV1Profile    stdProfile;
    VkBool32              filmGrainSupport;
} VkVideoDecodeAV1ProfileInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • stdProfile is a StdVideoAV1Profile value specifying the AV1 codec profile, as defined in section A.2 of the AV1 Specification.

  • filmGrainSupport specifies whether AV1 film grain, as defined in section 7.8.3 of the AV1 Specification, can be used with the video profile. When this member is VK_TRUE, video session objects created against the video profile will be able to decode pictures that have film grain enabled.

Enabling filmGrainSupport may increase the memory requirements of video sessions and/or video picture resources on some implementations.

Valid Usage (Implicit)
  • VUID-VkVideoDecodeAV1ProfileInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_DECODE_AV1_PROFILE_INFO_KHR

AV1 Decode Capabilities

When calling vkGetPhysicalDeviceVideoCapabilitiesKHR to query the capabilities for an AV1 decode profile, the VkVideoCapabilitiesKHR::pNext chain must include a VkVideoDecodeAV1CapabilitiesKHR structure that will be filled with the profile-specific capabilities.

The VkVideoDecodeAV1CapabilitiesKHR structure is defined as:

// Provided by VK_KHR_video_decode_av1
typedef struct VkVideoDecodeAV1CapabilitiesKHR {
    VkStructureType     sType;
    void*               pNext;
    StdVideoAV1Level    maxLevel;
} VkVideoDecodeAV1CapabilitiesKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • maxLevel is a StdVideoAV1Level value specifying the maximum AV1 level supported by the profile, as defined in section A.3 of the AV1 Specification.

Valid Usage (Implicit)
  • VUID-VkVideoDecodeAV1CapabilitiesKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_DECODE_AV1_CAPABILITIES_KHR

AV1 Decode Parameter Sets

Video session parameters objects created with the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_AV1_BIT_KHR contain a single instance of the following parameter set:

AV1 Sequence Header

Represented by StdVideoAV1SequenceHeader structures and interpreted as follows:

  • flags.reserved and reserved1 are used only for padding purposes and are otherwise ignored;

  • the StdVideoAV1ColorConfig structure pointed to by pColorConfig is interpreted as follows:

    • flags.reserved and reserved1 are used only for padding purposes and are otherwise ignored;

    • all other members of StdVideoAV1ColorConfig are interpreted as defined in section 6.4.2 of the AV1 Specification;

  • if flags.timing_info_present_flag is set, then the StdVideoAV1TimingInfo structure pointed to by pTimingInfo is interpreted as follows:

    • flags.reserved is used only for padding purposes and is otherwise ignored;

    • all other members of StdVideoAV1TimingInfo are interpreted as defined in section 6.4.3 of the AV1 Specification;

  • all other members of StdVideoAV1SequenceHeader are interpreted as defined in section 6.4 of the AV1 Specification.

When a video session parameters object is created with the codec operation VK_VIDEO_CODEC_OPERATION_DECODE_AV1_BIT_KHR, the VkVideoSessionParametersCreateInfoKHR::pNext chain must include a VkVideoDecodeAV1SessionParametersCreateInfoKHR structure specifying the contents of the object.

The VkVideoDecodeAV1SessionParametersCreateInfoKHR structure is defined as:

// Provided by VK_KHR_video_decode_av1
typedef struct VkVideoDecodeAV1SessionParametersCreateInfoKHR {
    VkStructureType                     sType;
    const void*                         pNext;
    const StdVideoAV1SequenceHeader*    pStdSequenceHeader;
} VkVideoDecodeAV1SessionParametersCreateInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • pStdSequenceHeader is a pointer to a StdVideoAV1SequenceHeader structure describing the AV1 sequence header entry to store in the created object.

As AV1 video session parameters objects will only ever contain a single AV1 sequence header, this has to be specified at object creation time and such video session parameters objects cannot be updated using the vkUpdateVideoSessionParametersKHR command. When a new AV1 sequence header is decoded from the input video bitstream the application needs to create a new video session parameters object to store it.

Valid Usage (Implicit)
  • VUID-VkVideoDecodeAV1SessionParametersCreateInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_DECODE_AV1_SESSION_PARAMETERS_CREATE_INFO_KHR

  • VUID-VkVideoDecodeAV1SessionParametersCreateInfoKHR-pStdSequenceHeader-parameter
    pStdSequenceHeader must be a valid pointer to a valid StdVideoAV1SequenceHeader value

AV1 Decoding Parameters

The VkVideoDecodeAV1PictureInfoKHR structure is defined as:

// Provided by VK_KHR_video_decode_av1
typedef struct VkVideoDecodeAV1PictureInfoKHR {
    VkStructureType                        sType;
    const void*                            pNext;
    const StdVideoDecodeAV1PictureInfo*    pStdPictureInfo;
    int32_t                                referenceNameSlotIndices[VK_MAX_VIDEO_AV1_REFERENCES_PER_FRAME_KHR];
    uint32_t                               frameHeaderOffset;
    uint32_t                               tileCount;
    const uint32_t*                        pTileOffsets;
    const uint32_t*                        pTileSizes;
} VkVideoDecodeAV1PictureInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • pStdPictureInfo is a pointer to a StdVideoDecodeAV1PictureInfo structure specifying AV1 picture information.

  • referenceNameSlotIndices is an array of seven (VK_MAX_VIDEO_AV1_REFERENCES_PER_FRAME_KHR, which is equal to the Video Std definition STD_VIDEO_AV1_REFS_PER_FRAME) signed integer values specifying the index of the DPB slot or a negative integer value for each AV1 reference name used for inter coding. In particular, the DPB slot index for the AV1 reference name frame is specified in referenceNameSlotIndices[frame - STD_VIDEO_AV1_REFERENCE_NAME_LAST_FRAME].

  • frameHeaderOffset is the byte offset of the AV1 frame header OBU, as defined in section 5.9 of the AV1 Specification, within the video bitstream buffer range specified in VkVideoDecodeInfoKHR.

  • tileCount is the number of elements in pTileOffsets and pTileSizes.

  • pTileOffsets is a pointer to an array of tileCount integers specifying the byte offset of the tiles of the picture within the video bitstream buffer range specified in VkVideoDecodeInfoKHR.

  • pTileSizes is a pointer to an array of tileCount integers specifying the byte size of the tiles of the picture within the video bitstream buffer range specified in VkVideoDecodeInfoKHR.

This structure is specified in the pNext chain of the VkVideoDecodeInfoKHR structure passed to vkCmdDecodeVideoKHR to specify the codec-specific picture information for an AV1 decode operation.

Decode Output Picture Information

When this structure is specified in the pNext chain of the VkVideoDecodeInfoKHR structure passed to vkCmdDecodeVideoKHR, the information related to the decode output picture is defined as follows:

Std Picture Information

The members of the StdVideoDecodeAV1PictureInfo structure pointed to by pStdPictureInfo are interpreted as follows:

  • flags.reserved, reserved1, and reserved2 are used only for padding purposes and are otherwise ignored;

  • flags.apply_grain indicates that film grain is enabled for the decoded picture, as defined in section 6.8.20 of the AV1 Specification;

  • OrderHint, OrderHints, and expectedFrameId are interpreted as defined in section 6.8.2 of the AV1 Specification;

  • the StdVideoAV1TileInfo structure pointed to by pTileInfo is interpreted as follows:

    • flags.reserved and reserved1 are used only for padding purposes and are otherwise ignored;

    • pMiColStarts is a pointer to an array of TileCols number of unsigned integers that corresponds to MiColStarts defined in section 6.8.14 of the AV1 Specification;

    • pMiRowStarts is a pointer to an array of TileRows number of unsigned integers that corresponds to MiRowStarts defined in section 6.8.14 of the AV1 Specification;

    • pWidthInSbsMinus1 is a pointer to an array of TileCols number of unsigned integers that corresponds to width_in_sbs_minus_1 defined in section 6.8.14 of the AV1 Specification;

    • pHeightInSbsMinus1 is a pointer to an array of TileRows number of unsigned integers that corresponds to height_in_sbs_minus_1 defined in section 6.8.14 of the AV1 Specification;

    • all other members of StdVideoAV1TileInfo are interpreted as defined in section 6.8.14 of the AV1 Specification;

  • the StdVideoAV1Quantization structure pointed to by pQuantization is interpreted as follows:

    • flags.reserved is used only for padding purposes and is otherwise ignored;

    • all other members of StdVideoAV1Quantization are interpreted as defined in section 6.8.11 of the AV1 Specification;

  • if flags.segmentation_enabled is set, then the StdVideoAV1Segmentation structure pointed to by pSegmentation is interpreted as follows:

    • the elements of FeatureEnabled are bitmasks where bit index j of element i corresponds to FeatureEnabled[i][j] as defined in section 6.8.13 of the AV1 Specification;

    • FeatureData is interpreted as defined in section 6.8.13 of the AV1 Specification;

  • the StdVideoAV1LoopFilter structure pointed to by pLoopFilter is interpreted as follows:

    • flags.reserved is used only for padding purposes and is otherwise ignored;

    • update_ref_delta is a bitmask where bit index i is interpreted as the value of update_ref_delta corresponding to element i of loop_filter_ref_deltas as defined in section 6.8.10 of the AV1 Specification;

    • update_mode_delta is a bitmask where bit index i is interpreted as the value of update_mode_delta corresponding to element i of loop_filter_mode_deltas as defined in section 6.8.10 of the AV1 Specification;

    • all other members of StdVideoAV1LoopFilter are interpreted as defined in section 6.8.10 of the AV1 Specification;

  • if flags.enable_cdef is set in the active sequence header, then the members of the StdVideoAV1CDEF structure pointed to by pCDEF are interpreted as follows:

    • cdef_y_sec_strength and cdef_uv_sec_strength are the bitstream values of the corresponding syntax elements defined in section 5.9.19 of the AV1 Specification;

    • all other members of StdVideoAV1CDEF are interpreted as defined in section 6.10.14 of the AV1 Specification;

  • the StdVideoAV1LoopRestoration structure pointed to by pLoopRestoration is interpreted as follows:

    • LoopRestorationSize[plane] is interpreted as log2(size) - 5, where size is the value of LoopRestorationSize[plane] as defined in section 6.10.15 of the AV1 Specification.

    • all other members of StdVideoAV1LoopRestoration are defined as in section 6.10.15 of the AV1 Specification;

  • the members of the StdVideoAV1GlobalMotion structure provided in global_motion are interpreted as defined in section 7.10 of the AV1 Specification;

  • if flags.film_grain_params_present is set in the active sequence header, then the StdVideoAV1FilmGrain structure pointed to by pFilmGrain is interpreted as follows:

    • flags.reserved is used only for padding purposes and is otherwise ignored;

    • all other members of StdVideoAV1FilmGrain are interpreted as defined in section 6.8.20 of the AV1 Specification;

  • all other members are interpreted as defined in section 6.8 of the AV1 Specification.

When film grain is enabled for the decoded frame, the flags.update_grain and film_grain_params_ref_idx values specified in StdVideoAV1FilmGrain are ignored by AV1 decode operations and the load_grain_params function, as defined in section 6.8.20 of the AV1 Specification, is not executed. Instead, the application is responsible for specifying the effective film grain parameters for the frame in StdVideoAV1FilmGrain.

When film grain is enabled for the decoded frame, the application is required to specify a different decode output picture resource in VkVideoDecodeInfoKHR::dstPictureResource compared to the reconstructed picture specified in VkVideoDecodeInfoKHR::pSetupReferenceSlot->pPictureResource even if the implementation does not report support for VK_VIDEO_DECODE_CAPABILITY_DPB_AND_OUTPUT_DISTINCT_BIT_KHR in VkVideoDecodeCapabilitiesKHR::flags for the video decode profile.

Reference picture setup is controlled by the value of StdVideoDecodeAV1PictureInfo::refresh_frame_flags. If it is not zero and a reconstructed picture is specified, then the latter is used as the target of picture reconstruction to activate the DPB slot specified in pDecodeInfo->pSetupReferenceSlot→slotIndex. If StdVideoDecodeAV1PictureInfo::refresh_frame_flags is zero, but a reconstructed picture is specified, then the corresponding picture reference associated with the DPB slot is invalidated, as described in the DPB Slot States section.

Active Parameter Sets

The active sequence header is the AV1 sequence header stored in the bound video session parameters object.

Valid Usage (Implicit)
  • VUID-VkVideoDecodeAV1PictureInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_DECODE_AV1_PICTURE_INFO_KHR

  • VUID-VkVideoDecodeAV1PictureInfoKHR-pStdPictureInfo-parameter
    pStdPictureInfo must be a valid pointer to a valid StdVideoDecodeAV1PictureInfo value

  • VUID-VkVideoDecodeAV1PictureInfoKHR-pTileOffsets-parameter
    pTileOffsets must be a valid pointer to an array of tileCount uint32_t values

  • VUID-VkVideoDecodeAV1PictureInfoKHR-pTileSizes-parameter
    pTileSizes must be a valid pointer to an array of tileCount uint32_t values

  • VUID-VkVideoDecodeAV1PictureInfoKHR-tileCount-arraylength
    tileCount must be greater than 0

The VkVideoDecodeAV1DpbSlotInfoKHR structure is defined as:

// Provided by VK_KHR_video_decode_av1
typedef struct VkVideoDecodeAV1DpbSlotInfoKHR {
    VkStructureType                          sType;
    const void*                              pNext;
    const StdVideoDecodeAV1ReferenceInfo*    pStdReferenceInfo;
} VkVideoDecodeAV1DpbSlotInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • pStdReferenceInfo is a pointer to a StdVideoDecodeAV1ReferenceInfo structure specifying AV1 reference information.

This structure is specified in the pNext chain of VkVideoDecodeInfoKHR::pSetupReferenceSlot, if not NULL, and the pNext chain of the elements of VkVideoDecodeInfoKHR::pReferenceSlots to specify the codec-specific reference picture information for an AV1 decode operation.

Active Reference Picture Information

When this structure is specified in the pNext chain of the elements of VkVideoDecodeInfoKHR::pReferenceSlots, one element is added to the list of active reference pictures used by the video decode operation for each element of VkVideoDecodeInfoKHR::pReferenceSlots as follows:

Reconstructed Picture Information

When this structure is specified in the pNext chain of VkVideoDecodeInfoKHR::pSetupReferenceSlot, the information related to the reconstructed picture is defined as follows:

Std Reference Information

The members of the StdVideoDecodeAV1ReferenceInfo structure pointed to by pStdReferenceInfo are interpreted as follows:

  • flags.reserved and reserved1 are used only for padding purposes and are otherwise ignored;

  • flags.disable_frame_end_update_cdf is interpreted as defined in section 6.8.2 of the AV1 Specification;

  • flags.segmentation_enabled is interpreted as defined in section 6.8.13 of the AV1 Specification;

  • frame_type is interpreted as defined in section 6.8.2 of the AV1 Specification;

    The frame_type member is defined with the type uint8_t, but it takes the same values defined in the StdVideoAV1FrameType enumeration type as StdVideoDecodeAV1PictureInfo::frame_type.

  • RefFrameSignBias is a bitmask where bit index i corresponds to RefFrameSignBias[i] as defined in section 6.8.2 of the AV1 Specification;

  • OrderHint is interpreted as defined in section 6.8.2 of the AV1 Specification;

  • SavedOrderHints is interpreted as defined in section 7.20 of the AV1 Specification.

    When the AV1 reference information is provided for the reconstructed picture, certain parameters (e.g. frame_type) are specified both in the AV1 picture information and in the AV1 reference information. This is necessary because unlike the AV1 picture information, which is only used for the purposes of the video decode operation in question, the AV1 reference information specified for the reconstructed picture may be associated with the activated DPB slot, meaning that some implementations may maintain it as part of the reference picture metadata corresponding to the video picture resource associated with the DPB slot. When the AV1 reference information is provided for an active reference picture, the specified parameters correspond to the parameters specified when the DPB slot was activated (set up) with the reference picture, as usual, in order to communicate these parameters for implementations that do not maintain any subset of these parameters as part of the DPB slot’s reference picture metadata.

Valid Usage (Implicit)
  • VUID-VkVideoDecodeAV1DpbSlotInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_DECODE_AV1_DPB_SLOT_INFO_KHR

  • VUID-VkVideoDecodeAV1DpbSlotInfoKHR-pStdReferenceInfo-parameter
    pStdReferenceInfo must be a valid pointer to a valid StdVideoDecodeAV1ReferenceInfo value

AV1 Decode Requirements

This section describes the required AV1 decoding capabilities for physical devices that have at least one queue family that supports the video codec operation VK_VIDEO_CODEC_OPERATION_DECODE_AV1_BIT_KHR, as returned by vkGetPhysicalDeviceQueueFamilyProperties2 in VkQueueFamilyVideoPropertiesKHR::videoCodecOperations.

Table 5. Required Video Std Header Versions
Video Std Header Name Version

vulkan_video_codec_av1std_decode

1.0.0

Table 6. Required Video Capabilities
Video Capability Requirement Requirement Type1

VkVideoCapabilitiesKHR

flags

-

min

minBitstreamBufferOffsetAlignment

4096

max

minBitstreamBufferSizeAlignment

4096

max

pictureAccessGranularity

(64,64)

max

minCodedExtent

-

max

maxCodedExtent

-

min

maxDpbSlots

0

min

maxActiveReferencePictures

0

min

VkVideoDecodeCapabilitiesKHR

flags

VK_VIDEO_DECODE_CAPABILITY_DPB_AND_OUTPUT_COINCIDE_BIT_KHR or VK_VIDEO_DECODE_CAPABILITY_DPB_AND_OUTPUT_DISTINCT_BIT_KHR

min

VkVideoDecodeAV1CapabilitiesKHR

maxLevel

STD_VIDEO_AV1_LEVEL_2_0

min

1

The Requirement Type column specifies the requirement is either the minimum value all implementations must support, the maximum value all implementations must support, or the exact value all implementations must support. For bitmasks a minimum value is the least bits all implementations must set, but they may have additional bits set beyond this minimum.

Video Encode Operations

Video encode operations consume an encode input picture and zero or more reference pictures, and produce compressed video data to a video bitstream buffer and an optional reconstructed picture.

Such encode input pictures can be used as the output of video decode operations, with graphics or compute operations, or with Window System Integration APIs, depending on the capabilities of the implementation.

Video encode operations may access the following resources in the VK_PIPELINE_STAGE_2_VIDEO_ENCODE_BIT_KHR stage:

The image subresource of each video picture resource accessed by the video encode operation is specified using a corresponding VkVideoPictureResourceInfoKHR structure. Each such image subresource must be in the appropriate image layout as follows:

  • If the image subresource is used in the video encode operation as an encode input picture, then it must be in the VK_IMAGE_LAYOUT_VIDEO_ENCODE_SRC_KHR layout.

  • If the image subresource is used in the video encode operation as a reconstructed picture or reference picture, then it must be in the VK_IMAGE_LAYOUT_VIDEO_ENCODE_DPB_KHR layout.

  • If the image subresource is used in the video encode operation as a quantization map, then it must be in the VK_IMAGE_LAYOUT_VIDEO_ENCODE_QUANTIZATION_MAP_KHR layout.

A video encode operation may complete unsuccessfully. In this case the target video bitstream buffer will have undefined contents. Similarly, if reference picture setup is requested, the reconstructed-picture will also have undefined contents, and the activated DPB slot will have an invalid picture reference.

If a video encode operation completes successfully and the codec-specific parameters provided by the application adhere to the syntactic and semantic requirements defined in the corresponding video compression standard, then the target video bitstream buffer will contain compressed video data after the execution of the video encode operation according to the respective codec-specific semantics.

Codec-Specific Semantics

The following aspects of video encode operations are codec-specific:

  • The compressed video data written to the target video bitstream buffer range.

  • The construction and interpretation of the list of active reference pictures and the interpretation of the picture data referred to by the corresponding image subregions.

  • The construction and interpretation of information related to the encode input picture and the interpretation of the picture data referred to by the corresponding image subregion.

  • The decision on reference picture setup.

  • The construction and interpretation of information related to the optional reconstructed picture and the generation of picture data to the corresponding image subregion.

  • Certain aspects of rate control.

These codec-specific behaviors are defined for each video codec operation separately.

  • If the used video codec operation is VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR, then the codec-specific aspects of the video encoding process are performed as defined in the H.264 Encode Operations section.

  • If the used video codec operation is VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR, then the codec-specific aspects of the video encoding process are performed as defined in the H.265 Encode Operations section.

  • If the used video codec operation is VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR, then the codec-specific aspects of the video encoding process are performed as defined in the AV1 Encode Operations section.

Video Encode Parameter Overrides

Implementations supporting video encode operations for any particular video codec operation often support only a subset of the available encoding tools defined by the corresponding video compression standards. Accordingly, certain implementation-dependent limitations may apply to codec-specific parameters provided through the structures defined in the Video Std headers corresponding to the used video codec operation.

Exposing all of these restrictions on particular codec-specific parameter values or combinations thereof in the form of application-queryable capabilities is impractical, hence this specification allows implementations to override the value of any of the codec-specific parameters, unless otherwise specified, as long as all of the following conditions are met:

  • If the application-provided codec-specific parameters adhere to the syntactic and semantic requirements and rules defined by the used video compression standard, and thus would be usable to produce a video bitstream compliant with that standard, then the codec-specific parameters resulting from the process of implementation overrides must also adhere to the same requirements and rules, and any video bitstream produced using the overridden parameters must also be compliant.

  • The overridden codec-specific parameter values must not have an impact on the codec-independent behaviors defined for video encode operations.

  • The implementation must not override any codec-specific parameters specified to a command that may cause application-provided codec-specific parameters specified to subsequent commands to no longer adhere to the semantic requirements and rules defined by the used video compression standard, unless the implementation also overrides those parameters to adhere to any such requirements and rules.

  • The overridden codec-specific parameter values must not have an impact on the codec-specific picture data access semantics.

  • The overridden codec-specific parameter values may change the contents of the codec-specific bitstream elements produced by video encode operations or otherwise retrieved by the application (e.g. using the vkGetEncodedVideoSessionParametersKHR command) but must still adhere to the codec-specific semantics defined for that video codec operation, including, but not limited to, the number, type, and order of the encoded codec-specific bitstream elements.

Besides codec-specific parameter overrides performed for implementation-dependent reasons, applications can enable the implementation to apply additional optimizing overrides that may improve the efficiency or performance of video encoding operations. However, implementations must meet the conditions listed above even in case of such optimizing overrides.

Unless the application opts in for optimizing overrides, implementations are not expected to override any of the codec-specific parameters, except when such overrides are necessary for the correct operation of video encoder implementation due to limitations to the available encoding tools on that implementation.

Video Encode Operation Steps

Each video encode operation performs the following steps in the VK_PIPELINE_STAGE_2_VIDEO_ENCODE_BIT_KHR stage:

  1. Reads the input picture data from the encode input picture;

  2. Determine derived encoding quality parameters according to the codec-specific semantics and the current rate control state;

  3. Compresses the input picture data according to the codec-specific semantics, applying any prediction data read from the active reference pictures and rate control restrictions in the process;

  4. Writes the encoded bitstream data to the destination video bitstream buffer range;

  5. Performs picture reconstruction of the encoded video data according to the codec-specific semantics, applying any prediction data read from the active reference pictures in the process, if a reconstructed picture is specified and reference picture setup is requested;

  6. If reference picture setup is requested, the DPB slot index specified in the reconstructed picture information is activated with the reconstructed picture;

  7. Writes the reconstructed picture data to the reconstructed picture, if one is specified, according to the codec-specific semantics.

When reconstructed picture information is provided, the specified DPB slot index is associated with the corresponding bound reference picture resource, indifferent of whether reference picture setup is requested.

Capabilities

When calling vkGetPhysicalDeviceVideoCapabilitiesKHR with pVideoProfile->videoCodecOperation specifying an encode operation, the VkVideoEncodeCapabilitiesKHR structure must be included in the pNext chain of the VkVideoCapabilitiesKHR structure to retrieve capabilities specific to video encoding.

The VkVideoEncodeCapabilitiesKHR structure is defined as:

// Provided by VK_KHR_video_encode_queue
typedef struct VkVideoEncodeCapabilitiesKHR {
    VkStructureType                         sType;
    void*                                   pNext;
    VkVideoEncodeCapabilityFlagsKHR         flags;
    VkVideoEncodeRateControlModeFlagsKHR    rateControlModes;
    uint32_t                                maxRateControlLayers;
    uint64_t                                maxBitrate;
    uint32_t                                maxQualityLevels;
    VkExtent2D                              encodeInputPictureGranularity;
    VkVideoEncodeFeedbackFlagsKHR           supportedEncodeFeedbackFlags;
} VkVideoEncodeCapabilitiesKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • flags is a bitmask of VkVideoEncodeCapabilityFlagBitsKHR describing supported encoding features.

  • rateControlModes is a bitmask of VkVideoEncodeRateControlModeFlagBitsKHR indicating supported rate control modes.

  • maxRateControlLayers indicates the maximum number of rate control layers supported.

  • maxBitrate indicates the maximum supported bitrate.

  • maxQualityLevels indicates the number of discrete video encode quality levels supported. Implementations must support at least one quality level.

  • encodeInputPictureGranularity indicates the granularity at which encode input picture data is encoded and may indicate a texel granularity up to the size of the largest supported codec-specific coding block. This capability does not impose any valid usage constraints on the application, however, depending on the contents of the encode input picture, it may have effects on the encoded bitstream, as described in more detail below.

  • supportedEncodeFeedbackFlags is a bitmask of VkVideoEncodeFeedbackFlagBitsKHR values specifying the supported flags for video encode feedback queries.

Implementations must include support for at least VK_VIDEO_ENCODE_FEEDBACK_BITSTREAM_BUFFER_OFFSET_BIT_KHR and VK_VIDEO_ENCODE_FEEDBACK_BITSTREAM_BYTES_WRITTEN_BIT_KHR in supportedEncodeFeedbackFlags.

encodeInputPictureGranularity provides information about the way encode input picture data is used as input to video encode operations. In particular, some implementations may not be able to limit the set of texels used to encode the output video bitstream to the image subregion specified in the VkVideoPictureResourceInfoKHR structure corresponding to the encode input picture (i.e. to the resolution of the image data to encode specified in its codedExtent member).

For example, the application requests the coded extent to be 1920x1080, but the implementation is only able to source the encode input picture data at the granularity of the codec-specific coding block size which is 16x16 pixels (or as otherwise indicated in encodeInputPictureGranularity). In this example, the content is horizontally aligned with the coding block size, but not vertically aligned with it. Thus encoding of the last row of coding blocks will be impacted by the contents of the input image at texel rows 1080 to 1087 (the latter being the next row which is vertically aligned with the coding block size, assuming a zero-based texel row index).

If codedExtent rounded up to the next integer multiple of encodeInputPictureGranularity is greater than the extent of the image subresource specified for the encode input picture, then the texel values corresponding to texel coordinates outside of the bounds of the image subresource may be undefined. However, implementations should use well-defined default values for such texels in order to maximize the encoding efficiency for the last coding block row/column, and/or to ensure consistent encoding results across repeated encoding of the same input content. Nonetheless, the values used for such texels must not have an effect on whether the video encode operation produces a compliant bitstream, and must not have any other effects on the encoded picture data beyond what may otherwise result from using these texel values as input to any compression algorithm, as defined in the used video compression standard.

While not required, it is generally a good practice for applications to make sure that the image subresource used for the encode input picture has an extent that is an integer multiple of the codec-specific coding block size (or at least encodeInputPictureGranularity) and that this padding area is filled with known values in order to improve encoding efficiency, portability, and reproducibility.

Valid Usage (Implicit)
  • VUID-VkVideoEncodeCapabilitiesKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_CAPABILITIES_KHR

Bits which may be set in VkVideoEncodeCapabilitiesKHR::flags, indicating the encoding tools supported, are:

// Provided by VK_KHR_video_encode_queue
typedef enum VkVideoEncodeCapabilityFlagBitsKHR {
    VK_VIDEO_ENCODE_CAPABILITY_PRECEDING_EXTERNALLY_ENCODED_BYTES_BIT_KHR = 0x00000001,
    VK_VIDEO_ENCODE_CAPABILITY_INSUFFICIENT_BITSTREAM_BUFFER_RANGE_DETECTION_BIT_KHR = 0x00000002,
  // Provided by VK_KHR_video_encode_quantization_map
    VK_VIDEO_ENCODE_CAPABILITY_QUANTIZATION_DELTA_MAP_BIT_KHR = 0x00000004,
  // Provided by VK_KHR_video_encode_quantization_map
    VK_VIDEO_ENCODE_CAPABILITY_EMPHASIS_MAP_BIT_KHR = 0x00000008,
} VkVideoEncodeCapabilityFlagBitsKHR;
  • VK_VIDEO_ENCODE_CAPABILITY_PRECEDING_EXTERNALLY_ENCODED_BYTES_BIT_KHR specifies that the implementation supports the use of VkVideoEncodeInfoKHR::precedingExternallyEncodedBytes.

  • VK_VIDEO_ENCODE_CAPABILITY_INSUFFICIENT_BITSTREAM_BUFFER_RANGE_DETECTION_BIT_KHR specifies that the implementation is able to detect and report when the destination video bitstream buffer range provided by the application is not sufficiently large to fit the encoded bitstream data produced by a video encode operation by reporting the VK_QUERY_RESULT_STATUS_INSUFFICIENT_BITSTREAM_BUFFER_RANGE_KHR query result status code.

    Some implementations may not be able to reliably detect insufficient bitstream buffer range conditions in all situations. Such implementations will not report support for the VK_VIDEO_ENCODE_CAPABILITY_INSUFFICIENT_BITSTREAM_BUFFER_RANGE_DETECTION_BIT_KHR encode capability flag for the video profile, but may still report the VK_QUERY_RESULT_STATUS_INSUFFICIENT_BITSTREAM_BUFFER_RANGE_KHR query result status code in certain cases. Applications should always check for the specific query result status code VK_QUERY_RESULT_STATUS_INSUFFICIENT_BITSTREAM_BUFFER_RANGE_KHR even when this encode capability flag is not supported by the implementation for the video profile in question. However, applications must not assume that a different negative query result status code indicating an unsuccessful completion of a video encode operation is not the result of an insufficient bitstream buffer condition unless this encode capability flag is supported.

  • VK_VIDEO_ENCODE_CAPABILITY_QUANTIZATION_DELTA_MAP_BIT_KHR indicates support for using quantization delta maps.

  • VK_VIDEO_ENCODE_CAPABILITY_EMPHASIS_MAP_BIT_KHR specifies support for using emphasis maps.

// Provided by VK_KHR_video_encode_queue
typedef VkFlags VkVideoEncodeCapabilityFlagsKHR;

VkVideoEncodeCapabilityFlagsKHR is a bitmask type for setting a mask of zero or more VkVideoEncodeCapabilityFlagBitsKHR.

Video Encode Quality Levels

Implementations can support more than one video encode quality levels for a video encode profile, which control the number and type of implementation-specific encoding tools and algorithms utilized in the encoding process.

Generally, using higher video encode quality levels may produce higher quality video streams at the cost of additional processing time. However, as the final quality of an encoded picture depends on the contents of the encode input picture, the contents of the active reference pictures, the codec-specific encode parameters, and the particular implementation-specific tools used corresponding to the individual video encode quality levels, there are no guarantees that using a higher video encode quality level will always produce a higher quality encoded picture for any given set of inputs.

To query properties for a specific video encode quality level supported by a video encode profile, call:

// Provided by VK_KHR_video_encode_queue
VkResult vkGetPhysicalDeviceVideoEncodeQualityLevelPropertiesKHR(
    VkPhysicalDevice                            physicalDevice,
    const VkPhysicalDeviceVideoEncodeQualityLevelInfoKHR* pQualityLevelInfo,
    VkVideoEncodeQualityLevelPropertiesKHR*     pQualityLevelProperties);
Valid Usage
  • VUID-vkGetPhysicalDeviceVideoEncodeQualityLevelPropertiesKHR-pQualityLevelInfo-08257
    If pQualityLevelInfo->pVideoProfile→videoCodecOperation is VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR, then the pNext chain of pQualityLevelProperties must include a VkVideoEncodeH264QualityLevelPropertiesKHR structure

  • VUID-vkGetPhysicalDeviceVideoEncodeQualityLevelPropertiesKHR-pQualityLevelInfo-08258
    If pQualityLevelInfo->pVideoProfile→videoCodecOperation is VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR, then the pNext chain of pQualityLevelProperties must include a VkVideoEncodeH265QualityLevelPropertiesKHR structure

  • VUID-vkGetPhysicalDeviceVideoEncodeQualityLevelPropertiesKHR-pQualityLevelInfo-10305
    If pQualityLevelInfo->pVideoProfile→videoCodecOperation is VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR, then the pNext chain of pQualityLevelProperties must include a VkVideoEncodeAV1QualityLevelPropertiesKHR structure

Valid Usage (Implicit)
  • VUID-vkGetPhysicalDeviceVideoEncodeQualityLevelPropertiesKHR-physicalDevice-parameter
    physicalDevice must be a valid VkPhysicalDevice handle

  • VUID-vkGetPhysicalDeviceVideoEncodeQualityLevelPropertiesKHR-pQualityLevelInfo-parameter
    pQualityLevelInfo must be a valid pointer to a valid VkPhysicalDeviceVideoEncodeQualityLevelInfoKHR structure

  • VUID-vkGetPhysicalDeviceVideoEncodeQualityLevelPropertiesKHR-pQualityLevelProperties-parameter
    pQualityLevelProperties must be a valid pointer to a VkVideoEncodeQualityLevelPropertiesKHR structure

Return Codes
Success
  • VK_SUCCESS

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

  • VK_ERROR_VIDEO_PROFILE_OPERATION_NOT_SUPPORTED_KHR

  • VK_ERROR_VIDEO_PROFILE_FORMAT_NOT_SUPPORTED_KHR

  • VK_ERROR_VIDEO_PICTURE_LAYOUT_NOT_SUPPORTED_KHR

  • VK_ERROR_VIDEO_PROFILE_CODEC_NOT_SUPPORTED_KHR

The VkPhysicalDeviceVideoEncodeQualityLevelInfoKHR structure is defined as:

// Provided by VK_KHR_video_encode_queue
typedef struct VkPhysicalDeviceVideoEncodeQualityLevelInfoKHR {
    VkStructureType                 sType;
    const void*                     pNext;
    const VkVideoProfileInfoKHR*    pVideoProfile;
    uint32_t                        qualityLevel;
} VkPhysicalDeviceVideoEncodeQualityLevelInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • pVideoProfile is a pointer to a VkVideoProfileInfoKHR structure specifying the video profile to query the video encode quality level properties for.

  • qualityLevel is the video encode quality level to query properties for.

Valid Usage
  • VUID-VkPhysicalDeviceVideoEncodeQualityLevelInfoKHR-pVideoProfile-08259
    pVideoProfile must be a supported video profile

  • VUID-VkPhysicalDeviceVideoEncodeQualityLevelInfoKHR-pVideoProfile-08260
    pVideoProfile->videoCodecOperation must specify an encode operation

  • VUID-VkPhysicalDeviceVideoEncodeQualityLevelInfoKHR-qualityLevel-08261
    qualityLevel must be less than VkVideoEncodeCapabilitiesKHR::maxQualityLevels, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile specified in pVideoProfile

Valid Usage (Implicit)
  • VUID-VkPhysicalDeviceVideoEncodeQualityLevelInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_VIDEO_ENCODE_QUALITY_LEVEL_INFO_KHR

  • VUID-VkPhysicalDeviceVideoEncodeQualityLevelInfoKHR-pNext-pNext
    pNext must be NULL

  • VUID-VkPhysicalDeviceVideoEncodeQualityLevelInfoKHR-pVideoProfile-parameter
    pVideoProfile must be a valid pointer to a valid VkVideoProfileInfoKHR structure

The VkVideoEncodeQualityLevelPropertiesKHR structure is defined as:

// Provided by VK_KHR_video_encode_queue
typedef struct VkVideoEncodeQualityLevelPropertiesKHR {
    VkStructureType                            sType;
    void*                                      pNext;
    VkVideoEncodeRateControlModeFlagBitsKHR    preferredRateControlMode;
    uint32_t                                   preferredRateControlLayerCount;
} VkVideoEncodeQualityLevelPropertiesKHR;
Valid Usage (Implicit)

The VkVideoEncodeQualityLevelInfoKHR structure is defined as:

// Provided by VK_KHR_video_encode_queue
typedef struct VkVideoEncodeQualityLevelInfoKHR {
    VkStructureType    sType;
    const void*        pNext;
    uint32_t           qualityLevel;
} VkVideoEncodeQualityLevelInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • qualityLevel is the used video encode quality level.

This structure can be specified in the following places:

  • In the pNext chain of VkVideoSessionParametersCreateInfoKHR to specify the video encode quality level to use for a video session parameters object created for a video encode session. If no instance of this structure is included in the pNext chain of VkVideoSessionParametersCreateInfoKHR, then the video session parameters object is created with a video encode quality level of zero.

  • In the pNext chain of VkVideoCodingControlInfoKHR to change the video encode quality level state of the bound video session.

Valid Usage
Valid Usage (Implicit)
  • VUID-VkVideoEncodeQualityLevelInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_QUALITY_LEVEL_INFO_KHR

Retrieving Encoded Session Parameters

Any codec-specific parameters stored in video session parameters objects may need to be separately encoded and included in the final video bitstream data, depending on the used video compression standard. In such cases the application must call the vkGetEncodedVideoSessionParametersKHR command to retrieve the encoded parameter data from the used video session parameters object in order to be able to produce a compliant video bitstream.

This is needed because implementations may have changed some of the codec-specific parameters stored in the video session parameters object, as defined in the Video Encode Parameter Overrides section. In addition, the vkGetEncodedVideoSessionParametersKHR command enables the application to retrieve the encoded parameter data without having to encode these codec-specific parameters manually.

Encoded parameter data can be retrieved from a video session parameters object created with a video encode operation using the command:

// Provided by VK_KHR_video_encode_queue
VkResult vkGetEncodedVideoSessionParametersKHR(
    VkDevice                                    device,
    const VkVideoEncodeSessionParametersGetInfoKHR* pVideoSessionParametersInfo,
    VkVideoEncodeSessionParametersFeedbackInfoKHR* pFeedbackInfo,
    size_t*                                     pDataSize,
    void*                                       pData);
  • device is the logical device that owns the video session parameters object.

  • pVideoSessionParametersInfo is a pointer to a VkVideoEncodeSessionParametersGetInfoKHR structure specifying the parameters of the encoded parameter data to retrieve.

  • pFeedbackInfo is either NULL or a pointer to a VkVideoEncodeSessionParametersFeedbackInfoKHR structure in which feedback about the requested parameter data is returned.

  • pDataSize is a pointer to a size_t value related to the amount of encode parameter data returned, as described below.

  • pData is either NULL or a pointer to a buffer to write the encoded parameter data to.

If pData is NULL, then the size of the encoded parameter data, in bytes, that can be retrieved is returned in pDataSize. Otherwise, pDataSize must point to a variable set by the application to the size of the buffer, in bytes, pointed to by pData, and on return the variable is overwritten with the number of bytes actually written to pData. If pDataSize is less than the size of the encoded parameter data that can be retrieved, then no data will be written to pData, zero will be written to pDataSize, and VK_INCOMPLETE will be returned instead of VK_SUCCESS, to indicate that no encoded parameter data was returned.

If pFeedbackInfo is not NULL then the members of the VkVideoEncodeSessionParametersFeedbackInfoKHR structure and any additional structures included in its pNext chain that are applicable to the video session parameters object specified in pVideoSessionParametersInfo->videoSessionParameters will be filled with feedback about the requested parameter data on all successful calls to this command.

This includes the cases when pData is NULL or when VK_INCOMPLETE is returned by the command, and enables the application to determine whether the implementation overrode any of the requested video session parameters without actually needing to retrieve the encoded parameter data itself.

Valid Usage
  • VUID-vkGetEncodedVideoSessionParametersKHR-pVideoSessionParametersInfo-08359
    pVideoSessionParametersInfo->videoSessionParameters must have been created with an encode operation

  • VUID-vkGetEncodedVideoSessionParametersKHR-pVideoSessionParametersInfo-08262
    If pVideoSessionParametersInfo->videoSessionParameters was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR, then the pNext chain of pVideoSessionParametersInfo must include a VkVideoEncodeH264SessionParametersGetInfoKHR structure

  • VUID-vkGetEncodedVideoSessionParametersKHR-pVideoSessionParametersInfo-08263
    If pVideoSessionParametersInfo->videoSessionParameters was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR, then for the VkVideoEncodeH264SessionParametersGetInfoKHR structure included in the pNext chain of pVideoSessionParametersInfo, if its writeStdSPS member is VK_TRUE, then pVideoSessionParametersInfo->videoSessionParameters must contain a StdVideoH264SequenceParameterSet entry with seq_parameter_set_id matching VkVideoEncodeH264SessionParametersGetInfoKHR::stdSPSId

  • VUID-vkGetEncodedVideoSessionParametersKHR-pVideoSessionParametersInfo-08264
    If pVideoSessionParametersInfo->videoSessionParameters was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR, then for the VkVideoEncodeH264SessionParametersGetInfoKHR structure included in the pNext chain of pVideoSessionParametersInfo, if its writeStdPPS member is VK_TRUE, then pVideoSessionParametersInfo->videoSessionParameters must contain a StdVideoH264PictureParameterSet entry with seq_parameter_set_id and pic_parameter_set_id matching VkVideoEncodeH264SessionParametersGetInfoKHR::stdSPSId and VkVideoEncodeH264SessionParametersGetInfoKHR::stdPPSId, respectively

  • VUID-vkGetEncodedVideoSessionParametersKHR-pVideoSessionParametersInfo-08265
    If pVideoSessionParametersInfo->videoSessionParameters was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR, then the pNext chain of pVideoSessionParametersInfo must include a VkVideoEncodeH265SessionParametersGetInfoKHR structure

  • VUID-vkGetEncodedVideoSessionParametersKHR-pVideoSessionParametersInfo-08266
    If pVideoSessionParametersInfo->videoSessionParameters was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR, then for the VkVideoEncodeH265SessionParametersGetInfoKHR structure included in the pNext chain of pVideoSessionParametersInfo, if its writeStdVPS member is VK_TRUE, then pVideoSessionParametersInfo->videoSessionParameters must contain a StdVideoH265VideoParameterSet entry with vps_video_parameter_set_id matching VkVideoEncodeH265SessionParametersGetInfoKHR::stdVPSId

  • VUID-vkGetEncodedVideoSessionParametersKHR-pVideoSessionParametersInfo-08267
    If pVideoSessionParametersInfo->videoSessionParameters was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR, then for the VkVideoEncodeH265SessionParametersGetInfoKHR structure included in the pNext chain of pVideoSessionParametersInfo, if its writeStdSPS member is VK_TRUE, then pVideoSessionParametersInfo->videoSessionParameters must contain a StdVideoH265SequenceParameterSet entry with sps_video_parameter_set_id and sps_seq_parameter_set_id matching VkVideoEncodeH265SessionParametersGetInfoKHR::stdVPSId and VkVideoEncodeH265SessionParametersGetInfoKHR::stdSPSId, respectively

  • VUID-vkGetEncodedVideoSessionParametersKHR-pVideoSessionParametersInfo-08268
    If pVideoSessionParametersInfo->videoSessionParameters was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR, then for the VkVideoEncodeH265SessionParametersGetInfoKHR structure included in the pNext chain of pVideoSessionParametersInfo, if its writeStdPPS member is VK_TRUE, then pVideoSessionParametersInfo->videoSessionParameters must contain a StdVideoH265PictureParameterSet entry with sps_video_parameter_set_id, pps_seq_parameter_set_id, and pps_pic_parameter_set_id matching VkVideoEncodeH265SessionParametersGetInfoKHR::stdVPSId, VkVideoEncodeH265SessionParametersGetInfoKHR::stdSPSId, and VkVideoEncodeH265SessionParametersGetInfoKHR::stdPPSId, respectively

Valid Usage (Implicit)
  • VUID-vkGetEncodedVideoSessionParametersKHR-device-parameter
    device must be a valid VkDevice handle

  • VUID-vkGetEncodedVideoSessionParametersKHR-pVideoSessionParametersInfo-parameter
    pVideoSessionParametersInfo must be a valid pointer to a valid VkVideoEncodeSessionParametersGetInfoKHR structure

  • VUID-vkGetEncodedVideoSessionParametersKHR-pFeedbackInfo-parameter
    If pFeedbackInfo is not NULL, pFeedbackInfo must be a valid pointer to a VkVideoEncodeSessionParametersFeedbackInfoKHR structure

  • VUID-vkGetEncodedVideoSessionParametersKHR-pDataSize-parameter
    pDataSize must be a valid pointer to a size_t value

  • VUID-vkGetEncodedVideoSessionParametersKHR-pData-parameter
    If the value referenced by pDataSize is not 0, and pData is not NULL, pData must be a valid pointer to an array of pDataSize bytes

Return Codes
Success
  • VK_SUCCESS

  • VK_INCOMPLETE

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

The VkVideoEncodeSessionParametersGetInfoKHR structure is defined as:

// Provided by VK_KHR_video_encode_queue
typedef struct VkVideoEncodeSessionParametersGetInfoKHR {
    VkStructureType                sType;
    const void*                    pNext;
    VkVideoSessionParametersKHR    videoSessionParameters;
} VkVideoEncodeSessionParametersGetInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • videoSessionParameters is the VkVideoSessionParametersKHR object to retrieve encoded parameter data from.

Depending on the used video encode operation, additional codec-specific structures may need to be included in the pNext chain of this structure to identify the specific video session parameters to retrieve encoded parameter data for, as described in the corresponding sections.

Valid Usage (Implicit)
  • VUID-VkVideoEncodeSessionParametersGetInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_SESSION_PARAMETERS_GET_INFO_KHR

  • VUID-VkVideoEncodeSessionParametersGetInfoKHR-pNext-pNext
    Each pNext member of any structure (including this one) in the pNext chain must be either NULL or a pointer to a valid instance of VkVideoEncodeH264SessionParametersGetInfoKHR or VkVideoEncodeH265SessionParametersGetInfoKHR

  • VUID-VkVideoEncodeSessionParametersGetInfoKHR-sType-unique
    The sType value of each struct in the pNext chain must be unique

  • VUID-VkVideoEncodeSessionParametersGetInfoKHR-videoSessionParameters-parameter
    videoSessionParameters must be a valid VkVideoSessionParametersKHR handle

The VkVideoEncodeSessionParametersFeedbackInfoKHR structure is defined as:

// Provided by VK_KHR_video_encode_queue
typedef struct VkVideoEncodeSessionParametersFeedbackInfoKHR {
    VkStructureType    sType;
    void*              pNext;
    VkBool32           hasOverrides;
} VkVideoEncodeSessionParametersFeedbackInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • hasOverrides indicates whether any of the requested parameter data were overridden by the implementation.

Depending on the used video encode operation, additional codec-specific structures can be included in the pNext chain of this structure to capture codec-specific feedback information about the requested parameter data, as described in the corresponding sections.

Valid Usage (Implicit)
  • VUID-VkVideoEncodeSessionParametersFeedbackInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_SESSION_PARAMETERS_FEEDBACK_INFO_KHR

  • VUID-VkVideoEncodeSessionParametersFeedbackInfoKHR-pNext-pNext
    Each pNext member of any structure (including this one) in the pNext chain must be either NULL or a pointer to a valid instance of VkVideoEncodeH264SessionParametersFeedbackInfoKHR or VkVideoEncodeH265SessionParametersFeedbackInfoKHR

  • VUID-VkVideoEncodeSessionParametersFeedbackInfoKHR-sType-unique
    The sType value of each struct in the pNext chain must be unique

Video Encode Commands

To launch video encode operations, call:

// Provided by VK_KHR_video_encode_queue
void vkCmdEncodeVideoKHR(
    VkCommandBuffer                             commandBuffer,
    const VkVideoEncodeInfoKHR*                 pEncodeInfo);
  • commandBuffer is the command buffer in which to record the command.

  • pEncodeInfo is a pointer to a VkVideoEncodeInfoKHR structure specifying the parameters of the video encode operations.

Each call issues one or more video encode operations. The implicit parameter opCount corresponds to the number of video encode operations issued by the command. After calling this command, the active query index of each active query is incremented by opCount.

Currently each call to this command results in the issue of a single video encode operation.

If the bound video session was created with VK_VIDEO_SESSION_CREATE_INLINE_QUERIES_BIT_KHR and the pNext chain of pEncodeInfo includes a VkVideoInlineQueryInfoKHR structure with its queryPool member specifying a valid VkQueryPool handle, then this command will execute a query for each video encode operation issued by it.

Active Reference Picture Information

The list of active reference pictures used by a video encode operation is a list of image subregions used as the source of reference picture data and related parameters, and is derived from the VkVideoReferenceSlotInfoKHR structures provided as the elements of the pEncodeInfo->pReferenceSlots array. For each element of pEncodeInfo->pReferenceSlots, one or more elements are added to the active reference picture list, as defined by the codec-specific semantics. Each element of this list contains the following information:

  • The image subregion within the image subresource referred to by the video picture resource used as the reference picture.

  • The DPB slot index the reference picture is associated with.

  • The codec-specific reference information related to the reference picture.

Reconstructed Picture Information

Information related to the optional reconstructed picture used by a video encode operation is derived from the VkVideoReferenceSlotInfoKHR structure pointed to by pEncodeInfo->pSetupReferenceSlot, if not NULL, as defined by the codec-specific semantics, and consists of the following:

  • The image subregion within the image subresource referred to by the video picture resource used as the reconstructed picture.

  • The DPB slot index to use for picture reconstruction.

  • The codec-specific reference information related to the reconstructed picture.

Specifying a valid VkVideoReferenceSlotInfoKHR structure in pEncodeInfo->pSetupReferenceSlot is always required, unless the video session was created with VkVideoSessionCreateInfoKHR::maxDpbSlot equal to zero. However, the DPB slot identified by pEncodeInfo->pSetupReferenceSlot→slotIndex is only activated with the reconstructed picture specified in pEncodeInfo->pSetupReferenceSlot→pPictureResource if reference picture setup is requested according to the codec-specific semantics.

If reconstructed picture information is specified, but reference picture setup is not requested, according to the codec-specific semantics, the contents of the video picture resource corresponding to the reconstructed picture will be undefined after the video encode operation.

Some implementations may always output the reconstructed picture or use it as temporary storage during the video encode operation even when the reconstructed picture is not marked for future reference.

Encode Input Picture Information

Information related to the encode input picture used by a video encode operation is derived from pEncodeInfo->srcPictureResource and any codec-specific parameters provided in the pEncodeInfo->pNext chain, as defined by the codec-specific semantics, and consists of the following:

  • The image subregion within the image subresource referred to by the video picture resource used as the encode input picture.

  • The codec-specific picture information related to the encoded picture.

Several limiting values are defined below that are referenced by the relevant valid usage statements of this command.

  • Let uint32_t activeReferencePictureCount be the size of the list of active reference pictures used by the video encode operation. Unless otherwise defined, activeReferencePictureCount is set to the value of pEncodeInfo->referenceSlotCount.

  • Let VkOffset2D codedOffsetGranularity be the minimum alignment requirement for the coded offset of video picture resources. Unless otherwise defined, the value of the x and y members of codedOffsetGranularity are 0.

  • Let uint32_t dpbFrameUseCount[] be an array of size maxDpbSlots, where maxDpbSlots is the VkVideoSessionCreateInfoKHR::maxDpbSlots the bound video session was created with, with each element indicating the number of times a frame associated with the corresponding DPB slot index is referred to by the video coding operation. Let the initial value of each element of the array be 0.

    • If pEncodeInfo->pSetupReferenceSlot is not NULL, then dpbFrameUseCount[i] is incremented by one, where i equals pEncodeInfo->pSetupReferenceSlot→slotIndex.

    • For each element of pEncodeInfo->pReferenceSlots, dpbFrameUseCount[i] is incremented by one, where i equals the slotIndex member of the corresponding element.

  • If there is a bound video session parameters object created with VK_VIDEO_SESSION_PARAMETERS_CREATE_QUANTIZATION_MAP_COMPATIBLE_BIT_KHR, then let VkExtent2D quantizationMapTexelSize be the quantization map texel size the bound video session parameters object was created with.

  • Let VkExtent2D maxCodingBlockSize be the maximum codec-specific coding block size that may be used by the video encode operation.

  • If maxCodingBlockSize is defined, then let VkExtent2D minCodingBlockExtent be the coded extent of the encode input picture expressed in terms of codec-specific coding blocks, assuming the maximum size of such coding blocks, as defined by maxCodingBlockSize, calculated from the value of the codedExtent member of pEncodeInfo->srcPictureResource as follows:

    • minCodingBlockExtent.width = (codedExtent.width
      maxCodingBlockSize.width - 1) / maxCodingBlockSize.width

    • minCodingBlockExtent.height = (codedExtent.height
      maxCodingBlockSize.height - 1) / maxCodingBlockSize.height

  • If the bound video session object was created with an H.264 encode profile, then:

    • Let StdVideoH264PictureType h264PictureType be the picture type of the encoded picture set to the value of pStdPictureInfo->primary_pic_type specified in the VkVideoEncodeH264PictureInfoKHR structure included in the pEncodeInfo->pNext chain.

    • Let StdVideoH264PictureType h264L0PictureTypes[] and StdVideoH264PictureType h264L1PictureTypes[] be the picture types of the reference pictures in the L0 and L1 reference lists, respectively. If pStdPictureInfo->pRefLists specified in the VkVideoEncodeH264PictureInfoKHR structure included in the pEncodeInfo->pNext chain is not NULL, then for each reference index specified in the elements of the pStdPictureInfo->pRefLists→RefPicList0 and pStdPictureInfo->pRefLists→RefPicList1 arrays, if the reference index is not STD_VIDEO_H264_NO_REFERENCE_PICTURE, pStdReferenceInfo->primary_pic_type is added to h264L0PictureTypes or h264L1PictureTypes, respectively, where pStdReferenceInfo is the member of the VkVideoEncodeH264DpbSlotInfoKHR structure included in the pNext chain of the element of pEncodeInfo->pReferenceSlots for which slotIndex equals the reference index in question.

  • If the bound video session object was created with an H.265 encode profile, then:

    • Let StdVideoH265PictureType h265PictureType be the picture type of the encoded picture set to the value of pStdPictureInfo->pic_type specified in the VkVideoEncodeH265PictureInfoKHR structure included in the pEncodeInfo->pNext chain.

    • Let StdVideoH265PictureType h265L0PictureTypes[] and StdVideoH265PictureType h265L1PictureTypes[] be the picture types of the reference pictures in the L0 and L1 reference lists, respectively. If pStdPictureInfo->pRefLists specified in the VkVideoEncodeH265PictureInfoKHR structure included in the pEncodeInfo->pNext chain is not NULL, then for each reference index specified in the elements of the pStdPictureInfo->pRefLists→RefPicList0 and pStdPictureInfo->pRefLists→RefPicList1 arrays, if the reference index is not STD_VIDEO_H265_NO_REFERENCE_PICTURE, pStdReferenceInfo->pic_type is added to h265L0PictureTypes or h265L1PictureTypes, respectively, where pStdReferenceInfo is the member of the VkVideoEncodeH265DpbSlotInfoKHR structure included in the pNext chain of the element of pEncodeInfo->pReferenceSlots for which slotIndex equals the reference index in question.

  • If the bound video session object was created with an AV1 encode profile, then:

    • If the primaryReferenceCdfOnly member of the VkVideoEncodeAV1PictureInfoKHR structure included in the pEncodeInfo->pNext chain is set to VK_TRUE, then let int32_t cdfOnlyReferenceIndex be the value of VkVideoEncodeAV1PictureInfoKHR::pStdPictureInfo->primary_ref_frame.

    • Otherwise let int32_t cdfOnlyReferenceIndex be -1.

Valid Usage
  • VUID-vkCmdEncodeVideoKHR-None-08250
    The bound video session must have been created with an encode operation

  • VUID-vkCmdEncodeVideoKHR-None-07012
    The bound video session must not be in uninitialized state at the time the command is executed on the device

  • VUID-vkCmdEncodeVideoKHR-None-08318
    The bound video session parameters object must have been created with the currently set video encode quality level for the bound video session at the time the command is executed on the device

  • VUID-vkCmdEncodeVideoKHR-opCount-07174
    For each active query, the active query index corresponding to the query type of that query plus opCount must be less than or equal to the last activatable query index corresponding to the query type of that query plus one

  • VUID-vkCmdEncodeVideoKHR-pNext-08360
    If the bound video session was created with VK_VIDEO_SESSION_CREATE_INLINE_QUERIES_BIT_KHR, and the pNext chain of pEncodeInfo includes a VkVideoInlineQueryInfoKHR structure with its queryPool member specifying a valid VkQueryPool handle, then VkVideoInlineQueryInfoKHR::queryCount must equal opCount

  • VUID-vkCmdEncodeVideoKHR-pNext-08361
    If the bound video session was created with VK_VIDEO_SESSION_CREATE_INLINE_QUERIES_BIT_KHR, and the pNext chain of pEncodeInfo includes a VkVideoInlineQueryInfoKHR structure with its queryPool member specifying a valid VkQueryPool handle, then all the queries used by the command, as specified by the VkVideoInlineQueryInfoKHR structure, must be unavailable

  • VUID-vkCmdEncodeVideoKHR-queryType-08362
    If the bound video session was created with VK_VIDEO_SESSION_CREATE_INLINE_QUERIES_BIT_KHR, then the queryType used to create the queryPool specified in the VkVideoInlineQueryInfoKHR structure included in the pNext chain of pEncodeInfo must be VK_QUERY_TYPE_RESULT_STATUS_ONLY_KHR or VK_QUERY_TYPE_VIDEO_ENCODE_FEEDBACK_KHR

  • VUID-vkCmdEncodeVideoKHR-queryPool-08363
    If the bound video session was created with VK_VIDEO_SESSION_CREATE_INLINE_QUERIES_BIT_KHR, then the queryPool specified in the VkVideoInlineQueryInfoKHR structure included in the pNext chain of pEncodeInfo must have been created with a VkVideoProfileInfoKHR structure included in the pNext chain of VkQueryPoolCreateInfo identical to the one specified in VkVideoSessionCreateInfoKHR::pVideoProfile the bound video session was created with

  • VUID-vkCmdEncodeVideoKHR-queryType-08364
    If the bound video session was created with VK_VIDEO_SESSION_CREATE_INLINE_QUERIES_BIT_KHR, and the queryType used to create the queryPool specified in the VkVideoInlineQueryInfoKHR structure included in the pNext chain of pEncodeInfo is VK_QUERY_TYPE_RESULT_STATUS_ONLY_KHR, then the VkCommandPool that commandBuffer was allocated from must have been created with a queue family index that supports result status queries, as indicated by VkQueueFamilyQueryResultStatusPropertiesKHR::queryResultStatusSupport

  • VUID-vkCmdEncodeVideoKHR-pEncodeInfo-08201
    pEncodeInfo->dstBuffer must be compatible with the video profile the bound video session was created with

  • VUID-vkCmdEncodeVideoKHR-commandBuffer-08202
    If commandBuffer is an unprotected command buffer and protectedNoFault is not supported, then pEncodeInfo->dstBuffer must not be a protected buffer

  • VUID-vkCmdEncodeVideoKHR-commandBuffer-08203
    If commandBuffer is a protected command buffer and protectedNoFault is not supported, then pEncodeInfo->dstBuffer must be a protected buffer

  • VUID-vkCmdEncodeVideoKHR-pEncodeInfo-08204
    pEncodeInfo->dstBufferOffset must be an integer multiple of VkVideoCapabilitiesKHR::minBitstreamBufferOffsetAlignment, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile the bound video session was created with

  • VUID-vkCmdEncodeVideoKHR-pEncodeInfo-08205
    pEncodeInfo->dstBufferRange must be an integer multiple of VkVideoCapabilitiesKHR::minBitstreamBufferSizeAlignment, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile the bound video session was created with

  • VUID-vkCmdEncodeVideoKHR-pEncodeInfo-08206
    pEncodeInfo->srcPictureResource.imageViewBinding must be compatible with the video profile the bound video session was created with

  • VUID-vkCmdEncodeVideoKHR-pEncodeInfo-08207
    The format of pEncodeInfo->srcPictureResource.imageViewBinding must match the VkVideoSessionCreateInfoKHR::pictureFormat the bound video session was created with

  • VUID-vkCmdEncodeVideoKHR-pEncodeInfo-08208
    pEncodeInfo->srcPictureResource.codedOffset must be an integer multiple of codedOffsetGranularity

  • VUID-vkCmdEncodeVideoKHR-pEncodeInfo-08209
    pEncodeInfo->srcPictureResource.codedExtent must be between minCodedExtent and maxCodedExtent, inclusive, the bound video session was created with

  • VUID-vkCmdEncodeVideoKHR-pEncodeInfo-08210
    pEncodeInfo->srcPictureResource.imageViewBinding must have been created with VK_IMAGE_USAGE_VIDEO_ENCODE_SRC_BIT_KHR

  • VUID-vkCmdEncodeVideoKHR-commandBuffer-08211
    If commandBuffer is an unprotected command buffer and protectedNoFault is not supported, then pEncodeInfo->srcPictureResource.imageViewBinding must not have been created from a protected image

  • VUID-vkCmdEncodeVideoKHR-pEncodeInfo-08377
    pEncodeInfo->pSetupReferenceSlot must not be NULL unless the bound video session was created with VkVideoSessionCreateInfoKHR::maxDpbSlots equal to zero

  • VUID-vkCmdEncodeVideoKHR-pEncodeInfo-08213
    If pEncodeInfo->pSetupReferenceSlot is not NULL, then pEncodeInfo->pSetupReferenceSlot→slotIndex must be less than the VkVideoSessionCreateInfoKHR::maxDpbSlots specified when the bound video session was created

  • VUID-vkCmdEncodeVideoKHR-pEncodeInfo-08214
    If pEncodeInfo->pSetupReferenceSlot is not NULL, then pEncodeInfo->pSetupReferenceSlot→pPictureResource→codedOffset must be an integer multiple of codedOffsetGranularity

  • VUID-vkCmdEncodeVideoKHR-pEncodeInfo-08215
    If pEncodeInfo->pSetupReferenceSlot is not NULL, then pEncodeInfo->pSetupReferenceSlot→pPictureResource must match one of the bound reference picture resource

  • VUID-vkCmdEncodeVideoKHR-activeReferencePictureCount-08216
    activeReferencePictureCount must be less than or equal to the VkVideoSessionCreateInfoKHR::maxActiveReferencePictures specified when the bound video session was created

  • VUID-vkCmdEncodeVideoKHR-slotIndex-08217
    The slotIndex member of each element of pEncodeInfo->pReferenceSlots must be less than the VkVideoSessionCreateInfoKHR::maxDpbSlots specified when the bound video session was created

  • VUID-vkCmdEncodeVideoKHR-codedOffset-08218
    The codedOffset member of the VkVideoPictureResourceInfoKHR structure pointed to by the pPictureResource member of each element of pEncodeInfo->pReferenceSlots must be an integer multiple of codedOffsetGranularity

  • VUID-vkCmdEncodeVideoKHR-pPictureResource-08219
    The pPictureResource member of each element of pEncodeInfo->pReferenceSlots must match one of the bound reference picture resource associated with the DPB slot index specified in the slotIndex member of that element

  • VUID-vkCmdEncodeVideoKHR-pPictureResource-08220
    Each video picture resource corresponding to the pPictureResource member specified in the elements of pEncodeInfo->pReferenceSlots must be unique within pEncodeInfo->pReferenceSlots

  • VUID-vkCmdEncodeVideoKHR-dpbFrameUseCount-08221
    All elements of dpbFrameUseCount must be less than or equal to 1

  • VUID-vkCmdEncodeVideoKHR-pEncodeInfo-08222
    The image subresource referred to by pEncodeInfo->srcPictureResource must be in the VK_IMAGE_LAYOUT_VIDEO_ENCODE_SRC_KHR layout at the time the video encode operation is executed on the device

  • VUID-vkCmdEncodeVideoKHR-pEncodeInfo-08223
    If pEncodeInfo->pSetupReferenceSlot is not NULL, then the image subresource referred to by pEncodeInfo->pSetupReferenceSlot→pPictureResource must be in the VK_IMAGE_LAYOUT_VIDEO_ENCODE_DPB_KHR layout at the time the video encode operation is executed on the device

  • VUID-vkCmdEncodeVideoKHR-pPictureResource-08224
    The image subresource referred to by the pPictureResource member of each element of pEncodeInfo->pReferenceSlots must be in the VK_IMAGE_LAYOUT_VIDEO_ENCODE_DPB_KHR layout at the time the video encode operation is executed on the device

  • VUID-vkCmdEncodeVideoKHR-pEncodeInfo-10306
    If pEncodeInfo->flags includes VK_VIDEO_ENCODE_WITH_QUANTIZATION_DELTA_MAP_BIT_KHR, then the bound video session must have been created with VK_VIDEO_SESSION_CREATE_ALLOW_ENCODE_QUANTIZATION_DELTA_MAP_BIT_KHR

  • VUID-vkCmdEncodeVideoKHR-pEncodeInfo-10307
    If pEncodeInfo->flags includes VK_VIDEO_ENCODE_WITH_EMPHASIS_MAP_BIT_KHR, then the bound video session must have been created with VK_VIDEO_SESSION_CREATE_ALLOW_ENCODE_EMPHASIS_MAP_BIT_KHR

  • VUID-vkCmdEncodeVideoKHR-pEncodeInfo-10308
    If the current rate control mode is VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DEFAULT_KHR or VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DISABLED_BIT_KHR, then pEncodeInfo->flags must not include VK_VIDEO_ENCODE_WITH_EMPHASIS_MAP_BIT_KHR

  • VUID-vkCmdEncodeVideoKHR-pEncodeInfo-10309
    If pEncodeInfo->flags includes VK_VIDEO_ENCODE_WITH_QUANTIZATION_DELTA_MAP_BIT_KHR or VK_VIDEO_ENCODE_WITH_EMPHASIS_MAP_BIT_KHR, then the pNext chain of pEncodeInfo must include a VkVideoEncodeQuantizationMapInfoKHR structure with its quantizationMap member specifying a valid VkImageView handle

  • VUID-vkCmdEncodeVideoKHR-pEncodeInfo-10310
    If pEncodeInfo->flags includes VK_VIDEO_ENCODE_WITH_QUANTIZATION_DELTA_MAP_BIT_KHR or VK_VIDEO_ENCODE_WITH_EMPHASIS_MAP_BIT_KHR, then the VkImageView specified by the quantizationMap member of the VkVideoEncodeQuantizationMapInfoKHR structure included in the pNext chain must be compatible with the video profile the bound video session was created with

  • VUID-vkCmdEncodeVideoKHR-pEncodeInfo-10311
    If pEncodeInfo->flags includes VK_VIDEO_ENCODE_WITH_QUANTIZATION_DELTA_MAP_BIT_KHR, then the VkImageView specified by the quantizationMap member of the VkVideoEncodeQuantizationMapInfoKHR structure included in the pNext chain of pEncodeInfo must have been created with VK_IMAGE_USAGE_VIDEO_ENCODE_QUANTIZATION_DELTA_MAP_BIT_KHR

  • VUID-vkCmdEncodeVideoKHR-pEncodeInfo-10312
    If pEncodeInfo->flags includes VK_VIDEO_ENCODE_WITH_EMPHASIS_MAP_BIT_KHR, then the VkImageView specified by the quantizationMap member of the VkVideoEncodeQuantizationMapInfoKHR structure included in the pNext chain of pEncodeInfo must have been created with VK_IMAGE_USAGE_VIDEO_ENCODE_EMPHASIS_MAP_BIT_KHR

  • VUID-vkCmdEncodeVideoKHR-pNext-10313
    If an instance of the VkVideoEncodeQuantizationMapInfoKHR structure is included in the pNext chain of pEncodeInfo, its quantizationMap member is not VK_NULL_HANDLE, commandBuffer is an unprotected command buffer, and protectedNoFault is not supported, then VkVideoEncodeQuantizationMapInfoKHR::quantizationMap must not have been created from a protected image

  • VUID-vkCmdEncodeVideoKHR-pNext-10314
    If an instance of the VkVideoEncodeQuantizationMapInfoKHR structure is included in the pNext chain of pEncodeInfo and its quantizationMap member is not VK_NULL_HANDLE, then the image subresource range referenced by quantizationMap must be in the VK_IMAGE_LAYOUT_VIDEO_ENCODE_QUANTIZATION_MAP_KHR layout at the time the video encode operation is executed on the device

  • VUID-vkCmdEncodeVideoKHR-pNext-10315
    If an instance of the VkVideoEncodeQuantizationMapInfoKHR structure is included in the pNext chain of pEncodeInfo and its quantizationMap member is not VK_NULL_HANDLE, and there is a bound video session parameters object, then the bound video session parameters object must have been created with VK_VIDEO_SESSION_PARAMETERS_CREATE_QUANTIZATION_MAP_COMPATIBLE_BIT_KHR

  • VUID-vkCmdEncodeVideoKHR-pNext-10316
    If an instance of the VkVideoEncodeQuantizationMapInfoKHR structure is included in the pNext chain of pEncodeInfo, its quantizationMap member is not VK_NULL_HANDLE, and there is a bound video session parameters object created with VK_VIDEO_SESSION_PARAMETERS_CREATE_QUANTIZATION_MAP_COMPATIBLE_BIT_KHR, then quantizationMapExtent must equal pEncodeInfo->srcPictureResource.codedExtent / quantizationMapTexelSize

  • VUID-vkCmdEncodeVideoKHR-pNext-08225
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR, then the pNext chain of pEncodeInfo must include a VkVideoEncodeH264PictureInfoKHR structure

  • VUID-vkCmdEncodeVideoKHR-StdVideoH264SequenceParameterSet-08226
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR, then the bound video session parameters object must contain a StdVideoH264SequenceParameterSet entry with seq_parameter_set_id matching StdVideoEncodeH264PictureInfo::seq_parameter_set_id that is provided in the pStdPictureInfo member of the VkVideoEncodeH264PictureInfoKHR structure included in the pNext chain of pEncodeInfo

  • VUID-vkCmdEncodeVideoKHR-StdVideoH264PictureParameterSet-08227
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR, then the bound video session parameters object must contain a StdVideoH264PictureParameterSet entry with seq_parameter_set_id and pic_parameter_set_id matching StdVideoEncodeH264PictureInfo::seq_parameter_set_id and StdVideoEncodeH264PictureInfo::pic_parameter_set_id, respectively, that are provided in the pStdPictureInfo member of the VkVideoEncodeH264PictureInfoKHR structure included in the pNext chain of pEncodeInfo

  • VUID-vkCmdEncodeVideoKHR-pEncodeInfo-08228
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR and pEncodeInfo->pSetupReferenceSlot is not NULL, then the pNext chain of pEncodeInfo->pSetupReferenceSlot must include a VkVideoEncodeH264DpbSlotInfoKHR structure

  • VUID-vkCmdEncodeVideoKHR-pNext-08229
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR, then the pNext chain of each element of pEncodeInfo->pReferenceSlots must include a VkVideoEncodeH264DpbSlotInfoKHR structure

  • VUID-vkCmdEncodeVideoKHR-constantQp-08269
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR and the current rate control mode is not VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DISABLED_BIT_KHR, then VkVideoEncodeH264NaluSliceInfoKHR::constantQp must be zero for each element of the pNaluSliceEntries member of the VkVideoEncodeH264PictureInfoKHR structure included in the pNext chain of pEncodeInfo

  • VUID-vkCmdEncodeVideoKHR-constantQp-08270
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR and the current rate control mode is VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DISABLED_BIT_KHR, then VkVideoEncodeH264NaluSliceInfoKHR::constantQp must be between VkVideoEncodeH264CapabilitiesKHR::minQp and VkVideoEncodeH264CapabilitiesKHR::maxQp, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile the bound video session was created with, for each element of the pNaluSliceEntries member of the VkVideoEncodeH264PictureInfoKHR structure included in the pNext chain of pEncodeInfo

  • VUID-vkCmdEncodeVideoKHR-constantQp-08271
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR and VkVideoEncodeH264CapabilitiesKHR::flags does not include VK_VIDEO_ENCODE_H264_CAPABILITY_PER_SLICE_CONSTANT_QP_BIT_KHR, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile the bound video session was created with, then VkVideoEncodeH264NaluSliceInfoKHR::constantQp must have the same value for each element of the pNaluSliceEntries member of the VkVideoEncodeH264PictureInfoKHR structure included in the pNext chain of pEncodeInfo

  • VUID-vkCmdEncodeVideoKHR-naluSliceEntryCount-08302
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR, then the naluSliceEntryCount member of the VkVideoEncodeH264PictureInfoKHR structure included in the pNext chain of pEncodeInfo must be less than or equal to minCodingBlockExtent.width multiplied by minCodingBlockExtent.height

  • VUID-vkCmdEncodeVideoKHR-naluSliceEntryCount-08312
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR and VkVideoEncodeH264CapabilitiesKHR::flags does not include VK_VIDEO_ENCODE_H264_CAPABILITY_ROW_UNALIGNED_SLICE_BIT_KHR, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile the bound video session was created with, then the naluSliceEntryCount member of the VkVideoEncodeH264PictureInfoKHR structure included in the pNext chain of pEncodeInfo must be less than or equal to minCodingBlockExtent.height

  • VUID-vkCmdEncodeVideoKHR-pNext-08352
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR, the pNext chain of pEncodeInfo includes a VkVideoEncodeH264PictureInfoKHR structure, and pEncodeInfo->referenceSlotCount is greater than zero, then VkVideoEncodeH264PictureInfoKHR::pStdPictureInfo->pRefLists must not be NULL

  • VUID-vkCmdEncodeVideoKHR-pNext-08339
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR, the pNext chain of pEncodeInfo includes a VkVideoEncodeH264PictureInfoKHR structure, and VkVideoEncodeH264PictureInfoKHR::pStdPictureInfo->pRefLists is not NULL, then each element of the RefPicList0 and RefPicList1 array members of the StdVideoEncodeH264ReferenceListsInfo structure pointed to by VkVideoEncodeH264PictureInfoKHR::pStdPictureInfo->pRefLists must either be STD_VIDEO_H264_NO_REFERENCE_PICTURE or must equal the slotIndex member of one of the elements of pEncodeInfo->pReferenceSlots

  • VUID-vkCmdEncodeVideoKHR-pNext-08353
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR, the pNext chain of pEncodeInfo includes a VkVideoEncodeH264PictureInfoKHR structure, and pEncodeInfo->referenceSlotCount is greater than zero, then the slotIndex member of each element of pEncodeInfo->pReferenceSlots must equal one of the elements of the RefPicList0 or RefPicList1 array members of the StdVideoEncodeH264ReferenceListsInfo structure pointed to by VkVideoEncodeH264PictureInfoKHR::pStdPictureInfo->pRefLists

  • VUID-vkCmdEncodeVideoKHR-maxPPictureL0ReferenceCount-08340
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR and VkVideoEncodeH264CapabilitiesKHR::maxPPictureL0ReferenceCount is zero, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile the bound video session was created with, then h264PictureType and each element of h264L0PictureTypes and h264L1PictureTypes must not be STD_VIDEO_H264_PICTURE_TYPE_P

  • VUID-vkCmdEncodeVideoKHR-maxBPictureL0ReferenceCount-08341
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR and VkVideoEncodeH264CapabilitiesKHR::maxBPictureL0ReferenceCount and VkVideoEncodeH264CapabilitiesKHR::maxL1ReferenceCount are both zero, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile the bound video session was created with, then h264PictureType and each element of h264L0PictureTypes and h264L1PictureTypes must not be STD_VIDEO_H264_PICTURE_TYPE_B

  • VUID-vkCmdEncodeVideoKHR-flags-08342
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR and VkVideoEncodeH264CapabilitiesKHR::flags does not include VK_VIDEO_ENCODE_H264_CAPABILITY_B_FRAME_IN_L0_LIST_BIT_KHR, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile the bound video session was created with, then each element of h264L0PictureTypes must not be STD_VIDEO_H264_PICTURE_TYPE_B

  • VUID-vkCmdEncodeVideoKHR-flags-08343
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR and VkVideoEncodeH264CapabilitiesKHR::flags does not include VK_VIDEO_ENCODE_H264_CAPABILITY_B_FRAME_IN_L1_LIST_BIT_KHR, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile the bound video session was created with, then each element of h264L1PictureTypes must not be STD_VIDEO_H264_PICTURE_TYPE_B

  • VUID-vkCmdEncodeVideoKHR-pNext-08230
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR, then the pNext chain of pEncodeInfo must include a VkVideoEncodeH265PictureInfoKHR structure

  • VUID-vkCmdEncodeVideoKHR-StdVideoH265VideoParameterSet-08231
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR, then the bound video session parameters object must contain a StdVideoH265VideoParameterSet entry with vps_video_parameter_set_id matching StdVideoEncodeH265PictureInfo::sps_video_parameter_set_id that is provided in the pStdPictureInfo member of the VkVideoEncodeH265PictureInfoKHR structure included in the pNext chain of pEncodeInfo

  • VUID-vkCmdEncodeVideoKHR-StdVideoH265SequenceParameterSet-08232
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR, then the bound video session parameters object must contain a StdVideoH265SequenceParameterSet entry with sps_video_parameter_set_id and sps_seq_parameter_set_id matching StdVideoEncodeH265PictureInfo::sps_video_parameter_set_id and StdVideoEncodeH265PictureInfo::pps_seq_parameter_set_id, respectively, that are provided in the pStdPictureInfo member of the VkVideoEncodeH265PictureInfoKHR structure included in the pNext chain of pEncodeInfo

  • VUID-vkCmdEncodeVideoKHR-StdVideoH265PictureParameterSet-08233
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR, then the bound video session parameters object must contain a StdVideoH265PictureParameterSet entry with sps_video_parameter_set_id, pps_seq_parameter_set_id, and pps_pic_parameter_set_id matching StdVideoEncodeH265PictureInfo::sps_video_parameter_set_id, StdVideoEncodeH265PictureInfo::pps_seq_parameter_set_id, and StdVideoEncodeH265PictureInfo::pps_pic_parameter_set_id, respectively, that are provided in the pStdPictureInfo member of the VkVideoEncodeH265PictureInfoKHR structure included in the pNext chain of pEncodeInfo

  • VUID-vkCmdEncodeVideoKHR-pEncodeInfo-08234
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR and pEncodeInfo->pSetupReferenceSlot is not NULL, then the pNext chain of pEncodeInfo->pSetupReferenceSlot must include a VkVideoEncodeH265DpbSlotInfoKHR structure

  • VUID-vkCmdEncodeVideoKHR-pNext-08235
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR, then the pNext chain of each element of pEncodeInfo->pReferenceSlots must include a VkVideoEncodeH265DpbSlotInfoKHR structure

  • VUID-vkCmdEncodeVideoKHR-constantQp-08272
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR and the current rate control mode is not VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DISABLED_BIT_KHR, then VkVideoEncodeH265NaluSliceSegmentInfoKHR::constantQp must be zero for each element of the pNaluSliceSegmentEntries member of the VkVideoEncodeH265PictureInfoKHR structure included in the pNext chain of pEncodeInfo

  • VUID-vkCmdEncodeVideoKHR-constantQp-08273
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR and the current rate control mode is VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DISABLED_BIT_KHR, then VkVideoEncodeH265NaluSliceSegmentInfoKHR::constantQp must be between VkVideoEncodeH265CapabilitiesKHR::minQp and VkVideoEncodeH265CapabilitiesKHR::maxQp, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile the bound video session was created with, for each element of the pNaluSliceSegmentEntries member of the VkVideoEncodeH265PictureInfoKHR structure included in the pNext chain of pEncodeInfo

  • VUID-vkCmdEncodeVideoKHR-constantQp-08274
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR and VkVideoEncodeH265CapabilitiesKHR::flags does not include VK_VIDEO_ENCODE_H265_CAPABILITY_PER_SLICE_SEGMENT_CONSTANT_QP_BIT_KHR, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile the bound video session was created with, then VkVideoEncodeH265NaluSliceSegmentInfoKHR::constantQp must have the same value for each element of the pNaluSliceSegmentEntries member of the VkVideoEncodeH264PictureInfoKHR structure included in the pNext chain of pEncodeInfo

  • VUID-vkCmdEncodeVideoKHR-naluSliceSegmentEntryCount-08307
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR, then the naluSliceSegmentEntryCount member of the VkVideoEncodeH265PictureInfoKHR structure included in the pNext chain of pEncodeInfo must be less than or equal to minCodingBlockExtent.width multiplied by minCodingBlockExtent.height

  • VUID-vkCmdEncodeVideoKHR-naluSliceSegmentEntryCount-08313
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR and VkVideoEncodeH265CapabilitiesKHR::flags does not include VK_VIDEO_ENCODE_H265_CAPABILITY_ROW_UNALIGNED_SLICE_SEGMENT_BIT_KHR, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile the bound video session was created with, then the naluSliceSegmentEntryCount member of the VkVideoEncodeH265PictureInfoKHR structure included in the pNext chain of pEncodeInfo must be less than or equal to minCodingBlockExtent.height

  • VUID-vkCmdEncodeVideoKHR-pNext-08354
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR, the pNext chain of pEncodeInfo includes a VkVideoEncodeH265PictureInfoKHR structure, and pEncodeInfo->referenceSlotCount is greater than zero, then VkVideoEncodeH265PictureInfoKHR::pStdPictureInfo->pRefLists must not be NULL

  • VUID-vkCmdEncodeVideoKHR-pNext-08344
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR, the pNext chain of pEncodeInfo includes a VkVideoEncodeH265PictureInfoKHR structure, and VkVideoEncodeH265PictureInfoKHR::pStdPictureInfo->pRefLists is not NULL, then each element of the RefPicList0 and RefPicList1 array members of the StdVideoEncodeH265ReferenceListsInfo structure pointed to by VkVideoEncodeH265PictureInfoKHR::pStdPictureInfo->pRefLists must either be STD_VIDEO_H265_NO_REFERENCE_PICTURE or must equal the slotIndex member of one of the elements of pEncodeInfo->pReferenceSlots

  • VUID-vkCmdEncodeVideoKHR-pNext-08355
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR, the pNext chain of pEncodeInfo includes a VkVideoEncodeH265PictureInfoKHR structure, and pEncodeInfo->referenceSlotCount is greater than zero, then the slotIndex member of each element of pEncodeInfo->pReferenceSlots must equal one of the elements of the RefPicList0 or RefPicList1 array members of the StdVideoEncodeH265ReferenceListsInfo structure pointed to by VkVideoEncodeH265PictureInfoKHR::pStdPictureInfo->pRefLists

  • VUID-vkCmdEncodeVideoKHR-maxPPictureL0ReferenceCount-08345
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR and VkVideoEncodeH265CapabilitiesKHR::maxPPictureL0ReferenceCount is zero, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile the bound video session was created with, then h265PictureType and each element of h265L0PictureTypes and h265L1PictureTypes must not be STD_VIDEO_H265_PICTURE_TYPE_P

  • VUID-vkCmdEncodeVideoKHR-maxBPictureL0ReferenceCount-08346
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR and VkVideoEncodeH265CapabilitiesKHR::maxBPictureL0ReferenceCount and VkVideoEncodeH265CapabilitiesKHR::maxL1ReferenceCount are both zero, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile the bound video session was created with, then h265PictureType and each element of h265L0PictureTypes and h265L1PictureTypes must not be STD_VIDEO_H265_PICTURE_TYPE_B

  • VUID-vkCmdEncodeVideoKHR-flags-08347
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR and VkVideoEncodeH265CapabilitiesKHR::flags does not include VK_VIDEO_ENCODE_H265_CAPABILITY_B_FRAME_IN_L0_LIST_BIT_KHR, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile the bound video session was created with, then each element of h265L0PictureTypes must not be STD_VIDEO_H264_PICTURE_TYPE_B

  • VUID-vkCmdEncodeVideoKHR-flags-08348
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR and VkVideoEncodeH265CapabilitiesKHR::flags does not include VK_VIDEO_ENCODE_H265_CAPABILITY_B_FRAME_IN_L1_LIST_BIT_KHR, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile the bound video session was created with, then each element of h265L1PictureTypes must not be STD_VIDEO_H265_PICTURE_TYPE_B

  • VUID-vkCmdEncodeVideoKHR-pNext-10317
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR, then the pNext chain of pEncodeInfo must include a VkVideoEncodeAV1PictureInfoKHR structure

  • VUID-vkCmdEncodeVideoKHR-pEncodeInfo-10318
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR and pEncodeInfo->pSetupReferenceSlot is not NULL, then the pNext chain of pEncodeInfo->pSetupReferenceSlot must include a VkVideoEncodeAV1DpbSlotInfoKHR structure

  • VUID-vkCmdEncodeVideoKHR-pNext-10319
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR, then the pNext chain of each element of pEncodeInfo->pReferenceSlots must include a VkVideoEncodeAV1DpbSlotInfoKHR structure

  • VUID-vkCmdEncodeVideoKHR-constantQIndex-10320
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR and the current rate control mode is not VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DISABLED_BIT_KHR, then the constantQIndex member of the VkVideoEncodeAV1PictureInfoKHR structure included in the pNext chain of pEncodeInfo must be zero

  • VUID-vkCmdEncodeVideoKHR-constantQIndex-10321
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR and the current rate control mode is VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DISABLED_BIT_KHR, then the constantQIndex member of the VkVideoEncodeAV1PictureInfoKHR structure included in the pNext chain of pEncodeInfo must be between VkVideoEncodeAV1CapabilitiesKHR::minQIndex and VkVideoEncodeAV1CapabilitiesKHR::maxQIndex, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile the bound video session was created with

  • VUID-vkCmdEncodeVideoKHR-flags-10322
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR and VkVideoEncodeAV1CapabilitiesKHR::flags does not include VK_VIDEO_ENCODE_AV1_CAPABILITY_FRAME_SIZE_OVERRIDE_BIT_KHR, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile the bound video session was created with, then VkVideoEncodeAV1PictureInfoKHR::pStdPictureInfo->flags.frame_size_override_flag must be zero for the VkVideoEncodeAV1PictureInfoKHR structure included in the pNext chain of pEncodeInfo

  • VUID-vkCmdEncodeVideoKHR-flags-10323
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR and VkVideoEncodeAV1CapabilitiesKHR::flags does not include VK_VIDEO_ENCODE_AV1_CAPABILITY_FRAME_SIZE_OVERRIDE_BIT_KHR, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile the bound video session was created with, then pEncodeInfo->srcPictureResource.codedExtent.width must equal StdVideoAV1SequenceHeader::max_frame_width_minus_1 + 1 of the active AV1 sequence header

  • VUID-vkCmdEncodeVideoKHR-flags-10324
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR and VkVideoEncodeAV1CapabilitiesKHR::flags does not include VK_VIDEO_ENCODE_AV1_CAPABILITY_FRAME_SIZE_OVERRIDE_BIT_KHR, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile the bound video session was created with, then pEncodeInfo->srcPictureResource.codedExtent.height must equal StdVideoAV1SequenceHeader::max_frame_height_minus_1 + 1 of the active AV1 sequence header

  • VUID-vkCmdEncodeVideoKHR-flags-10325
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR and VkVideoEncodeAV1CapabilitiesKHR::flags does not include VK_VIDEO_ENCODE_AV1_CAPABILITY_MOTION_VECTOR_SCALING_BIT_KHR, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile the bound video session was created with, then for each element i of pEncodeInfo->pReferenceSlots pEncodeInfo->pReferenceSlots[i].pPictureResource->codedExtent must match pEncodeInfo->srcPictureResource.codedExtent

  • VUID-vkCmdEncodeVideoKHR-predictionMode-10326
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR and the predictionMode member of the VkVideoEncodeAV1PictureInfoKHR structure included in the pNext chain of pEncodeInfo is VK_VIDEO_ENCODE_AV1_PREDICTION_MODE_INTRA_ONLY_KHR, then VkVideoEncodeAV1PictureInfoKHR::pStdPictureInfo->frame_type must be STD_VIDEO_AV1_FRAME_TYPE_KEY or STD_VIDEO_AV1_FRAME_TYPE_INTRA_ONLY

  • VUID-vkCmdEncodeVideoKHR-pStdPictureInfo-10327
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR and pStdPictureInfo->frame_type for the pStdPictureInfo member of the VkVideoEncodeAV1PictureInfoKHR structure included in the pNext chain of pEncodeInfo is STD_VIDEO_AV1_FRAME_TYPE_KEY or STD_VIDEO_AV1_FRAME_TYPE_INTRA_ONLY, then VkVideoEncodeAV1PictureInfoKHR::predictionMode must be VK_VIDEO_ENCODE_AV1_PREDICTION_MODE_INTRA_ONLY_KHR

  • VUID-vkCmdEncodeVideoKHR-maxSingleReferenceCount-10328
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR and VkVideoEncodeAV1CapabilitiesKHR::maxSingleReferenceCount is zero, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile the bound video session was created with, then the predictionMode member of the VkVideoEncodeAV1PictureInfoKHR structure included in the pNext chain of pEncodeInfo must not be VK_VIDEO_ENCODE_AV1_PREDICTION_MODE_SINGLE_REFERENCE_KHR

  • VUID-vkCmdEncodeVideoKHR-predictionMode-10329
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR and the predictionMode member of the VkVideoEncodeAV1PictureInfoKHR structure included in the pNext chain of pEncodeInfo is VK_VIDEO_ENCODE_AV1_PREDICTION_MODE_SINGLE_REFERENCE_KHR, then there must be at least one non-negative element of VkVideoEncodeAV1PictureInfoKHR::referenceNameSlotIndices with element index i that does not equal cdfOnlyReferenceIndex and for which bit index i is set in VkVideoEncodeAV1CapabilitiesKHR::singleReferenceNameMask, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile the bound video session was created with

  • VUID-vkCmdEncodeVideoKHR-maxUnidirectionalCompoundReferenceCount-10330
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR and VkVideoEncodeAV1CapabilitiesKHR::maxUnidirectionalCompoundReferenceCount is zero, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile the bound video session was created with, then the predictionMode member of the VkVideoEncodeAV1PictureInfoKHR structure included in the pNext chain of pEncodeInfo must not be VK_VIDEO_ENCODE_AV1_PREDICTION_MODE_UNIDIRECTIONAL_COMPOUND_KHR

  • VUID-vkCmdEncodeVideoKHR-predictionMode-10331
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR and the predictionMode member of the VkVideoEncodeAV1PictureInfoKHR structure included in the pNext chain of pEncodeInfo is VK_VIDEO_ENCODE_AV1_PREDICTION_MODE_UNIDIRECTIONAL_COMPOUND_KHR, then there must be at least two non-negative elements of VkVideoEncodeAV1PictureInfoKHR::referenceNameSlotIndices with element indices i and j where (i,j) ∈ {(0,1),(0,2),(0,3),(4,6)}, such that neither element equals cdfOnlyReferenceIndex and for which bit indices i and j are set in VkVideoEncodeAV1CapabilitiesKHR::unidirectionalCompoundReferenceNameMask, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile the bound video session was created with

  • VUID-vkCmdEncodeVideoKHR-maxBidirectionalCompoundReferenceCount-10332
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR and VkVideoEncodeAV1CapabilitiesKHR::maxBidirectionalCompoundReferenceCount is zero, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile the bound video session was created with, then the predictionMode member of the VkVideoEncodeAV1PictureInfoKHR structure included in the pNext chain of pEncodeInfo must not be VK_VIDEO_ENCODE_AV1_PREDICTION_MODE_BIDIRECTIONAL_COMPOUND_KHR

  • VUID-vkCmdEncodeVideoKHR-predictionMode-10333
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR and the predictionMode member of the VkVideoEncodeAV1PictureInfoKHR structure included in the pNext chain of pEncodeInfo is VK_VIDEO_ENCODE_AV1_PREDICTION_MODE_BIDIRECTIONAL_COMPOUND_KHR, then there must be at least two non-negative elements of VkVideoEncodeAV1PictureInfoKHR::referenceNameSlotIndices with element indices i ∈ {0,1,2,3} and j ∈ {4,5,6}, respectively, such that neither element equals cdfOnlyReferenceIndex, and for which bit indices i and j are set in VkVideoEncodeAV1CapabilitiesKHR::bidirectionalCompoundReferenceNameMask, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile the bound video session was created with

  • VUID-vkCmdEncodeVideoKHR-referenceNameSlotIndices-10334
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR, then each element of the referenceNameSlotIndices array member of the VkVideoEncodeAV1PictureInfoKHR structure included in the pNext chain of pEncodeInfo must either be negative or must equal the slotIndex member of one of the elements of pEncodeInfo->pReferenceSlots

  • VUID-vkCmdEncodeVideoKHR-slotIndex-10335
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR, then the slotIndex member of each element of pEncodeInfo->pReferenceSlots must equal one of the elements of the referenceNameSlotIndices array member of the VkVideoEncodeAV1PictureInfoKHR structure included in the pNext chain of pEncodeInfo

  • VUID-vkCmdEncodeVideoKHR-pExtensionHeader-10336
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR and the pExtensionHeader member of VkVideoEncodeAV1PictureInfoKHR::pStdPictureInfo for the VkVideoEncodeAV1PictureInfoKHR structure included in the pNext chain of pEncodeInfo is not NULL, then pExtensionHeader->temporal_id must be less than VkVideoEncodeAV1CapabilitiesKHR::maxTemporalLayerCount, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile the bound video session was created with

  • VUID-vkCmdEncodeVideoKHR-pExtensionHeader-10337
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR and the pExtensionHeader member of VkVideoEncodeAV1PictureInfoKHR::pStdPictureInfo for the VkVideoEncodeAV1PictureInfoKHR structure included in the pNext chain of pEncodeInfo is not NULL, then pExtensionHeader->spatial_id must be less than VkVideoEncodeAV1CapabilitiesKHR::maxSpatialLayerCount, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile the bound video session was created with

  • VUID-vkCmdEncodeVideoKHR-pEncodeInfo-10338
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR and pEncodeInfo->pSetupReferenceSlot is not NULL, then the pExtensionHeader member of VkVideoEncodeAV1DpbSlotInfoKHR::pStdReferenceInfo for the VkVideoEncodeAV1DpbSlotInfoKHR structure included in the pNext chain of pEncodeInfo->pSetupReferenceSlot and the pExtensionHeader member of VkVideoEncodeAV1PictureInfoKHR::pStdPictureInfo for the VkVideoEncodeAV1PictureInfoKHR structure included in the pNext chain of pEncodeInfo must both be NULL or not NULL

  • VUID-vkCmdEncodeVideoKHR-pEncodeInfo-10339
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR, pEncodeInfo->pSetupReferenceSlot is not NULL, and the pExtensionHeader member of VkVideoEncodeAV1DpbSlotInfoKHR::pStdReferenceInfo for the VkVideoEncodeAV1DpbSlotInfoKHR structure included in the pNext chain of pEncodeInfo->pSetupReferenceSlot is not NULL, then pExtensionHeader->temporal_id must equal pStdPictureInfo->pExtensionHeader→temporal_id in the VkVideoEncodeAV1PictureInfoKHR structure included in the pNext chain of pEncodeInfo

  • VUID-vkCmdEncodeVideoKHR-pEncodeInfo-10340
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR, pEncodeInfo->pSetupReferenceSlot is not NULL, and the pExtensionHeader member of VkVideoEncodeAV1DpbSlotInfoKHR::pStdReferenceInfo for the VkVideoEncodeAV1DpbSlotInfoKHR structure included in the pNext chain of pEncodeInfo->pSetupReferenceSlot is not NULL, then pExtensionHeader->spatial_id must equal pStdPictureInfo->pExtensionHeader→spatial_id in the VkVideoEncodeAV1PictureInfoKHR structure included in the pNext chain of pEncodeInfo

  • VUID-vkCmdEncodeVideoKHR-pExtensionHeader-10341
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR and the pExtensionHeader member of VkVideoEncodeAV1DpbSlotInfoKHR::pStdReferenceInfo for any of the VkVideoEncodeAV1DpbSlotInfoKHR structures included in the pNext chain of any element of pEncodeInfo->pReferenceSlots is not NULL, then pExtensionHeader->temporal_id must be less than VkVideoEncodeAV1CapabilitiesKHR::maxTemporalLayerCount, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile the bound video session was created with

  • VUID-vkCmdEncodeVideoKHR-pExtensionHeader-10342
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR and the pExtensionHeader member of VkVideoEncodeAV1DpbSlotInfoKHR::pStdReferenceInfo for any of the VkVideoEncodeAV1DpbSlotInfoKHR structures included in the pNext chain of any element of pEncodeInfo->pReferenceSlots is not NULL, then pExtensionHeader->spatial_id must be less than VkVideoEncodeAV1CapabilitiesKHR::maxSpatialLayerCount, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile the bound video session was created with

  • VUID-vkCmdEncodeVideoKHR-pTileInfo-10343
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR and the pTileInfo member of VkVideoEncodeAV1PictureInfoKHR::pStdPictureInfo for the VkVideoEncodeAV1PictureInfoKHR structure included in the pNext chain of pEncodeInfo is not NULL, then pTileInfo->TileCols must be greater than 0

  • VUID-vkCmdEncodeVideoKHR-pTileInfo-10344
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR and the pTileInfo member of VkVideoEncodeAV1PictureInfoKHR::pStdPictureInfo for the VkVideoEncodeAV1PictureInfoKHR structure included in the pNext chain of pEncodeInfo is not NULL, then pTileInfo->TileRows must be greater than 0

  • VUID-vkCmdEncodeVideoKHR-pTileInfo-10345
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR and the pTileInfo member of VkVideoEncodeAV1PictureInfoKHR::pStdPictureInfo for the VkVideoEncodeAV1PictureInfoKHR structure included in the pNext chain of pEncodeInfo is not NULL, then pTileInfo->TileCols must be less than or equal to VkVideoEncodeAV1CapabilitiesKHR::maxTiles.width, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile the bound video session was created with

  • VUID-vkCmdEncodeVideoKHR-pTileInfo-10346
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR and the pTileInfo member of VkVideoEncodeAV1PictureInfoKHR::pStdPictureInfo for the VkVideoEncodeAV1PictureInfoKHR structure included in the pNext chain of pEncodeInfo is not NULL, then pTileInfo->TileRows must be less than or equal to VkVideoEncodeAV1CapabilitiesKHR::maxTiles.height, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile the bound video session was created with

  • VUID-vkCmdEncodeVideoKHR-pTileInfo-10347
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR and the pTileInfo member of VkVideoEncodeAV1PictureInfoKHR::pStdPictureInfo for the VkVideoEncodeAV1PictureInfoKHR structure included in the pNext chain of pEncodeInfo is not NULL, then pEncodeInfo->srcPictureResource.codedExtent.width / pTileInfo->TileCols must be between VkVideoEncodeAV1CapabilitiesKHR::minTileSize.width and VkVideoEncodeAV1CapabilitiesKHR::maxTileSize.width, inclusive, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile the bound video session was created with

  • VUID-vkCmdEncodeVideoKHR-pTileInfo-10348
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR and the pTileInfo member of VkVideoEncodeAV1PictureInfoKHR::pStdPictureInfo for the VkVideoEncodeAV1PictureInfoKHR structure included in the pNext chain of pEncodeInfo is not NULL, then pEncodeInfo->srcPictureResource.codedExtent.height / pTileInfo->TileRows must be between VkVideoEncodeAV1CapabilitiesKHR::minTileSize.height and VkVideoEncodeAV1CapabilitiesKHR::maxTileSize.height, inclusive, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile the bound video session was created with

  • VUID-vkCmdEncodeVideoKHR-pStdPictureInfo-10349
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR, then VkVideoEncodeAV1PictureInfoKHR::pStdPictureInfo->flags.segmentation_enabled for the VkVideoEncodeAV1PictureInfoKHR structure included in the pNext chain of pEncodeInfo must be zero

  • VUID-vkCmdEncodeVideoKHR-pStdPictureInfo-10350
    If the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR, then VkVideoEncodeAV1PictureInfoKHR::pStdPictureInfo->pSegmentation for the VkVideoEncodeAV1PictureInfoKHR structure included in the pNext chain of pEncodeInfo must be NULL

Valid Usage (Implicit)
  • VUID-vkCmdEncodeVideoKHR-commandBuffer-parameter
    commandBuffer must be a valid VkCommandBuffer handle

  • VUID-vkCmdEncodeVideoKHR-pEncodeInfo-parameter
    pEncodeInfo must be a valid pointer to a valid VkVideoEncodeInfoKHR structure

  • VUID-vkCmdEncodeVideoKHR-commandBuffer-recording
    commandBuffer must be in the recording state

  • VUID-vkCmdEncodeVideoKHR-commandBuffer-cmdpool
    The VkCommandPool that commandBuffer was allocated from must support encode operations

  • VUID-vkCmdEncodeVideoKHR-renderpass
    This command must only be called outside of a render pass instance

  • VUID-vkCmdEncodeVideoKHR-videocoding
    This command must only be called inside of a video coding scope

  • VUID-vkCmdEncodeVideoKHR-bufferlevel
    commandBuffer must be a primary VkCommandBuffer

Host Synchronization
  • Host access to commandBuffer must be externally synchronized

  • Host access to the VkCommandPool that commandBuffer was allocated from must be externally synchronized

Command Properties
Command Buffer Levels Render Pass Scope Video Coding Scope Supported Queue Types Command Type

Primary

Outside

Inside

Encode

Action

The VkVideoEncodeInfoKHR structure is defined as:

// Provided by VK_KHR_video_encode_queue
typedef struct VkVideoEncodeInfoKHR {
    VkStructureType                       sType;
    const void*                           pNext;
    VkVideoEncodeFlagsKHR                 flags;
    VkBuffer                              dstBuffer;
    VkDeviceSize                          dstBufferOffset;
    VkDeviceSize                          dstBufferRange;
    VkVideoPictureResourceInfoKHR         srcPictureResource;
    const VkVideoReferenceSlotInfoKHR*    pSetupReferenceSlot;
    uint32_t                              referenceSlotCount;
    const VkVideoReferenceSlotInfoKHR*    pReferenceSlots;
    uint32_t                              precedingExternallyEncodedBytes;
} VkVideoEncodeInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is a pointer to a structure extending this structure.

  • flags is a bitmask of VkVideoEncodeFlagBitsKHR indicating video encode command flags.

  • dstBuffer is the destination video bitstream buffer to write the encoded bitstream to.

  • dstBufferOffset is the starting offset in bytes from the start of dstBuffer to write the encoded bitstream to.

  • dstBufferRange is the maximum bitstream size in bytes that can be written to dstBuffer, starting from dstBufferOffset.

  • srcPictureResource is the video picture resource to use as the encode input picture.

  • pSetupReferenceSlot is NULL or a pointer to a VkVideoReferenceSlotInfoKHR structure specifying the reconstructed picture information.

  • referenceSlotCount is the number of elements in the pReferenceSlots array.

  • pReferenceSlots is NULL or a pointer to an array of VkVideoReferenceSlotInfoKHR structures describing the DPB slots and corresponding reference picture resources to use in this video encode operation (the set of active reference pictures).

  • precedingExternallyEncodedBytes is the number of bytes externally encoded by the application to the video bitstream and is used to update the internal state of the implementation’s rate control algorithm to account for the bitrate budget consumed by these externally encoded bytes.

Valid Usage
  • VUID-VkVideoEncodeInfoKHR-dstBuffer-08236
    dstBuffer must have been created with VK_BUFFER_USAGE_VIDEO_ENCODE_DST_BIT_KHR set

  • VUID-VkVideoEncodeInfoKHR-dstBufferOffset-08237
    dstBufferOffset must be less than the size of dstBuffer

  • VUID-VkVideoEncodeInfoKHR-dstBufferRange-08238
    dstBufferRange must be less than or equal to the size of dstBuffer minus dstBufferOffset

  • VUID-VkVideoEncodeInfoKHR-pSetupReferenceSlot-08239
    If pSetupReferenceSlot is not NULL, then its slotIndex member must not be negative

  • VUID-VkVideoEncodeInfoKHR-pSetupReferenceSlot-08240
    If pSetupReferenceSlot is not NULL, then its pPictureResource must not be NULL

  • VUID-VkVideoEncodeInfoKHR-slotIndex-08241
    The slotIndex member of each element of pReferenceSlots must not be negative

  • VUID-VkVideoEncodeInfoKHR-pPictureResource-08242
    The pPictureResource member of each element of pReferenceSlots must not be NULL

Valid Usage (Implicit)

Bits which can be set in VkVideoEncodeInfoKHR::flags, specifying video encode flags, are:

// Provided by VK_KHR_video_encode_quantization_map
typedef enum VkVideoEncodeFlagBitsKHR {
  // Provided by VK_KHR_video_encode_quantization_map
    VK_VIDEO_ENCODE_WITH_QUANTIZATION_DELTA_MAP_BIT_KHR = 0x00000001,
  // Provided by VK_KHR_video_encode_quantization_map
    VK_VIDEO_ENCODE_WITH_EMPHASIS_MAP_BIT_KHR = 0x00000002,
} VkVideoEncodeFlagBitsKHR;
// Provided by VK_KHR_video_encode_queue
typedef VkFlags VkVideoEncodeFlagsKHR;

VkVideoEncodeFlagsKHR is a bitmask type for setting a mask of zero or more VkVideoEncodeFlagBitsKHR.

Video Encode Rate Control

The size of the encoded bitstream data produced by video encode operations is a function of the following set of constraints:

  • The capabilities of the compression algorithms defined and employed by the used video compression standard;

  • Restrictions imposed by the selected video profile according to the rules defined by the used video compression standard;

  • Further restrictions imposed by the capabilities supported by the implementation for the selected video profile;

  • The image data in the encode input picture and the set of active reference pictures (as these affect the effectiveness of the compression algorithms employed by the video encode operations);

  • The set of codec-specific and codec-independent encoding parameters provided by the application.

These also inherently define the set of decoder capabilities required for reconstructing and processing the picture data in the encoded bitstream.

Video coding uses bitrate as the quantitative metric associated with encoded bitstream data size which expresses the rate at which video bitstream data can be transferred or processed, measured in number of bits per second. This bitrate is both a function of the encoded bitstream data size of the encoded pictures as well as the frame rate used by the video sequence.

Rate control algorithms are used by video encode operations to enable adjusting encoding parameters to achieve a target bitrate, or otherwise directly or indirectly control the bitrate of the generated video bitstream data. These algorithms are usually not defined by the used video compression standard, although some video compression standards do provide non-normative guidelines for implementations.

Accordingly, this specification does not mandate implementations to produce identical encoded bitstream data outputs in response to video encode operations, however, it does define a set of codec-independent and codec-specific parameters that enable the application to control the behavior of the rate control algorithms supported by the implementation. Some of these parameters guarantee certain implementation behavior while others provide guidance for implementations to apply various rate control heuristics.

Applications need to make sure that they configure rate control parameters appropriately and that they follow the promises made to the implementation through parameters providing guidance for the implementation’s rate control algorithms and heuristics in order to be able to get the desired rate control behavior and to be able to hit the set bitrate targets. In addition, the behavior of rate control may also differ across implementations even if the capabilities of the used video profile match between those implementations. This may happen due to implementations applying different rate control algorithms or heuristics internally, and thus even the same set of guidance parameter values may have different effects on the rate control behavior across implementations.

Rate Control Modes

After a video session is reset to the initial state, the default behavior and parameters of video encode rate control are entirely implementation-dependent and the application cannot affect the bitrate or quality parameters of the encoded bitstream data produced by video encode operations unless the application changes the rate control configuration of the video session, as described in the Video Coding Control section.

For each supported video profile, the implementation may expose a set of rate control modes that are available for use by the application when encoding bitstreams targeting that video profile. These modes allow using different rate control algorithms that fall into one of the following two categories:

  1. Per-operation rate control

  2. Stream-level rate control

In case of per-operation rate control, the bitrate of the generated video bitstream data is indirectly controlled by quality, size, or other encoding parameters specified by the application for each individual video encode operation.

In case of stream-level rate control, the application can directly specify target bitrates besides other encoding parameters to control the behavior of the rate control algorithm used by the implementation across multiple video encode operations.

The rate control modes are defined with the following enums:

// Provided by VK_KHR_video_encode_queue
typedef enum VkVideoEncodeRateControlModeFlagBitsKHR {
    VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DEFAULT_KHR = 0,
    VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DISABLED_BIT_KHR = 0x00000001,
    VK_VIDEO_ENCODE_RATE_CONTROL_MODE_CBR_BIT_KHR = 0x00000002,
    VK_VIDEO_ENCODE_RATE_CONTROL_MODE_VBR_BIT_KHR = 0x00000004,
} VkVideoEncodeRateControlModeFlagBitsKHR;
  • VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DEFAULT_KHR specifies the use of implementation-specific rate control.

  • VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DISABLED_BIT_KHR specifies that rate control is disabled and the application will specify per-operation rate control parameters controlling the encoding quality. In this mode implementations will encode pictures independently of the output bitrate of prior video encode operations.

  • VK_VIDEO_ENCODE_RATE_CONTROL_MODE_CBR_BIT_KHR specifies the use of constant bitrate (CBR) rate control mode. In this mode the implementation will attempt to produce the encoded bitstream at a constant bitrate while conforming to the constraints of other rate control parameters.

  • VK_VIDEO_ENCODE_RATE_CONTROL_MODE_VBR_BIT_KHR specifies the use of variable bitrate (VBR) rate control mode. In this mode the implementation will produce the encoded bitstream at a variable bitrate according to the constraints of other rate control parameters.

// Provided by VK_KHR_video_encode_queue
typedef VkFlags VkVideoEncodeRateControlModeFlagsKHR;

VkVideoEncodeRateControlModeFlagsKHR is a bitmask type for setting a mask of zero or more VkVideoEncodeRateControlModeFlagBitsKHR.

Leaky Bucket Model

Video encoding implementations use the leaky bucket model for stream-level rate control. The leaky bucket is a concept referring to the interface between the video encoder and the consumer (for example, a network connection), where the video encoder produces encoded bitstream data corresponding to the encoded pictures and adds them in the leaky bucket while its content are drained by the consumer.

Analogously, a similar leaky bucket is considered to exist at the input interface of a video decoder, into which encoded bitstream data is continuously added and is subsequently consumed by the video decoder. It is desirable to avoid overflowing or underflowing this leaky bucked because:

  • In case of an underflow, the video decoder will be unable to consume encoded bitstream data in order to decode pictures (and optionally display them).

  • In case of an overflow, the leaky bucket will be unable to accommodate more encoded bitstream data and such data may need to be thrown away, leading to the loss of the corresponding encoded pictures.

These requirements can be satisfied by imposing various constraints on the encoder-side leaky bucket to avoid its overflow or underflow, depending on the used rate control algorithm and codec parameters. However, enumerating these constraints is outside the scope of this specification.

The term virtual buffer is often used as an alternative to refer to the leaky bucket.

This virtual buffer model is defined by the following parameters:

  • The bitrate (R) at which the encoded bitstream is expected to be processed.

  • The size (B) of the virtual buffer.

  • The initial occupancy (F) of the virtual buffer.

In this model the virtual buffer is used to smooth out fluctuations in the bitrate of the encoded bitstream over time without experiencing buffer overflow or underflow, as long as the bitrate of the encoded stream does not diverge from the target bitrate for extended periods of time.

This buffering may inherently impose a processing delay, as the goal of the model is to enable decoders maintain a consistent processing rate of an encoded bitstream with varying data rate.

The initial or start-up delay (D) is computed as:

D = F / R

Applications need to configure the virtual buffer with sufficient size to avoid or minimize buffer overflows and underflows while also keeping it small enough to meet their latency goals.

Rate Control Layers

Some video compression standards and video profiles allow associating encoded pictures with specific video coding layers. The name, identification, and semantics associated with such video coding layers are defined by the corresponding video compression standards.

Analogously, stream-level rate control can be configured to use one or more rate control layers:

  • When a single rate control layer is configured, it is applied to all encoded pictures, regardless of the picture’s video coding layer. In this case the distribution of the available bitrate budget across video coding layers is implementation-dependent.

  • When multiple rate control layers are configured, each rate control layer is applied to the corresponding video coding layer, i.e. only across encoded pictures pertaining to the corresponding video coding layer.

Individual rate control layers are identified using layer indices between zero and N-1, where N is the number of active rate control layers.

Rate control layers are only applicable when using stream-level rate control modes.

Rate Control State

Rate control state is maintained by the implementation in the video session objects and its parameters are specified using an instance of the VkVideoEncodeRateControlInfoKHR structure. The complete rate control state of a video session is defined by the following set of parameters:

Two rate control states match if all the parameters listed above match between them.

The VkVideoEncodeRateControlInfoKHR structure is defined as:

// Provided by VK_KHR_video_encode_queue
typedef struct VkVideoEncodeRateControlInfoKHR {
    VkStructureType                                sType;
    const void*                                    pNext;
    VkVideoEncodeRateControlFlagsKHR               flags;
    VkVideoEncodeRateControlModeFlagBitsKHR        rateControlMode;
    uint32_t                                       layerCount;
    const VkVideoEncodeRateControlLayerInfoKHR*    pLayers;
    uint32_t                                       virtualBufferSizeInMs;
    uint32_t                                       initialVirtualBufferSizeInMs;
} VkVideoEncodeRateControlInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • flags is reserved for future use.

  • rateControlMode is a VkVideoEncodeRateControlModeFlagBitsKHR value specifying the rate control mode.

  • layerCount specifies the number of rate control layers to use.

  • pLayers is a pointer to an array of layerCount VkVideoEncodeRateControlLayerInfoKHR structures, each specifying the rate control configuration of the corresponding rate control layer.

  • virtualBufferSizeInMs is the size in milliseconds of the virtual buffer used by the implementation’s rate control algorithm for the leaky bucket model, with respect to the average bitrate of the stream calculated by summing the values of the averageBitrate members of the elements of the pLayers array.

  • initialVirtualBufferSizeInMs is the initial occupancy in milliseconds of the virtual buffer used by the implementation’s rate control algorithm for the leaky bucket model.

If layerCount is zero then the values of virtualBufferSizeInMs and initialVirtualBufferSizeInMs are ignored.

This structure can be specified in the following places:

Including this structure in the pNext chain of VkVideoCodingControlInfoKHR and including VK_VIDEO_CODING_CONTROL_ENCODE_RATE_CONTROL_BIT_KHR in VkVideoCodingControlInfoKHR::flags enables updating the rate control configuration of the bound video session. This replaces the entire rate control configuration of the bound video session and may reset the state of all enabled rate control layers to an initial state according to the codec-specific rate control semantics defined in the corresponding sections listed below.

When layerCount is greater than one, multiple rate control layers are configured, and each rate control layer is applied to the corresponding video coding layer identified by the index of the corresponding element of pLayer.

  • If the video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR, then this index specifies the H.264 temporal layer ID of the video coding layer the rate control layer is applied to.

  • If the video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR, then this index specifies the H.265 temporal ID of the video coding layer the rate control layer is applied to.

  • If the video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR, then this index specifies the AV1 temporal ID of the temporal layer the rate control layer is applied to.

Additional structures providing codec-specific rate control parameters can be included in the pNext chain of VkVideoCodingControlInfoKHR depending on the video profile the bound video session was created. For further details see:

The new rate control configuration takes effect when the corresponding vkCmdControlVideoCodingKHR is executed on the device, and only impacts video encode operations that follow in execution order.

Valid Usage
  • VUID-VkVideoEncodeRateControlInfoKHR-rateControlMode-08248
    If rateControlMode is VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DEFAULT_KHR or VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DISABLED_BIT_KHR, then layerCount must be 0

  • VUID-VkVideoEncodeRateControlInfoKHR-rateControlMode-08275
    If rateControlMode is VK_VIDEO_ENCODE_RATE_CONTROL_MODE_CBR_BIT_KHR or VK_VIDEO_ENCODE_RATE_CONTROL_MODE_VBR_BIT_KHR, then layerCount must be greater than 0

  • VUID-VkVideoEncodeRateControlInfoKHR-rateControlMode-08244
    If rateControlMode is not VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DEFAULT_KHR, then it must specify one of the bits included in VkVideoEncodeCapabilitiesKHR::rateControlModes, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the used video profile

  • VUID-VkVideoEncodeRateControlInfoKHR-layerCount-08245
    layerCount member must be less than or equal to VkVideoEncodeCapabilitiesKHR::maxRateControlLayers, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the used video profile

  • VUID-VkVideoEncodeRateControlInfoKHR-pLayers-08276
    For each element of pLayers, its averageBitrate member must be between 1 and VkVideoEncodeCapabilitiesKHR::maxBitrate, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the used video profile

  • VUID-VkVideoEncodeRateControlInfoKHR-pLayers-08277
    For each element of pLayers, its maxBitrate member must be between 1 and VkVideoEncodeCapabilitiesKHR::maxBitrate, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the used video profile

  • VUID-VkVideoEncodeRateControlInfoKHR-rateControlMode-08356
    If rateControlMode is VK_VIDEO_ENCODE_RATE_CONTROL_MODE_CBR_BIT_KHR, then for each element of pLayers, its averageBitrate member must equal its maxBitrate member

  • VUID-VkVideoEncodeRateControlInfoKHR-rateControlMode-08278
    If rateControlMode is VK_VIDEO_ENCODE_RATE_CONTROL_MODE_VBR_BIT_KHR, then for each element of pLayers, its averageBitrate member must be less than or equal to its maxBitrate member

  • VUID-VkVideoEncodeRateControlInfoKHR-layerCount-08357
    If layerCount is not zero, then virtualBufferSizeInMs must be greater than zero

  • VUID-VkVideoEncodeRateControlInfoKHR-layerCount-08358
    If layerCount is not zero, then initialVirtualBufferSizeInMs must be less than or equal to virtualBufferSizeInMs

  • VUID-VkVideoEncodeRateControlInfoKHR-videoCodecOperation-07022
    If the videoCodecOperation of the used video profile is VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR, the pNext chain this structure is included in also includes an instance of the VkVideoEncodeH264RateControlInfoKHR structure, and layerCount is greater than 1, then layerCount must equal VkVideoEncodeH264RateControlInfoKHR::temporalLayerCount

  • VUID-VkVideoEncodeRateControlInfoKHR-videoCodecOperation-07025
    If the videoCodecOperation of the used video profile is VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR, the pNext chain this structure is included in also includes an instance of the VkVideoEncodeH265RateControlInfoKHR structure, and layerCount is greater than 1, then layerCount must equal VkVideoEncodeH265RateControlInfoKHR::subLayerCount

  • VUID-VkVideoEncodeRateControlInfoKHR-videoCodecOperation-10351
    If the videoCodecOperation of the used video profile is VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR, the pNext chain this structure is included in also includes an instance of the VkVideoEncodeAV1RateControlInfoKHR structure, and layerCount is greater than 1, then layerCount must equal VkVideoEncodeAV1RateControlInfoKHR::temporalLayerCount

Valid Usage (Implicit)
  • VUID-VkVideoEncodeRateControlInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_RATE_CONTROL_INFO_KHR

  • VUID-VkVideoEncodeRateControlInfoKHR-flags-zerobitmask
    flags must be 0

  • VUID-VkVideoEncodeRateControlInfoKHR-rateControlMode-parameter
    If rateControlMode is not 0, rateControlMode must be a valid VkVideoEncodeRateControlModeFlagBitsKHR value

  • VUID-VkVideoEncodeRateControlInfoKHR-pLayers-parameter
    If layerCount is not 0, pLayers must be a valid pointer to an array of layerCount valid VkVideoEncodeRateControlLayerInfoKHR structures

// Provided by VK_KHR_video_encode_queue
typedef VkFlags VkVideoEncodeRateControlFlagsKHR;

VkVideoEncodeRateControlFlagsKHR is a bitmask type for setting a mask, but currently reserved for future use.

Rate Control Layer State

The configuration of individual rate control layers is specified using an instance of the VkVideoEncodeRateControlLayerInfoKHR structure.

The VkVideoEncodeRateControlLayerInfoKHR structure is defined as:

// Provided by VK_KHR_video_encode_queue
typedef struct VkVideoEncodeRateControlLayerInfoKHR {
    VkStructureType    sType;
    const void*        pNext;
    uint64_t           averageBitrate;
    uint64_t           maxBitrate;
    uint32_t           frameRateNumerator;
    uint32_t           frameRateDenominator;
} VkVideoEncodeRateControlLayerInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is a pointer to a structure extending this structure.

  • averageBitrate is the average bitrate to be targeted by the implementation’s rate control algorithm.

  • maxBitrate is the peak bitrate to be targeted by the implementation’s rate control algorithm.

  • frameRateNumerator is the numerator of the frame rate assumed by the implementation’s rate control algorithm.

  • frameRateDenominator is the denominator of the frame rate assumed by the implementation’s rate control algorithm.

The ability of the implementation’s rate control algorithm to be able to match the requested average and/or peak bitrates may be limited by the set of other codec-independent and codec-specific rate control parameters specified by the application, the input content, as well as the application conforming to the rate control guidance provided to the implementation, as described earlier.

Additional structures providing codec-specific rate control parameters can be included in the pNext chain of VkVideoEncodeRateControlLayerInfoKHR depending on the video profile the bound video session was created with. For further details see:

Valid Usage
  • VUID-VkVideoEncodeRateControlLayerInfoKHR-frameRateNumerator-08350
    frameRateNumerator must be greater than zero

  • VUID-VkVideoEncodeRateControlLayerInfoKHR-frameRateDenominator-08351
    frameRateDenominator must be greater than zero

Valid Usage (Implicit)

Video Encode Quantization Maps

Quantization maps are VkImage objects that are used in video encode operations to control the relative quantization parameter values across the encoded picture. Each texel in the quantization map controls the relative quantization parameter values used to encode the corresponding rectangular block of texels in the encode input picture.

The size of the rectangular block of texels each quantization map texel covers is referred to as the quantization map texel size.

The extent of the image subresource used as a quantization map when encoding a picture with a coded extent of (width,height) thus has to be at least (⌈width / texelSize.width⌉, ⌈height / texelSize.height⌉), where texelSize is the used quantization map texel size.

In particular, the quantization map texel at location (x,y) contains relative quantization parameter values used when encoding the texelSize sized rectangular block of the encode input picture starting at the texel location (x × texelSize.width, y × texelSize.height).

The quantization map texel size does not always match the size of the codec-specific coding blocks used during encoding. Furthermore, some video compression standards allow the size of the codec-specific coding blocks to vary across the encoded picture. In order to accommodate for such mismatches between the granularity at which quantization parameters are stored in quantization maps and the granularity at which they are applied to codec-specific coding blocks during encoding, the following mapping rules are applied to define the quantization map texel value corresponding to a given codec-specific coding block with a size (width,height) at the texel location (x,y) in the encode input picture:

  • If the size of the codec-specific coding block matches the used quantization map texel size, then the fetched quantization map value corresponding to the codec-specific coding block is the texel value at the texel location (x / texelSize.width, y / texelSize.height).

  • If the size of the codec-specific coding block is smaller than the used quantization map texel size, then the fetched quantization map value corresponding to the codec-specific coding block is the texel value at the texel location (⌊x / texelSize.width⌋, ⌊y / texelSize.height⌋).

  • If the size of the codec-specific coding block is larger than the used quantization map texel size, then the fetched quantization map value corresponding to the codec-specific coding block may be any value determined as the linear interpolation of the quantization map texel values in the subregion starting at texel location (x / texelSize.width,y / texelSize.height) with a size (⌈width / texelSize.width⌉, ⌈height / texelSize.height⌉).

The actual control parameters stored in the quantization map depend on its type. This specification supports the following types of quantization maps:

Quantization Delta Maps

Quantization delta maps contain values that directly affect the codec-specific quantization parameter values used to encode the corresponding block of the encode input picture.

Quantization delta maps can be used in conjunction with any rate control mode, including VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DISABLED_BIT_KHR.

Due to their codec-specific nature, they are described in more detail in the corresponding codec-specific section for video encode operations that support them. In particular:

This specification does not support quantization delta maps for any other video encode operation.

Emphasis Maps

Emphasis maps contain values that indirectly affect the codec-specific quantization parameter values used to encode the corresponding block of the encode input picture.

The texels of emphasis maps contain values that provide input to the encoder implementation about the relative importance (emphasis) of regions of the encoded pictures in order to enable the implementation’s rate control algorithm to allocate more bitrate budget for regions of the encoded picture with higher emphasis values than to those with lower emphasis values.

Emphasis maps can only be used when the current rate control mode configured for the video session is not VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DEFAULT_KHR or VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DISABLED_BIT_KHR.

As these emphasis values only control the otherwise implementation-specific behavior of the used rate control algorithm, this specification does not impose additional restrictions on implementations beyond the ones outlined in the corresponding codec-specific sections describing quantization behavior:

This specification does not support emphasis maps for any other video encode operation.

Emphasis maps always have single channel unsigned normalized integer formats and implementations are required to support the VK_FORMAT_R8_UNORM format for emphasis maps, as reported in VkVideoFormatPropertiesKHR::format, when the video encode profile supports VK_VIDEO_ENCODE_CAPABILITY_EMPHASIS_MAP_BIT_KHR.

Quantization Map Capabilities

When calling vkGetPhysicalDeviceVideoCapabilitiesKHR with pVideoProfile->videoCodecOperation specifying an encode operation, the VkVideoEncodeQuantizationMapCapabilitiesKHR structure can be included in the pNext chain of the VkVideoCapabilitiesKHR structure to retrieve capabilities specific to video encode quantization maps.

The VkVideoEncodeQuantizationMapCapabilitiesKHR structure is defined as:

// Provided by VK_KHR_video_encode_quantization_map
typedef struct VkVideoEncodeQuantizationMapCapabilitiesKHR {
    VkStructureType    sType;
    void*              pNext;
    VkExtent2D         maxQuantizationMapExtent;
} VkVideoEncodeQuantizationMapCapabilitiesKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • maxQuantizationMapExtent indicates the maximum supported width and height of quantization maps.

Valid Usage (Implicit)
  • VUID-VkVideoEncodeQuantizationMapCapabilitiesKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_QUANTIZATION_MAP_CAPABILITIES_KHR

When calling vkGetPhysicalDeviceVideoCapabilitiesKHR to query the capabilities of an H.264 encode profile, the VkVideoEncodeH264QuantizationMapCapabilitiesKHR structure can be included in the pNext chain of the VkVideoCapabilitiesKHR structure to retrieve additional video encode quantization map capabilities specific to H.264 encode profiles.

The VkVideoEncodeH264QuantizationMapCapabilitiesKHR structure is defined as:

// Provided by VK_KHR_video_encode_h264 with VK_KHR_video_encode_quantization_map
typedef struct VkVideoEncodeH264QuantizationMapCapabilitiesKHR {
    VkStructureType    sType;
    void*              pNext;
    int32_t            minQpDelta;
    int32_t            maxQpDelta;
} VkVideoEncodeH264QuantizationMapCapabilitiesKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • minQpDelta indicates the minimum QP delta value supported for H.264 QP delta maps.

  • maxQpDelta indicates the maximum QP delta value supported for H.264 QP delta maps.

Valid Usage (Implicit)
  • VUID-VkVideoEncodeH264QuantizationMapCapabilitiesKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_H264_QUANTIZATION_MAP_CAPABILITIES_KHR

When calling vkGetPhysicalDeviceVideoCapabilitiesKHR to query the capabilities of an H.265 encode profile, the VkVideoEncodeH265QuantizationMapCapabilitiesKHR structure can be included in the pNext chain of the VkVideoCapabilitiesKHR structure to retrieve additional video encode quantization map capabilities specific to H.265 encode profiles.

The VkVideoEncodeH265QuantizationMapCapabilitiesKHR structure is defined as:

// Provided by VK_KHR_video_encode_h265 with VK_KHR_video_encode_quantization_map
typedef struct VkVideoEncodeH265QuantizationMapCapabilitiesKHR {
    VkStructureType    sType;
    void*              pNext;
    int32_t            minQpDelta;
    int32_t            maxQpDelta;
} VkVideoEncodeH265QuantizationMapCapabilitiesKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • minQpDelta indicates the minimum QP delta value supported for H.265 QP delta maps.

  • maxQpDelta indicates the maximum QP delta value supported for H.265 QP delta maps.

Valid Usage (Implicit)
  • VUID-VkVideoEncodeH265QuantizationMapCapabilitiesKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_H265_QUANTIZATION_MAP_CAPABILITIES_KHR

When calling vkGetPhysicalDeviceVideoCapabilitiesKHR to query the capabilities of an AV1 encode profile, the VkVideoEncodeAV1QuantizationMapCapabilitiesKHR structure can be included in the pNext chain of the VkVideoCapabilitiesKHR structure to retrieve additional video encode quantization map capabilities specific to AV1 encode profiles.

The VkVideoEncodeAV1QuantizationMapCapabilitiesKHR structure is defined as:

// Provided by VK_KHR_video_encode_av1 with VK_KHR_video_encode_quantization_map
typedef struct VkVideoEncodeAV1QuantizationMapCapabilitiesKHR {
    VkStructureType    sType;
    void*              pNext;
    int32_t            minQIndexDelta;
    int32_t            maxQIndexDelta;
} VkVideoEncodeAV1QuantizationMapCapabilitiesKHR;
Valid Usage (Implicit)
  • VUID-VkVideoEncodeAV1QuantizationMapCapabilitiesKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_AV1_QUANTIZATION_MAP_CAPABILITIES_KHR

Quantization Map Format Properties

When calling vkGetPhysicalDeviceVideoFormatPropertiesKHR, the VkVideoFormatQuantizationMapPropertiesKHR structure can be included in the pNext chain of the VkVideoFormatPropertiesKHR structure to retrieve video format properties specific to video encode quantization maps.

The VkVideoFormatQuantizationMapPropertiesKHR structure is defined as:

// Provided by VK_KHR_video_encode_quantization_map
typedef struct VkVideoFormatQuantizationMapPropertiesKHR {
    VkStructureType    sType;
    void*              pNext;
    VkExtent2D         quantizationMapTexelSize;
} VkVideoFormatQuantizationMapPropertiesKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • quantizationMapTexelSize indicates the quantization map texel size of the video format, i.e. the number of pixels corresponding to each quantization map texel.

The values returned in this structure are only defined if the allowed image usage flags returned in VkVideoFormatPropertiesKHR::imageUsageFlags for this video format include VK_IMAGE_USAGE_VIDEO_ENCODE_QUANTIZATION_DELTA_MAP_BIT_KHR or VK_IMAGE_USAGE_VIDEO_ENCODE_EMPHASIS_MAP_BIT_KHR.

Implementations may support multiple quantization map texel sizes for a particular video format which is indicated by vkGetPhysicalDeviceVideoFormatPropertiesKHR returning multiple entries with different quantizationMapTexelSize values.

Valid Usage (Implicit)
  • VUID-VkVideoFormatQuantizationMapPropertiesKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_FORMAT_QUANTIZATION_MAP_PROPERTIES_KHR

When calling vkGetPhysicalDeviceVideoFormatPropertiesKHR, the VkVideoFormatH265QuantizationMapPropertiesKHR structure can be included in the pNext chain of the VkVideoFormatPropertiesKHR structure to retrieve video format properties specific to video encode quantization maps used with an H.265 encode profile.

The VkVideoFormatH265QuantizationMapPropertiesKHR structure is defined as:

// Provided by VK_KHR_video_encode_h265 with VK_KHR_video_encode_quantization_map
typedef struct VkVideoFormatH265QuantizationMapPropertiesKHR {
    VkStructureType                     sType;
    void*                               pNext;
    VkVideoEncodeH265CtbSizeFlagsKHR    compatibleCtbSizes;
} VkVideoFormatH265QuantizationMapPropertiesKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • compatibleCtbSizes is a bitmask of VkVideoEncodeH265CtbSizeFlagBitsKHR indicating the CTB sizes that quantization maps using this video format are compatible with.

    The value of compatibleCtbSizes does not limit the use of the specific quantization map format, but does limit the implementation in being able to encode pictures with CTB sizes not included in compatibleCtbSizes but otherwise supported by the used video profile, as indicated by VkVideoEncodeH265CapabilitiesKHR::ctbSizes. In particular, using smaller quantization map texel sizes may prevent implementations from encoding with larger CTB sizes which may have a negative impact on the efficiency of the encoder.

The values returned in this structure are only defined if the allowed image usage flags returned in VkVideoFormatPropertiesKHR::imageUsageFlags for this video format include VK_IMAGE_USAGE_VIDEO_ENCODE_QUANTIZATION_DELTA_MAP_BIT_KHR or VK_IMAGE_USAGE_VIDEO_ENCODE_EMPHASIS_MAP_BIT_KHR.

Valid Usage (Implicit)
  • VUID-VkVideoFormatH265QuantizationMapPropertiesKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_FORMAT_H265_QUANTIZATION_MAP_PROPERTIES_KHR

When calling vkGetPhysicalDeviceVideoFormatPropertiesKHR, the VkVideoFormatAV1QuantizationMapPropertiesKHR structure can be included in the pNext chain of the VkVideoFormatPropertiesKHR structure to retrieve video format properties specific to video encode quantization maps used with an AV1 encode profile.

The VkVideoFormatAV1QuantizationMapPropertiesKHR structure is defined as:

// Provided by VK_KHR_video_encode_av1 with VK_KHR_video_encode_quantization_map
typedef struct VkVideoFormatAV1QuantizationMapPropertiesKHR {
    VkStructureType                           sType;
    void*                                     pNext;
    VkVideoEncodeAV1SuperblockSizeFlagsKHR    compatibleSuperblockSizes;
} VkVideoFormatAV1QuantizationMapPropertiesKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • compatibleSuperblockSizes is a bitmask of VkVideoEncodeAV1SuperblockSizeFlagBitsKHR indicating the AV1 superblock sizes that quantization maps using this video format are compatible with.

    The value of compatibleSuperblockSizes does not limit the use of the specific quantization map format, but does limit the implementation in being able to encode pictures with superblock sizes not included in compatibleSuperblockSizes but otherwise supported by the used video profile, as indicated by VkVideoEncodeAV1CapabilitiesKHR::superblockSizes. In particular, using smaller quantization map texel sizes may prevent implementations from encoding with larger superblock sizes which may have a negative impact on the efficiency of the encoder.

The values returned in this structure are only defined if the allowed image usage flags returned in VkVideoFormatPropertiesKHR::imageUsageFlags for this video format include VK_IMAGE_USAGE_VIDEO_ENCODE_QUANTIZATION_DELTA_MAP_BIT_KHR or VK_IMAGE_USAGE_VIDEO_ENCODE_EMPHASIS_MAP_BIT_KHR.

Valid Usage (Implicit)
  • VUID-VkVideoFormatAV1QuantizationMapPropertiesKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_FORMAT_AV1_QUANTIZATION_MAP_PROPERTIES_KHR

Encoding with Quantization Maps

The VkVideoEncodeQuantizationMapInfoKHR structure can be included in the pNext chain of the VkVideoEncodeInfoKHR structure passed to the vkCmdEncodeVideoKHR command to specify the quantization map used by the issued video encode operations.

The VkVideoEncodeQuantizationMapInfoKHR structure is defined as:

// Provided by VK_KHR_video_encode_quantization_map
typedef struct VkVideoEncodeQuantizationMapInfoKHR {
    VkStructureType    sType;
    const void*        pNext;
    VkImageView        quantizationMap;
    VkExtent2D         quantizationMapExtent;
} VkVideoEncodeQuantizationMapInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • quantizationMap specifies the image view to use as the quantization map.

  • quantizationMapExtent specifies the extent of the image subregion of quantizationMap to use as the quantization map starting at offset (0,0).

Valid Usage
  • VUID-VkVideoEncodeQuantizationMapInfoKHR-quantizationMapExtent-10352
    quantizationMapExtent.width must be less than or equal to the width of quantizationMap

  • VUID-VkVideoEncodeQuantizationMapInfoKHR-quantizationMapExtent-10353
    quantizationMapExtent.height must be less than or equal to the height of quantizationMap

Valid Usage (Implicit)
  • VUID-VkVideoEncodeQuantizationMapInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_QUANTIZATION_MAP_INFO_KHR

  • VUID-VkVideoEncodeQuantizationMapInfoKHR-quantizationMap-parameter
    If quantizationMap is not VK_NULL_HANDLE, quantizationMap must be a valid VkImageView handle

H.264 Encode Operations

Video encode operations using an H.264 encode profile can be used to encode elementary video stream sequences compliant to the ITU-T H.264 Specification.

Refer to the Preamble for information on how the Khronos Intellectual Property Rights Policy relates to normative references to external materials not created by Khronos.

This process is performed according to the video encode operation steps with the codec-specific semantics defined in section 8 of the ITU-T H.264 Specification as follows:

If the parameters adhere to the syntactic and semantic requirements defined in the corresponding sections of the ITU-T H.264 Specification, as described above, and the DPB slots associated with the active reference pictures all refer to valid picture references, then the video encode operation will complete successfully. Otherwise, the video encode operation may complete unsuccessfully.

H.264 Encode Parameter Overrides

Implementations may override, unless otherwise specified, any of the H.264 encode parameters specified in the following Video Std structures:

  • StdVideoH264SequenceParameterSet

  • StdVideoH264PictureParameterSet

  • StdVideoEncodeH264PictureInfo

  • StdVideoEncodeH264SliceHeader

  • StdVideoEncodeH264ReferenceInfo

All such H.264 encode parameter overrides must fulfill the conditions defined in the Video Encode Parameter Overrides section.

In addition, implementations must not override any of the following H.264 encode parameters:

  • StdVideoEncodeH264PictureInfo::primary_pic_type

  • StdVideoEncodeH264SliceHeader::slice_type

In case of H.264 encode parameters stored in video session parameters objects, applications need to use the vkGetEncodedVideoSessionParametersKHR command to determine whether any implementation overrides happened. If the query indicates that implementation overrides were applied, then the application needs to retrieve and use the encoded H.264 parameter sets in the bitstream in order to be able to produce a compliant H.264 video bitstream using the H.264 encode parameters stored in the video session parameters object.

In case of any H.264 encode parameters stored in the encoded bitstream produced by video encode operations, if the implementation supports the VK_VIDEO_ENCODE_FEEDBACK_BITSTREAM_HAS_OVERRIDES_BIT_KHR video encode feedback query flag, the application can use such queries to retrieve feedback about whether any implementation overrides have been applied to those H.264 encode parameters.

H.264 Encode Bitstream Data Access

Each video encode operation writes one or more VCL NAL units comprising of slice headers and data of the encoded picture, in the format defined in sections 7.3.3 and 7.3.4, according to the semantics defined in sections 7.4.3 and 7.4.4 of the ITU-T H.264 Specification, respectively. The number of VCL NAL units written is specified by VkVideoEncodeH264PictureInfoKHR::naluSliceEntryCount.

In addition, if VkVideoEncodeH264PictureInfoKHR::generatePrefixNalu is VK_TRUE for the video encode operation, then an additional prefix NAL unit is written before each VCL NAL unit corresponding to individual slices in the format defined in section 7.3.2.12, according to the semantics defined in section 7.4.2.12 of the ITU-T H.264 Specification, respectively.

H.264 Encode Picture Data Access

Accesses to image data within a video picture resource happen at the granularity indicated by VkVideoCapabilitiesKHR::pictureAccessGranularity, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the used video profile. Accordingly, the complete image subregion of a encode input picture, reference picture, or reconstructed picture accessed by video coding operations using an H.264 encode profile is defined as the set of texels within the coordinate range:

([0,endX), [0,endY))

Where:

  • endX equals codedExtent.width rounded up to the nearest integer multiple of pictureAccessGranularity.width and clamped to the width of the image subresource referred to by the corresponding VkVideoPictureResourceInfoKHR structure;

  • endY equals codedExtent.height rounded up to the nearest integer multiple of pictureAccessGranularity.height and clamped to the height of the image subresource referred to by the corresponding VkVideoPictureResourceInfoKHR structure;

Where codedExtent is the member of the VkVideoPictureResourceInfoKHR structure corresponding to the picture.

In case of video encode operations using an H.264 encode profile, any access to a picture at the coordinates (x,y), as defined by the ITU-T H.264 Specification, is an access to the image subresource referred to by the corresponding VkVideoPictureResourceInfoKHR structure at the texel coordinates (x,y).

Implementations may choose not to access some or all texels within particular reference pictures available to a video encode operation (e.g. due to video encode parameter overrides restricting the effective set of used reference pictures, or if the encoding algorithm chooses not to use certain subregions of the reference picture data for sample prediction).

H.264 Frame, Picture, and Slice

H.264 pictures are partitioned into slices, as defined in section 6.3 of the ITU-T H.264 Specification.

For the purposes of this specification, the H.264 slices comprising a picture are referred to as the picture partitions of the picture.

Video encode operations using an H.264 encode profile can encode slices of different types, as defined in section 7.4.3 of the ITU-T H.264 Specification, by specifying the corresponding enumeration constant value in StdVideoEncodeH264SliceHeader::slice_type in the H.264 slice header parameters from the Video Std enumeration type StdVideoH264SliceType:

  • STD_VIDEO_H264_SLICE_TYPE_P indicates that the slice is a P slice as defined in section 3.109 of the ITU-T H.264 Specification.

  • STD_VIDEO_H264_SLICE_TYPE_B indicates that the slice is a B slice as defined in section 3.9 of the ITU-T H.264 Specification.

  • STD_VIDEO_H264_SLICE_TYPE_I indicates that the slice is an I slice as defined in section 3.66 of the ITU-T H.264 Specification.

Pictures constructed from such slices can be of different types, as defined in section 7.4.2.4 of the ITU-T H.264 Specification. Video encode operations using an H.264 encode profile can encode pictures of a specific type by specifying the corresponding enumeration constant value in StdVideoEncodeH264PictureInfo::primary_pic_type in the H.264 picture information from the Video Std enumeration type StdVideoH264PictureType:

  • STD_VIDEO_H264_PICTURE_TYPE_P indicates that the picture is a P picture. A frame consisting of a P picture is also referred to as a P frame.

  • STD_VIDEO_H264_PICTURE_TYPE_B indicates that the picture is a B picture. A frame consisting of a B picture is also referred to as a B frame.

  • STD_VIDEO_H264_PICTURE_TYPE_I indicates that the picture is an I picture. A frame consisting of an I picture is also referred to as an I frame.

  • STD_VIDEO_H264_PICTURE_TYPE_IDR indicates that the picture is a special type of I picture called an IDR picture as defined in section 3.69 of the ITU-T H.264 Specification. A frame consisting of an IDR picture is also referred to as an IDR frame.

H.264 Coding Blocks

H.264 encode supports a single type of coding block called a macroblock, as defined in section 3.84 of the ITU-T H.264 Specification.

H.264 Encode Profile

A video profile supporting H.264 video encode operations is specified by setting VkVideoProfileInfoKHR::videoCodecOperation to VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR and adding a VkVideoEncodeH264ProfileInfoKHR structure to the VkVideoProfileInfoKHR::pNext chain.

The VkVideoEncodeH264ProfileInfoKHR structure is defined as:

// Provided by VK_KHR_video_encode_h264
typedef struct VkVideoEncodeH264ProfileInfoKHR {
    VkStructureType           sType;
    const void*               pNext;
    StdVideoH264ProfileIdc    stdProfileIdc;
} VkVideoEncodeH264ProfileInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • stdProfileIdc is a StdVideoH264ProfileIdc value specifying the H.264 codec profile IDC, as defined in section A.2 of the ITU-T H.264 Specification.

Valid Usage (Implicit)
  • VUID-VkVideoEncodeH264ProfileInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_H264_PROFILE_INFO_KHR

H.264 Encode Capabilities

When calling vkGetPhysicalDeviceVideoCapabilitiesKHR to query the capabilities for an H.264 encode profile, the VkVideoCapabilitiesKHR::pNext chain must include a VkVideoEncodeH264CapabilitiesKHR structure that will be filled with the profile-specific capabilities.

The VkVideoEncodeH264CapabilitiesKHR structure is defined as:

// Provided by VK_KHR_video_encode_h264
typedef struct VkVideoEncodeH264CapabilitiesKHR {
    VkStructureType                        sType;
    void*                                  pNext;
    VkVideoEncodeH264CapabilityFlagsKHR    flags;
    StdVideoH264LevelIdc                   maxLevelIdc;
    uint32_t                               maxSliceCount;
    uint32_t                               maxPPictureL0ReferenceCount;
    uint32_t                               maxBPictureL0ReferenceCount;
    uint32_t                               maxL1ReferenceCount;
    uint32_t                               maxTemporalLayerCount;
    VkBool32                               expectDyadicTemporalLayerPattern;
    int32_t                                minQp;
    int32_t                                maxQp;
    VkBool32                               prefersGopRemainingFrames;
    VkBool32                               requiresGopRemainingFrames;
    VkVideoEncodeH264StdFlagsKHR           stdSyntaxFlags;
} VkVideoEncodeH264CapabilitiesKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • flags is a bitmask of VkVideoEncodeH264CapabilityFlagBitsKHR indicating supported H.264 encoding capabilities.

  • maxLevelIdc is a StdVideoH264LevelIdc value indicating the maximum H.264 level supported by the profile, where enum constant STD_VIDEO_H264_LEVEL_IDC_<major>_<minor> identifies H.264 level <major>.<minor> as defined in section A.3 of the ITU-T H.264 Specification.

  • maxSliceCount indicates the maximum number of slices that can be encoded for a single picture. Further restrictions may apply to the number of slices that can be encoded for a single picture depending on other capabilities and codec-specific rules.

  • maxPPictureL0ReferenceCount indicates the maximum number of reference pictures the implementation supports in the reference list L0 for P pictures.

    As implementations may override the reference lists, maxPPictureL0ReferenceCount does not limit the number of elements that the application can specify in the L0 reference list for P pictures. However, if maxPPictureL0ReferenceCount is zero, then the use of P pictures is not allowed.

  • maxBPictureL0ReferenceCount indicates the maximum number of reference pictures the implementation supports in the reference list L0 for B pictures.

  • maxL1ReferenceCount indicates the maximum number of reference pictures the implementation supports in the reference list L1 if encoding of B pictures is supported.

    As implementations may override the reference lists, maxBPictureL0ReferenceCount and maxL1ReferenceCount does not limit the number of elements that the application can specify in the L0 and L1 reference lists for B pictures. However, if maxBPictureL0ReferenceCount and maxL1ReferenceCount are both zero, then the use of B pictures is not allowed.

  • maxTemporalLayerCount indicates the maximum number of H.264 temporal layers supported by the implementation.

  • expectDyadicTemporalLayerPattern indicates that the implementation’s rate control algorithms expect the application to use a dyadic temporal layer pattern when encoding multiple temporal layers.

  • minQp indicates the minimum QP value supported.

  • maxQp indicates the maximum QP value supported.

  • prefersGopRemainingFrames indicates that the implementation’s rate control algorithm prefers the application to specify the number of frames of each type remaining in the current group of pictures when beginning a video coding scope.

  • requiresGopRemainingFrames indicates that the implementation’s rate control algorithm requires the application to specify the number of frames of each type remaining in the current group of pictures when beginning a video coding scope.

  • stdSyntaxFlags is a bitmask of VkVideoEncodeH264StdFlagBitsKHR indicating capabilities related to H.264 syntax elements.

Valid Usage (Implicit)
  • VUID-VkVideoEncodeH264CapabilitiesKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_H264_CAPABILITIES_KHR

Bits which may be set in VkVideoEncodeH264CapabilitiesKHR::flags, indicating the H.264 encoding capabilities supported, are:

// Provided by VK_KHR_video_encode_h264
typedef enum VkVideoEncodeH264CapabilityFlagBitsKHR {
    VK_VIDEO_ENCODE_H264_CAPABILITY_HRD_COMPLIANCE_BIT_KHR = 0x00000001,
    VK_VIDEO_ENCODE_H264_CAPABILITY_PREDICTION_WEIGHT_TABLE_GENERATED_BIT_KHR = 0x00000002,
    VK_VIDEO_ENCODE_H264_CAPABILITY_ROW_UNALIGNED_SLICE_BIT_KHR = 0x00000004,
    VK_VIDEO_ENCODE_H264_CAPABILITY_DIFFERENT_SLICE_TYPE_BIT_KHR = 0x00000008,
    VK_VIDEO_ENCODE_H264_CAPABILITY_B_FRAME_IN_L0_LIST_BIT_KHR = 0x00000010,
    VK_VIDEO_ENCODE_H264_CAPABILITY_B_FRAME_IN_L1_LIST_BIT_KHR = 0x00000020,
    VK_VIDEO_ENCODE_H264_CAPABILITY_PER_PICTURE_TYPE_MIN_MAX_QP_BIT_KHR = 0x00000040,
    VK_VIDEO_ENCODE_H264_CAPABILITY_PER_SLICE_CONSTANT_QP_BIT_KHR = 0x00000080,
    VK_VIDEO_ENCODE_H264_CAPABILITY_GENERATE_PREFIX_NALU_BIT_KHR = 0x00000100,
  // Provided by VK_KHR_video_encode_h264 with VK_KHR_video_encode_quantization_map
    VK_VIDEO_ENCODE_H264_CAPABILITY_MB_QP_DIFF_WRAPAROUND_BIT_KHR = 0x00000200,
} VkVideoEncodeH264CapabilityFlagBitsKHR;
  • VK_VIDEO_ENCODE_H264_CAPABILITY_HRD_COMPLIANCE_BIT_KHR specifies whether the implementation may be able to generate HRD compliant bitstreams if any of the nal_hrd_parameters_present_flag or vcl_hrd_parameters_present_flag members of StdVideoH264SpsVuiFlags are set to 1 in the active SPS.

  • VK_VIDEO_ENCODE_H264_CAPABILITY_PREDICTION_WEIGHT_TABLE_GENERATED_BIT_KHR specifies that if StdVideoH264PpsFlags::weighted_pred_flag is set to 1 or StdVideoH264PictureParameterSet::weighted_bipred_idc is set to STD_VIDEO_H264_WEIGHTED_BIPRED_IDC_EXPLICIT in the active PPS when encoding a P picture or B picture, respectively, then the implementation is able to internally decide syntax for pred_weight_table, as defined in section 7.4.3.2 of the ITU-T H.264 Specification, and the application is not required to provide a weight table in the H.264 slice header parameters.

  • VK_VIDEO_ENCODE_H264_CAPABILITY_ROW_UNALIGNED_SLICE_BIT_KHR specifies that each slice in a frame with multiple slices may begin or finish at any offset in a macroblock row. If not supported, all slices in the frame must begin at the start of a macroblock row (and hence each slice must finish at the end of a macroblock row).

  • VK_VIDEO_ENCODE_H264_CAPABILITY_DIFFERENT_SLICE_TYPE_BIT_KHR specifies that when a frame is encoded with multiple slices, the implementation allows encoding each slice with a different StdVideoEncodeH264SliceHeader::slice_type specified in the H.264 slice header parameters. If not supported, all slices of the frame must be encoded with the same slice_type which corresponds to the picture type of the frame.

  • VK_VIDEO_ENCODE_H264_CAPABILITY_B_FRAME_IN_L0_LIST_BIT_KHR specifies support for using a B frame as L0 reference, as specified in StdVideoEncodeH264ReferenceListsInfo::RefPicList0 in the H.264 picture information.

  • VK_VIDEO_ENCODE_H264_CAPABILITY_B_FRAME_IN_L1_LIST_BIT_KHR specifies support for using a B frame as L1 reference, as specified in StdVideoEncodeH264ReferenceListsInfo::RefPicList1 in the H.264 picture information.

  • VK_VIDEO_ENCODE_H264_CAPABILITY_PER_PICTURE_TYPE_MIN_MAX_QP_BIT_KHR specifies support for specifying different QP values in the members of VkVideoEncodeH264QpKHR.

  • VK_VIDEO_ENCODE_H264_CAPABILITY_PER_SLICE_CONSTANT_QP_BIT_KHR specifies support for specifying different constant QP values for each slice.

  • VK_VIDEO_ENCODE_H264_CAPABILITY_GENERATE_PREFIX_NALU_BIT_KHR specifies support for generating prefix NAL units by setting VkVideoEncodeH264PictureInfoKHR::generatePrefixNalu to VK_TRUE.

  • VK_VIDEO_ENCODE_H264_CAPABILITY_MB_QP_DIFF_WRAPAROUND_BIT_KHR indicates support for wraparound during the calculation of the QP values of subsequently encoded macroblocks, as defined in equation 7-37 of the ITU-T H.264 Specification. If not supported, equation 7-37 of the ITU-T H.264 Specification is effectively reduced to the following:

    QPY = QPY,PREV + mb_qp_delta

    The effect of this is that the maximum QP difference across subsequent macroblocks is limited to the [-(26 + QpBdOffsetY / 2), 25 + QpBdOffsetY / 2] range.

// Provided by VK_KHR_video_encode_h264
typedef VkFlags VkVideoEncodeH264CapabilityFlagsKHR;

VkVideoEncodeH264CapabilityFlagsKHR is a bitmask type for setting a mask of zero or more VkVideoEncodeH264CapabilityFlagBitsKHR.

Bits which may be set in VkVideoEncodeH264CapabilitiesKHR::stdSyntaxFlags, indicating the capabilities related to the H.264 syntax elements, are:

// Provided by VK_KHR_video_encode_h264
typedef enum VkVideoEncodeH264StdFlagBitsKHR {
    VK_VIDEO_ENCODE_H264_STD_SEPARATE_COLOR_PLANE_FLAG_SET_BIT_KHR = 0x00000001,
    VK_VIDEO_ENCODE_H264_STD_QPPRIME_Y_ZERO_TRANSFORM_BYPASS_FLAG_SET_BIT_KHR = 0x00000002,
    VK_VIDEO_ENCODE_H264_STD_SCALING_MATRIX_PRESENT_FLAG_SET_BIT_KHR = 0x00000004,
    VK_VIDEO_ENCODE_H264_STD_CHROMA_QP_INDEX_OFFSET_BIT_KHR = 0x00000008,
    VK_VIDEO_ENCODE_H264_STD_SECOND_CHROMA_QP_INDEX_OFFSET_BIT_KHR = 0x00000010,
    VK_VIDEO_ENCODE_H264_STD_PIC_INIT_QP_MINUS26_BIT_KHR = 0x00000020,
    VK_VIDEO_ENCODE_H264_STD_WEIGHTED_PRED_FLAG_SET_BIT_KHR = 0x00000040,
    VK_VIDEO_ENCODE_H264_STD_WEIGHTED_BIPRED_IDC_EXPLICIT_BIT_KHR = 0x00000080,
    VK_VIDEO_ENCODE_H264_STD_WEIGHTED_BIPRED_IDC_IMPLICIT_BIT_KHR = 0x00000100,
    VK_VIDEO_ENCODE_H264_STD_TRANSFORM_8X8_MODE_FLAG_SET_BIT_KHR = 0x00000200,
    VK_VIDEO_ENCODE_H264_STD_DIRECT_SPATIAL_MV_PRED_FLAG_UNSET_BIT_KHR = 0x00000400,
    VK_VIDEO_ENCODE_H264_STD_ENTROPY_CODING_MODE_FLAG_UNSET_BIT_KHR = 0x00000800,
    VK_VIDEO_ENCODE_H264_STD_ENTROPY_CODING_MODE_FLAG_SET_BIT_KHR = 0x00001000,
    VK_VIDEO_ENCODE_H264_STD_DIRECT_8X8_INFERENCE_FLAG_UNSET_BIT_KHR = 0x00002000,
    VK_VIDEO_ENCODE_H264_STD_CONSTRAINED_INTRA_PRED_FLAG_SET_BIT_KHR = 0x00004000,
    VK_VIDEO_ENCODE_H264_STD_DEBLOCKING_FILTER_DISABLED_BIT_KHR = 0x00008000,
    VK_VIDEO_ENCODE_H264_STD_DEBLOCKING_FILTER_ENABLED_BIT_KHR = 0x00010000,
    VK_VIDEO_ENCODE_H264_STD_DEBLOCKING_FILTER_PARTIAL_BIT_KHR = 0x00020000,
    VK_VIDEO_ENCODE_H264_STD_SLICE_QP_DELTA_BIT_KHR = 0x00080000,
    VK_VIDEO_ENCODE_H264_STD_DIFFERENT_SLICE_QP_DELTA_BIT_KHR = 0x00100000,
} VkVideoEncodeH264StdFlagBitsKHR;
  • VK_VIDEO_ENCODE_H264_STD_SEPARATE_COLOR_PLANE_FLAG_SET_BIT_KHR specifies whether the implementation supports using the application-provided value for StdVideoH264SpsFlags::separate_colour_plane_flag in the SPS when that value is 1.

  • VK_VIDEO_ENCODE_H264_STD_QPPRIME_Y_ZERO_TRANSFORM_BYPASS_FLAG_SET_BIT_KHR specifies whether the implementation supports using the application-provided value for StdVideoH264SpsFlags::qpprime_y_zero_transform_bypass_flag in the SPS when that value is 1.

  • VK_VIDEO_ENCODE_H264_STD_SCALING_MATRIX_PRESENT_FLAG_SET_BIT_KHR specifies whether the implementation supports using the application-provided values for StdVideoH264SpsFlags::seq_scaling_matrix_present_flag in the SPS and StdVideoH264PpsFlags::pic_scaling_matrix_present_flag in the PPS when any of those values are 1.

  • VK_VIDEO_ENCODE_H264_STD_CHROMA_QP_INDEX_OFFSET_BIT_KHR specifies whether the implementation supports using the application-provided value for StdVideoH264PictureParameterSet::chroma_qp_index_offset in the PPS when that value is non-zero.

  • VK_VIDEO_ENCODE_H264_STD_SECOND_CHROMA_QP_INDEX_OFFSET_BIT_KHR specifies whether the implementation supports using the application-provided value for StdVideoH264PictureParameterSet::second_chroma_qp_index_offset in the PPS when that value is non-zero.

  • VK_VIDEO_ENCODE_H264_STD_PIC_INIT_QP_MINUS26_BIT_KHR specifies whether the implementation supports using the application-provided value for StdVideoH264PictureParameterSet::pic_init_qp_minus26 in the PPS when that value is non-zero.

  • VK_VIDEO_ENCODE_H264_STD_WEIGHTED_PRED_FLAG_SET_BIT_KHR specifies whether the implementation supports using the application-provided value for StdVideoH264PpsFlags::weighted_pred_flag in the PPS when that value is 1.

  • VK_VIDEO_ENCODE_H264_STD_WEIGHTED_BIPRED_IDC_EXPLICIT_BIT_KHR specifies whether the implementation supports using the application-provided value for StdVideoH264PictureParameterSet::weighted_bipred_idc in the PPS when that value is STD_VIDEO_H264_WEIGHTED_BIPRED_IDC_EXPLICIT.

  • VK_VIDEO_ENCODE_H264_STD_WEIGHTED_BIPRED_IDC_IMPLICIT_BIT_KHR specifies whether the implementation supports using the application-provided value for StdVideoH264PictureParameterSet::weighted_bipred_idc in the PPS when that value is STD_VIDEO_H264_WEIGHTED_BIPRED_IDC_IMPLICIT.

  • VK_VIDEO_ENCODE_H264_STD_TRANSFORM_8X8_MODE_FLAG_SET_BIT_KHR specifies whether the implementation supports using the application-provided value for StdVideoH264PpsFlags::transform_8x8_mode_flag in the PPS when that value is 1.

  • VK_VIDEO_ENCODE_H264_STD_DIRECT_SPATIAL_MV_PRED_FLAG_UNSET_BIT_KHR specifies whether the implementation supports using the application-provided value for StdVideoEncodeH264SliceHeaderFlags::direct_spatial_mv_pred_flag in the H.264 slice header parameters when that value is 0.

  • VK_VIDEO_ENCODE_H264_STD_ENTROPY_CODING_MODE_FLAG_UNSET_BIT_KHR specifies whether the implementation supports CAVLC entropy coding, as defined in section 9.2 of the ITU-T H.264 Specification, and thus supports using the application-provided value for StdVideoH264PpsFlags::entropy_coding_mode_flag in the PPS when that value is 0.

  • VK_VIDEO_ENCODE_H264_STD_ENTROPY_CODING_MODE_FLAG_SET_BIT_KHR specifies whether the implementation supports CABAC entropy coding, as defined in section 9.3 of the ITU-T H.264 Specification, and thus supports using the application-provided value for StdVideoH264PpsFlags::entropy_coding_mode_flag in the PPS when that value is 1.

  • VK_VIDEO_ENCODE_H264_STD_DIRECT_8X8_INFERENCE_FLAG_UNSET_BIT_KHR specifies whether the implementation supports using the application-provided value for StdVideoH264SpsFlags::direct_8x8_inference_flag in the SPS when that value is 0.

  • VK_VIDEO_ENCODE_H264_STD_CONSTRAINED_INTRA_PRED_FLAG_SET_BIT_KHR specifies whether the implementation supports using the application-provided value for StdVideoH264PpsFlags::constrained_intra_pred_flag in the PPS when that value is 1.

  • VK_VIDEO_ENCODE_H264_STD_DEBLOCKING_FILTER_DISABLED_BIT_KHR specifies whether the implementation supports using the application-provided value for StdVideoEncodeH264SliceHeader::disable_deblocking_filter_idc in the H.264 slice header parameters when that value is STD_VIDEO_H264_DISABLE_DEBLOCKING_FILTER_IDC_DISABLED.

  • VK_VIDEO_ENCODE_H264_STD_DEBLOCKING_FILTER_ENABLED_BIT_KHR specifies whether the implementation supports using the application-provided value for StdVideoEncodeH264SliceHeader::disable_deblocking_filter_idc in the H.264 slice header parameters when that value is STD_VIDEO_H264_DISABLE_DEBLOCKING_FILTER_IDC_ENABLED.

  • VK_VIDEO_ENCODE_H264_STD_DEBLOCKING_FILTER_PARTIAL_BIT_KHR specifies whether the implementation supports using the application-provided value for StdVideoEncodeH264SliceHeader::disable_deblocking_filter_idc in the H.264 slice header parameters when that value is STD_VIDEO_H264_DISABLE_DEBLOCKING_FILTER_IDC_PARTIAL.

  • VK_VIDEO_ENCODE_H264_STD_SLICE_QP_DELTA_BIT_KHR specifies whether the implementation supports using the application-provided value for StdVideoEncodeH264SliceHeader::slice_qp_delta in the H.264 slice header parameters when that value is identical across the slices of the encoded frame.

  • VK_VIDEO_ENCODE_H264_STD_DIFFERENT_SLICE_QP_DELTA_BIT_KHR specifies whether the implementation supports using the application-provided value for StdVideoEncodeH264SliceHeader::slice_qp_delta in the H.264 slice header parameters when that value is different across the slices of the encoded frame.

These capability flags provide information to the application about specific H.264 syntax element values that the implementation supports without having to override them and do not otherwise restrict the values that the application can specify for any of the mentioned H.264 syntax elements.

// Provided by VK_KHR_video_encode_h264
typedef VkFlags VkVideoEncodeH264StdFlagsKHR;

VkVideoEncodeH264StdFlagsKHR is a bitmask type for setting a mask of zero or more VkVideoEncodeH264StdFlagBitsKHR.

H.264 Encode Quality Level Properties

When calling vkGetPhysicalDeviceVideoEncodeQualityLevelPropertiesKHR with pVideoProfile->videoCodecOperation specified as VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR, the VkVideoEncodeH264QualityLevelPropertiesKHR structure must be included in the pNext chain of the VkVideoEncodeQualityLevelPropertiesKHR structure to retrieve additional video encode quality level properties specific to H.264 encoding.

The VkVideoEncodeH264QualityLevelPropertiesKHR structure is defined as:

// Provided by VK_KHR_video_encode_h264
typedef struct VkVideoEncodeH264QualityLevelPropertiesKHR {
    VkStructureType                         sType;
    void*                                   pNext;
    VkVideoEncodeH264RateControlFlagsKHR    preferredRateControlFlags;
    uint32_t                                preferredGopFrameCount;
    uint32_t                                preferredIdrPeriod;
    uint32_t                                preferredConsecutiveBFrameCount;
    uint32_t                                preferredTemporalLayerCount;
    VkVideoEncodeH264QpKHR                  preferredConstantQp;
    uint32_t                                preferredMaxL0ReferenceCount;
    uint32_t                                preferredMaxL1ReferenceCount;
    VkBool32                                preferredStdEntropyCodingModeFlag;
} VkVideoEncodeH264QualityLevelPropertiesKHR;
Valid Usage (Implicit)
  • VUID-VkVideoEncodeH264QualityLevelPropertiesKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_H264_QUALITY_LEVEL_PROPERTIES_KHR

H.264 Encode Session

Additional parameters can be specified when creating a video session with an H.264 encode profile by including an instance of the VkVideoEncodeH264SessionCreateInfoKHR structure in the pNext chain of VkVideoSessionCreateInfoKHR.

The VkVideoEncodeH264SessionCreateInfoKHR structure is defined as:

// Provided by VK_KHR_video_encode_h264
typedef struct VkVideoEncodeH264SessionCreateInfoKHR {
    VkStructureType         sType;
    const void*             pNext;
    VkBool32                useMaxLevelIdc;
    StdVideoH264LevelIdc    maxLevelIdc;
} VkVideoEncodeH264SessionCreateInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • useMaxLevelIdc indicates whether the value of maxLevelIdc should be used by the implementation. When it is VK_FALSE, the implementation ignores the value of maxLevelIdc and uses the value of VkVideoEncodeH264CapabilitiesKHR::maxLevelIdc, as reported by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile.

  • maxLevelIdc is a StdVideoH264LevelIdc value specifying the upper bound on the H.264 level for the video bitstreams produced by the created video session, where enum constant STD_VIDEO_H264_LEVEL_IDC_<major>_<minor> identifies H.264 level <major>.<minor> as defined in section A.3 of the ITU-T H.264 Specification.

Valid Usage (Implicit)
  • VUID-VkVideoEncodeH264SessionCreateInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_H264_SESSION_CREATE_INFO_KHR

H.264 Encode Parameter Sets

Video session parameters objects created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR can contain the following types of parameters:

H.264 Sequence Parameter Sets (SPS)

Represented by StdVideoH264SequenceParameterSet structures and interpreted as follows:

  • reserved1 and reserved2 are used only for padding purposes and are otherwise ignored;

  • seq_parameter_set_id is used as the key of the SPS entry;

  • level_idc is one of the enum constants STD_VIDEO_H264_LEVEL_IDC_<major>_<minor> identifying the H.264 level <major>.<minor> as defined in section A.3 of the ITU-T H.264 Specification;

  • if flags.seq_scaling_matrix_present_flag is set, then the StdVideoH264ScalingLists structure pointed to by pScalingLists is interpreted as follows:

    • scaling_list_present_mask is a bitmask where bit index i corresponds to seq_scaling_list_present_flag[i] as defined in section 7.4.2.1 of the ITU-T H.264 Specification;

    • use_default_scaling_matrix_mask is a bitmask where bit index i corresponds to UseDefaultScalingMatrix4x4Flag[i], when i < 6, or corresponds to UseDefaultScalingMatrix8x8Flag[i-6], otherwise, as defined in section 7.3.2.1 of the ITU-T H.264 Specification;

    • ScalingList4x4 and ScalingList8x8 correspond to the identically named syntax elements defined in section 7.3.2.1 of the ITU-T H.264 Specification;

  • if flags.vui_parameters_present_flag is set, then pSequenceParameterSetVui is a pointer to a StdVideoH264SequenceParameterSetVui structure that is interpreted as follows:

    • reserved1 is used only for padding purposes and is otherwise ignored;

    • flags.color_description_present_flag is interpreted as the value of colour_description_present_flag, as defined in section E.2.1 of the ITU-T H.264 Specification;

      The name of colour_description_present_flag was misspelled in the Video Std header.

    • if flags.nal_hrd_parameters_present_flag or flags.vcl_hrd_parameters_present_flag is set, then the StdVideoH264HrdParameters structure pointed to by pHrdParameters is interpreted as follows:

      • reserved1 is used only for padding purposes and is otherwise ignored;

      • all other members of StdVideoH264HrdParameters are interpreted as defined in section E.2.2 of the ITU-T H.264 Specification;

    • all other members of StdVideoH264SequenceParameterSetVui are interpreted as defined in section E.2.1 of the ITU-T H.264 Specification;

  • all other members of StdVideoH264SequenceParameterSet are interpreted as defined in section 7.4.2.1 of the ITU-T H.264 Specification.

H.264 Picture Parameter Sets (PPS)

Represented by StdVideoH264PictureParameterSet structures and interpreted as follows:

  • the pair constructed from seq_parameter_set_id and pic_parameter_set_id is used as the key of the PPS entry;

  • if flags.pic_scaling_matrix_present_flag is set, then the StdVideoH264ScalingLists structure pointed to by pScalingLists is interpreted as follows:

    • scaling_list_present_mask is a bitmask where bit index i corresponds to pic_scaling_list_present_flag[i] as defined in section 7.4.2.2 of the ITU-T H.264 Specification;

    • use_default_scaling_matrix_mask is a bitmask where bit index i corresponds to UseDefaultScalingMatrix4x4Flag[i], when i < 6, or corresponds to UseDefaultScalingMatrix8x8Flag[i-6], otherwise, as defined in section 7.3.2.2 of the ITU-T H.264 Specification;

    • ScalingList4x4 and ScalingList8x8 correspond to the identically named syntax elements defined in section 7.3.2.2 of the ITU-T H.264 Specification;

  • all other members of StdVideoH264PictureParameterSet are interpreted as defined in section 7.4.2.2 of the ITU-T H.264 Specification.

Implementations may override any of these parameters according to the semantics defined in the Video Encode Parameter Overrides section before storing the resulting H.264 parameter sets into the video session parameters object. Applications need to use the vkGetEncodedVideoSessionParametersKHR command to determine whether any implementation overrides happened and to retrieve the encoded H.264 parameter sets in order to be able to produce a compliant H.264 video bitstream.

Such H.264 parameter set overrides may also have cascading effects on the implementation overrides applied to the encoded bitstream produced by video encode operations. If the implementation supports the VK_VIDEO_ENCODE_FEEDBACK_BITSTREAM_HAS_OVERRIDES_BIT_KHR video encode feedback query flag, then the application can use such queries to retrieve feedback about whether any implementation overrides have been applied to the encoded bitstream.

When a video session parameters object is created with the codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR, the VkVideoSessionParametersCreateInfoKHR::pNext chain must include a VkVideoEncodeH264SessionParametersCreateInfoKHR structure specifying the capacity and initial contents of the object.

The VkVideoEncodeH264SessionParametersCreateInfoKHR structure is defined as:

// Provided by VK_KHR_video_encode_h264
typedef struct VkVideoEncodeH264SessionParametersCreateInfoKHR {
    VkStructureType                                        sType;
    const void*                                            pNext;
    uint32_t                                               maxStdSPSCount;
    uint32_t                                               maxStdPPSCount;
    const VkVideoEncodeH264SessionParametersAddInfoKHR*    pParametersAddInfo;
} VkVideoEncodeH264SessionParametersCreateInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • maxStdSPSCount is the maximum number of H.264 SPS entries the created VkVideoSessionParametersKHR can contain.

  • maxStdPPSCount is the maximum number of H.264 PPS entries the created VkVideoSessionParametersKHR can contain.

  • pParametersAddInfo is NULL or a pointer to a VkVideoEncodeH264SessionParametersAddInfoKHR structure specifying H.264 parameters to add upon object creation.

Valid Usage (Implicit)
  • VUID-VkVideoEncodeH264SessionParametersCreateInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_H264_SESSION_PARAMETERS_CREATE_INFO_KHR

  • VUID-VkVideoEncodeH264SessionParametersCreateInfoKHR-pParametersAddInfo-parameter
    If pParametersAddInfo is not NULL, pParametersAddInfo must be a valid pointer to a valid VkVideoEncodeH264SessionParametersAddInfoKHR structure

The VkVideoEncodeH264SessionParametersAddInfoKHR structure is defined as:

// Provided by VK_KHR_video_encode_h264
typedef struct VkVideoEncodeH264SessionParametersAddInfoKHR {
    VkStructureType                            sType;
    const void*                                pNext;
    uint32_t                                   stdSPSCount;
    const StdVideoH264SequenceParameterSet*    pStdSPSs;
    uint32_t                                   stdPPSCount;
    const StdVideoH264PictureParameterSet*     pStdPPSs;
} VkVideoEncodeH264SessionParametersAddInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • stdSPSCount is the number of elements in the pStdSPSs array.

  • pStdSPSs is a pointer to an array of StdVideoH264SequenceParameterSet structures describing the H.264 SPS entries to add.

  • stdPPSCount is the number of elements in the pStdPPSs array.

  • pStdPPSs is a pointer to an array of StdVideoH264PictureParameterSet structures describing the H.264 PPS entries to add.

This structure can be specified in the following places:

Valid Usage
  • VUID-VkVideoEncodeH264SessionParametersAddInfoKHR-None-04837
    The seq_parameter_set_id member of each StdVideoH264SequenceParameterSet structure specified in the elements of pStdSPSs must be unique within pStdSPSs

  • VUID-VkVideoEncodeH264SessionParametersAddInfoKHR-None-04838
    The pair constructed from the seq_parameter_set_id and pic_parameter_set_id members of each StdVideoH264PictureParameterSet structure specified in the elements of pStdPPSs must be unique within pStdPPSs

Valid Usage (Implicit)
  • VUID-VkVideoEncodeH264SessionParametersAddInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_H264_SESSION_PARAMETERS_ADD_INFO_KHR

  • VUID-VkVideoEncodeH264SessionParametersAddInfoKHR-pStdSPSs-parameter
    If stdSPSCount is not 0, and pStdSPSs is not NULL, pStdSPSs must be a valid pointer to an array of stdSPSCount StdVideoH264SequenceParameterSet values

  • VUID-VkVideoEncodeH264SessionParametersAddInfoKHR-pStdPPSs-parameter
    If stdPPSCount is not 0, and pStdPPSs is not NULL, pStdPPSs must be a valid pointer to an array of stdPPSCount StdVideoH264PictureParameterSet values

The VkVideoEncodeH264SessionParametersGetInfoKHR structure is defined as:

// Provided by VK_KHR_video_encode_h264
typedef struct VkVideoEncodeH264SessionParametersGetInfoKHR {
    VkStructureType    sType;
    const void*        pNext;
    VkBool32           writeStdSPS;
    VkBool32           writeStdPPS;
    uint32_t           stdSPSId;
    uint32_t           stdPPSId;
} VkVideoEncodeH264SessionParametersGetInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • writeStdSPS indicates whether the encoded H.264 sequence parameter set identified by stdSPSId is requested to be retrieved.

  • writeStdPPS indicates whether the encoded H.264 picture parameter set identified by the pair constructed from stdSPSId and stdPPSId is requested to be retrieved.

  • stdSPSId specifies the H.264 sequence parameter set ID used to identify the retrieved H.264 sequence and/or picture parameter set(s).

  • stdPPSId specifies the H.264 picture parameter set ID used to identify the retrieved H.264 picture parameter set when writeStdPPS is VK_TRUE.

When this structure is specified in the pNext chain of the VkVideoEncodeSessionParametersGetInfoKHR structure passed to vkGetEncodedVideoSessionParametersKHR, the command will write encoded parameter data to the output buffer in the following order:

  1. The H.264 sequence parameter set identified by stdSPSId, if writeStdSPS is VK_TRUE.

  2. The H.264 picture parameter set identified by the pair constructed from stdSPSId and stdPPSId, if writeStdPPS is VK_TRUE.

Valid Usage
  • VUID-VkVideoEncodeH264SessionParametersGetInfoKHR-writeStdSPS-08279
    At least one of writeStdSPS and writeStdPPS must be VK_TRUE

Valid Usage (Implicit)
  • VUID-VkVideoEncodeH264SessionParametersGetInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_H264_SESSION_PARAMETERS_GET_INFO_KHR

The VkVideoEncodeH264SessionParametersFeedbackInfoKHR structure is defined as:

// Provided by VK_KHR_video_encode_h264
typedef struct VkVideoEncodeH264SessionParametersFeedbackInfoKHR {
    VkStructureType    sType;
    void*              pNext;
    VkBool32           hasStdSPSOverrides;
    VkBool32           hasStdPPSOverrides;
} VkVideoEncodeH264SessionParametersFeedbackInfoKHR;
Valid Usage (Implicit)
  • VUID-VkVideoEncodeH264SessionParametersFeedbackInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_H264_SESSION_PARAMETERS_FEEDBACK_INFO_KHR

H.264 Encoding Parameters

The VkVideoEncodeH264PictureInfoKHR structure is defined as:

// Provided by VK_KHR_video_encode_h264
typedef struct VkVideoEncodeH264PictureInfoKHR {
    VkStructureType                             sType;
    const void*                                 pNext;
    uint32_t                                    naluSliceEntryCount;
    const VkVideoEncodeH264NaluSliceInfoKHR*    pNaluSliceEntries;
    const StdVideoEncodeH264PictureInfo*        pStdPictureInfo;
    VkBool32                                    generatePrefixNalu;
} VkVideoEncodeH264PictureInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • naluSliceEntryCount is the number of elements in pNaluSliceEntries.

  • pNaluSliceEntries is a pointer to an array of naluSliceEntryCount VkVideoEncodeH264NaluSliceInfoKHR structures specifying the parameters of the individual H.264 slices to encode for the input picture.

  • pStdPictureInfo is a pointer to a StdVideoEncodeH264PictureInfo structure specifying H.264 picture information.

  • generatePrefixNalu controls whether prefix NALUs are generated before slice NALUs into the target bitstream, as defined in sections 7.3.2.12 and 7.4.2.12 of the ITU-T H.264 Specification.

This structure is specified in the pNext chain of the VkVideoEncodeInfoKHR structure passed to vkCmdEncodeVideoKHR to specify the codec-specific picture information for an H.264 encode operation.

Encode Input Picture Information

When this structure is specified in the pNext chain of the VkVideoEncodeInfoKHR structure passed to vkCmdEncodeVideoKHR, the information related to the encode input picture is defined as follows:

Std Picture Information

The members of the StdVideoEncodeH264PictureInfo structure pointed to by pStdPictureInfo are interpreted as follows:

  • flags.reserved and reserved1 are used only for padding purposes and are otherwise ignored;

  • flags.IdrPicFlag as defined in section 7.4.1 of the ITU-T H.264 Specification;

  • flags.is_reference as defined in section 3.136 of the ITU-T H.264 Specification;

  • seq_parameter_set_id and pic_parameter_set_id are used to identify the active parameter sets, as described below;

  • primary_pic_type as defined in section 7.4.2 of the ITU-T H.264 Specification;

  • PicOrderCnt as defined in section 8.2 of the ITU-T H.264 Specification;

  • temporal_id as defined in section G.7.4.1.1 of the ITU-T H.264 Specification;

  • if pRefLists is not NULL, then it is a pointer to a StdVideoEncodeH264ReferenceListsInfo structure that is interpreted as follows:

    • flags.reserved is used only for padding purposes and is otherwise ignored;

    • ref_pic_list_modification_flag_l0 and ref_pic_list_modification_flag_l1 as defined in section 7.4.3.1 of the ITU-T H.264 Specification;

    • num_ref_idx_l0_active_minus1 and num_ref_idx_l1_active_minus1 as defined in section 7.4.3 of the ITU-T H.264 Specification;

    • RefPicList0 and RefPicList1 as defined in section 8.2.4 of the ITU-T H.264 Specification where each element of these arrays either identifies an active reference picture using its DPB slot index or contains the value STD_VIDEO_H264_NO_REFERENCE_PICTURE to indicate “no reference picture”;

    • if refList0ModOpCount is not zero, then pRefList0ModOperations is a pointer to an array of refList0ModOpCount number of StdVideoEncodeH264RefListModEntry structures specifying the modification parameters for the reference list L0 as defined in section 7.4.3.1 of the ITU-T H.264 Specification;

    • if refList1ModOpCount is not zero, then pRefList1ModOperations is a pointer to an array of refList1ModOpCount number of StdVideoEncodeH264RefListModEntry structures specifying the modification parameters for the reference list L1 as defined in section 7.4.3.1 of the ITU-T H.264 Specification;

    • if refPicMarkingOpCount is not zero, then refPicMarkingOperations is a pointer to an array of refPicMarkingOpCount number of StdVideoEncodeH264RefPicMarkingEntry structures specifying the reference picture marking parameters as defined in section 7.4.3.3 of the ITU-T H.264 Specification;

  • all other members are interpreted as defined in section 7.4.3 of the ITU-T H.264 Specification.

Reference picture setup is controlled by the value of StdVideoEncodeH264PictureInfo::flags.is_reference. If it is set and a reconstructed picture is specified, then the latter is used as the target of picture reconstruction to activate the DPB slot specified in pEncodeInfo->pSetupReferenceSlot→slotIndex. If StdVideoEncodeH264PictureInfo::flags.is_reference is not set, but a reconstructed picture is specified, then the corresponding picture reference associated with the DPB slot is invalidated, as described in the DPB Slot States section.

Active Parameter Sets

The members of the StdVideoEncodeH264PictureInfo structure pointed to by pStdPictureInfo are used to select the active parameter sets to use from the bound video session parameters object, as follows:

  • The active SPS is the SPS identified by the key specified in StdVideoEncodeH264PictureInfo::seq_parameter_set_id.

  • The active PPS is the PPS identified by the key specified by the pair constructed from StdVideoEncodeH264PictureInfo::seq_parameter_set_id and StdVideoEncodeH264PictureInfo::pic_parameter_set_id.

H.264 encoding uses explicit weighted sample prediction for a slice, as defined in section 8.4.2.3 of the ITU-T H.264 Specification, if any of the following conditions are true for the active PPS and the pStdSliceHeader member of the corresponding element of pNaluSliceEntries:

  • pStdSliceHeader->slice_type is STD_VIDEO_H264_SLICE_TYPE_P and weighted_pred_flag is enabled in the active PPS.

  • pStdSliceHeader->slice_type is STD_VIDEO_H264_SLICE_TYPE_B and weighted_bipred_idc in the active PPS equals STD_VIDEO_H264_WEIGHTED_BIPRED_IDC_EXPLICIT.

Valid Usage
Valid Usage (Implicit)
  • VUID-VkVideoEncodeH264PictureInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_H264_PICTURE_INFO_KHR

  • VUID-VkVideoEncodeH264PictureInfoKHR-pNaluSliceEntries-parameter
    pNaluSliceEntries must be a valid pointer to an array of naluSliceEntryCount valid VkVideoEncodeH264NaluSliceInfoKHR structures

  • VUID-VkVideoEncodeH264PictureInfoKHR-pStdPictureInfo-parameter
    pStdPictureInfo must be a valid pointer to a valid StdVideoEncodeH264PictureInfo value

  • VUID-VkVideoEncodeH264PictureInfoKHR-naluSliceEntryCount-arraylength
    naluSliceEntryCount must be greater than 0

The VkVideoEncodeH264NaluSliceInfoKHR structure is defined as:

// Provided by VK_KHR_video_encode_h264
typedef struct VkVideoEncodeH264NaluSliceInfoKHR {
    VkStructureType                         sType;
    const void*                             pNext;
    int32_t                                 constantQp;
    const StdVideoEncodeH264SliceHeader*    pStdSliceHeader;
} VkVideoEncodeH264NaluSliceInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • constantQp is the QP to use for the slice if the current rate control mode configured for the video session is VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DISABLED_BIT_KHR.

  • pStdSliceHeader is a pointer to a StdVideoEncodeH264SliceHeader structure specifying H.264 slice header parameters for the slice.

Std Slice Header Parameters

The members of the StdVideoEncodeH264SliceHeader structure pointed to by pStdSliceHeader are interpreted as follows:

  • flags.reserved and reserved1 are used only for padding purposes and are otherwise ignored;

  • if pWeightTable is not NULL, then it is a pointer to a StdVideoEncodeH264WeightTable that is interpreted as follows:

    • flags.reserved is used only for padding purposes and is otherwise ignored;

    • all other members of StdVideoEncodeH264WeightTable are interpreted as defined in section 7.4.3.2 of the ITU-T H.264 Specification;

  • all other members are interpreted as defined in section 7.4.3 of the ITU-T H.264 Specification.

Valid Usage (Implicit)
  • VUID-VkVideoEncodeH264NaluSliceInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_H264_NALU_SLICE_INFO_KHR

  • VUID-VkVideoEncodeH264NaluSliceInfoKHR-pNext-pNext
    pNext must be NULL

  • VUID-VkVideoEncodeH264NaluSliceInfoKHR-pStdSliceHeader-parameter
    pStdSliceHeader must be a valid pointer to a valid StdVideoEncodeH264SliceHeader value

The VkVideoEncodeH264DpbSlotInfoKHR structure is defined as:

// Provided by VK_KHR_video_encode_h264
typedef struct VkVideoEncodeH264DpbSlotInfoKHR {
    VkStructureType                           sType;
    const void*                               pNext;
    const StdVideoEncodeH264ReferenceInfo*    pStdReferenceInfo;
} VkVideoEncodeH264DpbSlotInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • pStdReferenceInfo is a pointer to a StdVideoEncodeH264ReferenceInfo structure specifying H.264 reference information.

This structure is specified in the pNext chain of VkVideoEncodeInfoKHR::pSetupReferenceSlot, if not NULL, and the pNext chain of the elements of VkVideoEncodeInfoKHR::pReferenceSlots to specify the codec-specific reference picture information for an H.264 encode operation.

Active Reference Picture Information

When this structure is specified in the pNext chain of the elements of VkVideoEncodeInfoKHR::pReferenceSlots, one element is added to the list of active reference pictures used by the video encode operation for each element of VkVideoEncodeInfoKHR::pReferenceSlots as follows:

Reconstructed Picture Information

When this structure is specified in the pNext chain of VkVideoEncodeInfoKHR::pSetupReferenceSlot, the information related to the reconstructed picture is defined as follows:

Std Reference Information

The members of the StdVideoEncodeH264ReferenceInfo structure pointed to by pStdReferenceInfo are interpreted as follows:

Valid Usage (Implicit)
  • VUID-VkVideoEncodeH264DpbSlotInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_H264_DPB_SLOT_INFO_KHR

  • VUID-VkVideoEncodeH264DpbSlotInfoKHR-pStdReferenceInfo-parameter
    pStdReferenceInfo must be a valid pointer to a valid StdVideoEncodeH264ReferenceInfo value

H.264 Encode Rate Control

Group of Pictures

In case of H.264 encoding it is common practice to follow a regular pattern of different picture types in display order when encoding subsequent frames. This pattern is referred to as the group of pictures (GOP).

A regular GOP is defined by the following parameters:

  • The number of frames in the GOP;

  • The number of consecutive B frames between I and/or P frames in display order.

GOPs are further classified as open and closed GOPs.

Frame types in an open GOP follow each other in display order according to the following algorithm:

  1. The first frame is always an I frame.

  2. This is followed by a number of consecutive B frames, as defined above.

  3. If the number of frames in the GOP is not reached yet, then the next frame is a P frame and the algorithm continues from step 2.

h26x open gop
Figure 1. H.264 open GOP

In case of a closed GOP, an IDR frame is used at a certain period.

h26x closed gop
Figure 2. H.264 closed GOP

It is also typical for H.264 encoding to use specific reference picture usage patterns across the frames of the GOP. The two most common reference patterns used are as follows:

Flat Reference Pattern
  • Each P frame uses the last non-B frame, in display order, as reference.

  • Each B frame uses the last non-B frame, in display order, as its forward reference, and uses the next non-B frame, in display order, as its backward reference.

h26x ref pattern flat
Figure 3. H.264 flat reference pattern
Dyadic Reference Pattern
  • Each P frame uses the last non-B frame, in display order, as reference.

  • The following algorithm is applied to the sequence of consecutive B frames between I and/or P frames in display order:

    1. The B frame in the middle of this sequence uses the frame preceding the sequence as its forward reference, and uses the frame following the sequence as its backward reference.

    2. The algorithm is executed recursively for the following frame sequences:

      • The B frames of the original sequence preceding the frame in the middle, if any.

      • The B frames of the original sequence following the frame in the middle, if any.

h26x ref pattern dyadic
Figure 4. H.264 dyadic reference pattern

The application can provide guidance to the implementation’s rate control algorithm about the structure of the GOP used by the application. Any such guidance about the GOP and its structure does not mandate that specific GOP structure to be used by the application, as the picture type of individual encoded pictures is still application-controlled, however, any deviation from the provided guidance may result in undesired rate control behavior including, but not limited, to the implementation not being able to conform to the expected average or target bitrates, or other rate control parameters specified by the application.

When an H.264 encode session is used to encode multiple temporal layers, it is also common practice to follow a regular pattern for the H.264 temporal ID for the encoded pictures in display order when encoding subsequent frames. This pattern is referred to as the temporal GOP. The most common temporal layer pattern used is as follows:

Dyadic Temporal Layer Pattern
  • The number of frames in the temporal GOP is 2n-1, where n is the number of temporal layers.

  • The ith frame in the temporal GOP uses temporal ID t, if and only if the index of the least significant bit set in i equals n-t-1, except for the first frame, which is the only frame in the temporal GOP using temporal ID zero.

  • The ith frame in the temporal GOP uses the rth frame as reference, where r is calculated from i by clearing the least significant bit set in it, except for the first frame in the temporal GOP, which uses the first frame of the previous temporal GOP, if any, as reference.

h26x layer pattern dyadic
Figure 5. H.264 dyadic temporal layer pattern

Multi-layer rate control and multi-layer coding are typically used for streaming cases where low latency is expected, hence B pictures with backward prediction are usually not used.

The VkVideoEncodeH264RateControlInfoKHR structure is defined as:

// Provided by VK_KHR_video_encode_h264
typedef struct VkVideoEncodeH264RateControlInfoKHR {
    VkStructureType                         sType;
    const void*                             pNext;
    VkVideoEncodeH264RateControlFlagsKHR    flags;
    uint32_t                                gopFrameCount;
    uint32_t                                idrPeriod;
    uint32_t                                consecutiveBFrameCount;
    uint32_t                                temporalLayerCount;
} VkVideoEncodeH264RateControlInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • flags is a bitmask of VkVideoEncodeH264RateControlFlagBitsKHR specifying H.264 rate control flags.

  • gopFrameCount is the number of frames within a group of pictures (GOP) intended to be used by the application. If it is 0, the rate control algorithm may assume an implementation-dependent GOP length. If it is UINT32_MAX, the GOP length is treated as infinite.

  • idrPeriod is the interval, in terms of number of frames, between two IDR frames (see IDR period). If it is 0, the rate control algorithm may assume an implementation-dependent IDR period. If it is UINT32_MAX, the IDR period is treated as infinite.

  • consecutiveBFrameCount is the number of consecutive B frames between I and/or P frames within the GOP.

  • temporalLayerCount specifies the number of H.264 temporal layers that the application intends to use.

When an instance of this structure is included in the pNext chain of the VkVideoCodingControlInfoKHR structure passed to the vkCmdControlVideoCodingKHR command, and VkVideoCodingControlInfoKHR::flags includes VK_VIDEO_CODING_CONTROL_ENCODE_RATE_CONTROL_BIT_KHR, the parameters in this structure are used as guidance for the implementation’s rate control algorithm (see Video Coding Control).

If flags includes VK_VIDEO_ENCODE_H264_RATE_CONTROL_ATTEMPT_HRD_COMPLIANCE_BIT_KHR, then the rate control state is reset to an initial state to meet HRD compliance requirements. Otherwise the new rate control state may be applied without a reset depending on the implementation and the specified rate control parameters.

It would be possible to infer the picture type to be used when encoding a frame, on the basis of the values provided for consecutiveBFrameCount, idrPeriod, and gopFrameCount, but this inferred picture type will not be used by implementations to override the picture type provided to the video encode operation.

Valid Usage
  • VUID-VkVideoEncodeH264RateControlInfoKHR-flags-08280
    If VkVideoEncodeH264CapabilitiesKHR::flags, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the used video profile, does not include VK_VIDEO_ENCODE_H264_CAPABILITY_HRD_COMPLIANCE_BIT_KHR, then flags must not contain VK_VIDEO_ENCODE_H264_RATE_CONTROL_ATTEMPT_HRD_COMPLIANCE_BIT_KHR

  • VUID-VkVideoEncodeH264RateControlInfoKHR-flags-08281
    If flags contains VK_VIDEO_ENCODE_H264_RATE_CONTROL_REFERENCE_PATTERN_FLAT_BIT_KHR or VK_VIDEO_ENCODE_H264_RATE_CONTROL_REFERENCE_PATTERN_DYADIC_BIT_KHR, then it must also contain VK_VIDEO_ENCODE_H264_RATE_CONTROL_REGULAR_GOP_BIT_KHR

  • VUID-VkVideoEncodeH264RateControlInfoKHR-flags-08282
    If flags contains VK_VIDEO_ENCODE_H264_RATE_CONTROL_REFERENCE_PATTERN_FLAT_BIT_KHR, then it must not also contain VK_VIDEO_ENCODE_H264_RATE_CONTROL_REFERENCE_PATTERN_DYADIC_BIT_KHR

  • VUID-VkVideoEncodeH264RateControlInfoKHR-flags-08283
    If flags contains VK_VIDEO_ENCODE_H264_RATE_CONTROL_REGULAR_GOP_BIT_KHR, then gopFrameCount must be greater than 0

  • VUID-VkVideoEncodeH264RateControlInfoKHR-idrPeriod-08284
    If idrPeriod is not 0, then it must be greater than or equal to gopFrameCount

  • VUID-VkVideoEncodeH264RateControlInfoKHR-consecutiveBFrameCount-08285
    If consecutiveBFrameCount is not 0, then it must be less than gopFrameCount

Valid Usage (Implicit)
  • VUID-VkVideoEncodeH264RateControlInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_H264_RATE_CONTROL_INFO_KHR

  • VUID-VkVideoEncodeH264RateControlInfoKHR-flags-parameter
    flags must be a valid combination of VkVideoEncodeH264RateControlFlagBitsKHR values

Bits which can be set in VkVideoEncodeH264RateControlInfoKHR::flags, specifying H.264 rate control flags, are:

// Provided by VK_KHR_video_encode_h264
typedef enum VkVideoEncodeH264RateControlFlagBitsKHR {
    VK_VIDEO_ENCODE_H264_RATE_CONTROL_ATTEMPT_HRD_COMPLIANCE_BIT_KHR = 0x00000001,
    VK_VIDEO_ENCODE_H264_RATE_CONTROL_REGULAR_GOP_BIT_KHR = 0x00000002,
    VK_VIDEO_ENCODE_H264_RATE_CONTROL_REFERENCE_PATTERN_FLAT_BIT_KHR = 0x00000004,
    VK_VIDEO_ENCODE_H264_RATE_CONTROL_REFERENCE_PATTERN_DYADIC_BIT_KHR = 0x00000008,
    VK_VIDEO_ENCODE_H264_RATE_CONTROL_TEMPORAL_LAYER_PATTERN_DYADIC_BIT_KHR = 0x00000010,
} VkVideoEncodeH264RateControlFlagBitsKHR;
  • VK_VIDEO_ENCODE_H264_RATE_CONTROL_ATTEMPT_HRD_COMPLIANCE_BIT_KHR specifies that rate control should attempt to produce an HRD compliant bitstream, as defined in annex C of the ITU-T H.264 Specification.

  • VK_VIDEO_ENCODE_H264_RATE_CONTROL_REGULAR_GOP_BIT_KHR specifies that the application intends to use a regular GOP structure according to the parameters specified in the gopFrameCount, idrPeriod, and consecutiveBFrameCount members of the VkVideoEncodeH264RateControlInfoKHR structure.

  • VK_VIDEO_ENCODE_H264_RATE_CONTROL_REFERENCE_PATTERN_FLAT_BIT_KHR specifies that the application intends to follow a flat reference pattern in the GOP.

  • VK_VIDEO_ENCODE_H264_RATE_CONTROL_REFERENCE_PATTERN_DYADIC_BIT_KHR specifies that the application intends to follow a dyadic reference pattern in the GOP.

  • VK_VIDEO_ENCODE_H264_RATE_CONTROL_TEMPORAL_LAYER_PATTERN_DYADIC_BIT_KHR specifies that the application intends to follow a dyadic temporal layer pattern.

// Provided by VK_KHR_video_encode_h264
typedef VkFlags VkVideoEncodeH264RateControlFlagsKHR;

VkVideoEncodeH264RateControlFlagsKHR is a bitmask type for setting a mask of zero or more VkVideoEncodeH264RateControlFlagBitsKHR.

Rate Control Layers

The VkVideoEncodeH264RateControlLayerInfoKHR structure is defined as:

// Provided by VK_KHR_video_encode_h264
typedef struct VkVideoEncodeH264RateControlLayerInfoKHR {
    VkStructureType                  sType;
    const void*                      pNext;
    VkBool32                         useMinQp;
    VkVideoEncodeH264QpKHR           minQp;
    VkBool32                         useMaxQp;
    VkVideoEncodeH264QpKHR           maxQp;
    VkBool32                         useMaxFrameSize;
    VkVideoEncodeH264FrameSizeKHR    maxFrameSize;
} VkVideoEncodeH264RateControlLayerInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • useMinQp indicates whether the QP values determined by rate control will be clamped to the lower bounds on the QP values specified in minQp.

  • minQp specifies the lower bounds on the QP values, for each picture type, that the implementation’s rate control algorithm will use when useMinQp is VK_TRUE.

  • useMaxQp indicates whether the QP values determined by rate control will be clamped to the upper bounds on the QP values specified in maxQp.

  • maxQp specifies the upper bounds on the QP values, for each picture type, that the implementation’s rate control algorithm will use when useMaxQp is VK_TRUE.

  • useMaxFrameSize indicates whether the implementation’s rate control algorithm should use the values specified in maxFrameSize as the upper bounds on the encoded frame size for each picture type.

  • maxFrameSize specifies the upper bounds on the encoded frame size, for each picture type, when useMaxFrameSize is VK_TRUE.

When used, the values in minQp and maxQp guarantee that the effective QP values used by the implementation will respect those lower and upper bounds, respectively. However, limiting the range of QP values that the implementation is able to use will also limit the capabilities of the implementation’s rate control algorithm to comply to other constraints. In particular, the implementation may not be able to comply to the following:

  • The average and/or peak bitrate values to be used for the encoded bitstream specified in the averageBitrate and maxBitrate members of the VkVideoEncodeRateControlLayerInfoKHR structure.

  • The upper bounds on the encoded frame size, for each picture type, specified in the maxFrameSize member of VkVideoEncodeH264RateControlLayerInfoKHR.

In general, applications need to configure rate control parameters appropriately in order to be able to get the desired rate control behavior, as described in the Video Encode Rate Control section.

When an instance of this structure is included in the pNext chain of a VkVideoEncodeRateControlLayerInfoKHR structure specified in one of the elements of the pLayers array member of the VkVideoEncodeRateControlInfoKHR structure passed to the vkCmdControlVideoCodingKHR command, VkVideoCodingControlInfoKHR::flags includes VK_VIDEO_CODING_CONTROL_ENCODE_RATE_CONTROL_BIT_KHR, and the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR, it specifies the H.264-specific rate control parameters of the rate control layer corresponding to that element of pLayers.

Valid Usage
Valid Usage (Implicit)
  • VUID-VkVideoEncodeH264RateControlLayerInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_H264_RATE_CONTROL_LAYER_INFO_KHR

  • VUID-VkVideoEncodeH264RateControlLayerInfoKHR-minQp-parameter
    minQp must be a valid VkVideoEncodeH264QpKHR structure

  • VUID-VkVideoEncodeH264RateControlLayerInfoKHR-maxQp-parameter
    maxQp must be a valid VkVideoEncodeH264QpKHR structure

  • VUID-VkVideoEncodeH264RateControlLayerInfoKHR-maxFrameSize-parameter
    maxFrameSize must be a valid VkVideoEncodeH264FrameSizeKHR structure

The VkVideoEncodeH264QpKHR structure is defined as:

// Provided by VK_KHR_video_encode_h264
typedef struct VkVideoEncodeH264QpKHR {
    int32_t    qpI;
    int32_t    qpP;
    int32_t    qpB;
} VkVideoEncodeH264QpKHR;

The VkVideoEncodeH264FrameSizeKHR structure is defined as:

// Provided by VK_KHR_video_encode_h264
typedef struct VkVideoEncodeH264FrameSizeKHR {
    uint32_t    frameISize;
    uint32_t    framePSize;
    uint32_t    frameBSize;
} VkVideoEncodeH264FrameSizeKHR;
  • frameISize is the size in bytes to be used for I pictures.

  • framePSize is the size in bytes to be used for P pictures.

  • frameBSize is the size in bytes to be used for B pictures.

GOP Remaining Frames

Besides session level rate control configuration, the application can specify the number of frames per frame type remaining in the group of pictures (GOP).

The VkVideoEncodeH264GopRemainingFrameInfoKHR structure is defined as:

// Provided by VK_KHR_video_encode_h264
typedef struct VkVideoEncodeH264GopRemainingFrameInfoKHR {
    VkStructureType    sType;
    const void*        pNext;
    VkBool32           useGopRemainingFrames;
    uint32_t           gopRemainingI;
    uint32_t           gopRemainingP;
    uint32_t           gopRemainingB;
} VkVideoEncodeH264GopRemainingFrameInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • useGopRemainingFrames indicates whether the implementation’s rate control algorithm should use the values specified in gopRemainingI, gopRemainingP, and gopRemainingB. If useGopRemainingFrames is VK_FALSE, then the values of gopRemainingI, gopRemainingP, and gopRemainingB are ignored.

  • gopRemainingI specifies the number of I frames the implementation’s rate control algorithm should assume to be remaining in the GOP prior to executing the video encode operation.

  • gopRemainingP specifies the number of P frames the implementation’s rate control algorithm should assume to be remaining in the GOP prior to executing the video encode operation.

  • gopRemainingB specifies the number of B frames the implementation’s rate control algorithm should assume to be remaining in the GOP prior to executing the video encode operation.

Setting useGopRemainingFrames to VK_TRUE and including this structure in the pNext chain of VkVideoBeginCodingInfoKHR is only mandatory if the VkVideoEncodeH264CapabilitiesKHR::requiresGopRemainingFrames reported for the used video profile is VK_TRUE. However, implementations may use these remaining frame counts, when specified, even when it is not required. In particular, when the application does not use a regular GOP structure, these values may provide additional guidance for the implementation’s rate control algorithm.

The VkVideoEncodeH264CapabilitiesKHR::prefersGopRemainingFrames capability is also used to indicate that the implementation’s rate control algorithm may operate more accurately if the application specifies the remaining frame counts using this structure.

As with other rate control guidance values, if the effective order and number of frames encoded by the application are not in line with the remaining frame counts specified in this structure at any given point, then the behavior of the implementation’s rate control algorithm may deviate from the one expected by the application.

Valid Usage (Implicit)
  • VUID-VkVideoEncodeH264GopRemainingFrameInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_H264_GOP_REMAINING_FRAME_INFO_KHR

H.264 QP Delta Maps

Quantization delta maps used with an H.264 encode profile are referred to as QP delta maps and their texels contain integer values representing QP delta values that are applied in the process of determining the quantization parameters of the encoded picture.

Accordingly, H.264 QP delta maps always have single channel integer formats, as reported in VkVideoFormatPropertiesKHR::format.

When the rate control mode is VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DISABLED_BIT_KHR, the QP delta values are added to the per slice constant QP values that, in effect, enable the application to explicitly control the used QP values at the granularity of the used quantization map texel size.

For all other rate control modes, the QP delta values can be used to offset the QP values that the rate control algorithm would otherwise produce.

H.264 Encode Quantization

Performing H.264 encode operations involves the process of assigning QP values to individual H.264 macroblocks. This process depends on the used rate control mode, as well as other encode and rate control parameters, as described below:

  • If the configured rate control mode is VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DEFAULT_KHR, then the QP value is initialized by the implementation-specific default rate control algorithm.

  • If the configured rate control mode is VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DISABLED_BIT_KHR, then the QP value is initialized from the constant QP value specified for the H.264 slice the macroblock is part of.

  • If the configured rate control mode is not VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DEFAULT_KHR or VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DISABLED_BIT_KHR, then the QP value is initialized by the corresponding rate control algorithm.

    • If the video encode operation is issued with a quantization delta map, the QP delta value corresponding to the macroblock, as fetched from the quantization map, is added to the previously determined QP value. If the fetched QP delta value falls outside the supported QP delta value range reported in the minQpDelta and maxQpDelta members of VkVideoEncodeH264QuantizationMapCapabilitiesKHR, then the QP value used for the macroblock becomes undefined.

    • If the video encode operation is issued with an emphasis map, the rate control will adjust the QP value based on the emphasis value corresponding to the macroblock, as fetched from the quantization map, according to the following equation:

      QPnew = f(QPprev,e)

      Where QPnew is the resulting QP value, QPprev is the previously determined QP value, e is the emphasis value corresponding to the macroblock, and f is an implementation-defined function for which the following implication is true:

      e1 < e2 ⇒ f(QP,e1) ≥ f(QP,e2)

      This means that lower emphasis values will result in higher QP values, whereas higher emphasis values will result in lower QP values, but the function is not strictly decreasing with respect to the input emphasis value for a given input QP value.

    • If clamping to minimum QP values is enabled in the applied rate control layer, then the QP value is clamped to the corresponding minimum QP value.

    • If clamping to maximum QP values is enabled in the applied rate control layer, then the QP value is clamped to the corresponding maximum QP value.

  • If VK_VIDEO_ENCODE_H264_CAPABILITY_MB_QP_DIFF_WRAPAROUND_BIT_KHR is not supported, then the determined QP value is clamped in such a way that the mb_qp_delta value of the encoded macroblock complies to the modified version of equation 7-37 of the ITU-T H.264 Specification.

    The effect of this is that the maximum QP difference across subsequent macroblocks is limited to the [-(26 + QpBdOffsetY / 2), 25 + QpBdOffsetY / 2] range and only has an observable change in behavior when the video encode operation is issued with a QP delta map.

  • In all cases, the final QP value is clamped to the QP value range supported by the video profile, as reported in the minQp and maxQp members of VkVideoEncodeH264CapabilitiesKHR.

H.264 Encode Requirements

This section described the required H.264 encoding capabilities for physical devices that have at least one queue family that supports the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H264_BIT_KHR, as returned by vkGetPhysicalDeviceQueueFamilyProperties2 in VkQueueFamilyVideoPropertiesKHR::videoCodecOperations.

Table 7. Required Video Std Header Versions
Video Std Header Name Version

vulkan_video_codec_h264std_encode

1.0.0

Table 8. Required Video Capabilities
Video Capability Requirement Requirement Type1

VkVideoCapabilitiesKHR

flags

-

min

minBitstreamBufferOffsetAlignment

4096

max

minBitstreamBufferSizeAlignment

4096

max

pictureAccessGranularity

(64,64)

max

minCodedExtent

-

max

maxCodedExtent

-

min

maxDpbSlots

0

min

maxActiveReferencePictures

0

min

VkVideoEncodeCapabilitiesKHR

flags

-

min

rateControlModes

-

min

maxBitrate

64000

min

maxQualityLevels

1

min

encodeInputPictureGranularity

(64,64)

max

supportedEncodeFeedbackFlags

VK_VIDEO_ENCODE_FEEDBACK_BITSTREAM_BUFFER_OFFSET_BIT_KHR
VK_VIDEO_ENCODE_FEEDBACK_BITSTREAM_BYTES_WRITTEN_BIT_KHR

min

VkVideoEncodeH264CapabilitiesKHR

flags

-

min

maxLevelIdc

STD_VIDEO_H264_LEVEL_IDC_1_0

min

maxSliceCount

1

min

maxPPictureL0ReferenceCount

0

min

maxBPictureL0ReferenceCount

0

min

maxL1ReferenceCount

0

min

maxTemporalLayerCount

1

min

expectDyadicTemporalLayerPattern

-

implementation-dependent

minQp

-

max

maxQp

-

min

prefersGopRemainingFrames

-

implementation-dependent

requiresGopRemainingFrames

-

implementation-dependent

stdSyntaxFlags

-

min

VkVideoEncodeQuantizationMapCapabilitiesKHR

maxQuantizationMapExtent

- 2

min

VkVideoEncodeH264QuantizationMapCapabilitiesKHR

minQpDelta

- 3

max

maxQpDelta

- 3

min

1

The Requirement Type column specifies the requirement is either the minimum value all implementations must support, the maximum value all implementations must support, or the exact value all implementations must support. For bitmasks a minimum value is the least bits all implementations must set, but they may have additional bits set beyond this minimum.

2

If VkVideoCapabilitiesKHR::flags includes VK_VIDEO_ENCODE_CAPABILITY_QUANTIZATION_DELTA_MAP_BIT_KHR or VK_VIDEO_ENCODE_CAPABILITY_EMPHASIS_MAP_BIT_KHR, then the width and height members of maxQuantizationMapExtent must be greater than zero.

3

If VkVideoCapabilitiesKHR::flags includes VK_VIDEO_ENCODE_CAPABILITY_QUANTIZATION_DELTA_MAP_BIT_KHR, then maxQpDelta must be greater than minQpDelta.

H.265 Encode Operations

Video encode operations using an H.265 encode profile can be used to encode elementary video stream sequences compliant to the ITU-T H.265 Specification.

Refer to the Preamble for information on how the Khronos Intellectual Property Rights Policy relates to normative references to external materials not created by Khronos.

This process is performed according to the video encode operation steps with the codec-specific semantics defined in section 8 of the ITU-T H.265 Specification as follows:

If the parameters adhere to the syntactic and semantic requirements defined in the corresponding sections of the ITU-T H.265 Specification, as described above, and the DPB slots associated with the active reference pictures all refer to valid picture references, then the video encode operation will complete successfully. Otherwise, the video encode operation may complete unsuccessfully.

H.265 Encode Parameter Overrides

Implementations may override, unless otherwise specified, any of the H.265 encode parameters specified in the following Video Std structures:

  • StdVideoH265VideoParameterSet

  • StdVideoH265SequenceParameterSet

  • StdVideoH265PictureParameterSet

  • StdVideoEncodeH265PictureInfo

  • StdVideoEncodeH265SliceSegmentHeader

  • StdVideoEncodeH265ReferenceInfo

All such H.265 encode parameter overrides must fulfill the conditions defined in the Video Encode Parameter Overrides section.

In addition, implementations must not override any of the following H.265 encode parameters:

  • StdVideoEncodeH265PictureInfo::pic_type

  • StdVideoEncodeH265SliceSegmentHeader::slice_type

In case of a video session parameters object created with VK_VIDEO_SESSION_PARAMETERS_CREATE_QUANTIZATION_MAP_COMPATIBLE_BIT_KHR, the following H.265 SPS and PPS parameters may be overridden by the implementation according to the quantization map texel size the video session parameters object was created with:

  • StdVideoH265SequenceParameterSet::log2_min_luma_coding_block_size_minus3

  • StdVideoH265SequenceParameterSet::log2_diff_max_min_luma_coding_block_size

  • StdVideoH265SequenceParameterSet::log2_min_pcm_luma_coding_block_size_minus3

  • StdVideoH265SequenceParameterSet::log2_diff_max_min_pcm_luma_coding_block_size

  • StdVideoH265PictureParameterSet::diff_cu_qp_delta_depth

This may be necessary in order to limit the set of H.265 coding unit and coding tree unit sizes used during picture encoding to those that are supported by the implementation when using the specific quantization map texel size.

In case of H.265 encode parameters stored in video session parameters objects, applications need to use the vkGetEncodedVideoSessionParametersKHR command to determine whether any implementation overrides happened. If the query indicates that implementation overrides were applied, then the application needs to retrieve and use the encoded H.265 parameter sets in the bitstream in order to be able to produce a compliant H.265 video bitstream using the H.265 encode parameters stored in the video session parameters object.

In case of any H.265 encode parameters stored in the encoded bitstream produced by video encode operations, if the implementation supports the VK_VIDEO_ENCODE_FEEDBACK_BITSTREAM_HAS_OVERRIDES_BIT_KHR video encode feedback query flag, the application can use such queries to retrieve feedback about whether any implementation overrides have been applied to those H.265 encode parameters.

H.265 Encode Bitstream Data Access

Each video encode operation writes one or more VCL NAL units comprising of slice segment headers and data of the encoded picture, in the format defined in sections 7.3.6 and 7.3.8, according to the semantics defined in sections 7.4.7 and 7.4.9 of the ITU-T H.265 Specification, respectively. The number of VCL NAL units written is specified by VkVideoEncodeH265PictureInfoKHR::naluSliceSegmentEntryCount.

H.265 Encode Picture Data Access

Accesses to image data within a video picture resource happen at the granularity indicated by VkVideoCapabilitiesKHR::pictureAccessGranularity, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the used video profile. Accordingly, the complete image subregion of a encode input picture, reference picture, or reconstructed picture accessed by video coding operations using an H.265 encode profile is defined as the set of texels within the coordinate range:

([0,endX), [0,endY))

Where:

  • endX equals codedExtent.width rounded up to the nearest integer multiple of pictureAccessGranularity.width and clamped to the width of the image subresource referred to by the corresponding VkVideoPictureResourceInfoKHR structure;

  • endY equals codedExtent.height rounded up to the nearest integer multiple of pictureAccessGranularity.height and clamped to the height of the image subresource referred to by the corresponding VkVideoPictureResourceInfoKHR structure;

Where codedExtent is the member of the VkVideoPictureResourceInfoKHR structure corresponding to the picture.

In case of video encode operations using an H.265 encode profile, any access to a picture at the coordinates (x,y), as defined by the ITU-T H.265 Specification, is an access to the image subresource referred to by the corresponding VkVideoPictureResourceInfoKHR structure at the texel coordinates (x,y).

Implementations may choose not to access some or all texels within particular reference pictures available to a video encode operation (e.g. due to video encode parameter overrides restricting the effective set of used reference pictures, or if the encoding algorithm chooses not to use certain subregions of the reference picture data for sample prediction).

H.265 Frame, Picture, Slice Segments, and Tiles

H.265 pictures consist of one or more slices, slice segments, and tiles, as defined in section 6.3.1 of the ITU-T H.265 Specification.

For the purposes of this specification, the H.265 slice segments comprising a picture are referred to as the picture partitions of the picture.

Video encode operations using an H.265 encode profile can encode slice segments of different types, as defined in section 7.4.7.1 of the ITU-T H.265 Specification, by specifying the corresponding enumeration constant value in StdVideoEncodeH265SliceSegmentHeader::slice_type in the H.265 slice segment header parameters from the Video Std enumeration type StdVideoH265SliceType:

  • STD_VIDEO_H265_SLICE_TYPE_B indicates that the slice segment is part of a B slice as defined in section 3.12 of the ITU-T H.265 Specification.

  • STD_VIDEO_H265_SLICE_TYPE_P indicates that the slice segment is part of a P slice as defined in section 3.111 of the ITU-T H.265 Specification.

  • STD_VIDEO_H265_SLICE_TYPE_I indicates that the slice segment is part of an I slice as defined in section 3.74 of the ITU-T H.265 Specification.

Pictures constructed from such slice segments can be of different types, as defined in section 7.4.3.5 of the ITU-T H.265 Specification. Video encode operations using an H.265 encode profile can encode pictures of a specific type by specifying the corresponding enumeration constant value in StdVideoEncodeH265PictureInfo::pic_type in the H.265 picture information from the Video Std enumeration type StdVideoH265PictureType:

  • STD_VIDEO_H265_PICTURE_TYPE_P indicates that the picture is a P picture. A frame consisting of a P picture is also referred to as a P frame.

  • STD_VIDEO_H265_PICTURE_TYPE_B indicates that the picture is a B picture. A frame consisting of a B picture is also referred to as a B frame.

  • STD_VIDEO_H265_PICTURE_TYPE_I indicates that the picture is an I picture. A frame consisting of an I picture is also referred to as an I frame.

  • STD_VIDEO_H265_PICTURE_TYPE_IDR indicates that the picture is a special type of I picture called an IDR picture as defined in section 3.67 of the ITU-T H.265 Specification. A frame consisting of an IDR picture is also referred to as an IDR frame.

H.265 Coding Blocks

H.265 encode supports two types of coding blocks:

H.265 Encode Profile

A video profile supporting H.265 video encode operations is specified by setting VkVideoProfileInfoKHR::videoCodecOperation to VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR and adding a VkVideoEncodeH265ProfileInfoKHR structure to the VkVideoProfileInfoKHR::pNext chain.

The VkVideoEncodeH265ProfileInfoKHR structure is defined as:

// Provided by VK_KHR_video_encode_h265
typedef struct VkVideoEncodeH265ProfileInfoKHR {
    VkStructureType           sType;
    const void*               pNext;
    StdVideoH265ProfileIdc    stdProfileIdc;
} VkVideoEncodeH265ProfileInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • stdProfileIdc is a StdVideoH265ProfileIdc value specifying the H.265 codec profile IDC, as defined in section A.3 of the ITU-T H.265 Specification.

Valid Usage (Implicit)
  • VUID-VkVideoEncodeH265ProfileInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_H265_PROFILE_INFO_KHR

H.265 Encode Capabilities

When calling vkGetPhysicalDeviceVideoCapabilitiesKHR to query the capabilities for an H.265 encode profile, the VkVideoCapabilitiesKHR::pNext chain must include a VkVideoEncodeH265CapabilitiesKHR structure that will be filled with the profile-specific capabilities.

The VkVideoEncodeH265CapabilitiesKHR structure is defined as:

// Provided by VK_KHR_video_encode_h265
typedef struct VkVideoEncodeH265CapabilitiesKHR {
    VkStructureType                                sType;
    void*                                          pNext;
    VkVideoEncodeH265CapabilityFlagsKHR            flags;
    StdVideoH265LevelIdc                           maxLevelIdc;
    uint32_t                                       maxSliceSegmentCount;
    VkExtent2D                                     maxTiles;
    VkVideoEncodeH265CtbSizeFlagsKHR               ctbSizes;
    VkVideoEncodeH265TransformBlockSizeFlagsKHR    transformBlockSizes;
    uint32_t                                       maxPPictureL0ReferenceCount;
    uint32_t                                       maxBPictureL0ReferenceCount;
    uint32_t                                       maxL1ReferenceCount;
    uint32_t                                       maxSubLayerCount;
    VkBool32                                       expectDyadicTemporalSubLayerPattern;
    int32_t                                        minQp;
    int32_t                                        maxQp;
    VkBool32                                       prefersGopRemainingFrames;
    VkBool32                                       requiresGopRemainingFrames;
    VkVideoEncodeH265StdFlagsKHR                   stdSyntaxFlags;
} VkVideoEncodeH265CapabilitiesKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • flags is a bitmask of VkVideoEncodeH265CapabilityFlagBitsKHR indicating supported H.265 encoding capabilities.

  • maxLevelIdc is a StdVideoH265LevelIdc value indicating the maximum H.265 level supported by the profile, where enum constant STD_VIDEO_H265_LEVEL_IDC_<major>_<minor> identifies H.265 level <major>.<minor> as defined in section A.4 of the ITU-T H.265 Specification.

  • maxSliceSegmentCount indicates the maximum number of slice segments that can be encoded for a single picture. Further restrictions may apply to the number of slice segments that can be encoded for a single picture depending on other capabilities and codec-specific rules.

  • maxTiles indicates the maximum number of H.265 tile columns and rows, as defined in sections 3.175 and 3.176 of the ITU-T H.265 Specification that can be encoded for a single picture. Further restrictions may apply to the number of H.265 tiles that can be encoded for a single picture depending on other capabilities and codec-specific rules.

  • ctbSizes is a bitmask of VkVideoEncodeH265CtbSizeFlagBitsKHR describing the supported CTB sizes.

  • transformBlockSizes is a bitmask of VkVideoEncodeH265TransformBlockSizeFlagBitsKHR describing the supported transform block sizes.

  • maxPPictureL0ReferenceCount indicates the maximum number of reference pictures the implementation supports in the reference list L0 for P pictures.

    As implementations may override the reference lists, maxPPictureL0ReferenceCount does not limit the number of elements that the application can specify in the L0 reference list for P pictures. However, if maxPPictureL0ReferenceCount is zero, then the use of P pictures is not allowed. In case of H.265 encoding, pictures can be encoded using only forward prediction even if P pictures are not supported, as the ITU-T H.265 Specification supports generalized P & B frames (also known as low delay B frames) whereas B frames can refer to past frames through both the L0 and L1 reference lists.

  • maxBPictureL0ReferenceCount indicates the maximum number of reference pictures the implementation supports in the reference list L0 for B pictures.

  • maxL1ReferenceCount indicates the maximum number of reference pictures the implementation supports in the reference list L1 if encoding of B pictures is supported.

    As implementations may override the reference lists, maxBPictureL0ReferenceCount and maxL1ReferenceCount does not limit the number of elements that the application can specify in the L0 and L1 reference lists for B pictures. However, if maxBPictureL0ReferenceCount and maxL1ReferenceCount are both zero, then the use of B pictures is not allowed.

  • maxSubLayerCount indicates the maximum number of H.265 sub-layers supported by the implementation.

  • expectDyadicTemporalSubLayerPattern indicates that the implementation’s rate control algorithms expect the application to use a dyadic temporal sub-layer pattern when encoding multiple temporal sub-layers.

  • minQp indicates the minimum QP value supported.

  • maxQp indicates the maximum QP value supported.

  • prefersGopRemainingFrames indicates that the implementation’s rate control algorithm prefers the application to specify the number of frames of each type remaining in the current group of pictures when beginning a video coding scope.

  • requiresGopRemainingFrames indicates that the implementation’s rate control algorithm requires the application to specify the number of frames of each type remaining in the current group of pictures when beginning a video coding scope.

  • stdSyntaxFlags is a bitmask of VkVideoEncodeH265StdFlagBitsKHR indicating capabilities related to H.265 syntax elements.

Valid Usage (Implicit)
  • VUID-VkVideoEncodeH265CapabilitiesKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_H265_CAPABILITIES_KHR

Bits which may be set in VkVideoEncodeH265CapabilitiesKHR::flags, indicating the H.265 encoding capabilities supported, are:

// Provided by VK_KHR_video_encode_h265
typedef enum VkVideoEncodeH265CapabilityFlagBitsKHR {
    VK_VIDEO_ENCODE_H265_CAPABILITY_HRD_COMPLIANCE_BIT_KHR = 0x00000001,
    VK_VIDEO_ENCODE_H265_CAPABILITY_PREDICTION_WEIGHT_TABLE_GENERATED_BIT_KHR = 0x00000002,
    VK_VIDEO_ENCODE_H265_CAPABILITY_ROW_UNALIGNED_SLICE_SEGMENT_BIT_KHR = 0x00000004,
    VK_VIDEO_ENCODE_H265_CAPABILITY_DIFFERENT_SLICE_SEGMENT_TYPE_BIT_KHR = 0x00000008,
    VK_VIDEO_ENCODE_H265_CAPABILITY_B_FRAME_IN_L0_LIST_BIT_KHR = 0x00000010,
    VK_VIDEO_ENCODE_H265_CAPABILITY_B_FRAME_IN_L1_LIST_BIT_KHR = 0x00000020,
    VK_VIDEO_ENCODE_H265_CAPABILITY_PER_PICTURE_TYPE_MIN_MAX_QP_BIT_KHR = 0x00000040,
    VK_VIDEO_ENCODE_H265_CAPABILITY_PER_SLICE_SEGMENT_CONSTANT_QP_BIT_KHR = 0x00000080,
    VK_VIDEO_ENCODE_H265_CAPABILITY_MULTIPLE_TILES_PER_SLICE_SEGMENT_BIT_KHR = 0x00000100,
    VK_VIDEO_ENCODE_H265_CAPABILITY_MULTIPLE_SLICE_SEGMENTS_PER_TILE_BIT_KHR = 0x00000200,
  // Provided by VK_KHR_video_encode_h265 with VK_KHR_video_encode_quantization_map
    VK_VIDEO_ENCODE_H265_CAPABILITY_CU_QP_DIFF_WRAPAROUND_BIT_KHR = 0x00000400,
} VkVideoEncodeH265CapabilityFlagBitsKHR;
  • VK_VIDEO_ENCODE_H265_CAPABILITY_HRD_COMPLIANCE_BIT_KHR specifies whether the implementation may be able to generate HRD compliant bitstreams if any of the nal_hrd_parameters_present_flag, vcl_hrd_parameters_present_flag, or sub_pic_hrd_params_present_flag members of StdVideoH265HrdFlags are set to 1 in the HRD parameters of the active VPS or active SPS, or if StdVideoH265SpsVuiFlags::vui_hrd_parameters_present_flag is set to 1 in the active SPS.

  • VK_VIDEO_ENCODE_H265_CAPABILITY_PREDICTION_WEIGHT_TABLE_GENERATED_BIT_KHR specifies that if the weighted_pred_flag or the weighted_bipred_flag member of StdVideoH265PpsFlags is set to 1 in the active PPS when encoding a P picture or B picture, respectively, then the implementation is able to internally decide syntax for pred_weight_table, as defined in section 7.4.7.3 of the ITU-T H.265 Specification, and the application is not required to provide a weight table in the H.265 slice segment header parameters.

  • VK_VIDEO_ENCODE_H265_CAPABILITY_ROW_UNALIGNED_SLICE_SEGMENT_BIT_KHR specifies that each slice segment in a frame with a single or multiple tiles per slice may begin or finish at any offset in a CTB row. If not supported, all slice segments in such a frame must begin at the start of a CTB row (and hence each slice segment must finish at the end of a CTB row). Also indicates that each slice segment in a frame with multiple slices per tile may begin or finish at any offset within the enclosing tile’s CTB row. If not supported, slice segments in such a frame must begin at the start of the enclosing tile’s CTB row (and hence each slice segment must finish at the end of the enclosing tile’s CTB row).

  • VK_VIDEO_ENCODE_H265_CAPABILITY_DIFFERENT_SLICE_SEGMENT_TYPE_BIT_KHR specifies that when a frame is encoded with multiple slice segments, the implementation allows encoding each slice segment with a different StdVideoEncodeH265SliceSegmentHeader::slice_type specified in the H.265 slice segment header parameters. If not supported, all slice segments of the frame must be encoded with the same slice_type which corresponds to the picture type of the frame.

  • VK_VIDEO_ENCODE_H265_CAPABILITY_B_FRAME_IN_L0_LIST_BIT_KHR specifies support for using a B frame as L0 reference, as specified in StdVideoEncodeH265ReferenceListsInfo::RefPicList0 in the H.265 picture information.

  • VK_VIDEO_ENCODE_H265_CAPABILITY_B_FRAME_IN_L1_LIST_BIT_KHR specifies support for using a B frame as L1 reference, as specified in StdVideoEncodeH265ReferenceListsInfo::RefPicList1 in the H.265 picture information.

  • VK_VIDEO_ENCODE_H265_CAPABILITY_PER_PICTURE_TYPE_MIN_MAX_QP_BIT_KHR specifies support for specifying different QP values in the members of VkVideoEncodeH265QpKHR.

  • VK_VIDEO_ENCODE_H265_CAPABILITY_PER_SLICE_SEGMENT_CONSTANT_QP_BIT_KHR specifies support for specifying different constant QP values for each slice segment.

  • VK_VIDEO_ENCODE_H265_CAPABILITY_MULTIPLE_TILES_PER_SLICE_SEGMENT_BIT_KHR specifies whether encoding multiple tiles per slice segment, as defined in section 6.3.1 of the ITU-T H.265 Specification, is supported. If this capability flag is not present, then the implementation is only able to encode a single tile for each slice segment.

  • VK_VIDEO_ENCODE_H265_CAPABILITY_MULTIPLE_SLICE_SEGMENTS_PER_TILE_BIT_KHR specifies whether encoding multiple slice segments per tile, as defined in section 6.3.1 of the ITU-T H.265 Specification, is supported. If this capability flag is not present, then the implementation is only able to encode a single slice segment for each tile.

  • VK_VIDEO_ENCODE_H265_CAPABILITY_CU_QP_DIFF_WRAPAROUND_BIT_KHR indicates support for wraparound during the calculation of the QP values of subsequently encoded coding units, as defined in section 7.4.9.14 of the ITU-T H.265 Specification. If not supported, equation 8-283 of the ITU-T H.265 Specification is effectively reduced to the following:

    QpY = qPY_PRED + CuQpDeltaVal

    The effect of this is that the maximum QP difference across subsequent coding units is limited to the [-(26 + QpBdOffsetY / 2), 25 + QpBdOffsetY / 2] range.

// Provided by VK_KHR_video_encode_h265
typedef VkFlags VkVideoEncodeH265CapabilityFlagsKHR;

VkVideoEncodeH265CapabilityFlagsKHR is a bitmask type for setting a mask of zero or more VkVideoEncodeH265CapabilityFlagBitsKHR.

Bits which may be set in VkVideoEncodeH265CapabilitiesKHR::stdSyntaxFlags, indicating the capabilities related to the H.265 syntax elements, are:

// Provided by VK_KHR_video_encode_h265
typedef enum VkVideoEncodeH265StdFlagBitsKHR {
    VK_VIDEO_ENCODE_H265_STD_SEPARATE_COLOR_PLANE_FLAG_SET_BIT_KHR = 0x00000001,
    VK_VIDEO_ENCODE_H265_STD_SAMPLE_ADAPTIVE_OFFSET_ENABLED_FLAG_SET_BIT_KHR = 0x00000002,
    VK_VIDEO_ENCODE_H265_STD_SCALING_LIST_DATA_PRESENT_FLAG_SET_BIT_KHR = 0x00000004,
    VK_VIDEO_ENCODE_H265_STD_PCM_ENABLED_FLAG_SET_BIT_KHR = 0x00000008,
    VK_VIDEO_ENCODE_H265_STD_SPS_TEMPORAL_MVP_ENABLED_FLAG_SET_BIT_KHR = 0x00000010,
    VK_VIDEO_ENCODE_H265_STD_INIT_QP_MINUS26_BIT_KHR = 0x00000020,
    VK_VIDEO_ENCODE_H265_STD_WEIGHTED_PRED_FLAG_SET_BIT_KHR = 0x00000040,
    VK_VIDEO_ENCODE_H265_STD_WEIGHTED_BIPRED_FLAG_SET_BIT_KHR = 0x00000080,
    VK_VIDEO_ENCODE_H265_STD_LOG2_PARALLEL_MERGE_LEVEL_MINUS2_BIT_KHR = 0x00000100,
    VK_VIDEO_ENCODE_H265_STD_SIGN_DATA_HIDING_ENABLED_FLAG_SET_BIT_KHR = 0x00000200,
    VK_VIDEO_ENCODE_H265_STD_TRANSFORM_SKIP_ENABLED_FLAG_SET_BIT_KHR = 0x00000400,
    VK_VIDEO_ENCODE_H265_STD_TRANSFORM_SKIP_ENABLED_FLAG_UNSET_BIT_KHR = 0x00000800,
    VK_VIDEO_ENCODE_H265_STD_PPS_SLICE_CHROMA_QP_OFFSETS_PRESENT_FLAG_SET_BIT_KHR = 0x00001000,
    VK_VIDEO_ENCODE_H265_STD_TRANSQUANT_BYPASS_ENABLED_FLAG_SET_BIT_KHR = 0x00002000,
    VK_VIDEO_ENCODE_H265_STD_CONSTRAINED_INTRA_PRED_FLAG_SET_BIT_KHR = 0x00004000,
    VK_VIDEO_ENCODE_H265_STD_ENTROPY_CODING_SYNC_ENABLED_FLAG_SET_BIT_KHR = 0x00008000,
    VK_VIDEO_ENCODE_H265_STD_DEBLOCKING_FILTER_OVERRIDE_ENABLED_FLAG_SET_BIT_KHR = 0x00010000,
    VK_VIDEO_ENCODE_H265_STD_DEPENDENT_SLICE_SEGMENTS_ENABLED_FLAG_SET_BIT_KHR = 0x00020000,
    VK_VIDEO_ENCODE_H265_STD_DEPENDENT_SLICE_SEGMENT_FLAG_SET_BIT_KHR = 0x00040000,
    VK_VIDEO_ENCODE_H265_STD_SLICE_QP_DELTA_BIT_KHR = 0x00080000,
    VK_VIDEO_ENCODE_H265_STD_DIFFERENT_SLICE_QP_DELTA_BIT_KHR = 0x00100000,
} VkVideoEncodeH265StdFlagBitsKHR;
  • VK_VIDEO_ENCODE_H265_STD_SEPARATE_COLOR_PLANE_FLAG_SET_BIT_KHR specifies whether the implementation supports using the application-provided value for StdVideoH265SpsFlags::separate_colour_plane_flag in the SPS when that value is 1.

  • VK_VIDEO_ENCODE_H265_STD_SAMPLE_ADAPTIVE_OFFSET_ENABLED_FLAG_SET_BIT_KHR specifies whether the implementation supports using the application-provided value for StdVideoH265SpsFlags::sample_adaptive_offset_enabled_flag in the SPS when that value is 1.

  • VK_VIDEO_ENCODE_H265_STD_SCALING_LIST_DATA_PRESENT_FLAG_SET_BIT_KHR specifies whether the implementation supports using the application-provided value for the scaling_list_enabled_flag and sps_scaling_list_data_present_flag members of StdVideoH265SpsFlags in the SPS, and the application-provided value for StdVideoH265PpsFlags::pps_scaling_list_data_present_flag in the PPS when those values are 1.

  • VK_VIDEO_ENCODE_H265_STD_PCM_ENABLED_FLAG_SET_BIT_KHR specifies whether the implementation supports using the application-provided value for StdVideoH265SpsFlags::pcm_enable_flag in the SPS when that value is 1.

  • VK_VIDEO_ENCODE_H265_STD_SPS_TEMPORAL_MVP_ENABLED_FLAG_SET_BIT_KHR specifies whether the implementation supports using the application-provided value for StdVideoH265SpsFlags::sps_temporal_mvp_enabled_flag in the SPS when that value is 1.

  • VK_VIDEO_ENCODE_H265_STD_INIT_QP_MINUS26_BIT_KHR specifies whether the implementation supports using the application-provided value for StdVideoH265PictureParameterSet::init_qp_minus26 in the PPS when that value is non-zero.

  • VK_VIDEO_ENCODE_H265_STD_WEIGHTED_PRED_FLAG_SET_BIT_KHR specifies whether the implementation supports using the application-provided value for StdVideoH265PpsFlags::weighted_pred_flag in the PPS when that value is 1.

  • VK_VIDEO_ENCODE_H265_STD_WEIGHTED_BIPRED_FLAG_SET_BIT_KHR specifies whether the implementation supports using the application-provided value for StdVideoH265PpsFlags::weighted_bipred_flag in the PPS when that value is 1.

  • VK_VIDEO_ENCODE_H265_STD_LOG2_PARALLEL_MERGE_LEVEL_MINUS2_BIT_KHR specifies whether the implementation supports using the application-provided value for StdVideoH265PictureParameterSet::log2_parallel_merge_level_minus2 in the PPS when that value is non-zero.

  • VK_VIDEO_ENCODE_H265_STD_SIGN_DATA_HIDING_ENABLED_FLAG_SET_BIT_KHR specifies whether the implementation supports using the application-provided value for StdVideoH265PpsFlags::sign_data_hiding_enabled_flag in the PPS when that value is 1.

  • VK_VIDEO_ENCODE_H265_STD_TRANSFORM_SKIP_ENABLED_FLAG_SET_BIT_KHR specifies whether the implementation supports using the application-provided value for StdVideoH265PpsFlags::transform_skip_enabled_flag in the PPS when that value is 1.

  • VK_VIDEO_ENCODE_H265_STD_TRANSFORM_SKIP_ENABLED_FLAG_UNSET_BIT_KHR specifies whether the implementation supports using the application-provided value for StdVideoH265PpsFlags::transform_skip_enabled_flag in the PPS when that value is 0.

  • VK_VIDEO_ENCODE_H265_STD_PPS_SLICE_CHROMA_QP_OFFSETS_PRESENT_FLAG_SET_BIT_KHR specifies whether the implementation supports using the application-provided value for StdVideoH265PpsFlags::pps_slice_chroma_qp_offsets_present_flag in the PPS when that value is 1.

  • VK_VIDEO_ENCODE_H265_STD_TRANSQUANT_BYPASS_ENABLED_FLAG_SET_BIT_KHR specifies whether the implementation supports using the application-provided value for StdVideoH265PpsFlags::transquant_bypass_enabled_flag in the PPS when that value is 1.

  • VK_VIDEO_ENCODE_H265_STD_CONSTRAINED_INTRA_PRED_FLAG_SET_BIT_KHR specifies whether the implementation supports using the application-provided value for StdVideoH265PpsFlags::constrained_intra_pred_flag in the PPS when that value is 1.

  • VK_VIDEO_ENCODE_H265_STD_ENTROPY_CODING_SYNC_ENABLED_FLAG_SET_BIT_KHR specifies whether the implementation supports using the application-provided value for StdVideoH265PpsFlags::entropy_coding_sync_enabled_flag in the PPS when that value is 1.

  • VK_VIDEO_ENCODE_H265_STD_DEBLOCKING_FILTER_OVERRIDE_ENABLED_FLAG_SET_BIT_KHR specifies whether the implementation supports using the application-provided value for StdVideoH265PpsFlags::deblocking_filter_override_enabled_flag in the PPS when that value is 1.

  • VK_VIDEO_ENCODE_H265_STD_DEPENDENT_SLICE_SEGMENTS_ENABLED_FLAG_SET_BIT_KHR specifies whether the implementation supports using the application-provided value for StdVideoH265PpsFlags::dependent_slice_segments_enabled_flag in the PPS when that value is 1.

  • VK_VIDEO_ENCODE_H265_STD_DEPENDENT_SLICE_SEGMENT_FLAG_SET_BIT_KHR specifies whether the implementation supports using the application-provided value for StdVideoEncodeH265SliceSegmentHeader::dependent_slice_segment_flag in the H.265 slice segment header parameters when that value is 1.

  • VK_VIDEO_ENCODE_H265_STD_SLICE_QP_DELTA_BIT_KHR specifies whether the implementation supports using the application-provided value for StdVideoEncodeH265SliceSegmentHeader::slice_qp_delta in the H.265 slice segment header parameters when that value is identical across the slice segments of the encoded frame.

  • VK_VIDEO_ENCODE_H265_STD_DIFFERENT_SLICE_QP_DELTA_BIT_KHR specifies whether the implementation supports using the application-provided value for StdVideoEncodeH265SliceSegmentHeader::slice_qp_delta in the H.265 slice segment header parameters when that value is different across the slice segments of the encoded frame.

These capability flags provide information to the application about specific H.265 syntax element values that the implementation supports without having to override them and do not otherwise restrict the values that the application can specify for any of the mentioned H.265 syntax elements.

// Provided by VK_KHR_video_encode_h265
typedef VkFlags VkVideoEncodeH265StdFlagsKHR;

VkVideoEncodeH265StdFlagsKHR is a bitmask type for setting a mask of zero or more VkVideoEncodeH265StdFlagBitsKHR.

Bits which may be set in VkVideoEncodeH265CapabilitiesKHR::ctbSizes, indicating the CTB sizes supported by the implementation, are:

// Provided by VK_KHR_video_encode_h265
typedef enum VkVideoEncodeH265CtbSizeFlagBitsKHR {
    VK_VIDEO_ENCODE_H265_CTB_SIZE_16_BIT_KHR = 0x00000001,
    VK_VIDEO_ENCODE_H265_CTB_SIZE_32_BIT_KHR = 0x00000002,
    VK_VIDEO_ENCODE_H265_CTB_SIZE_64_BIT_KHR = 0x00000004,
} VkVideoEncodeH265CtbSizeFlagBitsKHR;
  • VK_VIDEO_ENCODE_H265_CTB_SIZE_16_BIT_KHR specifies that a CTB size of 16x16 is supported.

  • VK_VIDEO_ENCODE_H265_CTB_SIZE_32_BIT_KHR specifies that a CTB size of 32x32 is supported.

  • VK_VIDEO_ENCODE_H265_CTB_SIZE_64_BIT_KHR specifies that a CTB size of 64x64 is supported.

// Provided by VK_KHR_video_encode_h265
typedef VkFlags VkVideoEncodeH265CtbSizeFlagsKHR;

VkVideoEncodeH265CtbSizeFlagsKHR is a bitmask type for setting a mask of zero or more VkVideoEncodeH265CtbSizeFlagBitsKHR.

Implementations must support at least one of VkVideoEncodeH265CtbSizeFlagBitsKHR.

Bits which may be set in VkVideoEncodeH265CapabilitiesKHR::transformBlockSizes, indicating the transform block sizes supported by the implementation, are:

// Provided by VK_KHR_video_encode_h265
typedef enum VkVideoEncodeH265TransformBlockSizeFlagBitsKHR {
    VK_VIDEO_ENCODE_H265_TRANSFORM_BLOCK_SIZE_4_BIT_KHR = 0x00000001,
    VK_VIDEO_ENCODE_H265_TRANSFORM_BLOCK_SIZE_8_BIT_KHR = 0x00000002,
    VK_VIDEO_ENCODE_H265_TRANSFORM_BLOCK_SIZE_16_BIT_KHR = 0x00000004,
    VK_VIDEO_ENCODE_H265_TRANSFORM_BLOCK_SIZE_32_BIT_KHR = 0x00000008,
} VkVideoEncodeH265TransformBlockSizeFlagBitsKHR;
  • VK_VIDEO_ENCODE_H265_TRANSFORM_BLOCK_SIZE_4_BIT_KHR specifies that a transform block size of 4x4 is supported.

  • VK_VIDEO_ENCODE_H265_TRANSFORM_BLOCK_SIZE_8_BIT_KHR specifies that a transform block size of 8x8 is supported.

  • VK_VIDEO_ENCODE_H265_TRANSFORM_BLOCK_SIZE_16_BIT_KHR specifies that a transform block size of 16x16 is supported.

  • VK_VIDEO_ENCODE_H265_TRANSFORM_BLOCK_SIZE_32_BIT_KHR specifies that a transform block size of 32x32 is supported.

// Provided by VK_KHR_video_encode_h265
typedef VkFlags VkVideoEncodeH265TransformBlockSizeFlagsKHR;

VkVideoEncodeH265TransformBlockSizeFlagsKHR is a bitmask type for setting a mask of zero or more VkVideoEncodeH265TransformBlockSizeFlagBitsKHR.

Implementations must support at least one of VkVideoEncodeH265TransformBlockSizeFlagBitsKHR.

H.265 Encode Quality Level Properties

When calling vkGetPhysicalDeviceVideoEncodeQualityLevelPropertiesKHR with pVideoProfile->videoCodecOperation specified as VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR, the VkVideoEncodeH265QualityLevelPropertiesKHR structure must be included in the pNext chain of the VkVideoEncodeQualityLevelPropertiesKHR structure to retrieve additional video encode quality level properties specific to H.265 encoding.

The VkVideoEncodeH265QualityLevelPropertiesKHR structure is defined as:

// Provided by VK_KHR_video_encode_h265
typedef struct VkVideoEncodeH265QualityLevelPropertiesKHR {
    VkStructureType                         sType;
    void*                                   pNext;
    VkVideoEncodeH265RateControlFlagsKHR    preferredRateControlFlags;
    uint32_t                                preferredGopFrameCount;
    uint32_t                                preferredIdrPeriod;
    uint32_t                                preferredConsecutiveBFrameCount;
    uint32_t                                preferredSubLayerCount;
    VkVideoEncodeH265QpKHR                  preferredConstantQp;
    uint32_t                                preferredMaxL0ReferenceCount;
    uint32_t                                preferredMaxL1ReferenceCount;
} VkVideoEncodeH265QualityLevelPropertiesKHR;
Valid Usage (Implicit)
  • VUID-VkVideoEncodeH265QualityLevelPropertiesKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_H265_QUALITY_LEVEL_PROPERTIES_KHR

H.265 Encode Session

Additional parameters can be specified when creating a video session with an H.265 encode profile by including an instance of the VkVideoEncodeH265SessionCreateInfoKHR structure in the pNext chain of VkVideoSessionCreateInfoKHR.

The VkVideoEncodeH265SessionCreateInfoKHR structure is defined as:

// Provided by VK_KHR_video_encode_h265
typedef struct VkVideoEncodeH265SessionCreateInfoKHR {
    VkStructureType         sType;
    const void*             pNext;
    VkBool32                useMaxLevelIdc;
    StdVideoH265LevelIdc    maxLevelIdc;
} VkVideoEncodeH265SessionCreateInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • useMaxLevelIdc indicates whether the value of maxLevelIdc should be used by the implementation. When it is VK_FALSE, the implementation ignores the value of maxLevelIdc and uses the value of VkVideoEncodeH265CapabilitiesKHR::maxLevelIdc, as reported by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile.

  • maxLevelIdc is a StdVideoH265LevelIdc value specifying the upper bound on the H.265 level for the video bitstreams produced by the created video session, where enum constant STD_VIDEO_H265_LEVEL_IDC_<major>_<minor> identifies H.265 level <major>.<minor> as defined in section A.4 of the ITU-T H.265 Specification.

Valid Usage (Implicit)
  • VUID-VkVideoEncodeH265SessionCreateInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_H265_SESSION_CREATE_INFO_KHR

H.265 Encode Parameter Sets

Video session parameters objects created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR can contain the following types of parameters:

H.265 Video Parameter Sets (VPS)

Represented by StdVideoH265VideoParameterSet structures and interpreted as follows:

  • reserved1, reserved2, and reserved3 are used only for padding purposes and are otherwise ignored;

  • vps_video_parameter_set_id is used as the key of the VPS entry;

  • the max_latency_increase_plus1, max_dec_pic_buffering_minus1, and max_num_reorder_pics members of the StdVideoH265DecPicBufMgr structure pointed to by pDecPicBufMgr correspond to vps_max_latency_increase_plus1, vps_max_dec_pic_buffering_minus1, and vps_max_num_reorder_pics, respectively, as defined in section 7.4.3.1 of the ITU-T H.265 Specification;

  • the StdVideoH265HrdParameters structure pointed to by pHrdParameters is interpreted as follows:

    • reserved is used only for padding purposes and is otherwise ignored;

    • flags.fixed_pic_rate_general_flag is a bitmask where bit index i corresponds to fixed_pic_rate_general_flag[i] as defined in section E.3.2 of the ITU-T H.265 Specification;

    • flags.fixed_pic_rate_within_cvs_flag is a bitmask where bit index i corresponds to fixed_pic_rate_within_cvs_flag[i] as defined in section E.3.2 of the ITU-T H.265 Specification;

    • flags.low_delay_hrd_flag is a bitmask where bit index i corresponds to low_delay_hrd_flag[i] as defined in section E.3.2 of the ITU-T H.265 Specification;

    • if flags.nal_hrd_parameters_present_flag is set, then pSubLayerHrdParametersNal is a pointer to an array of vps_max_sub_layers_minus1 + 1 number of StdVideoH265SubLayerHrdParameters structures where vps_max_sub_layers_minus1 is the corresponding member of the encompassing StdVideoH265VideoParameterSet structure and each element is interpreted as follows:

      • cbr_flag is a bitmask where bit index i corresponds to cbr_flag[i] as defined in section E.3.3 of the ITU-T H.265 Specification;

      • all other members of the StdVideoH265SubLayerHrdParameters structure are interpreted as defined in section E.3.3 of the ITU-T H.265 Specification;

    • if flags.vcl_hrd_parameters_present_flag is set, then pSubLayerHrdParametersVcl is a pointer to an array of vps_max_sub_layers_minus1 + 1 number of StdVideoH265SubLayerHrdParameters structures where vps_max_sub_layers_minus1 is the corresponding member of the encompassing StdVideoH265VideoParameterSet structure and each element is interpreted as follows:

      • cbr_flag is a bitmask where bit index i corresponds to cbr_flag[i] as defined in section E.3.3 of the ITU-T H.265 Specification;

      • all other members of the StdVideoH265SubLayerHrdParameters structure are interpreted as defined in section E.3.3 of the ITU-T H.265 Specification;

    • all other members of StdVideoH265HrdParameters are interpreted as defined in section E.3.2 of the ITU-T H.265 Specification;

  • the StdVideoH265ProfileTierLevel structure pointed to by pProfileTierLevel are interpreted as follows:

    • general_level_idc is one of the enum constants STD_VIDEO_H265_LEVEL_IDC_<major>_<minor> identifying the H.265 level <major>.<minor> as defined in section A.4 of the ITU-T H.265 Specification;

    • all other members of StdVideoH265ProfileTierLevel are interpreted as defined in section 7.4.4 of the ITU-T H.265 Specification;

  • all other members of StdVideoH265VideoParameterSet are interpreted as defined in section 7.4.3.1 of the ITU-T H.265 Specification.

H.265 Sequence Parameter Sets (SPS)

Represented by StdVideoH265SequenceParameterSet structures and interpreted as follows:

  • reserved1 and reserved2 are used only for padding purposes and are otherwise ignored;

  • the pair constructed from sps_video_parameter_set_id and sps_seq_parameter_set_id is used as the key of the SPS entry;

  • the StdVideoH265ProfileTierLevel structure pointed to by pProfileTierLevel are interpreted as follows:

    • general_level_idc is one of the enum constants STD_VIDEO_H265_LEVEL_IDC_<major>_<minor> identifying the H.265 level <major>.<minor> as defined in section A.4 of the ITU-T H.265 Specification;

    • all other members of StdVideoH265ProfileTierLevel are interpreted as defined in section 7.4.4 of the ITU-T H.265 Specification;

  • the max_latency_increase_plus1, max_dec_pic_buffering_minus1, and max_num_reorder_pics members of the StdVideoH265DecPicBufMgr structure pointed to by pDecPicBufMgr correspond to sps_max_latency_increase_plus1, sps_max_dec_pic_buffering_minus1, and sps_max_num_reorder_pics, respectively, as defined in section 7.4.3.2 of the ITU-T H.265 Specification;

  • if flags.sps_scaling_list_data_present_flag is set, then the StdVideoH265ScalingLists structure pointed to by pScalingLists is interpreted as follows:

    • ScalingList4x4, ScalingList8x8, ScalingList16x16, and ScalingList32x32 correspond to ScalingList[0], ScalingList[1], ScalingList[2], and ScalingList[3], respectively, as defined in section 7.3.4 of the ITU-T H.265 Specification;

    • ScalingListDCCoef16x16 and ScalingListDCCoef32x32 correspond to scaling_list_dc_coef_minus8[0] and scaling_list_dc_coef_minus8[1], respectively, as defined in section 7.3.4 of the ITU-T H.265 Specification;

  • pShortTermRefPicSet is a pointer to an array of num_short_term_ref_pic_sets number of StdVideoH265ShortTermRefPicSet structures where each element is interpreted as follows:

    • reserved1, reserved2, and reserved3 are used only for padding purposes and are otherwise ignored;

    • used_by_curr_pic_flag is a bitmask where bit index i corresponds to used_by_curr_pic_flag[i] as defined in section 7.4.8 of the ITU-T H.265 Specification;

    • use_delta_flag is a bitmask where bit index i corresponds to use_delta_flag[i] as defined in section 7.4.8 of the ITU-T H.265 Specification;

    • used_by_curr_pic_s0_flag is a bitmask where bit index i corresponds to used_by_curr_pic_s0_flag[i] as defined in section 7.4.8 of the ITU-T H.265 Specification;

    • used_by_curr_pic_s1_flag is a bitmask where bit index i corresponds to used_by_curr_pic_s1_flag[i] as defined in section 7.4.8 of the ITU-T H.265 Specification;

    • all other members of StdVideoH265ShortTermRefPicSet are interpreted as defined in section 7.4.8 of the ITU-T H.265 Specification;

  • if flags.long_term_ref_pics_present_flag is set then the StdVideoH265LongTermRefPicsSps structure pointed to by pLongTermRefPicsSps is interpreted as follows:

    • used_by_curr_pic_lt_sps_flag is a bitmask where bit index i corresponds to used_by_curr_pic_lt_sps_flag[i] as defined in section 7.4.3.2 of the ITU-T H.265 Specification;

    • all other members of StdVideoH265LongTermRefPicsSps are interpreted as defined in section 7.4.3.2 of the ITU-T H.265 Specification;

  • if flags.vui_parameters_present_flag is set, then the StdVideoH265SequenceParameterSetVui structure pointed to by pSequenceParameterSetVui is interpreted as follows:

    • reserved1, reserved2, and reserved3 are used only for padding purposes and are otherwise ignored;

    • the StdVideoH265HrdParameters structure pointed to by pHrdParameters is interpreted as follows:

      • flags.fixed_pic_rate_general_flag is a bitmask where bit index i corresponds to fixed_pic_rate_general_flag[i] as defined in section E.3.2 of the ITU-T H.265 Specification;

      • flags.fixed_pic_rate_within_cvs_flag is a bitmask where bit index i corresponds to fixed_pic_rate_within_cvs_flag[i] as defined in section E.3.2 of the ITU-T H.265 Specification;

      • flags.low_delay_hrd_flag is a bitmask where bit index i corresponds to low_delay_hrd_flag[i] as defined in section E.3.2 of the ITU-T H.265 Specification;

      • if flags.nal_hrd_parameters_present_flag is set, then pSubLayerHrdParametersNal is a pointer to an array of sps_max_sub_layers_minus1 + 1 number of StdVideoH265SubLayerHrdParameters structures where sps_max_sub_layers_minus1 is the corresponding member of the encompassing StdVideoH265SequenceParameterSet structure and each element is interpreted as follows:

        • cbr_flag is a bitmask where bit index i corresponds to cbr_flag[i] as defined in section E.3.3 of the ITU-T H.265 Specification;

        • all other members of the StdVideoH265SubLayerHrdParameters structure are interpreted as defined in section E.3.3 of the ITU-T H.265 Specification;

      • if flags.vcl_hrd_parameters_present_flag is set, then pSubLayerHrdParametersVcl is a pointer to an array of sps_max_sub_layers_minus1 + 1 number of StdVideoH265SubLayerHrdParameters structures where sps_max_sub_layers_minus1 is the corresponding member of the encompassing StdVideoH265SequenceParameterSet structure and each element is interpreted as follows:

        • cbr_flag is a bitmask where bit index i corresponds to cbr_flag[i] as defined in section E.3.3 of the ITU-T H.265 Specification;

        • all other members of the StdVideoH265SubLayerHrdParameters structure are interpreted as defined in section E.3.3 of the ITU-T H.265 Specification;

      • all other members of StdVideoH265HrdParameters are interpreted as defined in section E.3.2 of the ITU-T H.265 Specification;

    • all other members of pSequenceParameterSetVui are interpreted as defined in section E.3.1 of the ITU-T H.265 Specification;

  • if flags.sps_palette_predictor_initializer_present_flag is set, then the PredictorPaletteEntries member of the StdVideoH265PredictorPaletteEntries structure pointed to by pPredictorPaletteEntries is interpreted as defined in section 7.4.9.13 of the ITU-T H.265 Specification;

  • all other members of StdVideoH265SequenceParameterSet are interpreted as defined in section 7.4.3.1 of the ITU-T H.265 Specification.

H.265 Picture Parameter Sets (PPS)

Represented by StdVideoH265PictureParameterSet structures and interpreted as follows:

  • reserved1, reserved2, and reserved3 are used only for padding purposes and are otherwise ignored;

  • the triplet constructed from sps_video_parameter_set_id, pps_seq_parameter_set_id, and pps_pic_parameter_set_id is used as the key of the PPS entry;

  • if flags.pps_scaling_list_data_present_flag is set, then the StdVideoH265ScalingLists structure pointed to by pScalingLists is interpreted as follows:

    • ScalingList4x4, ScalingList8x8, ScalingList16x16, and ScalingList32x32 correspond to ScalingList[0], ScalingList[1], ScalingList[2], and ScalingList[3], respectively, as defined in section 7.3.4 of the ITU-T H.265 Specification;

    • ScalingListDCCoef16x16 and ScalingListDCCoef32x32 correspond to scaling_list_dc_coef_minus8[0] and scaling_list_dc_coef_minus8[1], respectively, as defined in section 7.3.4 of the ITU-T H.265 Specification;

  • if flags.pps_palette_predictor_initializer_present_flag is set, then the PredictorPaletteEntries member of the StdVideoH265PredictorPaletteEntries structure pointed to by pPredictorPaletteEntries is interpreted as defined in section 7.4.9.13 of the ITU-T H.265 Specification;

  • all other members of StdVideoH265PictureParameterSet are interpreted as defined in section 7.4.3.3 of the ITU-T H.265 Specification.

Implementations may override any of these parameters according to the semantics defined in the Video Encode Parameter Overrides section before storing the resulting H.265 parameter sets into the video session parameters object. Applications need to use the vkGetEncodedVideoSessionParametersKHR command to determine whether any implementation overrides happened and to retrieve the encoded H.265 parameter sets in order to be able to produce a compliant H.265 video bitstream.

Such H.265 parameter set overrides may also have cascading effects on the implementation overrides applied to the encoded bitstream produced by video encode operations. If the implementation supports the VK_VIDEO_ENCODE_FEEDBACK_BITSTREAM_HAS_OVERRIDES_BIT_KHR video encode feedback query flag, then the application can use such queries to retrieve feedback about whether any implementation overrides have been applied to the encoded bitstream.

When a video session parameters object is created with the codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR, the VkVideoSessionParametersCreateInfoKHR::pNext chain must include a VkVideoEncodeH265SessionParametersCreateInfoKHR structure specifying the capacity and initial contents of the object.

The VkVideoEncodeH265SessionParametersCreateInfoKHR structure is defined as:

// Provided by VK_KHR_video_encode_h265
typedef struct VkVideoEncodeH265SessionParametersCreateInfoKHR {
    VkStructureType                                        sType;
    const void*                                            pNext;
    uint32_t                                               maxStdVPSCount;
    uint32_t                                               maxStdSPSCount;
    uint32_t                                               maxStdPPSCount;
    const VkVideoEncodeH265SessionParametersAddInfoKHR*    pParametersAddInfo;
} VkVideoEncodeH265SessionParametersCreateInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • maxStdVPSCount is the maximum number of H.265 VPS entries the created VkVideoSessionParametersKHR can contain.

  • maxStdSPSCount is the maximum number of H.265 SPS entries the created VkVideoSessionParametersKHR can contain.

  • maxStdPPSCount is the maximum number of H.265 PPS entries the created VkVideoSessionParametersKHR can contain.

  • pParametersAddInfo is NULL or a pointer to a VkVideoEncodeH265SessionParametersAddInfoKHR structure specifying H.265 parameters to add upon object creation.

Valid Usage (Implicit)
  • VUID-VkVideoEncodeH265SessionParametersCreateInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_H265_SESSION_PARAMETERS_CREATE_INFO_KHR

  • VUID-VkVideoEncodeH265SessionParametersCreateInfoKHR-pParametersAddInfo-parameter
    If pParametersAddInfo is not NULL, pParametersAddInfo must be a valid pointer to a valid VkVideoEncodeH265SessionParametersAddInfoKHR structure

The VkVideoEncodeH265SessionParametersAddInfoKHR structure is defined as:

// Provided by VK_KHR_video_encode_h265
typedef struct VkVideoEncodeH265SessionParametersAddInfoKHR {
    VkStructureType                            sType;
    const void*                                pNext;
    uint32_t                                   stdVPSCount;
    const StdVideoH265VideoParameterSet*       pStdVPSs;
    uint32_t                                   stdSPSCount;
    const StdVideoH265SequenceParameterSet*    pStdSPSs;
    uint32_t                                   stdPPSCount;
    const StdVideoH265PictureParameterSet*     pStdPPSs;
} VkVideoEncodeH265SessionParametersAddInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • stdVPSCount is the number of elements in the pStdVPSs array.

  • pStdVPSs is a pointer to an array of StdVideoH265VideoParameterSet structures describing the H.265 VPS entries to add.

  • stdSPSCount is the number of elements in the pStdSPSs array.

  • pStdSPSs is a pointer to an array of StdVideoH265SequenceParameterSet structures describing the H.265 SPS entries to add.

  • stdPPSCount is the number of elements in the pStdPPSs array.

  • pStdPPSs is a pointer to an array of StdVideoH265PictureParameterSet structures describing the H.265 PPS entries to add.

This structure can be specified in the following places:

Valid Usage (Implicit)
  • VUID-VkVideoEncodeH265SessionParametersAddInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_H265_SESSION_PARAMETERS_ADD_INFO_KHR

  • VUID-VkVideoEncodeH265SessionParametersAddInfoKHR-pStdVPSs-parameter
    If stdVPSCount is not 0, and pStdVPSs is not NULL, pStdVPSs must be a valid pointer to an array of stdVPSCount StdVideoH265VideoParameterSet values

  • VUID-VkVideoEncodeH265SessionParametersAddInfoKHR-pStdSPSs-parameter
    If stdSPSCount is not 0, and pStdSPSs is not NULL, pStdSPSs must be a valid pointer to an array of stdSPSCount StdVideoH265SequenceParameterSet values

  • VUID-VkVideoEncodeH265SessionParametersAddInfoKHR-pStdPPSs-parameter
    If stdPPSCount is not 0, and pStdPPSs is not NULL, pStdPPSs must be a valid pointer to an array of stdPPSCount StdVideoH265PictureParameterSet values

Valid Usage
  • VUID-VkVideoEncodeH265SessionParametersAddInfoKHR-None-06438
    The vps_video_parameter_set_id member of each StdVideoH265VideoParameterSet structure specified in the elements of pStdVPSs must be unique within pStdVPSs

  • VUID-VkVideoEncodeH265SessionParametersAddInfoKHR-None-06439
    The pair constructed from the sps_video_parameter_set_id and sps_seq_parameter_set_id members of each StdVideoH265SequenceParameterSet structure specified in the elements of pStdSPSs must be unique within pStdSPSs

  • VUID-VkVideoEncodeH265SessionParametersAddInfoKHR-None-06440
    The triplet constructed from the sps_video_parameter_set_id, pps_seq_parameter_set_id, and pps_pic_parameter_set_id members of each StdVideoH265PictureParameterSet structure specified in the elements of pStdPPSs must be unique within pStdPPSs

The VkVideoEncodeH265SessionParametersGetInfoKHR structure is defined as:

// Provided by VK_KHR_video_encode_h265
typedef struct VkVideoEncodeH265SessionParametersGetInfoKHR {
    VkStructureType    sType;
    const void*        pNext;
    VkBool32           writeStdVPS;
    VkBool32           writeStdSPS;
    VkBool32           writeStdPPS;
    uint32_t           stdVPSId;
    uint32_t           stdSPSId;
    uint32_t           stdPPSId;
} VkVideoEncodeH265SessionParametersGetInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • writeStdVPS indicates whether the encoded H.265 video parameter set identified by stdVPSId is requested to be retrieved.

  • writeStdSPS indicates whether the encoded H.265 sequence parameter set identified by the pair constructed from stdVPSId and stdSPSId is requested to be retrieved.

  • writeStdPPS indicates whether the encoded H.265 picture parameter set identified by the triplet constructed from stdVPSId, stdSPSId, and stdPPSId is requested to be retrieved.

  • stdVPSId specifies the H.265 video parameter set ID used to identify the retrieved H.265 video, sequence, and/or picture parameter set(s).

  • stdSPSId specifies the H.265 sequence parameter set ID used to identify the retrieved H.265 sequence and/or picture parameter set(s) when writeStdSPS and/or writeStdPPS is VK_TRUE.

  • stdPPSId specifies the H.265 picture parameter set ID used to identify the retrieved H.265 picture parameter set when writeStdPPS is VK_TRUE.

When this structure is specified in the pNext chain of the VkVideoEncodeSessionParametersGetInfoKHR structure passed to vkGetEncodedVideoSessionParametersKHR, the command will write encoded parameter data to the output buffer in the following order:

  1. The H.265 video parameter set identified by stdVPSId, if writeStdVPS is VK_TRUE.

  2. The H.265 sequence parameter set identified by the pair constructed from stdVPSId and stdSPSId, if writeStdSPS is VK_TRUE.

  3. The H.265 picture parameter set identified by the triplet constructed from stdVPSId, stdSPSId, and stdPPSId, if writeStdPPS is VK_TRUE.

Valid Usage
  • VUID-VkVideoEncodeH265SessionParametersGetInfoKHR-writeStdVPS-08290
    At least one of writeStdVPS, writeStdSPS, and writeStdPPS must be VK_TRUE

Valid Usage (Implicit)
  • VUID-VkVideoEncodeH265SessionParametersGetInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_H265_SESSION_PARAMETERS_GET_INFO_KHR

The VkVideoEncodeH265SessionParametersFeedbackInfoKHR structure is defined as:

// Provided by VK_KHR_video_encode_h265
typedef struct VkVideoEncodeH265SessionParametersFeedbackInfoKHR {
    VkStructureType    sType;
    void*              pNext;
    VkBool32           hasStdVPSOverrides;
    VkBool32           hasStdSPSOverrides;
    VkBool32           hasStdPPSOverrides;
} VkVideoEncodeH265SessionParametersFeedbackInfoKHR;
Valid Usage (Implicit)
  • VUID-VkVideoEncodeH265SessionParametersFeedbackInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_H265_SESSION_PARAMETERS_FEEDBACK_INFO_KHR

H.265 Encoding Parameters

The VkVideoEncodeH265PictureInfoKHR structure is defined as:

// Provided by VK_KHR_video_encode_h265
typedef struct VkVideoEncodeH265PictureInfoKHR {
    VkStructureType                                    sType;
    const void*                                        pNext;
    uint32_t                                           naluSliceSegmentEntryCount;
    const VkVideoEncodeH265NaluSliceSegmentInfoKHR*    pNaluSliceSegmentEntries;
    const StdVideoEncodeH265PictureInfo*               pStdPictureInfo;
} VkVideoEncodeH265PictureInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • naluSliceSegmentEntryCount is the number of elements in pNaluSliceSegmentEntries.

  • pNaluSliceSegmentEntries is a pointer to an array of naluSliceSegmentEntryCount VkVideoEncodeH265NaluSliceSegmentInfoKHR structures specifying the parameters of the individual H.265 slice segments to encode for the input picture.

  • pStdPictureInfo is a pointer to a StdVideoEncodeH265PictureInfo structure specifying H.265 picture information.

This structure is specified in the pNext chain of the VkVideoEncodeInfoKHR structure passed to vkCmdEncodeVideoKHR to specify the codec-specific picture information for an H.265 encode operation.

Encode Input Picture Information

When this structure is specified in the pNext chain of the VkVideoEncodeInfoKHR structure passed to vkCmdEncodeVideoKHR, the information related to the encode input picture is defined as follows:

Std Picture Information

The members of the StdVideoEncodeH265PictureInfo structure pointed to by pStdPictureInfo are interpreted as follows:

  • flags.reserved and reserved1 are used only for padding purposes and are otherwise ignored;

  • flags.is_reference as defined in section 3.132 of the ITU-T H.265 Specification;

  • flags.IrapPicFlag as defined in section 3.73 of the ITU-T H.265 Specification;

  • flags.used_for_long_term_reference is used to indicate whether the picture is marked as “used for long-term reference” as defined in section 8.3.2 of the ITU-T H.265 Specification;

  • flags.discardable_flag and cross_layer_bla_flag as defined in section F.7.4.7.1 of the ITU-T H.265 Specification;

  • pic_type as defined in section 7.4.3.5 of the ITU-T H.265 Specification;

  • sps_video_parameter_set_id, pps_seq_parameter_set_id, and pps_pic_parameter_set_id are used to identify the active parameter sets, as described below;

  • PicOrderCntVal as defined in section 8.3.1 of the ITU-T H.265 Specification;

  • TemporalId as defined in section 7.4.2.2 of the ITU-T H.265 Specification;

  • if pRefLists is not NULL, then it is a pointer to a StdVideoEncodeH265ReferenceListsInfo structure that is interpreted as follows:

    • flags.reserved is used only for padding purposes and is otherwise ignored;

    • ref_pic_list_modification_flag_l0 and ref_pic_list_modification_flag_l1 as defined in section 7.4.7.2 of the ITU-T H.265 Specification;

    • num_ref_idx_l0_active_minus1 and num_ref_idx_l1_active_minus1 as defined in section 7.4.7.1 of the ITU-T H.265 Specification;

    • RefPicList0 and RefPicList1 as defined in section 8.3.4 of the ITU-T H.265 Specification where each element of these arrays either identifies an active reference picture using its DPB slot index or contains the value STD_VIDEO_H265_NO_REFERENCE_PICTURE to indicate “no reference picture”;

    • list_entry_l0 and list_entry_l1 as defined in section 7.4.7.2 of the ITU-T H.265 Specification;

  • if flags.short_term_ref_pic_set_sps_flag is set, then the StdVideoH265ShortTermRefPicSet structure pointed to by pShortTermRefPicSet is interpreted as defined for the elements of the pShortTermRefPicSet array specified in H.265 sequence parameter sets.

  • if flags.long_term_ref_pics_present_flag is set in the active SPS, then the StdVideoEncodeH265LongTermRefPics structure pointed to by pLongTermRefPics is interpreted as follows:

    • used_by_curr_pic_lt_flag is a bitmask where bit index i corresponds to used_by_curr_pic_lt_flag[i] as defined in section 7.4.7.1 of the ITU-T H.265 Specification;

    • all other members of StdVideoEncodeH265LongTermRefPics are interpreted as defined in section 7.4.7.1 of the ITU-T H.265 Specification;

  • all other members are interpreted as defined in section 7.4.7.1 of the ITU-T H.265 Specification.

Reference picture setup is controlled by the value of StdVideoEncodeH265PictureInfo::flags.is_reference. If it is set and a reconstructed picture is specified, then the latter is used as the target of picture reconstruction to activate the DPB slot specified in pEncodeInfo->pSetupReferenceSlot→slotIndex. If StdVideoEncodeH265PictureInfo::flags.is_reference is not set, but a reconstructed picture is specified, then the corresponding picture reference associated with the DPB slot is invalidated, as described in the DPB Slot States section.

Active Parameter Sets

The members of the StdVideoEncodeH265PictureInfo structure pointed to by pStdPictureInfo are used to select the active parameter sets to use from the bound video session parameters object, as follows:

  • The active VPS is the VPS identified by the key specified in StdVideoEncodeH265PictureInfo::sps_video_parameter_set_id.

  • The active SPS is the SPS identified by the key specified by the pair constructed from StdVideoEncodeH265PictureInfo::sps_video_parameter_set_id and StdVideoEncodeH265PictureInfo::pps_seq_parameter_set_id.

  • The active PPS is the PPS identified by the key specified by the triplet constructed from StdVideoEncodeH265PictureInfo::sps_video_parameter_set_id, StdVideoEncodeH265PictureInfo::pps_seq_parameter_set_id, and StdVideoEncodeH265PictureInfo::pps_pic_parameter_set_id.

H.265 encoding uses explicit weighted sample prediction for a slice segment, as defined in section 8.5.3.3.4 of the ITU-T H.265 Specification, if any of the following conditions are true for the active PPS and the pStdSliceSegmentHeader member of the corresponding element of pNaluSliceSegmentEntries:

  • pStdSliceSegmentHeader->slice_type is STD_VIDEO_H265_SLICE_TYPE_P and weighted_pred_flag is enabled in the active PPS.

  • pStdSliceSegmentHeader->slice_type is STD_VIDEO_H265_SLICE_TYPE_B and weighted_bipred_flag is enabled in the active PPS.

The number of H.265 tiles, as defined in section 3.174 of the ITU-T H.265 Specification, is derived from the num_tile_columns_minus1 and num_tile_rows_minus1 members of the active PPS as follows:

(num_tile_columns_minus1 + 1) × (num_tile_rows_minus1 + 1)

Valid Usage
Valid Usage (Implicit)
  • VUID-VkVideoEncodeH265PictureInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_H265_PICTURE_INFO_KHR

  • VUID-VkVideoEncodeH265PictureInfoKHR-pNaluSliceSegmentEntries-parameter
    pNaluSliceSegmentEntries must be a valid pointer to an array of naluSliceSegmentEntryCount valid VkVideoEncodeH265NaluSliceSegmentInfoKHR structures

  • VUID-VkVideoEncodeH265PictureInfoKHR-pStdPictureInfo-parameter
    pStdPictureInfo must be a valid pointer to a valid StdVideoEncodeH265PictureInfo value

  • VUID-VkVideoEncodeH265PictureInfoKHR-naluSliceSegmentEntryCount-arraylength
    naluSliceSegmentEntryCount must be greater than 0

The VkVideoEncodeH265NaluSliceSegmentInfoKHR structure is defined as:

// Provided by VK_KHR_video_encode_h265
typedef struct VkVideoEncodeH265NaluSliceSegmentInfoKHR {
    VkStructureType                                sType;
    const void*                                    pNext;
    int32_t                                        constantQp;
    const StdVideoEncodeH265SliceSegmentHeader*    pStdSliceSegmentHeader;
} VkVideoEncodeH265NaluSliceSegmentInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • constantQp is the QP to use for the slice segment if the current rate control mode configured for the video session is VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DISABLED_BIT_KHR.

  • pStdSliceSegmentHeader is a pointer to a StdVideoEncodeH265SliceSegmentHeader structure specifying H.265 slice segment header parameters for the slice segment.

Std Slice Segment Header Parameters

The members of the StdVideoEncodeH265SliceSegmentHeader structure pointed to by pStdSliceSegmentHeader are interpreted as follows:

  • flags.reserved and reserved1 are used only for padding purposes and are otherwise ignored;

  • if pWeightTable is not NULL, then it is a pointer to a StdVideoEncodeH265WeightTable that is interpreted as follows:

    • flags.luma_weight_l0_flag, flags.chroma_weight_l0_flag, flags.luma_weight_l1_flag, and flags.chroma_weight_l1_flag are bitmasks where bit index i corresponds to luma_weight_l0_flag[i], chroma_weight_l0_flag[i], luma_weight_l1_flag[i], and chroma_weight_l1_flag[i], respectively, as defined in section 7.4.7.3 of the ITU-T H.265 Specification;

    • all other members of StdVideoEncodeH265WeightTable are interpreted as defined in section 7.4.7.3 of the ITU-T H.265 Specification;

  • all other members are interpreted as defined in section 7.4.7.1 of the ITU-T H.265 Specification.

Valid Usage (Implicit)
  • VUID-VkVideoEncodeH265NaluSliceSegmentInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_H265_NALU_SLICE_SEGMENT_INFO_KHR

  • VUID-VkVideoEncodeH265NaluSliceSegmentInfoKHR-pNext-pNext
    pNext must be NULL

  • VUID-VkVideoEncodeH265NaluSliceSegmentInfoKHR-pStdSliceSegmentHeader-parameter
    pStdSliceSegmentHeader must be a valid pointer to a valid StdVideoEncodeH265SliceSegmentHeader value

The VkVideoEncodeH265DpbSlotInfoKHR structure is defined as:

// Provided by VK_KHR_video_encode_h265
typedef struct VkVideoEncodeH265DpbSlotInfoKHR {
    VkStructureType                           sType;
    const void*                               pNext;
    const StdVideoEncodeH265ReferenceInfo*    pStdReferenceInfo;
} VkVideoEncodeH265DpbSlotInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • pStdReferenceInfo is a pointer to a StdVideoEncodeH265ReferenceInfo structure specifying H.265 reference information.

This structure is specified in the pNext chain of VkVideoEncodeInfoKHR::pSetupReferenceSlot, if not NULL, and the pNext chain of the elements of VkVideoEncodeInfoKHR::pReferenceSlots to specify the codec-specific reference picture information for an H.265 encode operation.

Active Reference Picture Information

When this structure is specified in the pNext chain of the elements of VkVideoEncodeInfoKHR::pReferenceSlots, one element is added to the list of active reference pictures used by the video encode operation for each element of VkVideoEncodeInfoKHR::pReferenceSlots as follows:

Reconstructed Picture Information

When this structure is specified in the pNext chain of VkVideoEncodeInfoKHR::pSetupReferenceSlot, the information related to the reconstructed picture is defined as follows:

Std Reference Information

The members of the StdVideoEncodeH265ReferenceInfo structure pointed to by pStdReferenceInfo are interpreted as follows:

  • flags.reserved is used only for padding purposes and is otherwise ignored;

  • flags.used_for_long_term_reference is used to indicate whether the picture is marked as “used for long-term reference” as defined in section 8.3.2 of the ITU-T H.265 Specification;

  • flags.unused_for_reference is used to indicate whether the picture is marked as “unused for reference” as defined in section 8.3.2 of the ITU-T H.265 Specification;

  • pic_type as defined in section 7.4.3.5 of the ITU-T H.265 Specification;

  • PicOrderCntVal as defined in section 8.3.1 of the ITU-T H.265 Specification;

  • TemporalId as defined in section 7.4.2.2 of the ITU-T H.265 Specification.

Valid Usage (Implicit)
  • VUID-VkVideoEncodeH265DpbSlotInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_H265_DPB_SLOT_INFO_KHR

  • VUID-VkVideoEncodeH265DpbSlotInfoKHR-pStdReferenceInfo-parameter
    pStdReferenceInfo must be a valid pointer to a valid StdVideoEncodeH265ReferenceInfo value

H.265 Encode Rate Control

Group of Pictures

In case of H.265 encoding it is common practice to follow a regular pattern of different picture types in display order when encoding subsequent frames. This pattern is referred to as the group of pictures (GOP).

A regular GOP is defined by the following parameters:

  • The number of frames in the GOP;

  • The number of consecutive B frames between I and/or P frames in display order.

GOPs are further classified as open and closed GOPs.

Frame types in an open GOP follow each other in display order according to the following algorithm:

  1. The first frame is always an I frame.

  2. This is followed by a number of consecutive B frames, as defined above.

  3. If the number of frames in the GOP is not reached yet, then the next frame is a P frame and the algorithm continues from step 2.

h26x open gop
Figure 6. H.265 open GOP

In case of a closed GOP, an IDR frame is used at a certain period.

h26x closed gop
Figure 7. H.265 closed GOP

It is also typical for H.265 encoding to use specific reference picture usage patterns across the frames of the GOP. The two most common reference patterns used are as follows:

Flat Reference Pattern
  • Each P frame uses the last non-B frame, in display order, as reference.

  • Each B frame uses the last non-B frame, in display order, as its forward reference, and uses the next non-B frame, in display order, as its backward reference.

h26x ref pattern flat
Figure 8. H.265 flat reference pattern
Dyadic Reference Pattern
  • Each P frame uses the last non-B frame, in display order, as reference.

  • The following algorithm is applied to the sequence of consecutive B frames between I and/or P frames in display order:

    1. The B frame in the middle of this sequence uses the frame preceding the sequence as its forward reference, and uses the frame following the sequence as its backward reference.

    2. The algorithm is executed recursively for the following frame sequences:

      • The B frames of the original sequence preceding the frame in the middle, if any.

      • The B frames of the original sequence following the frame in the middle, if any.

h26x ref pattern dyadic
Figure 9. H.265 dyadic reference pattern

The application can provide guidance to the implementation’s rate control algorithm about the structure of the GOP used by the application. Any such guidance about the GOP and its structure does not mandate that specific GOP structure to be used by the application, as the picture type of individual encoded pictures is still application-controlled, however, any deviation from the provided guidance may result in undesired rate control behavior including, but not limited, to the implementation not being able to conform to the expected average or target bitrates, or other rate control parameters specified by the application.

When an H.265 encode session is used to encode multiple temporal sub-layers, it is also common practice to follow a regular pattern for the H.265 temporal ID for the encoded pictures in display order when encoding subsequent frames. This pattern is referred to as the temporal GOP. The most common temporal layer pattern used is as follows:

Dyadic Temporal Sub-Layer Pattern
  • The number of frames in the temporal GOP is 2n-1, where n is the number of temporal sub-layers.

  • The ith frame in the temporal GOP uses temporal ID t, if and only if the index of the least significant bit set in i equals n-t-1, except for the first frame, which is the only frame in the temporal GOP using temporal ID zero.

  • The ith frame in the temporal GOP uses the rth frame as reference, where r is calculated from i by clearing the least significant bit set in it, except for the first frame in the temporal GOP, which uses the first frame of the previous temporal GOP, if any, as reference.

h26x layer pattern dyadic
Figure 10. H.265 dyadic temporal sub-layer pattern

Multi-layer rate control and multi-layer coding are typically used for streaming cases where low latency is expected, hence B pictures with backward prediction are usually not used.

The VkVideoEncodeH265RateControlInfoKHR structure is defined as:

// Provided by VK_KHR_video_encode_h265
typedef struct VkVideoEncodeH265RateControlInfoKHR {
    VkStructureType                         sType;
    const void*                             pNext;
    VkVideoEncodeH265RateControlFlagsKHR    flags;
    uint32_t                                gopFrameCount;
    uint32_t                                idrPeriod;
    uint32_t                                consecutiveBFrameCount;
    uint32_t                                subLayerCount;
} VkVideoEncodeH265RateControlInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • flags is a bitmask of VkVideoEncodeH265RateControlFlagBitsKHR specifying H.265 rate control flags.

  • gopFrameCount is the number of frames within a group of pictures (GOP) intended to be used by the application. If it is 0, the rate control algorithm may assume an implementation-dependent GOP length. If it is UINT32_MAX, the GOP length is treated as infinite.

  • idrPeriod is the interval, in terms of number of frames, between two IDR frames (see IDR period). If it is 0, the rate control algorithm may assume an implementation-dependent IDR period. If it is UINT32_MAX, the IDR period is treated as infinite.

  • consecutiveBFrameCount is the number of consecutive B frames between I and/or P frames within the GOP.

  • subLayerCount specifies the number of H.265 sub-layers that the application intends to use.

When an instance of this structure is included in the pNext chain of the VkVideoCodingControlInfoKHR structure passed to the vkCmdControlVideoCodingKHR command, and VkVideoCodingControlInfoKHR::flags includes VK_VIDEO_CODING_CONTROL_ENCODE_RATE_CONTROL_BIT_KHR, the parameters in this structure are used as guidance for the implementation’s rate control algorithm (see Video Coding Control).

If flags includes VK_VIDEO_ENCODE_H265_RATE_CONTROL_ATTEMPT_HRD_COMPLIANCE_BIT_KHR, then the rate control state is reset to an initial state to meet HRD compliance requirements. Otherwise the new rate control state may be applied without a reset depending on the implementation and the specified rate control parameters.

It would be possible to infer the picture type to be used when encoding a frame, on the basis of the values provided for consecutiveBFrameCount, idrPeriod, and gopFrameCount, but this inferred picture type will not be used by implementations to override the picture type provided to the video encode operation.

Valid Usage
  • VUID-VkVideoEncodeH265RateControlInfoKHR-flags-08291
    If VkVideoEncodeH265CapabilitiesKHR::flags, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the used video profile, does not include VK_VIDEO_ENCODE_H265_CAPABILITY_HRD_COMPLIANCE_BIT_KHR, then flags must not contain VK_VIDEO_ENCODE_H265_RATE_CONTROL_ATTEMPT_HRD_COMPLIANCE_BIT_KHR

  • VUID-VkVideoEncodeH265RateControlInfoKHR-flags-08292
    If flags contains VK_VIDEO_ENCODE_H265_RATE_CONTROL_REFERENCE_PATTERN_FLAT_BIT_KHR or VK_VIDEO_ENCODE_H265_RATE_CONTROL_REFERENCE_PATTERN_DYADIC_BIT_KHR, then it must also contain VK_VIDEO_ENCODE_H265_RATE_CONTROL_REGULAR_GOP_BIT_KHR

  • VUID-VkVideoEncodeH265RateControlInfoKHR-flags-08293
    If flags contains VK_VIDEO_ENCODE_H265_RATE_CONTROL_REFERENCE_PATTERN_FLAT_BIT_KHR, then it must not also contain VK_VIDEO_ENCODE_H265_RATE_CONTROL_REFERENCE_PATTERN_DYADIC_BIT_KHR

  • VUID-VkVideoEncodeH265RateControlInfoKHR-flags-08294
    If flags contains VK_VIDEO_ENCODE_H265_RATE_CONTROL_REGULAR_GOP_BIT_KHR, then gopFrameCount must be greater than 0

  • VUID-VkVideoEncodeH265RateControlInfoKHR-idrPeriod-08295
    If idrPeriod is not 0, then it must be greater than or equal to gopFrameCount

  • VUID-VkVideoEncodeH265RateControlInfoKHR-consecutiveBFrameCount-08296
    If consecutiveBFrameCount is not 0, then it must be less than gopFrameCount

Valid Usage (Implicit)
  • VUID-VkVideoEncodeH265RateControlInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_H265_RATE_CONTROL_INFO_KHR

  • VUID-VkVideoEncodeH265RateControlInfoKHR-flags-parameter
    flags must be a valid combination of VkVideoEncodeH265RateControlFlagBitsKHR values

Bits which can be set in VkVideoEncodeH265RateControlInfoKHR::flags, specifying H.265 rate control flags, are:

// Provided by VK_KHR_video_encode_h265
typedef enum VkVideoEncodeH265RateControlFlagBitsKHR {
    VK_VIDEO_ENCODE_H265_RATE_CONTROL_ATTEMPT_HRD_COMPLIANCE_BIT_KHR = 0x00000001,
    VK_VIDEO_ENCODE_H265_RATE_CONTROL_REGULAR_GOP_BIT_KHR = 0x00000002,
    VK_VIDEO_ENCODE_H265_RATE_CONTROL_REFERENCE_PATTERN_FLAT_BIT_KHR = 0x00000004,
    VK_VIDEO_ENCODE_H265_RATE_CONTROL_REFERENCE_PATTERN_DYADIC_BIT_KHR = 0x00000008,
    VK_VIDEO_ENCODE_H265_RATE_CONTROL_TEMPORAL_SUB_LAYER_PATTERN_DYADIC_BIT_KHR = 0x00000010,
} VkVideoEncodeH265RateControlFlagBitsKHR;
  • VK_VIDEO_ENCODE_H265_RATE_CONTROL_ATTEMPT_HRD_COMPLIANCE_BIT_KHR specifies that rate control should attempt to produce an HRD compliant bitstream, as defined in annex C of the ITU-T H.265 Specification.

  • VK_VIDEO_ENCODE_H265_RATE_CONTROL_REGULAR_GOP_BIT_KHR specifies that the application intends to use a regular GOP structure according to the parameters specified in the gopFrameCount, idrPeriod, and consecutiveBFrameCount members of the VkVideoEncodeH265RateControlInfoKHR structure.

  • VK_VIDEO_ENCODE_H265_RATE_CONTROL_REFERENCE_PATTERN_FLAT_BIT_KHR specifies that the application intends to follow a flat reference pattern in the GOP.

  • VK_VIDEO_ENCODE_H265_RATE_CONTROL_REFERENCE_PATTERN_DYADIC_BIT_KHR specifies that the application intends to follow a dyadic reference pattern in the GOP.

  • VK_VIDEO_ENCODE_H265_RATE_CONTROL_TEMPORAL_SUB_LAYER_PATTERN_DYADIC_BIT_KHR specifies that the application intends to follow a dyadic temporal sub-layer pattern.

// Provided by VK_KHR_video_encode_h265
typedef VkFlags VkVideoEncodeH265RateControlFlagsKHR;

VkVideoEncodeH265RateControlFlagsKHR is a bitmask type for setting a mask of zero or more VkVideoEncodeH265RateControlFlagBitsKHR.

Rate Control Layers

The VkVideoEncodeH265RateControlLayerInfoKHR structure is defined as:

// Provided by VK_KHR_video_encode_h265
typedef struct VkVideoEncodeH265RateControlLayerInfoKHR {
    VkStructureType                  sType;
    const void*                      pNext;
    VkBool32                         useMinQp;
    VkVideoEncodeH265QpKHR           minQp;
    VkBool32                         useMaxQp;
    VkVideoEncodeH265QpKHR           maxQp;
    VkBool32                         useMaxFrameSize;
    VkVideoEncodeH265FrameSizeKHR    maxFrameSize;
} VkVideoEncodeH265RateControlLayerInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • useMinQp indicates whether the QP values determined by rate control will be clamped to the lower bounds on the QP values specified in minQp.

  • minQp specifies the lower bounds on the QP values, for each picture type, that the implementation’s rate control algorithm will use when useMinQp is VK_TRUE.

  • useMaxQp indicates whether the QP values determined by rate control will be clamped to the upper bounds on the QP values specified in maxQp.

  • maxQp specifies the upper bounds on the QP values, for each picture type, that the implementation’s rate control algorithm will use when useMaxQp is VK_TRUE.

  • useMaxFrameSize indicates whether the implementation’s rate control algorithm should use the values specified in maxFrameSize as the upper bounds on the encoded frame size for each picture type.

  • maxFrameSize specifies the upper bounds on the encoded frame size, for each picture type, when useMaxFrameSize is VK_TRUE.

When used, the values in minQp and maxQp guarantee that the effective QP values used by the implementation will respect those lower and upper bounds, respectively. However, limiting the range of QP values that the implementation is able to use will also limit the capabilities of the implementation’s rate control algorithm to comply to other constraints. In particular, the implementation may not be able to comply to the following:

  • The average and/or peak bitrate values to be used for the encoded bitstream specified in the averageBitrate and maxBitrate members of the VkVideoEncodeRateControlLayerInfoKHR structure.

  • The upper bounds on the encoded frame size, for each picture type, specified in the maxFrameSize member of VkVideoEncodeH265RateControlLayerInfoKHR.

In general, applications need to configure rate control parameters appropriately in order to be able to get the desired rate control behavior, as described in the Video Encode Rate Control section.

When an instance of this structure is included in the pNext chain of a VkVideoEncodeRateControlLayerInfoKHR structure specified in one of the elements of the pLayers array member of the VkVideoEncodeRateControlInfoKHR structure passed to the vkCmdControlVideoCodingKHR command, VkVideoCodingControlInfoKHR::flags includes VK_VIDEO_CODING_CONTROL_ENCODE_RATE_CONTROL_BIT_KHR, and the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR, it specifies the H.265-specific rate control parameters of the rate control layer corresponding to that element of pLayers.

Valid Usage
Valid Usage (Implicit)
  • VUID-VkVideoEncodeH265RateControlLayerInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_H265_RATE_CONTROL_LAYER_INFO_KHR

  • VUID-VkVideoEncodeH265RateControlLayerInfoKHR-minQp-parameter
    minQp must be a valid VkVideoEncodeH265QpKHR structure

  • VUID-VkVideoEncodeH265RateControlLayerInfoKHR-maxQp-parameter
    maxQp must be a valid VkVideoEncodeH265QpKHR structure

  • VUID-VkVideoEncodeH265RateControlLayerInfoKHR-maxFrameSize-parameter
    maxFrameSize must be a valid VkVideoEncodeH265FrameSizeKHR structure

The VkVideoEncodeH265QpKHR structure is defined as:

// Provided by VK_KHR_video_encode_h265
typedef struct VkVideoEncodeH265QpKHR {
    int32_t    qpI;
    int32_t    qpP;
    int32_t    qpB;
} VkVideoEncodeH265QpKHR;

The VkVideoEncodeH265FrameSizeKHR structure is defined as:

// Provided by VK_KHR_video_encode_h265
typedef struct VkVideoEncodeH265FrameSizeKHR {
    uint32_t    frameISize;
    uint32_t    framePSize;
    uint32_t    frameBSize;
} VkVideoEncodeH265FrameSizeKHR;
  • frameISize is the size in bytes to be used for I frames.

  • framePSize is the size in bytes to be used for P frames.

  • frameBSize is the size in bytes to be used for B frames.

GOP Remaining Frames

Besides session level rate control configuration, the application can specify the number of frames per frame type remaining in the group of pictures (GOP).

The VkVideoEncodeH265GopRemainingFrameInfoKHR structure is defined as:

// Provided by VK_KHR_video_encode_h265
typedef struct VkVideoEncodeH265GopRemainingFrameInfoKHR {
    VkStructureType    sType;
    const void*        pNext;
    VkBool32           useGopRemainingFrames;
    uint32_t           gopRemainingI;
    uint32_t           gopRemainingP;
    uint32_t           gopRemainingB;
} VkVideoEncodeH265GopRemainingFrameInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • useGopRemainingFrames indicates whether the implementation’s rate control algorithm should use the values specified in gopRemainingI, gopRemainingP, and gopRemainingB. If useGopRemainingFrames is VK_FALSE, then the values of gopRemainingI, gopRemainingP, and gopRemainingB are ignored.

  • gopRemainingI specifies the number of I frames the implementation’s rate control algorithm should assume to be remaining in the GOP prior to executing the video encode operation.

  • gopRemainingP specifies the number of P frames the implementation’s rate control algorithm should assume to be remaining in the GOP prior to executing the video encode operation.

  • gopRemainingB specifies the number of B frames the implementation’s rate control algorithm should assume to be remaining in the GOP prior to executing the video encode operation.

Setting useGopRemainingFrames to VK_TRUE and including this structure in the pNext chain of VkVideoBeginCodingInfoKHR is only mandatory if the VkVideoEncodeH265CapabilitiesKHR::requiresGopRemainingFrames reported for the used video profile is VK_TRUE. However, implementations may use these remaining frame counts, when specified, even when it is not required. In particular, when the application does not use a regular GOP structure, these values may provide additional guidance for the implementation’s rate control algorithm.

The VkVideoEncodeH265CapabilitiesKHR::prefersGopRemainingFrames capability is also used to indicate that the implementation’s rate control algorithm may operate more accurately if the application specifies the remaining frame counts using this structure.

As with other rate control guidance values, if the effective order and number of frames encoded by the application are not in line with the remaining frame counts specified in this structure at any given point, then the behavior of the implementation’s rate control algorithm may deviate from the one expected by the application.

Valid Usage (Implicit)
  • VUID-VkVideoEncodeH265GopRemainingFrameInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_H265_GOP_REMAINING_FRAME_INFO_KHR

H.265 QP Delta Maps

Quantization delta maps used with an H.265 encode profile are referred to as QP delta maps and their texels contain integer values representing QP delta values that are applied in the process of determining the quantization parameters of the encoded picture.

Accordingly, H.265 QP delta maps always have single channel integer formats, as reported in VkVideoFormatPropertiesKHR::format.

When the rate control mode is VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DISABLED_BIT_KHR, the QP delta values are added to the per slice segment constant QP values that, in effect, enable the application to explicitly control the used QP values at the granularity of the used quantization map texel size.

For all other rate control modes, the QP delta values can be used to offset the QP values that the rate control algorithm would otherwise produce.

H.265 Encode Quantization

Performing H.265 encode operations involves the process of assigning QP values to individual H.265 coding units. This process depends on the used rate control mode, as well as other encode and rate control parameters, as described below:

  • If the configured rate control mode is VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DEFAULT_KHR, then the QP value is initialized by the implementation-specific default rate control algorithm.

  • If the configured rate control mode is VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DISABLED_BIT_KHR, then the QP value is initialized from the constant QP value specified for the H.265 slice segment the coding unit is part of.

  • If the configured rate control mode is not VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DEFAULT_KHR or VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DISABLED_BIT_KHR, then the QP value is initialized by the corresponding rate control algorithm.

    • If the video encode operation is issued with a quantization delta map, the QP delta value corresponding to the coding unit, as fetched from the quantization map, is added to the previously determined QP value. If the fetched QP delta value falls outside the supported QP delta value range reported in the minQpDelta and maxQpDelta members of VkVideoEncodeH265QuantizationMapCapabilitiesKHR, then the QP value used for the coding unit becomes undefined.

    • If the video encode operation is issued with an emphasis map, the rate control will adjust the QP value based on the emphasis value corresponding to the coding unit, as fetched from the quantization map, according to the following equation:

      QPnew = f(QPprev,e)

      Where QPnew is the resulting QP value, QPprev is the previously determined QP value, e is the emphasis value corresponding to the coding unit, and f is an implementation-defined function for which the following implication is true:

      e1 < e2 ⇒ f(QP,e1) ≥ f(QP,e2)

      This means that lower emphasis values will result in higher QP values, whereas higher emphasis values will result in lower QP values, but the function is not strictly decreasing with respect to the input emphasis value for a given input QP value.

    • If clamping to minimum QP values is enabled in the applied rate control layer, then the QP value is clamped to the corresponding minimum QP value.

    • If clamping to maximum QP values is enabled in the applied rate control layer, then the QP value is clamped to the corresponding maximum QP value.

  • If VK_VIDEO_ENCODE_H265_CAPABILITY_CU_QP_DIFF_WRAPAROUND_BIT_KHR is not supported, then the determined QP value is clamped in such a way that the CuQpDeltaVal value of the encoded coding unit complies to the modified version of equation 8-283 of the ITU-T H.265 Specification.

    The effect of this is that the maximum QP difference across subsequent coding units is limited to the [-(26 + QpBdOffsetY / 2), 25 + QpBdOffsetY / 2] range and only has an observable change in behavior when the video encode operation is issued with a QP delta map.

  • In all cases, the final QP value is clamped to the QP value range supported by the video profile, as reported in the minQp and maxQp members of VkVideoEncodeH265CapabilitiesKHR.

H.265 Encode Requirements

This section described the required H.265 encoding capabilities for physical devices that have at least one queue family that supports the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_H265_BIT_KHR, as returned by vkGetPhysicalDeviceQueueFamilyProperties2 in VkQueueFamilyVideoPropertiesKHR::videoCodecOperations.

Table 9. Required Video Std Header Versions
Video Std Header Name Version

vulkan_video_codec_h265std_encode

1.0.0

Table 10. Required Video Capabilities
Video Capability Requirement Requirement Type1

VkVideoCapabilitiesKHR

flags

-

min

minBitstreamBufferOffsetAlignment

4096

max

minBitstreamBufferSizeAlignment

4096

max

pictureAccessGranularity

(64,64)

max

minCodedExtent

-

max

maxCodedExtent

-

min

maxDpbSlots

0

min

maxActiveReferencePictures

0

min

VkVideoEncodeCapabilitiesKHR

flags

-

min

rateControlModes

-

min

maxBitrate

128000

min

maxQualityLevels

1

min

encodeInputPictureGranularity

(64,64)

max

supportedEncodeFeedbackFlags

VK_VIDEO_ENCODE_FEEDBACK_BITSTREAM_BUFFER_OFFSET_BIT_KHR
VK_VIDEO_ENCODE_FEEDBACK_BITSTREAM_BYTES_WRITTEN_BIT_KHR

min

VkVideoEncodeH265CapabilitiesKHR

flags

-

min

maxLevelIdc

STD_VIDEO_H265_LEVEL_IDC_1_0

min

maxSliceSegmentCount

1

min

maxTiles

(1,1)

min

ctbSizes

at least one bit set

implementation-dependent

transformBlockSizes

at least one bit set

implementation-dependent

maxPPictureL0ReferenceCount

0

min

maxBPictureL0ReferenceCount

0

min

maxL1ReferenceCount

0

min

maxSubLayerCount

1

min

expectDyadicTemporalSubLayerPattern

-

implementation-dependent

minQp

-

max

maxQp

-

min

prefersGopRemainingFrames

-

implementation-dependent

requiresGopRemainingFrames

-

implementation-dependent

stdSyntaxFlags

-

min

VkVideoEncodeQuantizationMapCapabilitiesKHR

maxQuantizationMapExtent

- 2

min

VkVideoEncodeH265QuantizationMapCapabilitiesKHR

minQpDelta

- 3

max

maxQpDelta

- 3

min

1

The Requirement Type column specifies the requirement is either the minimum value all implementations must support, the maximum value all implementations must support, or the exact value all implementations must support. For bitmasks a minimum value is the least bits all implementations must set, but they may have additional bits set beyond this minimum.

2

If VkVideoCapabilitiesKHR::flags includes VK_VIDEO_ENCODE_CAPABILITY_QUANTIZATION_DELTA_MAP_BIT_KHR or VK_VIDEO_ENCODE_CAPABILITY_EMPHASIS_MAP_BIT_KHR, then the width and height members of maxQuantizationMapExtent must be greater than zero.

3

If VkVideoCapabilitiesKHR::flags includes VK_VIDEO_ENCODE_CAPABILITY_QUANTIZATION_DELTA_MAP_BIT_KHR, then maxQpDelta must be greater than minQpDelta.

AV1 Encode Operations

Video encode operations using an AV1 encode profile can be used to encode elementary video stream sequences compliant with the AV1 Specification.

Refer to the Preamble for information on how the Khronos Intellectual Property Rights Policy relates to normative references to external materials not created by Khronos.

This process is performed according to the video encode operation steps with the codec-specific semantics defined in section 7 of the AV1 Specification:

If the parameters adhere to the syntactic and semantic requirements defined in the corresponding sections of the AV1 Specification, as described above, and the DPB slots associated with the active reference pictures all refer to valid picture references, then the video encode operation will complete successfully. Otherwise, the video encode operation may complete unsuccessfully.

AV1 Encode Parameter Overrides

Implementations may override, unless otherwise specified, any of the AV1 encode parameters specified in the following Video Std structures:

  • StdVideoAV1SequenceHeader

  • StdVideoEncodeAV1DecoderModelInfo

  • StdVideoEncodeAV1OperatingPointInfo

  • StdVideoEncodeAV1PictureInfo

  • StdVideoEncodeAV1ReferenceInfo

All such AV1 encode parameter overrides must fulfill the conditions defined in the Video Encode Parameter Overrides section.

In addition, implementations must not override any of the following AV1 encode parameters:

  • the following parameters specified in StdVideoAV1SequenceHeader:

    • flags.still_picture

    • flags.enable_order_hint

    • flags.frame_id_numbers_present_flag

    • flags.film_grain_params_present

    • flags.timing_info_present_flag

    • flags.initial_display_delay_present_flag

    • delta_frame_id_length_minus_2

    • additional_frame_id_length_minus_1

    • order_hint_bits_minus_1

  • the following parameters specified in the StdVideoAV1ColorConfig structure pointed to by StdVideoAV1SequenceHeader::pColorConfig:

    • flags.mono_chrome

    • flags.color_range

    • BitDepth

    • subsampling_x

    • subsampling_y

    • color_primaries

    • transfer_characteristics

    • matrix_coefficients

    • chroma_sample_position

  • the following parameters specified in the StdVideoAV1TimingInfo structure pointed to by StdVideoAV1SequenceHeader::pTimingInfo:

    • flags.equal_picture_interval

    • num_units_in_display_tick

    • time_scale

    • num_ticks_per_picture_minus_1

  • the parameters specified in StdVideoEncodeAV1DecoderModelInfo

  • the parameters specified in StdVideoEncodeAV1OperatingPointInfo

  • the following parameters specified in StdVideoEncodeAV1PictureInfo:

    • flags.show_frame

    • flags.showable_frame

    • frame_type

    • frame_presentation_time

    • current_frame_id

    • order_hint

    • refresh_frame_flags

    • render_width_minus_1

    • render_height_minus_1

    • ref_order_hint

    • ref_frame_idx

    • delta_frame_id_minus_1

  • the following parameters specified in the StdVideoEncodeAV1ExtensionHeader structure pointed to by StdVideoEncodeAV1PictureInfo::pExtensionHeader when VkVideoEncodeAV1PictureInfoKHR::generateObuExtensionHeader is set to VK_TRUE:

    • temporal_id

    • spatial_id

If VkVideoEncodeAV1PictureInfoKHR::primaryReferenceCdfOnly is set to VK_TRUE for a video encode operation, the implementation will not override StdVideoEncodeAV1PictureInfo::primary_ref_frame.

Implementations supporting the VK_VIDEO_ENCODE_AV1_STD_PRIMARY_REF_FRAME_BIT_KHR AV1 syntax element capability reported in VkVideoEncodeAV1CapabilitiesKHR::stdSyntaxFlag also guarantee the use of the application-specified value for StdVideoEncodeAV1PictureInfo::primary_ref_frame. While support for VK_VIDEO_ENCODE_AV1_STD_PRIMARY_REF_FRAME_BIT_KHR guarantees that the implementation will not override the application-specified primary_ref_frame, it does not mandate the implementation to use the reference picture indicated by primary_ref_frame for sample prediction as implementations can always decide to use only a subset of the application-specified reference pictures for sample prediction. This means that implementations supporting VK_VIDEO_ENCODE_AV1_STD_PRIMARY_REF_FRAME_BIT_KHR may end up using the reference picture indicated by primary_ref_frame only for CDF data reference even if the application did not set VkVideoEncodeAV1PictureInfoKHR::primaryReferenceCdfOnly to VK_TRUE.

If VkVideoEncodeAV1CapabilitiesKHR::codedPictureAlignment is not equal to {8,8} for the used video profile, implementations will override the coded picture’s resolution and parameters related to the width and height in the following manner:

  • Let w and h be the codedExtent.width and codedExtent.height of the VkVideoPictureResourceInfoKHR structure corresponding to the encode input picture, rounded up to the nearest integer multiple of 8.

  • Let aW and aH be w and h rounded up to the nearest integer multiple of codedPictureAlignment.width and codedPictureAlignment.height respectively.

  • If w equals aW, no override will occur. Otherwise the coded width will be aW.

  • If h equals aH, no override will occur. Otherwise the coded height will be aH.

The AV1 specification codes all resolutions to an 8x8 alignment, but supports unaligned resolutions through implicit cropping. Thus, if the original coded extent, aligned to 8x8, meets the implementation required alignment, no override needs to occur. Otherwise, the implementation cannot code the requested coded extent, so the final resolution in the bitstream is overridden to be aligned to the implementation required alignment.

For example, consider an implementation that is only able to output bitstreams that are 16x16 aligned (as indicated by VkVideoEncodeAV1CapabilitiesKHR::codedPictureAlignment). If an application requests the coded extent to be 1920x1080, the resulting bitstream will be 1920x1088. On the other hand, a request of 1920x1082 will result in no override, since the 8x8 alignment of this resolution (1920x1088) is 16x16 aligned.

In case of a video session parameters object created with VK_VIDEO_SESSION_PARAMETERS_CREATE_QUANTIZATION_MAP_COMPATIBLE_BIT_KHR, the following AV1 sequence header parameters may be overridden by the implementation according to the quantization map texel size the video session parameters object was created with:

  • StdVideoAV1SequenceHeader::flags.use_128x128_superblock

This may be necessary to change the AV1 superblock size used during encoding to be compatible with the used quantization map texel size.

In case of AV1 encode parameters stored in video session parameters objects, applications need to use the vkGetEncodedVideoSessionParametersKHR command to determine whether any implementation overrides happened. If the query indicates that implementation overrides were applied, then the application needs to retrieve and use the encoded AV1 sequence header in the bitstream in order to be able to produce a compliant AV1 video bitstream using the AV1 encode parameters stored in the video session parameters object.

In case of any AV1 encode parameters stored in the encoded bitstream produced by video encode operations, if the implementation supports the VK_VIDEO_ENCODE_FEEDBACK_BITSTREAM_HAS_OVERRIDES_BIT_KHR video encode feedback query flag, the application can use such queries to retrieve feedback about whether any implementation overrides have been applied to those AV1 encode parameters.

AV1 Encode Bitstream Data Access

Each video encode operation writes either:

  • A single OBU with obu_type equal to OBU_FRAME comprising of the frame header and tile data of the encoded picture, or

  • An OBU with obu_type equal to OBU_FRAME_HEADER encapsulating the frame header of the encoded picture, followed by one or more OBUs with obu_type equal to OBU_TILE_GROUP comprising of the tile data of the encoded picture.

In addition, if VkVideoEncodeAV1PictureInfoKHR::generateObuExtensionHeader is set to VK_TRUE for the video encode operation, then OBU extension headers are included in the generated bitstream as defined in sections 5.3.1, 5.3.2, and 5.3.3 of the AV1 Specification.

AV1 Encode Picture Data Access

Accesses to image data within a video picture resource happen at the granularity indicated by VkVideoCapabilitiesKHR::pictureAccessGranularity, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the used video profile. Accordingly, the complete image subregion of a encode input picture, reference picture, or reconstructed picture accessed by video coding operations using an AV1 encode profile is defined as the set of texels within the coordinate range:

([0,endX),[0,endY))

Where:

  • endX equals codedExtent.width rounded up to the nearest integer multiple of pictureAccessGranularity.width and clamped to the width of the image subresource referred to by the corresponding VkVideoPictureResourceInfoKHR structure;

  • endY equals codedExtent.height rounded up to the nearest integer multiple of pictureAccessGranularity.height and clamped to the height of the image subresource referred to by the corresponding VkVideoPictureResourceInfoKHR structure;

Where codedExtent is the member of the VkVideoPictureResourceInfoKHR structure corresponding to the picture.

In case of video encode operations using an AV1 encode profile, any access to a picture at the coordinates (x,y), as defined by the AV1 Specification, is an access to the image subresource referred to by the corresponding VkVideoPictureResourceInfoKHR structure at the texel coordinates (x,y).

Implementations may choose not to access some or all texels within particular reference pictures available to a video encode operation (e.g. due to video encode parameter overrides restricting the effective set of used reference pictures, or if the encoding algorithm chooses not to use certain subregions of the reference picture data for sample prediction).

AV1 Reference Names and Semantics

Individual reference frames used in the encoding process have different semantics, as defined in section 6.10.24 of the AV1 Specification. The AV1 semantics associated with a reference picture is indicated by the corresponding enumeration constant defined in the Video Std enumeration type StdVideoAV1ReferenceName:

  • STD_VIDEO_AV1_REFERENCE_NAME_INTRA_FRAME identifies the reference used for intra coding (INTRA_FRAME), as defined in sections 2 and 7.11.2 of the AV1 Specification.

  • All other enumeration constants refer to backward or forward references used for inter coding, as defined in sections 2 and 7.11.3 of the AV1 Specification:

    • STD_VIDEO_AV1_REFERENCE_NAME_LAST_FRAME identifies the LAST_FRAME reference

    • STD_VIDEO_AV1_REFERENCE_NAME_LAST2_FRAME identifies the LAST2_FRAME reference

    • STD_VIDEO_AV1_REFERENCE_NAME_LAST3_FRAME identifies the LAST3_FRAME reference

    • STD_VIDEO_AV1_REFERENCE_NAME_GOLDEN_FRAME identifies the GOLDEN_FRAME reference

    • STD_VIDEO_AV1_REFERENCE_NAME_BWDREF_FRAME identifies the BWDREF_FRAME reference

    • STD_VIDEO_AV1_REFERENCE_NAME_ALTREF2_FRAME identifies the ALTREF2_FRAME reference

    • STD_VIDEO_AV1_REFERENCE_NAME_ALTREF_FRAME identifies the ALTREF_FRAME reference

These enumeration constants are not directly used in any APIs but are used to indirectly index into certain Video Std and Vulkan API parameter arrays.

AV1 Prediction Modes

AV1 encoding supports multiple types of prediction modes, as described in section 6.10.24 of the AV1 Specification.

Possible AV1 encode prediction modes are as follows:

// Provided by VK_KHR_video_encode_av1
typedef enum VkVideoEncodeAV1PredictionModeKHR {
    VK_VIDEO_ENCODE_AV1_PREDICTION_MODE_INTRA_ONLY_KHR = 0,
    VK_VIDEO_ENCODE_AV1_PREDICTION_MODE_SINGLE_REFERENCE_KHR = 1,
    VK_VIDEO_ENCODE_AV1_PREDICTION_MODE_UNIDIRECTIONAL_COMPOUND_KHR = 2,
    VK_VIDEO_ENCODE_AV1_PREDICTION_MODE_BIDIRECTIONAL_COMPOUND_KHR = 3,
} VkVideoEncodeAV1PredictionModeKHR;
  • VK_VIDEO_ENCODE_AV1_PREDICTION_MODE_INTRA_ONLY_KHR specifies the use of intra-only prediction mode, used when encoding AV1 frames of type STD_VIDEO_AV1_FRAME_TYPE_KEY or STD_VIDEO_AV1_FRAME_TYPE_INTRA_ONLY.

  • VK_VIDEO_ENCODE_AV1_PREDICTION_MODE_SINGLE_REFERENCE_KHR specifies the use of single reference prediction mode, used when encoding AV1 frames of type STD_VIDEO_AV1_FRAME_TYPE_INTER or STD_VIDEO_AV1_FRAME_TYPE_SWITCH with reference_select, as defined in section 6.8.23 of the AV1 Specification, equal to 0. When using this prediction mode, the application must specify a reference picture for at least one AV1 reference name in VkVideoEncodeAV1PictureInfoKHR::referenceNameSlotIndices that is supported by the implementation, as reported in VkVideoEncodeAV1CapabilitiesKHR::singleReferenceNameMask.

  • VK_VIDEO_ENCODE_AV1_PREDICTION_MODE_UNIDIRECTIONAL_COMPOUND_KHR specifies the use of unidirectional compound prediction mode, used when encoding AV1 frames of type STD_VIDEO_AV1_FRAME_TYPE_INTER or STD_VIDEO_AV1_FRAME_TYPE_SWITCH with reference_select, as defined in section 6.8.23 of the AV1 Specification, equal to 1, and both reference names used for prediction are from the same reference frame group, as defined in section 6.10.24 of the AV1 Specification. When using this prediction mode, the application must specify a reference picture for at least two AV1 reference names in VkVideoEncodeAV1PictureInfoKHR::referenceNameSlotIndices that is supported by the implementation, as reported in VkVideoEncodeAV1CapabilitiesKHR::unidirectionalCompoundReferenceNameMask, where those two reference names are one of the allowed pairs of reference names, as defined in section 5.11.25 of the AV1 Specification, listed below:

    • LAST_FRAME and LAST2_FRAME,

    • LAST_FRAME and LAST3_FRAME,

    • LAST_FRAME and GOLDEN_FRAME, or

    • BWDREF_FRAME and ALTREF_FRAME.

  • VK_VIDEO_ENCODE_AV1_PREDICTION_MODE_BIDIRECTIONAL_COMPOUND_KHR specifies the use of bidirectional compound prediction mode, used when encoding AV1 frames of type STD_VIDEO_AV1_FRAME_TYPE_INTER or STD_VIDEO_AV1_FRAME_TYPE_SWITCH with reference_select, as defined in section 6.8.23 of the AV1 Specification, equal to 1, and the two reference names used for prediction are from different reference frame groups, as defined in section 6.10.24 of the AV1 Specification. When using this prediction mode, the application must specify a reference picture for at least one AV1 reference name from each reference frame group in VkVideoEncodeAV1PictureInfoKHR::referenceNameSlotIndices that is supported by the implementation, as reported in VkVideoEncodeAV1CapabilitiesKHR::bidirectionalCompoundReferenceNameMask.

The effective prediction mode used to encode individual AV1 mode info blocks may use simpler prediction modes than the one set by the application for the frame, as allowed by the AV1 Specification, in particular:

  • Frames encoded with single reference prediction mode may contain mode info blocks encoded with intra-only prediction mode.

  • Frames encoded with unidirectional compound prediction mode may contain mode info blocks encoded with intra-only or single reference prediction mode.

  • Frames encoded with bidirectional compound prediction mode may contain mode info blocks encoded with intra-only, single reference, or unidirectional compound prediction mode.

AV1 Coding Blocks

AV1 encode supports two types of coding blocks, as defined in section 2 of the AV1 Specification:

  • Superblock.

  • Mode info block.

AV1 Encode Profile

A video profile supporting AV1 video encode operations is specified by setting VkVideoProfileInfoKHR::videoCodecOperation to VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR and adding a VkVideoEncodeAV1ProfileInfoKHR structure to the VkVideoProfileInfoKHR::pNext chain.

The VkVideoEncodeAV1ProfileInfoKHR structure is defined as:

// Provided by VK_KHR_video_encode_av1
typedef struct VkVideoEncodeAV1ProfileInfoKHR {
    VkStructureType       sType;
    const void*           pNext;
    StdVideoAV1Profile    stdProfile;
} VkVideoEncodeAV1ProfileInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • stdProfile is a StdVideoAV1Profile value specifying the AV1 codec profile, as defined in section A.2 of the AV1 Specification.

Valid Usage (Implicit)
  • VUID-VkVideoEncodeAV1ProfileInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_AV1_PROFILE_INFO_KHR

AV1 Encode Capabilities

When calling vkGetPhysicalDeviceVideoCapabilitiesKHR to query the capabilities for an AV1 encode profile, the VkVideoCapabilitiesKHR::pNext chain must include a VkVideoEncodeAV1CapabilitiesKHR structure that will be filled with the profile-specific capabilities.

The VkVideoEncodeAV1CapabilitiesKHR structure is defined as:

// Provided by VK_KHR_video_encode_av1
typedef struct VkVideoEncodeAV1CapabilitiesKHR {
    VkStructureType                           sType;
    void*                                     pNext;
    VkVideoEncodeAV1CapabilityFlagsKHR        flags;
    StdVideoAV1Level                          maxLevel;
    VkExtent2D                                codedPictureAlignment;
    VkExtent2D                                maxTiles;
    VkExtent2D                                minTileSize;
    VkExtent2D                                maxTileSize;
    VkVideoEncodeAV1SuperblockSizeFlagsKHR    superblockSizes;
    uint32_t                                  maxSingleReferenceCount;
    uint32_t                                  singleReferenceNameMask;
    uint32_t                                  maxUnidirectionalCompoundReferenceCount;
    uint32_t                                  maxUnidirectionalCompoundGroup1ReferenceCount;
    uint32_t                                  unidirectionalCompoundReferenceNameMask;
    uint32_t                                  maxBidirectionalCompoundReferenceCount;
    uint32_t                                  maxBidirectionalCompoundGroup1ReferenceCount;
    uint32_t                                  maxBidirectionalCompoundGroup2ReferenceCount;
    uint32_t                                  bidirectionalCompoundReferenceNameMask;
    uint32_t                                  maxTemporalLayerCount;
    uint32_t                                  maxSpatialLayerCount;
    uint32_t                                  maxOperatingPoints;
    uint32_t                                  minQIndex;
    uint32_t                                  maxQIndex;
    VkBool32                                  prefersGopRemainingFrames;
    VkBool32                                  requiresGopRemainingFrames;
    VkVideoEncodeAV1StdFlagsKHR               stdSyntaxFlags;
} VkVideoEncodeAV1CapabilitiesKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • flags is a bitmask of VkVideoEncodeAV1CapabilityFlagBitsKHR indicating supported AV1 encoding capabilities.

  • maxLevel is a StdVideoAV1Level value indicating the maximum AV1 level supported by the profile, as defined in section A.3 of the AV1 Specification.

  • codedPictureAlignment indicates the alignment at which the implementation will code pictures. This capability does not impose any valid usage constraints on the application. However, depending on the codedExtent of the encode input picture resource, this capability may result in a change of the resolution of the encoded picture, as described in more detail below.

  • maxTiles indicates the maximum number of AV1 tile columns and rows the implementation supports.

  • minTileSize indicates the minimum extent of individual AV1 tiles the implementation supports.

  • maxTileSize indicates the maximum extent of individual AV1 tiles the implementation supports.

  • superblockSizes is a bitmask of VkVideoEncodeAV1SuperblockSizeFlagBitsKHR values indicating the supported AV1 superblock sizes.

  • maxSingleReferenceCount indicates the maximum number of reference pictures the implementation supports when using single reference prediction mode.

  • singleReferenceNameMask is a bitmask of supported AV1 reference names when using single reference prediction mode.

  • maxUnidirectionalCompoundReferenceCount indicates the maximum number of reference pictures the implementation supports when using unidirectional compound prediction mode.

  • maxUnidirectionalCompoundGroup1ReferenceCount indicates the maximum number of reference pictures the implementation supports when using unidirectional compound prediction mode from reference frame group 1, as defined in section 6.10.24 of the AV1 Specification.

  • unidirectionalCompoundReferenceNameMask is a bitmask of supported AV1 reference names when using unidirectional compound prediction mode.

  • maxBidirectionalCompoundReferenceCount indicates the maximum number of reference pictures the implementation supports when using bidirectional compound prediction mode.

  • maxBidirectionalCompoundGroup1ReferenceCount indicates the maximum number of reference pictures the implementation supports when using bidirectional compound prediction mode from reference frame group 1, as defined in section 6.10.24 of the AV1 Specification.

  • maxBidirectionalCompoundGroup2ReferenceCount indicates the maximum number of reference pictures the implementation supports when using bidirectional compound prediction mode from reference frame group 2, as defined in section 6.10.24 of the AV1 Specification.

  • bidirectionalCompoundReferenceNameMask is a bitmask of supported AV1 reference names when using bidirectional compound prediction mode.

  • maxTemporalLayerCount indicates the maximum number of AV1 temporal layers supported by the implementation.

  • maxSpatialLayerCount indicates the maximum number of AV1 spatial layers supported by the implementation.

  • maxOperatingPoints indicates the maximum number of AV1 operating points supported by the implementation.

  • minQIndex indicates the minimum quantizer index value supported.

  • maxQIndex indicates the maximum quantizer index value supported.

  • prefersGopRemainingFrames indicates that the implementation’s rate control algorithm prefers the application to specify the number of frames in each AV1 rate control group remaining in the current group of pictures when beginning a video coding scope.

  • requiresGopRemainingFrames indicates that the implementation’s rate control algorithm requires the application to specify the number of frames in each AV1 rate control group remaining in the current group of pictures when beginning a video coding scope.

  • stdSyntaxFlags is a bitmask of VkVideoEncodeAV1StdFlagBitsKHR indicating capabilities related to AV1 syntax elements.

singleReferenceNameMask, unidirectionalCompoundReferenceNameMask, and bidirectionalCompoundReferenceNameMask are encoded such that when bit index i is set, it indicates support for the AV1 reference name STD_VIDEO_AV1_REFERENCE_NAME_LAST_FRAME + i.

These masks indicate which elements of the referenceNameSlotIndices member of VkVideoEncodeAV1PictureInfoKHR are supported to be used by the implementation. It is important to note that both the bits of these masks and the elements of referenceNameSlotIndices are indexed such that the first value specifies the support bit and DPB slot index, respectively, for the AV1 reference name STD_VIDEO_AV1_REFERENCE_NAME_LAST_FRAME (i.e. there is no bit or element for STD_VIDEO_AV1_REFERENCE_NAME_INTRA_FRAME).

codedPictureAlignment provides information about implementation limitations to encode arbitrary resolutions. In particular, some implementations may not be able to generate bitstreams aligned to the requirements of the AV1 Specification (8x8). In such cases, the implementation may override the width and height of the bitstream, in order to produce a bitstream compliant to the AV1 Specification. If such an override occurs, the encoded resolution of the coded picture is enlargened, with the texel values used for the texel coordinates outside of the bounds of the codedExtent of the encode input picture resource being first governed by the rules regarding the encode input picture granularity. Any texel values outside of the region described by the encode input picture granularity are implementation-defined. Implementations should use well-defined values to minimize impact on the produced encoded content.

This capability does not impose additional application requirements. However, these overrides change the effective resolution of the bitstream and add padding pixels. Applications sensitive to such overrides can use this capability and the corresponding override behavior to compute the cropping needed to reproduce the original input of the encoding and transmit it in a side channel (i.e. by using cropping fields available in a container). Additionally, applications can explicitly consider this alignment in their coded extent, to avoid implementation-defined texel values being included in the encoded content.

Valid Usage (Implicit)
  • VUID-VkVideoEncodeAV1CapabilitiesKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_AV1_CAPABILITIES_KHR

Bits which may be set in VkVideoEncodeAV1CapabilitiesKHR::flags, indicating the AV1 encoding capabilities supported, are:

// Provided by VK_KHR_video_encode_av1
typedef enum VkVideoEncodeAV1CapabilityFlagBitsKHR {
    VK_VIDEO_ENCODE_AV1_CAPABILITY_PER_RATE_CONTROL_GROUP_MIN_MAX_Q_INDEX_BIT_KHR = 0x00000001,
    VK_VIDEO_ENCODE_AV1_CAPABILITY_GENERATE_OBU_EXTENSION_HEADER_BIT_KHR = 0x00000002,
    VK_VIDEO_ENCODE_AV1_CAPABILITY_PRIMARY_REFERENCE_CDF_ONLY_BIT_KHR = 0x00000004,
    VK_VIDEO_ENCODE_AV1_CAPABILITY_FRAME_SIZE_OVERRIDE_BIT_KHR = 0x00000008,
    VK_VIDEO_ENCODE_AV1_CAPABILITY_MOTION_VECTOR_SCALING_BIT_KHR = 0x00000010,
} VkVideoEncodeAV1CapabilityFlagBitsKHR;
  • VK_VIDEO_ENCODE_AV1_CAPABILITY_PER_RATE_CONTROL_GROUP_MIN_MAX_Q_INDEX_BIT_KHR indicates support for specifying different quantizer index values in the members of VkVideoEncodeAV1QIndexKHR.

  • VK_VIDEO_ENCODE_AV1_CAPABILITY_GENERATE_OBU_EXTENSION_HEADER_BIT_KHR indicates support for generating OBU extension headers, as defined in section 5.3.3 of the AV1 Specification.

  • VK_VIDEO_ENCODE_AV1_CAPABILITY_PRIMARY_REFERENCE_CDF_ONLY_BIT_KHR indicates support for using the primary reference frame indicated by the value of StdVideoEncodeAV1PictureInfo::primary_ref_frame in the AV1 picture information only for CDF data reference, as defined in section 6.8.2 of the AV1 Specification.

  • VK_VIDEO_ENCODE_AV1_CAPABILITY_FRAME_SIZE_OVERRIDE_BIT_KHR indicates support for encoding a picture with a frame size different from the maximum frame size defined in the active AV1 sequence header. If this capability is not supported, then frame_size_override_flag must not be set in the AV1 picture information of the encoded frame and the coded extent of the encode input picture must match the maximum coded extent allowed by the active AV1 sequence header, i.e. (max_frame_width_minus_1 + 1, max_frame_height_minus_1 + 1).

  • VK_VIDEO_ENCODE_AV1_CAPABILITY_MOTION_VECTOR_SCALING_BIT_KHR indicates support for motion vector scaling, as defined in section 7.11.3.3 of the AV1 Specification. If this capability is not supported, then the coded extent of all active reference pictures must match the coded extent of the encode input picture. This capability may only be supported by a video profile when VK_VIDEO_ENCODE_AV1_CAPABILITY_FRAME_SIZE_OVERRIDE_BIT_KHR is also supported.

// Provided by VK_KHR_video_encode_av1
typedef VkFlags VkVideoEncodeAV1CapabilityFlagsKHR;

VkVideoEncodeAV1CapabilityFlagsKHR is a bitmask type for setting a mask of zero or more VkVideoEncodeAV1CapabilityFlagBitsKHR.

Bits which may be set in VkVideoEncodeAV1CapabilitiesKHR::stdSyntaxFlags, indicating the capabilities related to the AV1 syntax elements, are:

// Provided by VK_KHR_video_encode_av1
typedef enum VkVideoEncodeAV1StdFlagBitsKHR {
    VK_VIDEO_ENCODE_AV1_STD_UNIFORM_TILE_SPACING_FLAG_SET_BIT_KHR = 0x00000001,
    VK_VIDEO_ENCODE_AV1_STD_SKIP_MODE_PRESENT_UNSET_BIT_KHR = 0x00000002,
    VK_VIDEO_ENCODE_AV1_STD_PRIMARY_REF_FRAME_BIT_KHR = 0x00000004,
    VK_VIDEO_ENCODE_AV1_STD_DELTA_Q_BIT_KHR = 0x00000008,
} VkVideoEncodeAV1StdFlagBitsKHR;
  • VK_VIDEO_ENCODE_AV1_STD_UNIFORM_TILE_SPACING_FLAG_SET_BIT_KHR indicates whether the implementation supports using the application-provided value for StdVideoAV1TileInfoFlags::uniform_tile_spacing_flag in the AV1 tile parameters when that value is 1, indifferent of the coded extent of the encode input picture and the number of tile columns and rows requested in the TileCols and TileRows members of StdVideoAV1TileInfo.

  • VK_VIDEO_ENCODE_AV1_STD_SKIP_MODE_PRESENT_UNSET_BIT_KHR specifies whether the implementation supports using the application-provided value for StdVideoEncodeAV1PictureInfoFlags::skip_mode_present when that value is 0.

  • VK_VIDEO_ENCODE_AV1_STD_PRIMARY_REF_FRAME_BIT_KHR specifies whether the implementation supports using the application-provided value for StdVideoEncodeAV1PictureInfo::primary_ref_frame.

  • VK_VIDEO_ENCODE_AV1_STD_DELTA_Q_BIT_KHR specifies whether the implementation supports using the application-provided values for the DeltaQYDc, DeltaQUDc, DeltaQUAc, DeltaQVDc, and DeltaQVAc members of StdVideoAV1Quantization.

These capability flags provide information to the application about specific AV1 syntax element values that the implementation supports without having to override them and do not otherwise restrict the values that the application can specify for any of the mentioned AV1 syntax elements.

// Provided by VK_KHR_video_encode_av1
typedef VkFlags VkVideoEncodeAV1StdFlagsKHR;

VkVideoEncodeAV1StdFlagsKHR is a bitmask type for setting a mask of zero or more VkVideoEncodeAV1StdFlagBitsKHR.

Bits which may be set in VkVideoEncodeAV1CapabilitiesKHR::superblockSizes, indicating the superblock sizes supported by the implementation, are:

// Provided by VK_KHR_video_encode_av1
typedef enum VkVideoEncodeAV1SuperblockSizeFlagBitsKHR {
    VK_VIDEO_ENCODE_AV1_SUPERBLOCK_SIZE_64_BIT_KHR = 0x00000001,
    VK_VIDEO_ENCODE_AV1_SUPERBLOCK_SIZE_128_BIT_KHR = 0x00000002,
} VkVideoEncodeAV1SuperblockSizeFlagBitsKHR;
  • VK_VIDEO_ENCODE_AV1_SUPERBLOCK_SIZE_64_BIT_KHR specifies that a superblock size of 64x64 is supported.

  • VK_VIDEO_ENCODE_AV1_SUPERBLOCK_SIZE_128_BIT_KHR specifies that a superblock size of 128x128 is supported.

// Provided by VK_KHR_video_encode_av1
typedef VkFlags VkVideoEncodeAV1SuperblockSizeFlagsKHR;

VkVideoEncodeAV1SuperblockSizeFlagsKHR is a bitmask type for setting a mask of zero or more VkVideoEncodeAV1SuperblockSizeFlagBitsKHR.

Implementations must support at least one of VkVideoEncodeAV1SuperblockSizeFlagBitsKHR.

AV1 Encode Quality Level Properties

When calling vkGetPhysicalDeviceVideoEncodeQualityLevelPropertiesKHR with pVideoProfile->videoCodecOperation specified as VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR, the VkVideoEncodeAV1QualityLevelPropertiesKHR structure must be included in the pNext chain of the VkVideoEncodeQualityLevelPropertiesKHR structure to retrieve additional video encode quality level properties specific to AV1 encoding.

The VkVideoEncodeAV1QualityLevelPropertiesKHR structure is defined as:

// Provided by VK_KHR_video_encode_av1
typedef struct VkVideoEncodeAV1QualityLevelPropertiesKHR {
    VkStructureType                        sType;
    void*                                  pNext;
    VkVideoEncodeAV1RateControlFlagsKHR    preferredRateControlFlags;
    uint32_t                               preferredGopFrameCount;
    uint32_t                               preferredKeyFramePeriod;
    uint32_t                               preferredConsecutiveBipredictiveFrameCount;
    uint32_t                               preferredTemporalLayerCount;
    VkVideoEncodeAV1QIndexKHR              preferredConstantQIndex;
    uint32_t                               preferredMaxSingleReferenceCount;
    uint32_t                               preferredSingleReferenceNameMask;
    uint32_t                               preferredMaxUnidirectionalCompoundReferenceCount;
    uint32_t                               preferredMaxUnidirectionalCompoundGroup1ReferenceCount;
    uint32_t                               preferredUnidirectionalCompoundReferenceNameMask;
    uint32_t                               preferredMaxBidirectionalCompoundReferenceCount;
    uint32_t                               preferredMaxBidirectionalCompoundGroup1ReferenceCount;
    uint32_t                               preferredMaxBidirectionalCompoundGroup2ReferenceCount;
    uint32_t                               preferredBidirectionalCompoundReferenceNameMask;
} VkVideoEncodeAV1QualityLevelPropertiesKHR;

preferredSingleReferenceNameMask, preferredUnidirectionalCompoundReferenceNameMask, and preferredBidirectionalCompoundReferenceNameMask are encoded such that when bit index i is set, it indicates preference for using the AV1 reference name STD_VIDEO_AV1_REFERENCE_NAME_LAST_FRAME + i.

Valid Usage (Implicit)
  • VUID-VkVideoEncodeAV1QualityLevelPropertiesKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_AV1_QUALITY_LEVEL_PROPERTIES_KHR

AV1 Encode Session

Additional parameters can be specified when creating a video session with an AV1 encode profile by including an instance of the VkVideoEncodeAV1SessionCreateInfoKHR structure in the pNext chain of VkVideoSessionCreateInfoKHR.

The VkVideoEncodeAV1SessionCreateInfoKHR structure is defined as:

// Provided by VK_KHR_video_encode_av1
typedef struct VkVideoEncodeAV1SessionCreateInfoKHR {
    VkStructureType     sType;
    const void*         pNext;
    VkBool32            useMaxLevel;
    StdVideoAV1Level    maxLevel;
} VkVideoEncodeAV1SessionCreateInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • useMaxLevel indicates whether the value of maxLevel should be used by the implementation. When it is set to VK_FALSE, the implementation ignores the value of maxLevel and uses the value of VkVideoEncodeAV1CapabilitiesKHR::maxLevel, as reported by vkGetPhysicalDeviceVideoCapabilitiesKHR for the video profile.

  • maxLevel is a StdVideoAV1Level value specifying the upper bound on the AV1 level for the video bitstreams produced by the created video session.

Valid Usage (Implicit)
  • VUID-VkVideoEncodeAV1SessionCreateInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_AV1_SESSION_CREATE_INFO_KHR

AV1 Encode Parameter Sets

Video session parameters objects created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR contain a single instance of the following parameter set:

AV1 Sequence Header

Represented by StdVideoAV1SequenceHeader structures and interpreted as follows:

  • flags.reserved and reserved1 are used only for padding purposes and are otherwise ignored;

  • the StdVideoAV1ColorConfig structure pointed to by pColorConfig is interpreted as follows:

    • flags.reserved and reserved1 are used only for padding purposes and are otherwise ignored;

    • all other members of StdVideoAV1ColorConfig are interpreted as defined in section 6.4.2 of the AV1 Specification;

  • if flags.timing_info_present_flag is set, then the StdVideoAV1TimingInfo structure pointed to by pTimingInfo is interpreted as follows:

    • flags.reserved is used only for padding purposes and is otherwise ignored;

    • all other members of StdVideoAV1TimingInfo are interpreted as defined in section 6.4.3 of the AV1 Specification;

  • all other members of StdVideoAV1SequenceHeader are interpreted as defined in section 6.4 of the AV1 Specification.

When StdVideoAV1SequenceHeader::flags.timing_info_present_flag is set, the AV1 sequence header can be amended with AV1 decoder model information, represented by a StdVideoEncodeAV1DecoderModelInfo structure and interpreted as follows:

  • reserved1 is used only for padding purposes and is otherwise ignored;

  • all other members of StdVideoEncodeAV1DecoderModelInfo are interpreted as defined in section 6.4.4 of the AV1 Specification.

When StdVideoAV1SequenceHeader::flags.reduced_still_picture_header is not set, the AV1 sequence header can be amended with AV1 operating point information, represented by an array of StdVideoEncodeAV1OperatingPointInfo structures and interpreted as follows:

  • flags.reserved is used only for padding purposes and is otherwise ignored;

  • all other members of StdVideoEncodeAV1OperatingPointInfo are interpreted as the corresponding element of the respective arrays defined in section 6.4 of the AV1 Specification.

Implementations may override any of these parameters according to the semantics defined in the Video Encode Parameter Overrides section before storing the resulting AV1 sequence header into the video session parameters object. Applications need to use the vkGetEncodedVideoSessionParametersKHR command to determine whether any implementation overrides happened and to retrieve the encoded AV1 sequence header in order to be able to produce a compliant AV1 video bitstream.

The encoded AV1 sequence header retrieved using the vkGetEncodedVideoSessionParametersKHR command is encoded as a single OBU with obu_type equal to OBU_SEQUENCE_HEADER, as defined in section 5.3 of the AV1 Specification.

Such AV1 sequence header overrides may also have cascading effects on the implementation overrides applied to the encoded bitstream produced by video encode operations. If the implementation supports the VK_VIDEO_ENCODE_FEEDBACK_BITSTREAM_HAS_OVERRIDES_BIT_KHR video encode feedback query flag, then the application can use such queries to retrieve feedback about whether any implementation overrides have been applied to the encoded bitstream.

When a video session parameters object is created with the codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR, the VkVideoSessionParametersCreateInfoKHR::pNext chain must include a VkVideoEncodeAV1SessionParametersCreateInfoKHR structure specifying the contents of the object.

The VkVideoEncodeAV1SessionParametersCreateInfoKHR structure is defined as:

// Provided by VK_KHR_video_encode_av1
typedef struct VkVideoEncodeAV1SessionParametersCreateInfoKHR {
    VkStructureType                               sType;
    const void*                                   pNext;
    const StdVideoAV1SequenceHeader*              pStdSequenceHeader;
    const StdVideoEncodeAV1DecoderModelInfo*      pStdDecoderModelInfo;
    uint32_t                                      stdOperatingPointCount;
    const StdVideoEncodeAV1OperatingPointInfo*    pStdOperatingPoints;
} VkVideoEncodeAV1SessionParametersCreateInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • pStdSequenceHeader is a pointer to a StdVideoAV1SequenceHeader structure describing parameters of the AV1 sequence header entry to store in the created object.

  • pStdDecoderModelInfo is NULL or a pointer to a StdVideoEncodeAV1DecoderModelInfo structure specifying the AV1 decoder model information to store in the created object.

  • stdOperatingPointCount is the number of elements in the pStdOperatingPoints array.

  • pStdOperatingPoints is NULL or a pointer to an array of stdOperatingPointCount number of StdVideoEncodeAV1OperatingPointInfo structures specifying the AV1 operating point information to store in the created object. Each element i specifies the parameter values corresponding to element i of the syntax elements defined in section 6.4 of the AV1 Specification.

Valid Usage
  • VUID-VkVideoEncodeAV1SessionParametersCreateInfoKHR-pStdSequenceHeader-10288
    pStdSequenceHeader->flags.film_grain_params_present must be zero

Valid Usage (Implicit)
  • VUID-VkVideoEncodeAV1SessionParametersCreateInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_AV1_SESSION_PARAMETERS_CREATE_INFO_KHR

  • VUID-VkVideoEncodeAV1SessionParametersCreateInfoKHR-pStdSequenceHeader-parameter
    pStdSequenceHeader must be a valid pointer to a valid StdVideoAV1SequenceHeader value

  • VUID-VkVideoEncodeAV1SessionParametersCreateInfoKHR-pStdDecoderModelInfo-parameter
    If pStdDecoderModelInfo is not NULL, pStdDecoderModelInfo must be a valid pointer to a valid StdVideoEncodeAV1DecoderModelInfo value

  • VUID-VkVideoEncodeAV1SessionParametersCreateInfoKHR-pStdOperatingPoints-parameter
    If stdOperatingPointCount is not 0, and pStdOperatingPoints is not NULL, pStdOperatingPoints must be a valid pointer to an array of stdOperatingPointCount StdVideoEncodeAV1OperatingPointInfo values

AV1 Encoding Parameters

The VkVideoEncodeAV1PictureInfoKHR structure is defined as:

// Provided by VK_KHR_video_encode_av1
typedef struct VkVideoEncodeAV1PictureInfoKHR {
    VkStructureType                        sType;
    const void*                            pNext;
    VkVideoEncodeAV1PredictionModeKHR      predictionMode;
    VkVideoEncodeAV1RateControlGroupKHR    rateControlGroup;
    uint32_t                               constantQIndex;
    const StdVideoEncodeAV1PictureInfo*    pStdPictureInfo;
    int32_t                                referenceNameSlotIndices[VK_MAX_VIDEO_AV1_REFERENCES_PER_FRAME_KHR];
    VkBool32                               primaryReferenceCdfOnly;
    VkBool32                               generateObuExtensionHeader;
} VkVideoEncodeAV1PictureInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • predictionMode specifies the AV1 prediction mode to use for the encoded frame.

  • rateControlGroup specifies the AV1 rate control group to use for the encoded frame when the current rate control mode is not VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DISABLED_BIT_KHR. Otherwise it is ignored.

  • constantQIndex is the quantizer index to use for the encoded frame if the current rate control mode configured for the video session is VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DISABLED_BIT_KHR.

  • pStdPictureInfo is a pointer to a StdVideoEncodeAV1PictureInfo structure specifying AV1 picture information.

  • referenceNameSlotIndices is an array of seven (VK_MAX_VIDEO_AV1_REFERENCES_PER_FRAME_KHR, which is equal to the Video Std definition STD_VIDEO_AV1_REFS_PER_FRAME) signed integer values specifying the index of the DPB slot or a negative integer value for each AV1 reference name used for inter coding. In particular, the DPB slot index for the AV1 reference name frame is specified in referenceNameSlotIndices[frame - STD_VIDEO_AV1_REFERENCE_NAME_LAST_FRAME].

  • primaryReferenceCdfOnly controls whether the primary reference frame indicated by the value of pStdPictureInfo->primary_ref_frame is used only for CDF data reference, as defined in sections 6.8.2 of the AV1 Specification. If set to VK_TRUE, then the primary reference frame’s picture data will not be used for sample prediction.

  • generateObuExtensionHeader controls whether OBU extension headers are generated into the target bitstream, as defined in sections 5.3.1, 5.3.2, and 5.3.3 of the AV1 Specification.

This structure is specified in the pNext chain of the VkVideoEncodeInfoKHR structure passed to vkCmdEncodeVideoKHR to specify the codec-specific picture information for an AV1 encode operation.

Encode Input Picture Information

When this structure is specified in the pNext chain of the VkVideoEncodeInfoKHR structure passed to vkCmdEncodeVideoKHR, the information related to the encode input picture is defined as follows:

Std Picture Information

The members of the StdVideoEncodeAV1PictureInfo structure pointed to by pStdPictureInfo are interpreted as follows:

  • flags.reserved and reserved1 are used only for padding purposes and are otherwise ignored;

  • pSegmentation must be NULL

    AV1 segmentation is currently not supported in video encode operations. Accordingly, the application needs to set flags.segmentation_enabled to 0 and pSegmentation to NULL.

  • pTileInfo is NULL or a pointer to a StdVideoAV1TileInfo structure specifying AV1 tile parameters;

  • the StdVideoAV1Quantization structure pointed to by pQuantization is interpreted as follows:

    • flags.reserved is used only for padding purposes and is otherwise ignored;

    • all other members of StdVideoAV1Quantization are interpreted as defined in section 6.8.11 of the AV1 Specification;

  • the StdVideoAV1LoopFilter structure pointed to by pLoopFilter is interpreted as follows:

    • flags.reserved is used only for padding purposes and is otherwise ignored;

    • update_ref_delta is a bitmask where bit index i is interpreted as the value of update_ref_delta corresponding to element i of loop_filter_ref_deltas as defined in section 6.8.10 of the AV1 Specification;

    • update_mode_delta is a bitmask where bit index i is interpreted as the value of update_mode_delta corresponding to element i of loop_filter_mode_deltas as defined in section 6.8.10 of the AV1 Specification;

    • all other members of StdVideoAV1LoopFilter are interpreted as defined in section 6.8.10 of the AV1 Specification;

  • if flags.enable_cdef is set in the active sequence header, then the members of the StdVideoAV1CDEF structure pointed to by pCDEF are interpreted as follows:

    • cdef_y_sec_strength and cdef_uv_sec_strength are the bitstream values of the corresponding syntax elements defined in section 5.9.19 of the AV1 Specification;

    • all other members of StdVideoAV1CDEF are interpreted as defined in section 6.10.14 of the AV1 Specification;

  • if flags.UsesLr is set in the active sequence header, then the StdVideoAV1LoopRestoration structure pointed to by pLoopRestoration is interpreted as follows:

    • LoopRestorationSize[plane] is interpreted as log2(size) - 5, where size is the value of LoopRestorationSize[plane] as defined in section 6.10.15 of the AV1 Specification;

    • all other members of StdVideoAV1LoopRestoration are defined as in section 6.10.15 of the AV1 Specification;

  • the members of the StdVideoAV1GlobalMotion structure provided in global_motion are interpreted as defined in section 7.10 of the AV1 Specification;

  • pExtensionHeader is NULL or a pointer to a StdVideoEncodeAV1ExtensionHeader structure whose temporal_id and spatial_id members specify the temporal and spatial layer ID of the reference frame, respectively (these IDs are encoded into the OBU extension header if VkVideoEncodeAV1PictureInfoKHR::generateObuExtensionHeader is set to VK_TRUE for the encode operation);

  • if flags.buffer_removal_time_present_flag is set, then pBufferRemovalTimes is a pointer to an array of N number of unsigned integer values specifying the elements of the buffer_removal_time array, as defined in section 6.8.2 of the AV1 Specification, where N is the number of operating points specified for the active sequence header through VkVideoEncodeAV1SessionParametersCreateInfoKHR::stdOperatingPointCount;

  • all other members are interpreted as defined in section 6.8 of the AV1 Specification.

Reference picture setup is controlled by the value of StdVideoEncodeAV1PictureInfo::refresh_frame_flags. If it is not zero and a reconstructed picture is specified, then the latter is used as the target of picture reconstruction to activate the DPB slot specified in pEncodeInfo->pSetupReferenceSlot→slotIndex. If StdVideoEncodeAV1PictureInfo::refresh_frame_flags is zero, but a reconstructed picture is specified, then the corresponding picture reference associated with the DPB slot is invalidated, as described in the DPB Slot States section.

Std Tile Parameters

Specifying AV1 tile parameters is optional. If StdVideoEncodeAV1PictureInfo::pTileInfo is NULL, then the implementation determines the values of AV1 tile parameters defined in section 6.8.14 of the AV1 Specification in an implementation-dependent manner. If StdVideoEncodeAV1PictureInfo::pTileInfo is not NULL, then the members of the StdVideoAV1TileInfo structure pointed to by StdVideoEncodeAV1PictureInfo::pTileInfo are interpreted as follows:

  • flags.reserved and reserved1 are used only for padding purposes and are otherwise ignored;

  • TileCols and TileRows specify the number of tile columns and tile rows as defined in section 6.8.14 of the AV1 Specification;

  • tile_size_bytes_minus_1 is ignored, as its value, as defined in section 6.8.14 of the AV1 Specification, is determined as the result of the encoding process;

  • pMiColStarts and pMiRowStarts are ignored, as the elements of the MiColStarts and MiRowStarts arrays defined in section 6.8.14 of the AV1 Specification are determined by the implementation based on the tile widths and heights determined by the implementation or specified through the pWidthInSbsMinus1 and pHeightInSbsMinus1 arrays, respectively;

  • pWidthInSbsMinus1 is NULL or a pointer to an array of TileCols number of unsigned integers that corresponds to width_in_sbs_minus_1 defined in section 6.8.14 of the AV1 Specification;

  • pHeightInSbsMinus1 is NULL or is a pointer to an array of TileRows number of unsigned integers that corresponds to height_in_sbs_minus_1 defined in section 6.8.14 of the AV1 Specification;

  • all other members of StdVideoAV1TileInfo are interpreted as defined in section 6.8.14 of the AV1 Specification.

If flags.uniform_tile_spacing_flag is set, then pWidthInSbsMinus1 and pHeightInSbsMinus1 are ignored.

If flags.uniform_tile_spacing_flag is not set and pWidthInSbsMinus1 is NULL, then the width of individual tile columns is determined in an implementation-dependent manner.

If flags.uniform_tile_spacing_flag is not set and pHeightInSbsMinus1 is NULL, then the height of individual tile rows is determined in an implementation-dependent manner.

In general, implementations are expected to respect the application-specified AV1 tile parameters. However, as implementations may have restrictions on the combination of tile column and row counts, and tile widths and heights with respect to the extent of the encoded frame beyond the restrictions specified in the AV1 Specification and this specification (through video profile capabilities), certain parameter combinations may require the implementation to override them in order to conform to such implementation-specific limitations.

Active Parameter Sets

The active sequence header is the AV1 sequence header stored in the bound video session parameters object.

Valid Usage
  • VUID-VkVideoEncodeAV1PictureInfoKHR-flags-10289
    If VkVideoEncodeAV1CapabilitiesKHR::flags, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the used video profile, does not include VK_VIDEO_ENCODE_AV1_CAPABILITY_PRIMARY_REFERENCE_CDF_ONLY_BIT_KHR, then primaryReferenceCdfOnly must be VK_FALSE

  • VUID-VkVideoEncodeAV1PictureInfoKHR-primaryReferenceCdfOnly-10290
    If primaryReferenceCdfOnly is set to VK_TRUE, then pStdPictureInfo->primary_ref_frame must be less than VK_MAX_VIDEO_AV1_REFERENCES_PER_FRAME_KHR

  • VUID-VkVideoEncodeAV1PictureInfoKHR-pStdPictureInfo-10291
    If pStdPictureInfo->primary_ref_frame is less than VK_MAX_VIDEO_AV1_REFERENCES_PER_FRAME_KHR, then referenceNameSlotIndices[pStdPictureInfo->primary_ref_frame] must not be negative

  • VUID-VkVideoEncodeAV1PictureInfoKHR-flags-10292
    If VkVideoEncodeAV1CapabilitiesKHR::flags, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the used video profile, does not include VK_VIDEO_ENCODE_AV1_CAPABILITY_GENERATE_OBU_EXTENSION_HEADER_BIT_KHR, then generateObuExtensionHeader must be VK_FALSE

  • VUID-VkVideoEncodeAV1PictureInfoKHR-generateObuExtensionHeader-10293
    If generateObuExtensionHeader is set to VK_TRUE, then pStdPictureInfo->pExtensionHeader must not be NULL

Valid Usage (Implicit)
  • VUID-VkVideoEncodeAV1PictureInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_AV1_PICTURE_INFO_KHR

  • VUID-VkVideoEncodeAV1PictureInfoKHR-predictionMode-parameter
    predictionMode must be a valid VkVideoEncodeAV1PredictionModeKHR value

  • VUID-VkVideoEncodeAV1PictureInfoKHR-rateControlGroup-parameter
    rateControlGroup must be a valid VkVideoEncodeAV1RateControlGroupKHR value

  • VUID-VkVideoEncodeAV1PictureInfoKHR-pStdPictureInfo-parameter
    pStdPictureInfo must be a valid pointer to a valid StdVideoEncodeAV1PictureInfo value

The VkVideoEncodeAV1DpbSlotInfoKHR structure is defined as:

// Provided by VK_KHR_video_encode_av1
typedef struct VkVideoEncodeAV1DpbSlotInfoKHR {
    VkStructureType                          sType;
    const void*                              pNext;
    const StdVideoEncodeAV1ReferenceInfo*    pStdReferenceInfo;
} VkVideoEncodeAV1DpbSlotInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • pStdReferenceInfo is a pointer to a StdVideoEncodeAV1ReferenceInfo structure specifying AV1 reference information.

This structure is specified in the pNext chain of VkVideoEncodeInfoKHR::pSetupReferenceSlot, if not NULL, and the pNext chain of the elements of VkVideoEncodeInfoKHR::pReferenceSlots to specify the codec-specific reference picture information for an AV1 encode operation.

Active Reference Picture Information

When this structure is specified in the pNext chain of the elements of VkVideoEncodeInfoKHR::pReferenceSlots, one element is added to the list of active reference pictures used by the video encode operation for each element of VkVideoEncodeInfoKHR::pReferenceSlots as follows:

Reconstructed Picture Information

When this structure is specified in the pNext chain of VkVideoEncodeInfoKHR::pSetupReferenceSlot, the information related to the reconstructed picture is defined as follows:

Std Reference Information

The members of the StdVideoEncodeAV1ReferenceInfo structure pointed to by pStdReferenceInfo are interpreted as follows:

  • flags.reserved and reserved1 are used only for padding purposes and are otherwise ignored;

  • flags.disable_frame_end_update_cdf is interpreted as defined in section 6.8.2 of the AV1 Specification;

  • flags.segmentation_enabled is interpreted as defined in section 6.8.13 of the AV1 Specification;

  • RefFrameId is interpreted as the element of the RefFrameId array defined in section 6.8.2 of the AV1 Specification corresponding to the reference frame;

  • frame_type is interpreted as defined in section 6.8.2 of the AV1 Specification;

  • OrderHint is interpreted as defined in section 6.8.2 of the AV1 Specification;

  • pExtensionHeader is NULL or a pointer to a StdVideoEncodeAV1ExtensionHeader structure whose temporal_id and spatial_id members specify the temporal and spatial layer ID of the reference frame, respectively.

Valid Usage (Implicit)
  • VUID-VkVideoEncodeAV1DpbSlotInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_AV1_DPB_SLOT_INFO_KHR

  • VUID-VkVideoEncodeAV1DpbSlotInfoKHR-pStdReferenceInfo-parameter
    pStdReferenceInfo must be a valid pointer to a valid StdVideoEncodeAV1ReferenceInfo value

AV1 Encode Rate Control

Group of Pictures

In case of AV1 encoding it is common practice to follow a regular pattern of frame types and prediction directions in display order when encoding subsequent frames. This pattern is referred to as the group of pictures (GOP).

The AV1 Specification, unlike some other video compression standards, does not restrict the direction in display order of the referenced frames based on the used frame type or AV1 prediction mode. Accordingly, this specification introduces the concept of rate control groups for which the application can specify separate rate control configuration parameters. When encoding a frame, the application specifies the rate control group the encoded frame belongs to through a VkVideoEncodeAV1RateControlGroupKHR value in VkVideoEncodeAV1PictureInfoKHR::rateControlGroup. This value is then used by the implementation’s rate control algorithm to determine which rate control configuration parameters apply to it.

Possible AV1 encode rate control groups are as follows:

// Provided by VK_KHR_video_encode_av1
typedef enum VkVideoEncodeAV1RateControlGroupKHR {
    VK_VIDEO_ENCODE_AV1_RATE_CONTROL_GROUP_INTRA_KHR = 0,
    VK_VIDEO_ENCODE_AV1_RATE_CONTROL_GROUP_PREDICTIVE_KHR = 1,
    VK_VIDEO_ENCODE_AV1_RATE_CONTROL_GROUP_BIPREDICTIVE_KHR = 2,
} VkVideoEncodeAV1RateControlGroupKHR;
  • VK_VIDEO_ENCODE_AV1_RATE_CONTROL_GROUP_INTRA_KHR should be specified when encoding AV1 frames that use intra-only prediction (e.g. when encoding AV1 frames of type STD_VIDEO_AV1_FRAME_TYPE_KEY or STD_VIDEO_AV1_FRAME_TYPE_INTRA_ONLY).

  • VK_VIDEO_ENCODE_AV1_RATE_CONTROL_GROUP_PREDICTIVE_KHR should be specified when encoding AV1 frames that only have forward references in display order.

  • VK_VIDEO_ENCODE_AV1_RATE_CONTROL_GROUP_BIPREDICTIVE_KHR should be specified when encoding AV1 frames that have backward references in display order.

While the application can specify any rate control group for any frame, indifferent of the frame type, prediction mode, or prediction direction, specifying a rate control group that does not reflect the prediction direction used by the encoded frame may result in unexpected behavior of the implementation’s rate control algorithm.

A regular GOP is defined by the following parameters:

  • The number of frames in the GOP;

  • The number of consecutive frames encoded with VK_VIDEO_ENCODE_AV1_RATE_CONTROL_GROUP_BIPREDICTIVE_KHR between frames encoded with other rate control groups in display order.

GOPs are further classified as open and closed GOPs.

Frame types in an open GOP follow each other in display order according to the following algorithm:

  1. The first frame is always a frame encoded with VK_VIDEO_ENCODE_AV1_RATE_CONTROL_GROUP_INTRA_KHR.

  2. This is followed by a number of consecutive frames encoded with VK_VIDEO_ENCODE_AV1_RATE_CONTROL_GROUP_BIPREDICTIVE_KHR.

  3. If the number of frames in the GOP is not reached yet, then the next frame is a frame encoded with VK_VIDEO_ENCODE_AV1_RATE_CONTROL_GROUP_PREDICTIVE_KHR and the algorithm continues from step 2.

av1 open gop
Figure 11. AV1 open GOP

In case of a closed GOP, a frame with the AV1 frame type STD_VIDEO_AV1_FRAME_TYPE_KEY is used at a certain period.

av1 closed gop
Figure 12. AV1 closed GOP

It is also typical for AV1 encoding to use specific reference picture usage patterns across the frames of the GOP. The two most common reference patterns used are as follows:

Flat Reference Pattern
  • Each frame encoded with VK_VIDEO_ENCODE_AV1_RATE_CONTROL_GROUP_PREDICTIVE_KHR refers to the last frame that was not encoded using VK_VIDEO_ENCODE_AV1_RATE_CONTROL_GROUP_BIPREDICTIVE_KHR, in display order, as its forward reference.

  • Each frame encoded with VK_VIDEO_ENCODE_AV1_RATE_CONTROL_GROUP_BIPREDICTIVE_KHR refers to the last frame that was not encoded with VK_VIDEO_ENCODE_AV1_RATE_CONTROL_GROUP_BIPREDICTIVE_KHR, in display order, as its forward reference, and refers to the next frame that was not encoded with VK_VIDEO_ENCODE_AV1_RATE_CONTROL_GROUP_BIPREDICTIVE_KHR, in display order, as its backward reference.

av1 ref pattern flat
Figure 13. AV1 flat reference pattern
Dyadic Reference Pattern
  • Each frame encoded with VK_VIDEO_ENCODE_AV1_RATE_CONTROL_GROUP_PREDICTIVE_KHR refers to the last frame that was not encoded with VK_VIDEO_ENCODE_AV1_RATE_CONTROL_GROUP_BIPREDICTIVE_KHR, in display order, as its forward reference.

  • The following algorithm is applied to the sequence of consecutive frames encoded with VK_VIDEO_ENCODE_AV1_RATE_CONTROL_GROUP_BIPREDICTIVE_KHR between frames using other rate control groups in display order:

    1. The frame in the middle of this sequence uses the frame preceding the sequence as its forward reference, and uses the frame following the sequence as its backward reference.

    2. The algorithm is executed recursively for the following frame sequences:

      • The frames of the original sequence preceding the frame in the middle, if any.

      • The frames of the original sequence following the frame in the middle, if any.

av1 ref pattern dyadic
Figure 14. AV1 dyadic reference pattern

The application can provide guidance to the implementation’s rate control algorithm about the structure of the GOP used by the application. Any such guidance about the GOP and its structure does not mandate that specific GOP structure to be used by the application, as the frame type and the selected rate control group is still application-controlled, however, any deviation from the provided guidance may result in undesired rate control behavior including, but not limited, to the implementation not being able to conform to the expected average or target bitrates, or other rate control parameters specified by the application.

When an AV1 encode session is used to encode multiple temporal layers, it is also common practice to follow a regular pattern for the AV1 temporal ID for the encoded frames in display order when encoding subsequent frames. This pattern is referred to as the temporal GOP. The most common temporal layer pattern used is as follows:

Dyadic Temporal Layer Pattern
  • The number of frames in the temporal GOP is 2n-1, where n is the number of temporal layers.

  • The ith frame in the temporal GOP uses temporal ID t, if and only if the index of the least significant bit set in i equals n-t-1, except for the first frame, which is the only frame in the temporal GOP using temporal ID zero.

  • The ith frame in the temporal GOP uses the rth frame as reference, where r is calculated from i by clearing the least significant bit set in it, except for the first frame in the temporal GOP, which uses the first frame of the previous temporal GOP, if any, as reference.

av1 layer pattern dyadic
Figure 15. AV1 dyadic temporal layer pattern

Multi-layer rate control and multi-layer coding are typically used for streaming cases where low latency is expected, hence frames usually do not use backward references in display order.

The VkVideoEncodeAV1RateControlInfoKHR structure is defined as:

// Provided by VK_KHR_video_encode_av1
typedef struct VkVideoEncodeAV1RateControlInfoKHR {
    VkStructureType                        sType;
    const void*                            pNext;
    VkVideoEncodeAV1RateControlFlagsKHR    flags;
    uint32_t                               gopFrameCount;
    uint32_t                               keyFramePeriod;
    uint32_t                               consecutiveBipredictiveFrameCount;
    uint32_t                               temporalLayerCount;
} VkVideoEncodeAV1RateControlInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • flags is a bitmask of VkVideoEncodeAV1RateControlFlagBitsKHR specifying AV1 rate control flags.

  • gopFrameCount is the number of frames within a group of pictures (GOP) intended to be used by the application. If it is set to 0, the rate control algorithm may assume an implementation-dependent GOP length. If it is set to UINT32_MAX, the GOP length is treated as infinite.

  • keyFramePeriod is the interval, in terms of number of frames, between two frames with the AV1 frame type STD_VIDEO_AV1_FRAME_TYPE_KEY (see key frame period). If it is set to 0, the rate control algorithm may assume an implementation-dependent key frame period. If it is set to UINT32_MAX, the key frame period is treated as infinite.

  • consecutiveBipredictiveFrameCount is the number of consecutive frames encoded with VK_VIDEO_ENCODE_AV1_RATE_CONTROL_GROUP_BIPREDICTIVE_KHR between frames encoded with other rate control groups within the GOP.

  • temporalLayerCount specifies the number of AV1 temporal layers that the application intends to use.

When an instance of this structure is included in the pNext chain of the VkVideoCodingControlInfoKHR structure passed to the vkCmdControlVideoCodingKHR command, and VkVideoCodingControlInfoKHR::flags includes VK_VIDEO_CODING_CONTROL_ENCODE_RATE_CONTROL_BIT_KHR, the parameters in this structure are used as guidance for the implementation’s rate control algorithm (see Video Coding Control).

Valid Usage
  • VUID-VkVideoEncodeAV1RateControlInfoKHR-flags-10294
    If flags contains VK_VIDEO_ENCODE_AV1_RATE_CONTROL_REFERENCE_PATTERN_FLAT_BIT_KHR or VK_VIDEO_ENCODE_AV1_RATE_CONTROL_REFERENCE_PATTERN_DYADIC_BIT_KHR, then it must also contain VK_VIDEO_ENCODE_AV1_RATE_CONTROL_REGULAR_GOP_BIT_KHR

  • VUID-VkVideoEncodeAV1RateControlInfoKHR-flags-10295
    If flags contains VK_VIDEO_ENCODE_AV1_RATE_CONTROL_REFERENCE_PATTERN_FLAT_BIT_KHR, then it must not also contain VK_VIDEO_ENCODE_AV1_RATE_CONTROL_REFERENCE_PATTERN_DYADIC_BIT_KHR

  • VUID-VkVideoEncodeAV1RateControlInfoKHR-flags-10296
    If flags contains VK_VIDEO_ENCODE_AV1_RATE_CONTROL_REGULAR_GOP_BIT_KHR, then gopFrameCount must be greater than 0

  • VUID-VkVideoEncodeAV1RateControlInfoKHR-keyFramePeriod-10297
    If keyFramePeriod is not 0, then it must be greater than or equal to gopFrameCount

  • VUID-VkVideoEncodeAV1RateControlInfoKHR-consecutiveBipredictiveFrameCount-10298
    If consecutiveBipredictiveFrameCount is not 0, then it must be less than gopFrameCount

  • VUID-VkVideoEncodeAV1RateControlInfoKHR-temporalLayerCount-10299
    temporalLayerCount must be less than or equal to VkVideoEncodeAV1CapabilitiesKHR::maxTemporalLayerCount, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the used video profile

Valid Usage (Implicit)
  • VUID-VkVideoEncodeAV1RateControlInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_AV1_RATE_CONTROL_INFO_KHR

  • VUID-VkVideoEncodeAV1RateControlInfoKHR-flags-parameter
    flags must be a valid combination of VkVideoEncodeAV1RateControlFlagBitsKHR values

Bits which can be set in VkVideoEncodeAV1RateControlInfoKHR::flags, specifying AV1 rate control flags, are:

// Provided by VK_KHR_video_encode_av1
typedef enum VkVideoEncodeAV1RateControlFlagBitsKHR {
    VK_VIDEO_ENCODE_AV1_RATE_CONTROL_REGULAR_GOP_BIT_KHR = 0x00000001,
    VK_VIDEO_ENCODE_AV1_RATE_CONTROL_TEMPORAL_LAYER_PATTERN_DYADIC_BIT_KHR = 0x00000002,
    VK_VIDEO_ENCODE_AV1_RATE_CONTROL_REFERENCE_PATTERN_FLAT_BIT_KHR = 0x00000004,
    VK_VIDEO_ENCODE_AV1_RATE_CONTROL_REFERENCE_PATTERN_DYADIC_BIT_KHR = 0x00000008,
} VkVideoEncodeAV1RateControlFlagBitsKHR;
  • VK_VIDEO_ENCODE_AV1_RATE_CONTROL_REGULAR_GOP_BIT_KHR specifies that the application intends to use a regular GOP structure according to the parameters specified in the gopFrameCount and keyFramePeriod members of the VkVideoEncodeAV1RateControlInfoKHR structure.

  • VK_VIDEO_ENCODE_AV1_RATE_CONTROL_TEMPORAL_LAYER_PATTERN_DYADIC_BIT_KHR specifies that the application intends to follow a dyadic temporal layer pattern.

  • VK_VIDEO_ENCODE_AV1_RATE_CONTROL_REFERENCE_PATTERN_FLAT_BIT_KHR specifies that the application intends to follow a flat reference pattern in the GOP.

  • VK_VIDEO_ENCODE_AV1_RATE_CONTROL_REFERENCE_PATTERN_DYADIC_BIT_KHR specifies that the application intends to follow a dyadic reference pattern in the GOP.

// Provided by VK_KHR_video_encode_av1
typedef VkFlags VkVideoEncodeAV1RateControlFlagsKHR;

VkVideoEncodeAV1RateControlFlagsKHR is a bitmask type for setting a mask of zero or more VkVideoEncodeAV1RateControlFlagBitsKHR.

Rate Control Layers

The VkVideoEncodeAV1RateControlLayerInfoKHR structure is defined as:

// Provided by VK_KHR_video_encode_av1
typedef struct VkVideoEncodeAV1RateControlLayerInfoKHR {
    VkStructureType                 sType;
    const void*                     pNext;
    VkBool32                        useMinQIndex;
    VkVideoEncodeAV1QIndexKHR       minQIndex;
    VkBool32                        useMaxQIndex;
    VkVideoEncodeAV1QIndexKHR       maxQIndex;
    VkBool32                        useMaxFrameSize;
    VkVideoEncodeAV1FrameSizeKHR    maxFrameSize;
} VkVideoEncodeAV1RateControlLayerInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • useMinQIndex indicates whether the quantizer index values determined by rate control will be clamped to the lower bounds on the quantizer index values specified in minQIndex.

  • minQIndex specifies the lower bounds on the quantizer index values, for each rate control group, that the implementation’s rate control algorithm will use when useMinQIndex is set to VK_TRUE.

  • useMaxQIndex indicates whether the quantizer index values determined by rate control will be clamped to the upper bounds on the quantizer index values specified in maxQIndex.

  • maxQIndex specifies the upper bounds on the quantizer index values, for each rate control group, that the implementation’s rate control algorithm will use when useMaxQIndex is set to VK_TRUE.

  • useMaxFrameSize indicates whether the implementation’s rate control algorithm should use the values specified in maxFrameSize as the upper bounds on the encoded frame size for each rate control group.

  • maxFrameSize specifies the upper bounds on the encoded frame size, for each rate control group, when useMaxFrameSize is set to VK_TRUE.

When used, the values in minQIndex and maxQIndex guarantee that the effective quantizer index values used by the implementation will respect those lower and upper bounds, respectively. However, limiting the range of quantizer index values that the implementation is able to use will also limit the capabilities of the implementation’s rate control algorithm to comply to other constraints. In particular, the implementation may not be able to comply to the following:

  • The average and/or peak bitrate values to be used for the encoded bitstream specified in the averageBitrate and maxBitrate members of the VkVideoEncodeRateControlLayerInfoKHR structure.

  • The upper bounds on the encoded frame size, for each rate control group, specified in the maxFrameSize member of VkVideoEncodeAV1RateControlLayerInfoKHR.

In general, applications need to configure rate control parameters appropriately in order to be able to get the desired rate control behavior, as described in the Video Encode Rate Control section.

When an instance of this structure is included in the pNext chain of a VkVideoEncodeRateControlLayerInfoKHR structure specified in one of the elements of the pLayers array member of the VkVideoEncodeRateControlInfoKHR structure passed to the vkCmdControlVideoCodingKHR command, VkVideoCodingControlInfoKHR::flags includes VK_VIDEO_CODING_CONTROL_ENCODE_RATE_CONTROL_BIT_KHR, and the bound video session was created with the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR, it specifies the AV1-specific rate control parameters of the rate control layer corresponding to that element of pLayers.

Valid Usage
  • VUID-VkVideoEncodeAV1RateControlLayerInfoKHR-useMinQIndex-10300
    If useMinQIndex is VK_TRUE, then the intraQIndex, predictiveQIndex, and bipredictiveQIndex members of minQIndex must all be between VkVideoEncodeAV1CapabilitiesKHR::minQIndex and VkVideoEncodeAV1CapabilitiesKHR::maxQIndex, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the used video profile

  • VUID-VkVideoEncodeAV1RateControlLayerInfoKHR-useMinQIndex-10301
    If useMinQIndex is VK_TRUE and VkVideoEncodeAV1CapabilitiesKHR::flags, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the used video profile, does not include VK_VIDEO_ENCODE_AV1_CAPABILITY_PER_RATE_CONTROL_GROUP_MIN_MAX_Q_INDEX_BIT_KHR, then the intraQIndex, predictiveQIndex, and bipredictiveQIndex members of minQIndex must all specify the same value

  • VUID-VkVideoEncodeAV1RateControlLayerInfoKHR-useMaxQIndex-10302
    If useMaxQIndex is VK_TRUE, then the intraQIndex, predictiveQIndex, and bipredictiveQIndex members of maxQIndex must all be between VkVideoEncodeAV1CapabilitiesKHR::minQIndex and VkVideoEncodeAV1CapabilitiesKHR::maxQIndex, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the used video profile

  • VUID-VkVideoEncodeAV1RateControlLayerInfoKHR-useMaxQIndex-10303
    If useMaxQIndex is VK_TRUE and VkVideoEncodeAV1CapabilitiesKHR::flags, as returned by vkGetPhysicalDeviceVideoCapabilitiesKHR for the used video profile, does not include VK_VIDEO_ENCODE_AV1_CAPABILITY_PER_RATE_CONTROL_GROUP_MIN_MAX_Q_INDEX_BIT_KHR, then the intraQIndex, predictiveQIndex, and bipredictiveQIndex members of maxQIndex must all specify the same value

  • VUID-VkVideoEncodeAV1RateControlLayerInfoKHR-useMinQIndex-10304
    If useMinQIndex and useMaxQIndex are both VK_TRUE, then the intraQIndex, predictiveQIndex, and bipredictiveQIndex members of minQIndex must all be less than or equal to the respective members of maxQIndex

Valid Usage (Implicit)
  • VUID-VkVideoEncodeAV1RateControlLayerInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_AV1_RATE_CONTROL_LAYER_INFO_KHR

  • VUID-VkVideoEncodeAV1RateControlLayerInfoKHR-minQIndex-parameter
    minQIndex must be a valid VkVideoEncodeAV1QIndexKHR structure

  • VUID-VkVideoEncodeAV1RateControlLayerInfoKHR-maxQIndex-parameter
    maxQIndex must be a valid VkVideoEncodeAV1QIndexKHR structure

  • VUID-VkVideoEncodeAV1RateControlLayerInfoKHR-maxFrameSize-parameter
    maxFrameSize must be a valid VkVideoEncodeAV1FrameSizeKHR structure

The VkVideoEncodeAV1QIndexKHR structure is defined as:

// Provided by VK_KHR_video_encode_av1
typedef struct VkVideoEncodeAV1QIndexKHR {
    uint32_t    intraQIndex;
    uint32_t    predictiveQIndex;
    uint32_t    bipredictiveQIndex;
} VkVideoEncodeAV1QIndexKHR;
  • intraQIndex is the quantizer index to be used for frames encoded with VK_VIDEO_ENCODE_AV1_RATE_CONTROL_GROUP_INTRA_KHR.

  • predictiveQIndex is the quantizer index to be used for frames encoded with VK_VIDEO_ENCODE_AV1_RATE_CONTROL_GROUP_PREDICTIVE_KHR.

  • bipredictiveQIndex is the quantizer index to be used for frames encoded with VK_VIDEO_ENCODE_AV1_RATE_CONTROL_GROUP_BIPREDICTIVE_KHR.

The VkVideoEncodeAV1FrameSizeKHR structure is defined as:

// Provided by VK_KHR_video_encode_av1
typedef struct VkVideoEncodeAV1FrameSizeKHR {
    uint32_t    intraFrameSize;
    uint32_t    predictiveFrameSize;
    uint32_t    bipredictiveFrameSize;
} VkVideoEncodeAV1FrameSizeKHR;
  • intraFrameSize is the size in bytes to be used for frames encoded with VK_VIDEO_ENCODE_AV1_RATE_CONTROL_GROUP_INTRA_KHR.

  • predictiveFrameSize is the size in bytes to be used for frames encoded with VK_VIDEO_ENCODE_AV1_RATE_CONTROL_GROUP_PREDICTIVE_KHR.

  • bipredictiveFrameSize is the size in bytes to be used for frames encoded with VK_VIDEO_ENCODE_AV1_RATE_CONTROL_GROUP_BIPREDICTIVE_KHR.

GOP Remaining Frames

Besides session level rate control configuration, the application can specify the number of frames per frame type remaining in the group of pictures (GOP).

The VkVideoEncodeAV1GopRemainingFrameInfoKHR structure is defined as:

// Provided by VK_KHR_video_encode_av1
typedef struct VkVideoEncodeAV1GopRemainingFrameInfoKHR {
    VkStructureType    sType;
    const void*        pNext;
    VkBool32           useGopRemainingFrames;
    uint32_t           gopRemainingIntra;
    uint32_t           gopRemainingPredictive;
    uint32_t           gopRemainingBipredictive;
} VkVideoEncodeAV1GopRemainingFrameInfoKHR;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • useGopRemainingFrames indicates whether the implementation’s rate control algorithm should use the values specified in gopRemainingIntra, gopRemainingPredictive, and gopRemainingBipredictive. If useGopRemainingFrames is VK_FALSE, then the values of gopRemainingIntra, gopRemainingPredictive, and gopRemainingBipredictive are ignored.

  • gopRemainingIntra specifies the number of frames encoded with VK_VIDEO_ENCODE_AV1_RATE_CONTROL_GROUP_INTRA_KHR the implementation’s rate control algorithm should assume to be remaining in the GOP prior to executing the next video encode operation.

  • gopRemainingPredictive specifies the number of frames encoded with VK_VIDEO_ENCODE_AV1_RATE_CONTROL_GROUP_PREDICTIVE_KHR the implementation’s rate control algorithm should assume to be remaining in the GOP prior to executing the next video encode operation.

  • gopRemainingBipredictive specifies the number of frames encoded with VK_VIDEO_ENCODE_AV1_RATE_CONTROL_GROUP_BIPREDICTIVE_KHR the implementation’s rate control algorithm should assume to be remaining in the GOP prior to executing the next video encode operation.

Setting useGopRemainingFrames to VK_TRUE and including this structure in the pNext chain of VkVideoBeginCodingInfoKHR is only mandatory if the VkVideoEncodeAV1CapabilitiesKHR::requiresGopRemainingFrames reported for the used video profile is VK_TRUE. However, implementations may use these remaining frame counts, when specified, even when it is not required. In particular, when the application does not use a regular GOP structure, these values may provide additional guidance for the implementation’s rate control algorithm.

The VkVideoEncodeAV1CapabilitiesKHR::prefersGopRemainingFrames capability is also used to indicate that the implementation’s rate control algorithm may operate more accurately if the application specifies the remaining frame counts using this structure.

As with other rate control guidance values, if the effective order and number of frames encoded by the application are not in line with the remaining frame counts specified in this structure at any given point, then the behavior of the implementation’s rate control algorithm may deviate from the one expected by the application.

Valid Usage (Implicit)
  • VUID-VkVideoEncodeAV1GopRemainingFrameInfoKHR-sType-sType
    sType must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_AV1_GOP_REMAINING_FRAME_INFO_KHR

AV1 Quantizer Index Delta Maps

Quantization delta maps used with an AV1 encode profile are referred to as quantizer index delta maps and their texels contain integer values representing quantizer index delta values that are applied in the process of determining the quantizer indices of the encoded picture.

Accordingly, AV1 quantizer index delta maps always have single channel integer formats, as reported in VkVideoFormatPropertiesKHR::format.

When the rate control mode is VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DISABLED_BIT_KHR, the quantizer index delta values are added to the constant quantizer index value that, in effect, enable the application to explicitly control the used quantizer index values at the granularity of the used quantization map texel size.

For all other rate control modes, the quantizer index delta values can be used to offset the quantizer index values that the rate control algorithm would otherwise produce.

AV1 Encode Quantization

Performing AV1 encode operations involves the process of assigning quantizer index values to individual AV1 mode info blocks. This process depends on the used rate control mode, as well as other encode and rate control parameters, as described below:

  • If the configured rate control mode is VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DEFAULT_KHR, then the quantizer index value is initialized by the implementation-specific default rate control algorithm.

    • If the video encode operation is issued with a quantization delta map, the quantizer index delta value corresponding to the mode info block, as fetched from the quantization map, is added to the previously determined quantizer index value. If the fetched quantizer index delta value falls outside the supported quantizer index delta value range reported in the minQIndexDelta and maxQIndexDelta members of VkVideoEncodeAV1QuantizationMapCapabilitiesKHR, then the quantizer index value used for the mode info block becomes undefined.

  • If the configured rate control mode is VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DISABLED_BIT_KHR, then the quantizer index value is initialized from the constant quantizer index value specified for the encoded frame.

    • If the video encode operation is issued with a quantization delta map, the quantizer index delta value corresponding to the mode info block, as fetched from the quantization map, is added to the previously determined quantizer index value. If the fetched quantizer index delta value falls outside the supported quantizer index delta value range reported in the minQIndexDelta and maxQIndexDelta members of VkVideoEncodeAV1QuantizationMapCapabilitiesKHR, then the quantizer index value used for the mode info block becomes undefined.

  • If the configured rate control mode is not VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DEFAULT_KHR or VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DISABLED_BIT_KHR, then the quantizer index value is initialized by the corresponding rate control algorithm.

    • If the video encode operation is issued with a quantization delta map, the quantizer index delta value corresponding to the mode info block, as fetched from the quantization map, is added to the previously determined quantizer index value. If the fetched quantizer index delta value falls outside the supported quantizer index delta value range reported in the minQIndexDelta and maxQIndexDelta members of VkVideoEncodeAV1QuantizationMapCapabilitiesKHR, then the quantizer index value used for the mode info block becomes undefined.

    • If the video encode operation is issued with an emphasis map, the rate control will adjust the quantizer index value based on the emphasis value corresponding to the mode info block, as fetched from the quantization map, according to the following equation:

      QIndexnew = f(QIndexprev,e)

      Where QIndexnew is the resulting quantizer index value, QIndexprev is the previously determined quantizer index value, e is the emphasis value corresponding to the macroblock, and f is an implementation-defined function for which the following implication is true:

      e1 < e2 ⇒ f(QIndex,e1) ≥ f(QIndex,e2)

      This means that lower emphasis values will result in higher quantizer index values, whereas higher emphasis values will result in lower quantizer index values, but the function is not strictly decreasing with respect to the input emphasis value for a given input quantizer index value.

    • If clamping to minimum quantizer index values is enabled in the applied rate control layer, then the quantizer index value is clamped to the corresponding minimum quantizer index value.

    • If clamping to maximum quantizer index values is enabled in the applied rate control layer, then the quantizer index value is clamped to the corresponding maximum quantizer index value.

  • In all cases, the final quantizer index value is clamped to the minimum and maximum quantizer index values supported by the video profile.

AV1 Encode Requirements

This section described the required AV1 encoding capabilities for physical devices that have at least one queue family that supports the video codec operation VK_VIDEO_CODEC_OPERATION_ENCODE_AV1_BIT_KHR, as returned by vkGetPhysicalDeviceQueueFamilyProperties2 in VkQueueFamilyVideoPropertiesKHR::videoCodecOperations.

Table 11. Required Video Std Header Versions
Video Std Header Name Version

vulkan_video_codec_av1std_encode

1.0.0

Table 12. Required Video Capabilities
Video Capability Requirement Requirement Type1

VkVideoCapabilitiesKHR

flags

-

min

minBitstreamBufferOffsetAlignment

4096

max

minBitstreamBufferSizeAlignment

4096

max

pictureAccessGranularity

(64,64)

max

minCodedExtent

-

max

maxCodedExtent

-

min

maxDpbSlots

0

min

maxActiveReferencePictures

0

min

VkVideoEncodeCapabilitiesKHR

flags

-

min

rateControlModes

-

min

maxBitrate

5529600

min

maxQualityLevels

1

min

encodeInputPictureGranularity

(64,64)

max

supportedEncodeFeedbackFlags

VK_VIDEO_ENCODE_FEEDBACK_BITSTREAM_BUFFER_OFFSET_BIT_KHR
VK_VIDEO_ENCODE_FEEDBACK_BITSTREAM_BYTES_WRITTEN_BIT_KHR

min

VkVideoEncodeAV1CapabilitiesKHR

flags

-

min

codedPictureAlignment

(8,8)

min

maxTiles

1

min

minTileSize

-

max

maxTileSize

-

min

maxLevel

STD_VIDEO_AV1_LEVEL_2_0

min

superblockSizes

at least one bit set

implementation-dependent

maxSingleReferenceCount

0

min

singleReferenceNameMask

- 2

min

maxUnidirectionalCompoundReferenceCount

0 3

min

maxUnidirectionalCompoundGroup1ReferenceCount

0 3,4

min

unidirectionalCompoundReferenceNameMask

- 2

min

maxBidirectionalCompoundReferenceCount

0 3

min

maxBidirectionalCompoundGroup1ReferenceCount

0 3,5

min

maxBidirectionalCompoundGroup2ReferenceCount

0 3,5

min

bidirectionalCompoundReferenceNameMask

- 2

min

maxTemporalLayerCount

1

min

maxSpatialLayerCount

1

min

maxOperatingPoints

0

min

minQIndex

-

max

maxQIndex

-

min

prefersGopRemainingFrames

-

implementation-dependent

requiresGopRemainingFrames

-

implementation-dependent

stdSyntaxFlags

-

min

VkVideoEncodeQuantizationMapCapabilitiesKHR

maxQuantizationMapExtent

- 6

min

VkVideoEncodeAV1QuantizationMapCapabilitiesKHR

minQIndexDelta

- 7

max

maxQIndexDelta

- 7

min

1

The Requirement Type column specifies the requirement is either the minimum value all implementations must support, the maximum value all implementations must support, or the exact value all implementations must support. For bitmasks a minimum value is the least bits all implementations must set, but they may have additional bits set beyond this minimum.

2

These masks must only have bits set in the least significant VK_MAX_VIDEO_AV1_REFERENCES_PER_FRAME_KHR bits (bit index i indicates support for the AV1 reference name STD_VIDEO_AV1_REFERENCE_NAME_LAST_FRAME + i when using the corresponding AV1 prediction mode), and must have at least as many bits set in any *ReferenceNameMask capability as the value of the corresponding max*ReferenceCount capability.

3

If greater than zero, it must be at least 2.

4

maxUnidirectionalCompoundGroup1ReferenceCount must be less than or equal to maxUnidirectionalCompoundReferenceCount

5

The sum of maxBidirectionalCompoundGroup1ReferenceCount and maxBidirectionalCompoundGroup2ReferenceCount must be greater than or equal to maxBidirectionalCompoundReferenceCount

6

If VkVideoCapabilitiesKHR::flags includes VK_VIDEO_ENCODE_CAPABILITY_QUANTIZATION_DELTA_MAP_BIT_KHR or VK_VIDEO_ENCODE_CAPABILITY_EMPHASIS_MAP_BIT_KHR, then the width and height members of maxQuantizationMapExtent must be greater than zero.

7

If VkVideoCapabilitiesKHR::flags includes VK_VIDEO_ENCODE_CAPABILITY_QUANTIZATION_DELTA_MAP_BIT_KHR, then maxQIndexDelta must be greater than minQIndexDelta.