VK_KHR_maintenance7
- 1. Problem Statement
- 2. Issue Details and Solution Space
- 2.1. Separate Depth/Stencil Access
- 2.2. Query Properties of Underlying Layered Implementations
- 2.3. Mixed Inline and Secondary Command Buffer Recording in Render Passes
- 2.4. Relax Count of Dynamic Uniform/Storage Buffers
- 2.5. Fragment Shading Rate Attachment Size Mismatch
- 2.6. Determine 32-Bit Query Overflow Behavior
- 3. Proposal
- 4. Issues
- 5. Further Functionality
This proposal details and addresses the issues solved by the VK_KHR_maintenance7
extension.
1. Problem Statement
Over time, a collection of minor features, none of which would warrant an entire extension of their own, requires the creation of a maintenance extension.
The following is a list of issues considered in this proposal:
-
Require no access overlap between depth and stencil aspects when rendering
-
Add a way to query information regarding the underlying devices in environments where the Vulkan implementation is provided through layered implementations.
-
Promote
VK_RENDERING_CONTENTS_INLINE_BIT_EXT
andVK_SUBPASS_CONTENTS_INLINE_AND_SECONDARY_COMMAND_BUFFERS_EXT
to KHR -
Add a limit to report the maximum total count of dynamic uniform buffers and dynamic storage buffers that can be included in a pipeline layout.
-
Add a method for determining whether an implementation wraps or saturates on 32bit timestamp query overflow
-
Require that FSR attachment access is consistent with other image accesses for robustness. Perhaps also make sure array access is robust - not just width/height.
1.1. Separate Depth/Stencil Access
Some implementations treat writes to a single aspect of mixed depth/stencil attachments as writes to both aspects, while some implementations treat these writes as single-aspect writes in isolation. It is important for applications to know which behavior is in use by an implementation.
1.2. Query Properties of Underlying Layered Implementations
When the Vulkan driver is provided by a layered implementation, it may be necessary to query the details of the underlying device.
For example, running on Mesa/Venus, driver ID is returned as VK_DRIVER_ID_MESA_VENUS
, but it may be necessary to know what the real driver under the hood is.
1.3. Mixed Inline and Secondary Command Buffer Recording in Render Passes
Vulkan 1.0 required that the contents of a render pass subpass are either entirely inlined or are provided in a single secondary command buffer. In some situations, it may be beneficial to be able to mix inline and secondary command buffers inside the same render pass subpass, or executed multiple secondary command buffers.
1.4. Relax Count of Dynamic Uniform/Storage Buffers
The maximum count of dynamic uniform buffers and dynamic storage buffers that can be included in a pipeline layout are reported separately. While some implementations treat dynamic offsets of uniform buffers and storage buffers the same way, reporting the total count along with maximum count of dynamic uniform and storage buffers could relax the limitation and expose device capabilities more accurately.
1.5. Fragment Shading Rate Attachment Size Mismatch
The DirectX 12 Variable Shading Rate feature allows applications to specify a shading rate image that is smaller than would be required to provide shading rates for all rendered texels. When a fragment is rendered outside of the area covered by the shading rate image, default values are returned, in line with the usual out of bounds values for images that it guarantees.
In Vulkan however, this is currently outright banned, making it difficult to guarantee portability for apps relying on this or for emulation layers.
2. Issue Details and Solution Space
2.1. Separate Depth/Stencil Access
A property that indicates whether single-aspect writes to a depth/stencil attachment will result in writes to both aspects.
2.2. Query Properties of Underlying Layered Implementations
A new set of structures are included, accessible through VkPhysicalDeviceLayeredApiPropertiesListKHR when chained to VkPhysicalDeviceProperties2.
2.3. Mixed Inline and Secondary Command Buffer Recording in Render Passes
Flags from the VK_EXT_nested_command_buffer
extension provide this functionality, and are promoted to this extension.
2.4. Relax Count of Dynamic Uniform/Storage Buffers
A limit that indicates the maximum total count of dynamic uniform buffers and storage buffers that can be included in a pipeline layout is added.
2.5. Fragment Shading Rate Attachment Size Mismatch
At the basic level, the first hurdle is allowing applications to specify fragment shading rate attachments that are too small for the render area, and giving it some sense of defined behavior. The second challenge is ensuring that implementations all return the expected values.
This could be achieved by using new language to describe the specifics of these lookups, but notably image reads with robust access already have the desired behavior. As such, this proposal retools fragment shading rate attachment reads as general image reads, and guarantees the values match DirectX 12’s guarantees when the VkPhysicalDeviceRobustness2FeaturesEXT::robustImageAccess2 feature is enabled.
2.6. Determine 32-Bit Query Overflow Behavior
Previously when the 32-bit unsigned integer query result overflows, the implementation may either
wrap or saturate. However, MESA Virtio-GPU Venus layered implementation needs determined behavior
to implement vkGetQueryPoolResults
based on vkCmdCopyQueryPoolResults
in an efficient way.
This is solved by requiring that for an unsigned integer query, the 32-bit result value must be
equal to the 32 least significant bits of the equivalent 64-bit result value.
3. Proposal
3.1. New features
The following features are exposed:
typedef struct VkPhysicalDeviceMaintenance7FeaturesKHR {
VkStructureType sType;
void* pNext;
VkBool32 maintenance7;
} VkPhysicalDeviceMaintenance7FeaturesKHR;
-
The
maintenance7
feature indicates support for theVK_KHR_maintenance7
extension.
3.2. New properties
The following properties are added by this extension:
typedef struct VkPhysicalDeviceMaintenance7PropertiesKHR {
VkStructureType sType;
void* pNext;
VkBool32 robustFragmentShadingRateAttachmentAccess;
VkBool32 separateDepthStencilAttachmentAccess;
uint32_t maxDescriptorSetTotalUniformBuffersDynamic;
uint32_t maxDescriptorSetTotalStorageBuffersDynamic;
uint32_t maxDescriptorSetTotalBuffersDynamic;
uint32_t maxDescriptorSetUpdateAfterBindTotalUniformBuffersDynamic;
uint32_t maxDescriptorSetUpdateAfterBindTotalStorageBuffersDynamic;
uint32_t maxDescriptorSetUpdateAfterBindTotalBuffersDynamic;
} VkPhysicalDeviceMaintenance7PropertiesKHR;
-
robustFragmentShadingRateAttachmentAccess
indicates whether a fragment shading rate attachment created with VkImageSubresourceRange::`baseMipLevel` equal to 0 can have a size that is too small to cover a specified render area. -
separateDepthStencilAttachmentAccess
indicates whether read-modify-write operations to a depth/stencil attachment are considered a write to the sibling stencil or depth attachment in an image which contains both depth and stencil aspects. -
maxDescriptorSetTotalUniformBuffersDynamic
indicates the maximum total count of dynamic uniform buffers that can be included in a pipeline layout. -
maxDescriptorSetTotalStorageBuffersDynamic
indicates the maximum total count of dynamic storage buffers that can be included in a pipeline layout. -
maxDescriptorSetTotalBuffersDynamic
indicates the maximum total count of dynamic uniform buffers and storage buffers that can be included in a pipeline layout. -
maxDescriptorSetUpdateAfterBindTotalUniformBuffersDynamic
is similar tomaxDescriptorSetUniformBuffersDynamic
but counts descriptors from descriptor sets created with or without theVK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT
bit set. -
maxDescriptorSetUpdateAfterBindTotalStorageBuffersDynamic
is similar tomaxDescriptorSetStorageBuffersDynamic
but counts descriptors from descriptor sets created with or without theVK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT
bit set. -
maxDescriptorSetUpdateAfterBindTotalBuffersDynamic
is similar tomaxDescriptorSetBuffersDynamic
but counts descriptors from descriptor sets created with or without theVK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT
bit set.
3.3. New flags
The VK_RENDERING_CONTENTS_INLINE_BIT_KHR
flag promoted from
the VK_EXT_nested_command_buffer
extension allows the render pass instance to be recorded
inline within the current command buffer.
Combined with the VK_RENDERING_CONTENTS_SECONDARY_COMMAND_BUFFERS_BIT
bit,
the contents of the render pass instance can be recorded both inline and in
secondary command buffers executed with vkCmdExecuteCommands
.
The VK_SUBPASS_CONTENTS_INLINE_AND_SECONDARY_COMMAND_BUFFERS_KHR
flag
promoted from the VK_EXT_nested_command_buffer
extension allows the contents
of a render pass subpass to be recorded both inline and in secondary command
buffers executed with vkCmdExecuteCommands
.
3.4. New structs
To query information regarding layered implementations, chain the following to vkGetPhysicalDeviceProperties2
.
typedef struct VkPhysicalDeviceLayeredApiPropertiesListKHR {
VkStructureType sType;
void* pNext;
uint32_t layeredApiCount;
VkPhysicalDeviceLayeredApiPropertiesKHR* pLayeredApis;
} VkPhysicalDeviceLayeredApiPropertiesListKHR;
Where:
typedef struct VkPhysicalDeviceLayeredApiPropertiesKHR {
VkStructureType sType;
void* pNext;
uint32_t vendorID;
uint32_t deviceID;
VkPhysicalDeviceLayeredApiKHR layeredAPI;
char deviceName[VK_MAX_PHYSICAL_DEVICE_NAME_SIZE];
} VkPhysicalDeviceLayeredApiPropertiesKHR;
In the above, vendorID
, deviceID
, and deviceName
are similar to members of the same name in VkPhysicalDeviceProperties
.
layeredAPI
is an enum that identifies the underlying API of the layered implementation, for example VK_PHYSICAL_DEVICE_LAYERED_API_D3D12_KHR
if the layer implements the D3D12 API.
In the presence of multiple layers, the contents of pLayeredApis[0]
corresponds to the bottom-most layer, with the following indices (if any) ordered by layer order.
This allows applications who are purely interested in the ultimate vendor ID or API that is executing the commands to avoid querying the layer count, always provide a layeredApiCount
of 1, and inspect only pLayeredApis[0]
.
To query API-specific details of the layered implementation, an API-specific struct can be chained to VkPhysicalDeviceLayeredApiPropertiesKHR
.
For layered Vulkan implementations (i.e. VK_PHYSICAL_DEVICE_LAYERED_API_VULKAN_KHR
) this extension introduces VkPhysicalDeviceLayeredApiVulkanPropertiesKHR
to be chained, with structs for other APIs potentially added in future extensions.
The implementation will fill in the chained struct that corresponds to the layered API, and leave structs for other APIs untouched.
This allows the application to chain structs for multiple APIs and retrieve all necessary information in a single query.
typedef struct VkPhysicalDeviceLayeredApiVulkanPropertiesKHR {
VkStructureType sType;
void* pNext;
VkPhysicalDeviceProperties2 properties;
} VkPhysicalDeviceLayeredApiVulkanPropertiesKHR;
In the above struct, the application may additionally chain VkPhysicalDeviceDriverProperties
and VkPhysicalDeviceIDProperties
to properties
to extract further information from the underlying Vulkan device.
properties.properties.limits
and properties.properties.sparseProperties
will however be 0-initialized and will not contain meaningful values.
For example, an application running through Mesa’s Venus, atop Mesa’s Dozen, atop the Nvidia proprietary D3D12 implementation would receive:
layers->pLayeredApis[0].layeredAPI = VK_PHYSICAL_DEVICE_LAYERED_API_D3D12_KHR;
// other fields
layers->pLayeredApis[1].layeredAPI = VK_PHYSICAL_DEVICE_LAYERED_API_VULKAN_KHR;
// other fields
// If driverProperties is a VkPhysicalDeviceDriverProperties chained to
// VkPhysicalDeviceLayeredApiVulkanPropertiesKHR::properties that is in turn
// chained to layers->pLayeredApis[1].pNext:
driverProperties->driverID = VK_DRIVER_ID_MESA_DOZEN;
// other fields
In the above example, the properties of the top layer (Mesa’s Venus) will be returned as usual in VkPhysicalDeviceProperties2
.
Note: if there are layers underneath a non-Vulkan implementation, they may not be visible in this query.
For example, if the application is running through Mesa’s Dozen, atop VKD3D-proton and so on, the query may return layered implementations only up to Mesa’s Dozen as other APIs may lack such a query.
The VkPhysicalDeviceLayeredApiKHR
enum is defined as:
typedef enum VkPhysicalDeviceLayeredApiKHR {
VK_PHYSICAL_DEVICE_LAYERED_API_VULKAN_KHR = 1,
VK_PHYSICAL_DEVICE_LAYERED_API_D3D12_KHR = 2,
VK_PHYSICAL_DEVICE_LAYERED_API_METAL_KHR = 3,
VK_PHYSICAL_DEVICE_LAYERED_API_OPENGL_KHR = 4,
VK_PHYSICAL_DEVICE_LAYERED_API_OPENGLES_KHR = 5,
} VkPhysicalDeviceLayeredApiKHR;
4. Issues
4.1. RESOLVED: When running on a layered implementation, how should the properties of an underlying layered Vulkan device be queries?
A dedicated struct VkPhysicalDeviceLayeredApiVulkanPropertiesKHR
should be used, with a VkPhysicalDeviceProperties2
member.
Additional information can be queried by chaining VkPhysicalDeviceDriverProperties
and VkPhysicalDeviceIDProperties
structs to that member.
Chaining VkPhysicalDeviceProperties2
, VkPhysicalDeviceDriverProperties
and VkPhysicalDeviceIDProperties
directly to VkPhysicalDeviceLayeredApiPropertiesKHR
can be confusing for a number of reasons.
In particular, VkPhysicalDeviceProperties2
is a "root" structure which can accept VkPhysicalDeviceDriverProperties
and VkPhysicalDeviceIDProperties
in its chain; allowing those structs to be chained to VkPhysicalDeviceLayeredApiPropertiesKHR
means that it would be valid for an application to create a chain such as VkPhysicalDeviceLayeredApiPropertiesKHR
→ VkPhysicalDeviceDriverProperties
→ VkPhysicalDeviceProperties2
which can be confusing.
A future extension could also provide functionality to query properties of another layered API, such as D3D.
This extension allows the API-specific structs to be chained to VkPhysicalDeviceLayeredApiPropertiesKHR
to facilitate querying all information at once, which means the pNext
chain of VkPhysicalDeviceLayeredApiPropertiesKHR
could include property structs for both Vulkan and D3D for example.
Allowing multiple structs per API, potentially interleaved would just add to the confusion.
By wrapping VkPhysicalDeviceProperties2
in a VkPhysicalDeviceLayeredApiVulkanPropertiesKHR
struct, the pNext
chain of VkPhysicalDeviceLayeredApiPropertiesKHR
would contain only one struct per API, and avoid confusion in drivers, applications and validation layers.