VK_EXT_zero_initialize_device_memory

VK_EXT_zero_initialize_device_memory intends to add more control over how memory is cleared on allocation, ensuring that minimal work is performed in order to meet application requirements.

1. Problem Statement

By default, Vulkan provides no guarantees that device memory allocated through vkAllocateMemory is cleared to zero. This means that applications wanting resources to be zero-initialized must execute a command such as vkCmdFillBuffer or vkCmdClearColorImage on the device to ensure a deterministic result. This can be wasteful if the underlying platform either:

  • Already performs that zero clear anyway, due to e.g. security concerns.

  • Can be performed more efficiently in implementation, by e.g. clearing pages to zero in the background after device memory is freed.

This extension also has uses in API layering and porting efforts, where zero memory behavior may be more strict than Vulkan. Different OS platforms also have wildly different behaviors here, which leads to implementations needing to apply workarounds to paper over these issues in the wild. If an extension exists to make allocation behavior explicit, we hopefully achieve a more robust ecosystem for Vulkan.

2. Solution space

The solution is to provide a mechanism for applications to control this behavior at vkAllocateMemory time.

2.1. Image clears

Using zero clears to initialize images also has some special concerns. To be able to use an optimally tiled image, the application must transition it away from VK_IMAGE_LAYOUT_UNDEFINED, and that transition is allowed to clobber the image. It may be possible to guarantee that the accessed texels are all 0 after:

  • Clearing memory to zero (through either zero allocation or vkCmdFillBuffer).

  • Transitioning image away from UNDEFINED

  • Accessing texels

One way to solve this would have been to reuse PREINITIALIZED layout, but adding that to OPTIMAL tiling would be quirky since PREINITIALIZED is intended to be for host access aliasing and imported memory. The proposed solution introduces a new initialLayout which makes it explicit that it functions like PREINITIALIZED, but only for zeroed memory and works with OPTIMAL tiled images, unlike normal OPTIMAL images.

3. Proposal

The extension adds a feature struct:

typedef struct VkPhysicalDeviceZeroInitializeDeviceMemoryFeaturesEXT
{
	VkStructureType sType;
	const void *pNext;
	VkBool32 zeroInitializeDeviceMemory;
} VkPhysicalDeviceZeroInitializeDeviceMemoryFeaturesEXT;

and a new VkMemoryAllocateFlagBits flag:

VK_MEMORY_ALLOCATE_ZERO_INITIALIZE_BIT_EXT = 0x00000008

Setting this flag in a VkMemoryAllocateFlagsInfo used to allocate memory will cause the allocated memory to be zero-initialized.

To make zero-clearing images as efficient as possible when memory is known to be zero, there is a new image layout:

VK_IMAGE_LAYOUT_ZERO_INITIALIZED_EXT = 1000620000

If transitioning away from VK_IMAGE_LAYOUT_ZERO_INITIALIZED_EXT:

  • the image data for block-compressed formats will be bitwise 0

  • the texels for all other formats image will all have a value of 0.

If not all memory bound to the VkImage (including sparse binds) is bitwise 0`, the result is undefined, i.e. it behaves just like a transition away from UNDEFINED layout. VK_IMAGE_LAYOUT_ZERO_INITIALIZED_EXT can only be used for the full contents of the image, and it can only be used as the oldLayout member of a transition away from the initialLayout.

This layout is intended to help various use cases:

  • If an application desires a zero-cleared image after allocation, it would normally have to clear it after transitioning from UNDEFINED, since transitions from UNDEFINED clobber the image. This can be quite inefficient, especially if the image is not framebuffer compressed. Since the implementation knows the memory is all zero, it should have a fast path to make sure the texels are all zero.

  • If the implementation knows the memory is all zero, it may be able to omit metadata clears or similar. While very cheap to clear that, it is not completely free either. On implementations where undefined metadata can cause stability problems, the implementation can treat transition from oldLayout similar to UNDEFINED as long as the zeroing guarantee is met.

  • Having an explicit layout aids debugging tools such as RenderDoc, since tools are allowed to explicitly clobber image data when transitioning away from UNDEFINED for debugging purposes.

It is important to note that it is not intended that filling memory with zero, then transitioning a fresh image away from VK_IMAGE_LAYOUT_ZERO_INITIALIZED_EXT should be seen as a "faster" way to obtain a zero filled image if the memory was not already guaranteed to be zero.

4. Issues

4.1. RESOLVED: Should exportable memory be supported?

Yes, exportable memory is supported just like non-exportable memory.