Storage Image and Texel Buffers

This chapter covers storage images and texel buffers in Vulkan, explaining their purpose, how to use them, and best practices.

Storage Images

A storage image is a descriptor type (VK_DESCRIPTOR_TYPE_STORAGE_IMAGE) that allows shaders to read from and write to an image without using a fixed-function graphics pipeline. This is particularly useful for compute shaders and advanced rendering techniques.

Creating a Storage Image

To create a storage image, you need to:

  1. Create a VkImage with the VK_IMAGE_USAGE_STORAGE_BIT flag

  2. Create a VkImageView for the image

  3. Create a descriptor set layout with a binding of type VK_DESCRIPTOR_TYPE_STORAGE_IMAGE

  4. Update the descriptor set with the image view

// Create the image with storage usage flag
VkImageCreateInfo imageInfo = {};
imageInfo.sType = VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO;
imageInfo.imageType = VK_IMAGE_TYPE_2D;
imageInfo.format = VK_FORMAT_R32G32B32A32_SFLOAT; // Choose a format that supports storage operations
imageInfo.extent = {width, height, 1};
imageInfo.mipLevels = 1;
imageInfo.arrayLayers = 1;
imageInfo.samples = VK_SAMPLE_COUNT_1_BIT;
imageInfo.tiling = VK_IMAGE_TILING_OPTIMAL;
imageInfo.usage = VK_IMAGE_USAGE_STORAGE_BIT | VK_IMAGE_USAGE_TRANSFER_SRC_BIT;
imageInfo.sharingMode = VK_SHARING_MODE_EXCLUSIVE;
imageInfo.initialLayout = VK_IMAGE_LAYOUT_UNDEFINED;

// Create image view
VkImageViewCreateInfo viewInfo = {};
viewInfo.sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO;
viewInfo.image = storageImage;
viewInfo.viewType = VK_IMAGE_VIEW_TYPE_2D;
viewInfo.format = VK_FORMAT_R32G32B32A32_SFLOAT;
viewInfo.subresourceRange.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT;
viewInfo.subresourceRange.baseMipLevel = 0;
viewInfo.subresourceRange.levelCount = 1;
viewInfo.subresourceRange.baseArrayLayer = 0;
viewInfo.subresourceRange.layerCount = 1;

Using Storage Images in Shaders

In GLSL, storage images are declared using the image type with a format qualifier. The imageLoad and imageStore functions are used to read from and write to the image.

// VK_FORMAT_R32G32B32A32_SFLOAT
layout(set = 0, binding = 0, rgba32f) uniform image2D storageImage;

void main() {
    ivec2 texelCoord = ivec2(gl_GlobalInvocationID.xy);

    // Read from the image
    vec4 value = imageLoad(storageImage, texelCoord);

    // Modify the value
    value = value * 2.0;

    // Write back to the image
    imageStore(storageImage, texelCoord, value);
}

In Slang, storage images are declared similarly to HLSL, using the RWTexture2D type. The Load and Store methods are used to read from and write to the image.

// VK_FORMAT_R32G32B32A32_SFLOAT
[[vk::binding(0, 0)]]
[[vk::image_format("rgba32f")]]
RWTexture2D<float4> storageImage;

[numthreads(8, 8, 1)]
void main(uint3 dispatchThreadID : SV_DispatchThreadID)
{
    int2 texelCoord = int2(dispatchThreadID.xy);

    // Read from the image
    float4 value = storageImage.Load(texelCoord);

    // Modify the value
    value = value * 2.0;

    // Write back to the image
    storageImage[texelCoord] = value;
}

The corresponding SPIR-V assembly:

OpDecorate %storageImage DescriptorSet 0
OpDecorate %storageImage Binding 0

%rgba32f      = OpTypeImage %float 2D 0 0 0 2 Rgba32f
%ptr          = OpTypePointer UniformConstant %rgba32f
%storageImage = OpVariable %ptr UniformConstant

Image Formats for Storage Images

Not all image formats support storage operations. The VK_FORMAT_FEATURE_STORAGE_IMAGE_BIT flag in VkFormatProperties indicates whether a format can be used for storage images.

VkFormatProperties formatProperties;
vkGetPhysicalDeviceFormatProperties(physicalDevice, format, &formatProperties);
if (!(formatProperties.optimalTilingFeatures & VK_FORMAT_FEATURE_STORAGE_IMAGE_BIT)) {
    // Format does not support storage image operations
}

Common formats that typically support storage operations include:

  • VK_FORMAT_R32G32B32A32_SFLOAT

  • VK_FORMAT_R32G32B32A32_UINT

  • VK_FORMAT_R32G32B32A32_SINT

  • VK_FORMAT_R8G8B8A8_UNORM

  • VK_FORMAT_R8G8B8A8_UINT

Synchronization with Storage Images

When using storage images, proper synchronization is crucial to avoid race conditions. Storage images typically use the VK_IMAGE_LAYOUT_GENERAL layout for both reading and writing.

VkImageMemoryBarrier barrier = {};
barrier.sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER;
barrier.oldLayout = VK_IMAGE_LAYOUT_UNDEFINED;
barrier.newLayout = VK_IMAGE_LAYOUT_GENERAL;
barrier.srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
barrier.dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
barrier.image = storageImage;
barrier.subresourceRange.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT;
barrier.subresourceRange.baseMipLevel = 0;
barrier.subresourceRange.levelCount = 1;
barrier.subresourceRange.baseArrayLayer = 0;
barrier.subresourceRange.layerCount = 1;
barrier.srcAccessMask = 0;
barrier.dstAccessMask = VK_ACCESS_SHADER_WRITE_BIT;

vkCmdPipelineBarrier(
    commandBuffer,
    VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT,
    VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT,
    0,
    0, nullptr,
    0, nullptr,
    1, &barrier
);

When transitioning between compute shader writes and reads:

barrier.srcAccessMask = VK_ACCESS_SHADER_WRITE_BIT;
barrier.dstAccessMask = VK_ACCESS_SHADER_READ_BIT;

vkCmdPipelineBarrier(
    commandBuffer,
    VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT,
    VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT,
    0,
    0, nullptr,
    0, nullptr,
    1, &barrier
);

Texel Buffers

Texel buffers are a way to access buffer data with texture-like operations in shaders. There are two types of texel buffers:

  1. Uniform Texel Buffers (VK_DESCRIPTOR_TYPE_UNIFORM_TEXEL_BUFFER): Read-only access

  2. Storage Texel Buffers (VK_DESCRIPTOR_TYPE_STORAGE_TEXEL_BUFFER): Read-write access

Creating a Texel Buffer

To create a texel buffer, you need to:

  1. Create a VkBuffer with appropriate usage flags

  2. Create a VkBufferView for the buffer

  3. Create a descriptor set layout with a binding of the appropriate texel buffer type

  4. Update the descriptor set with the buffer view

// Create buffer
VkBufferCreateInfo bufferInfo = {};
bufferInfo.sType = VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO;
bufferInfo.size = size;
bufferInfo.usage = VK_BUFFER_USAGE_UNIFORM_TEXEL_BUFFER_BIT; // or VK_BUFFER_USAGE_STORAGE_TEXEL_BUFFER_BIT
bufferInfo.sharingMode = VK_SHARING_MODE_EXCLUSIVE;

// Create buffer view
VkBufferViewCreateInfo viewInfo = {};
viewInfo.sType = VK_STRUCTURE_TYPE_BUFFER_VIEW_CREATE_INFO;
viewInfo.buffer = buffer;
viewInfo.format = VK_FORMAT_R32G32B32A32_SFLOAT; // Choose a format that supports texel buffer operations
viewInfo.offset = 0;
viewInfo.range = size;

VkBufferView bufferView;
vkCreateBufferView(device, &viewInfo, nullptr, &bufferView);

Using Uniform Texel Buffers in Shaders

In GLSL, uniform texel buffers are declared using the textureBuffer type. The texelFetch function is used to read from the buffer.

layout(set = 0, binding = 0) uniform textureBuffer uniformTexelBuffer;

void main() {
    // Read from the texel buffer
    vec4 value = texelFetch(uniformTexelBuffer, int(gl_GlobalInvocationID.x));

    // Use the value
    // ...
}

In Slang, uniform texel buffers are declared using the Buffer type. The Load method is used to read from the buffer.

[[vk::binding(0, 0)]]
Buffer<float4> uniformTexelBuffer;

[numthreads(64, 1, 1)]
void main(uint3 dispatchThreadID : SV_DispatchThreadID)
{
    // Read from the texel buffer
    float4 value = uniformTexelBuffer.Load(dispatchThreadID.x);

    // Use the value
    // ...
}

The corresponding SPIR-V assembly:

OpDecorate %uniformTexelBuffer DescriptorSet 0
OpDecorate %uniformTexelBuffer Binding 0

%texelBuffer        = OpTypeImage %float Buffer 0 0 0 1 Unknown
%ptr                = OpTypePointer UniformConstant %texelBuffer
%uniformTexelBuffer = OpVariable %ptr UniformConstant

Using Storage Texel Buffers in Shaders

In GLSL, storage texel buffers are declared using the imageBuffer type with a format qualifier. The imageLoad and imageStore functions are used to read from and write to the buffer.

// VK_FORMAT_R32G32B32A32_SFLOAT
layout(set = 0, binding = 0, rgba32f) uniform imageBuffer storageTexelBuffer;

void main() {
    int index = int(gl_GlobalInvocationID.x);

    // Read from the texel buffer
    vec4 value = imageLoad(storageTexelBuffer, index);

    // Modify the value
    value = value * 2.0;

    // Write back to the texel buffer
    imageStore(storageTexelBuffer, index, value);
}

In Slang, storage texel buffers are declared using the RWBuffer type. The Load method and array indexing are used to read from and write to the buffer.

// VK_FORMAT_R32G32B32A32_SFLOAT
[[vk::binding(0, 0)]]
[[vk::image_format("rgba32f")]]
RWBuffer<float4> storageTexelBuffer;

[numthreads(64, 1, 1)]
void main(uint3 dispatchThreadID : SV_DispatchThreadID)
{
    int index = int(dispatchThreadID.x);

    // Read from the texel buffer
    float4 value = storageTexelBuffer.Load(index);

    // Modify the value
    value = value * 2.0;

    // Write back to the texel buffer
    storageTexelBuffer[index] = value;
}

The corresponding SPIR-V assembly:

OpDecorate %storageTexelBuffer DescriptorSet 0
OpDecorate %storageTexelBuffer Binding 0

%rgba32f           = OpTypeImage %float Buffer 0 0 0 2 Rgba32f
%ptr               = OpTypePointer UniformConstant %rgba32f
%storageTexelBuffer = OpVariable %ptr UniformConstant

Using non-rgba Format for Texel Buffer

A common mistake when dealing with Texel Buffers is forgetting you are accessing a single texel at a time. This texel format can have 1 to 4 components (R8 vs RGBA8). Some shading languages, such as GLSL, require you to write all 4 components where the extra components are ignored.

// VK_FORMAT_R32_UINT
layout(set = 0, binding = 0, r32ui) uniform uimageBuffer storageTexelBuffer;

void main() {
    // Invalid in GLSL, need to use a uvec4
    uint a = 1;
    imageStore(storageTexelBuffer, 0, a);

    // Common mistake is to assume this will write all 4 values to 4 consecutive texels.
    // Only "1" is written and the other 3 components are discarded because the format only contains 1 component
    uvec4 b = uvec4(1, 2, 3, 4);
    imageStore(storageTexelBuffer, 0, b);
}

Formats for Texel Buffers

Not all formats support texel buffer operations. The VK_FORMAT_FEATURE_UNIFORM_TEXEL_BUFFER_BIT and VK_FORMAT_FEATURE_STORAGE_TEXEL_BUFFER_BIT flags in VkFormatProperties indicate whether a format can be used for uniform and storage texel buffers, respectively.

VkFormatProperties formatProperties;
vkGetPhysicalDeviceFormatProperties(physicalDevice, format, &formatProperties);
if (!(formatProperties.bufferFeatures & VK_FORMAT_FEATURE_UNIFORM_TEXEL_BUFFER_BIT)) {
    // Format does not support uniform texel buffer operations
}
if (!(formatProperties.bufferFeatures & VK_FORMAT_FEATURE_STORAGE_TEXEL_BUFFER_BIT)) {
    // Format does not support storage texel buffer operations
}

The code above is using the bufferFeatures member of VkFormatProperties to check for texel buffer support, as opposed to optimalTilingFeatures or linearTilingFeatures which are used for images.

Synchronization with Texel Buffers

When using storage texel buffers, proper synchronization is crucial to avoid race conditions.

VkBufferMemoryBarrier barrier = {};
barrier.sType = VK_STRUCTURE_TYPE_BUFFER_MEMORY_BARRIER;
barrier.srcAccessMask = VK_ACCESS_SHADER_WRITE_BIT;
barrier.dstAccessMask = VK_ACCESS_SHADER_READ_BIT;
barrier.srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
barrier.dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
barrier.buffer = buffer;
barrier.offset = 0;
barrier.size = VK_WHOLE_SIZE;

vkCmdPipelineBarrier(
    commandBuffer,
    VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT,
    VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT,
    0,
    0, nullptr,
    1, &barrier,
    0, nullptr
);

Comparison with Other Buffer Types

Storage Image vs. Storage Buffer

While both storage images and storage buffers allow for read-write access in shaders, they have different use cases:

  • Storage Images: Ideal for 2D or 3D data that benefits from texture operations like filtering or addressing modes.

  • Storage Buffers: Better for arbitrary structured data or when you need to access data in a non-uniform pattern.

Texel Buffer vs. Storage Buffer

Texel buffers and storage buffers also have different strengths:

  • Texel Buffers: Provide texture-like access to buffer data, allowing for operations like filtering.

  • Storage Buffers: More flexible for general-purpose data storage and manipulation.

Considerations for Tile-Based Renderers

Many mobile GPUs and some desktop GPUs use a tile-based rendering architecture, which has important implications for how storage images and texel buffers should be used.

What is Tile-Based Rendering?

In tile-based rendering (TBR) or tile-based deferred rendering (TBDR), the GPU divides the framebuffer into small rectangular regions called tiles. Each tile is processed completely (all draw calls affecting that tile) before moving to the next tile. This approach:

  • Reduces memory bandwidth by keeping tile data in fast on-chip memory

  • Improves power efficiency, which is particularly important for mobile devices

  • Allows for efficient implementation of certain rendering techniques

Storage Images in Tile-Based Renderers

When using storage images with tile-based renderers, consider the following:

  1. Tile Memory Flushing: Writing to storage images may cause the GPU to flush tile memory to main memory, reducing the benefits of tile-based rendering.

    • This can significantly impact performance, especially if done frequently

    • Try to batch storage image operations to minimize tile memory flushes

  2. Transient Attachments: Some tile-based renderers support special "transient" attachments that exist only in tile memory.

    • These cannot be used as storage images since they don’t have backing memory

    • If you need to process render results, consider using input attachments instead where possible

  3. Pixel Local Storage Extensions: Some tile-based GPUs offer extensions like VK_EXT_shader_pixel_local_storage that provide more efficient alternatives to storage images for certain use cases.

    • These extensions allow shaders to access per-pixel data that stays in tile memory

    • Check for and use these extensions when available on tile-based hardware

  4. Render Pass Coherency: In tile-based renderers, data written to storage images during a render pass may not be visible to subsequent draw calls in the same render pass.

    • Use appropriate memory barriers or split your work into multiple render passes

    • Be aware that these barriers may be more expensive on tile-based renderers

Texel Buffers in Tile-Based Renderers

Texel buffers generally work similarly on tile-based and immediate mode renderers, but there are still some considerations:

  1. Cache Coherency: Tile-based renderers may have different caching behaviors for texel buffer access.

    • Ensure proper synchronization when writing to and reading from texel buffers

    • Be aware that cache flushes may be more expensive on tile-based architectures

  2. Memory Access Patterns: Tile-based renderers may be more sensitive to non-coherent memory access patterns.

    • Organize your data to maximize locality for the tile being processed

    • Consider the tile size when designing your algorithms

Performance Optimization for Tile-Based Renderers

  1. Minimize Framebuffer Resolves: Each time you need to access framebuffer contents as a storage image, the tile-based renderer must "resolve" (write) the tile memory to main memory.

    • Try to complete all operations that modify a particular image before reading from it

    • Consider using subpasses and input attachments instead of storage images for operations within a render pass

  2. Prefer Render Passes Over Compute for Image Processing: On tile-based renderers, operations within a render pass can often be more efficient than compute shaders using storage images.

    • Consider implementing image processing as fragment shaders in a render pass

    • Use multiple subpasses to keep intermediate results in tile memory

  3. Be Careful with Mixed Access Patterns: Mixing reads and writes to the same storage image can be particularly expensive on tile-based renderers.

    • Try to separate read and write phases

    • Consider double-buffering techniques to avoid read-after-write hazards

Format Compatibility Requirements

When using storage images and texel buffers, it’s crucial to understand the format compatibility rules, which differ slightly between these resource types. Mismatches between shader formats and resource formats can lead to undefined behavior and potential validation warnings.

Differences in Format Compatibility Rules

The format compatibility rules differ between storage images and texel buffers in subtle but important ways:

  1. Storage Images: The format specified in the shader (SPIR-V Image Format) must exactly match the format used when creating the VkImageView (Vulkan Format).

  2. Texel Buffers: The format compatibility is more relaxed. The shader can access the data as long as the format is compatible with the buffer view format according to the format compatibility classes.

This difference means that storage images have stricter format requirements than texel buffers, which can lead to confusion when working with both resource types.

SPIR-V Image Format and Vulkan Format Compatibility

The Vulkan Specification defines a table of Compatibility Between SPIR-V Image Formats and Vulkan Formats that shows the exact mapping between SPIR-V Image Formats and Vulkan Formats.

Storage Images Format Requirements

For storage images, the format specified in the shader must exactly match the format of the image view according to this table. There is no automatic format conversion or component swizzling.

// SPIR-V format Rgba8 (maps to VK_FORMAT_R8G8B8A8_UNORM)
layout(set = 0, binding = 0, rgba8) uniform image2D storageImage;

// The VkImageView must be created with VK_FORMAT_R8G8B8A8_UNORM
// Using VK_FORMAT_B8G8R8A8_UNORM would result in undefined behavior

Texel Buffers Format Requirements

For texel buffers, the format compatibility is determined by the format compatibility classes. Formats within the same compatibility class can be used interchangeably, with the shader performing the necessary conversions.

// For uniform texel buffers, the format is not specified in the shader
layout(set = 0, binding = 0) uniform textureBuffer uniformTexelBuffer;

// The VkBufferView can be created with any compatible format
// For example, both VK_FORMAT_R8G8B8A8_UNORM and VK_FORMAT_B8G8R8A8_UNORM would work

For storage texel buffers, a format is specified in the shader, but the compatibility rules are still more relaxed than for storage images:

// SPIR-V format Rgba8 (maps to VK_FORMAT_R8G8B8A8_UNORM)
layout(set = 0, binding = 0, rgba8) uniform imageBuffer storageTexelBuffer;

// The VkBufferView should ideally be created with VK_FORMAT_R8G8B8A8_UNORM
// But formats in the same compatibility class may work on some implementations

Component Swizzling

Component swizzling is another area where storage images and texel buffers differ:

  1. Storage Images: No automatic component swizzling occurs. The components are accessed exactly as they are stored in memory. If you need to swizzle components, (e.g., convert between RGBA and BGRA), you must do it manually in your shader code.

  2. Texel Buffers: Some implementations may perform automatic component swizzling based on the format compatibility class. However, this behavior is not guaranteed across all implementations, so it’s best practice to match formats exactly when possible.

  3. Image Views: For sampled images (not storage images), you can use the VkComponentMapping structure in VkImageViewCreateInfo to specify component swizzling. This is not applicable to storage images or texel buffers.

Common Format Mismatch Cases

Several types of format mismatches can occur, all of which result in undefined behavior:

  1. Component Size Mismatch: When the component size in the SPIR-V format differs from the Vulkan format.

    • Example: SPIR-V format Rgba32f (32-bit float components) with VK_FORMAT_R8G8B8A8_UNORM (8-bit components)

  2. Component Count Mismatch: When the number of components in the SPIR-V format differs from the Vulkan format.

    • More Components Written: SPIR-V format Rgba8 (4 components) with VK_FORMAT_R8_UNORM (1 component)

    • Less Components Written: SPIR-V format R8 (1 component) with VK_FORMAT_R8G8B8A8_UNORM (4 components)

  3. Numeric Format Mismatch: When the numeric format (normalized, float, int) in the SPIR-V format differs from the Vulkan format.

    • Example: SPIR-V format Rgba8 (UNORM) with VK_FORMAT_R8G8B8A8_SNORM (SNORM)

  4. Numeric Type Mismatch: When the numeric type (float, int, uint) in the SPIR-V format differs from the Vulkan format.

    • Example: SPIR-V format R8 (float) with VK_FORMAT_R8_SINT (signed int)

    • Example: SPIR-V format R8ui (unsigned int) with VK_FORMAT_R8_SINT (signed int)

  5. Channel Order Mismatch: When the channel order in the SPIR-V format differs from the Vulkan format.

    • Example: SPIR-V format Rgba8 (RGBA order) with VK_FORMAT_B8G8R8A8_UNORM (BGRA order)

    • This is particularly problematic for storage images, where no automatic swizzling occurs

How to Fix Format Mismatches

There are different approaches to fix format mismatches depending on the resource type:

For Storage Images

  • Match the Formats Exactly: Ensure that the VkImageView format exactly matches the SPIR-V Image Format as defined in the compatibility table.

    • For example, if your shader uses rgba8 (SPIR-V format Rgba8), create your VkImageView with VK_FORMAT_R8G8B8A8_UNORM.

    • If you need to work with a different format (e.g., VK_FORMAT_B8G8R8A8_UNORM), you’ll need to manually swizzle the components in your shader code:

// Manual swizzling for BGRA to RGBA conversion
vec4 value = imageLoad(storageImage, texelCoord);
vec4 swizzled = value.bgra; // Manually swizzle components
// Use swizzled value
  • Use the Unknown Format in SPIR-V: If you need flexibility in the formats you use, you can use the Unknown format in SPIR-V, which is compatible with any Vulkan format.

    • This requires enabling the shaderStorageImageWriteWithoutFormat feature.

    • In GLSL, this means omitting the format qualifier:

// No format specified, uses Unknown in SPIR-V
layout(set = 0, binding = 0) uniform image2D storageImage;
  • Note that when using the Unknown format, you’re responsible for ensuring that the data you read from or write to the image is compatible with the actual format of the image.

For Texel Buffers

  1. Match the Formats When Possible: While texel buffers have more relaxed format compatibility rules, it’s still best practice to match formats exactly when possible.

  2. Use Format Compatibility Classes: If you need to work with different formats, ensure they are in the same format compatibility class.

  3. Handle Component Swizzling in Shader: If you’re working with formats that have different component orders (e.g., RGBA vs. BGRA), handle the swizzling explicitly in your shader code to ensure consistent behavior across all implementations.

Important Considerations

  • When a format mismatch occurs with storage images, the entire image memory becomes undefined, not just the texels being written.

  • Even formats that are in the same compatibility class (e.g., VK_FORMAT_R8G8B8A8_UNORM and VK_FORMAT_B8G8R8A8_UNORM) must match exactly for storage images.

  • Texel buffers have more relaxed format compatibility rules, but it’s still best practice to match formats exactly when possible.

  • The validation warnings for format mismatches are intended to help developers identify potential issues, as these mismatches can lead to subtle bugs that might not be immediately clear.

  • Component swizzling must be handled manually for storage images, while some automatic swizzling may occur for texel buffers on some implementations.

Best Practices

Performance Considerations

  1. Format Selection: Choose formats that are natively supported by the hardware for better performance.

    • Prefer formats with native hardware support (check VkFormatProperties)

    • For storage images, 32-bit formats (R32_*) often have better performance than packed formats

    • Consider using single-channel formats when only one channel is needed to reduce memory bandwidth

  2. Memory Access Patterns: Try to ensure coalesced memory access patterns when reading from or writing to storage images and texel buffers.

    • Group memory accesses to adjacent locations to maximize cache efficiency

    • In compute shaders, align work group sizes with hardware warp/wavefront sizes

    • Consider the memory layout when accessing 2D images (row-major vs. column-major access)

    • For texel buffers, sequential access is generally faster than random access

  3. Synchronization: Use the minimal necessary synchronization to avoid performance penalties.

    • Use the most specific access flags and pipeline stages possible

    • Batch operations to reduce the number of barriers needed

    • Consider using VK_PIPELINE_STAGE_ALL_COMMANDS_BIT only when absolutely necessary

    • For compute workloads, try to design algorithms that minimize synchronization points

  4. Resource Reuse: Reuse storage images and texel buffers when possible to reduce memory allocation overhead.

    • Consider implementing a resource pool for frequently created/destroyed resources

    • Use double or triple buffering techniques for resources that are updated every frame

  5. Workload Balancing: Distribute work evenly across compute shader invocations.

    • Choose appropriate workgroup sizes based on your hardware (typically multiples of 32 or 64)

    • Avoid divergent execution paths within a workgroup

    • Consider tiled processing for large images to improve cache locality

Common Pitfalls

  1. Format Support: Not all formats support storage operations. Always check format features.

    • Use vkGetPhysicalDeviceFormatProperties to verify format support before creating resources

    • Some formats may support storage operations but with reduced performance

    • Be aware that format support can vary between different hardware vendors

  2. Memory Barriers: Missing or incorrect memory barriers can lead to race conditions and undefined behavior.

    • Always use appropriate memory barriers between writes and subsequent reads

    • Remember that barriers are needed even when operations are in the same shader

    • For compute shaders, use memoryBarrierImage() or memoryBarrierBuffer() in GLSL when appropriate

    • Be careful with multiple queue submissions that access the same resources

  3. Layout Transitions: Storage images typically use VK_IMAGE_LAYOUT_GENERAL, but transitioning to this layout is still required.

    • Always transition images to the correct layout before use

    • Be aware that VK_IMAGE_LAYOUT_GENERAL may be less efficient than specialized layouts

    • Consider using VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL if you only need read access

  4. Atomic Operations: Atomic operations on storage images and buffers can be expensive.

    • Minimize the use of atomic operations when possible

    • Consider alternative algorithms that don’t require atomics

    • Be aware that atomic performance varies significantly between hardware vendors

    • Group atomic operations to minimize memory contention

  5. Resource Limits: Be aware of device limits for storage images and texel buffers.

    • Check maxPerStageDescriptorStorageImages and related limits

    • Some devices may have restrictions on the number of storage resources that can be written to

    • Consider the impact on descriptor set layout when using many storage resources

  6. Validation Layers: Use validation layers during development to catch common errors.

    • Enable synchronization validation to detect barrier issues

    • Pay attention to warnings about format support and usage flags

    • Test on multiple hardware vendors if possible to catch implementation-specific issues

  7. Shader Compilation: Be aware of shader compilation implications.

    • Complex storage image and texel buffer operations may increase register pressure

    • Consider splitting complex shaders into multiple passes

    • Profile shader performance to identify bottlenecks

Example Use Cases

Image Processing with Storage Images

Storage images are ideal for image processing tasks like filters, blurs, and other post-processing effects.

Particle Systems with Storage Texel Buffers

Storage texel buffers can be used to store and update particle data in a compute shader, which can then be read by a vertex shader for rendering.

Lookup Tables with Uniform Texel Buffers

Uniform texel buffers are useful for implementing lookup tables that need to be accessed with texture-like operations.