Samples overview

Introduction

This readme lists all Vulkan samples currently available in this repository. They are grouped into multiple categories. Many samples come with a tutorial, which can be found in their respective folders.

Performance samples

The goal of these samples is to demonstrate how to use certain features and functions to achieve optimal performance. To visualize this, they also include real-time profiling information.

AFBC

AFBC (Arm Frame Buffer Compression) is a real-time lossless compression algorithm found in Arm Mali GPUs, designed to tackle the ever-growing demand for higher resolution graphics. This format is applied to the framebuffers that are to be written to the GPU. This technology can offer bandwidth reductions of up to 50%.

Command buffer usage

This sample demonstrates how to use and manage secondary command buffers, and how to record them concurrently. Implementing multi-threaded recording of draw calls can help reduce CPU frame time.

Constant data

The Vulkan API exposes a few different ways in which we can send uniform data into our shaders. There are enough methods that it raises the question "Which one is fastest?", and more often than not the answer is "It depends". The main issue for developers is that the fastest methods may differ between the various vendors, so often there is no "one size fits all" solution. This sample aims to highlight this issue, and help move the Vulkan ecosystem to a point where we are better equipped to solve this for developers. This is done by having an interactive way to toggle different constant data methods that the Vulkan API expose to us. This can then be run on a platform of the developers choice to see the performance implications that each of them bring.

Descriptor management

An application using Vulkan will have to implement a system to manage descriptor pools and sets. The most straightforward and flexible approach is to re-create them for each frame, but doing so might be very inefficient, especially on mobile platforms. The problem of descriptor management is intertwined with that of buffer management, that is choosing how to pack data in VkBuffer objects. This sample will explore a few options to improve both descriptor and buffer management.

HPP Swapchain images

A transcoded version of the Performance sample Swapchain images that illustrates the usage of the C++ bindings of vulkan provided by vulkan.hpp.

Image compression control

This sample shows how to use the extensions VK_EXT_image_compression_control and VK_EXT_image_compression_control_swapchain to select between different levels of image compression. The UI shows the impact compression has on image size and bandwidth, illustrating the benefits of fixed-rate (visually lossless) compression.

Layout transitions

Vulkan requires the application to manage image layouts, so that all render pass attachments are in the correct layout when the render pass begins. This is usually done using pipeline barriers or the initialLayout and finalLayout parameters of the render pass. If the rendering pipeline is complex, transitioning each image to its correct layout is not trivial, as it requires some sort of state tracking. If previous image contents are not needed, there is an easy way out, that is setting oldLayout/initialLayout to VK_IMAGE_LAYOUT_UNDEFINED. While this is functionally correct, it can have performance implications as it may prevent the GPU from performing some optimizations. This sample will cover an example of such optimizations and how to avoid the performance overhead from using sub-optimal layouts.

MSAA

Aliasing is the result of under-sampling a signal. In graphics this means computing the color of a pixel at a resolution that results in artifacts, commonly jaggies at model edges. Multisample anti-aliasing (MSAA) is an efficient technique that reduces pixel sampling error.

Multi-threaded recording with multiple render passes

Ideally you render all stages of your frame in a single render pass. However, in some cases different stages can’t be performed in the same render pass. This sample shows how multi-threading can help to boost performance when using multiple render passes to render a single frame.

Pipeline barriers

Vulkan gives the application significant control over memory access for resources. Pipeline barriers are particularly convenient for synchronizing memory accesses between render passes. Having barriers is required whenever there is a memory dependency - the application should not assume that render passes are executed in order. However, having too many or too strict barriers can affect the application’s performance. This sample will cover how to set up pipeline barriers efficiently, with a focus on pipeline stages.

Pipeline cache

Vulkan gives applications the ability to save internal representation of a pipeline (graphics or compute) to enable recreating the same pipeline later. This sample will look in detail at the implementation and performance implications of the pipeline creation, caching and management.

Render passes

Vulkan render-passes use attachments to describe input and output render targets. This sample shows how loading and storing attachments might affect performance on mobile. During the creation of a render-pass, you can specify various color attachments and a depth-stencil attachment. Each of those is described by a VkAttachmentDescription struct, which contains attributes to specify the load operation (loadOp) and the store operation (storeOp). This sample lets you choose between different combinations of these operations at runtime.

Specialization constants

Vulkan exposes a number of methods for setting values within shader code during run-time, this includes UBOs and Specialization Constants. This sample compares these two methods and the performance impact of them.

Sub passes

Vulkan introduces the concept of subpasses to subdivide a single render pass into separate logical phases. The benefit of using subpasses over multiple render passes is that a GPU is able to perform various optimizations. Tile-based renderers, for example, can take advantage of tile memory, which being on chip is decisively faster than external memory, potentially saving a considerable amount of bandwidth.

Surface rotation

Mobile devices can be rotated, therefore the logical orientation of the application window and the physical orientation of the display may not match. Applications then need to be able to operate in two modes: portrait and landscape. The difference between these two modes can be simplified to just a change in resolution. However, some display subsystems always work on the "native" (or "physical") orientation of the display panel. Since the device has been rotated, to achieve the desired effect the application output must also rotate. In this sample we focus on the rotation step, and analyze the performance implications of implementing it correctly with Vulkan.

Swapchain images

Vulkan gives the application some significant control over the number of swapchain images to be created. This sample analyzes the available options and their performance implications.

Wait idle

This sample compares two methods for synchronizing between the CPU and GPU, WaitIdle and Fences demonstrating which one is the best option in order to avoid stalling.

16-bit storage InputOutput

This sample compares bandwidth consumption when using FP32 varyings compared to using FP16 varyings with VK_KHR_16bit_storage.

16-bit arithmetic

This sample compares arithmetic throughput for 32-bit arithmetic operations and 16-bit arithmetic. The sample also shows how to enable 16-bit storage for SSBOs and push constants.

Async compute

This sample demonstrates using multiple Vulkan queues to get better hardware utilization with compute post-processing workloads.

Basis Universal supercompressed GPU textures

This sample demonstrates how to use Basis universal supercompressed GPU textures in a Vulkan application.

GPU Rendering and Multi-Draw Indirect

This sample demonstrates how to reduce CPU usage by offloading draw call generation and frustum culling to the GPU.

Texture compression comparison

This sample demonstrates how to use different types of compressed GPU textures in a Vulkan application, and shows the timing benefits of each.

API samples

The goal of these samples is to demonstrate how to use a given Vulkan feature at the API level with as little abstraction as possible.

Compute shader N-Body simulation

Compute shader example that uses two passes and shared compute shader memory for simulating a N-Body particle system.

Dynamic Uniform buffers

Dynamic uniform buffers are used for rendering multiple objects with separate matrices stored in a single uniform buffer object, that are addressed dynamically.

High dynamic range

Implements a high dynamic range rendering pipeline using 16/32 bit floating point precision for all calculations.

Hello Triangle

A self-contained (minimal use of framework) sample that illustrates the rendering of a triangle.

HPP Compute shader N-Body simulation

A transcoded version of the API sample Compute N-Body that illustrates the usage of the C++ bindings of Vulkan provided by vulkan.hpp.

HPP Dynamic Uniform Buffers

A transcoded version of the API sample Dynamic Uniform buffers that illustrates the usage of the C++ bindings of Vulkan provided by vulkan.hpp.

HPP High dynamic range

A transcoded version of the API sample High dynamic rangethat illustrates the usage of the C++ bindings of Vulkan provided by vulkan.hpp.

HPP Hello Triangle

A transcoded version of the API sample Hello Triangle that illustrates the usage of the C++ bindings of Vulkan provided by vulkan.hpp.

HPP HLSL shaders

A transcoded version of the API sample HLSL Shaders that illustrates the usage of the C++ bindings of Vulkan provided by vulkan.hpp.

HPP Instancing

A transcoded version of the API sample Instancing that illustrates the usage of the C++ bindings of Vulkan provided by vulkan.hpp.

HPP OIT Linked Lists

A transcoded version of the API sample OIT Linked Lists that illustrates the usage of the C++ bindings of Vulkan provided by vulkan.hpp.

HPP Separate image sampler

A transcoded version of the API sample Separate image sampler that illustrates the usage of the C++ bindings of vulkan provided by vulkan.hpp.

HPP Terrain Tessellation

A transcoded version of the API sample Terrain Tessellation that illustrates the usage of the C++ bindings of vulkan provided by vulkan.hpp.

HPP Texture Loading

A transcoded version of the API sample Texture loading that illustrates the usage of the C++ bindings of vulkan provided by vulkan.hpp.

HPP Texture run-time mip-map generation

A transcoded version of the API sample Texture run-time mip-map generation that illustrates the usage of the C++ bindings of vulkan provided by vulkan.hpp.

HPP Timestamp queries

A transcoded version of the API sample Timestamp queries that illustrates the usage of the C++ bindings of vulkan provided by vulkan.hpp.

Instancing

Uses the instancing feature for rendering many instances of the same mesh from a single vertex buffer with variable parameters and textures.

Separate image sampler

Separate image and samplers, both in the application and the shaders. The sample demonstrates how to use different samplers for the same image without the need to recreate descriptors.

Terrain Tessellation

Uses a tessellation shader for rendering a terrain with dynamic level-of-detail and frustum culling.

Texture loading

Loading and rendering of a 2D texture map from a file.

Texture run-time mip-map generation

Generates a complete mip-chain for a texture at runtime instead of loading it from a file.

HLSL shaders

Converts High Level Shading Language (HLSL) shaders to Vulkan-compatible SPIR-V.

Timestamp queries

Using timestamp queries for profiling GPU workloads.

Swapchain recreation

A sample that implements best practices in handling swapchain recreation, for example due to window resizing or present mode changes.

Order-independent transparency with per-pixel ordered linked lists

A sample that implements an order-independent transparency algorithm using per-pixel ordered linked lists.

Order-independent transparency with depth peeling

A sample that implements order-independent transparency with depth peeling.

Extension samples

The goal of these samples is to demonstrate how to use a particular Vulkan extension at the API level with as little abstraction as possible.

Conservative Rasterization

Uses conservative rasterization to change the way fragments are generated. Enables overestimation to generate fragments for every pixel touched instead of only pixels that are fully covered.

Dynamic Rendering

Demonstrates how to use Dynamic Rendering. Read the blog post here for discussion: (https://www.khronos.org/blog/streamlining-render-passes)

Push Descriptors

Push descriptors apply the push constants concept to descriptor sets. Instead of creating per-object descriptor sets, this example passes descriptors at command buffer creation time.

Debug Utilities

Extension: VK_EXT_debug_utils

Uses the debug utilities extension to name and group Vulkan objects (command buffers, images, etc.). This information makes debugging in tools like RenderDoc significantly easier.

Memory Budget

Uses the memory budget extension to monitor the allocated memory in the GPU and demonstrates how to use it.

Mesh Shader Culling

Extension: VK_EXT_mesh_shader

Uses the mesh shader extension to demonstrate how to do basic culling utilizing both a mesh and a task shader.

Basic ray queries

Render a sponza scene using the ray query extension. Shows how to set up all data structures required for ray queries, including the bottom and top level acceleration structures for the geometry and a standard vertex/fragment shader pipeline. Shadows are cast dynamically by ray queries being cast by the fragment shader.

Basic hardware accelerated ray tracing

Render a basic scene using the official cross-vendor ray tracing extension. Shows how to setup all data structures required for ray tracing, including the bottom and top level acceleration structures for the geometry, the shader binding table and the ray tracing pipelines with shader groups for ray generation, ray hits, and ray misses. After dispatching the rays, the final result is copied to the swapchain image.

Extended hardware accelerated ray tracing

Render Sponza with Ambient Occlusion. Place a vase in center. Generate a particle fire that demonstrates the TLAS (Top Level Acceleration Structure) animation for the same underlying geometry. Procedurally generate a transparent quad and deform the geometry of the quad in the BLAS (Bottom Level Acceleration Structure) to demonstrate how to animate with deforming geometry. Shows how to rebuild the acceleration structure and when to set it to fast rebuild vs fast traversal.

Mesh shading

Extensions: VK_EXT_mesh_shader

Renders a triangle with the most simple of all possible mesh shader pipeline examples. There is no vertex shader, there is only a mesh and fragment shader. The mesh shader creates the vertices for the triangle. The mesh shading pipeline includes the task and mesh shaders before going into the fragment shader. This replaces the vertex / geometry shader standard pipeline.

HPP Mesh shading

A transcoded version of the Extensions sample Mesh shading that illustrates the usage of the C++ bindings of vulkan provided by vulkan.hpp.

OpenGL interoperability

Render a procedural image using OpenGL and incorporate that rendered content into a Vulkan scene. Demonstrates using the same backing memory for a texture in both OpenGL and Vulkan and how to synchronize the APIs using shared semaphores and barriers.

OpenCL interoperability

This sample shows how to do Vulkan and OpenCL interoperability using cross vendor extensions in both apis. The sample uses OpenCL to update an image that is then displayed in Vulkan. This is done by sharing the memory for that image across the two apis. The sample also shares semaphores for doing cross api synchronization.

OpenCL interoperability (Arm)

This sample demonstrates usage of OpenCL extensions available on Arm devices. Fill a procedural texture using OpenCL and display it using Vulkan. In this sample data sharing between APIs is achieved using Android Hardware Buffers.

Timeline semaphore

Demonstrates various use cases which are enabled with timeline semaphores. The sample implements "Game of Life" in an esoteric way, using out-of-order signal and wait, multiple waits on same semaphore in different queues, waiting and signalling semaphore on host.

Buffer device address

Demonstrates how to use the buffer device address feature, which enables extreme flexibility in how buffer memory is accessed.

Synchronization2

Demonstrates the use of the reworked synchronization api introduced with VK_KHR_synchronization2. Based on the compute shading N-Body particle system, this sample uses the new extension to streamline the memory barriers used for the compute and graphics work submissions.

Descriptor indexing

Demonstrates how to use descriptor indexing to enable update-after-bind and non-dynamically uniform indexing of descriptors.

Fragment shading rate

Uses a special framebuffer attachment to control fragment shading rates for different framebuffer regions. This allows explicit control over the number of fragment shader invocations for each pixel covered by a fragment, which is e.g. useful for foveated rendering.

Fragment shading rate_dynamic

Render a simple scene showing the basics of shading rate dynamic. This sample shows low and high frequency textures over several cubes. It creates a sample rate map based upon this frequency every frame. Then it uses that dynamic sample rate map as a base for the next frame.

Ray tracing: reflection, shadow rays

Render a simple scene showing the basics of ray tracing, including reflection and shadow rays. The sample creates some geometries and create a bottom acceleration structure for each, then make instances of those, using different materials and placing them at different locations.

Portability

Demonstrate how to include non-conformant portable Vulkan implementations by using the portability extension to include those implementations in the device query. An example of a non-conformant portable Vulkan implementation is MoltenVk: MoltenVk. Also demonstrate use of beta extension which allows for querying which features of the full Vulkan spec are not currently supported by the non-conformant Vulkan implementation.

Graphics pipeline library

Uses the graphics pipeline library extensions to improve run-time pipeline creation. Instead of creating the whole pipeline at once, this sample makes use of that extension to pre-build shared pipeline parts such as vertex input state and fragment output state. These building blocks are then used to create pipelines at runtime, improving build times compared to traditional pipeline creation.

Conditional rendering

Demonstrate how to do conditional rendering, dynamically discarding rendering commands without having to update command buffers. This is done by sourcing conditional rendering blocks from a dedicated buffer that can be updated without having to touch command buffers.

Vertex input dynamic state

Demonstrate how to use vertex input bindings and attribute descriptions dynamically, which can reduce the number of pipeline objects that are needed to be created.

Extended dynamic state 2

Demonstrate how to use depth bias, primitive restart, rasterizer discard and patch control points dynamically, which can reduce the number of pipeline objects that are needed to be created.

Logic operations dynamic state

Demonstrate how to use logical operations dynamically, which can reduce the number of pipeline objects that are needed to be created or allow to change the pipeline state dynamically (change type of the logical operation).

Patch control points

Demonstrate how to use patch control points dynamically, which can reduce the number of pipeline objects that are needed to be created.

Fragment shader barycentric

Demonstrate how to use fragment shader barycentric feature, which allows accessing barycentric coordinates for each processed fragment.

Basic descriptor buffer

Demonstrate how to use the new extension to replace descriptor sets with resource descriptor buffers

Color write enable

Demonstrate how to create multiple color blend attachments and then toggle them dynamically.

Geometry shader to mesh shader

Extension: VK_EXT_mesh_shader

Demonstrates how a mesh shader can be used to achieve the same results as with geometry shader, it loads model from a file and visualizes its normals.

Shader object

Demonstrate how to use shader objects.

Dynamic blending

Demonstrate how to use the blending related functions available in the VK_EXT_extended_dynamic_state3 extension.

Dynamic line rasterization

Demonstrate methods for dynamically customizing the appearance of the rendered lines.

Shader Debug Printf

Demonstrates how to use Printf statements in a shader to output per-invocation values. This can help find issues with shaders in combination with graphics debugging tools.

Dynamic depth clipping and primitive clipping

Rendering using primitive clipping and depth clipping configured by dynamic pipeline state.

Tooling Samples

The goal of these samples is to demonstrate usage of tooling functions and libraries that are not directly part of the api.

Profiles Library

Use the Vulkan Profiles library to simplify instance and device setup. The library defines a common baseline of features, extensions, etc.

General Samples

The goal of these samples is to demonstrate different techniques or showcase complex scenarios that doesn’t necessarily fit any of the main categories.

Mobile NeRF

A Neural Radiance Field synthesizer sample, based on textured polygons.