Execution Graphs

Execution graphs provide a way for applications to dispatch multiple operations dynamically from a single initial command on the host. To achieve this, a new execution graph pipeline is provided, that links together multiple shaders or pipelines which each describe one or more operations that can be dispatched within the execution graph. Each linked pipeline or shader describes an execution node within the graph, which can be dispatched dynamically from another shader within the same graph. This allows applications to describe much richer execution topologies at a finer granularity than would typically be possible with API commands alone.

Pipeline Creation

To create execution graph pipelines, call:

// Provided by VK_AMDX_shader_enqueue
VkResult vkCreateExecutionGraphPipelinesAMDX(
    VkDevice                                    device,
    VkPipelineCache                             pipelineCache,
    uint32_t                                    createInfoCount,
    const VkExecutionGraphPipelineCreateInfoAMDX* pCreateInfos,
    const VkAllocationCallbacks*                pAllocator,
    VkPipeline*                                 pPipelines);
  • device is the logical device that creates the execution graph pipelines.

  • pipelineCache is either VK_NULL_HANDLE, indicating that pipeline caching is disabled; or the handle of a valid pipeline cache object, in which case use of that cache is enabled for the duration of the command. The implementation must not access this object outside of the duration of this command.

  • createInfoCount is the length of the pCreateInfos and pPipelines arrays.

  • pCreateInfos is a pointer to an array of VkExecutionGraphPipelineCreateInfoAMDX structures.

  • pAllocator controls host memory allocation as described in the Memory Allocation chapter.

  • pPipelines is a pointer to an array of VkPipeline handles in which the resulting execution graph pipeline objects are returned.

Pipelines are created and returned as described for Multiple Pipeline Creation.

Valid Usage
Valid Usage (Implicit)
  • VUID-vkCreateExecutionGraphPipelinesAMDX-device-parameter
    device must be a valid VkDevice handle

  • VUID-vkCreateExecutionGraphPipelinesAMDX-pipelineCache-parameter
    If pipelineCache is not VK_NULL_HANDLE, pipelineCache must be a valid VkPipelineCache handle

  • VUID-vkCreateExecutionGraphPipelinesAMDX-pCreateInfos-parameter
    pCreateInfos must be a valid pointer to an array of createInfoCount valid VkExecutionGraphPipelineCreateInfoAMDX structures

  • VUID-vkCreateExecutionGraphPipelinesAMDX-pAllocator-parameter
    If pAllocator is not NULL, pAllocator must be a valid pointer to a valid VkAllocationCallbacks structure

  • VUID-vkCreateExecutionGraphPipelinesAMDX-pPipelines-parameter
    pPipelines must be a valid pointer to an array of createInfoCount VkPipeline handles

  • VUID-vkCreateExecutionGraphPipelinesAMDX-createInfoCount-arraylength
    createInfoCount must be greater than 0

  • VUID-vkCreateExecutionGraphPipelinesAMDX-pipelineCache-parent
    If pipelineCache is a valid handle, it must have been created, allocated, or retrieved from device

The VkExecutionGraphPipelineCreateInfoAMDX structure is defined as:

// Provided by VK_AMDX_shader_enqueue
typedef struct VkExecutionGraphPipelineCreateInfoAMDX {
    VkStructureType                           sType;
    const void*                               pNext;
    VkPipelineCreateFlags                     flags;
    uint32_t                                  stageCount;
    const VkPipelineShaderStageCreateInfo*    pStages;
    const VkPipelineLibraryCreateInfoKHR*     pLibraryInfo;
    VkPipelineLayout                          layout;
    VkPipeline                                basePipelineHandle;
    int32_t                                   basePipelineIndex;
} VkExecutionGraphPipelineCreateInfoAMDX;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • flags is a bitmask of VkPipelineCreateFlagBits specifying how the pipeline will be generated.

  • stageCount is the number of entries in the pStages array.

  • pStages is a pointer to an array of stageCount VkPipelineShaderStageCreateInfo structures describing the set of the shader stages to be included in the execution graph pipeline.

  • pLibraryInfo is a pointer to a VkPipelineLibraryCreateInfoKHR structure defining pipeline libraries to include.

  • layout is the description of binding locations used by both the pipeline and descriptor sets used with the pipeline. The implementation must not access this object outside of the duration of the command this structure is passed to.

  • basePipelineHandle is a pipeline to derive from

  • basePipelineIndex is an index into the pCreateInfos parameter to use as a pipeline to derive from

The parameters basePipelineHandle and basePipelineIndex are described in more detail in Pipeline Derivatives.

Each shader stage provided when creating an execution graph pipeline (including those in libraries) is associated with a name and an index, determined by the inclusion or omission of a VkPipelineShaderStageNodeCreateInfoAMDX structure in its pNext chain. For any graphics pipeline libraries, only the name and index of the vertex or mesh shader stage is linked directly to the graph as a node - other shader stages in the pipeline will be executed after those shader stages as normal. Task shaders cannot be included in a graphics pipeline used for a draw node.

In addition to the shader name and index, an internal “node index” is also generated for each node, which can be queried with vkGetExecutionGraphPipelineNodeIndexAMDX, and is used exclusively for initial dispatch of an execution graph.

Valid Usage
  • VUID-VkExecutionGraphPipelineCreateInfoAMDX-None-09497
    If the pNext chain does not include a VkPipelineCreateFlags2CreateInfo structure, flags must be a valid combination of VkPipelineCreateFlagBits values

  • VUID-VkExecutionGraphPipelineCreateInfoAMDX-flags-07984
    If flags contains the VK_PIPELINE_CREATE_DERIVATIVE_BIT flag, and basePipelineIndex is -1, basePipelineHandle must be a valid execution graph VkPipeline handle

  • VUID-VkExecutionGraphPipelineCreateInfoAMDX-flags-07985
    If flags contains the VK_PIPELINE_CREATE_DERIVATIVE_BIT flag, and basePipelineHandle is VK_NULL_HANDLE, basePipelineIndex must be a valid index into the calling command’s pCreateInfos parameter

  • VUID-VkExecutionGraphPipelineCreateInfoAMDX-flags-07986
    If flags contains the VK_PIPELINE_CREATE_DERIVATIVE_BIT flag, basePipelineIndex must be -1 or basePipelineHandle must be VK_NULL_HANDLE

  • VUID-VkExecutionGraphPipelineCreateInfoAMDX-layout-07987
    If a push constant block is declared in a shader, a push constant range in layout must match the shader stage

  • VUID-VkExecutionGraphPipelineCreateInfoAMDX-layout-10069
    If a push constant block is declared in a shader, the block must be contained inside the push constant range in layout that matches the stage

  • VUID-VkExecutionGraphPipelineCreateInfoAMDX-layout-07988
    If a resource variable is declared in a shader, the corresponding descriptor set in layout must match the shader stage

  • VUID-VkExecutionGraphPipelineCreateInfoAMDX-layout-07990
    If a resource variable is declared in a shader, and the descriptor type is not VK_DESCRIPTOR_TYPE_MUTABLE_EXT, the corresponding descriptor set in layout must match the descriptor type

  • VUID-VkExecutionGraphPipelineCreateInfoAMDX-layout-07991
    If a resource variable is declared in a shader as an array, the corresponding descriptor binding used to create layout must have a descriptorCount that is greater than or equal to the length of the array

  • VUID-VkExecutionGraphPipelineCreateInfoAMDX-None-10391
    If a resource variables is declared in a shader as an array of descriptors, then the descriptor type of that variable must not be VK_DESCRIPTOR_TYPE_INLINE_UNIFORM_BLOCK

  • VUID-VkExecutionGraphPipelineCreateInfoAMDX-flags-11798
    If shader64BitIndexing feature is not enabled, flags must not contain VK_PIPELINE_CREATE_2_64_BIT_INDEXING_BIT_EXT

Valid Usage (Implicit)
  • VUID-VkExecutionGraphPipelineCreateInfoAMDX-sType-sType
    sType must be VK_STRUCTURE_TYPE_EXECUTION_GRAPH_PIPELINE_CREATE_INFO_AMDX

  • VUID-VkExecutionGraphPipelineCreateInfoAMDX-pNext-pNext
    Each pNext member of any structure (including this one) in the pNext chain must be either NULL or a pointer to a valid instance of VkPipelineCompilerControlCreateInfoAMD or VkPipelineCreationFeedbackCreateInfo

  • VUID-VkExecutionGraphPipelineCreateInfoAMDX-sType-unique
    The sType value of each structure in the pNext chain must be unique

  • VUID-VkExecutionGraphPipelineCreateInfoAMDX-pStages-parameter
    If stageCount is not 0, and pStages is not NULL, pStages must be a valid pointer to an array of stageCount valid VkPipelineShaderStageCreateInfo structures

  • VUID-VkExecutionGraphPipelineCreateInfoAMDX-pLibraryInfo-parameter
    If pLibraryInfo is not NULL, pLibraryInfo must be a valid pointer to a valid VkPipelineLibraryCreateInfoKHR structure

  • VUID-VkExecutionGraphPipelineCreateInfoAMDX-layout-parameter
    layout must be a valid VkPipelineLayout handle

  • VUID-VkExecutionGraphPipelineCreateInfoAMDX-commonparent
    Both of basePipelineHandle, and layout that are valid handles of non-ignored parameters must have been created, allocated, or retrieved from the same VkDevice

VK_SHADER_INDEX_UNUSED_AMDX is a special shader index used to indicate that the created node does not override the index. In this case, the shader index is determined through other means. It is defined as:

#define VK_SHADER_INDEX_UNUSED_AMDX       (~0U)

The VkPipelineShaderStageNodeCreateInfoAMDX structure is defined as:

// Provided by VK_AMDX_shader_enqueue
typedef struct VkPipelineShaderStageNodeCreateInfoAMDX {
      VkStructureType    sType;
    const void*          pNext;
    const char*          pName;
    uint32_t             index;
} VkPipelineShaderStageNodeCreateInfoAMDX;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • pName is the shader name to use when creating a node in an execution graph. If pName is NULL, the name of the entry point specified in SPIR-V is used as the shader name.

  • index is the shader index to use when creating a node in an execution graph. If index is VK_SHADER_INDEX_UNUSED_AMDX then the original index is used, either as specified by the ShaderIndexAMDX execution mode, or 0 if that too is not specified.

When included in the pNext chain of a VkPipelineShaderStageCreateInfo structure, this structure specifies the shader name and shader index of a node when creating an execution graph pipeline. If this structure is omitted, the shader name is set to the name of the entry point in SPIR-V and the shader index is set to 0.

When dispatching a node from another shader, the name is fixed at pipeline creation, but the index can be set dynamically. By associating multiple shaders with the same name but different indexes, applications can dynamically select different nodes to execute. Applications must ensure each node has a unique name and index.

Shaders with the same name must be of the same type - e.g. a compute and graphics shader, or even two compute shaders where one is coalescing and the other is not, cannot share the same name.

Valid Usage (Implicit)

To query the internal node index for a particular node in an execution graph, call:

// Provided by VK_AMDX_shader_enqueue
VkResult vkGetExecutionGraphPipelineNodeIndexAMDX(
    VkDevice                                    device,
    VkPipeline                                  executionGraph,
    const VkPipelineShaderStageNodeCreateInfoAMDX* pNodeInfo,
    uint32_t*                                   pNodeIndex);
  • device is the logical device that executionGraph was created on.

  • executionGraph is the execution graph pipeline to query the internal node index for.

  • pNodeInfo is a pointer to a VkPipelineShaderStageNodeCreateInfoAMDX structure identifying the name and index of the node to query.

  • pNodeIndex is the returned internal node index of the identified node.

Once this function returns, the contents of pNodeIndex contain the internal node index of the identified node.

Valid Usage
  • VUID-vkGetExecutionGraphPipelineNodeIndexAMDX-pNodeInfo-09140
    pNodeInfo->pName must not be NULL

  • VUID-vkGetExecutionGraphPipelineNodeIndexAMDX-pNodeInfo-09141
    pNodeInfo->index must not be VK_SHADER_INDEX_UNUSED_AMDX

  • VUID-vkGetExecutionGraphPipelineNodeIndexAMDX-executionGraph-09142
    There must be a node in executionGraph with a shader name and index equal to pNodeInfo->pName and pNodeInfo->index

Valid Usage (Implicit)
  • VUID-vkGetExecutionGraphPipelineNodeIndexAMDX-device-parameter
    device must be a valid VkDevice handle

  • VUID-vkGetExecutionGraphPipelineNodeIndexAMDX-executionGraph-parameter
    executionGraph must be a valid VkPipeline handle

  • VUID-vkGetExecutionGraphPipelineNodeIndexAMDX-pNodeInfo-parameter
    pNodeInfo must be a valid pointer to a valid VkPipelineShaderStageNodeCreateInfoAMDX structure

  • VUID-vkGetExecutionGraphPipelineNodeIndexAMDX-pNodeIndex-parameter
    pNodeIndex must be a valid pointer to a uint32_t value

  • VUID-vkGetExecutionGraphPipelineNodeIndexAMDX-executionGraph-parent
    executionGraph must have been created, allocated, or retrieved from device

Initializing Scratch Memory

Implementations may need scratch memory to manage dispatch queues or similar when executing a pipeline graph, and this is explicitly managed by the application.

To query the scratch space required to dispatch an execution graph, call:

// Provided by VK_AMDX_shader_enqueue
VkResult vkGetExecutionGraphPipelineScratchSizeAMDX(
    VkDevice                                    device,
    VkPipeline                                  executionGraph,
    VkExecutionGraphPipelineScratchSizeAMDX*    pSizeInfo);
  • device is the logical device that executionGraph was created on.

  • executionGraph is the execution graph pipeline to query the scratch space for.

  • pSizeInfo is a pointer to a VkExecutionGraphPipelineScratchSizeAMDX structure that will contain the required scratch size.

After this function returns, information about the scratch space required will be returned in pSizeInfo.

Valid Usage (Implicit)
  • VUID-vkGetExecutionGraphPipelineScratchSizeAMDX-device-parameter
    device must be a valid VkDevice handle

  • VUID-vkGetExecutionGraphPipelineScratchSizeAMDX-executionGraph-parameter
    executionGraph must be a valid VkPipeline handle

  • VUID-vkGetExecutionGraphPipelineScratchSizeAMDX-pSizeInfo-parameter
    pSizeInfo must be a valid pointer to a VkExecutionGraphPipelineScratchSizeAMDX structure

  • VUID-vkGetExecutionGraphPipelineScratchSizeAMDX-executionGraph-parent
    executionGraph must have been created, allocated, or retrieved from device

The VkExecutionGraphPipelineScratchSizeAMDX structure is defined as:

// Provided by VK_AMDX_shader_enqueue
typedef struct VkExecutionGraphPipelineScratchSizeAMDX {
    VkStructureType    sType;
    void*              pNext;
    VkDeviceSize       minSize;
    VkDeviceSize       maxSize;
    VkDeviceSize       sizeGranularity;
} VkExecutionGraphPipelineScratchSizeAMDX;
  • sType is a VkStructureType value identifying this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • minSize indicates the minimum scratch space required for dispatching the queried execution graph.

  • maxSize indicates the maximum scratch space that can be used for dispatching the queried execution graph.

  • sizeGranularity indicates the granularity at which the scratch space can be increased from minSize.

Applications can use any amount of scratch memory greater than minSize for dispatching a graph, however only the values equal to minSize + an integer multiple of sizeGranularity will be used. Greater values may result in higher performance, up to maxSize which indicates the most memory that an implementation can use effectively.

Valid Usage (Implicit)

To initialize scratch memory for a particular execution graph, call:

// Provided by VK_AMDX_shader_enqueue
void vkCmdInitializeGraphScratchMemoryAMDX(
    VkCommandBuffer                             commandBuffer,
    VkPipeline                                  executionGraph,
    VkDeviceAddress                             scratch,
    VkDeviceSize                                scratchSize);
  • commandBuffer is the command buffer into which the command will be recorded.

  • executionGraph is the execution graph pipeline to initialize the scratch memory for.

  • scratch is the address of scratch memory to be initialized.

  • scratchSize is a range in bytes of scratch memory to be initialized.

This command must be called before using scratch to dispatch the bound execution graph pipeline.

Execution of this command may modify any memory locations in the range [scratch,scratch + scratchSize). Accesses to this memory range are performed in the VK_PIPELINE_STAGE_2_COMPUTE_SHADER_BIT pipeline stage with the VK_ACCESS_2_SHADER_STORAGE_READ_BIT and VK_ACCESS_2_SHADER_STORAGE_WRITE_BIT access flags.

If any portion of scratch is modified by any command other than vkCmdDispatchGraphAMDX, vkCmdDispatchGraphIndirectAMDX, vkCmdDispatchGraphIndirectCountAMDX, or vkCmdInitializeGraphScratchMemoryAMDX with the same execution graph, it must be reinitialized for the execution graph again before dispatching against it.

Valid Usage
  • VUID-vkCmdInitializeGraphScratchMemoryAMDX-scratch-10185
    scratch must be the device address of an allocated memory range at least as large as scratchSize

  • VUID-vkCmdInitializeGraphScratchMemoryAMDX-scratchSize-10186
    scratchSize must be greater than or equal to VkExecutionGraphPipelineScratchSizeAMDX::minSize returned by vkGetExecutionGraphPipelineScratchSizeAMDX for the bound execution graph pipeline

  • VUID-vkCmdInitializeGraphScratchMemoryAMDX-scratch-09144
    scratch must be a multiple of 64

Valid Usage (Implicit)
  • VUID-vkCmdInitializeGraphScratchMemoryAMDX-commandBuffer-parameter
    commandBuffer must be a valid VkCommandBuffer handle

  • VUID-vkCmdInitializeGraphScratchMemoryAMDX-executionGraph-parameter
    executionGraph must be a valid VkPipeline handle

  • VUID-vkCmdInitializeGraphScratchMemoryAMDX-scratch-parameter
    scratch must be a valid VkDeviceAddress value

  • VUID-vkCmdInitializeGraphScratchMemoryAMDX-commandBuffer-recording
    commandBuffer must be in the recording state

  • VUID-vkCmdInitializeGraphScratchMemoryAMDX-commandBuffer-cmdpool
    The VkCommandPool that commandBuffer was allocated from must support VK_QUEUE_COMPUTE_BIT, or VK_QUEUE_GRAPHICS_BIT operations

  • VUID-vkCmdInitializeGraphScratchMemoryAMDX-suspended
    This command must not be called between suspended render pass instances

  • VUID-vkCmdInitializeGraphScratchMemoryAMDX-videocoding
    This command must only be called outside of a video coding scope

  • VUID-vkCmdInitializeGraphScratchMemoryAMDX-bufferlevel
    commandBuffer must be a primary VkCommandBuffer

  • VUID-vkCmdInitializeGraphScratchMemoryAMDX-commonparent
    Both of commandBuffer, and executionGraph must have been created, allocated, or retrieved from the same VkDevice

Host Synchronization
  • Host access to the VkCommandPool that commandBuffer was allocated from must be externally synchronized

Command Properties
Command Buffer Levels Render Pass Scope Video Coding Scope Supported Queue Types Command Type

Primary

Both

Outside

VK_QUEUE_COMPUTE_BIT
VK_QUEUE_GRAPHICS_BIT

Action

Conditional Rendering

vkCmdInitializeGraphScratchMemoryAMDX is not affected by conditional rendering

Dispatching a Graph

Initial dispatch of an execution graph is done from the host in the same way as any other command, and can be used in a similar way to compute dispatch commands, with indirect variants available.

To record an execution graph dispatch, call:

// Provided by VK_AMDX_shader_enqueue
void vkCmdDispatchGraphAMDX(
    VkCommandBuffer                             commandBuffer,
    VkDeviceAddress                             scratch,
    VkDeviceSize                                scratchSize,
    const VkDispatchGraphCountInfoAMDX*         pCountInfo);
  • commandBuffer is the command buffer into which the command will be recorded.

  • scratch is the address of scratch memory to be used.

  • scratchSize is a range in bytes of scratch memory to be used.

  • pCountInfo is a host pointer to a VkDispatchGraphCountInfoAMDX structure defining the nodes which will be initially executed.

When this command is executed, the nodes specified in pCountInfo are executed. Nodes executed as part of this command are not implicitly synchronized in any way against each other once they are dispatched. There are no rasterization order guarantees between separately dispatched graphics nodes, though individual primitives within a single dispatch do adhere to rasterization order. Draw calls executed before or after the execution graph also execute relative to each graphics node with respect to rasterization order.

For this command, all device/host pointers in substructures are treated as host pointers and read only during host execution of this command. Once this command returns, no reference to the original pointers is retained.

Execution of this command may modify any memory locations in the range [scratch,scratch + scratchSize). Accesses to this memory range are performed in the VK_PIPELINE_STAGE_2_COMPUTE_SHADER_BIT pipeline stage with the VK_ACCESS_2_SHADER_STORAGE_READ_BIT and VK_ACCESS_2_SHADER_STORAGE_WRITE_BIT access flags.

This command captures command buffer state for mesh nodes similarly to draw commands.

Valid Usage
Valid Usage (Implicit)
  • VUID-vkCmdDispatchGraphAMDX-commandBuffer-parameter
    commandBuffer must be a valid VkCommandBuffer handle

  • VUID-vkCmdDispatchGraphAMDX-scratch-parameter
    scratch must be a valid VkDeviceAddress value

  • VUID-vkCmdDispatchGraphAMDX-pCountInfo-parameter
    pCountInfo must be a valid pointer to a valid VkDispatchGraphCountInfoAMDX structure

  • VUID-vkCmdDispatchGraphAMDX-commandBuffer-recording
    commandBuffer must be in the recording state

  • VUID-vkCmdDispatchGraphAMDX-commandBuffer-cmdpool
    The VkCommandPool that commandBuffer was allocated from must support VK_QUEUE_COMPUTE_BIT, or VK_QUEUE_GRAPHICS_BIT operations

  • VUID-vkCmdDispatchGraphAMDX-suspended
    This command must not be called between suspended render pass instances

  • VUID-vkCmdDispatchGraphAMDX-videocoding
    This command must only be called outside of a video coding scope

  • VUID-vkCmdDispatchGraphAMDX-bufferlevel
    commandBuffer must be a primary VkCommandBuffer

Host Synchronization
  • Host access to the VkCommandPool that commandBuffer was allocated from must be externally synchronized

Command Properties
Command Buffer Levels Render Pass Scope Video Coding Scope Supported Queue Types Command Type

Primary

Both

Outside

VK_QUEUE_COMPUTE_BIT
VK_QUEUE_GRAPHICS_BIT

Action

Conditional Rendering

vkCmdDispatchGraphAMDX is affected by conditional rendering

To record an execution graph dispatch with node and payload parameters read on device, call:

// Provided by VK_AMDX_shader_enqueue
void vkCmdDispatchGraphIndirectAMDX(
    VkCommandBuffer                             commandBuffer,
    VkDeviceAddress                             scratch,
    VkDeviceSize                                scratchSize,
    const VkDispatchGraphCountInfoAMDX*         pCountInfo);
  • commandBuffer is the command buffer into which the command will be recorded.

  • scratch is the address of scratch memory to be used.

  • scratchSize is a range in bytes of scratch memory to be used.

  • pCountInfo is a host pointer to a VkDispatchGraphCountInfoAMDX structure defining the nodes which will be initially executed.

When this command is executed, the nodes specified in pCountInfo are executed. Nodes executed as part of this command are not implicitly synchronized in any way against each other once they are dispatched. There are no rasterization order guarantees between separately dispatched graphics nodes, though individual primitives within a single dispatch do adhere to rasterization order. Draw calls executed before or after the execution graph also execute relative to each graphics node with respect to rasterization order.

For this command, all device/host pointers in substructures are treated as device pointers and read during device execution of this command. The allocation and contents of these pointers only needs to be valid during device execution. All of these addresses will be read in the VK_PIPELINE_STAGE_2_COMPUTE_SHADER_BIT pipeline stage with the VK_ACCESS_2_SHADER_STORAGE_READ_BIT access flag.

Execution of this command may modify any memory locations in the range [scratch,scratch + scratchSize). Accesses to this memory range are performed in the VK_PIPELINE_STAGE_2_COMPUTE_SHADER_BIT pipeline stage with the VK_ACCESS_2_SHADER_STORAGE_READ_BIT and VK_ACCESS_2_SHADER_STORAGE_WRITE_BIT access flags.

This command captures command buffer state for mesh nodes similarly to draw commands.

Valid Usage
Valid Usage (Implicit)
  • VUID-vkCmdDispatchGraphIndirectAMDX-commandBuffer-parameter
    commandBuffer must be a valid VkCommandBuffer handle

  • VUID-vkCmdDispatchGraphIndirectAMDX-scratch-parameter
    scratch must be a valid VkDeviceAddress value

  • VUID-vkCmdDispatchGraphIndirectAMDX-pCountInfo-parameter
    pCountInfo must be a valid pointer to a valid VkDispatchGraphCountInfoAMDX structure

  • VUID-vkCmdDispatchGraphIndirectAMDX-commandBuffer-recording
    commandBuffer must be in the recording state

  • VUID-vkCmdDispatchGraphIndirectAMDX-commandBuffer-cmdpool
    The VkCommandPool that commandBuffer was allocated from must support VK_QUEUE_COMPUTE_BIT, or VK_QUEUE_GRAPHICS_BIT operations

  • VUID-vkCmdDispatchGraphIndirectAMDX-suspended
    This command must not be called between suspended render pass instances

  • VUID-vkCmdDispatchGraphIndirectAMDX-videocoding
    This command must only be called outside of a video coding scope

  • VUID-vkCmdDispatchGraphIndirectAMDX-bufferlevel
    commandBuffer must be a primary VkCommandBuffer

Host Synchronization
  • Host access to the VkCommandPool that commandBuffer was allocated from must be externally synchronized

Command Properties
Command Buffer Levels Render Pass Scope Video Coding Scope Supported Queue Types Command Type

Primary

Both

Outside

VK_QUEUE_COMPUTE_BIT
VK_QUEUE_GRAPHICS_BIT

Action

Conditional Rendering

vkCmdDispatchGraphIndirectAMDX is affected by conditional rendering

To record an execution graph dispatch with all parameters read on device, call:

// Provided by VK_AMDX_shader_enqueue
void vkCmdDispatchGraphIndirectCountAMDX(
    VkCommandBuffer                             commandBuffer,
    VkDeviceAddress                             scratch,
    VkDeviceSize                                scratchSize,
    VkDeviceAddress                             countInfo);
  • commandBuffer is the command buffer into which the command will be recorded.

  • scratch is the address of scratch memory to be used.

  • scratchSize is a range in bytes of scratch memory to be used.

  • countInfo is a device address of a VkDispatchGraphCountInfoAMDX structure defining the nodes which will be initially executed.

When this command is executed, the nodes specified in countInfo are executed. Nodes executed as part of this command are not implicitly synchronized in any way against each other once they are dispatched.

For this command, all pointers in substructures are treated as device pointers and read during device execution of this command. The allocation and contents of these pointers only needs to be valid during device execution. All of these addresses will be read in the VK_PIPELINE_STAGE_2_COMPUTE_SHADER_BIT pipeline stage with the VK_ACCESS_2_SHADER_STORAGE_READ_BIT access flag.

Execution of this command may modify any memory locations in the range [scratch,scratch + scratchSize). Accesses to this memory range are performed in the VK_PIPELINE_STAGE_2_COMPUTE_SHADER_BIT pipeline stage with the VK_ACCESS_2_SHADER_STORAGE_READ_BIT and VK_ACCESS_2_SHADER_STORAGE_WRITE_BIT access flags.

Valid Usage
Valid Usage (Implicit)
  • VUID-vkCmdDispatchGraphIndirectCountAMDX-commandBuffer-parameter
    commandBuffer must be a valid VkCommandBuffer handle

  • VUID-vkCmdDispatchGraphIndirectCountAMDX-scratch-parameter
    scratch must be a valid VkDeviceAddress value

  • VUID-vkCmdDispatchGraphIndirectCountAMDX-countInfo-parameter
    countInfo must be a valid VkDeviceAddress value

  • VUID-vkCmdDispatchGraphIndirectCountAMDX-commandBuffer-recording
    commandBuffer must be in the recording state

  • VUID-vkCmdDispatchGraphIndirectCountAMDX-commandBuffer-cmdpool
    The VkCommandPool that commandBuffer was allocated from must support VK_QUEUE_COMPUTE_BIT, or VK_QUEUE_GRAPHICS_BIT operations

  • VUID-vkCmdDispatchGraphIndirectCountAMDX-suspended
    This command must not be called between suspended render pass instances

  • VUID-vkCmdDispatchGraphIndirectCountAMDX-videocoding
    This command must only be called outside of a video coding scope

  • VUID-vkCmdDispatchGraphIndirectCountAMDX-bufferlevel
    commandBuffer must be a primary VkCommandBuffer

Host Synchronization
  • Host access to the VkCommandPool that commandBuffer was allocated from must be externally synchronized

Command Properties
Command Buffer Levels Render Pass Scope Video Coding Scope Supported Queue Types Command Type

Primary

Both

Outside

VK_QUEUE_COMPUTE_BIT
VK_QUEUE_GRAPHICS_BIT

Action

Conditional Rendering

vkCmdDispatchGraphIndirectCountAMDX is affected by conditional rendering

The VkDeviceOrHostAddressConstAMDX union is defined as:

// Provided by VK_AMDX_shader_enqueue
typedef union VkDeviceOrHostAddressConstAMDX {
    VkDeviceAddress    deviceAddress;
    const void*        hostAddress;
} VkDeviceOrHostAddressConstAMDX;
  • deviceAddress is a buffer device address as returned by the vkGetBufferDeviceAddressKHR command.

  • hostAddress is a const host memory address.

The VkDispatchGraphCountInfoAMDX structure is defined as:

// Provided by VK_AMDX_shader_enqueue
typedef struct VkDispatchGraphCountInfoAMDX {
    uint32_t                          count;
    VkDeviceOrHostAddressConstAMDX    infos;
    uint64_t                          stride;
} VkDispatchGraphCountInfoAMDX;

Whether infos is consumed as a device or host pointer is defined by the command this structure is used in.

The VkDispatchGraphInfoAMDX structure is defined as:

// Provided by VK_AMDX_shader_enqueue
typedef struct VkDispatchGraphInfoAMDX {
    uint32_t                          nodeIndex;
    uint32_t                          payloadCount;
    VkDeviceOrHostAddressConstAMDX    payloads;
    uint64_t                          payloadStride;
} VkDispatchGraphInfoAMDX;
  • nodeIndex is the index of a node in an execution graph to be dispatched.

  • payloadCount is the number of payloads to dispatch for the specified node.

  • payloads is a device or host address pointer to a flat array of payloads with size equal to the product of payloadCount and payloadStride

  • payloadStride is the byte stride between successive payloads in payloads

Whether payloads is consumed as a device or host pointer is defined by the command this structure is used in.

Valid Usage

Shader Enqueue

Compute shaders in an execution graph can use the OpInitializeNodePayloadsAMDX to initialize nodes for dispatch. Any node payload initialized in this way will be enqueued for dispatch once the shader is done writing to the payload. As compilers may be conservative when making this determination, shaders can further call OpFinalizeNodePayloadsAMDX to guarantee that the payload is no longer being written.

The Node Name operand of the PayloadNodeNameAMDX decoration on a payload identifies the shader name of the node to be enqueued, and the Shader Index operand of OpInitializeNodePayloadsAMDX identifies the shader index. A node identified in this way is dispatched as described in the following sections.

Compute Nodes

Compute shaders added as nodes to an execution graph are executed differently based on the presence or absence of the StaticNumWorkgroupsAMDX or CoalescingAMDX execution modes.

Dispatching a compute shader node that does not declare either the StaticNumWorkgroupsAMDX or CoalescingAMDX execution mode will execute a number of workgroups in each dimension specified by the first 12 bytes of the payload, interpreted as a VkDispatchIndirectCommand. The same payload will be broadcast to each workgroup in the same dispatch. Additional values in the payload are have no effect on execution.

Dispatching a compute shader node with the StaticNumWorkgroupsAMDX execution mode will execute workgroups in each dimension according to the x, y, and z size operands to the StaticNumWorkgroupsAMDX execution mode. The same payload will be broadcast to each workgroup in the same dispatch. Any values in the payload have no effect on execution.

Dispatching a compute shader node with the CoalescingAMDX execution mode will enqueue a single invocation for execution. Implementations may combine multiple such dispatches into the same workgroup, up to the size of the workgroup. The number of invocations coalesced into a given workgroup in this way can be queried via the CoalescedInputCountAMDX built-in. Any values in the payload have no effect on execution.

Mesh Nodes

Graphics pipelines added as nodes to an execution graph are executed in a manner similar to a vkCmdDrawMeshTasksIndirectEXT, using the same payloads as compute shaders, but capturing some state from the command buffer.

When an execution graph dispatch is recorded into a command buffer, it captures the following dynamic state for use with draw nodes:

Other state is not captured, and graphics pipelines must not be created with other dynamic states when used as a library in an execution graph pipeline.