ML Extension Overview
The ML Emulation Layer for Vulkan® exposes ML functionality through two Vulkan® layers:
VK_LAYER_ML_Tensor_Emulation, which provides tensor objects, tensor views, tensor queries, and tensor-shader emulation throughVK_ARM_tensors.VK_LAYER_ML_Graph_Emulation, which provides graph execution throughVK_ARM_data_graphand extends that same graph path for TOSA and optical flow.
The tensor layer is the foundation. The graph layer builds on top of it and is the route used for generic data-graph execution, TOSA workloads, optical flow workloads, and Motion Engine operations.
API Flow
Direct tensor flow
Application
-> enable VK_LAYER_ML_Tensor_Emulation
-> use VK_ARM_tensors
-> create tensors and tensor views
-> submit compute shaders using SPV_ARM_tensors
Data graph flow
Application
-> enable VK_LAYER_ML_Graph_Emulation
-> enable VK_LAYER_ML_Tensor_Emulation
-> use VK_ARM_data_graph
-> graph layer uses VK_ARM_tensors tensor objects and tensor views
-> create and dispatch graph pipelines from SPV_ARM_graph
-> optional graph paths
-> VK_ARM_data_graph_instruction_set_tosa
-> TOSA.001000.1
-> VK_ARM_data_graph_optical_flow
-> optical-flow query paths and optical-flow pipeline nodes
-> Arm.MotionEngine.100
-> Motion Engine operations through the same data-graph path
In general, the execution model is as below:
VK_ARM_tensorsis the direct tensor API path.VK_ARM_data_graphis the graph execution path.TOSA, optical flow, and Motion Engine operations are all routed through the data-graph path rather than through separate top-level execution APIs.
Vulkan® Extension Map
Vulkan® extension |
Exposed by |
Relationship |
Detailed page |
|---|---|---|---|
|
|
Foundation for tensor objects, tensor views, and tensor-aware compute shaders. Also used by the graph layer. |
|
|
|
Core graph pipeline, graph session, and graph dispatch API. |
|
|
|
Add-on to |
|
|
|
Add-on to |
SPIR-V™ Map
SPIR-V™ extension or instruction set |
Consumed by |
Route through the Emulation Layer |
|---|---|---|
|
Tensor layer |
Direct tensor compute shaders. |
|
Graph layer |
Core graph pipeline representation. |
|
Graph layer |
Processed through |
|
Graph layer |
Processed through |
The shared graph processing path also handles reduced-precision tensor element
encodings used by the ML extensions, including BFloat16KHR, Float8E5M2EXT, and
Float8E4M3EXT.
Detailed Pages
Use the following child pages for the detailed API surface. For platform- specific constraints, see Known Limitations.