Data Graph

VK_ARM_data_graph is the main graph execution path in the Emulation Layer. TOSA, optical flow, and Motion Engine workloads are all routed through this same graph layer rather than through separate top-level execution APIs.

Relationship To Other ML Extensions

  • The graph layer requires VK_ARM_tensors.

  • The graph layer also requires VK_KHR_synchronization2.

  • Tensor objects and tensor views used by graph pipelines come from the tensor layer.

  • VK_ARM_data_graph_instruction_set_tosa extends the core graph path for TOSA.001000.1 workloads.

  • VK_ARM_data_graph_optical_flow extends the core graph path for optical-flow queries and optical-flow pipeline nodes.

  • Arm.MotionEngine.100 is handled by the graph SPIR-V™ pass and dispatched through the same graph pipeline creation and execution flow.

Supported Data Graph API Surface

The graph layer currently exposes or intercepts the following graph-related entry points.

Discovery and capability query

  • vkGetPhysicalDeviceQueueFamilyProperties

  • vkGetPhysicalDeviceQueueFamilyProperties2

  • vkGetPhysicalDeviceQueueFamilyDataGraphPropertiesARM

  • vkGetPhysicalDeviceQueueFamilyDataGraphProcessingEnginePropertiesARM

  • vkGetPhysicalDeviceQueueFamilyDataGraphEngineOperationPropertiesARM

  • vkGetPhysicalDeviceQueueFamilyDataGraphOpticalFlowImageFormatsARM

  • vkGetPhysicalDeviceFeatures2

  • vkGetPhysicalDeviceFeatures2KHR

  • vkGetPhysicalDeviceToolPropertiesEXT

  • vkCreateDevice

Core graph pipeline and session API

  • vkCreateDataGraphPipelinesARM

  • vkDestroyPipeline

  • vkGetDataGraphPipelineAvailablePropertiesARM

  • vkGetDataGraphPipelinePropertiesARM

  • vkCreateDataGraphPipelineSessionARM

  • vkDestroyDataGraphPipelineSessionARM

  • vkGetDataGraphPipelineSessionBindPointRequirementsARM

  • vkGetDataGraphPipelineSessionMemoryRequirementsARM

  • vkBindDataGraphPipelineSessionMemoryARM

  • vkCmdDispatchDataGraphARM

Supporting Vulkan® hooks used by the graph layer

  • vkAllocateDescriptorSets

  • vkFreeDescriptorSets

  • vkUpdateDescriptorSets

  • vkCmdBindPipeline

  • vkCmdBindDescriptorSets

  • vkCreateTensorViewARM

  • vkDestroyTensorViewARM

  • vkCreateShaderModule

  • vkDestroyShaderModule

  • vkCmdPipelineBarrier2

  • vkSetDebugUtilsObjectNameEXT

Behavior Exposed By The Graph Layer

The graph layer currently reports the following behavior through standard query paths:

  • VkPhysicalDeviceDataGraphFeaturesARM.dataGraph is enabled.

  • VkPhysicalDeviceDataGraphFeaturesARM.dataGraphShaderModule is enabled.

  • VkPhysicalDeviceDataGraphFeaturesARM.dataGraphUpdateAfterBind is enabled when the underlying device supports uniform-buffer update-after-bind.

  • VkPhysicalDeviceDataGraphOpticalFlowFeaturesARM.dataGraphOpticalFlow is enabled.

  • Any queue family that already supports compute is also reported with VK_QUEUE_DATA_GRAPH_BIT_ARM.

The queue-family data-graph property query currently reports two operation types:

  • TOSA.001000.1 as a SPIR-V™ extended instruction set operation.

  • OpticalFlow as a dedicated optical-flow operation.

For TOSA query paths, the layer reports one profile named Emulation Layer with conformant quality at TOSA level 8K.

TOSA Through Data Graph

VK_ARM_data_graph_instruction_set_tosa is not a separate execution API. It extends VK_ARM_data_graph so that graph pipelines can consume the TOSA.001000.1 SPIR-V™ extended instruction set.

The graph pass currently implements the following TOSA.001000.1 operations:

  • Elementwise and logical: ABS, ADD, BITWISE_AND, BITWISE_NOT, BITWISE_OR, BITWISE_XOR, EQUAL, GREATER, GREATER_EQUAL, INTDIV, LOGICAL_AND, LOGICAL_LEFT_SHIFT, LOGICAL_NOT, LOGICAL_OR, LOGICAL_RIGHT_SHIFT, LOGICAL_XOR, MAXIMUM, MINIMUM, MUL, NEGATE, POW, SUB.

  • Unary math and activations: CAST, CEIL, CLAMP, CLZ, COS, ERF, EXP, FLOOR, LOG, RECIPROCAL, RSQRT, SIGMOID, SIN, TANH.

  • Reductions and selection: ARGMAX, REDUCE_ALL, REDUCE_ANY, REDUCE_MAX, REDUCE_MIN, REDUCE_PRODUCT, REDUCE_SUM, SELECT.

  • Convolution, pooling, and matrix-style operations: AVG_POOL2D, CONV2D, CONV3D, DEPTHWISE_CONV2D, MATMUL, MAX_POOL2D, TRANSPOSE_CONV2D.

  • Data movement and layout: ARITHMETIC_RIGHT_SHIFT, CONCAT, FFT2D, GATHER, PAD, RESCALE, RESHAPE, RESIZE, REVERSE, RFFT2D, SCATTER, SLICE, TABLE, TILE, TRANSPOSE.

Motion Engine Through Data Graph

Arm.MotionEngine.100 is also handled by the graph SPIR-V™ path rather than by a separate Vulkan® execution API.

The graph pass currently implements the following Motion Engine operations:

  • MIN_SAD

  • MIN_SAD_COST

  • RAW_SAD

Optical Flow Through Data Graph

VK_ARM_data_graph_optical_flow extends VK_ARM_data_graph rather than introducing a separate execution API. Optical-flow pipelines are still created, queried, and dispatched through the core data-graph pipeline and session API listed above.

Capability query

The optical-flow path currently reports and accepts:

  • Output and hint grid sizes 1x1, 2x2, 4x4, and 8x8.

  • Hint input support.

  • Cost output support.

  • Image sizes from 64x64 up to 8192x8192.

  • Input image formats: VK_FORMAT_R8G8B8_UNORM, VK_FORMAT_B8G8R8_UNORM, VK_FORMAT_R8G8B8A8_UNORM, VK_FORMAT_B8G8R8A8_UNORM, VK_FORMAT_B10G11R11_UFLOAT_PACK32, VK_FORMAT_R8_UNORM.

  • Flow-vector output format VK_FORMAT_R16G16_SFLOAT.

  • Cost output format VK_FORMAT_R16_UINT.

Creation constraints

The current optical-flow implementation accepts:

  • VK_DATA_GRAPH_OPTICAL_FLOW_CREATE_ENABLE_HINT_BIT_ARM

  • VK_DATA_GRAPH_OPTICAL_FLOW_CREATE_ENABLE_COST_BIT_ARM

  • Performance levels SLOW, MEDIUM, FAST, and UNKNOWN

  • Session create flag VK_DATA_GRAPH_PIPELINE_SESSION_CREATE_OPTICAL_FLOW_CACHE_BIT_ARM

When hint input is enabled, the hint grid size must match the selected output grid size.