Converting and deploying a PyTorch model tutorial

Note

For details on platform support, installation, and usage of ExecuTorch, please refer to the official documentation:

This tutorial describes how to convert and deploy a PyTorch model using the ML SDK for Vulkan®. In this tutorial, we generate a sample PyTorch file with a single MaxPool2D operation to demonstrate each step of the end-to-end workflow.

ExecuTorch can be installed via prebuilt wheels:

Note

Here we are installing from a developmental wheel. In the future, replace it with an official release.

pip install --upgrade --pre -f https://download.pytorch.org/whl/nightly/executorch/ "executorch==0.8.0.dev20250811"

Download the ExecuTorch repo, and install the required dependencies using the script.

Note

In order to run the setup script, Git username and email need to be configured. For example:

git config --global user.name "Your Name"
git config --global user.email "you@example.com"
git clone https://github.com/pytorch/executorch.git
./executorch/examples/arm/setup.sh --disable-ethos-u-deps
  1. Add the ML SDK Model Converter to PATH:

The ExecuTorch backend relies on the ML SDK Model Converter.

export PATH=/path/containing/model-converter/:$PATH
which model-converter

This should print out the path to the model-converter binary.

  1. Run the following python script to create a PyTorch model for a single MaxPool2D operation.

#!/usr/bin/env python3
#
# SPDX-FileCopyrightText: Copyright 2024-2025 Arm Limited and/or its affiliates <open-source-office@arm.com>
# SPDX-License-Identifier: Apache-2.0
#
import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
from executorch.backends.arm.arm_backend import ArmCompileSpecBuilder
from executorch.backends.arm.vgf_partitioner import VgfPartitioner
from executorch.exir import EdgeCompileConfig
from executorch.exir import to_edge_transform_and_lower


# Define model
class MaxPoolModel(nn.Module):
    def __init__(self):
        super(MaxPoolModel, self).__init__()
        self.pool = nn.MaxPool2d(kernel_size=2, stride=2)

    def forward(self, x):
        x = self.pool(x)
        return x


# Generate test input
example_input = torch.randn(1, 3, 64, 64)
np.save("input-0.npy", example_input.numpy())

model = MaxPoolModel().eval()

# Save the VGF model
compile_spec = (
    ArmCompileSpecBuilder()
    .vgf_compile_spec()
    .dump_intermediate_artifacts_to(".")
    .build()
)
partitioner = VgfPartitioner(compile_spec)

exported_program = torch.export.export_for_training(model, (example_input,))

to_edge_transform_and_lower(
    exported_program,
    partitioner=[partitioner],
    compile_config=EdgeCompileConfig(
        _check_ir_validity=False,
    ),
)
python MaxPool2DModel.py

This generates a VGF file ${NAME}.vgf in the current working directory, where the tool generates ${NAME}. A matching example input is also generated in the same directory for testing.

3. Use the VGF Dump Tool to generate a Scenario Template. To run a scenario on the ML SDK Scenario Runner, you must have a scenario specification in the form of a JSON file. Use the VGF file that was generated in the previous step and pass it to the VGF Dump Tool:

$vgf_dump --input ${NAME}.vgf --output scenario.json --scenario-template

Note

For more information about VGF Library and the VGF Dump Tool, see: ML SDK VGF Library

  1. The generated scenario.json file contains placeholder names for input and output bindings for the scenario. You must replace these names with the actual input and output filenames that will be used when running the scenario. In the example scenario.json file generated in the preceding step:

    1. Replace the name TEMPLATE_PATH_TENSOR_INPUT_0 with the actual input file input-0.npy.

    2. Replace the name TEMPLATE_PATH_TENSOR_OUTPUT_0 with the actual output filename output-0.npy.

Note

For more information about the test description format, see: JSON Test Description Specification.

  1. Run the ML SDK Scenario Runner on the ML Emulation Layer for Vulkan®:

scenario-runner --scenario scenario.json

The output from the scenario is produced as a file named output-0.npy. The file is specified in scenario.json.

Note

For more information about building and running the ML SDK Scenario Runner, see: ML SDK Scenario Runner.

For more information about building and setting up the Emulation Layer, see: ML Emulation Layer for Vulkan®