High-level (export, transformation, and compilation)

Export the model
- capture the pytorch program as a graph
Compile the exported model to an ExecuTorch program
- Given an exported model from step 1, convert it to an executable format called an ExecuTorch program that the runtime can use for inference.
- entry point for various optimizations
  - quantization
  - further compiling subgraphs down to on-device specialized hardware accelerators to improve latency.
  - memory planning, i.e. plan the location of intermediate tensors to reducememory footprint.
Run the ExecuTorch program on a target device.
- input → output (nothing eager, execution plan already calculated in step 1 and 2)

Architectural Components

Program preparation

leverage pytorch 2 compiler to do AOT (ahead-of-time) torch.export()
compile to edge dialect + compile to executorch program Edge Compilation

End-to-end workflow

https://pytorch.org/executorch/stable/tutorials/export-to-executorch-tutorial.html

import torch
from torch.export import export, export_for_training, ExportedProgram
 
 
class M(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.param = torch.nn.Parameter(torch.rand(3, 4))
        self.linear = torch.nn.Linear(4, 5)
 
    def forward(self, x):
        return self.linear(x + self.param).clamp(min=0.0, max=1.0)
 
 
example_args = (torch.randn(3, 4),)
pre_autograd_aten_dialect = export_for_training(M(), example_args).module()
# Optionally do quantization:
# pre_autograd_aten_dialect = convert_pt2e(prepare_pt2e(pre_autograd_aten_dialect, CustomBackendQuantizer))
aten_dialect: ExportedProgram = export(pre_autograd_aten_dialect, example_args)
edge_program: exir.EdgeProgramManager = exir.to_edge(aten_dialect)
# Optionally do delegation:
# edge_program = edge_program.to_backend(CustomBackendPartitioner)
executorch_program: exir.ExecutorchProgramManager = edge_program.to_executorch(
    ExecutorchBackendConfig(
        passes=[],  # User-defined passes
    )
)
 
with open("model.pte", "wb") as file:
    file.write(executorch_program.buffer)

🤖 Harold's Notes

Explorer

How it works

High-level (export, transformation, and compilation)

Architectural Components

Program preparation

End-to-end workflow

Graph View

Table of Contents

Backlinks