Invoking the runtime

https://pytorch.org/executorch/main/llm/getting-started.html
(Prerequisites) Export the model to .pte following torch.export() ⇒ Edge Compilation
Create a file called main.cpp with the following contents

// main.cpp
 
#include <cstdint>
 
#include "basic_sampler.h"
#include "basic_tokenizer.h"
 
#include <executorch/extension/module/module.h>
#include <executorch/extension/tensor/tensor.h>
#include <executorch/runtime/core/evalue.h>
#include <executorch/runtime/core/exec_aten/exec_aten.h>
cd#include <executorch/runtime/core/result.h>
 
using executorch::aten::ScalarType;
using executorch::aten::Tensor;
using executorch::extension::from_blob;
using executorch::extension::Module;
using executorch::runtime::EValue;
using executorch::runtime::Result;

The Module class handles loading the .pte file and preparing for execution.
- has the forward signature and expectes Evalue tensor

 // Load the exported nanoGPT program, which was generated via the previous
  // steps.
  Module model("nanogpt.pte", Module::LoadMode::MmapUseMlockIgnoreErrors);

The ExecuTorch EValue class provides a wrapper around tensors and other ExecuTorch data types.

🤖 Harold's Notes

Explorer

Invoking the runtime

Graph View

Backlinks