# Directory Structure Below is the layout of the `examples/mediatek` directory, which includes the necessary files for the example applications: ```plaintext examples/mediatek ├── aot_utils # Utils for AoT export ├── llm_utils # Utils for LLM models ├── preformatter_templates # Model specific prompt preformatter templates ├── prompts # Calibration Prompts ├── tokenizers_ # Model tokenizer scripts ├── oss_utils # Utils for oss models ├── eval_utils # Utils for eval oss models ├── model_export_scripts # Model specifc export scripts ├── models # Model definitions ├── llm_models # LLM model definitions ├── weights # LLM model weights location (Offline) [Ensure that config.json, relevant tokenizer files and .bin or .safetensors weights file(s) are placed here] ├── executor_runner # Example C++ wrapper for the ExecuTorch runtime ├── pte # Generated .pte files location ├── shell_scripts # Shell scripts to quickrun model specific exports ├── CMakeLists.txt # CMake build configuration file for compiling examples ├── requirements.txt # MTK and other required packages ├── mtk_build_examples.sh # Script for building MediaTek backend and the examples └── README.md # Documentation for the examples (this file) ``` # Examples Build Instructions ## Environment Setup - Follow the instructions of **Prerequisites** and **Setup** in `backends/mediatek/scripts/README.md`. ## Build MediaTek Examples 1. Build the backend and the examples by exedcuting the script: ```bash ./mtk_build_examples.sh ``` ## LLaMa Example Instructions ##### Note: Verify that localhost connection is available before running AoT Flow 1. Exporting Models to `.pte` - In the `examples/mediatek directory`, run: ```bash source shell_scripts/export_llama.sh ``` - Defaults: - `model_name` = llama3 - `num_chunks` = 4 - `prompt_num_tokens` = 128 - `cache_size` = 1024 - `calibration_set_name` = None - Argument Explanations/Options: - `model_name`: llama2/llama3 **Note: Currently Only Tested on Llama2 7B Chat and Llama3 8B Instruct.** - `num_chunks`: Number of chunks to split the model into. Each chunk contains the same number of decoder layers. Will result in `num_chunks` number of `.pte` files being generated. Typical values are 1, 2 and 4. - `prompt_num_tokens`: Number of tokens (> 1) consumed each forward pass for the prompt processing stage. - `cache_size`: Cache Size. - `calibration_set_name`: Name of calibration dataset with extension that is found inside the `aot_utils/llm_utils/prompts` directory. Example: `alpaca.txt`. If `"None"`, will use dummy data to calibrate. **Note: Export script example only tested on `.txt` file.** 2. `.pte` files will be generated in `examples/mediatek/pte` - Users should expect `num_chunks*2` number of pte files (half of them for prompt and half of them for generation). - Generation `.pte` files have "`1t`" in their names. - Additionally, an embedding bin file will be generated in the weights folder where the `config.json` can be found in. [`examples/mediatek/models/llm_models/weights//embedding__fp32.bin`] - eg. For `llama3-8B-instruct`, embedding bin generated in `examples/mediatek/models/llm_models/weights/llama3-8B-instruct/` - AoT flow will take roughly 2.5 hours (114GB RAM for `num_chunks=4`) to complete (Results will vary by device/hardware configurations) ### oss 1. Exporting Model to `.pte` ```bash bash shell_scripts/export_oss.sh ``` - Argument Options: - `model_name`: deeplabv3/edsr/inceptionv3/inceptionv4/mobilenetv2/mobilenetv3/resnet18/resnet50 # Runtime ## Environment Setup To set up the build environment for the `mtk_executor_runner`: 1. Navigate to the `backends/mediatek/scripts` directory within the repository. 2. Follow the detailed build steps provided in that location. 3. Upon successful completion of the build steps, the `mtk_executor_runner` binary will be generated. ## Deploying and Running on the Device ### Pushing Files to the Device Transfer the `.pte` model files and the `mtk_executor_runner` binary to your Android device using the following commands: ```bash adb push mtk_executor_runner adb push .pte ``` Make sure to replace `` with the actual name of your model file. And, replace the `` with the desired detination on the device. ##### Note: For oss models, please push additional files to your Android device ```bash adb push mtk_oss_executor_runner adb push input_list.txt for i in input*bin; do adb push "$i" ; done; ``` ### Executing the Model Execute the model on your Android device by running: ```bash adb shell "/data/local/tmp/mtk_executor_runner --model_path /data/local/tmp/.pte --iteration " ``` In the command above, replace `` with the name of your model file and `` with the desired number of iterations to run the model. ##### Note: For llama models, please use `mtk_llama_executor_runner`. Refer to `examples/mediatek/executor_runner/run_llama3_sample.sh` for reference. ##### Note: For oss models, please use `mtk_oss_executor_runner`. ```bash adb shell "/data/local/tmp/mtk_oss_executor_runner --model_path /data/local/tmp/.pte --input_list /data/local/tmp/input_list.txt --output_folder /data/local/tmp/output_" adb pull "/data/local/tmp/output_ ./" ``` ### Check oss result on PC ```bash python3 eval_utils/eval_oss_result.py --eval_type --target_f --output_f ``` For example: ``` python3 eval_utils/eval_oss_result.py --eval_type piq --target_f edsr --output_f output_edsr ``` - Argument Options: - `eval_type`: topk/piq/segmentation - `target_f`: folder contain golden data files. file name is `golden__0.bin` - `output_f`: folder contain model output data files. file name is `output__0.bin`