1# Using MindSpore Lite for Model Conversion 2 3## Basic Concepts 4 5- MindSpore Lite: a built-in AI inference engine of OpenHarmony that provides inference deployment for deep learning models. 6 7- Neural Network Runtime (NNRt): a bridge that connects the upper-layer AI inference framework to the bottom-layer acceleration chip to implement cross-chip inference and computing of AI models. 8 9- Common neural network model: network models commonly used for AI applications, including MindSpore, ONNX, TensorFlow, and CAFFE. 10 11- Offline models: network models obtained using the offline model conversion tool of the AI hardware vendor. The hardware vendor is responsible for parsing and inference of offline models. 12 13## When to Use 14 15The deployment process is as follows: 161. Use the MindSpore Lite model conversion tool to convert the original model (for example, ONNX or CAFFE) to a .ms model file. You can check the supported operators against the [MindSpore Lite Kit operator list](mindspore-lite-supported-operators.md) to ensure that the model conversion is successful. 172. Call APIs of the MindSpore Lite inference engine to perform [model inference](mindspore-lite-guidelines.md). 18 19## Obtaining the Model Conversion Tool 20 21You can obtain the MindSpore Lite model conversion tool in either of the following ways: 22 23### Download 24 25| Component | Hardware Platform| OS | URL | SHA-256 | 26| ------------------------------------------------------- | -------- | ------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | 27| On-device inference and training benchmark tool, converter tool, and cropper tool| CPU | Linux-x86_64 | [mindspore-lite-2.3.0-linux-x64.tar.gz](https://ms-release.obs.cn-north-4.myhuaweicloud.com/2.3.0/MindSpore/lite/release/linux/x86_64/mindspore-lite-2.3.0-linux-x64.tar.gz) | 060d698a171b52c38b64c8d65927816daf4b81d8e2b5069718aeb91a9f8a154c | 28 29### Source Code Building 30 31> **NOTE** 32> 33> - The build option that supports PyTorch model conversion is disabled by default. Therefore, the downloaded installation package does not support PyTorch model conversion and can be obtained only by source code building. 34> 35> - If the transpose and convolution operators are fused in the model, you need to obtain the model through source code compilation. Otherwise, an alarm similar to the following may be displayed: node infer shape failed, node is Default/Conv2DFusion-xxx. 36> 37> - If the NPU backend is used for inference, you need to determine whether to [disable the fusion of clip operators](#disabling-the-fusion-of-specified-operators) and obtain the model conversion tool through source code compilation. Otherwise, an error similar to the following may be reported: BuildKirinNPUModel# Create full model kernel failed. 38 391. The environment requirements are as follows: 40 41 - System environment: Linux x86_64 (Ubuntu 18.04.02LTS recommended) 42 - C++ build dependencies: 43 - GCC >= 7.3.0 44 - CMake >= 3.18.3 45 - Git >= 2.28.0 46 472. Obtain the [MindSpore Lite source code](https://gitee.com/openharmony/third_party_mindspore). 48 The complete source code of MindSpore Lite is available at `mindspore-src/source/`. 49 503. Start building. 51 52 To obtain the conversion tool that supports PyTorch model conversion, run `export MSLITE_ENABLE_CONVERT_PYTORCH_MODEL=on && export LIB_TORCH_PATH="/home/user/libtorch"` before you begin model building. Add the libtorch environment variable `export LD_LIBRARY_PATH="/home/user/libtorch/lib:${LD_LIBRARY_PATH}"` before conversion. You can download the libtorch package of the CPU version and decompress it to `/home/user/libtorch`. 53 54 ```bash 55 cd mindspore-src/source/ 56 bash build.sh -I x86_64 -j 8 57 ``` 58 59 After the building is complete, you can obtain the MindSpore Lite release package from `output/` in the root directory of the source code. The conversion tool is available at `tools/converter/converter/` after decompression. 60 61## Configure environment variables. 62 63After obtaining the model conversion tool, you need to add the dynamic link library (DLL) required by the conversion tool to the environment variable `LD_LIBRARY_PATH`. 64 65```bash 66export LD_LIBRARY_PATH=${PACKAGE_ROOT_PATH}/tools/converter/lib:${LD_LIBRARY_PATH} 67``` 68 69${PACKAGE_ROOT_PATH} indicates the path where the MindSpore Lite release package is decompressed. 70 71 72## Parameter Description 73 74The MindSpore Lite model conversion tool provides multiple parameter settings. You can use them as required. In addition, you can run `./converter_lite --help` to obtain the help information in real time. 75The following describes the parameters in detail. 76 77 78| Name | Mandatory | Description | Value Range | 79| :----------------: | ------------------- | ------------------------------------------------------------ | ------------------------------------------------ | 80| --help | No | Displays all help information. | - | 81| --fmk | Yes | Original format of the input model. This parameter can be set to **MSLITE** only when the MS model is converted to micro code.| MINDIR, CAFFE, TFLITE, TF, ONNX, PYTORCH, or MSLITE| 82| --modelFile | Yes | Path of the input model. | - | 83| --outputFile | Yes | Path of the output model. You do not need to add an extension because the extension `.ms` is automatically generated. | - | 84| --weightFile | Yes for CAFFE model conversion| Path of the model weight file. | - | 85| --configFile | No | 1) Path of the quantization configuration file after training. 2) Path of the extended function configuration file.| - | 86| --fp16 | No | Whether to store the weights of float32 data as float16 data during model serialization.<br>The default value is **off**.| on or off | 87| --inputShape | No | Set the input dimensions of the model. Make sure that the sequence of the input dimensions is the same as that of the original model. The model structure can be further optimized for some specific models, but the dynamic shape feature will be unavailable for the converted model. Separate each input name and shape by a colon (:), and separate each pair of input name and shape by a semicolon (;). In addition, enclose them with double quotation marks (""). For example, set this parameter to **"inTensorName_1: 1,32,32,4;inTensorName_2:1,64,64,4;"**.| - | 88| --inputDataFormat | No | Input format of the exported model. This parameter is valid only for 4D input.<br>The default value is **NHWC**.| NHWC or NCHW | 89| --inputDataType | No | Data type of the input tensor of the quantization model. This parameter is valid only when the quantization parameters (**scale** and **zero point**) are configured for the input tensor. The data type is the same as that of the input tensor of the original model by default.<br>The default value is **DEFAULT**.| FLOAT32, INT8, UINT8, or DEFAULT | 90| --outputDataType | No | Data type of the output tensor of the quantization model. This parameter is valid only when the quantization parameters (**scale** and **zero point**) are configured for the output tensor. The data type is the same as that of the output tensor of the original model by default.<br>The default value is **DEFAULT**.| FLOAT32, INT8, UINT8, or DEFAULT | 91| --outputDataFormat | No | Output format of the exported model. This parameter is valid only for 4D input. | NHWC or NCHW | 92 93> **NOTE** 94> - The parameter name and value are separated by an equal sign (=) and no space is allowed between them. 95> - Generally, a CAFFE model has two files: the model structure `*.prototxt`, which corresponds to the `--modelFile` parameter, and the model weight `*.caffemodel`, which corresponds to the `--weightFile` parameter. 96 97## Example 98 99The following conversion command uses the CAFFE model LeNet as an example. 100 101```bash 102./converter_lite --fmk=CAFFE --modelFile=lenet.prototxt --weightFile=lenet.caffemodel --outputFile=lenet 103``` 104In this example, the CAFFE model is used. Therefore, you need to specify two input files: model structure and model weight. In addition, add other mandatory parameters, that is, fmk type and output path. 105The command output is as follows: 106 107```bash 108CONVERT RESULT SUCCESS:0 109``` 110This indicates that the CAFFE model is successfully converted to the MindSpore Lite model. A new file named **lenet.ms** is generated in the specified path. 111 112## (Optional) Offline Model Conversion 113 114If you want to reduce the loading delay to meet the requirements of the deployment scenario, you can use offline model-based inference as an alternative. The operation procedure is as follows: 115 116During inference, MindSpore Lite directly sends the offline model to the AI hardware connected to NNRt. This way, the model can be loaded without the need for online image composition, greatly reducing the model loading delay. In addition, MindSpore Lite can provide additional hardware-specific information to assist the AI hardware in model inference. 117 118### Constraints 119 120- Offline model inference can only be implemented at the NNRt backend. The AI hardware needs to connect to NNRt and support offline model inference. 121- The offline model conversion tool can be obtained only through source code building. 122- During offline model conversion, `fmk` must be set to `THIRDPARTY`. 123- The offline model comes as a black box and cannot be directly parsed by the conversion tool to obtain its input and output tensor information. Therefore, you need to manually configure the tensor information in the extended configuration file of the conversion tool. 124 125### Description of the Extended Configuration File 126 127An example of the extended configuration is as follows: 128- `[third_party_model]` in the first line is a fixed keyword that indicates the section of offline model configuration. 129- The following lines exhibit the name, data type, shape, and memory format of the input and output tensors of the model respectively. Each field occupies a line and is expressed in the key-value pair format. The sequence of fields is not limited. 130- Among the fields, data type and shape are mandatory, and other parameters are optional. 131- Extended parameters are also provided. They are used to encapsulate custom configuration of the offline model into an .ms file in the key-value pair format. The .ms file is passed to the AI hardware by NNRt during inference. 132 133```text 134[third_party_model] 135input_names=in_0;in_1 136input_dtypes=float32;float32 137input_shapes=8,256,256;8,256,256,3 138input_formats=NCHW;NCHW 139output_names=out_0 140output_dtypes=float32 141output_shapes=8,64 142output_formats=NCHW 143extended_parameters=key_foo:value_foo;key_bar:value_bar 144``` 145 146Field description: 147 148- `input_names` (optional): model input name, which is in the string format. If multiple names are specified, use a semicolon (;) to separate them. 149- `input_dtypes` (mandatory): model input data type, which is in the type format. If multiple data types are specified, use a semicolon (;) to separate them. 150- `input_shapes` (mandatory): model input shape, which is in the integer array format. If multiple input shapes are specified, use a semicolon (;) to separate them. 151- `input_formats` (optional): model input memory format, which is in the string format. If multiple formats are specified, use a semicolon (;) to separate them. The default value is NHWC. 152- `output_names` (optional): model output name, which is in the string format. If multiple names are specified, use a semicolon (;) to separate them. 153- `output_dtypes` (mandatory): model output data type, which is in the type format. If multiple data types are specified, use a semicolon (;) to separate them. 154- `output_shapes` (mandatory): model output shape, which is in the integer array format. If multiple output shapes are specified, use a semicolon (;) to separate them. 155- `output_formats` (optional): model output memory format, which is in the string format. If multiple formats are specified, use a semicolon (;) to separate them. The default value is NHWC. 156- `extended_parameters` (optional): custom configuration of the inference hardware, which is in the key-value pair format. It is passed to the AI hardware through the NNRt backend during inference. 157 158## Appendix 159 160### Disabling the Fusion of Specified Operators 161 162If you need to disable the fusion of specified operators, create a configuration file, for example, **converter.cfg**, and configure and file content as follows: 163 164```ini 165[registry] 166# If **disable_fusion** is set to **off**, you can configure **fusion_blacklists** to disable the fusion of specified operators. If **disable_fusion** is set to **on**, the fusion of operators is disabled, and **fusion_blacklists** does not take effect. The default value of **disable_fusion** is **off**. 167disable_fusion=off 168# To disable the fusion of multiple operators, separate the operators with commas (,). 169fusion_blacklists=ConvActivationFusion,MatMulActivationFusion,clip_convert_activation_pass 170``` 171 172When running the converter, set **configFile** to **converter.cfg**. 173 174The following lists the operators for which fusion can be disabled: 175 176- AddConcatActivationFusion 177- SqueezeFusion 178- TransposeFusion 179- ReshapeReshapeFusion 180- ConvBiasaddFusion 181- ConvBatchNormFusion 182- ConvScaleFusion 183- GroupNormFusion 184- TfNormFusion 185- OnnxLayerNormFusion 186- OnnxLayerNormFusion2 187- BatchMatMulFusion 188- BatchNormToScaleFusion 189- SigmoidMulFusion 190- ActivationFusion 191- ConvActivationFusion 192- ConvTupleGetItemFusion 193- ConvTupleActivationFusion 194- TfliteLstmCellFusion 195- TfLstmCellFusion 196- TfBidirectionGruFusion 197- TfGeLUFusion 198- OnnxGeLUFusion 199- TfliteRelPosMultiHeadAttentionFusion 200- GLUFusion 201- ConstFoldPass 202- AffineFusion 203- AffineActivationFusion 204- ConvConvFusion 205- ConvPadFusion 206- MatMulAddFusion 207- MatMulMulFusion 208- TransposeMatMulFusion 209- MulAddFusion 210- ScaleActivationFusion 211- ScaleScaleFusion 212- FullConnectedFusion 213- FullconnectedAddFusion 214- TensorDotFusion 215- MatMulActivationFusion 216- clip_convert_activation_pass 217