• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1# Using MindSpore Lite for Model Conversion
2
3## Basic Concepts
4
5- MindSpore Lite: a built-in AI inference engine of OpenHarmony that provides inference deployment for deep learning models.
6
7- Neural Network Runtime (NNRt): a bridge that connects the upper-layer AI inference framework to the bottom-layer acceleration chip to implement cross-chip inference and computing of AI models.
8
9- Common neural network model: network models commonly used for AI applications, including MindSpore, ONNX, TensorFlow, and CAFFE.
10
11- Offline models: network models obtained using the offline model conversion tool of the AI hardware vendor. The hardware vendor is responsible for parsing and inference of offline models.
12
13## When to Use
14
15The deployment process is as follows:
161. Use the MindSpore Lite model conversion tool to convert the original model (for example, ONNX or CAFFE) to a .ms model file. You can check the supported operators against the [MindSpore Lite Kit operator list](mindspore-lite-supported-operators.md) to ensure that the model conversion is successful.
172. Call APIs of the MindSpore Lite inference engine to perform [model inference](mindspore-lite-guidelines.md).
18
19## Obtaining the Model Conversion Tool
20
21You can obtain the MindSpore Lite model conversion tool in either of the following ways:
22
23### Download
24
25| Component                                                   | Hardware Platform| OS    | URL                                                        | SHA-256                                                      |
26| ------------------------------------------------------- | -------- | ------------ | ------------------------------------------------------------ | ------------------------------------------------------------ |
27| On-device inference and training benchmark tool, converter tool, and cropper tool| CPU      | Linux-x86_64 | [mindspore-lite-2.3.0-linux-x64.tar.gz](https://ms-release.obs.cn-north-4.myhuaweicloud.com/2.3.0/MindSpore/lite/release/linux/x86_64/mindspore-lite-2.3.0-linux-x64.tar.gz) | 060d698a171b52c38b64c8d65927816daf4b81d8e2b5069718aeb91a9f8a154c |
28
29### Source Code Building
30
31> **NOTE**
32>
33> - The build option that supports PyTorch model conversion is disabled by default. Therefore, the downloaded installation package does not support PyTorch model conversion and can be obtained only by source code building.
34>
35> - If the transpose and convolution operators are fused in the model, you need to obtain the model through source code compilation. Otherwise, an alarm similar to the following may be displayed: node infer shape failed, node is Default/Conv2DFusion-xxx.
36>
37> - If the NPU backend is used for inference, you need to determine whether to [disable the fusion of clip operators](#disabling-the-fusion-of-specified-operators) and obtain the model conversion tool through source code compilation. Otherwise, an error similar to the following may be reported: BuildKirinNPUModel# Create full model kernel failed.
38
391. The environment requirements are as follows:
40
41   - System environment: Linux x86_64 (Ubuntu 18.04.02LTS recommended)
42   - C++ build dependencies:
43     -  GCC >= 7.3.0
44     -  CMake >= 3.18.3
45     -  Git >= 2.28.0
46
472. Obtain the [MindSpore Lite source code](https://gitee.com/openharmony/third_party_mindspore).
48   The complete source code of MindSpore Lite is available at `mindspore-src/source/`.
49
503. Start building.
51
52   To obtain the conversion tool that supports PyTorch model conversion, run `export MSLITE_ENABLE_CONVERT_PYTORCH_MODEL=on && export LIB_TORCH_PATH="/home/user/libtorch"` before you begin model building. Add the libtorch environment variable `export LD_LIBRARY_PATH="/home/user/libtorch/lib:${LD_LIBRARY_PATH}"` before conversion. You can download the libtorch package of the CPU version and decompress it to `/home/user/libtorch`.
53
54   ```bash
55   cd mindspore-src/source/
56   bash build.sh -I x86_64 -j 8
57   ```
58
59   After the building is complete, you can obtain the MindSpore Lite release package from `output/` in the root directory of the source code. The conversion tool is available at `tools/converter/converter/` after decompression.
60
61## Configure environment variables.
62
63After obtaining the model conversion tool, you need to add the dynamic link library (DLL) required by the conversion tool to the environment variable `LD_LIBRARY_PATH`.
64
65```bash
66export LD_LIBRARY_PATH=${PACKAGE_ROOT_PATH}/tools/converter/lib:${LD_LIBRARY_PATH}
67```
68
69${PACKAGE_ROOT_PATH} indicates the path where the MindSpore Lite release package is decompressed.
70
71
72## Parameter Description
73
74The MindSpore Lite model conversion tool provides multiple parameter settings. You can use them as required. In addition, you can run `./converter_lite --help` to obtain the help information in real time.
75The following describes the parameters in detail.
76
77
78|        Name       | Mandatory           | Description                                                    | Value Range                                        |
79| :----------------: | ------------------- | ------------------------------------------------------------ | ------------------------------------------------ |
80|       --help       | No                 | Displays all help information.                                          | -                                                |
81|       --fmk        | Yes                 | Original format of the input model. This parameter can be set to **MSLITE** only when the MS model is converted to micro code.| MINDIR, CAFFE, TFLITE, TF, ONNX, PYTORCH, or MSLITE|
82|    --modelFile     | Yes                 | Path of the input model.                                            | -                                                |
83|    --outputFile    | Yes                 | Path of the output model. You do not need to add an extension because the extension `.ms` is automatically generated.           | -                                                |
84|    --weightFile    | Yes for CAFFE model conversion| Path of the model weight file.                                    | -                                                |
85|    --configFile    | No                 | 1) Path of the quantization configuration file after training. 2) Path of the extended function configuration file.| -                                                |
86|       --fp16       | No                 | Whether to store the weights of float32 data as float16 data during model serialization.<br>The default value is **off**.| on or off                                         |
87|    --inputShape    | No                 | Set the input dimensions of the model. Make sure that the sequence of the input dimensions is the same as that of the original model. The model structure can be further optimized for some specific models, but the dynamic shape feature will be unavailable for the converted model. Separate each input name and shape by a colon (:), and separate each pair of input name and shape by a semicolon (;). In addition, enclose them with double quotation marks (""). For example, set this parameter to **"inTensorName_1: 1,32,32,4;inTensorName_2:1,64,64,4;"**.| -                                                |
88| --inputDataFormat  | No                 | Input format of the exported model. This parameter is valid only for 4D input.<br>The default value is **NHWC**.| NHWC or NCHW                                      |
89|  --inputDataType   | No                 | Data type of the input tensor of the quantization model. This parameter is valid only when the quantization parameters (**scale** and **zero point**) are configured for the input tensor. The data type is the same as that of the input tensor of the original model by default.<br>The default value is **DEFAULT**.| FLOAT32, INT8, UINT8, or DEFAULT                   |
90|  --outputDataType  | No                 | Data type of the output tensor of the quantization model. This parameter is valid only when the quantization parameters (**scale** and **zero point**) are configured for the output tensor. The data type is the same as that of the output tensor of the original model by default.<br>The default value is **DEFAULT**.| FLOAT32, INT8, UINT8, or DEFAULT                   |
91| --outputDataFormat | No                 | Output format of the exported model. This parameter is valid only for 4D input.                | NHWC or NCHW                                      |
92
93> **NOTE**
94> - The parameter name and value are separated by an equal sign (=) and no space is allowed between them.
95> - Generally, a CAFFE model has two files: the model structure `*.prototxt`, which corresponds to the `--modelFile` parameter, and the model weight `*.caffemodel`, which corresponds to the `--weightFile` parameter.
96
97## Example
98
99The following conversion command uses the CAFFE model LeNet as an example.
100
101```bash
102./converter_lite --fmk=CAFFE --modelFile=lenet.prototxt --weightFile=lenet.caffemodel --outputFile=lenet
103```
104In this example, the CAFFE model is used. Therefore, you need to specify two input files: model structure and model weight. In addition, add other mandatory parameters, that is, fmk type and output path.
105The command output is as follows:
106
107```bash
108CONVERT RESULT SUCCESS:0
109```
110This indicates that the CAFFE model is successfully converted to the MindSpore Lite model. A new file named **lenet.ms** is generated in the specified path.
111
112## (Optional) Offline Model Conversion
113
114If you want to reduce the loading delay to meet the requirements of the deployment scenario, you can use offline model-based inference as an alternative. The operation procedure is as follows:
115
116During inference, MindSpore Lite directly sends the offline model to the AI hardware connected to NNRt. This way, the model can be loaded without the need for online image composition, greatly reducing the model loading delay. In addition, MindSpore Lite can provide additional hardware-specific information to assist the AI hardware in model inference.
117
118### Constraints
119
120- Offline model inference can only be implemented at the NNRt backend. The AI hardware needs to connect to NNRt and support offline model inference.
121- The offline model conversion tool can be obtained only through source code building.
122- During offline model conversion, `fmk` must be set to `THIRDPARTY`.
123- The offline model comes as a black box and cannot be directly parsed by the conversion tool to obtain its input and output tensor information. Therefore, you need to manually configure the tensor information in the extended configuration file of the conversion tool.
124
125### Description of the Extended Configuration File
126
127An example of the extended configuration is as follows:
128- `[third_party_model]` in the first line is a fixed keyword that indicates the section of offline model configuration.
129- The following lines exhibit the name, data type, shape, and memory format of the input and output tensors of the model respectively. Each field occupies a line and is expressed in the key-value pair format. The sequence of fields is not limited.
130- Among the fields, data type and shape are mandatory, and other parameters are optional.
131- Extended parameters are also provided. They are used to encapsulate custom configuration of the offline model into an .ms file in the key-value pair format. The .ms file is passed to the AI hardware by NNRt during inference.
132
133```text
134[third_party_model]
135input_names=in_0;in_1
136input_dtypes=float32;float32
137input_shapes=8,256,256;8,256,256,3
138input_formats=NCHW;NCHW
139output_names=out_0
140output_dtypes=float32
141output_shapes=8,64
142output_formats=NCHW
143extended_parameters=key_foo:value_foo;key_bar:value_bar
144```
145
146Field description:
147
148- `input_names` (optional): model input name, which is in the string format. If multiple names are specified, use a semicolon (;) to separate them.
149- `input_dtypes` (mandatory): model input data type, which is in the type format. If multiple data types are specified, use a semicolon (;) to separate them.
150- `input_shapes` (mandatory): model input shape, which is in the integer array format. If multiple input shapes are specified, use a semicolon (;) to separate them.
151- `input_formats` (optional): model input memory format, which is in the string format. If multiple formats are specified, use a semicolon (;) to separate them. The default value is NHWC.
152- `output_names` (optional): model output name, which is in the string format. If multiple names are specified, use a semicolon (;) to separate them.
153- `output_dtypes` (mandatory): model output data type, which is in the type format. If multiple data types are specified, use a semicolon (;) to separate them.
154- `output_shapes` (mandatory): model output shape, which is in the integer array format. If multiple output shapes are specified, use a semicolon (;) to separate them.
155- `output_formats` (optional): model output memory format, which is in the string format. If multiple formats are specified, use a semicolon (;) to separate them. The default value is NHWC.
156- `extended_parameters` (optional): custom configuration of the inference hardware, which is in the key-value pair format. It is passed to the AI hardware through the NNRt backend during inference.
157
158## Appendix
159
160### Disabling the Fusion of Specified Operators
161
162If you need to disable the fusion of specified operators, create a configuration file, for example, **converter.cfg**, and configure and file content as follows:
163
164```ini
165[registry]
166# If **disable_fusion** is set to **off**, you can configure **fusion_blacklists** to disable the fusion of specified operators. If **disable_fusion** is set to **on**, the fusion of operators is disabled, and **fusion_blacklists** does not take effect. The default value of **disable_fusion** is **off**.
167disable_fusion=off
168# To disable the fusion of multiple operators, separate the operators with commas (,).
169fusion_blacklists=ConvActivationFusion,MatMulActivationFusion,clip_convert_activation_pass
170```
171
172When running the converter, set **configFile** to **converter.cfg**.
173
174The following lists the operators for which fusion can be disabled:
175
176- AddConcatActivationFusion
177- SqueezeFusion
178- TransposeFusion
179- ReshapeReshapeFusion
180- ConvBiasaddFusion
181- ConvBatchNormFusion
182- ConvScaleFusion
183- GroupNormFusion
184- TfNormFusion
185- OnnxLayerNormFusion
186- OnnxLayerNormFusion2
187- BatchMatMulFusion
188- BatchNormToScaleFusion
189- SigmoidMulFusion
190- ActivationFusion
191- ConvActivationFusion
192- ConvTupleGetItemFusion
193- ConvTupleActivationFusion
194- TfliteLstmCellFusion
195- TfLstmCellFusion
196- TfBidirectionGruFusion
197- TfGeLUFusion
198- OnnxGeLUFusion
199- TfliteRelPosMultiHeadAttentionFusion
200- GLUFusion
201- ConstFoldPass
202- AffineFusion
203- AffineActivationFusion
204- ConvConvFusion
205- ConvPadFusion
206- MatMulAddFusion
207- MatMulMulFusion
208- TransposeMatMulFusion
209- MulAddFusion
210- ScaleActivationFusion
211- ScaleScaleFusion
212- FullConnectedFusion
213- FullconnectedAddFusion
214- TensorDotFusion
215- MatMulActivationFusion
216- clip_convert_activation_pass
217