• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1# Using MindSpore Lite for Offline Model Conversion and Inference
2
3## Basic Concepts
4
5- MindSpore Lite: a built-in AI inference engine of OpenHarmony that provides inference deployment for deep learning models.
6
7- Neural Network Runtime (NNRt): a bridge that connects the upper-layer AI inference framework to the bottom-layer acceleration chip to implement cross-chip inference and computing of AI models.
8
9- Offline model: a model obtained using the offline model conversion tool of the AI hardware vendor. The hardware vendor is responsible for parsing and inference of AI models.
10
11## When to Use
12
13The common process for MindSpore Lite AI model deployment is as follows:
14- Use the MindSpore Lite model conversion tool to convert third-party models (such as ONNX and CAFFE) to `.ms` models.
15- Call APIs of the MindSpore Lite inference engine to perform model inference. By specifying NNRt as the inference device, you can then use the AI hardware in the system to accelerate inference.
16
17When MindSpore Lite + NNRt inference is used, dynamic image composition in the initial phase will introduce a certain model loading delay.
18
19If you want to reduce the loading delay to meet the requirements of the deployment scenario, you can use offline model-based inference as an alternative. The operation procedure is as follows:
20- Use the offline model conversion tool provided by the AI hardware vendor to compile an offline model in advance.
21- Use the MindSpore Lite conversion tool to encapsulate the offline model as a black box into the `.ms` model.
22- Pass the `.ms` model to MindSpore Lite for inference.
23
24During inference, MindSpore Lite directly sends the offline model to the AI hardware connected to NNRt. This way, the model can be loaded without the need for online image composition, greatly reducing the model loading delay. In addition, MindSpore Lite can provide additional hardware-specific information to assist the AI hardware in model inference.
25
26The following sections describe the offline model inference and conversion process in detail.
27
28## Constraints
29
30- Offline model inference can only be implemented at the NNRt backend. The AI hardware needs to connect to NNRt and supports offline model inference.
31
32## Offline Model Conversion
33
34
35### 1. Building the MindSpore Lite Release Package
36
37Obtain the [MindSpore Lite source code](https://gitee.com/openharmony/third_party_mindspore). The source code is managed in "compressed package + patch" mode. Run the following commands to decompress the source code package and install the patch:
38```bash
39cd mindspore
40python3 build_helper.py --in_zip_path=./mindspore-v1.8.1.zip --patch_dir=./patches/ --out_src_path=./mindspore-src
41```
42If the command execution is successful, the complete MindSpore Lite source code is generated in `mindspore-src/source/`.
43
44Run the following commands to start building:
45```bash
46cd mindspore-src/source/
47bash build.sh -I x86_64 -j 8
48```
49
50After the building is complete, you can obtain the MindSpore Lite release package from the `output/` directory in the root directory of the source code.
51
52
53### 2. Writing Extended Configuration File of the Conversion Tool
54
55The offline model comes as a black box and cannot be parsed by the conversion tool to obtain its input and output tensor information. Therefore, you need to manually configure the tensor information in the extended configuration file of the conversion tool. Based on the extended configuration, the conversion tool can then generate the `.ms` model file for encapsulating the offline model.
56
57An example of the extended configuration is as follows:
58- `[third_party_model]` in the first line is a fixed keyword that indicates the section of offline model configuration.
59- The following lines exhibit the name, data type, shape, and memory format of the input and output tensors of the model respectively. Each field occupies a line and is expressed in the key-value pair format. The sequence of fields is not limited.
60- Among the fields, data type and shape are mandatory, and other parameters are optional.
61- Extended parameters are also provided. They are used to encapsulate custom configuration of the offline model into an `.ms` file in the the key-value pair format. The `.ms` file is passed to the AI hardware by NNRt during inference.
62
63```text
64[third_party_model]
65input_names=in_0;in_1
66input_dtypes=float32;float32
67input_shapes=8,256,256;8,256,256,3
68input_formats=NCHW;NCHW
69output_names=out_0
70output_dtypes=float32
71output_shapes=8,64
72output_formats=NCHW
73extended_parameters=key_foo:value_foo;key_bar:value_bar
74```
75
76The related fields are described as follows:
77
78- `input_names` (optional): model input name, which is in the string format. If multiple names are specified, use a semicolon (`;`) to separate them.
79- `input_dtypes` (mandatory): model input data type, which is in the type format. If multiple data types are specified, use a semicolon (`;`) to separate them.
80- `input_shapes` (mandatory): model input shape, which is in the integer array format. If multiple input shapes are specified, use a semicolon (`;`) to separate them.
81- `input_formats` (optional): model input memory format, which is in the string format. If multiple formats are specified, use a semicolon (`;`) to separate them. The default value is `NHWC`.
82- `output_names` (optional): model output name, which is in the string format. If multiple names are specified, use a semicolon (`;`) to separate them.
83- `output_dtypes` (mandatory): model output data type, which is in the type format. If multiple data types are specified, use a semicolon (`;`) to separate them.
84- `output_shapes` (mandatory): model output shape, which is in the integer array format. If multiple output shapes are specified, use a semicolon (`;`) to separate them.
85- `output_formats` (optional): model output memory format, which is in the string format. If multiple formats are specified, use a semicolon (`;`) to separate them. The default value is `NHWC`.
86- `extended_parameters` (optional): custom configuration of the inference hardware, which is in the key-value pair format. It is passed to the AI hardware through the NNRt backend during inference.
87
88### 3. Converting an Offline Model
89
90Decompress the MindSpore Lite release package obtained in step 1. Go to the directory where the conversion tool is located (that is, `tools/converter/converter/`), and run the following commands:
91
92```bash
93export LD_LIBRARY_PATH=${PWD}/../lib
94./converter_lite --fmk=THIRDPARTY --modelFile=/path/to/your_model --configFile=/path/to/your_config --outputFile=/path/to/output_model
95```
96The offline model conversion is complete.
97
98The related parameters are described as follows:
99- `--fmk`: original format of the input model. `THIRDPARTY` indicates an offline model.
100- `--modelFile`: path of the input model.
101- `--configFile`: path of the extended configuration file. The file is used to configure offline model information.
102- `--outputFile`: path of the output model. You do not need to add the file name extension. The `.ms` suffix is generated automatically.
103
104> **NOTE**
105>
106> If `fmk` is set to `THIRDPARTY`, offline model conversion is performed. In this case, only the preceding four parameters and the extended configuration file take effect.
107
108## Offline Model Inference
109
110Offline model inference is the same as common MindSpore Lite model inference except that only NNRt devices can be added to the inference context.
111
112For details about the MindSpore Lite model inference process, see [Using MindSpore Lite for Model Inference](./mindspore-lite-guidelines.md).
113