• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1# Using MindSpore Lite for Model Inference
2
3## When to Use
4
5MindSpore Lite is an AI engine that provides AI model inference for different hardware devices. It has been used in a wide range of fields, such as image classification, target recognition, facial recognition, and character recognition.
6
7This document describes the general development process for MindSpore Lite model inference.
8
9## Basic Concepts
10
11Before getting started, you need to understand the following basic concepts:
12
13**Tensor**: a special data structure that is similar to arrays and matrices. It is a basic data structure used in MindSpore Lite network operations.
14
15**Float16 inference**: a mode in which Float16 is used for inference. Float16, also called half-precision, uses 16 bits to represent a number.
16
17
18
19## Available APIs
20APIs involved in MindSpore Lite model inference are categorized into context APIs, model APIs, and tensor APIs.
21### Context APIs
22
23| API       | Description       |
24| ------------------ | ----------------- |
25|OH_AI_ContextHandle OH_AI_ContextCreate()|Creates a context object.|
26|void OH_AI_ContextSetThreadNum(OH_AI_ContextHandle context, int32_t thread_num)|Sets the number of runtime threads.|
27| void OH_AI_ContextSetThreadAffinityMode(OH_AI_ContextHandle context, int mode)|Sets the affinity mode for binding runtime threads to CPU cores, which are classified into large, medium, and small cores based on the CPU frequency. You only need to bind the large or medium cores, but not small cores.|
28|OH_AI_DeviceInfoHandle OH_AI_DeviceInfoCreate(OH_AI_DeviceType device_type)|Creates a runtime device information object.|
29|void OH_AI_ContextDestroy(OH_AI_ContextHandle *context)|Destroys a context object.|
30|void OH_AI_DeviceInfoSetEnableFP16(OH_AI_DeviceInfoHandle device_info, bool is_fp16)|Sets whether to enable float16 inference. This function is available only for CPU and GPU devices.|
31|void OH_AI_ContextAddDeviceInfo(OH_AI_ContextHandle context, OH_AI_DeviceInfoHandle device_info)|Adds a runtime device information object.|
32
33### Model APIs
34
35| API       | Description       |
36| ------------------ | ----------------- |
37|OH_AI_ModelHandle OH_AI_ModelCreate()|Creates a model object.|
38|OH_AI_Status OH_AI_ModelBuildFromFile(OH_AI_ModelHandle model, const char *model_path,OH_AI_ModelType odel_type, const OH_AI_ContextHandle model_context)|Loads and builds a MindSpore model from a model file.|
39|void OH_AI_ModelDestroy(OH_AI_ModelHandle *model)|Destroys a model object.|
40
41### Tensor APIs
42
43| API       | Description       |
44| ------------------ | ----------------- |
45|OH_AI_TensorHandleArray OH_AI_ModelGetInputs(const OH_AI_ModelHandle model)|Obtains the input tensor array structure of a model.|
46|int64_t OH_AI_TensorGetElementNum(const OH_AI_TensorHandle tensor)|Obtains the number of tensor elements.|
47|const char *OH_AI_TensorGetName(const OH_AI_TensorHandle tensor)|Obtains the name of a tensor.|
48|OH_AI_DataType OH_AI_TensorGetDataType(const OH_AI_TensorHandle tensor)|Obtains the tensor data type.|
49|void *OH_AI_TensorGetMutableData(const OH_AI_TensorHandle tensor)|Obtains the pointer to variable tensor data.|
50
51## How to Develop
52The following figure shows the development process for MindSpore Lite model inference.
53
54**Figure 1** Development process for MindSpore Lite model inference
55![how-to-use-mindspore-lite](figures/01.png)
56
57The development process consists of the following main steps:
58
591. Prepare the required model.
60
61    The required model can be downloaded directly or obtained using the model conversion tool.
62
63     - If the downloaded model is in the `.ms` format, you can use it directly for inference. The following uses the **mobilenetv2.ms** model as an example.
64     - If the downloaded model uses a third-party framework, such as TensorFlow, TensorFlow Lite, Caffe, or ONNX, you can use the [model conversion tool](https://www.mindspore.cn/lite/docs/en/r1.5/use/downloads.html#id1) to convert it to the `.ms` format.
65
662. Create a context, and set parameters such as the number of runtime threads and device type.
67
68    ```c
69    // Create a context, and set the number of runtime threads to 2 and the thread affinity mode to 1 (big cores first).
70    OH_AI_ContextHandle context = OH_AI_ContextCreate();
71    if (context == NULL) {
72      printf("OH_AI_ContextCreate failed.\n");
73      return OH_AI_STATUS_LITE_ERROR;
74    }
75    const int thread_num = 2;
76    OH_AI_ContextSetThreadNum(context, thread_num);
77    OH_AI_ContextSetThreadAffinityMode(context, 1);
78    // Set the device type to CPU, and disable Float16 inference.
79    OH_AI_DeviceInfoHandle cpu_device_info = OH_AI_DeviceInfoCreate(OH_AI_DEVICETYPE_CPU);
80    if (cpu_device_info == NULL) {
81      printf("OH_AI_DeviceInfoCreate failed.\n");
82      OH_AI_ContextDestroy(&context);
83      return OH_AI_STATUS_LITE_ERROR;
84    }
85    OH_AI_DeviceInfoSetEnableFP16(cpu_device_info, false);
86    OH_AI_ContextAddDeviceInfo(context, cpu_device_info);
87    ```
88
893. Create, load, and build the model.
90
91    Call **OH_AI_ModelBuildFromFile** to load and build the model.
92
93    In this example, the **argv[1]** parameter passed to **OH_AI_ModelBuildFromFile** indicates the specified model file path.
94
95    ```c
96    // Create a model.
97    OH_AI_ModelHandle model = OH_AI_ModelCreate();
98    if (model == NULL) {
99      printf("OH_AI_ModelCreate failed.\n");
100      OH_AI_ContextDestroy(&context);
101      return OH_AI_STATUS_LITE_ERROR;
102    }
103
104    // Load and build the model. The model type is OH_AI_ModelTypeMindIR.
105    int ret = OH_AI_ModelBuildFromFile(model, argv[1], OH_AI_ModelTypeMindIR, context);
106    if (ret != OH_AI_STATUS_SUCCESS) {
107      printf("OH_AI_ModelBuildFromFile failed, ret: %d.\n", ret);
108      OH_AI_ModelDestroy(&model);
109      return ret;
110    }
111    ```
112
1134. Input data.
114
115    Before executing model inference, you need to populate data to the input tensor. In this example, random data is used to populate the model.
116
117    ```c
118    // Obtain the input tensor.
119    OH_AI_TensorHandleArray inputs = OH_AI_ModelGetInputs(model);
120    if (inputs.handle_list == NULL) {
121      printf("OH_AI_ModelGetInputs failed, ret: %d.\n", ret);
122      OH_AI_ModelDestroy(&model);
123      return ret;
124    }
125    // Use random data to populate the tensor.
126    ret = GenerateInputDataWithRandom(inputs);
127    if (ret != OH_AI_STATUS_SUCCESS) {
128      printf("GenerateInputDataWithRandom failed, ret: %d.\n", ret);
129      OH_AI_ModelDestroy(&model);
130      return ret;
131    }
132   ```
133
1345. Execute model inference.
135
136    Call **OH_AI_ModelPredict** to perform model inference.
137
138    ```c
139    // Execute model inference.
140    OH_AI_TensorHandleArray outputs;
141    ret = OH_AI_ModelPredict(model, inputs, &outputs, NULL, NULL);
142    if (ret != OH_AI_STATUS_SUCCESS) {
143      printf("OH_AI_ModelPredict failed, ret: %d.\n", ret);
144      OH_AI_ModelDestroy(&model);
145      return ret;
146    }
147    ```
148
1496. Obtain the output.
150
151    After model inference is complete, you can obtain the inference result through the output tensor.
152
153    ```c
154    // Obtain the output tensor and print the information.
155    for (size_t i = 0; i < outputs.handle_num; ++i) {
156      OH_AI_TensorHandle tensor = outputs.handle_list[i];
157      int64_t element_num = OH_AI_TensorGetElementNum(tensor);
158      printf("Tensor name: %s, tensor size is %zu ,elements num: %lld.\n", OH_AI_TensorGetName(tensor),
159            OH_AI_TensorGetDataSize(tensor), element_num);
160      const float *data = (const float *)OH_AI_TensorGetData(tensor);
161      printf("output data is:\n");
162      const int max_print_num = 50;
163      for (int j = 0; j < element_num && j <= max_print_num; ++j) {
164        printf("%f ", data[j]);
165      }
166      printf("\n");
167    }
168    ```
169
1707. Destroy the model.
171
172    If the MindSpore Lite inference framework is no longer needed, you need to destroy the created model.
173
174    ```c
175    // Destroy the model.
176    OH_AI_ModelDestroy(&model);
177    ```
178
179## Verification
180
1811. Compile **CMakeLists.txt**.
182
183    ```cmake
184    cmake_minimum_required(VERSION 3.14)
185    project(Demo)
186
187    add_executable(demo main.c)
188
189    target_link_libraries(
190            demo
191            mindspore-lite.huawei
192            pthread
193            dl
194    )
195    ```
196   - To use ohos-sdk for cross compilation, you need to set the native toolchain path for the CMake tool as follows: `-DCMAKE_TOOLCHAIN_FILE="/xxx/ohos-sdk/linux/native/build/cmake/ohos.toolchain.cmake"`.
197
198   - The toolchain builds a 64-bit application by default. To build a 32-bit application, add the following configuration: `-DOHOS_ARCH="armeabi-v7a"`.
199
2002. Run the CMake tool.
201
202    - Use hdc_std to connect to the device and put **demo** and **mobilenetv2.ms** to the same directory on the board.
203    - Run the hdc_std shell command to access the device, go to the directory where **demo** is located, and run the following command:
204
205    ```shell
206    ./demo mobilenetv2.ms
207    ```
208
209    The inference is successful if the output is similar to the following:
210
211    ```shell
212    # ./QuickStart ./mobilenetv2.ms
213    Tensor name: Softmax-65, tensor size is 4004 ,elements num: 1001.
214    output data is:
215    0.000018 0.000012 0.000026 0.000194 0.000156 0.001501 0.000240 0.000825 0.000016 0.000006 0.000007 0.000004 0.000004 0.000004 0.000015 0.000099 0.000011 0.000013 0.000005 0.000023 0.000004 0.000008 0.000003 0.000003 0.000008 0.000014 0.000012 0.000006 0.000019 0.000006 0.000018 0.000024 0.000010 0.000002 0.000028 0.000372 0.000010 0.000017 0.000008 0.000004 0.000007 0.000010 0.000007 0.000012 0.000005 0.000015 0.000007 0.000040 0.000004 0.000085 0.000023
216    ```
217