1# Connecting NNRt to an AI Inference Framework 2 3<!--Kit: Neural Network Runtime Kit--> 4<!--Subsystem: AI--> 5<!--Owner: @GbuzhidaoR--> 6<!--Designer: @GbuzhidaoR--> 7<!--Tester: @GbuzhidaoR--> 8<!--Adviser: @ge-yafang--> 9 10## When to Use 11 12As a bridge between the AI inference engine and acceleration chip, Neural Network Runtime (NNRt) provides simplified native APIs for the AI inference engine to perform end-to-end inference through the acceleration chip. 13 14This topic uses the `Add` single-operator model shown in Figure 1 as an example to describe the NNRt development process. The `Add` operator involves two inputs, one parameter, and one output. Wherein, the `activation` parameter is used to specify the type of the activation function in the `Add` operator. 15 16**Figure 1** Add single-operator model<br> 17 18 19## Preparing the Environment 20 21### Environment Requirements 22 23The environment requirements for NNRt are as follows: 24 25- Development environment: Ubuntu 18.04 or later. 26- Access device: a standard device whose built-in hardware accelerator driver has been connected to NNRt. 27 28NNRt is opened to external systems through native APIs. Therefore, you need to download the corresponding SDK and build NNRt applications using the native development kit. You can use DevEco Studio to set up the environment and compile code. 29 30### Environment Setup 31 321. Start the Ubuntu server. 332. Specify the native toolchain path for compilation. 343. Download the required SDK. Specifically, choose **File** >** Settings...** on the DevEco Studio project page to navigate to the DevEco Studio installation directory, and search for the required SDK and download it to the local directory. 35 36## Available APIs 37 38The following table lists the common APIs used in the NNRt development. For details, see [NeuralNetworkRuntime](../../reference/apis-neural-network-runtime-kit/capi-neuralnetworkruntime.md). 39 40### Structs 41 42| Name| Description| 43| --------- | ---- | 44| typedef struct OH_NNModel OH_NNModel | Model handle of NNRt. It is used to construct a model.| 45| typedef struct OH_NNCompilation OH_NNCompilation | Compiler handle of NNRt. It is used to compile an AI model.| 46| typedef struct OH_NNExecutor OH_NNExecutor | Executor handle of NNRt. It is used to perform inference computing on a specified device.| 47| typedef struct NN_QuantParam NN_QuantParam | Quantization parameter handle, which is used to specify the quantization parameter of the tensor during model construction.| 48| typedef struct NN_TensorDesc NN_TensorDesc | Tensor description handle, which is used to describe tensor attributes, such as the data format, data type, and shape.| 49| typedef struct NN_Tensor NN_Tensor | Tensor handle, which is used to set the inference input and output tensors of the executor.| 50 51### Model Construction APIs 52 53| Name| Description| 54| ------- | --- | 55| OH_NNModel_Construct() | Creates a model instance of the OH_NNModel type.| 56| OH_NN_ReturnCode OH_NNModel_AddTensorToModel(OH_NNModel *model, const NN_TensorDesc *tensorDesc) | Adds a tensor to a model instance.| 57| OH_NN_ReturnCode OH_NNModel_SetTensorData(OH_NNModel *model, uint32_t index, const void *dataBuffer, size_t length) | Sets the tensor value.| 58| OH_NN_ReturnCode OH_NNModel_AddOperation(OH_NNModel *model, OH_NN_OperationType op, const OH_NN_UInt32Array *paramIndices, const OH_NN_UInt32Array *inputIndices, const OH_NN_UInt32Array *outputIndices) | Adds an operator to a model instance.| 59| OH_NN_ReturnCode OH_NNModel_SpecifyInputsAndOutputs(OH_NNModel *model, const OH_NN_UInt32Array *inputIndices, const OH_NN_UInt32Array *outputIndices) | Sets an index value for the input and output tensors of a model.| 60| OH_NN_ReturnCode OH_NNModel_Finish(OH_NNModel *model) | Completes model composition.| 61| void OH_NNModel_Destroy(OH_NNModel **model) | Destroys a model instance.| 62 63 64### Model Compilation APIs 65 66| Name| Description| 67| ------- | --- | 68| OH_NNCompilation *OH_NNCompilation_Construct(const OH_NNModel *model) | Creates an **OH_NNCompilation** instance based on the specified model instance.| 69| OH_NNCompilation *OH_NNCompilation_ConstructWithOfflineModelFile(const char *modelPath) | Creates an **OH_NNCompilation** instance based on the specified offline model file path.| 70| OH_NNCompilation *OH_NNCompilation_ConstructWithOfflineModelBuffer(const void *modelBuffer, size_t modelSize) | Creates an **OH_NNCompilation** instance based on the specified offline model buffer.| 71| OH_NNCompilation *OH_NNCompilation_ConstructForCache() | Creates an empty model building instance for later recovery from the model cache.| 72| OH_NN_ReturnCode OH_NNCompilation_ExportCacheToBuffer(OH_NNCompilation *compilation, const void *buffer, size_t length, size_t *modelSize) | Writes the model cache to the specified buffer.| 73| OH_NN_ReturnCode OH_NNCompilation_ImportCacheFromBuffer(OH_NNCompilation *compilation, const void *buffer, size_t modelSize) | Reads the model cache from the specified buffer.| 74| OH_NN_ReturnCode OH_NNCompilation_AddExtensionConfig(OH_NNCompilation *compilation, const char *configName, const void *configValue, const size_t configValueSize) | Adds extended configurations for custom device attributes. For details about the extended attribute names and values, see the documentation that comes with the device.| 75| OH_NN_ReturnCode OH_NNCompilation_SetDevice(OH_NNCompilation *compilation, size_t deviceID) | Sets the Device for model building and computing, which can be obtained through the device management APIs.| 76| OH_NN_ReturnCode OH_NNCompilation_SetCache(OH_NNCompilation *compilation, const char *cachePath, uint32_t version) | Sets the cache directory and version for model building.| 77| OH_NN_ReturnCode OH_NNCompilation_SetPerformanceMode(OH_NNCompilation *compilation, OH_NN_PerformanceMode performanceMode) | Sets the performance mode for model computing.| 78| OH_NN_ReturnCode OH_NNCompilation_SetPriority(OH_NNCompilation *compilation, OH_NN_Priority priority) | Sets the priority for model computing.| 79| OH_NN_ReturnCode OH_NNCompilation_EnableFloat16(OH_NNCompilation *compilation, bool enableFloat16) | Enables float16 for computing.| 80| OH_NN_ReturnCode OH_NNCompilation_Build(OH_NNCompilation *compilation) | Performs model building.| 81| void OH_NNCompilation_Destroy(OH_NNCompilation **compilation) | Destroys a model building instance.| 82 83### Tensor Description APIs 84 85| Name| Description| 86| ------- | --- | 87| NN_TensorDesc *OH_NNTensorDesc_Create() | Creates an **NN_TensorDesc** instance for creating an **NN_Tensor** instance at a later time.| 88| OH_NN_ReturnCode OH_NNTensorDesc_SetName(NN_TensorDesc *tensorDesc, const char *name) | Sets the name of the **NN_TensorDesc** instance.| 89| OH_NN_ReturnCode OH_NNTensorDesc_GetName(const NN_TensorDesc *tensorDesc, const char **name) | Obtains the name of the **NN_TensorDesc** instance.| 90| OH_NN_ReturnCode OH_NNTensorDesc_SetDataType(NN_TensorDesc *tensorDesc, OH_NN_DataType dataType) | Sets the data type of the **NN_TensorDesc** instance.| 91| OH_NN_ReturnCode OH_NNTensorDesc_GetDataType(const NN_TensorDesc *tensorDesc, OH_NN_DataType *dataType) | Obtains the data type of the **NN_TensorDesc** instance.| 92| OH_NN_ReturnCode OH_NNTensorDesc_SetShape(NN_TensorDesc *tensorDesc, const int32_t *shape, size_t shapeLength) | Sets the shape of the **NN_TensorDesc** instance.| 93| OH_NN_ReturnCode OH_NNTensorDesc_GetShape(const NN_TensorDesc *tensorDesc, int32_t **shape, size_t *shapeLength) | Obtains the shape of the **NN_TensorDesc** instance.| 94| OH_NN_ReturnCode OH_NNTensorDesc_SetFormat(NN_TensorDesc *tensorDesc, OH_NN_Format format) | Sets the data format of the **NN_TensorDesc** instance.| 95| OH_NN_ReturnCode OH_NNTensorDesc_GetFormat(const NN_TensorDesc *tensorDesc, OH_NN_Format *format) | Obtains the data format of the **NN_TensorDesc** instance.| 96| OH_NN_ReturnCode OH_NNTensorDesc_GetElementCount(const NN_TensorDesc *tensorDesc, size_t *elementCount) | Obtains the number of elements in the **NN_TensorDesc** instance.| 97| OH_NN_ReturnCode OH_NNTensorDesc_GetByteSize(const NN_TensorDesc *tensorDesc, size_t *byteSize) | Obtains the number of bytes occupied by the tensor data obtained through calculation based on the shape and data type of an **NN_TensorDesc** instance.| 98| OH_NN_ReturnCode OH_NNTensorDesc_Destroy(NN_TensorDesc **tensorDesc) | Destroys an **NN_TensorDesc** instance.| 99 100### Tensor APIs 101 102| Name| Description| 103| ------- | --- | 104| NN_Tensor* OH_NNTensor_Create(size_t deviceID, NN_TensorDesc *tensorDesc) | Creates an **NN_Tensor** instance based on the specified tensor description. This API will request for device shared memory.| 105| NN_Tensor* OH_NNTensor_CreateWithSize(size_t deviceID, NN_TensorDesc *tensorDesc, size_t size) | Creates an **NN_Tensor** instance based on the specified memory size and tensor description. This API will request for device shared memory.| 106| NN_Tensor* OH_NNTensor_CreateWithFd(size_t deviceID, NN_TensorDesc *tensorDesc, int fd, size_t size, size_t offset) | Creates an **NN_Tensor** instance based on the specified file descriptor of the shared memory and tensor description. This way, the device shared memory of other tensors can be reused.| 107| NN_TensorDesc* OH_NNTensor_GetTensorDesc(const NN_Tensor *tensor) | Obtains the pointer to the **NN_TensorDesc** instance in a tensor to read tensor attributes, such as the data type and shape.| 108| void* OH_NNTensor_GetDataBuffer(const NN_Tensor *tensor) | Obtains the memory address of tensor data to read or write tensor data.| 109| OH_NN_ReturnCode OH_NNTensor_GetFd(const NN_Tensor *tensor, int *fd) | Obtains the file descriptor of the shared memory where the tensor data is located. A file descriptor corresponds to a device shared memory block.| 110| OH_NN_ReturnCode OH_NNTensor_GetSize(const NN_Tensor *tensor, size_t *size) | Obtains the size of the shared memory where tensor data is located.| 111| OH_NN_ReturnCode OH_NNTensor_GetOffset(const NN_Tensor *tensor, size_t *offset) | Obtains the offset of the tensor data in the shared memory. The available size of the tensor data is the size of the shared memory minus the offset.| 112| OH_NN_ReturnCode OH_NNTensor_Destroy(NN_Tensor **tensor) | Destroys an **NN_Tensor** instance.| 113 114### Inference APIs 115 116| Name| Description| 117| ------- | --- | 118| OH_NNExecutor *OH_NNExecutor_Construct(OH_NNCompilation *compilation) | Creates an **OH_NNExecutor** instance.| 119| OH_NN_ReturnCode OH_NNExecutor_GetOutputShape(OH_NNExecutor *executor, uint32_t outputIndex, int32_t **shape, uint32_t *shapeLength) | Obtains the dimension information about the output tensor. This API is applicable only if the output tensor has a dynamic shape.| 120| OH_NN_ReturnCode OH_NNExecutor_GetInputCount(const OH_NNExecutor *executor, size_t *inputCount) | Obtains the number of input tensors.| 121| OH_NN_ReturnCode OH_NNExecutor_GetOutputCount(const OH_NNExecutor *executor, size_t *outputCount) | Obtains the number of output tensors.| 122| NN_TensorDesc* OH_NNExecutor_CreateInputTensorDesc(const OH_NNExecutor *executor, size_t index) | Creates an **NN_TensorDesc** instance for an input tensor based on the specified index value. This instance will be used to read tensor attributes or create **NN_Tensor** instances.| 123| NN_TensorDesc* OH_NNExecutor_CreateOutputTensorDesc(const OH_NNExecutor *executor, size_t index) | Creates an **NN_TensorDesc** instance for an output tensor based on the specified index value. This instance will be used to read tensor attributes or create **NN_Tensor** instances.| 124| OH_NN_ReturnCode OH_NNExecutor_GetInputDimRange(const OH_NNExecutor *executor, size_t index, size_t **minInputDims, size_t **maxInputDims, size_t *shapeLength) |Obtains the dimension range of all input tensors. If the input tensor has a dynamic shape, the dimension range supported by the tensor may vary according to device. | 125| OH_NN_ReturnCode OH_NNExecutor_SetOnRunDone(OH_NNExecutor *executor, NN_OnRunDone onRunDone) | Sets the callback function invoked when the asynchronous inference ends. For the definition of the callback function, see the *API Reference*.| 126| OH_NN_ReturnCode OH_NNExecutor_SetOnServiceDied(OH_NNExecutor *executor, NN_OnServiceDied onServiceDied) | Sets the callback function invoked when the device driver service terminates unexpectedly during asynchronous inference. For the definition of the callback function, see the *API Reference*.| 127| OH_NN_ReturnCode OH_NNExecutor_RunSync(OH_NNExecutor *executor, NN_Tensor *inputTensor[], size_t inputCount, NN_Tensor *outputTensor[], size_t outputCount) | Performs synchronous inference.| 128| OH_NN_ReturnCode OH_NNExecutor_RunAsync(OH_NNExecutor *executor, NN_Tensor *inputTensor[], size_t inputCount, NN_Tensor *outputTensor[], size_t outputCount, int32_t timeout, void *userData) | Performs asynchronous inference.| 129| void OH_NNExecutor_Destroy(OH_NNExecutor **executor) | Destroys an **OH_NNExecutor** instance.| 130 131### Device Management APIs 132 133| Name| Description| 134| ------- | --- | 135| OH_NN_ReturnCode OH_NNDevice_GetAllDevicesID(const size_t **allDevicesID, uint32_t *deviceCount) | Obtains the ID of the device connected to NNRt.| 136| OH_NN_ReturnCode OH_NNDevice_GetName(size_t deviceID, const char **name) | Obtains the name of the specified device.| 137| OH_NN_ReturnCode OH_NNDevice_GetType(size_t deviceID, OH_NN_DeviceType *deviceType) | Obtains the type of the specified device.| 138 139 140## How to Develop 141 142The development process of NNRt consists of three phases: model construction, model compilation, and inference execution. The following uses the `Add` single-operator model as an example to describe how to call NNRt APIs during application development. 143 1441. Create an application sample file. 145 146 Create the source file of the NNRt application sample. Run the following commands in the project directory to create the `nnrt_example/` directory and create the `nnrt_example.cpp` source file in the directory: 147 148 ```shell 149 mkdir ~/nnrt_example && cd ~/nnrt_example 150 touch nnrt_example.cpp 151 ``` 152 1532. Import the NNRt module. 154 155 Add the following code at the beginning of the `nnrt_example.cpp` file to import NNRt: 156 157 ```cpp 158 #include <iostream> 159 #include <cstdarg> 160 #include "neural_network_runtime/neural_network_runtime.h" 161 ``` 162 1633. Define auxiliary functions, such as log printing, input data setting, and data printing. 164 165 ```cpp 166 // Macro for checking the return value 167 #define CHECKNEQ(realRet, expectRet, retValue, ...) \ 168 do { \ 169 if ((realRet) != (expectRet)) { \ 170 printf(__VA_ARGS__); \ 171 return (retValue); \ 172 } \ 173 } while (0) 174 175 #define CHECKEQ(realRet, expectRet, retValue, ...) \ 176 do { \ 177 if ((realRet) == (expectRet)) { \ 178 printf(__VA_ARGS__); \ 179 return (retValue); \ 180 } \ 181 } while (0) 182 183 // Set the input data for inference. 184 OH_NN_ReturnCode SetInputData(NN_Tensor* inputTensor[], size_t inputSize) 185 { 186 OH_NN_DataType dataType(OH_NN_FLOAT32); 187 OH_NN_ReturnCode ret{OH_NN_FAILED}; 188 size_t elementCount = 0; 189 for (size_t i = 0; i < inputSize; ++i) { 190 // Obtain the data memory of the tensor. 191 auto data = OH_NNTensor_GetDataBuffer(inputTensor[i]); 192 CHECKEQ(data, nullptr, OH_NN_FAILED, "Failed to get data buffer."); 193 // Obtain the tensor description. 194 auto desc = OH_NNTensor_GetTensorDesc(inputTensor[i]); 195 CHECKEQ(desc, nullptr, OH_NN_FAILED, "Failed to get desc."); 196 // Obtain the data type of the tensor. 197 ret = OH_NNTensorDesc_GetDataType(desc, &dataType); 198 CHECKNEQ(ret, OH_NN_SUCCESS, OH_NN_FAILED, "Failed to get data type."); 199 // Obtain the number of elements in the tensor. 200 ret = OH_NNTensorDesc_GetElementCount(desc, &elementCount); 201 CHECKNEQ(ret, OH_NN_SUCCESS, OH_NN_FAILED, "Failed to get element count."); 202 switch(dataType) { 203 case OH_NN_FLOAT32: { 204 float* floatValue = reinterpret_cast<float*>(data); 205 for (size_t j = 0; j < elementCount; ++j) { 206 floatValue[j] = static_cast<float>(j); 207 } 208 break; 209 } 210 case OH_NN_INT32: { 211 int* intValue = reinterpret_cast<int*>(data); 212 for (size_t j = 0; j < elementCount; ++j) { 213 intValue[j] = static_cast<int>(j); 214 } 215 break; 216 } 217 default: 218 return OH_NN_FAILED; 219 } 220 } 221 return OH_NN_SUCCESS; 222 } 223 224 OH_NN_ReturnCode Print(NN_Tensor* outputTensor[], size_t outputSize) 225 { 226 OH_NN_DataType dataType(OH_NN_FLOAT32); 227 OH_NN_ReturnCode ret{OH_NN_FAILED}; 228 size_t elementCount = 0; 229 for (size_t i = 0; i < outputSize; ++i) { 230 auto data = OH_NNTensor_GetDataBuffer(outputTensor[i]); 231 CHECKEQ(data, nullptr, OH_NN_FAILED, "Failed to get data buffer."); 232 auto desc = OH_NNTensor_GetTensorDesc(outputTensor[i]); 233 CHECKEQ(desc, nullptr, OH_NN_FAILED, "Failed to get desc."); 234 ret = OH_NNTensorDesc_GetDataType(desc, &dataType); 235 CHECKNEQ(ret, OH_NN_SUCCESS, OH_NN_FAILED, "Failed to get data type."); 236 ret = OH_NNTensorDesc_GetElementCount(desc, &elementCount); 237 CHECKNEQ(ret, OH_NN_SUCCESS, OH_NN_FAILED, "Failed to get element count."); 238 switch(dataType) { 239 case OH_NN_FLOAT32: { 240 float* floatValue = reinterpret_cast<float*>(data); 241 for (size_t j = 0; j < elementCount; ++j) { 242 std::cout << "Output index: " << j << ", value is: " << floatValue[j] << "." << std::endl; 243 } 244 break; 245 } 246 case OH_NN_INT32: { 247 int* intValue = reinterpret_cast<int*>(data); 248 for (size_t j = 0; j < elementCount; ++j) { 249 std::cout << "Output index: " << j << ", value is: " << intValue[j] << "." << std::endl; 250 } 251 break; 252 } 253 default: 254 return OH_NN_FAILED; 255 } 256 } 257 258 return OH_NN_SUCCESS; 259 } 260 ``` 261 2624. Construct a model. 263 264 Use the model construction APIs to construct a single `Add` operator model. 265 266 ```cpp 267 OH_NN_ReturnCode BuildModel(OH_NNModel** pmodel) 268 { 269 // Create a model instance and construct a model. 270 OH_NNModel* model = OH_NNModel_Construct(); 271 CHECKEQ(model, nullptr, OH_NN_FAILED, "Create model failed."); 272 273 // Add the first input tensor of the float32 type for the Add operator. The tensor shape is [1, 2, 2, 3]. 274 NN_TensorDesc* tensorDesc = OH_NNTensorDesc_Create(); 275 CHECKEQ(tensorDesc, nullptr, OH_NN_FAILED, "Create TensorDesc failed."); 276 277 int32_t inputDims[4] = {1, 2, 2, 3}; 278 auto returnCode = OH_NNTensorDesc_SetShape(tensorDesc, inputDims, 4); 279 CHECKNEQ(returnCode, OH_NN_SUCCESS, OH_NN_FAILED, "Set TensorDesc shape failed."); 280 281 returnCode = OH_NNTensorDesc_SetDataType(tensorDesc, OH_NN_FLOAT32); 282 CHECKNEQ(returnCode, OH_NN_SUCCESS, OH_NN_FAILED, "Set TensorDesc data type failed."); 283 284 returnCode = OH_NNTensorDesc_SetFormat(tensorDesc, OH_NN_FORMAT_NONE); 285 CHECKNEQ(returnCode, OH_NN_SUCCESS, OH_NN_FAILED, "Set TensorDesc format failed."); 286 287 returnCode = OH_NNModel_AddTensorToModel(model, tensorDesc); 288 CHECKNEQ(returnCode, OH_NN_SUCCESS, OH_NN_FAILED, "Add first TensorDesc to model failed."); 289 290 returnCode = OH_NNModel_SetTensorType(model, 0, OH_NN_TENSOR); 291 CHECKNEQ(returnCode, OH_NN_SUCCESS, OH_NN_FAILED, "Set model tensor type failed."); 292 293 // Add the second input tensor of the float32 type for the Add operator. The tensor shape is [1, 2, 2, 3]. 294 tensorDesc = OH_NNTensorDesc_Create(); 295 CHECKEQ(tensorDesc, nullptr, OH_NN_FAILED, "Create TensorDesc failed."); 296 297 returnCode = OH_NNTensorDesc_SetShape(tensorDesc, inputDims, 4); 298 CHECKNEQ(returnCode, OH_NN_SUCCESS, OH_NN_FAILED, "Set TensorDesc shape failed."); 299 300 returnCode = OH_NNTensorDesc_SetDataType(tensorDesc, OH_NN_FLOAT32); 301 CHECKNEQ(returnCode, OH_NN_SUCCESS, OH_NN_FAILED, "Set TensorDesc data type failed."); 302 303 returnCode = OH_NNTensorDesc_SetFormat(tensorDesc, OH_NN_FORMAT_NONE); 304 CHECKNEQ(returnCode, OH_NN_SUCCESS, OH_NN_FAILED, "Set TensorDesc format failed."); 305 306 returnCode = OH_NNModel_AddTensorToModel(model, tensorDesc); 307 CHECKNEQ(returnCode, OH_NN_SUCCESS, OH_NN_FAILED, "Add second TensorDesc to model failed."); 308 309 returnCode = OH_NNModel_SetTensorType(model, 1, OH_NN_TENSOR); 310 CHECKNEQ(returnCode, OH_NN_SUCCESS, OH_NN_FAILED, "Set model tensor type failed."); 311 312 // Add the parameter tensor of the int8 type for the Add operator. The parameter tensor is used to specify the type of the activation function. 313 tensorDesc = OH_NNTensorDesc_Create(); 314 CHECKEQ(tensorDesc, nullptr, OH_NN_FAILED, "Create TensorDesc failed."); 315 316 int32_t activationDims = 1; 317 returnCode = OH_NNTensorDesc_SetShape(tensorDesc, &activationDims, 1); 318 CHECKNEQ(returnCode, OH_NN_SUCCESS, OH_NN_FAILED, "Set TensorDesc shape failed."); 319 320 returnCode = OH_NNTensorDesc_SetDataType(tensorDesc, OH_NN_INT8); 321 CHECKNEQ(returnCode, OH_NN_SUCCESS, OH_NN_FAILED, "Set TensorDesc data type failed."); 322 323 returnCode = OH_NNTensorDesc_SetFormat(tensorDesc, OH_NN_FORMAT_NONE); 324 CHECKNEQ(returnCode, OH_NN_SUCCESS, OH_NN_FAILED, "Set TensorDesc format failed."); 325 326 returnCode = OH_NNModel_AddTensorToModel(model, tensorDesc); 327 CHECKNEQ(returnCode, OH_NN_SUCCESS, OH_NN_FAILED, "Add second TensorDesc to model failed."); 328 329 returnCode = OH_NNModel_SetTensorType(model, 2, OH_NN_ADD_ACTIVATIONTYPE); 330 CHECKNEQ(returnCode, OH_NN_SUCCESS, OH_NN_FAILED, "Set model tensor type failed."); 331 332 // Set the type of the activation function to OH_NN_FUSED_NONE, indicating that no activation function is added to the operator. 333 int8_t activationValue = OH_NN_FUSED_NONE; 334 returnCode = OH_NNModel_SetTensorData(model, 2, &activationValue, sizeof(int8_t)); 335 CHECKNEQ(returnCode, OH_NN_SUCCESS, OH_NN_FAILED, "Set model tensor data failed."); 336 337 // Add the output tensor of the float32 type for the Add operator. The tensor shape is [1, 2, 2, 3]. 338 tensorDesc = OH_NNTensorDesc_Create(); 339 CHECKEQ(tensorDesc, nullptr, OH_NN_FAILED, "Create TensorDesc failed."); 340 341 returnCode = OH_NNTensorDesc_SetShape(tensorDesc, inputDims, 4); 342 CHECKNEQ(returnCode, OH_NN_SUCCESS, OH_NN_FAILED, "Set TensorDesc shape failed."); 343 344 returnCode = OH_NNTensorDesc_SetDataType(tensorDesc, OH_NN_FLOAT32); 345 CHECKNEQ(returnCode, OH_NN_SUCCESS, OH_NN_FAILED, "Set TensorDesc data type failed."); 346 347 returnCode = OH_NNTensorDesc_SetFormat(tensorDesc, OH_NN_FORMAT_NONE); 348 CHECKNEQ(returnCode, OH_NN_SUCCESS, OH_NN_FAILED, "Set TensorDesc format failed."); 349 350 returnCode = OH_NNModel_AddTensorToModel(model, tensorDesc); 351 CHECKNEQ(returnCode, OH_NN_SUCCESS, OH_NN_FAILED, "Add forth TensorDesc to model failed."); 352 353 returnCode = OH_NNModel_SetTensorType(model, 3, OH_NN_TENSOR); 354 CHECKNEQ(returnCode, OH_NN_SUCCESS, OH_NN_FAILED, "Set model tensor type failed."); 355 356 // Specify index values of the input tensor, parameter tensor, and output tensor for the Add operator. 357 uint32_t inputIndicesValues[2] = {0, 1}; 358 uint32_t paramIndicesValues = 2; 359 uint32_t outputIndicesValues = 3; 360 OH_NN_UInt32Array paramIndices = {¶mIndicesValues, 1}; 361 OH_NN_UInt32Array inputIndices = {inputIndicesValues, 2}; 362 OH_NN_UInt32Array outputIndices = {&outputIndicesValues, 1}; 363 364 // Add the Add operator to the model instance. 365 returnCode = OH_NNModel_AddOperation(model, OH_NN_OPS_ADD, ¶mIndices, &inputIndices, &outputIndices); 366 CHECKNEQ(returnCode, OH_NN_SUCCESS, OH_NN_FAILED, "Add operation to model failed."); 367 368 // Set the index values of the input tensor and output tensor for the model instance. 369 returnCode = OH_NNModel_SpecifyInputsAndOutputs(model, &inputIndices, &outputIndices); 370 CHECKNEQ(returnCode, OH_NN_SUCCESS, OH_NN_FAILED, "Specify model inputs and outputs failed."); 371 372 // Complete the model instance construction. 373 returnCode = OH_NNModel_Finish(model); 374 CHECKNEQ(returnCode, OH_NN_SUCCESS, OH_NN_FAILED, "Build model failed."); 375 376 // Return the model instance. 377 *pmodel = model; 378 return OH_NN_SUCCESS; 379 } 380 ``` 381 3825. Query the AI acceleration chips connected to NNRt. 383 384 NNRt can connect to multiple AI acceleration chips through HDIs. Before model building, you need to query the AI acceleration chips connected to NNRt on the current device. Each AI acceleration chip has a unique ID. In the compilation phase, you need to specify the chip for model compilation based on the ID. 385 ```cpp 386 void GetAvailableDevices(std::vector<size_t>& availableDevice) 387 { 388 availableDevice.clear(); 389 390 // Obtain the available hardware IDs. 391 const size_t* devices = nullptr; 392 uint32_t deviceCount = 0; 393 OH_NN_ReturnCode ret = OH_NNDevice_GetAllDevicesID(&devices, &deviceCount); 394 if (ret != OH_NN_SUCCESS) { 395 std::cout << "GetAllDevicesID failed, get no available device." << std::endl; 396 return; 397 } 398 399 for (uint32_t i = 0; i < deviceCount; i++) { 400 availableDevice.emplace_back(devices[i]); 401 } 402 } 403 ``` 404 4056. Compile a model on the specified device. 406 407 NNRt uses abstract model expressions to describe the topology structure of an AI model. Before inference execution on an AI acceleration chip, the build module provided by NNRt needs to deliver the abstract model expressions to the chip driver layer and convert the abstract model expressions into a format that supports inference and computing. 408 ```cpp 409 OH_NN_ReturnCode CreateCompilation(OH_NNModel* model, const std::vector<size_t>& availableDevice, 410 OH_NNCompilation** pCompilation) 411 { 412 // Create an OH_NNCompilation instance and pass the image composition model instance or the MindSpore Lite model instance to it. 413 OH_NNCompilation* compilation = OH_NNCompilation_Construct(model); 414 CHECKEQ(compilation, nullptr, OH_NN_FAILED, "OH_NNCore_ConstructCompilationWithNNModel failed."); 415 416 // Set compilation options, such as the compilation hardware, cache path, performance mode, computing priority, and whether to enable float16 low-precision computing. 417 // Choose to perform model compilation on the first device. 418 auto returnCode = OH_NNCompilation_SetDevice(compilation, availableDevice[0]); 419 CHECKNEQ(returnCode, OH_NN_SUCCESS, OH_NN_FAILED, "OH_NNCompilation_SetDevice failed."); 420 421 // Have the model compilation result cached in the /data/local/tmp directory, with the version number set to 1. 422 returnCode = OH_NNCompilation_SetCache(compilation, "/data/local/tmp", 1); 423 CHECKNEQ(returnCode, OH_NN_SUCCESS, OH_NN_FAILED, "OH_NNCompilation_SetCache failed."); 424 425 // Set the performance mode of the device. 426 returnCode = OH_NNCompilation_SetPerformanceMode(compilation, OH_NN_PERFORMANCE_EXTREME); 427 CHECKNEQ(returnCode, OH_NN_SUCCESS, OH_NN_FAILED, "OH_NNCompilation_SetPerformanceMode failed."); 428 429 // Set the inference priority. 430 returnCode = OH_NNCompilation_SetPriority(compilation, OH_NN_PRIORITY_HIGH); 431 CHECKNEQ(returnCode, OH_NN_SUCCESS, OH_NN_FAILED, "OH_NNCompilation_SetPriority failed."); 432 433 // Specify whether to enable FP16 computing. 434 returnCode = OH_NNCompilation_EnableFloat16(compilation, false); 435 CHECKNEQ(returnCode, OH_NN_SUCCESS, OH_NN_FAILED, "OH_NNCompilation_EnableFloat16 failed."); 436 437 // Perform model building 438 returnCode = OH_NNCompilation_Build(compilation); 439 CHECKNEQ(returnCode, OH_NN_SUCCESS, OH_NN_FAILED, "OH_NNCompilation_Build failed."); 440 441 *pCompilation = compilation; 442 return OH_NN_SUCCESS; 443 } 444 ``` 445 4467. Create an executor. 447 448 After the model building is complete, you need to call the NNRt execution module to create an executor. In the inference phase, operations such as setting the model input, obtaining the model output, and triggering inference computing are performed through the executor. 449 ```cpp 450 OH_NNExecutor* CreateExecutor(OH_NNCompilation* compilation) 451 { 452 // Create an executor based on the specified OH_NNCompilation instance. 453 OH_NNExecutor *executor = OH_NNExecutor_Construct(compilation); 454 CHECKEQ(executor, nullptr, nullptr, "OH_NNExecutor_Construct failed."); 455 return executor; 456 } 457 ``` 458 4598. Perform inference computing, and print the inference result. 460 461 The input data required for inference computing is passed to the executor through the API provided by the execution module. This way, the executor is triggered to perform inference computing once to obtain and print the inference computing result. 462 ```cpp 463 OH_NN_ReturnCode Run(OH_NNExecutor* executor, const std::vector<size_t>& availableDevice) 464 { 465 // Obtain information about the input and output tensors from the executor. 466 // Obtain the number of input tensors. 467 size_t inputCount = 0; 468 auto returnCode = OH_NNExecutor_GetInputCount(executor, &inputCount); 469 CHECKNEQ(returnCode, OH_NN_SUCCESS, OH_NN_FAILED, "OH_NNExecutor_GetInputCount failed."); 470 std::vector<NN_TensorDesc*> inputTensorDescs; 471 NN_TensorDesc* tensorDescTmp = nullptr; 472 for (size_t i = 0; i < inputCount; ++i) { 473 // Create the description of the input tensor. 474 tensorDescTmp = OH_NNExecutor_CreateInputTensorDesc(executor, i); 475 CHECKEQ(tensorDescTmp, nullptr, OH_NN_FAILED, "OH_NNExecutor_CreateInputTensorDesc failed."); 476 inputTensorDescs.emplace_back(tensorDescTmp); 477 } 478 // Obtain the number of output tensors. 479 size_t outputCount = 0; 480 returnCode = OH_NNExecutor_GetOutputCount(executor, &outputCount); 481 CHECKNEQ(returnCode, OH_NN_SUCCESS, OH_NN_FAILED, "OH_NNExecutor_GetOutputCount failed."); 482 std::vector<NN_TensorDesc*> outputTensorDescs; 483 for (size_t i = 0; i < outputCount; ++i) { 484 // Create the description of the output tensor. 485 tensorDescTmp = OH_NNExecutor_CreateOutputTensorDesc(executor, i); 486 CHECKEQ(tensorDescTmp, nullptr, OH_NN_FAILED, "OH_NNExecutor_CreateOutputTensorDesc failed."); 487 outputTensorDescs.emplace_back(tensorDescTmp); 488 } 489 490 // Create input and output tensors. 491 NN_Tensor* inputTensors[inputCount]; 492 NN_Tensor* tensor = nullptr; 493 for (size_t i = 0; i < inputCount; ++i) { 494 tensor = nullptr; 495 tensor = OH_NNTensor_Create(availableDevice[0], inputTensorDescs[i]); 496 CHECKEQ(tensor, nullptr, OH_NN_FAILED, "OH_NNTensor_Create failed."); 497 inputTensors[i] = tensor; 498 } 499 NN_Tensor* outputTensors[outputCount]; 500 for (size_t i = 0; i < outputCount; ++i) { 501 tensor = nullptr; 502 tensor = OH_NNTensor_Create(availableDevice[0], outputTensorDescs[i]); 503 CHECKEQ(tensor, nullptr, OH_NN_FAILED, "OH_NNTensor_Create failed."); 504 outputTensors[i] = tensor; 505 } 506 507 // Set the data of the input tensor. 508 returnCode = SetInputData(inputTensors, inputCount); 509 CHECKNEQ(returnCode, OH_NN_SUCCESS, OH_NN_FAILED, "SetInputData failed."); 510 511 // Perform inference 512 returnCode = OH_NNExecutor_RunSync(executor, inputTensors, inputCount, outputTensors, outputCount); 513 CHECKNEQ(returnCode, OH_NN_SUCCESS, OH_NN_FAILED, "OH_NNExecutor_RunSync failed."); 514 515 // Print the data of the output tensor. 516 Print(outputTensors, outputCount); 517 518 // Clear the input and output tensors and tensor description. 519 for (size_t i = 0; i < inputCount; ++i) { 520 returnCode = OH_NNTensor_Destroy(&inputTensors[i]); 521 CHECKNEQ(returnCode, OH_NN_SUCCESS, OH_NN_FAILED, "OH_NNTensor_Destroy failed."); 522 returnCode = OH_NNTensorDesc_Destroy(&inputTensorDescs[i]); 523 CHECKNEQ(returnCode, OH_NN_SUCCESS, OH_NN_FAILED, "OH_NNTensorDesc_Destroy failed."); 524 } 525 for (size_t i = 0; i < outputCount; ++i) { 526 returnCode = OH_NNTensor_Destroy(&outputTensors[i]); 527 CHECKNEQ(returnCode, OH_NN_SUCCESS, OH_NN_FAILED, "OH_NNTensor_Destroy failed."); 528 returnCode = OH_NNTensorDesc_Destroy(&outputTensorDescs[i]); 529 CHECKNEQ(returnCode, OH_NN_SUCCESS, OH_NN_FAILED, "OH_NNTensorDesc_Destroy failed."); 530 } 531 532 return OH_NN_SUCCESS; 533 } 534 ``` 535 5369. Build an end-to-end process from model construction to model compilation and execution. 537 538 Steps 4 to 8 implement the model construction, compilation, and execution processes and encapsulates them into multiple functions to facilitate modular development. The following sample code demonstrates how to chain these functions into a complete NNRt workflow. 539 ```cpp 540 int main(int argc, char** argv) 541 { 542 OH_NNModel* model = nullptr; 543 OH_NNCompilation* compilation = nullptr; 544 OH_NNExecutor* executor = nullptr; 545 std::vector<size_t> availableDevices; 546 547 // Construct a model. 548 OH_NN_ReturnCode ret = BuildModel(&model); 549 if (ret != OH_NN_SUCCESS) { 550 std::cout << "BuildModel failed." << std::endl; 551 OH_NNModel_Destroy(&model); 552 return -1; 553 } 554 555 // Obtain the available devices. 556 GetAvailableDevices(availableDevices); 557 if (availableDevices.empty()) { 558 std::cout << "No available device." << std::endl; 559 OH_NNModel_Destroy(&model); 560 return -1; 561 } 562 563 // Build the model. 564 ret = CreateCompilation(model, availableDevices, &compilation); 565 if (ret != OH_NN_SUCCESS) { 566 std::cout << "CreateCompilation failed." << std::endl; 567 OH_NNModel_Destroy(&model); 568 OH_NNCompilation_Destroy(&compilation); 569 return -1; 570 } 571 572 // Destroy the model instance. 573 OH_NNModel_Destroy(&model); 574 575 // Create an inference executor for the model. 576 executor = CreateExecutor(compilation); 577 if (executor == nullptr) { 578 std::cout << "CreateExecutor failed, no executor is created." << std::endl; 579 OH_NNCompilation_Destroy(&compilation); 580 return -1; 581 } 582 583 // Destroy the model building instance. 584 OH_NNCompilation_Destroy(&compilation); 585 586 // Use the created executor to perform inference. 587 ret = Run(executor, availableDevices); 588 if (ret != OH_NN_SUCCESS) { 589 std::cout << "Run failed." << std::endl; 590 OH_NNExecutor_Destroy(&executor); 591 return -1; 592 } 593 594 // Destroy the executor instance. 595 OH_NNExecutor_Destroy(&executor); 596 597 return 0; 598 } 599 ``` 600 601## Verification 602 6031. Prepare the compilation configuration file of the application sample. 604 605 Create a `CMakeLists.txt` file, and add compilation configurations to the application sample file `nnrt_example.cpp`. The following is a simple example of the `CMakeLists.txt` file: 606 ```text 607 cmake_minimum_required(VERSION 3.16) 608 project(nnrt_example C CXX) 609 610 add_executable(nnrt_example 611 ./nnrt_example.cpp 612 ) 613 614 target_link_libraries(nnrt_example 615 neural_network_runtime 616 neural_network_core 617 ) 618 ``` 619 6202. Compile the application sample. 621 622 Create the **build/** directory in the current directory, and compile `nnrt\_example.cpp` in the **build/** directory to obtain the binary file `nnrt\_example`: 623 ```shell 624 mkdir build && cd build 625 cmake -DCMAKE_TOOLCHAIN_FILE={Path of the cross-compilation toolchain}/build/cmake/ohos.toolchain.cmake -DOHOS_ARCH=arm64-v8a -DOHOS_PLATFORM=OHOS -DOHOS_STL=c++_static .. 626 make 627 ``` 628 6293. Push the application sample to the device for execution. 630 ```shell 631 # Push the `nnrt_example` obtained through compilation to the device, and execute it. 632 hdc_std file send ./nnrt_example /data/local/tmp/. 633 634 # Grant required permissions to the executable file of the test case. 635 hdc_std shell "chmod +x /data/local/tmp/nnrt_example" 636 637 # Execute the test case. 638 hdc_std shell "/data/local/tmp/nnrt_example" 639 ``` 640 641 If the execution is normal, information similar to the following is displayed: 642 ```text 643 Output index: 0, value is: 0.000000. 644 Output index: 1, value is: 2.000000. 645 Output index: 2, value is: 4.000000. 646 Output index: 3, value is: 6.000000. 647 Output index: 4, value is: 8.000000. 648 Output index: 5, value is: 10.000000. 649 Output index: 6, value is: 12.000000. 650 Output index: 7, value is: 14.000000. 651 Output index: 8, value is: 16.000000. 652 Output index: 9, value is: 18.000000. 653 Output index: 10, value is: 20.000000. 654 Output index: 11, value is: 22.000000. 655 ``` 656 6574. (Optional) Check the model cache. 658 659 If the HDI service connected to NNRt supports the model cache function, you can find the generated cache file in the `/data/local/tmp` directory after the `nnrt_example` is executed successfully. 660 661 > **NOTE** 662 > 663 > The IR graphs of the model need to be passed to the hardware driver layer, so that the HDI service compiles the IR graphs into a computing graph dedicated to hardware. The compilation process is time-consuming. The NNRt supports the computing graph cache feature. It can cache the computing graphs compiled by the HDI service to the device storage. If the same model is compiled on the same acceleration chip next time, you can specify the cache path so that NNRt can directly load the computing graphs in the cache file, reducing the compilation time. 664 665 Check the cached files in the cache directory. 666 ```shell 667 ls /data/local/tmp 668 ``` 669 670 The command output is as follows: 671 ```text 672 # 0.nncache 1.nncache 2.nncache cache_info.nncache 673 ``` 674 675 If the cache is no longer used, manually delete the cache files. 676 ```shell 677 rm /data/local/tmp/*nncache 678 ``` 679