1# Performance measurement 2 3## Benchmark tools 4 5TensorFlow Lite benchmark tools currently measure and calculate statistics for 6the following important performance metrics: 7 8* Initialization time 9* Inference time of warmup state 10* Inference time of steady state 11* Memory usage during initialization time 12* Overall memory usage 13 14The benchmark tools are available as benchmark apps for Android and iOS and as 15native command-line binaries, and they all share the same core performance 16measurement logic. Note that the available options and output formats are 17slightly different due to the differences in runtime environment. 18 19### Android benchmark app 20 21There are two options of using the benchmark tool with Android. One is a 22[native benchmark binary](#native-benchmark-binary) and another is an Android 23benchmark app, a better gauge of how the model would perform in the app. Either 24way, the numbers from the benchmark tool will still differ slightly from when 25running inference with the model in the actual app. 26 27This Android benchmark app has no UI. Install and run it by using the `adb` 28command and retrieve results by using the `adb logcat` command. 29 30#### Download or build the app 31 32Download the nightly pre-built Android benchmark apps using the links below: 33 34* [android_aarch64](https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/android_aarch64_benchmark_model.apk) 35* [android_arm](https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/android_arm_benchmark_model.apk) 36 37As for Android benchmark apps that support [TF ops](https://www.tensorflow.org/lite/guide/ops_select) 38via [Flex delegate](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/delegates/flex), 39use the links below: 40 41* [android_aarch64](https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/android_aarch64_benchmark_model_plus_flex.apk) 42* [android_arm](https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/android_arm_benchmark_model_plus_flex.apk) 43 44 45You can also build the app from source by following these 46[instructions](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/tools/benchmark/android). 47 48Note: It is required to build the app from the source if you want to run the 49Android benchmark apk on x86 CPU or Hexagon delegate or if your model contains 50[select TF operators](../guide/ops_select) or 51[custom operators](../guide/ops_custom). 52 53#### Prepare benchmark 54 55Before running the benchmark app, install the app and push the model file to the 56device as follows: 57 58```shell 59adb install -r -d -g android_aarch64_benchmark_model.apk 60adb push your_model.tflite /data/local/tmp 61``` 62 63#### Run benchmark 64 65```shell 66adb shell am start -S \ 67 -n org.tensorflow.lite.benchmark/.BenchmarkModelActivity \ 68 --es args '"--graph=/data/local/tmp/your_model.tflite \ 69 --num_threads=4"' 70``` 71 72`graph` is a required parameter. 73 74* `graph`: `string` \ 75 The path to the TFLite model file. 76 77You can specify more optional parameters for running the benchmark. 78 79* `num_threads`: `int` (default=1) \ 80 The number of threads to use for running TFLite interpreter. 81* `use_gpu`: `bool` (default=false) \ 82 Use [GPU delegate](gpu). 83* `use_nnapi`: `bool` (default=false) \ 84 Use [NNAPI delegate](nnapi). 85* `use_xnnpack`: `bool` (default=`false`) \ 86 Use 87 [XNNPACK delegate](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/delegates/xnnpack). 88* `use_hexagon`: `bool` (default=`false`) \ 89 Use [Hexagon delegate](hexagon_delegate). 90 91Depending on the device you are using, some of these options may not be 92available or have no effect. Refer to 93[parameters](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/tools/benchmark#parameters) 94for more performance parameters that you could run with the benchmark app. 95 96View the results using the `logcat` command: 97 98```shell 99adb logcat | grep "Average inference" 100``` 101 102The benchmark results are reported as: 103 104``` 105... tflite : Average inference timings in us: Warmup: 91471, Init: 4108, Inference: 80660.1 106``` 107 108### Native benchmark binary 109 110Benchmark tool is also provided as a native binary `benchmark_model`. You can 111execute this tool from a shell command line on Linux, Mac, embedded devices and 112Android devices. 113 114#### Download or build the binary 115 116Download the nightly pre-built native command-line binaries by following the 117links below: 118 119* [linux_x86-64](https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/linux_x86-64_benchmark_model) 120* [linux_aarch64](https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/linux_aarch64_benchmark_model) 121* [linux_arm](https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/linux_arm_benchmark_model) 122* [android_aarch64](https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/android_aarch64_benchmark_model) 123* [android_arm](https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/android_arm_benchmark_model) 124 125As for nightly pre-built binaries that support [TF ops](https://www.tensorflow.org/lite/guide/ops_select) 126via [Flex delegate](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/delegates/flex), 127use the links below: 128 129* [linux_x86-64](https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/linux_x86-64_benchmark_model_plus_flex) 130* [linux_aarch64](https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/linux_aarch64_benchmark_model_plus_flex) 131* [linux_arm](https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/linux_arm_benchmark_model_plus_flex) 132* [android_aarch64](https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/android_aarch64_benchmark_model_plus_flex) 133* [android_arm](https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/android_arm_benchmark_model_plus_flex) 134 135To benchmark with [TensorFlow Lite Hexagon delegate](https://www.tensorflow.org/lite/android/delegates/hexagon), 136we have also pre-built the required `libhexagon_interface.so` files (see [here](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/delegates/hexagon/README.md) 137for details about this file). After downloading the file of the corresponding 138platform from the links below, please rename the file to `libhexagon_interface.so`. 139 140* [android_aarch64](https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/android_aarch64_libhexagon_interface.so) 141* [android_arm](https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/android_arm_libhexagon_interface.so) 142 143You can also build the native benchmark binary from 144[source](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/tools/benchmark) 145on your computer. 146 147```shell 148bazel build -c opt //tensorflow/lite/tools/benchmark:benchmark_model 149``` 150 151To build with Android NDK toolchain, you need to set up the build environment 152first by following this 153[guide](../android/lite_build#set_up_build_environment_without_docker), or use 154the docker image as described in this 155[guide](../android/lite_build#set_up_build_environment_using_docker). 156 157```shell 158bazel build -c opt --config=android_arm64 \ 159 //tensorflow/lite/tools/benchmark:benchmark_model 160``` 161 162Note: It is a valid approach to push and execute binaries directly on an Android 163device for benchmarking, but it can result in subtle (but observable) 164differences in performance relative to execution within an actual Android app. 165In particular, Android's scheduler tailors behavior based on thread and process 166priorities, which differ between a foreground Activity/Application and a regular 167background binary executed via `adb shell ...`. This tailored behavior is most 168evident when enabling multi-threaded CPU execution with TensorFlow Lite. 169Therefore, the Android benchmark app is preferred for performance measurement. 170 171#### Run benchmark 172 173To run benchmarks on your computer, execute the binary from the shell. 174 175```shell 176path/to/downloaded_or_built/benchmark_model \ 177 --graph=your_model.tflite \ 178 --num_threads=4 179``` 180 181You can use the same set of 182[parameters](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/tools/benchmark#parameters) 183as mentioned above with the native command-line binary. 184 185#### Profiling model ops 186 187The benchmark model binary also allows you to profile model ops and get the 188execution times of each operator. To do this, pass the flag 189`--enable_op_profiling=true` to `benchmark_model` during invocation. Details are 190explained 191[here](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/tools/benchmark#profiling-model-operators). 192 193### Native benchmark binary for multiple performance options in a single run 194 195A convenient and simple C++ binary is also provided to 196[benchmark multiple performance options](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/tools/benchmark#benchmark-multiple-performance-options-in-a-single-run) 197in a single run. This binary is built based on the aforementioned benchmark tool 198that could only benchmark a single performance option at a time. They share the 199same build/install/run process, but the BUILD target name of this binary is 200`benchmark_model_performance_options` and it takes some additional parameters. 201An important parameter for this binary is: 202 203`perf_options_list`: `string` (default='all') \ 204A comma-separated list of TFLite performance options to benchmark. 205 206You can get nightly pre-built binaries for this tool as listed below: 207 208* [linux_x86-64](https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/linux_x86-64_benchmark_model_performance_options) 209* [linux_aarch64](https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/linux_aarch64_benchmark_model_performance_options) 210* [linux_arm](https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/linux_arm_benchmark_model_performance_options) 211* [android_aarch64](https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/android_aarch64_benchmark_model_performance_options) 212* [android_arm](https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/android_arm_benchmark_model_performance_options) 213 214### iOS benchmark app 215 216To run benchmarks on iOS device, you need to build the app from 217[source](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/tools/benchmark/ios). 218Put the TensorFlow Lite model file in the 219[benchmark_data](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/tools/benchmark/ios/TFLiteBenchmark/TFLiteBenchmark/benchmark_data) 220directory of the source tree and modify the `benchmark_params.json` file. Those 221files are packaged into the app and the app reads data from the directory. Visit 222the 223[iOS benchmark app](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/tools/benchmark/ios) 224for detailed instructions. 225 226## Performance benchmarks for well known models 227 228This section lists TensorFlow Lite performance benchmarks when running well 229known models on some Android and iOS devices. 230 231### Android performance benchmarks 232 233These performance benchmark numbers were generated with the 234[native benchmark binary](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/tools/benchmark). 235 236For Android benchmarks, the CPU affinity is set to use big cores on the device 237to reduce variance (see 238[details](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/tools/benchmark#reducing-variance-between-runs-on-android)). 239 240It assumes that models were downloaded and unzipped to the 241`/data/local/tmp/tflite_models` directory. The benchmark binary is built using 242[these instructions](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/tools/benchmark#on-android) 243and assumed to be in the `/data/local/tmp` directory. 244 245To run the benchmark: 246 247```sh 248adb shell /data/local/tmp/benchmark_model \ 249 --num_threads=4 \ 250 --graph=/data/local/tmp/tflite_models/${GRAPH} \ 251 --warmup_runs=1 \ 252 --num_runs=50 253``` 254 255To run with nnapi delegate, set `--use_nnapi=true`. To run with GPU delegate, 256set `--use_gpu=true`. 257 258The performance values below are measured on Android 10. 259 260<table> 261 <thead> 262 <tr> 263 <th>Model Name</th> 264 <th>Device </th> 265 <th>CPU, 4 threads</th> 266 <th>GPU</th> 267 <th>NNAPI</th> 268 </tr> 269 </thead> 270 <tr> 271 <td rowspan = 2> 272 <a href="https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_2018_08_02/mobilenet_v1_1.0_224.tgz">Mobilenet_1.0_224(float)</a> 273 </td> 274 <td>Pixel 3 </td> 275 <td>23.9 ms</td> 276 <td>6.45 ms</td> 277 <td>13.8 ms</td> 278 </tr> 279 <tr> 280 <td>Pixel 4 </td> 281 <td>14.0 ms</td> 282 <td>9.0 ms</td> 283 <td>14.8 ms</td> 284 </tr> 285 <tr> 286 <td rowspan = 2> 287 <a href="https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_2018_08_02/mobilenet_v1_1.0_224_quant.tgz">Mobilenet_1.0_224 (quant)</a> 288 </td> 289 <td>Pixel 3 </td> 290 <td>13.4 ms</td> 291 <td>--- </td> 292 <td>6.0 ms</td> 293 </tr> 294 <tr> 295 <td>Pixel 4 </td> 296 <td>5.0 ms</td> 297 <td>--- </td> 298 <td>3.2 ms</td> 299 </tr> 300 <tr> 301 <td rowspan = 2> 302 <a href="https://storage.googleapis.com/download.tensorflow.org/models/tflite/model_zoo/upload_20180427/nasnet_mobile_2018_04_27.tgz">NASNet mobile</a> 303 </td> 304 <td>Pixel 3 </td> 305 <td>56 ms</td> 306 <td>--- </td> 307 <td>102 ms</td> 308 </tr> 309 <tr> 310 <td>Pixel 4 </td> 311 <td>34.5 ms</td> 312 <td>--- </td> 313 <td>99.0 ms</td> 314 </tr> 315 <tr> 316 <td rowspan = 2> 317 <a href="https://storage.googleapis.com/download.tensorflow.org/models/tflite/model_zoo/upload_20180427/squeezenet_2018_04_27.tgz">SqueezeNet</a> 318 </td> 319 <td>Pixel 3 </td> 320 <td>35.8 ms</td> 321 <td>9.5 ms </td> 322 <td>18.5 ms</td> 323 </tr> 324 <tr> 325 <td>Pixel 4 </td> 326 <td>23.9 ms</td> 327 <td>11.1 ms</td> 328 <td>19.0 ms</td> 329 </tr> 330 <tr> 331 <td rowspan = 2> 332 <a href="https://storage.googleapis.com/download.tensorflow.org/models/tflite/model_zoo/upload_20180427/inception_resnet_v2_2018_04_27.tgz">Inception_ResNet_V2</a> 333 </td> 334 <td>Pixel 3 </td> 335 <td>422 ms</td> 336 <td>99.8 ms </td> 337 <td>201 ms</td> 338 </tr> 339 <tr> 340 <td>Pixel 4 </td> 341 <td>272.6 ms</td> 342 <td>87.2 ms</td> 343 <td>171.1 ms</td> 344 </tr> 345 <tr> 346 <td rowspan = 2> 347 <a href="https://storage.googleapis.com/download.tensorflow.org/models/tflite/model_zoo/upload_20180427/inception_v4_2018_04_27.tgz">Inception_V4</a> 348 </td> 349 <td>Pixel 3 </td> 350 <td>486 ms</td> 351 <td>93 ms </td> 352 <td>292 ms</td> 353 </tr> 354 <tr> 355 <td>Pixel 4 </td> 356 <td>324.1 ms</td> 357 <td>97.6 ms</td> 358 <td>186.9 ms</td> 359 </tr> 360 361 </table> 362 363### iOS performance benchmarks 364 365These performance benchmark numbers were generated with the 366[iOS benchmark app](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/tools/benchmark/ios). 367 368To run iOS benchmarks, the benchmark app was modified to include the appropriate 369model and `benchmark_params.json` was modified to set `num_threads` to 2. To use 370the GPU delegate, `"use_gpu" : "1"` and `"gpu_wait_type" : "aggressive"` options 371were also added to `benchmark_params.json`. 372 373<table> 374 <thead> 375 <tr> 376 <th>Model Name</th> 377 <th>Device </th> 378 <th>CPU, 2 threads</th> 379 <th>GPU</th> 380 </tr> 381 </thead> 382 <tr> 383 <td> 384 <a href="https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_2018_08_02/mobilenet_v1_1.0_224.tgz">Mobilenet_1.0_224(float)</a> 385 </td> 386 <td>iPhone XS </td> 387 <td>14.8 ms</td> 388 <td>3.4 ms</td> 389 </tr> 390 <tr> 391 <td> 392 <a href="https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_2018_08_02/mobilenet_v1_1.0_224_quant.tgz)">Mobilenet_1.0_224 (quant)</a> 393 </td> 394 <td>iPhone XS </td> 395 <td>11 ms</td> 396 <td>---</td> 397 </tr> 398 <tr> 399 <td> 400 <a href="https://storage.googleapis.com/download.tensorflow.org/models/tflite/model_zoo/upload_20180427/nasnet_mobile_2018_04_27.tgz">NASNet mobile</a> 401 </td> 402 <td>iPhone XS </td> 403 <td>30.4 ms</td> 404 <td>---</td> 405 </tr> 406 <tr> 407 <td> 408 <a href="https://storage.googleapis.com/download.tensorflow.org/models/tflite/model_zoo/upload_20180427/squeezenet_2018_04_27.tgz">SqueezeNet</a> 409 </td> 410 <td>iPhone XS </td> 411 <td>21.1 ms</td> 412 <td>15.5 ms</td> 413 </tr> 414 <tr> 415 <td> 416 <a href="https://storage.googleapis.com/download.tensorflow.org/models/tflite/model_zoo/upload_20180427/inception_resnet_v2_2018_04_27.tgz">Inception_ResNet_V2</a> 417 </td> 418 <td>iPhone XS </td> 419 <td>261.1 ms</td> 420 <td>45.7 ms</td> 421 </tr> 422 <tr> 423 <td> 424 <a href="https://storage.googleapis.com/download.tensorflow.org/models/tflite/model_zoo/upload_20180427/inception_v4_2018_04_27.tgz">Inception_V4</a> 425 </td> 426 <td>iPhone XS </td> 427 <td>309 ms</td> 428 <td>54.4 ms</td> 429 </tr> 430 </table> 431 432## Trace TensorFlow Lite internals 433 434### Trace TensorFlow Lite internals in Android 435 436Note: This feature is available from Tensorflow Lite v2.4. 437 438Internal events from the TensorFlow Lite interpreter of an Android app can be 439captured by 440[Android tracing tools](https://developer.android.com/topic/performance/tracing). 441They are the same events with Android 442[Trace](https://developer.android.com/reference/android/os/Trace) API, so the 443captured events from Java/Kotlin code are seen together with TensorFlow Lite 444internal events. 445 446Some examples of events are: 447 448* Operator invocation 449* Graph modification by delegate 450* Tensor allocation 451 452Among different options for capturing traces, this guide covers the Android 453Studio CPU Profiler and the System Tracing app. Refer to 454[Perfetto command-line tool](https://developer.android.com/studio/command-line/perfetto) 455or 456[Systrace command-line tool](https://developer.android.com/topic/performance/tracing/command-line) 457for other options. 458 459#### Adding trace events in Java code 460 461This is a code snippet from the 462[Image Classification](https://github.com/tensorflow/examples/tree/master/lite/examples/image_classification/android) 463example app. TensorFlow Lite interpreter runs in the 464`recognizeImage/runInference` section. This step is optional but it is useful to 465help notice where the inference call is made. 466 467```java 468 Trace.beginSection("recognizeImage"); 469 ... 470 // Runs the inference call. 471 Trace.beginSection("runInference"); 472 tflite.run(inputImageBuffer.getBuffer(), outputProbabilityBuffer.getBuffer().rewind()); 473 Trace.endSection(); 474 ... 475 Trace.endSection(); 476 477``` 478 479#### Enable TensorFlow Lite tracing 480 481To enable TensorFlow Lite tracing, set the Android system property 482`debug.tflite.trace` to 1 before starting the Android app. 483 484```shell 485adb shell setprop debug.tflite.trace 1 486``` 487 488If this property has been set when TensorFlow Lite interpreter is initialized, 489key events (e.g., operator invocation) from the interpreter will be traced. 490 491After you captured all the traces, disable tracing by setting the property value 492to 0. 493 494```shell 495adb shell setprop debug.tflite.trace 0 496``` 497 498#### Android Studio CPU Profiler 499 500Capture traces with the 501[Android Studio CPU Profiler](https://developer.android.com/studio/profile/cpu-profiler) 502by following the steps below: 503 5041. Select **Run > Profile 'app'** from the top menus. 505 5062. Click anywhere in CPU timeline when the Profiler window appears. 507 5083. Select 'Trace System Calls' among CPU Profiling modes. 509 510  511 5124. Press 'Record' button. 513 5145. Press 'Stop' button. 515 5166. Investigate the trace result. 517 518  519 520In this example, you can see the hierarchy of events in a thread and statistics 521for each operator time and also see the data flow of the whole app among 522threads. 523 524#### System Tracing app 525 526Capture traces without Android Studio by following the steps detailed in 527[System Tracing app](https://developer.android.com/topic/performance/tracing/on-device). 528 529In this example, the same TFLite events were captured and saved to the Perfetto 530or Systrace format depending on the version of Android device. The captured 531trace files can be opened in the [Perfetto UI](https://ui.perfetto.dev/#!/). 532 533 534 535### Trace TensorFlow Lite internals in iOS 536 537Note: This feature is available from Tensorflow Lite v2.5. 538 539Internal events from the TensorFlow Lite interpreter of an iOS app can be 540captured by 541[Instruments](https://developer.apple.com/library/archive/documentation/ToolsLanguages/Conceptual/Xcode_Overview/MeasuringPerformance.html#//apple_ref/doc/uid/TP40010215-CH60-SW1) 542tool included with Xcode. They are the iOS 543[signpost](https://developer.apple.com/documentation/os/logging/recording_performance_data) 544events, so the captured events from Swift/Objective-C code are seen together 545with TensorFlow Lite internal events. 546 547Some examples of events are: 548 549* Operator invocation 550* Graph modification by delegate 551* Tensor allocation 552 553#### Enable TensorFlow Lite tracing 554 555Set the environment variable `debug.tflite.trace` by following the steps below: 556 5571. Select **Product > Scheme > Edit Scheme...** from the top menus of Xcode. 558 5592. Click 'Profile' in the left pane. 560 5613. Deselect 'Use the Run action's arguments and environment variables' 562 checkbox. 563 5644. Add `debug.tflite.trace` under 'Environment Variables' section. 565 566  567 568If you want to exclude TensorFlow Lite events when profiling the iOS app, 569disable tracing by removing the environment variable. 570 571#### XCode Instruments 572 573Capture traces by following the steps below: 574 5751. Select **Product > Profile** from the top menus of Xcode. 576 5772. Click **Logging** among profiling templates when Instruments tool launches. 578 5793. Press 'Start' button. 580 5814. Press 'Stop' button. 582 5835. Click 'os_signpost' to expand OS Logging subsystem items. 584 5856. Click 'org.tensorflow.lite' OS Logging subsystem. 586 5877. Investigate the trace result. 588 589  590 591In this example, you can see the hierarchy of events and statistics for each 592operator time. 593 594### Using the tracing data 595 596The tracing data allows you to identify performance bottlenecks. 597 598Here are some examples of insights that you can get from the profiler and 599potential solutions to improve performance: 600 601* If the number of available CPU cores is smaller than the number of inference 602 threads, then the CPU scheduling overhead can lead to subpar performance. 603 You can reschedule other CPU intensive tasks in your application to avoid 604 overlapping with your model inference or tweak the number of interpreter 605 threads. 606* If the operators are not fully delegated, then some parts of the model graph 607 are executed on the CPU rather than the expected hardware accelerator. You 608 can substitute the unsupported operators with similar supported operators. 609