• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1# Performance measurement
2
3## Benchmark tools
4
5TensorFlow Lite benchmark tools currently measure and calculate statistics for
6the following important performance metrics:
7
8*   Initialization time
9*   Inference time of warmup state
10*   Inference time of steady state
11*   Memory usage during initialization time
12*   Overall memory usage
13
14The benchmark tools are available as benchmark apps for Android and iOS and as
15native command-line binaries, and they all share the same core performance
16measurement logic. Note that the available options and output formats are
17slightly different due to the differences in runtime environment.
18
19### Android benchmark app
20
21There are two options of using the benchmark tool with Android. One is a
22[native benchmark binary](#native-benchmark-binary) and another is an Android
23benchmark app, a better gauge of how the model would perform in the app. Either
24way, the numbers from the benchmark tool will still differ slightly from when
25running inference with the model in the actual app.
26
27This Android benchmark app has no UI. Install and run it by using the `adb`
28command and retrieve results by using the `adb logcat` command.
29
30#### Download or build the app
31
32Download the nightly pre-built Android benchmark apps using the links below:
33
34*   [android_aarch64](https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/android_aarch64_benchmark_model.apk)
35*   [android_arm](https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/android_arm_benchmark_model.apk)
36
37As for Android benchmark apps that support [TF ops](https://www.tensorflow.org/lite/guide/ops_select)
38via [Flex delegate](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/delegates/flex),
39use the links below:
40
41*   [android_aarch64](https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/android_aarch64_benchmark_model_plus_flex.apk)
42*   [android_arm](https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/android_arm_benchmark_model_plus_flex.apk)
43
44
45You can also build the app from source by following these
46[instructions](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/tools/benchmark/android).
47
48Note: It is required to build the app from the source if you want to run the
49Android benchmark apk on x86 CPU or Hexagon delegate or if your model contains
50[select TF operators](../guide/ops_select) or
51[custom operators](../guide/ops_custom).
52
53#### Prepare benchmark
54
55Before running the benchmark app, install the app and push the model file to the
56device as follows:
57
58```shell
59adb install -r -d -g android_aarch64_benchmark_model.apk
60adb push your_model.tflite /data/local/tmp
61```
62
63#### Run benchmark
64
65```shell
66adb shell am start -S \
67  -n org.tensorflow.lite.benchmark/.BenchmarkModelActivity \
68  --es args '"--graph=/data/local/tmp/your_model.tflite \
69              --num_threads=4"'
70```
71
72`graph` is a required parameter.
73
74*   `graph`: `string` \
75    The path to the TFLite model file.
76
77You can specify more optional parameters for running the benchmark.
78
79*   `num_threads`: `int` (default=1) \
80    The number of threads to use for running TFLite interpreter.
81*   `use_gpu`: `bool` (default=false) \
82    Use [GPU delegate](gpu).
83*   `use_nnapi`: `bool` (default=false) \
84    Use [NNAPI delegate](nnapi).
85*   `use_xnnpack`: `bool` (default=`false`) \
86    Use
87    [XNNPACK delegate](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/delegates/xnnpack).
88*   `use_hexagon`: `bool` (default=`false`) \
89    Use [Hexagon delegate](hexagon_delegate).
90
91Depending on the device you are using, some of these options may not be
92available or have no effect. Refer to
93[parameters](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/tools/benchmark#parameters)
94for more performance parameters that you could run with the benchmark app.
95
96View the results using the `logcat` command:
97
98```shell
99adb logcat | grep "Average inference"
100```
101
102The benchmark results are reported as:
103
104```
105... tflite  : Average inference timings in us: Warmup: 91471, Init: 4108, Inference: 80660.1
106```
107
108### Native benchmark binary
109
110Benchmark tool is also provided as a native binary `benchmark_model`. You can
111execute this tool from a shell command line on Linux, Mac, embedded devices and
112Android devices.
113
114#### Download or build the binary
115
116Download the nightly pre-built native command-line binaries by following the
117links below:
118
119*   [linux_x86-64](https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/linux_x86-64_benchmark_model)
120*   [linux_aarch64](https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/linux_aarch64_benchmark_model)
121*   [linux_arm](https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/linux_arm_benchmark_model)
122*   [android_aarch64](https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/android_aarch64_benchmark_model)
123*   [android_arm](https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/android_arm_benchmark_model)
124
125As for nightly pre-built binaries that support [TF ops](https://www.tensorflow.org/lite/guide/ops_select)
126via [Flex delegate](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/delegates/flex),
127use the links below:
128
129*   [linux_x86-64](https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/linux_x86-64_benchmark_model_plus_flex)
130*   [linux_aarch64](https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/linux_aarch64_benchmark_model_plus_flex)
131*   [linux_arm](https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/linux_arm_benchmark_model_plus_flex)
132*   [android_aarch64](https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/android_aarch64_benchmark_model_plus_flex)
133*   [android_arm](https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/android_arm_benchmark_model_plus_flex)
134
135To benchmark with [TensorFlow Lite Hexagon delegate](https://www.tensorflow.org/lite/android/delegates/hexagon),
136we have also pre-built the required `libhexagon_interface.so` files (see [here](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/delegates/hexagon/README.md)
137for details about this file). After downloading the file of the corresponding
138platform from the links below, please rename the file to `libhexagon_interface.so`.
139
140*   [android_aarch64](https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/android_aarch64_libhexagon_interface.so)
141*   [android_arm](https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/android_arm_libhexagon_interface.so)
142
143You can also build the native benchmark binary from
144[source](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/tools/benchmark)
145on your computer.
146
147```shell
148bazel build -c opt //tensorflow/lite/tools/benchmark:benchmark_model
149```
150
151To build with Android NDK toolchain, you need to set up the build environment
152first by following this
153[guide](../android/lite_build#set_up_build_environment_without_docker), or use
154the docker image as described in this
155[guide](../android/lite_build#set_up_build_environment_using_docker).
156
157```shell
158bazel build -c opt --config=android_arm64 \
159  //tensorflow/lite/tools/benchmark:benchmark_model
160```
161
162Note: It is a valid approach to push and execute binaries directly on an Android
163device for benchmarking, but it can result in subtle (but observable)
164differences in performance relative to execution within an actual Android app.
165In particular, Android's scheduler tailors behavior based on thread and process
166priorities, which differ between a foreground Activity/Application and a regular
167background binary executed via `adb shell ...`. This tailored behavior is most
168evident when enabling multi-threaded CPU execution with TensorFlow Lite.
169Therefore, the Android benchmark app is preferred for performance measurement.
170
171#### Run benchmark
172
173To run benchmarks on your computer, execute the binary from the shell.
174
175```shell
176path/to/downloaded_or_built/benchmark_model \
177  --graph=your_model.tflite \
178  --num_threads=4
179```
180
181You can use the same set of
182[parameters](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/tools/benchmark#parameters)
183as mentioned above with the native command-line binary.
184
185#### Profiling model ops
186
187The benchmark model binary also allows you to profile model ops and get the
188execution times of each operator. To do this, pass the flag
189`--enable_op_profiling=true` to `benchmark_model` during invocation. Details are
190explained
191[here](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/tools/benchmark#profiling-model-operators).
192
193### Native benchmark binary for multiple performance options in a single run
194
195A convenient and simple C++ binary is also provided to
196[benchmark multiple performance options](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/tools/benchmark#benchmark-multiple-performance-options-in-a-single-run)
197in a single run. This binary is built based on the aforementioned benchmark tool
198that could only benchmark a single performance option at a time. They share the
199same build/install/run process, but the BUILD target name of this binary is
200`benchmark_model_performance_options` and it takes some additional parameters.
201An important parameter for this binary is:
202
203`perf_options_list`: `string` (default='all') \
204A comma-separated list of TFLite performance options to benchmark.
205
206You can get nightly pre-built binaries for this tool as listed below:
207
208*   [linux_x86-64](https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/linux_x86-64_benchmark_model_performance_options)
209*   [linux_aarch64](https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/linux_aarch64_benchmark_model_performance_options)
210*   [linux_arm](https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/linux_arm_benchmark_model_performance_options)
211*   [android_aarch64](https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/android_aarch64_benchmark_model_performance_options)
212*   [android_arm](https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/android_arm_benchmark_model_performance_options)
213
214### iOS benchmark app
215
216To run benchmarks on iOS device, you need to build the app from
217[source](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/tools/benchmark/ios).
218Put the TensorFlow Lite model file in the
219[benchmark_data](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/tools/benchmark/ios/TFLiteBenchmark/TFLiteBenchmark/benchmark_data)
220directory of the source tree and modify the `benchmark_params.json` file. Those
221files are packaged into the app and the app reads data from the directory. Visit
222the
223[iOS benchmark app](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/tools/benchmark/ios)
224for detailed instructions.
225
226## Performance benchmarks for well known models
227
228This section lists TensorFlow Lite performance benchmarks when running well
229known models on some Android and iOS devices.
230
231### Android performance benchmarks
232
233These performance benchmark numbers were generated with the
234[native benchmark binary](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/tools/benchmark).
235
236For Android benchmarks, the CPU affinity is set to use big cores on the device
237to reduce variance (see
238[details](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/tools/benchmark#reducing-variance-between-runs-on-android)).
239
240It assumes that models were downloaded and unzipped to the
241`/data/local/tmp/tflite_models` directory. The benchmark binary is built using
242[these instructions](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/tools/benchmark#on-android)
243and assumed to be in the `/data/local/tmp` directory.
244
245To run the benchmark:
246
247```sh
248adb shell /data/local/tmp/benchmark_model \
249  --num_threads=4 \
250  --graph=/data/local/tmp/tflite_models/${GRAPH} \
251  --warmup_runs=1 \
252  --num_runs=50
253```
254
255To run with nnapi delegate, set `--use_nnapi=true`. To run with GPU delegate,
256set `--use_gpu=true`.
257
258The performance values below are measured on Android 10.
259
260<table>
261  <thead>
262    <tr>
263      <th>Model Name</th>
264      <th>Device </th>
265      <th>CPU, 4 threads</th>
266      <th>GPU</th>
267      <th>NNAPI</th>
268    </tr>
269  </thead>
270  <tr>
271    <td rowspan = 2>
272      <a href="https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_2018_08_02/mobilenet_v1_1.0_224.tgz">Mobilenet_1.0_224(float)</a>
273    </td>
274    <td>Pixel 3 </td>
275    <td>23.9 ms</td>
276    <td>6.45 ms</td>
277    <td>13.8 ms</td>
278  </tr>
279   <tr>
280     <td>Pixel 4 </td>
281    <td>14.0 ms</td>
282    <td>9.0 ms</td>
283    <td>14.8 ms</td>
284  </tr>
285  <tr>
286    <td rowspan = 2>
287      <a href="https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_2018_08_02/mobilenet_v1_1.0_224_quant.tgz">Mobilenet_1.0_224 (quant)</a>
288    </td>
289    <td>Pixel 3 </td>
290    <td>13.4 ms</td>
291    <td>--- </td>
292    <td>6.0 ms</td>
293  </tr>
294   <tr>
295     <td>Pixel 4 </td>
296    <td>5.0 ms</td>
297    <td>--- </td>
298    <td>3.2 ms</td>
299  </tr>
300  <tr>
301    <td rowspan = 2>
302      <a href="https://storage.googleapis.com/download.tensorflow.org/models/tflite/model_zoo/upload_20180427/nasnet_mobile_2018_04_27.tgz">NASNet mobile</a>
303    </td>
304    <td>Pixel 3 </td>
305    <td>56 ms</td>
306    <td>--- </td>
307    <td>102 ms</td>
308  </tr>
309   <tr>
310     <td>Pixel 4 </td>
311    <td>34.5 ms</td>
312    <td>--- </td>
313    <td>99.0 ms</td>
314  </tr>
315  <tr>
316    <td rowspan = 2>
317      <a href="https://storage.googleapis.com/download.tensorflow.org/models/tflite/model_zoo/upload_20180427/squeezenet_2018_04_27.tgz">SqueezeNet</a>
318    </td>
319    <td>Pixel 3 </td>
320    <td>35.8 ms</td>
321    <td>9.5 ms </td>
322    <td>18.5 ms</td>
323  </tr>
324   <tr>
325     <td>Pixel 4 </td>
326    <td>23.9 ms</td>
327    <td>11.1 ms</td>
328    <td>19.0 ms</td>
329  </tr>
330  <tr>
331    <td rowspan = 2>
332      <a href="https://storage.googleapis.com/download.tensorflow.org/models/tflite/model_zoo/upload_20180427/inception_resnet_v2_2018_04_27.tgz">Inception_ResNet_V2</a>
333    </td>
334    <td>Pixel 3 </td>
335    <td>422 ms</td>
336    <td>99.8 ms </td>
337    <td>201 ms</td>
338  </tr>
339   <tr>
340     <td>Pixel 4 </td>
341    <td>272.6 ms</td>
342    <td>87.2 ms</td>
343    <td>171.1 ms</td>
344  </tr>
345  <tr>
346    <td rowspan = 2>
347      <a href="https://storage.googleapis.com/download.tensorflow.org/models/tflite/model_zoo/upload_20180427/inception_v4_2018_04_27.tgz">Inception_V4</a>
348    </td>
349    <td>Pixel 3 </td>
350    <td>486 ms</td>
351    <td>93 ms </td>
352    <td>292 ms</td>
353  </tr>
354   <tr>
355     <td>Pixel 4 </td>
356    <td>324.1 ms</td>
357    <td>97.6 ms</td>
358    <td>186.9 ms</td>
359  </tr>
360
361 </table>
362
363### iOS performance benchmarks
364
365These performance benchmark numbers were generated with the
366[iOS benchmark app](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/tools/benchmark/ios).
367
368To run iOS benchmarks, the benchmark app was modified to include the appropriate
369model and `benchmark_params.json` was modified to set `num_threads` to 2. To use
370the GPU delegate, `"use_gpu" : "1"` and `"gpu_wait_type" : "aggressive"` options
371were also added to `benchmark_params.json`.
372
373<table>
374  <thead>
375    <tr>
376      <th>Model Name</th>
377      <th>Device </th>
378      <th>CPU, 2 threads</th>
379      <th>GPU</th>
380    </tr>
381  </thead>
382  <tr>
383    <td>
384      <a href="https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_2018_08_02/mobilenet_v1_1.0_224.tgz">Mobilenet_1.0_224(float)</a>
385    </td>
386    <td>iPhone XS </td>
387    <td>14.8 ms</td>
388    <td>3.4 ms</td>
389  </tr>
390  <tr>
391    <td>
392      <a href="https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_2018_08_02/mobilenet_v1_1.0_224_quant.tgz)">Mobilenet_1.0_224 (quant)</a>
393    </td>
394    <td>iPhone XS </td>
395    <td>11 ms</td>
396    <td>---</td>
397  </tr>
398  <tr>
399    <td>
400      <a href="https://storage.googleapis.com/download.tensorflow.org/models/tflite/model_zoo/upload_20180427/nasnet_mobile_2018_04_27.tgz">NASNet mobile</a>
401    </td>
402    <td>iPhone XS </td>
403    <td>30.4 ms</td>
404    <td>---</td>
405  </tr>
406  <tr>
407    <td>
408      <a href="https://storage.googleapis.com/download.tensorflow.org/models/tflite/model_zoo/upload_20180427/squeezenet_2018_04_27.tgz">SqueezeNet</a>
409    </td>
410    <td>iPhone XS </td>
411    <td>21.1 ms</td>
412    <td>15.5 ms</td>
413  </tr>
414  <tr>
415    <td>
416      <a href="https://storage.googleapis.com/download.tensorflow.org/models/tflite/model_zoo/upload_20180427/inception_resnet_v2_2018_04_27.tgz">Inception_ResNet_V2</a>
417    </td>
418    <td>iPhone XS </td>
419    <td>261.1 ms</td>
420    <td>45.7 ms</td>
421  </tr>
422  <tr>
423    <td>
424      <a href="https://storage.googleapis.com/download.tensorflow.org/models/tflite/model_zoo/upload_20180427/inception_v4_2018_04_27.tgz">Inception_V4</a>
425    </td>
426    <td>iPhone XS </td>
427    <td>309 ms</td>
428    <td>54.4 ms</td>
429  </tr>
430 </table>
431
432## Trace TensorFlow Lite internals
433
434### Trace TensorFlow Lite internals in Android
435
436Note: This feature is available from Tensorflow Lite v2.4.
437
438Internal events from the TensorFlow Lite interpreter of an Android app can be
439captured by
440[Android tracing tools](https://developer.android.com/topic/performance/tracing).
441They are the same events with Android
442[Trace](https://developer.android.com/reference/android/os/Trace) API, so the
443captured events from Java/Kotlin code are seen together with TensorFlow Lite
444internal events.
445
446Some examples of events are:
447
448*   Operator invocation
449*   Graph modification by delegate
450*   Tensor allocation
451
452Among different options for capturing traces, this guide covers the Android
453Studio CPU Profiler and the System Tracing app. Refer to
454[Perfetto command-line tool](https://developer.android.com/studio/command-line/perfetto)
455or
456[Systrace command-line tool](https://developer.android.com/topic/performance/tracing/command-line)
457for other options.
458
459#### Adding trace events in Java code
460
461This is a code snippet from the
462[Image Classification](https://github.com/tensorflow/examples/tree/master/lite/examples/image_classification/android)
463example app. TensorFlow Lite interpreter runs in the
464`recognizeImage/runInference` section. This step is optional but it is useful to
465help notice where the inference call is made.
466
467```java
468  Trace.beginSection("recognizeImage");
469  ...
470  // Runs the inference call.
471  Trace.beginSection("runInference");
472  tflite.run(inputImageBuffer.getBuffer(), outputProbabilityBuffer.getBuffer().rewind());
473  Trace.endSection();
474  ...
475  Trace.endSection();
476
477```
478
479#### Enable TensorFlow Lite tracing
480
481To enable TensorFlow Lite tracing, set the Android system property
482`debug.tflite.trace` to 1 before starting the Android app.
483
484```shell
485adb shell setprop debug.tflite.trace 1
486```
487
488If this property has been set when TensorFlow Lite interpreter is initialized,
489key events (e.g., operator invocation) from the interpreter will be traced.
490
491After you captured all the traces, disable tracing by setting the property value
492to 0.
493
494```shell
495adb shell setprop debug.tflite.trace 0
496```
497
498#### Android Studio CPU Profiler
499
500Capture traces with the
501[Android Studio CPU Profiler](https://developer.android.com/studio/profile/cpu-profiler)
502by following the steps below:
503
5041.  Select **Run > Profile 'app'** from the top menus.
505
5062.  Click anywhere in CPU timeline when the Profiler window appears.
507
5083.  Select 'Trace System Calls' among CPU Profiling modes.
509
510    ![Select 'Trace System Calls'](images/as_select_profiling_mode.png)
511
5124.  Press 'Record' button.
513
5145.  Press 'Stop' button.
515
5166.  Investigate the trace result.
517
518    ![Android Studio trace](images/as_traces.png)
519
520In this example, you can see the hierarchy of events in a thread and statistics
521for each operator time and also see the data flow of the whole app among
522threads.
523
524#### System Tracing app
525
526Capture traces without Android Studio by following the steps detailed in
527[System Tracing app](https://developer.android.com/topic/performance/tracing/on-device).
528
529In this example, the same TFLite events were captured and saved to the Perfetto
530or Systrace format depending on the version of Android device. The captured
531trace files can be opened in the [Perfetto UI](https://ui.perfetto.dev/#!/).
532
533![Perfetto trace](images/perfetto_traces.png)
534
535### Trace TensorFlow Lite internals in iOS
536
537Note: This feature is available from Tensorflow Lite v2.5.
538
539Internal events from the TensorFlow Lite interpreter of an iOS app can be
540captured by
541[Instruments](https://developer.apple.com/library/archive/documentation/ToolsLanguages/Conceptual/Xcode_Overview/MeasuringPerformance.html#//apple_ref/doc/uid/TP40010215-CH60-SW1)
542tool included with Xcode. They are the iOS
543[signpost](https://developer.apple.com/documentation/os/logging/recording_performance_data)
544events, so the captured events from Swift/Objective-C code are seen together
545with TensorFlow Lite internal events.
546
547Some examples of events are:
548
549*   Operator invocation
550*   Graph modification by delegate
551*   Tensor allocation
552
553#### Enable TensorFlow Lite tracing
554
555Set the environment variable `debug.tflite.trace` by following the steps below:
556
5571.  Select **Product > Scheme > Edit Scheme...** from the top menus of Xcode.
558
5592.  Click 'Profile' in the left pane.
560
5613.  Deselect 'Use the Run action's arguments and environment variables'
562    checkbox.
563
5644.  Add `debug.tflite.trace` under 'Environment Variables' section.
565
566    ![Set environment variable](images/xcode_profile_environment.png)
567
568If you want to exclude TensorFlow Lite events when profiling the iOS app,
569disable tracing by removing the environment variable.
570
571#### XCode Instruments
572
573Capture traces by following the steps below:
574
5751.  Select **Product > Profile** from the top menus of Xcode.
576
5772.  Click **Logging** among profiling templates when Instruments tool launches.
578
5793.  Press 'Start' button.
580
5814.  Press 'Stop' button.
582
5835.  Click 'os_signpost' to expand OS Logging subsystem items.
584
5856.  Click 'org.tensorflow.lite' OS Logging subsystem.
586
5877.  Investigate the trace result.
588
589    ![Xcode Instruments trace](images/xcode_traces.png)
590
591In this example, you can see the hierarchy of events and statistics for each
592operator time.
593
594### Using the tracing data
595
596The tracing data allows you to identify performance bottlenecks.
597
598Here are some examples of insights that you can get from the profiler and
599potential solutions to improve performance:
600
601*   If the number of available CPU cores is smaller than the number of inference
602    threads, then the CPU scheduling overhead can lead to subpar performance.
603    You can reschedule other CPU intensive tasks in your application to avoid
604    overlapping with your model inference or tweak the number of interpreter
605    threads.
606*   If the operators are not fully delegated, then some parts of the model graph
607    are executed on the CPU rather than the expected hardware accelerator. You
608    can substitute the unsupported operators with similar supported operators.
609