• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1# hiperf
2
3<!--Kit: Performance Analysis Kit-->
4<!--Subsystem: HiviewDFX-->
5<!--Owner: @leiguangyu-->
6<!--Designer: @Maplestroy-->
7<!--Tester: @gcw_KuLfPSbe-->
8<!--Adviser: @foryourself-->
9
10hiperf is a command line tool that integrates multiple performance analysis capabilities, enabling you to identify system bottlenecks, locate software hotspots, optimize code efficiency, and collect and analyze runtime performance data.
11
12
13You can preferentially use a graphical frontend tool such as [DevEco Studio](https://developer.huawei.com/consumer/en/doc/harmonyos-guides/ide-insight-session-time) or [SmartPerf](https://gitee.com/openharmony/developtools_smartperf_host/blob/master/smartperf_host/ide/src/doc/md/quickstart_hiperf.md) to collect the call stack of a function, obtain the execution time of the function at each layer in the call stack, and view the call chain information in a swimlane diagram for performance analysis. To specify the event, sampling period, collection duration, and number of CPU cores, you can use HiPerf. The **perf.data** file can be opened using SmartPerf and displayed in a flame graph.
14
15
16This topic describes how to use hiperf to perform performance analysis.
17
18
19## Environment Setup
20
21- The environment for OpenHarmony Device Connector (hdc) has been set up. For details, see [Environment Setup](hdc.md#environment-setup).
22
23- The devices are properly connected and **hdc shell** is executed.
24
25
26## Command Syntax
27
28Run the **hiperf --help** command to list all hiperf commands, including **dump**, **list**, **record**, **report**, and **stat**.
29
30```shell
31$ hiperf --help
32```
33
34
35| Command| Description|
36| -------- | -------- |
37| --hilog | Records logs generated during program running to HiLog.|
38| --logpath | Sets the save path of log files. You can set the output file path to **/data/local/tmp/** and customize the file name.|
39| --logtag | Enables logs of a specified funtionality.|
40| --debug | Records **debug** logs.|
41| --verbose | Records **verbose** logs.|
42| --much | Records **much** logs.|
43| --nodebug | Disables all logs.|
44| --mixlog | Outputs logs to the CLI.|
45| -h/--help | Displays the help information.|
46| [dump](#dump)| Converts the performance data file (for example, **perf.data**) into a readable format.|
47| [list](#list)| Displays the performance event types supported by the system.|
48| [record](#record)| Collects performance data.|
49| [report](#report)| Converts performance data into visualized data.|
50| [stat](#stat)| Collects statistics on performance data.|
51
52
53**Example**
54
55
56```shell
57$ hiperf --help
58Usage: hiperf [options] command [args for command]
59options:
60        --debug                 show debug log, usage format: --debug [command] [args]
61        --help                  show help
62        --hilog                 use hilog not file to record log
63        --logpath               log file name full path, usage format: --logpath [filepath] [command] [args]
64        --logtag                enable log level for HILOG_TAG, usage format: --logtag <tag>[:level][,<tag>[:level]] [command] [args]
65                                tag: Dump, Report, Record, Stat... level: D, V, M...
66                                example: hiperf --verbose --logtag Record:D [command] [args]
67        --mixlog                mix the log in output, usage format: --mixlog [command] [args]
68        --much                  show extremely much debug log, usage format: --much [command] [args]
69        --nodebug               disable debug log, usage format: --nodebug [command] [args]
70        --verbose               show debug log, usage format: --verbose [command] [args]
71        -h                      show help
72command:
73        dump:   Dump content of a perf data file, like perf.data
74        help:   Show more help information for hiperf
75        list:   List the supported event types.
76        record: Collect performance sample information
77        report: report sampling information from perf.data format file
78        stat:   Collect performance counter information
79
80See 'hiperf help [command]' for more information on a specific command.
81```
82
83
84## Common Commands
85
86
87### Recording Performance Data Sampling
88
89
901. Sample the process **1234** for 10 seconds. Set the stack unwinding mode to **fp**, sampling frequency to **1000** times per second, event types to **hw-cpu-cycles** and **hw-instructions**, and save the sampling file to **/data/local/tmp/perf.data**.
91
92
93```shell
94$ hiperf record -p 1234 -s fp -f 1000 -d 10 -e hw-cpu-cycles,hw-instructions -o /data/local/tmp/perf.data
95Profiling duration is 10.000 seconds.
96Start Profiling...
97Timeout exit (total 10335 ms)
98Process and Saving data...
99Hiperf is not running as root mode. Do not need load kernel syms
100[ hiperf record: Captured 3.014 MB perf data. ]
101[ Sample records: 1293, Non sample records: 855 ]
102[ Sample lost: 0, Non sample lost: 0 ]
103```
104
105
106The collected data is saved as a **perf.data** file in binary format, which contains the sampling data, process information, symbol table, and function calls required for performance analysis. You can use the flame graph script to convert the sampling data into a flame graph to identify system performance bottlenecks, locate software hotspots, and optimize code efficiency.
107
108
1092. Sample the application **com.example.insight_test_stage**. Set the sampling duration to **10s**, stack unwinding mode to **dwarf** (debug information table), sampling period to **1000**, event types to **hw-cpu-cycles** and **hw-instructions**, and use the default save path.
110
111
112```shell
113$ hiperf record --app com.example.insight_test_stage -d 10 -s dwarf --period 1000 -e hw-cpu-cycles,hw-instructions
114Profiling duration is 10.000 seconds.
115Start Profiling...
116Timeout exit (total 10000 ms)
117Process and Saving data...
118Hiperf is not running as root mode. Do not need load kernel syms
119[ hiperf record: Captured 0.296 MB perf data. ]
120[ Sample records: 0, Non sample records: 2640 ]
121[ Sample lost: 0, Non sample lost: 0 ]
122```
123
124
125The collected data is saved to the default path **/data/local/tmp/perf.data**.
126
127
128### Collecting Performance Statistics
129
130
1311. Count the **1745** and **1910** processes for 10 seconds.
132
133
134```
135$ hiperf stat -d 10 -p 1745,1910
136Profiling duration is 10.000 seconds.
137Start Profiling...
138Timeout exit (total 10000 ms)
139                    count  name                           | comment                          | coverage
140                  148,450  hw-branch-instructions         | 26.404 M/sec                     | (100%)
141                   49,833  hw-branch-misses               | 33.568878 miss rate              | (100%)
142                8,986,523  hw-cpu-cycles                  | 1.598409 GHz                     | (100%)
143                1,283,596  hw-instructions                | 7.001053 cycles per instruction  | (100%)
144                       63  sw-context-switches            | 11.206 K/sec                     | (100%)
145                        0  sw-page-faults                 | 0.000 /sec                       | (100%)
146                5,622,169  sw-task-clock                  | 0.000562 cpus used               | (100%)
147```
148
149
1502. Count processes **1745** and **1910** for **10** seconds, with event types set to **hw-cpu-cycles**, **hw-instructions**, and **sw-task-clock**, and a print interval of **3000** ms.
151
152
153```
154$ hiperf stat -d 10 -p 1745,1910 -e hw-cpu-cycles,hw-instructions,sw-task-clock -i 3000
155Profiling duration is 10.000 seconds.
156Start Profiling...
157Report at 3000 ms (6999 ms left):
158                    count  name                           | comment                          | coverage
159                2,534,675  hw-cpu-cycles                  | 1.717114 GHz                     | (100%)
160                  324,279  hw-instructions                | 7.816340 cycles per instruction  | (100%)
161                1,476,125  sw-task-clock                  | 0.000492 cpus used               | (100%)
162Report at 6000 ms (3999 ms left):
163                    count  name                           | comment                          | coverage
164                5,112,570  hw-cpu-cycles                  | 1.724259 GHz                     | (100%)
165                  648,303  hw-instructions                | 7.886081 cycles per instruction  | (100%)
166                2,965,083  sw-task-clock                  | 0.000494 cpus used               | (100%)
167Report at 9000 ms (999 ms left):
168                    count  name                           | comment                          | coverage
169                7,870,422  hw-cpu-cycles                  | 1.724897 GHz                     | (100%)
170                  994,407  hw-instructions                | 7.914689 cycles per instruction  | (100%)
171                4,562,835  sw-task-clock                  | 0.000507 cpus used               | (100%)
172Timeout exit (total 10000 ms)
173```
174
175
1763. Count the process **1910**, with the counting duration set to **3** seconds and the event types to **hw-cpu-cycles** and **hw-instructions**, and print detailed information.
177
178
179```
180$ hiperf stat -d 3 -p 1910 -e hw-cpu-cycles,hw-instructions --verbose
181Profiling duration is 3.000 seconds.
182Start Profiling...
183Timeout exit (total 3000 ms)
184hw-cpu-cycles id:1342(c-1:p1910) timeEnabled:133583 timeRunning:133583 value:255740
185hw-cpu-cycles id:1343(c-1:p1988) timeEnabled:0 timeRunning:0 value:0
186hw-cpu-cycles id:1344(c-1:p1989) timeEnabled:0 timeRunning:0 value:0
187hw-cpu-cycles id:1345(c-1:p1990) timeEnabled:187833 timeRunning:187833 value:331425
188...
189hw-instructions id:1375(c-1:p1910) timeEnabled:133583 timeRunning:133583 value:36485
190hw-instructions id:1376(c-1:p1988) timeEnabled:0 timeRunning:0 value:0
191hw-instructions id:1377(c-1:p1989) timeEnabled:0 timeRunning:0 value:0
192hw-instructions id:1378(c-1:p1990) timeEnabled:187833 timeRunning:187833 value:47816
193...
194                    count  name                           | comment                          | coverage
195                  669,850  hw-cpu-cycles                  |                                  | (100%)
196                   94,903  hw-instructions                | 7.058259 cycles per instruction  | (100%)
197```
198
199
200## Debug-Type Applications
201
202
203> **NOTE**
204>
205> The **hiperf record/stat -p [pid]** command should be used for applications signed by the debug certificate.
206>
207> Run the **hdc shell "bm dump -n bundlename | grep appProvisionType"** command to check whether the application specified in the command is a debug-type application. The expected output is **"appProvisionType": "debug"**.
208>
209> For example, run the following command to check the bundle name **com.example.myapplication**:
210>
211> ```shell
212> hdc shell "bm dump -n com.example.myapplication | grep appProvisionType"
213> ```
214>
215> If the application is a debug-type application, the following information is displayed:
216>
217> ```shell
218> "appProvisionType": "debug",
219> ```
220>
221> To build a debug-type application, you need to use a debug certificate for signature. For details about how to request and use the debug certificate, see [Requesting a Debug Certificate](https://developer.huawei.com/consumer/en/doc/app/agc-help-add-debugcert-0000001914263178).
222
223
224## list
225
226Displays the performance event types supported by the system, which can be used as parameters of the **-e** option in the **record** and **stat** commands.
227
228**Parameters**
229
230| Name| Description|
231| -------- | -------- |
232| -h/--help | Displays the help information.|
233| hw | Lists the hardware events.<br>The following events are supported:<br>- hw-cpu-cycles<br>- hw-instructions<br>- hw-cache-references<br>- hw-cache-misses<br>- hw-branch-instructions<br>- hw-branch-misses<br>- hw-bus-cycles<br>- hw-stalled-cycles-frontend<br>- hw-stalled-cycles-backend |
234| sw | Lists the software events.|
235| tp | Lists the tracepoint event.|
236| cache | Lists the hardware cache events.|
237| raw | Lists original performance monitoring unit (PMU) events.|
238
239**Example**
240
241```
242Usage: hiperf list [event type name]
243```
244
245Query the supported hardware event types.
246
247
248```
249$ hiperf list hw
250event not support hw-ref-cpu-cycles
251
252Supported events for hardware:
253        hw-cpu-cycles
254        hw-instructions
255        hw-cache-references
256        hw-cache-misses
257        hw-branch-instructions
258        hw-branch-misses
259        hw-bus-cycles
260        hw-stalled-cycles-frontend
261        hw-stalled-cycles-backend
262```
263
264
265## record
266
267Collects the performance data of a specified process or application, including the CPU cycle, number of instructions, and function calls, and saves the sampling data to a specified file (**/data/local/tmp/perf.data** by default).
268
269**Parameters of the record command**
270
271<!--RP1-->
272| Parameter| Description|
273| -------- | -------- |
274| -h/--help | Displays the help information.|
275| -c | Sets the ID of the CPU to collect its data.|
276| --cpu-limit | Sets the maximum CPU usage during collection. The value ranges from 1 to 100. The default value is 25.|
277| -d | Sets the collection duration, in seconds. This parameter cannot be used together with **--control**.|
278| -f | Sets the collection frequency. The default value is **4000** times per second. This parameter cannot be used together with **--period**.|
279| --period | Sets the event collection period, that is, the number of events for each collection. This parameter cannot be used together with **-f**.|
280| -e | Sets the event to collect. Multiple event types are supported; separate them with commas. You can run the **list** command to obtain the supported event types.|
281| -g | Specifies the event groups to collect, which are separated by commas (,).|
282| --no-inherit | Collects no subprocess data.|
283| -p | Specifies the process ID to collect. Multiple process IDs are supported; separate them with commas (,). This parameter cannot be used together with **-a**.|
284| -t | Specifies the thread ID to collect. Multiple thread IDs are supported; separate them with commas (,). This parameter cannot be used together with **-a**.|
285| --exclude-tid | Specifies the thread ID not to collect. Multiple thread IDs are supported; separate them with commas (,). This parameter cannot be used together with **-a**.|
286| --exclude-thread | Specifies the thread name not to collect. Multiple thread names are supported; separate them with commas (,). This parameter cannot be used together with **-a**.|
287| --offcpu | Traces the time when a thread is out of CPU scheduling.|
288| -j | Samples branch stacks. The following filters are supported: **any**, **any_call**, **any_ret**, **ind_call**, **ind_jmp**, **cond** and **call**.|
289| -s/--callstack | Sets the stack unwinding mode, which can be **fp** (stack pointer) or **dwarf** (debug information table). The default mode is **fp**.|
290| --kernel-callchain | Collects kernel-mode stacks. This parameter must be used together with the **-s** parameter.|
291| --callchain-useronly | Collects only user stacks.|
292| --delay-unwind | Delays call stack unwinding until after recording when the stack mode is set to **dwarf**.|
293| --disable-unwind | Disables call stack unwinding after recording when the stack mode is set to **dwarf**.|
294| --disable-callstack-expand | Merges the call stacks using the cached thread stack when the stack mode is set to **dwarf**.|
295| --enable-debuginfo-symbolic | Parses the symbols in the **.gnu_debugdata** section of elf when **-s fp/dwarf** is set. By default, the symbols are not parsed.|
296| --clockid | Sets the collection clock type, which can be **monotonic** or **monotonic_raw**. Some events support the **boottime**, **realtime**, and **clock_tai clock** types.|
297| --symbol-dir | Sets the symbol table file path, which is used for symbolization during collection.|
298| -m | Sets the number of mmap pages. Value range: 2 to 1024. The default value is **1024**.|
299| --app | Sets the application names to collect. Use commas (,) to separate them. The application must already be running. If it has not started, the command waits up to 20s and then exits automatically. This parameter cannot be used together with **-a**.|
300| --chkms | Sets the query interval, in milliseconds. The value ranges from 1 to 200. The default value is **10**.|
301| --data-limit | Sets the limit of the output data size. When this limit is reached, the collection stops. By default, there is no limit.|
302| -o | Sets the output file path. You can set the output file path to **/data/local/tmp/** and customize the file name.|
303| -z | Outputs the data in a .gz file.|
304| --restart | Collects performance metrics about application startup. If the process is not started within 30 seconds, the collection stops.|
305| --verbose | Outputs a more detailed report.|
306| --control [command] | Controls the collection operation. The following commands are supported: **prepare**/**start**/**pause**/**resume**/**output**/**stop**. This parameter cannot be used together with **-d**.|
307| --dedup_stack | Deletes duplicate stacks from the record.|
308| --cmdline-size | Sets the value of the **/sys/kernel/tracing/saved_cmdlines_size** node, in bytes. The value ranges from 512 to 4096.|
309| --report | Collects the backtrace report.|
310| --backtrack | Collects data in a previous period. This parameter must be used together with **--control prepare**.|
311| --backtrack-sec | Collects the duration of previous data, in seconds. The value ranges from 5 to 30. The default value is **10**. This parameter must be used together with **--backtrack**.|
312| --dumpoptions | Displays the collection parameter details.|
313| -a | Collects the device performance data.|
314| --exclude-hiperf | Excludes the performance data of the hiperf process. This parameter must be used together with **-a**.|
315| --exclude-process | Specifies the process name not to collect. This parameter must be used together with **-a**.|
316<!--RP1End-->
317
318**Example**
319
320```
321Usage: hiperf record [options] [command [command-args]]
322```
323
324Sample the process 267 for 10 seconds and use **dwarf** to unwind the stack.
325
326```
327$ hiperf record -p 267 -d 10 -s dwarf
328```
329
330
331## stat
332
333Monitors the specified application and periodically prints the values of performance counters.
334
335**Parameters of the stat command**
336
337<!--RP2-->
338| Parameter| Description|
339| -------- | -------- |
340| -h/--help | Displays the help information.|
341| -c | Sets the ID of the CPU to collect its data.|
342| -d | Sets the collection duration, in seconds. This parameter cannot be used together with **--control**.|
343| -i | Sets the interval for printing **stat** information, in milliseconds.|
344| -e | Specifies the events to collect. Multiple events are supported; use commas (,) to separate them.|
345| -g | Specifies the event groups to collect, which are separated by commas (,). You can run the **list** command to obtain the supported event types.|
346| --no-inherit | Collects no subprocess data.|
347| -p | Specifies the process ID to collect. Multiple process IDs are supported; separate them with commas (,). This parameter cannot be used together with **-a**.|
348| -t | Specifies the thread ID to collect. Multiple thread IDs are supported; separate them with commas (,). This parameter cannot be used together with **-a**.|
349| --app | Sets the application names to collect. Use commas (,) to separate them. The application must already be running. If it has not started, the command waits up to 20s and then exits automatically. This parameter cannot be used together with **-a**.|
350| --chkms | Sets the query interval, in milliseconds. The value ranges from 1 to 200. The default value is **10**.|
351| --per-core | Obtains the print count of each CPU core.|
352| --pre-thread | Obtains the print count of each thread.|
353| --restart | Collects performance indicator information about application startup. If a process is not started within 30 seconds, the record exits. This parameter must be used together with **--app**.|
354| --verbose | Outputs detailed information.|
355| --dumpoptions | Displays details about all options in the list.|
356| --control [command] | Controls the collection operation. The commands include **prepare**, **start**, and **stop**. This parameter cannot be used together with **-d**.<br>**NOTE**: This parameter is supported since API version 20.|
357| -o | Sets the output file path. You can set the output file path to **/data/local/tmp/** and customize the file name. This parameter must be used with **--control prepare**, and cannot be used with **--control**.<br>**NOTE**: This parameter is supported since API version 20.|
358| -a | Collects the device performance data.|
359
360**Example**
361
362```
363hiperf stat [options] [command [command-args]]
364```
365
366Run the **stat** command to monitor the performance data of the process **2349** that runs on CPU 0 for three seconds.
367
368```
369$ hiperf stat -p 1745 -d 3 -c 0
370```
371
372
373## dump
374
375Converts performance data files in different formats (for example, **perf.data**) into plain texts for you to check the correctness of original sampling data.
376
377**Parameters of the dump command**
378
379| Parameter| Description|
380| -------- | -------- |
381| -h/--help | Displays the help information.|
382| --head | Outputs only the data header and attributes.|
383| -d | Outputs only the data segment.|
384| -f | Outputs only additional functions.|
385| --sympath | Specifies the path of the symbol table file.|
386| -i | Specifies the path of the sampling file.|
387| -o | Sets the output file path. You can set the output file path to **/data/local/tmp/** and customize the file name. If this parameter is not set, the data is output to the CLI.|
388| --elf | Converts the ELF file to a readable plaintext.|
389| --proto | Converts the .proto file to a readable plaintext.|
390| --export | Splits the user stack data into multiple files.|
391
392**Example**
393
394```
395Usage: hiperf dump [option] \<filename\>
396```
397
398Run the **dump** command to read the **/data/local/tmp/perf.data** file and export it to the **/data/local/tmp/perf.dump** file.
399
400```
401$ hiperf dump -i /data/local/tmp/perf.data -o /data/local/tmp/perf.dump
402```
403
404
405## report
406
407Converts the sampling data (**perf.data**) to the specified format (such as JSON or ProtoBuf), groups samples belonging to the same process, thread, or function into individual sample entries, sorts these entries by event count, and displays them in a report.
408
409**Parameters of the report command**
410
411| Parameter| Description|
412| -------- | -------- |
413| -h/--help | Displays the help information.|
414| --symbol-dir | Specifies the path of the symbol table file.|
415| --limit-percent | Filters performance data whose share is at least the specified percentage (1 to 100). Only entries meeting this threshold are included in the report.|
416| -s | Displays the stack mode.|
417| --call-stack-limit-percent | Displays the stack content of a specified proportion. The value ranges from 1 to 100.|
418| -i | Specifies the resource file path. The default value is **perf.data**.|
419| -o | Sets the output file path. You can set the output file path to **/data/local/tmp/** and customize the file name. If this parameter is not set, the data is output to the CLI.|
420| --proto | Outputs data in ProtoBuf format.|
421| --json | Outputs data in JSON format.|
422| --diff | Displays the differences between the source file and the converted file. This parameter cannot be used together with **--proto**, **--json**, or **-s**.|
423| --branch | Displays the branches based on the function address.|
424| --&lt;keys&gt; &lt;keyname1&gt;[,keyname2][,...] | Specifies the keywords, which can be **comms**, **pids**, **tids**, **dsos**, **funcs**, **from_dsos** or **from_funcs**, for example, **--comms hiperf**.|
425| --sort [key1],[key2],[...] | Sorts the data by keyword.|
426| --hide_count | Hides values in the report.|
427| --dumpoptions | Displays details about all options in the list.|
428
429**Example**
430
431```
432Usage: hiperf report [option] \<filename\>
433```
434
435Extract key data that has a great impact on performance (≥ 1%) from the **perf.data** file and displays the data in a report.
436```
437$ hiperf report -i /data/local/tmp/perf.data --limit-percent 1
438```
439