1# Scripts reference 2 3[TOC] 4 5## Record a profile 6 7### app_profiler.py 8 9`app_profiler.py` is used to record profiling data for Android applications and native executables. 10 11```sh 12# Record an Android application. 13$ ./app_profiler.py -p simpleperf.example.cpp 14 15# Record an Android application with Java code compiled into native instructions. 16$ ./app_profiler.py -p simpleperf.example.cpp --compile_java_code 17 18# Record the launch of an Activity of an Android application. 19$ ./app_profiler.py -p simpleperf.example.cpp -a .SleepActivity 20 21# Record a native process. 22$ ./app_profiler.py -np surfaceflinger 23 24# Record a native process given its pid. 25$ ./app_profiler.py --pid 11324 26 27# Record a command. 28$ ./app_profiler.py -cmd \ 29 "dex2oat --dex-file=/data/local/tmp/app-debug.apk --oat-file=/data/local/tmp/a.oat" 30 31# Record an Android application, and use -r to send custom options to the record command. 32$ ./app_profiler.py -p simpleperf.example.cpp \ 33 -r "-e cpu-clock -g --duration 30" 34 35# Record both on CPU time and off CPU time. 36$ ./app_profiler.py -p simpleperf.example.cpp \ 37 -r "-e task-clock -g -f 1000 --duration 10 --trace-offcpu" 38 39# Save profiling data in a custom file (like perf_custom.data) instead of perf.data. 40$ ./app_profiler.py -p simpleperf.example.cpp -o perf_custom.data 41``` 42 43### Profile from launch of an application 44 45Sometimes we want to profile the launch-time of an application. To support this, we added `--app` in 46the record command. The `--app` option sets the package name of the Android application to profile. 47If the app is not already running, the record command will poll for the app process in a loop with 48an interval of 1ms. So to profile from launch of an application, we can first start the record 49command with `--app`, then start the app. Below is an example. 50 51```sh 52$ ./run_simpleperf_on_device.py record --app simpleperf.example.cpp \ 53 -g --duration 1 -o /data/local/tmp/perf.data 54# Start the app manually or using the `am` command. 55``` 56 57To make it convenient to use, `app_profiler.py` supports using the `-a` option to start an Activity 58after recording has started. 59 60```sh 61$ ./app_profiler.py -p simpleperf.example.cpp -a .MainActivity 62``` 63 64### api_profiler.py 65 66`api_profiler.py` is used to control recording in application code. It does preparation work 67before recording, and collects profiling data files after recording. 68 69[Here](./android_application_profiling.md#control-recording-in-application-code) are the details. 70 71### run_simpleperf_without_usb_connection.py 72 73`run_simpleperf_without_usb_connection.py` records profiling data while the USB cable isn't 74connected. Maybe `api_profiler.py` is more suitable, which also don't need USB cable when recording. 75Below is an example. 76 77```sh 78$ ./run_simpleperf_without_usb_connection.py start -p simpleperf.example.cpp 79# After the command finishes successfully, unplug the USB cable, run the 80# SimpleperfExampleCpp app. After a few seconds, plug in the USB cable. 81$ ./run_simpleperf_without_usb_connection.py stop 82# It may take a while to stop recording. After that, the profiling data is collected in perf.data 83# on host. 84``` 85 86### binary_cache_builder.py 87 88The `binary_cache` directory is a directory holding binaries needed by a profiling data file. The 89binaries are expected to be unstripped, having debug information and symbol tables. The 90`binary_cache` directory is used by report scripts to read symbols of binaries. It is also used by 91`report_html.py` to generate annotated source code and disassembly. 92 93By default, `app_profiler.py` builds the binary_cache directory after recording. But we can also 94build `binary_cache` for existing profiling data files using `binary_cache_builder.py`. It is useful 95when you record profiling data using `simpleperf record` directly, to do system wide profiling or 96record without the USB cable connected. 97 98`binary_cache_builder.py` can either pull binaries from an Android device, or find binaries in 99directories on the host (via `-lib`). 100 101By default, `binary_cache_builder.py` only pulls binaries that are actually mentioned in samples. 102For `perf.data` files that are not sample based, like those containing ETM traces, the `--every` 103command-line paremeter can be used to pull every binary that was recorded with a build id in the 104`perf.data`. 105 106```sh 107# Generate binary_cache for perf.data, by pulling binaries from the device. 108$ ./binary_cache_builder.py 109 110# Generate binary_cache, by pulling binaries from the device and finding binaries in 111# SimpleperfExampleCpp. 112$ ./binary_cache_builder.py -lib path_of_SimpleperfExampleCpp 113``` 114 115### run_simpleperf_on_device.py 116 117This script pushes the `simpleperf` executable on the device, and run a simpleperf command on the 118device. It is more convenient than running adb commands manually. 119 120## Viewing the profile 121 122Scripts in this section are for viewing the profile or converting profile data into formats used by 123external UIs. For recommended UIs, see [view_the_profile.md](view_the_profile.md). 124 125### report.py 126 127report.py is a wrapper of the `report` command on the host. It accepts all options of the `report` 128command. 129 130```sh 131# Report call graph 132$ ./report.py -g 133 134# Report call graph in a GUI window implemented by Python Tk. 135$ ./report.py -g --gui 136``` 137 138### report_html.py 139 140`report_html.py` generates `report.html` based on the profiling data. Then the `report.html` can show 141the profiling result without depending on other files. So it can be shown in local browsers or 142passed to other machines. Depending on which command-line options are used, the content of the 143`report.html` can include: chart statistics, sample table, flamegraphs, annotated source code for 144each function, annotated disassembly for each function. 145 146```sh 147# Generate chart statistics, sample table and flamegraphs, based on perf.data. 148$ ./report_html.py 149 150# Add source code. 151$ ./report_html.py --add_source_code --source_dirs path_of_SimpleperfExampleCpp 152 153# Add disassembly. 154$ ./report_html.py --add_disassembly 155 156# Adding disassembly for all binaries can cost a lot of time. So we can choose to only add 157# disassembly for selected binaries. 158$ ./report_html.py --add_disassembly --binary_filter libgame.so 159# Add disassembly and source code for binaries belonging to an app with package name 160# com.example.myapp. 161$ ./report_html.py --add_source_code --add_disassembly --binary_filter com.example.myapp 162 163# report_html.py accepts more than one recording data file. 164$ ./report_html.py -i perf1.data perf2.data 165``` 166 167Below is an example of generating html profiling results for SimpleperfExampleCpp. 168 169```sh 170$ ./app_profiler.py -p simpleperf.example.cpp 171$ ./report_html.py --add_source_code --source_dirs path_of_SimpleperfExampleCpp \ 172 --add_disassembly 173``` 174 175After opening the generated [`report.html`](./report_html.html) in a browser, there are several tabs: 176 177The first tab is "Chart Statistics". You can click the pie chart to show the time consumed by each 178process, thread, library and function. 179 180The second tab is "Sample Table". It shows the time taken by each function. By clicking one row in 181the table, we can jump to a new tab called "Function". 182 183The third tab is "Flamegraph". It shows the graphs generated by [`inferno`](./inferno.md). 184 185The fourth tab is "Function". It only appears when users click a row in the "Sample Table" tab. 186It shows information of a function, including: 187 1881. A flamegraph showing functions called by that function. 1892. A flamegraph showing functions calling that function. 1903. Annotated source code of that function. It only appears when there are source code files for 191 that function. 1924. Annotated disassembly of that function. It only appears when there are binaries containing that 193 function. 194 195### inferno 196 197[`inferno`](./inferno.md) is a tool used to generate flamegraph in a html file. 198 199```sh 200# Generate flamegraph based on perf.data. 201# On Windows, use inferno.bat instead of ./inferno.sh. 202$ ./inferno.sh -sc --record_file perf.data 203 204# Record a native program and generate flamegraph. 205$ ./inferno.sh -np surfaceflinger 206``` 207 208### purgatorio 209 210[`purgatorio`](../scripts/purgatorio/README.md) is a visualization tool to show samples in time order. 211 212### pprof_proto_generator.py 213 214It converts a profiling data file into `pprof.proto`, a format used by [pprof](https://github.com/google/pprof). 215 216```sh 217# Convert perf.data in the current directory to pprof.proto format. 218$ ./pprof_proto_generator.py 219# Show report in pdf format. 220$ pprof -pdf pprof.profile 221 222# Show report in html format. To show disassembly, add --tools option like: 223# --tools=objdump:<ndk_path>/toolchains/llvm/prebuilt/linux-x86_64/aarch64-linux-android/bin 224# To show annotated source or disassembly, select `top` in the view menu, click a function and 225# select `source` or `disassemble` in the view menu. 226$ pprof -http=:8080 pprof.profile 227``` 228 229### gecko_profile_generator.py 230 231Converts `perf.data` to [Gecko Profile 232Format](https://github.com/firefox-devtools/profiler/blob/main/docs-developer/gecko-profile-format.md), 233a format readable by both the [Perfetto UI](https://ui.perfetto.dev/) and 234[Firefox Profiler](https://profiler.firefox.com/). 235[View the profile](view_the_profile.md) provides more information on both options. 236 237Usage: 238 239``` 240# Record a profile of your application 241$ ./app_profiler.py -p simpleperf.example.cpp 242 243# Convert and gzip. 244$ ./gecko_profile_generator.py -i perf.data | gzip > gecko-profile.json.gz 245``` 246 247Then open `gecko-profile.json.gz` in https://ui.perfetto.dev/ or 248https://profiler.firefox.com/. 249 250### report_sample.py 251 252`report_sample.py` converts a profiling data file into the `perf script` text format output by 253`linux-perf-tool`. 254 255This format can be imported into: 256 257- [Perfetto](https://ui.perfetto.dev) 258- [FlameGraph](https://github.com/brendangregg/FlameGraph) 259- [Flamescope](https://github.com/Netflix/flamescope) 260- [Firefox 261 Profiler](https://github.com/firefox-devtools/profiler/blob/main/docs-user/guide-perf-profiling.md), 262 but prefer using `gecko_profile_generator.py`. 263- [Speedscope](https://github.com/jlfwong/speedscope/wiki/Importing-from-perf-(linux)) 264 265```sh 266# Record a profile to perf.data 267$ ./app_profiler.py <args> 268 269# Convert perf.data in the current directory to a format used by FlameGraph. 270$ ./report_sample.py --symfs binary_cache >out.perf 271 272$ git clone https://github.com/brendangregg/FlameGraph.git 273$ FlameGraph/stackcollapse-perf.pl out.perf >out.folded 274$ FlameGraph/flamegraph.pl out.folded >a.svg 275``` 276 277### stackcollapse.py 278 279`stackcollapse.py` converts a profiling data file (`perf.data`) to [Brendan 280Gregg's "Folded Stacks" 281format](https://queue.acm.org/detail.cfm?id=2927301#:~:text=The%20folded%20stack%2Dtrace%20format,trace%2C%20followed%20by%20a%20semicolon). 282 283Folded Stacks are lines of semicolon-delimited stack frames, root to leaf, 284followed by a count of events sampled in that stack, e.g.: 285 286``` 287BusyThread;__start_thread;__pthread_start(void*);java.lang.Thread.run 17889729 288``` 289 290All similar stacks are aggregated and sample timestamps are unused. 291 292Folded Stacks format is readable by: 293 294- The [FlameGraph](https://github.com/brendangregg/FlameGraph) toolkit 295- [Inferno](https://github.com/jonhoo/inferno) (Rust port of FlameGraph) 296- [Speedscope](https://speedscope.app/) 297 298Example: 299 300```sh 301# Record a profile to perf.data 302$ ./app_profiler.py <args> 303 304# Convert to Folded Stacks format 305$ ./stackcollapse.py --kernel --jit | gzip > profile.folded.gz 306 307# Visualise with FlameGraph with Java Stacks and nanosecond times 308$ git clone https://github.com/brendangregg/FlameGraph.git 309$ gunzip -c profile.folded.gz \ 310 | FlameGraph/flamegraph.pl --color=java --countname=ns \ 311 > profile.svg 312``` 313 314### report_etm.py 315 316`report_etm.py` generates instruction trace from profiles recorded with the `cs-etm` event. 317 318Example use: 319 320```sh 321# Record userspace-only trace of /bin/true. 322$ ./app_profiler.py -r "-e cs-etm:u" -nb -cmd /bin/true 323# Download binaries to use while decoding the trace. 324$ ./binary_cache_builder.py --every 325# Generate instruction trace. 326$ ./report_etm.py 327``` 328 329### report_fuchsia.py 330 331`report_fuchsia.py` generates [Fuchsia Trace](https://fuchsia.dev/fuchsia-src/reference/tracing/trace-format) 332from ETM trace with timestamps that can be viewed with [Perfetto](https://ui.perfetto.dev/) or on 333[https://magic-trace.org/](https://magic-trace.org/). The trace shows how the stack changes as time 334progresses. This is not always easy to do, and the script might fumble on traces where the stack is 335changed in unusual ways. 336 337It can be used like this: 338 339```sh 340# Record userspace-only trace of /bin/true with timestamps. 341$ ./app_profiler.py -r "-e cs-etm:u --record-timestamp" -nb -cmd /bin/true 342# Download binaries to use while decoding the trace. 343$ ./binary_cache_builder.py --every 344# Generate instruction trace. 345$ ./report_fuchsia.py 346``` 347 348Make sure that `--record-timestamp` is used when recording the trace on the device. Without 349timestamps, `report_fuchsia.py` will generate an empty trace. 350 351Note that it is assumed that 1 tick in the timestamps equals 1 nanosecond. This is always true for 352cores that implement Armv8.6 or later, or Armv9.1 or later, but is not the case for other cores. 353 354## simpleperf_report_lib.py 355 356`simpleperf_report_lib.py` is a Python library used to parse profiling data files generated by the 357record command. Internally, it uses libsimpleperf_report.so to do the work. Generally, for each 358profiling data file, we create an instance of ReportLib, pass it the file path (via SetRecordFile). 359Then we can read all samples through GetNextSample(). For each sample, we can read its event info 360(via GetEventOfCurrentSample), symbol info (via GetSymbolOfCurrentSample) and call chain info 361(via GetCallChainOfCurrentSample). We can also get some global information, like record options 362(via GetRecordCmd), the arch of the device (via GetArch) and meta strings (via MetaInfo). 363 364Examples of using `simpleperf_report_lib.py` are in `report_sample.py`, `report_html.py`, 365`report_etm.py`, `pprof_proto_generator.py` and `inferno/inferno.py`. 366 367## ipc.py 368`ipc.py`captures the instructions per cycle (IPC) of the system during a specified duration. 369 370Example: 371```sh 372./ipc.py 373./ipc.py 2 20 # Set interval to 2 secs and total duration to 20 secs 374./ipc.py -p 284 -C 4 # Only profile the PID 284 while running on core 4 375./ipc.py -c 'sleep 5' # Only profile the command to run 376``` 377 378The results look like: 379``` 380K_CYCLES K_INSTR IPC 38136840 14138 0.38 38270701 27743 0.39 383104562 41350 0.40 384138264 54916 0.40 385``` 386 387## sample_filter.py 388 389`sample_filter.py` generates sample filter files as documented in [sample_filter.md](https://android.googlesource.com/platform/system/extras/+/refs/heads/main/simpleperf/doc/sample_filter.md). 390A filter file can be passed in `--filter-file` when running report scripts. 391 392For example, it can be used to split a large recording file into several report files. 393 394```sh 395$ sample_filter.py -i perf.data --split-time-range 2 -o sample_filter 396$ gecko_profile_generator.py -i perf.data --filter-file sample_filter_part1 \ 397 | gzip >profile-part1.json.gz 398$ gecko_profile_generator.py -i perf.data --filter-file sample_filter_part2 \ 399 | gzip >profile-part2.json.gz 400``` 401