1# Trace-based metrics 2 3_The metrics subsystem is a part of the 4[trace processor](/docs/analysis/trace-processor.md) which uses traces to 5compute reproducible metrics. It can be used in a wide range of situations; 6examples include benchmarks, lab tests and on large corpuses of traces._ 7 8![Block diagram of metrics](/docs/images/metrics-summary.png) 9 10## Quickstart 11 12The [quickstart](/docs/quickstart/trace-analysis.md) provides a quick overview 13on how to compute trace-based metrics traces using trace processor. 14 15## Introduction 16 17### Motivation 18 19Performance metrics are useful to monitor for the health of a system and ensure 20that a system does not regress over time as new features are added. 21 22However, metrics retrieved directly from the system have a downside: if there is 23a regression, it is difficult to root-cause the issue. Often, the problem may 24not be reproducible or may rely on a particular setup. 25 26Trace-based metrics are one possible solution to this problem. Instead of 27collecting metrics directly on the system, a trace is collected and metrics are 28computed from the trace. If a regression in the metric is spotted, the developer 29can look directly at the trace to understand why the regression has occurred 30instead of having to reproduce the issue. 31 32### Metric subsystem 33 34The metric subsystem is a part of the 35[trace processor](/docs/analysis/trace-processor.md) which executes SQL queries 36against traces and produces a metric which summarizes some performance attribute 37(e.g. CPU, memory, startup latency etc.). 38 39For example, generating the Android CPU metrics on a trace is as simple as: 40 41```python 42> ./trace_processor --run-metrics android_cpu <trace> 43android_cpu { 44 process_info { 45 name: "/system/bin/init" 46 threads { 47 name: "init" 48 core { 49 id: 1 50 metrics { 51 mcycles: 1 52 runtime_ns: 570365 53 min_freq_khz: 1900800 54 max_freq_khz: 1900800 55 avg_freq_khz: 1902017 56 } 57 } 58 ... 59 } 60 ... 61 } 62 ... 63} 64``` 65 66### Case for upstreaming 67 68Authors are strongly encouraged to add all metrics derived on Perfetto traces to 69the Perfetto repo unless there is a clear usecase (e.g. confidentiality) why 70these metrics should not be publicly available. 71 72In return for upstreaming metrics, authors will have first class support for 73running metrics locally and the confidence that their metrics will remain stable 74as trace processor is developed. 75 76As well as scaling upwards while developing from running on a single trace 77locally to running on a large set of traces, the reverse is also very useful. 78When an anomaly is observed in the metrics of a lab benchmark, a representative 79trace can be downloaded and the same metric can be run locally in trace 80processor. 81 82Since the same code is running locally and remotely, developers can be confident 83in reproducing the issue and use the trace processor and/or the Perfetto UI to 84identify the problem. 85 86## Walkthrough: prototyping a metric 87 88TIP: To see how to add to add a new metric to trace processor, see the checklist 89[here](/docs/contributing/common-tasks.md#new-metric) 90 91This walkthrough will outline how to prototype a metric locally without needing 92to compile trace processor. This metric will compute the CPU time for every 93process in the trace and list the names of the top 5 processes (by CPU time) and 94the number of threads created by the process. 95 96NOTE: See this [GitHub gist][gist] to see how the code should look at the end of 97 the walkthrough. The prerequisites and Step 4 below give instructions on 98 how to get trace processor and run the metrics code. 99 100[gist]: https://gist.github.com/tilal6991/c221cf0cae17e298dfa82b118edf9080 101 102### Prerequisites 103 104As a setup step, create a folder to act as a scratch workspace; this folder will be referred to using the env variable `$WORKSPACE` in Step 4. 105 106The other requirement is trace processor. This can downloaded from [here](https://get.perfetto.dev/trace_processor) or can be built from source 107using the instructions [here](trace-processor.md). Whichever method is chosen, $TRACE_PROCESSOR env variable will be used to refer to the location of the binary in Step 4. 108 109### Step 1 110 111As all metrics in the metrics platform are defined using protos, the metric needs to be structured as a proto. For this metric, there needs to be some notion of a process name along with its CPU time and number of threads. 112 113Starting off, in a file named `top_five_processes.proto` in our workspace, create a basic proto message called ProcessInfo with those three fields: 114 115```protobuf 116message ProcessInfo { 117 optional string process_name = 1; 118 optional uint64 cpu_time_ms = 2; 119 optional uint32 num_threads = 3; 120} 121``` 122 123Next , create a wrapping message which will hold the repeated field containing the top 5 processes. 124 125```protobuf 126message TopProcesses { 127 repeated ProcessInfo process_info = 1; 128} 129``` 130 131Finally, define an extension to the root proto for all metrics (the [TraceMetrics](https://android.googlesource.com/platform/external/perfetto/+/HEAD/protos/perfetto/metrics/metrics.proto#39) proto). 132 133```protobuf 134extend TraceMetrics { 135 optional TopProcesses top_processes = 450; 136} 137``` 138 139Adding this extension field allows trace processor to link the newly defined 140metric to the `TraceMetrics` proto. 141 142_Notes:_ 143 144- The field ids 450-500 are reserved for local development so any of them can be used as the field id for the extension field. 145- The choice of field name here is important as the SQL file and the final table generated in SQL will be based on this name. 146 147Putting everything together, along with some boilerplate preamble gives: 148 149```protobuf 150syntax = "proto2"; 151 152package perfetto.protos; 153 154import "protos/perfetto/metrics/metrics.proto"; 155 156message ProcessInfo { 157 optional string process_name = 1; 158 optional int64 cpu_time_ms = 2; 159 optional uint32 num_threads = 3; 160} 161 162message TopProcesses { 163 repeated ProcessInfo process_info = 1; 164} 165 166extend TraceMetrics { 167 optional TopProcesses top_processes = 450; 168} 169``` 170 171### Step 2 172 173Next, write the SQL to generate the table of the top 5 processes ordered by the 174sum of the CPU time they ran for and the number of threads which were associated 175with the process. 176 177The following SQL should added to a file called `top_five_processes.sql` in the 178workspace: 179 180```sql 181CREATE VIEW top_five_processes_by_cpu 182SELECT 183 process.name as process_name, 184 CAST(SUM(sched.dur) / 1e6 as INT64) as cpu_time_ms, 185 COUNT(DISTINCT utid) as num_threads 186FROM sched 187INNER JOIN thread USING(utid) 188INNER JOIN process USING(upid) 189GROUP BY process.name 190ORDER BY cpu_time_ms DESC 191LIMIT 5; 192``` 193 194Let's break this query down: 195 1961. The first table used is the `sched` table. This contains all the scheduling 197 data available in the trace. Each scheduling "slice" is associated with a 198 thread which is uniquely identified in Perfetto traces using its `utid`. The 199 two pieces of information needed from the sched table are the `dur` - 200 short for duration, this is the amount of time the slice lasted - and the 201 `utid` which will be used to join with the thread table. 2022. The next table is the thread table. This gives us a lot of information which 203 is not particularly interesting (including its thread name) but it does give 204 us the `upid`. Similar to `utid`, `upid` is the unique identifier for a 205 process in a Perfetto trace. In this case, `upid` will refer to the process 206 which hosts the thread given by `utid`. 2073. The final table is the process table. This gives the name of the process 208 associated with the original sched slice. 2094. With the process, thread and duration for each sched slice, all the slices 210 for a single processes are collected and their durations summed to get the 211 CPU time (dividing by 1e6 as sched's duration is in nanoseconds) and 212 the number of distinct threads. 2135. Finally, we order by the cpu time and limit to the top 5 results. 214 215### Step 3 216 217Now that the result of the metric has been expressed as an SQL table, it needs 218to be converted to a proto. The metrics platform has built-in support for emitting 219protos using SQL functions; something which is used extensively in this step. 220 221Let's look at how it works for our table above. 222 223```sql 224CREATE VIEW top_processes_output AS 225SELECT TopProcesses( 226 'process_info', ( 227 SELECT RepeatedField( 228 ProcessInfo( 229 'process_name', process_name, 230 'cpu_time_ms', cpu_time_ms, 231 'num_threads', num_threads 232 ) 233 ) 234 FROM top_five_processes_by_cpu 235 ) 236); 237``` 238 239Breaking this down again: 240 2411. Starting from the inner-most SELECT statement, there is what looks like 242 a function call to the ProcessInfo function; in fact this is no coincidence. 243 For each proto that the metrics platform knows about, an SQL function is 244 generated with the same name as the proto. This function takes key value 245 pairs with the key as the name of the proto field to fill and the value being 246 the data to store in the field. The output is the proto created by writing 247 the fields described in the function. (\*) 248 249 In this case, this function is called once for each row in the 250 `top_five_processes_by_cpu` table. The output will be the fully filled 251 ProcessInfo proto. 252 253 The call to the `RepeatedField` function is the most interesting part and 254 also the most important. In technical terms, `RepeatedField` is an aggregate 255 function. Practically, this means that it takes a full table of values and 256 generates a single array which contains all the values passed to it. 257 258 Therefore, the output of this whole SELECT statement is an array of 5 259 ProcessInfo protos. 260 2612. Next is creation of the `TopProcesses` proto. By now, the syntax should 262 already feel somewhat familiar; the proto builder function is called to fill 263 in the `process_info` field with the array of protos from the inner function. 264 265 The output of this SELECT is a single `TopProcesses` proto containing the 266 ProcessInfos as a repeated field. 267 2683. Finally, the view is created. This view is specially named to allow the 269 metrics platform to query it to obtain the root proto for each metric 270 (in this case `TopProcesses`). See the note below as to the pattern behind 271 this view's name. 272 273(\*) _This is not strictly true. To type-check the protos, some metadata 274is returned about the type of the proto but this is unimportant for metric 275authors._ 276 277NOTE: It is important that the views be named {name of TraceMetrics extension 278 field}\_output. This is the pattern used and expected by the metrics 279 platform for all metrics. 280 281The final file should look like so: 282 283```sql 284CREATE VIEW top_five_processes_by_cpu AS 285SELECT 286 process.name as process_name, 287 CAST(SUM(sched.dur) / 1e6 as INT64) as cpu_time_ms, 288 COUNT(DISTINCT utid) as num_threads 289FROM sched 290INNER JOIN thread USING(utid) 291INNER JOIN process USING(upid) 292GROUP BY process.name 293ORDER BY cpu_time_ms DESC 294LIMIT 5; 295 296CREATE top_processes_output AS 297SELECT TopProcesses( 298 'process_info', ( 299 SELECT RepeatedField( 300 ProcessInfo( 301 'process_name', process_name, 302 'cpu_time_ms', cpu_time_ms, 303 'num_threads', num_threads 304 ) 305 ) 306 FROM top_five_processes_by_cpu 307 ) 308); 309``` 310 311NOTE: The name of the SQL file should be the same as the name of TraceMetrics 312 extension field. This is to allow the metrics platform to associated the 313 proto extension field with the SQL which needs to be run to generate it. 314 315### Step 4 316 317For this step, invoke trace processor shell to run the metrics (see the 318[Quickstart](/docs/quickstart/trace-analysis.md) for downloading instructions): 319 320```shell 321$TRACE_PROCESSOR --run-metrics $WORKSPACE/top_five_processes.sql $TRACE 2> /dev/null 322``` 323 324(For an example trace to test this on, see the Notes section below.) 325 326By passing the SQL file for the metric to be computed, trace processor uses the name of this file to find the proto and to figure out the name of the output table for the proto and the name of the extension field for `TraceMetrics`; this is the reason it was important to choose the names of these other objects carefully. 327 328_Notes:_ 329 330- If something doesn't work as intended, check that the workspace looks the same as the contents of this [GitHub gist](https://gist.github.com/tilal6991/c221cf0cae17e298dfa82b118edf9080). 331- A good example trace for this metric is the Android example trace used by the Perfetto UI found [here](https://storage.googleapis.com/perfetto-misc/example_android_trace_30s_1). 332- stderr is redirected to remove any noise from parsing the trace that trace processor generates. 333 334If everything went successfully, the following output should be visible (specifically this is the output for the Android example trace linked above): 335 336``` 337[perfetto.protos.top_five_processes] { 338 process_info { 339 process_name: "com.google.android.GoogleCamera" 340 cpu_time_ms: 15154 341 num_threads: 125 342 } 343 process_info { 344 process_name: "sugov:4" 345 cpu_time_ms: 6846 346 num_threads: 1 347 } 348 process_info { 349 process_name: "system_server" 350 cpu_time_ms: 6809 351 num_threads: 66 352 } 353 process_info { 354 process_name: "cds_ol_rx_threa" 355 cpu_time_ms: 6684 356 num_threads: 1 357 } 358 process_info { 359 process_name: "com.android.chrome" 360 cpu_time_ms: 5125 361 num_threads: 49 362 } 363} 364``` 365 366### Next steps 367 368* The [common tasks](/docs/contributing/common-tasks.md) page gives a list of 369 steps on how new metrics can be added to the trace processor. 370