• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1Perfetto Tracing
2================
3
4Mesa has experimental support for `Perfetto <https://perfetto.dev>`__ for
5GPU performance monitoring.  Perfetto supports multiple
6`producers <https://perfetto.dev/docs/concepts/service-model>`__ each with
7one or more data-sources.  Perfetto already provides various producers and
8data-sources for things like:
9
10- CPU scheduling events (``linux.ftrace``)
11- CPU frequency scaling (``linux.ftrace``)
12- System calls (``linux.ftrace``)
13- Process memory utilization (``linux.process_stats``)
14
15As well as various domain specific producers.
16
17The mesa perfetto support adds additional producers, to allow for visualizing
18GPU performance (frequency, utilization, performance counters, etc) on the
19same timeline, to better understand and tune/debug system level performance:
20
21- pps-producer: A systemwide daemon that can collect global performance
22  counters.
23- mesa: Per-process producer within mesa to capture render-stage traces
24  on the GPU timeline, track events, etc.
25
26The exact supported features vary per driver:
27
28.. list-table:: Supported data-sources
29   :header-rows: 1
30
31   * - Driver
32     - PPS Counters
33     - Render Stages
34   * - Freedreno
35     - ``gpu.counters.msm``
36     - ``gpu.renderstages.msm``
37   * - Turnip
38     - ``gpu.counters.msm``
39     -
40   * - Intel
41     - ``gpu.counters.i915``
42     -
43   * - Panfrost
44     - ``gpu.counters.panfrost``
45     -
46
47Run
48---
49
50To capture a trace with perfetto you need to take the following steps:
51
521. Build perfetto from sources available at ``subprojects/perfetto`` following
53   `this guide <https://perfetto.dev/docs/quickstart/linux-tracing>`__.
54
552. Create a `trace config <https://perfetto.dev/#/trace-config.md>`__, which is
56   a json formatted text file with extension ``.cfg``, or use one of the config
57   files under the ``src/tool/pps/cfg`` directory. More examples of config files
58   can be found in ``subprojects/perfetto/test/configs``.
59
603. Change directory to ``subprojects/perfetto`` and run a
61   `convenience script <https://perfetto.dev/#/running.md>`__ to start the
62   tracing service:
63
64   .. code-block:: console
65
66      cd subprojects/perfetto
67      CONFIG=<path/to/gpu.cfg> OUT=out/linux_clang_release ./tools/tmux -n
68
694. Start other producers you may need, e.g. ``pps-producer``.
70
715. Start ``perfetto`` under the tmux session initiated in step 3.
72
736. Once tracing has finished, you can detach from tmux with :kbd:`Ctrl+b`,
74   :kbd:`d`, and the convenience script should automatically copy the trace
75   files into ``$HOME/Downloads``.
76
777. Go to `ui.perfetto.dev <https://ui.perfetto.dev>`__ and upload
78   ``$HOME/Downloads/trace.protobuf`` by clicking on **Open trace file**.
79
808. Alternatively you can open the trace in `AGI <https://gpuinspector.dev/>`__
81   (which despite the name can be used to view non-android traces).
82
83To be a bit more explicit, here is a listing of commands reproducing
84the steps above :
85
86.. code-block:: console
87
88   # Configure Mesa with perfetto
89   mesa $ meson . build -Dperfetto=true -Dvulkan-drivers=intel,broadcom -Dgallium-drivers=
90   # Build mesa
91   mesa $ ninja -C build
92
93   # Within the Mesa repo, build perfetto
94   mesa $ cd subprojects/perfetto
95   perfetto $ ./tools/install-build-deps
96   perfetto $ ./tools/gn gen --args='is_debug=false' out/linux
97   perfetto $ ./tools/ninja -C out/linux
98
99   # Start perfetto
100   perfetto $ CONFIG=../../src/tool/pps/cfg/gpu.cfg OUT=out/linux/ ./tools/tmux -n
101
102   # In parallel from the Mesa repo, start the PPS producer
103   mesa $ ./build/src/tool/pps/pps-producer
104
105   # Back in the perfetto tmux, press enter to start the capture
106
107Vulkan data sources
108~~~~~~~~~~~~~~~~~~~
109
110The Vulkan API gives the application control over recording of command
111buffers as well as when they are submitted to the hardware. As a
112consequence, we need to ensure command buffers are properly
113instrumented for the perfetto driver data sources prior to Perfetto
114actually collecting traces.
115
116This can be achieved by setting the ``GPU_TRACE_INSTRUMENT``
117environment variable before starting a Vulkan application :
118
119.. code-block:: console
120
121   GPU_TRACE_INSTRUMENT=1 ./build/my_vulkan_app
122
123Driver Specifics
124~~~~~~~~~~~~~~~~
125
126Below is driver specific information/instructions for the PPS producer.
127
128Freedreno / Turnip
129^^^^^^^^^^^^^^^^^^
130
131The Freedreno PPS driver needs root access to read system-wide
132performance counters, so you can simply run it with sudo:
133
134.. code-block:: console
135
136   sudo ./build/src/tool/pps/pps-producer
137
138Intel
139^^^^^
140
141The Intel PPS driver needs root access to read system-wide
142`RenderBasic <https://software.intel.com/content/www/us/en/develop/documentation/vtune-help/top/reference/gpu-metrics-reference.html>`__
143performance counters, so you can simply run it with sudo:
144
145.. code-block:: console
146
147   sudo ./build/src/tool/pps/pps-producer
148
149Another option to enable access wide data without root permissions would be running the following:
150
151.. code-block:: console
152
153   sudo sysctl dev.i915.perf_stream_paranoid=0
154
155Alternatively using the ``CAP_PERFMON`` permission on the binary should work too.
156
157A particular metric set can also be selected to capture a different
158set of HW counters :
159
160.. code-block:: console
161
162   INTEL_PERFETTO_METRIC_SET=RasterizerAndPixelBackend ./build/src/tool/pps/pps-producer
163
164Vulkan applications can also be instrumented to be Perfetto producers.
165To enable this for given application, set the environment variable as
166follow :
167
168.. code-block:: console
169
170   PERFETTO_TRACE=1 my_vulkan_app
171
172Panfrost
173^^^^^^^^
174
175The Panfrost PPS driver uses unstable ioctls that behave correctly on
176kernel version `5.4.23+ <https://lwn.net/Articles/813601/>`__ and
177`5.5.7+ <https://lwn.net/Articles/813600/>`__.
178
179To run the producer, follow these two simple steps:
180
1811. Enable Panfrost unstable ioctls via kernel parameter:
182
183   .. code-block:: console
184
185      modprobe panfrost unstable_ioctls=1
186
187   Alternatively you could add ``panfrost.unstable_ioctls=1`` to your kernel command line, or ``echo 1 > /sys/module/panfrost/parameters/unstable_ioctls``.
188
1892. Run the producer:
190
191   .. code-block:: console
192
193      ./build/pps-producer
194
195Troubleshooting
196---------------
197
198Tmux
199~~~~
200
201If the convenience script ``tools/tmux`` keeps copying artifacts to your
202``SSH_TARGET`` without starting the tmux session, make sure you have ``tmux``
203installed in your system.
204
205.. code-block:: console
206
207   apt install tmux
208
209Missing counter names
210~~~~~~~~~~~~~~~~~~~~~
211
212If the trace viewer shows a list of counters with a description like
213``gpu_counter(#)`` instead of their proper names, maybe you had a data loss due
214to the trace buffer being full and wrapped.
215
216In order to prevent this loss of data you can tweak the trace config file in
217two different ways:
218
219- Increase the size of the buffer in use:
220
221  .. code-block:: javascript
222
223      buffers {
224          size_kb: 2048,
225          fill_policy: RING_BUFFER,
226      }
227
228- Periodically flush the trace buffer into the output file:
229
230  .. code-block:: javascript
231
232      write_into_file: true
233      file_write_period_ms: 250
234
235
236- Discard new traces when the buffer fills:
237
238  .. code-block:: javascript
239
240      buffers {
241          size_kb: 2048,
242          fill_policy: DISCARD,
243      }
244