• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1Perfetto Tracing
2================
3
4Mesa has experimental support for `Perfetto <https://perfetto.dev>`__ for
5GPU performance monitoring.  Perfetto supports multiple
6`producers <https://perfetto.dev/docs/concepts/service-model>`__ each with
7one or more data-sources.  Perfetto already provides various producers and
8data-sources for things like:
9
10- CPU scheduling events (``linux.ftrace``)
11- CPU frequency scaling (``linux.ftrace``)
12- System calls (``linux.ftrace``)
13- Process memory utilization (``linux.process_stats``)
14
15As well as various domain specific producers.
16
17The mesa Perfetto support adds additional producers, to allow for visualizing
18GPU performance (frequency, utilization, performance counters, etc) on the
19same timeline, to better understand and tune/debug system level performance:
20
21- pps-producer: A systemwide daemon that can collect global performance
22  counters.
23- mesa: Per-process producer within mesa to capture render-stage traces
24  on the GPU timeline, track events on the CPU timeline, etc.
25
26The exact supported features vary per driver:
27
28.. list-table:: Supported data-sources
29   :header-rows: 1
30
31   * - Driver
32     - PPS Counters
33     - Render Stages
34   * - Freedreno
35     - ``gpu.counters.msm``
36     - ``gpu.renderstages.msm``
37   * - Turnip
38     - ``gpu.counters.msm``
39     - ``gpu.renderstages.msm``
40   * - Intel
41     - ``gpu.counters.i915``
42     - ``gpu.renderstages.intel``
43   * - Panfrost
44     - ``gpu.counters.panfrost``
45     -
46   * - V3D
47     - ``gpu.counters.v3d``
48     -
49
50Run
51---
52
53To capture a trace with Perfetto you need to take the following steps:
54
551. Build Perfetto from sources available at ``subprojects/perfetto`` following
56   `this guide <https://perfetto.dev/docs/quickstart/linux-tracing>`__.
57
582. Create a `trace config <https://perfetto.dev/docs/concepts/config>`__, which is
59   a json formatted text file with extension ``.cfg``, or use one of the config
60   files under the ``src/tool/pps/cfg`` directory. More examples of config files
61   can be found in ``subprojects/perfetto/test/configs``.
62
633. Change directory to ``subprojects/perfetto`` and run a
64   `convenience script <https://perfetto.dev/docs/quickstart/linux-tracing#capturing-a-trace>`__
65   to start the tracing service:
66
67   .. code-block:: sh
68
69      cd subprojects/perfetto
70      CONFIG=<path/to/gpu.cfg> OUT=out/linux_clang_release ./tools/tmux -n
71
724. Start other producers you may need, e.g. ``pps-producer``.
73
745. Start ``perfetto`` under the tmux session initiated in step 3.
75
766. Once tracing has finished, you can detach from tmux with :kbd:`Ctrl+b`,
77   :kbd:`d`, and the convenience script should automatically copy the trace
78   files into ``$HOME/Downloads``.
79
807. Go to `ui.perfetto.dev <https://ui.perfetto.dev>`__ and upload
81   ``$HOME/Downloads/trace.protobuf`` by clicking on **Open trace file**.
82
838. Alternatively you can open the trace in `AGI <https://gpuinspector.dev/>`__
84   (which despite the name can be used to view non-android traces).
85
86To be a bit more explicit, here is a listing of commands reproducing
87the steps above :
88
89.. code-block:: sh
90
91   # Configure Mesa with perfetto
92   mesa $ meson . build -Dperfetto=true -Dvulkan-drivers=intel,broadcom -Dgallium-drivers=
93   # Build mesa
94   mesa $ meson compile -C build
95
96   # Within the Mesa repo, build perfetto
97   mesa $ cd subprojects/perfetto
98   perfetto $ ./tools/install-build-deps
99   perfetto $ ./tools/gn gen --args='is_debug=false' out/linux
100   perfetto $ ./tools/ninja -C out/linux
101
102   # Start perfetto
103   perfetto $ CONFIG=../../src/tool/pps/cfg/gpu.cfg OUT=out/linux/ ./tools/tmux -n
104
105   # In parallel from the Mesa repo, start the PPS producer
106   mesa $ ./build/src/tool/pps/pps-producer
107
108   # Back in the perfetto tmux, press enter to start the capture
109
110CPU Tracing
111~~~~~~~~~~~
112
113Mesa's CPU tracepoints (``MESA_TRACE_*``) use Perfetto track events when
114Perfetto is enabled.  They use ``mesa.default`` and ``mesa.slow`` categories.
115
116Currently, only EGL and the following drivers have have CPU tracepoints.
117
118- Freedreno
119- V3D
120- VC4
121
122Vulkan data sources
123~~~~~~~~~~~~~~~~~~~
124
125The Vulkan API gives the application control over recording of command
126buffers as well as when they are submitted to the hardware. As a
127consequence, we need to ensure command buffers are properly
128instrumented for the Perfetto driver data sources prior to Perfetto
129actually collecting traces.
130
131This can be achieved by setting the :envvar:`MESA_GPU_TRACES`
132environment variable before starting a Vulkan application :
133
134.. code-block:: sh
135
136   MESA_GPU_TRACES=perfetto ./build/my_vulkan_app
137
138Driver Specifics
139~~~~~~~~~~~~~~~~
140
141Below is driver specific information/instructions for the PPS producer.
142
143Freedreno / Turnip
144^^^^^^^^^^^^^^^^^^
145
146The Freedreno PPS driver needs root access to read system-wide
147performance counters, so you can simply run it with sudo:
148
149.. code-block:: sh
150
151   sudo ./build/src/tool/pps/pps-producer
152
153Intel
154^^^^^
155
156The Intel PPS driver needs root access to read system-wide
157`RenderBasic <https://www.intel.com/content/www/us/en/docs/vtune-profiler/user-guide/2023-0/gpu-metrics-reference.html>`__
158performance counters, so you can simply run it with sudo:
159
160.. code-block:: sh
161
162   sudo ./build/src/tool/pps/pps-producer
163
164Another option to enable access wide data without root permissions would be running the following:
165
166.. code-block:: sh
167
168   sudo sysctl dev.i915.perf_stream_paranoid=0
169
170Alternatively using the ``CAP_PERFMON`` permission on the binary should work too.
171
172A particular metric set can also be selected to capture a different
173set of HW counters :
174
175.. code-block:: sh
176
177   INTEL_PERFETTO_METRIC_SET=RasterizerAndPixelBackend ./build/src/tool/pps/pps-producer
178
179Vulkan applications can also be instrumented to be Perfetto producers.
180To enable this for given application, set the environment variable as
181follow :
182
183.. code-block:: sh
184
185   PERFETTO_TRACE=1 my_vulkan_app
186
187Panfrost
188^^^^^^^^
189
190The Panfrost PPS driver uses unstable ioctls that behave correctly on
191kernel version `5.4.23+ <https://lwn.net/Articles/813601/>`__ and
192`5.5.7+ <https://lwn.net/Articles/813600/>`__.
193
194To run the producer, follow these two simple steps:
195
1961. Enable Panfrost unstable ioctls via kernel parameter:
197
198   .. code-block:: sh
199
200      modprobe panfrost unstable_ioctls=1
201
202   Alternatively you could add ``panfrost.unstable_ioctls=1`` to your kernel command line, or ``echo 1 > /sys/module/panfrost/parameters/unstable_ioctls``.
203
2042. Run the producer:
205
206   .. code-block:: sh
207
208      ./build/pps-producer
209
210Troubleshooting
211---------------
212
213Tmux
214~~~~
215
216If the convenience script ``tools/tmux`` keeps copying artifacts to your
217``SSH_TARGET`` without starting the tmux session, make sure you have ``tmux``
218installed in your system.
219
220.. code-block:: sh
221
222   apt install tmux
223
224Missing counter names
225~~~~~~~~~~~~~~~~~~~~~
226
227If the trace viewer shows a list of counters with a description like
228``gpu_counter(#)`` instead of their proper names, maybe you had a data loss due
229to the trace buffer being full and wrapped.
230
231In order to prevent this loss of data you can tweak the trace config file in
232two different ways:
233
234- Increase the size of the buffer in use:
235
236  .. code-block:: javascript
237
238      buffers {
239          size_kb: 2048,
240          fill_policy: RING_BUFFER,
241      }
242
243- Periodically flush the trace buffer into the output file:
244
245  .. code-block:: javascript
246
247      write_into_file: true
248      file_write_period_ms: 250
249
250
251- Discard new traces when the buffer fills:
252
253  .. code-block:: javascript
254
255      buffers {
256          size_kb: 2048,
257          fill_policy: DISCARD,
258      }
259