1Perfetto Tracing 2================ 3 4Mesa has experimental support for `Perfetto <https://perfetto.dev>`__ for 5GPU performance monitoring. Perfetto supports multiple 6`producers <https://perfetto.dev/docs/concepts/service-model>`__ each with 7one or more data-sources. Perfetto already provides various producers and 8data-sources for things like: 9 10- CPU scheduling events (``linux.ftrace``) 11- CPU frequency scaling (``linux.ftrace``) 12- System calls (``linux.ftrace``) 13- Process memory utilization (``linux.process_stats``) 14 15As well as various domain specific producers. 16 17The mesa perfetto support adds additional producers, to allow for visualizing 18GPU performance (frequency, utilization, performance counters, etc) on the 19same timeline, to better understand and tune/debug system level performance: 20 21- pps-producer: A systemwide daemon that can collect global performance 22 counters. 23- mesa: Per-process producer within mesa to capture render-stage traces 24 on the GPU timeline, track events, etc. 25 26The exact supported features vary per driver: 27 28.. list-table:: Supported data-sources 29 :header-rows: 1 30 31 * - Driver 32 - PPS Counters 33 - Render Stages 34 * - Freedreno 35 - ``gpu.counters.msm`` 36 - ``gpu.renderstages.msm`` 37 * - Turnip 38 - ``gpu.counters.msm`` 39 - 40 * - Intel 41 - ``gpu.counters.i915`` 42 - 43 * - Panfrost 44 - ``gpu.counters.panfrost`` 45 - 46 47Run 48--- 49 50To capture a trace with perfetto you need to take the following steps: 51 521. Build perfetto from sources available at ``subprojects/perfetto`` following 53 `this guide <https://perfetto.dev/docs/quickstart/linux-tracing>`__. 54 552. Create a `trace config <https://perfetto.dev/#/trace-config.md>`__, which is 56 a json formatted text file with extension ``.cfg``, or use one of the config 57 files under the ``src/tool/pps/cfg`` directory. More examples of config files 58 can be found in ``subprojects/perfetto/test/configs``. 59 603. Change directory to ``subprojects/perfetto`` and run a 61 `convenience script <https://perfetto.dev/#/running.md>`__ to start the 62 tracing service: 63 64 .. code-block:: console 65 66 cd subprojects/perfetto 67 CONFIG=<path/to/gpu.cfg> OUT=out/linux_clang_release ./tools/tmux -n 68 694. Start other producers you may need, e.g. ``pps-producer``. 70 715. Start ``perfetto`` under the tmux session initiated in step 3. 72 736. Once tracing has finished, you can detach from tmux with :kbd:`Ctrl+b`, 74 :kbd:`d`, and the convenience script should automatically copy the trace 75 files into ``$HOME/Downloads``. 76 777. Go to `ui.perfetto.dev <https://ui.perfetto.dev>`__ and upload 78 ``$HOME/Downloads/trace.protobuf`` by clicking on **Open trace file**. 79 808. Alternatively you can open the trace in `AGI <https://gpuinspector.dev/>`__ 81 (which despite the name can be used to view non-android traces). 82 83To be a bit more explicit, here is a listing of commands reproducing 84the steps above : 85 86.. code-block:: console 87 88 # Configure Mesa with perfetto 89 mesa $ meson . build -Dperfetto=true -Dvulkan-drivers=intel,broadcom -Dgallium-drivers= 90 # Build mesa 91 mesa $ ninja -C build 92 93 # Within the Mesa repo, build perfetto 94 mesa $ cd subprojects/perfetto 95 perfetto $ ./tools/install-build-deps 96 perfetto $ ./tools/gn gen --args='is_debug=false' out/linux 97 perfetto $ ./tools/ninja -C out/linux 98 99 # Start perfetto 100 perfetto $ CONFIG=../../src/tool/pps/cfg/gpu.cfg OUT=out/linux/ ./tools/tmux -n 101 102 # In parallel from the Mesa repo, start the PPS producer 103 mesa $ ./build/src/tool/pps/pps-producer 104 105 # Back in the perfetto tmux, press enter to start the capture 106 107Vulkan data sources 108~~~~~~~~~~~~~~~~~~~ 109 110The Vulkan API gives the application control over recording of command 111buffers as well as when they are submitted to the hardware. As a 112consequence, we need to ensure command buffers are properly 113instrumented for the perfetto driver data sources prior to Perfetto 114actually collecting traces. 115 116This can be achieved by setting the ``GPU_TRACE_INSTRUMENT`` 117environment variable before starting a Vulkan application : 118 119.. code-block:: console 120 121 GPU_TRACE_INSTRUMENT=1 ./build/my_vulkan_app 122 123Driver Specifics 124~~~~~~~~~~~~~~~~ 125 126Below is driver specific information/instructions for the PPS producer. 127 128Freedreno / Turnip 129^^^^^^^^^^^^^^^^^^ 130 131The Freedreno PPS driver needs root access to read system-wide 132performance counters, so you can simply run it with sudo: 133 134.. code-block:: console 135 136 sudo ./build/src/tool/pps/pps-producer 137 138Intel 139^^^^^ 140 141The Intel PPS driver needs root access to read system-wide 142`RenderBasic <https://software.intel.com/content/www/us/en/develop/documentation/vtune-help/top/reference/gpu-metrics-reference.html>`__ 143performance counters, so you can simply run it with sudo: 144 145.. code-block:: console 146 147 sudo ./build/src/tool/pps/pps-producer 148 149Another option to enable access wide data without root permissions would be running the following: 150 151.. code-block:: console 152 153 sudo sysctl dev.i915.perf_stream_paranoid=0 154 155Alternatively using the ``CAP_PERFMON`` permission on the binary should work too. 156 157A particular metric set can also be selected to capture a different 158set of HW counters : 159 160.. code-block:: console 161 162 INTEL_PERFETTO_METRIC_SET=RasterizerAndPixelBackend ./build/src/tool/pps/pps-producer 163 164Vulkan applications can also be instrumented to be Perfetto producers. 165To enable this for given application, set the environment variable as 166follow : 167 168.. code-block:: console 169 170 PERFETTO_TRACE=1 my_vulkan_app 171 172Panfrost 173^^^^^^^^ 174 175The Panfrost PPS driver uses unstable ioctls that behave correctly on 176kernel version `5.4.23+ <https://lwn.net/Articles/813601/>`__ and 177`5.5.7+ <https://lwn.net/Articles/813600/>`__. 178 179To run the producer, follow these two simple steps: 180 1811. Enable Panfrost unstable ioctls via kernel parameter: 182 183 .. code-block:: console 184 185 modprobe panfrost unstable_ioctls=1 186 187 Alternatively you could add ``panfrost.unstable_ioctls=1`` to your kernel command line, or ``echo 1 > /sys/module/panfrost/parameters/unstable_ioctls``. 188 1892. Run the producer: 190 191 .. code-block:: console 192 193 ./build/pps-producer 194 195Troubleshooting 196--------------- 197 198Tmux 199~~~~ 200 201If the convenience script ``tools/tmux`` keeps copying artifacts to your 202``SSH_TARGET`` without starting the tmux session, make sure you have ``tmux`` 203installed in your system. 204 205.. code-block:: console 206 207 apt install tmux 208 209Missing counter names 210~~~~~~~~~~~~~~~~~~~~~ 211 212If the trace viewer shows a list of counters with a description like 213``gpu_counter(#)`` instead of their proper names, maybe you had a data loss due 214to the trace buffer being full and wrapped. 215 216In order to prevent this loss of data you can tweak the trace config file in 217two different ways: 218 219- Increase the size of the buffer in use: 220 221 .. code-block:: javascript 222 223 buffers { 224 size_kb: 2048, 225 fill_policy: RING_BUFFER, 226 } 227 228- Periodically flush the trace buffer into the output file: 229 230 .. code-block:: javascript 231 232 write_into_file: true 233 file_write_period_ms: 250 234 235 236- Discard new traces when the buffer fills: 237 238 .. code-block:: javascript 239 240 buffers { 241 size_kb: 2048, 242 fill_policy: DISCARD, 243 } 244