1Perfetto Tracing 2================ 3 4Mesa has experimental support for `Perfetto <https://perfetto.dev>`__ for 5GPU performance monitoring. Perfetto supports multiple 6`producers <https://perfetto.dev/docs/concepts/service-model>`__ each with 7one or more data-sources. Perfetto already provides various producers and 8data-sources for things like: 9 10- CPU scheduling events (``linux.ftrace``) 11- CPU frequency scaling (``linux.ftrace``) 12- System calls (``linux.ftrace``) 13- Process memory utilization (``linux.process_stats``) 14 15As well as various domain specific producers. 16 17The mesa Perfetto support adds additional producers, to allow for visualizing 18GPU performance (frequency, utilization, performance counters, etc) on the 19same timeline, to better understand and tune/debug system level performance: 20 21- pps-producer: A systemwide daemon that can collect global performance 22 counters. 23- mesa: Per-process producer within mesa to capture render-stage traces 24 on the GPU timeline, track events on the CPU timeline, etc. 25 26The exact supported features vary per driver: 27 28.. list-table:: Supported data-sources 29 :header-rows: 1 30 31 * - Driver 32 - PPS Counters 33 - Render Stages 34 * - Freedreno 35 - ``gpu.counters.msm`` 36 - ``gpu.renderstages.msm`` 37 * - Turnip 38 - ``gpu.counters.msm`` 39 - ``gpu.renderstages.msm`` 40 * - Intel 41 - ``gpu.counters.i915`` 42 - ``gpu.renderstages.intel`` 43 * - Panfrost 44 - ``gpu.counters.panfrost`` 45 - 46 * - V3D 47 - ``gpu.counters.v3d`` 48 - 49 50Run 51--- 52 53To capture a trace with Perfetto you need to take the following steps: 54 551. Build Perfetto from sources available at ``subprojects/perfetto`` following 56 `this guide <https://perfetto.dev/docs/quickstart/linux-tracing>`__. 57 582. Create a `trace config <https://perfetto.dev/docs/concepts/config>`__, which is 59 a json formatted text file with extension ``.cfg``, or use one of the config 60 files under the ``src/tool/pps/cfg`` directory. More examples of config files 61 can be found in ``subprojects/perfetto/test/configs``. 62 633. Change directory to ``subprojects/perfetto`` and run a 64 `convenience script <https://perfetto.dev/docs/quickstart/linux-tracing#capturing-a-trace>`__ 65 to start the tracing service: 66 67 .. code-block:: sh 68 69 cd subprojects/perfetto 70 CONFIG=<path/to/gpu.cfg> OUT=out/linux_clang_release ./tools/tmux -n 71 724. Start other producers you may need, e.g. ``pps-producer``. 73 745. Start ``perfetto`` under the tmux session initiated in step 3. 75 766. Once tracing has finished, you can detach from tmux with :kbd:`Ctrl+b`, 77 :kbd:`d`, and the convenience script should automatically copy the trace 78 files into ``$HOME/Downloads``. 79 807. Go to `ui.perfetto.dev <https://ui.perfetto.dev>`__ and upload 81 ``$HOME/Downloads/trace.protobuf`` by clicking on **Open trace file**. 82 838. Alternatively you can open the trace in `AGI <https://gpuinspector.dev/>`__ 84 (which despite the name can be used to view non-android traces). 85 86To be a bit more explicit, here is a listing of commands reproducing 87the steps above : 88 89.. code-block:: sh 90 91 # Configure Mesa with perfetto 92 mesa $ meson . build -Dperfetto=true -Dvulkan-drivers=intel,broadcom -Dgallium-drivers= 93 # Build mesa 94 mesa $ meson compile -C build 95 96 # Within the Mesa repo, build perfetto 97 mesa $ cd subprojects/perfetto 98 perfetto $ ./tools/install-build-deps 99 perfetto $ ./tools/gn gen --args='is_debug=false' out/linux 100 perfetto $ ./tools/ninja -C out/linux 101 102 # Start perfetto 103 perfetto $ CONFIG=../../src/tool/pps/cfg/gpu.cfg OUT=out/linux/ ./tools/tmux -n 104 105 # In parallel from the Mesa repo, start the PPS producer 106 mesa $ ./build/src/tool/pps/pps-producer 107 108 # Back in the perfetto tmux, press enter to start the capture 109 110CPU Tracing 111~~~~~~~~~~~ 112 113Mesa's CPU tracepoints (``MESA_TRACE_*``) use Perfetto track events when 114Perfetto is enabled. They use ``mesa.default`` and ``mesa.slow`` categories. 115 116Currently, only EGL and the following drivers have have CPU tracepoints. 117 118- Freedreno 119- V3D 120- VC4 121 122Vulkan data sources 123~~~~~~~~~~~~~~~~~~~ 124 125The Vulkan API gives the application control over recording of command 126buffers as well as when they are submitted to the hardware. As a 127consequence, we need to ensure command buffers are properly 128instrumented for the Perfetto driver data sources prior to Perfetto 129actually collecting traces. 130 131This can be achieved by setting the :envvar:`MESA_GPU_TRACES` 132environment variable before starting a Vulkan application : 133 134.. code-block:: sh 135 136 MESA_GPU_TRACES=perfetto ./build/my_vulkan_app 137 138Driver Specifics 139~~~~~~~~~~~~~~~~ 140 141Below is driver specific information/instructions for the PPS producer. 142 143Freedreno / Turnip 144^^^^^^^^^^^^^^^^^^ 145 146The Freedreno PPS driver needs root access to read system-wide 147performance counters, so you can simply run it with sudo: 148 149.. code-block:: sh 150 151 sudo ./build/src/tool/pps/pps-producer 152 153Intel 154^^^^^ 155 156The Intel PPS driver needs root access to read system-wide 157`RenderBasic <https://www.intel.com/content/www/us/en/docs/vtune-profiler/user-guide/2023-0/gpu-metrics-reference.html>`__ 158performance counters, so you can simply run it with sudo: 159 160.. code-block:: sh 161 162 sudo ./build/src/tool/pps/pps-producer 163 164Another option to enable access wide data without root permissions would be running the following: 165 166.. code-block:: sh 167 168 sudo sysctl dev.i915.perf_stream_paranoid=0 169 170Alternatively using the ``CAP_PERFMON`` permission on the binary should work too. 171 172A particular metric set can also be selected to capture a different 173set of HW counters : 174 175.. code-block:: sh 176 177 INTEL_PERFETTO_METRIC_SET=RasterizerAndPixelBackend ./build/src/tool/pps/pps-producer 178 179Vulkan applications can also be instrumented to be Perfetto producers. 180To enable this for given application, set the environment variable as 181follow : 182 183.. code-block:: sh 184 185 PERFETTO_TRACE=1 my_vulkan_app 186 187Panfrost 188^^^^^^^^ 189 190The Panfrost PPS driver uses unstable ioctls that behave correctly on 191kernel version `5.4.23+ <https://lwn.net/Articles/813601/>`__ and 192`5.5.7+ <https://lwn.net/Articles/813600/>`__. 193 194To run the producer, follow these two simple steps: 195 1961. Enable Panfrost unstable ioctls via kernel parameter: 197 198 .. code-block:: sh 199 200 modprobe panfrost unstable_ioctls=1 201 202 Alternatively you could add ``panfrost.unstable_ioctls=1`` to your kernel command line, or ``echo 1 > /sys/module/panfrost/parameters/unstable_ioctls``. 203 2042. Run the producer: 205 206 .. code-block:: sh 207 208 ./build/pps-producer 209 210Troubleshooting 211--------------- 212 213Tmux 214~~~~ 215 216If the convenience script ``tools/tmux`` keeps copying artifacts to your 217``SSH_TARGET`` without starting the tmux session, make sure you have ``tmux`` 218installed in your system. 219 220.. code-block:: sh 221 222 apt install tmux 223 224Missing counter names 225~~~~~~~~~~~~~~~~~~~~~ 226 227If the trace viewer shows a list of counters with a description like 228``gpu_counter(#)`` instead of their proper names, maybe you had a data loss due 229to the trace buffer being full and wrapped. 230 231In order to prevent this loss of data you can tweak the trace config file in 232two different ways: 233 234- Increase the size of the buffer in use: 235 236 .. code-block:: javascript 237 238 buffers { 239 size_kb: 2048, 240 fill_policy: RING_BUFFER, 241 } 242 243- Periodically flush the trace buffer into the output file: 244 245 .. code-block:: javascript 246 247 write_into_file: true 248 file_write_period_ms: 250 249 250 251- Discard new traces when the buffer fills: 252 253 .. code-block:: javascript 254 255 buffers { 256 size_kb: 2048, 257 fill_policy: DISCARD, 258 } 259