1LLVMpipe 2======== 3 4Introduction 5------------ 6 7The Gallium LLVMpipe driver is a software rasterizer that uses LLVM to 8do runtime code generation. Shaders, point/line/triangle rasterization 9and vertex processing are implemented with LLVM IR which is translated 10to x86, x86-64, or ppc64le machine code. Also, the driver is 11multithreaded to take advantage of multiple CPU cores (up to 32 at this 12time). It's the fastest software rasterizer for Mesa. 13 14Requirements 15------------ 16 17- For x86 or amd64 processors, 64-bit mode is recommended. Support for 18 SSE2 is strongly encouraged. Support for SSE3 and SSE4.1 will yield 19 the most efficient code. The fewer features the CPU has the more 20 likely it is that you will run into underperforming, buggy, or 21 incomplete code. 22 23 For ppc64le processors, use of the Altivec feature (the Vector 24 Facility) is recommended if supported; use of the VSX feature (the 25 Vector-Scalar Facility) is recommended if supported AND Mesa is built 26 with LLVM version 4.0 or later. 27 28 See ``/proc/cpuinfo`` to know what your CPU supports. 29 30- Unless otherwise stated, LLVM version 3.9 or later is required. 31 32 For Linux, on a recent Debian based distribution do: 33 34 .. code-block:: sh 35 36 aptitude install llvm-dev 37 38 If you want development snapshot builds of LLVM for Debian and 39 derived distributions like Ubuntu, you can use the APT repository at 40 `apt.llvm.org <https://apt.llvm.org/>`__, which are maintained by 41 Debian's LLVM maintainer. 42 43 For a RPM-based distribution do: 44 45 .. code-block:: sh 46 47 yum install llvm-devel 48 49 If you want development snapshot builds of LLVM for Fedora, you can 50 use the Copr repository at `fedora-llvm-team/llvm-snapshots <https://copr.fedorainfracloud.org/coprs/g/fedora-llvm-team/llvm-snapshots/>`__, 51 which is maintained by Red Hat's LLVM team. 52 53 For Windows you will need to build LLVM from source with MSVC or 54 MINGW (either natively or through cross compilers) and CMake, and set 55 the ``LLVM`` environment variable to the directory you installed it 56 to. LLVM will be statically linked, so when building on MSVC it needs 57 to be built with a matching CRT as Mesa, and you'll need to pass 58 ``-DLLVM_USE_CRT_xxx=yyy`` as described below. 59 60 61 +-----------------+----------------------------------------------------------------+ 62 | LLVM build-type | Mesa build-type | 63 | +--------------------------------+-------------------------------+ 64 | | debug,checked | release,profile | 65 +=================+================================+===============================+ 66 | Debug | ``-DLLVM_USE_CRT_DEBUG=MTd`` | ``-DLLVM_USE_CRT_DEBUG=MT`` | 67 +-----------------+--------------------------------+-------------------------------+ 68 | Release | ``-DLLVM_USE_CRT_RELEASE=MTd`` | ``-DLLVM_USE_CRT_RELEASE=MT`` | 69 +-----------------+--------------------------------+-------------------------------+ 70 71 You can build only the x86 target by passing 72 ``-DLLVM_TARGETS_TO_BUILD=X86`` to CMake. 73 74Building 75-------- 76 77To build everything on Linux invoke meson as: 78 79.. code-block:: sh 80 81 mkdir build 82 cd build 83 meson -D glx=xlib -D gallium-drivers=swrast 84 ninja 85 86 87Using 88----- 89 90Environment variables 91~~~~~~~~~~~~~~~~~~~~~ 92 93.. envvar:: LP_NATIVE_VECTOR_WIDTH 94 95 We can use it to override vector bits. Because sometimes it turns 96 out LLVMpipe can be fastest by using 128 bit vectors, 97 yet use AVX instructions. 98 99.. envvar:: GALLIUM_NOSSE 100 101 Deprecated in favor of ``GALLIUM_OVERRIDE_CPU_CAPS``, 102 use ``GALLIUM_OVERRIDE_CPU_CAPS=nosse`` instead. 103 104.. envvar:: LP_FORCE_SSE2 105 106 Deprecated in favor of ``GALLIUM_OVERRIDE_CPU_CAPS`` 107 use ``GALLIUM_OVERRIDE_CPU_CAPS=sse2`` instead. 108 109Linux 110~~~~~ 111 112On Linux, building will create a drop-in alternative for ``libGL.so`` 113into 114 115:: 116 117 build/foo/gallium/targets/libgl-xlib/libGL.so 118 119or 120 121:: 122 123 lib/gallium/libGL.so 124 125To use it set the ``LD_LIBRARY_PATH`` environment variable accordingly. 126 127Windows 128~~~~~~~ 129 130On Windows, building will create 131``build/windows-x86-debug/gallium/targets/libgl-gdi/opengl32.dll`` which 132is a drop-in alternative for system's ``opengl32.dll``, which will use 133the Mesa ICD, ``build/windows-x86-debug/gallium/targets/wgl/libgallium_wgl.dll``. 134To use it put both DLLs in the same directory as your application. It can also 135be used by replacing the native ICD driver, but it's quite an advanced usage, so if 136you need to ask, don't even try it. 137 138There is however an easy way to replace the OpenGL software renderer 139that comes with Microsoft Windows 7 (or later) with LLVMpipe (that is, 140on systems without any OpenGL drivers): 141 142- copy 143 ``build/windows-x86-debug/gallium/targets/wgl/libgallium_wgl.dll`` to 144 ``C:\Windows\SysWOW64\mesadrv.dll`` 145 146- load this registry settings: 147 148 :: 149 150 REGEDIT4 151 152 ; https://technet.microsoft.com/en-us/library/cc749368.aspx 153 ; https://www.msfn.org/board/topic/143241-portable-windows-7-build-from-winpe-30/page-5#entry942596 154 [HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Windows NT\CurrentVersion\OpenGLDrivers\MSOGL] 155 "DLL"="mesadrv.dll" 156 "DriverVersion"=dword:00000001 157 "Flags"=dword:00000001 158 "Version"=dword:00000002 159 160- Ditto for 64 bits drivers if you need them. 161 162Profiling 163--------- 164 165Linux perf integration 166~~~~~~~~~~~~~~~~~~~~~~ 167 168On Linux, it is possible to have symbol resolution of JIT code with 169`Linux perf <https://perf.wiki.kernel.org/>`__: 170 171:: 172 173 perf record -g /my/application 174 perf report 175 176When run inside Linux perf, LLVMpipe will create a 177``/tmp/perf-XXXXX.map`` file with symbol address table. It also dumps 178assembly code to ``/tmp/perf-XXXXX.map.asm``, which can be used by the 179``bin/perf-annotate-jit.py`` script to produce disassembly of the 180generated code annotated with the samples. 181 182You can obtain a call graph via 183`Gprof2Dot <https://github.com/jrfonseca/gprof2dot#linux-perf>`__. 184 185Unit testing 186------------ 187 188Building will also create several unit tests in 189``build/linux-???-debug/gallium/drivers/llvmpipe``: 190 191- ``lp_test_blend``: blending 192- ``lp_test_conv``: SIMD vector conversion 193- ``lp_test_format``: pixel unpacking/packing 194 195Some of these tests can output results and benchmarks to a tab-separated 196file for later analysis, e.g.: 197 198:: 199 200 build/linux-x86_64-debug/gallium/drivers/llvmpipe/lp_test_blend -o blend.tsv 201 202Development Notes 203----------------- 204 205- When looking at this code for the first time, start in lp_state_fs.c, 206 and then skim through the ``lp_bld_*`` functions called there, and 207 the comments at the top of the ``lp_bld_*.c`` functions. 208- The driver-independent parts of the LLVM / Gallium code are found in 209 ``src/gallium/auxiliary/gallivm/``. The filenames and function 210 prefixes need to be renamed from ``lp_bld_`` to something else 211 though. 212- We use LLVM-C bindings for now. They are not documented, but follow 213 the C++ interfaces very closely, and appear to be complete enough for 214 code generation. See `this stand-alone 215 example <https://npcontemplation.blogspot.com/2008/06/secret-of-llvm-c-bindings.html>`__. 216 See the ``llvm-c/Core.h`` file for reference. 217 218.. _recommended_reading: 219 220Recommended Reading 221------------------- 222 223- Rasterization 224 225 - `Triangle Scan Conversion using 2D Homogeneous 226 Coordinates <https://redirect.cs.umbc.edu/~olano/papers/2dh-tri/>`__ 227 - `Rasterization on 228 Larrabee <https://www.drdobbs.com/parallel/rasterization-on-larrabee/217200602>`__ 229 - `Rasterization using half-space 230 functions <http://web.archive.org/web/20110820052005/http://www.devmaster.net/codespotlight/show.php?id=17>`__ 231 - `Advanced 232 Rasterization <http://web.archive.org/web/20140514220546/http://devmaster.net/posts/6145/advanced-rasterization>`__ 233 - `Optimizing Software Occlusion 234 Culling <https://fgiesen.wordpress.com/2013/02/17/optimizing-sw-occlusion-culling-index/>`__ 235 236- Texture sampling 237 238 - `Perspective Texture 239 Mapping <https://chrishecker.com/Miscellaneous_Technical_Articles#Perspective_Texture_Mapping>`__ 240 - `Texturing As In 241 Unreal <https://www.flipcode.com/archives/Texturing_As_In_Unreal.shtml>`__ 242 - `Run-Time MIP-Map 243 Filtering <http://web.archive.org/web/20220709145555/http://www.gamasutra.com/view/feature/3301/runtime_mipmap_filtering.php>`__ 244 - `Will "brilinear" filtering 245 persist? <https://alt.3dcenter.org/artikel/2003/10-26_a_english.php>`__ 246 - `Trilinear 247 filtering <http://ixbtlabs.com/articles2/gffx/nv40-rx800-3.html>`__ 248 - `Texture tiling and 249 swizzling <https://fgiesen.wordpress.com/2011/01/17/texture-tiling-and-swizzling/>`__ 250 251- SIMD 252 253 - `Whole-Function 254 Vectorization <https://compilers.cs.uni-saarland.de/projects/wfv/#pubs>`__ 255 256- Optimization 257 258 - `Optimizing Pixomatic For Modern x86 259 Processors <https://www.drdobbs.com/optimizing-pixomatic-for-modern-x86-proc/184405807>`__ 260 - `Intel 64 and IA-32 Architectures Optimization Reference 261 Manual <https://www.intel.com/content/www/us/en/content-details/779559/intel-64-and-ia-32-architectures-optimization-reference-manual.html>`__ 262 - `Software optimization 263 resources <https://www.agner.org/optimize/>`__ 264 - `Intel Intrinsics 265 Guide <https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html>`__ 266 267- LLVM 268 269 - `LLVM Language Reference 270 Manual <https://llvm.org/docs/LangRef.html>`__ 271 - `The secret of LLVM C 272 bindings <https://npcontemplation.blogspot.com/2008/06/secret-of-llvm-c-bindings.html>`__ 273 274- General 275 276 - `A trip through the Graphics 277 Pipeline <https://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-graphics-pipeline-2011-index/>`__ 278 - `WARP Architecture and 279 Performance <https://learn.microsoft.com/en-us/windows/win32/direct3darticles/directx-warp#warp-architecture-and-performance>`__ 280