1LLVMpipe 2======== 3 4Introduction 5------------ 6 7The Gallium llvmpipe driver is a software rasterizer that uses LLVM to 8do runtime code generation. Shaders, point/line/triangle rasterization 9and vertex processing are implemented with LLVM IR which is translated 10to x86, x86-64, or ppc64le machine code. Also, the driver is 11multithreaded to take advantage of multiple CPU cores (up to 8 at this 12time). It's the fastest software rasterizer for Mesa. 13 14Requirements 15------------ 16 17- For x86 or amd64 processors, 64-bit mode is recommended. Support for 18 SSE2 is strongly encouraged. Support for SSE3 and SSE4.1 will yield 19 the most efficient code. The fewer features the CPU has the more 20 likely it is that you will run into underperforming, buggy, or 21 incomplete code. 22 23 For ppc64le processors, use of the Altivec feature (the Vector 24 Facility) is recommended if supported; use of the VSX feature (the 25 Vector-Scalar Facility) is recommended if supported AND Mesa is built 26 with LLVM version 4.0 or later. 27 28 See ``/proc/cpuinfo`` to know what your CPU supports. 29 30- Unless otherwise stated, LLVM version 3.4 is recommended; 3.3 or 31 later is required. 32 33 For Linux, on a recent Debian based distribution do: 34 35 .. code-block:: console 36 37 aptitude install llvm-dev 38 39 If you want development snapshot builds of LLVM for Debian and 40 derived distributions like Ubuntu, you can use the APT repository at 41 `apt.llvm.org <https://apt.llvm.org/>`__, which are maintained by 42 Debian's LLVM maintainer. 43 44 For a RPM-based distribution do: 45 46 .. code-block:: console 47 48 yum install llvm-devel 49 50 For Windows you will need to build LLVM from source with MSVC or 51 MINGW (either natively or through cross compilers) and CMake, and set 52 the ``LLVM`` environment variable to the directory you installed it 53 to. LLVM will be statically linked, so when building on MSVC it needs 54 to be built with a matching CRT as Mesa, and you'll need to pass 55 ``-DLLVM_USE_CRT_xxx=yyy`` as described below. 56 57 58 +-----------------+----------------------------------------------------------------+ 59 | LLVM build-type | Mesa build-type | 60 | +--------------------------------+-------------------------------+ 61 | | debug,checked | release,profile | 62 +=================+================================+===============================+ 63 | Debug | ``-DLLVM_USE_CRT_DEBUG=MTd`` | ``-DLLVM_USE_CRT_DEBUG=MT`` | 64 +-----------------+--------------------------------+-------------------------------+ 65 | Release | ``-DLLVM_USE_CRT_RELEASE=MTd`` | ``-DLLVM_USE_CRT_RELEASE=MT`` | 66 +-----------------+--------------------------------+-------------------------------+ 67 68 You can build only the x86 target by passing 69 ``-DLLVM_TARGETS_TO_BUILD=X86`` to cmake. 70 71- scons (optional) 72 73Building 74-------- 75 76To build everything on Linux invoke scons as: 77 78.. code-block:: console 79 80 scons build=debug libgl-xlib 81 82Alternatively, you can build it with meson with: 83 84.. code-block:: console 85 86 mkdir build 87 cd build 88 meson -D glx=gallium-xlib -D gallium-drivers=swrast 89 ninja 90 91but the rest of these instructions assume that scons is used. For 92Windows the procedure is similar except the target: 93 94.. code-block:: console 95 96 scons platform=windows build=debug libgl-gdi 97 98Using 99----- 100 101Linux 102~~~~~ 103 104On Linux, building will create a drop-in alternative for ``libGL.so`` 105into 106 107:: 108 109 build/foo/gallium/targets/libgl-xlib/libGL.so 110 111or 112 113:: 114 115 lib/gallium/libGL.so 116 117To use it set the ``LD_LIBRARY_PATH`` environment variable accordingly. 118 119For performance evaluation pass ``build=release`` to scons, and use the 120corresponding lib directory without the ``-debug`` suffix. 121 122Windows 123~~~~~~~ 124 125On Windows, building will create 126``build/windows-x86-debug/gallium/targets/libgl-gdi/opengl32.dll`` which 127is a drop-in alternative for system's ``opengl32.dll``. To use it put it 128in the same directory as your application. It can also be used by 129replacing the native ICD driver, but it's quite an advanced usage, so if 130you need to ask, don't even try it. 131 132There is however an easy way to replace the OpenGL software renderer 133that comes with Microsoft Windows 7 (or later) with llvmpipe (that is, 134on systems without any OpenGL drivers): 135 136- copy 137 ``build/windows-x86-debug/gallium/targets/libgl-gdi/opengl32.dll`` to 138 ``C:\Windows\SysWOW64\mesadrv.dll`` 139 140- load this registry settings: 141 142 :: 143 144 REGEDIT4 145 146 ; https://technet.microsoft.com/en-us/library/cc749368.aspx 147 ; https://www.msfn.org/board/topic/143241-portable-windows-7-build-from-winpe-30/page-5#entry942596 148 [HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Windows NT\CurrentVersion\OpenGLDrivers\MSOGL] 149 "DLL"="mesadrv.dll" 150 "DriverVersion"=dword:00000001 151 "Flags"=dword:00000001 152 "Version"=dword:00000002 153 154- Ditto for 64 bits drivers if you need them. 155 156Profiling 157--------- 158 159To profile llvmpipe you should build as 160 161:: 162 163 scons build=profile <same-as-before> 164 165This will ensure that frame pointers are used both in C and JIT 166functions, and that no tail call optimizations are done by gcc. 167 168Linux perf integration 169~~~~~~~~~~~~~~~~~~~~~~ 170 171On Linux, it is possible to have symbol resolution of JIT code with 172`Linux perf <https://perf.wiki.kernel.org/>`__: 173 174:: 175 176 perf record -g /my/application 177 perf report 178 179When run inside Linux perf, llvmpipe will create a 180``/tmp/perf-XXXXX.map`` file with symbol address table. It also dumps 181assembly code to ``/tmp/perf-XXXXX.map.asm``, which can be used by the 182``bin/perf-annotate-jit.py`` script to produce disassembly of the 183generated code annotated with the samples. 184 185You can obtain a call graph via 186`Gprof2Dot <https://github.com/jrfonseca/gprof2dot#linux-perf>`__. 187 188Unit testing 189------------ 190 191Building will also create several unit tests in 192``build/linux-???-debug/gallium/drivers/llvmpipe``: 193 194- ``lp_test_blend``: blending 195- ``lp_test_conv``: SIMD vector conversion 196- ``lp_test_format``: pixel unpacking/packing 197 198Some of these tests can output results and benchmarks to a tab-separated 199file for later analysis, e.g.: 200 201:: 202 203 build/linux-x86_64-debug/gallium/drivers/llvmpipe/lp_test_blend -o blend.tsv 204 205Development Notes 206----------------- 207 208- When looking at this code for the first time, start in lp_state_fs.c, 209 and then skim through the ``lp_bld_*`` functions called there, and 210 the comments at the top of the ``lp_bld_*.c`` functions. 211- The driver-independent parts of the LLVM / Gallium code are found in 212 ``src/gallium/auxiliary/gallivm/``. The filenames and function 213 prefixes need to be renamed from ``lp_bld_`` to something else 214 though. 215- We use LLVM-C bindings for now. They are not documented, but follow 216 the C++ interfaces very closely, and appear to be complete enough for 217 code generation. See `this stand-alone 218 example <https://npcontemplation.blogspot.com/2008/06/secret-of-llvm-c-bindings.html>`__. 219 See the ``llvm-c/Core.h`` file for reference. 220 221.. _recommended_reading: 222 223Recommended Reading 224------------------- 225 226- Rasterization 227 228 - `Triangle Scan Conversion using 2D Homogeneous 229 Coordinates <https://www.cs.unc.edu/~olano/papers/2dh-tri/>`__ 230 - `Rasterization on 231 Larrabee <http://www.drdobbs.com/parallel/rasterization-on-larrabee/217200602>`__ 232 (`DevMaster 233 copy <http://devmaster.net/posts/2887/rasterization-on-larrabee>`__) 234 - `Rasterization using half-space 235 functions <http://devmaster.net/posts/6133/rasterization-using-half-space-functions>`__ 236 - `Advanced 237 Rasterization <http://devmaster.net/posts/6145/advanced-rasterization>`__ 238 - `Optimizing Software Occlusion 239 Culling <https://fgiesen.wordpress.com/2013/02/17/optimizing-sw-occlusion-culling-index/>`__ 240 241- Texture sampling 242 243 - `Perspective Texture 244 Mapping <http://chrishecker.com/Miscellaneous_Technical_Articles#Perspective_Texture_Mapping>`__ 245 - `Texturing As In 246 Unreal <https://www.flipcode.com/archives/Texturing_As_In_Unreal.shtml>`__ 247 - `Run-Time MIP-Map 248 Filtering <http://www.gamasutra.com/view/feature/3301/runtime_mipmap_filtering.php>`__ 249 - `Will "brilinear" filtering 250 persist? <http://alt.3dcenter.org/artikel/2003/10-26_a_english.php>`__ 251 - `Trilinear 252 filtering <http://ixbtlabs.com/articles2/gffx/nv40-rx800-3.html>`__ 253 - `Texture 254 Swizzling <http://devmaster.net/posts/12785/texture-swizzling>`__ 255 256- SIMD 257 258 - `Whole-Function 259 Vectorization <http://www.cdl.uni-saarland.de/projects/wfv/#header4>`__ 260 261- Optimization 262 263 - `Optimizing Pixomatic For Modern x86 264 Processors <http://www.drdobbs.com/optimizing-pixomatic-for-modern-x86-proc/184405807>`__ 265 - `Intel 64 and IA-32 Architectures Optimization Reference 266 Manual <http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-optimization-manual.html>`__ 267 - `Software optimization 268 resources <http://www.agner.org/optimize/>`__ 269 - `Intel Intrinsics 270 Guide <https://software.intel.com/en-us/articles/intel-intrinsics-guide>`__ 271 272- LLVM 273 274 - `LLVM Language Reference 275 Manual <http://llvm.org/docs/LangRef.html>`__ 276 - `The secret of LLVM C 277 bindings <https://npcontemplation.blogspot.co.uk/2008/06/secret-of-llvm-c-bindings.html>`__ 278 279- General 280 281 - `A trip through the Graphics 282 Pipeline <https://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-graphics-pipeline-2011-index/>`__ 283 - `WARP Architecture and 284 Performance <https://msdn.microsoft.com/en-us/library/gg615082.aspx#architecture>`__ 285