1LLVMpipe 2======== 3 4Introduction 5------------ 6 7The Gallium LLVMpipe driver is a software rasterizer that uses LLVM to 8do runtime code generation. Shaders, point/line/triangle rasterization 9and vertex processing are implemented with LLVM IR which is translated 10to x86, x86-64, or ppc64le machine code. Also, the driver is 11multithreaded to take advantage of multiple CPU cores (up to 32 at this 12time). It's the fastest software rasterizer for Mesa. 13 14Requirements 15------------ 16 17- For x86 or amd64 processors, 64-bit mode is recommended. Support for 18 SSE2 is strongly encouraged. Support for SSE3 and SSE4.1 will yield 19 the most efficient code. The fewer features the CPU has the more 20 likely it is that you will run into underperforming, buggy, or 21 incomplete code. 22 23 For ppc64le processors, use of the Altivec feature (the Vector 24 Facility) is recommended if supported; use of the VSX feature (the 25 Vector-Scalar Facility) is recommended if supported AND Mesa is built 26 with LLVM version 4.0 or later. 27 28 See ``/proc/cpuinfo`` to know what your CPU supports. 29 30- Unless otherwise stated, LLVM version 3.9 or later is required. 31 32 For Linux, on a recent Debian based distribution do: 33 34 .. code-block:: sh 35 36 aptitude install llvm-dev 37 38 If you want development snapshot builds of LLVM for Debian and 39 derived distributions like Ubuntu, you can use the APT repository at 40 `apt.llvm.org <https://apt.llvm.org/>`__, which are maintained by 41 Debian's LLVM maintainer. 42 43 For a RPM-based distribution do: 44 45 .. code-block:: sh 46 47 yum install llvm-devel 48 49 If you want development snapshot builds of LLVM for Fedora, you can 50 use the Copr repository at `fedora-llvm-team/llvm-snapshots <https://copr.fedorainfracloud.org/coprs/g/fedora-llvm-team/llvm-snapshots/>`__, 51 which is maintained by Red Hat's LLVM team. 52 53 For Windows you will need to build LLVM from source with MSVC or 54 MINGW (either natively or through cross compilers) and CMake, and set 55 the ``LLVM`` environment variable to the directory you installed it 56 to. LLVM will be statically linked, so when building on MSVC it needs 57 to be built with a matching CRT as Mesa, and you'll need to pass 58 ``-DLLVM_USE_CRT_xxx=yyy`` as described below. 59 60 61 +-----------------+----------------------------------------------------------------+ 62 | LLVM build-type | Mesa build-type | 63 | +--------------------------------+-------------------------------+ 64 | | debug,checked | release,profile | 65 +=================+================================+===============================+ 66 | Debug | ``-DLLVM_USE_CRT_DEBUG=MTd`` | ``-DLLVM_USE_CRT_DEBUG=MT`` | 67 +-----------------+--------------------------------+-------------------------------+ 68 | Release | ``-DLLVM_USE_CRT_RELEASE=MTd`` | ``-DLLVM_USE_CRT_RELEASE=MT`` | 69 +-----------------+--------------------------------+-------------------------------+ 70 71 You can build only the x86 target by passing 72 ``-DLLVM_TARGETS_TO_BUILD=X86`` to CMake. 73 74Building 75-------- 76 77To build everything on Linux invoke meson as: 78 79.. code-block:: sh 80 81 mkdir build 82 cd build 83 meson -D glx=xlib -D gallium-drivers=swrast 84 ninja 85 86Building for Android 87-------------------- 88 89To build for Android requires the additional step of building LLVM 90for Android using the NDK. Before following the steps in 91:doc:`Android's documentation <../android>` you must build a version 92of LLVM that targets the NDK with all the required libraries for 93llvmpipe, and then create a wrap file so that meson knows where to 94find the LLVM libraries. It can be a bit tricky to get LLVM to build 95properly using the Android NDK, so the script below can be 96used as a reference to configure LLVM to build with the NDK for x86. 97You need to set the ``ANDROID_NDK_ROOT``, ``ANDROID_SDK_VERSION`` and 98``LLVML_INSTALL_PREFIX`` environment variables appropriately. 99 100.. code-block:: sh 101 102 #!/bin/bash 103 104 set -e 105 set -u 106 107 # Early check for required env variables, relies on `set -u` 108 : "$ANDROID_NDK_ROOT" 109 : "$ANDROID_SDK_VERSION" 110 : "$LLVM_INSTALL_PREFIX" 111 112 cmake -GNinja -S llvm -B build/ \ 113 -DCMAKE_TOOLCHAIN_FILE=${ANDROID_NDK_ROOT}/build/cmake/android.toolchain.cmake \ 114 -DANDROID_ABI=x86_64 \ 115 -DANDROID_PLATFORM=android-${ANDROID_SDK_VERSION} \ 116 -DANDROID_NDK=${ANDROID_NDK_ROOT} \ 117 -DCMAKE_ANDROID_ARCH_ABI=x86_64 \ 118 -DCMAKE_ANDROID_NDK=${ANDROID_NDK_ROOT} \ 119 -DCMAKE_BUILD_TYPE=MinSizeRel \ 120 -DCMAKE_SYSTEM_NAME=Android \ 121 -DCMAKE_SYSTEM_VERSION=${ANDROID_SDK_VERSION} \ 122 -DCMAKE_INSTALL_PREFIX=${LLVM_INSTALL_PREFIX} \ 123 -DCMAKE_CXX_FLAGS="-march=x86-64 --target=x86_64-linux-android${ANDROID_SDK_VERSION} -fno-rtti" \ 124 -DLLVM_HOST_TRIPLE=x86_64-linux-android${ANDROID_SDK_VERSION} \ 125 -DLLVM_TARGETS_TO_BUILD=X86 \ 126 -DLLVM_BUILD_LLVM_DYLIB=OFF \ 127 -DLLVM_BUILD_TESTS=OFF \ 128 -DLLVM_BUILD_EXAMPLES=OFF \ 129 -DLLVM_BUILD_DOCS=OFF \ 130 -DLLVM_BUILD_TOOLS=OFF \ 131 -DLLVM_ENABLE_RTTI=OFF \ 132 -DLLVM_BUILD_INSTRUMENTED_COVERAGE=OFF \ 133 -DLLVM_NATIVE_TOOL_DIR=${ANDROID_NDK_ROOT}toolchains/llvm/prebuilt/linux-x86_64/bin \ 134 -DLLVM_ENABLE_PIC=False 135 136 ninja -C build/ install 137 138You will also need to create a wrap file, so that meson is able 139to find the LLVM libraries built with the NDK. The process for this 140is described in :doc:`meson documentation <../meson>`. 141 142For example the following script will create the 143``subprojects/llvm/meson.build`` wrap file, after setting ``LLVM_INSTALL_PREFIX`` 144to the path where LLVM was installed to. 145 146The list of libraries passed in `dep_llvm` below should match what it was 147produced by the LLVM build from above. 148 149.. code-block:: sh 150 151 #!/usr/bin/env bash 152 153 set -exu 154 155 # Early check for required env variables, relies on `set -u` 156 : "$LLVM_INSTALL_PREFIX" 157 158 if [ ! -d "$LLVM_INSTALL_PREFIX" ]; then 159 echo "Cannot find an LLVM build in $LLVM_INSTALL_PREFIX" 1>&2 160 exit 1 161 fi 162 163 mkdir -p subprojects/llvm 164 165 cat << EOF > subprojects/llvm/meson.build 166 project('llvm', ['cpp']) 167 168 cpp = meson.get_compiler('cpp') 169 170 _deps = [] 171 _search = join_paths('$LLVM_INSTALL_PREFIX', 'lib') 172 173 foreach d: ['libLLVMAggressiveInstCombine', 'libLLVMAnalysis', 'libLLVMAsmParser', 'libLLVMAsmPrinter', 'libLLVMBinaryFormat', 'libLLVMBitReader', 'libLLVMBitstreamReader', 'libLLVMBitWriter', 'libLLVMCFGuard', 'libLLVMCFIVerify', 'libLLVMCodeGen', 'libLLVMCodeGenTypes', 'libLLVMCore', 'libLLVMCoroutines', 'libLLVMCoverage', 'libLLVMDebugInfoBTF', 'libLLVMDebugInfoCodeView', 'libLLVMDebuginfod', 'libLLVMDebugInfoDWARF', 'libLLVMDebugInfoGSYM', 'libLLVMDebugInfoLogicalView', 'libLLVMDebugInfoMSF', 'libLLVMDebugInfoPDB', 'libLLVMDemangle', 'libLLVMDiff', 'libLLVMDlltoolDriver', 'libLLVMDWARFLinker', 'libLLVMDWARFLinkerClassic', 'libLLVMDWARFLinkerParallel', 'libLLVMDWP', 'libLLVMExecutionEngine', 'libLLVMExegesis', 'libLLVMExegesisX86', 'libLLVMExtensions', 'libLLVMFileCheck', 'libLLVMFrontendDriver', 'libLLVMFrontendHLSL', 'libLLVMFrontendOffloading', 'libLLVMFrontendOpenACC', 'libLLVMFrontendOpenMP', 'libLLVMFuzzerCLI', 'libLLVMFuzzMutate', 'libLLVMGlobalISel', 'libLLVMHipStdPar', 'libLLVMInstCombine', 'libLLVMInstrumentation', 'libLLVMInterfaceStub', 'libLLVMInterpreter', 'libLLVMipo', 'libLLVMIRPrinter', 'libLLVMIRReader', 'libLLVMJITLink', 'libLLVMLibDriver', 'libLLVMLineEditor', 'libLLVMLinker', 'libLLVMLTO', 'libLLVMMC', 'libLLVMMCA', 'libLLVMMCDisassembler', 'libLLVMMCJIT', 'libLLVMMCParser', 'libLLVMMIRParser', 'libLLVMObjCARCOpts', 'libLLVMObjCopy', 'libLLVMObject', 'libLLVMObjectYAML', 'libLLVMOption', 'libLLVMOrcDebugging', 'libLLVMOrcJIT', 'libLLVMOrcShared', 'libLLVMOrcTargetProcess', 'libLLVMPasses', 'libLLVMProfileData', 'libLLVMRemarks', 'libLLVMRuntimeDyld', 'libLLVMScalarOpts', 'libLLVMSelectionDAG', 'libLLVMSupport', 'libLLVMSymbolize', 'libLLVMTableGen', 'libLLVMTableGenCommon', 'libLLVMTarget', 'libLLVMTargetParser', 'libLLVMTextAPI', 'libLLVMTextAPIBinaryReader', 'libLLVMTransformUtils', 'libLLVMVectorize', 'libLLVMWindowsDriver', 'libLLVMWindowsManifest', 'libLLVMX86AsmParser', 'libLLVMX86CodeGen', 'libLLVMX86Desc', 'libLLVMX86Disassembler', 'libLLVMX86Info', 'libLLVMX86TargetMCA', 'libLLVMXRay'] 174 _deps += cpp.find_library(d, dirs : _search) 175 endforeach 176 177 dep_llvm = declare_dependency( 178 include_directories : include_directories('$LLVM_INSTALL_PREFIX/include'), 179 dependencies : _deps, 180 version : '$(sed -n -e 's/^#define LLVM_VERSION_STRING "\([^"]*\)".*/\1/p' "${LLVM_INSTALL_PREFIX}/include/llvm/Config/llvm-config.h" )', 181 ) 182 183 has_rtti = false 184 irbuilder_h = files('$LLVM_INSTALL_PREFIX/include/llvm/IR/IRBuilder.h') 185 EOF 186 187Afterwards you can continue following the instructors to build mesa 188on :doc:`Android <../android>` and follow the steps to add the driver 189directly to an Android OS image. 190 191Using 192----- 193 194Environment variables 195~~~~~~~~~~~~~~~~~~~~~ 196 197.. envvar:: LP_NATIVE_VECTOR_WIDTH 198 199 We can use it to override vector bits. Because sometimes it turns 200 out LLVMpipe can be fastest by using 128 bit vectors, 201 yet use AVX instructions. 202 203.. envvar:: GALLIUM_NOSSE 204 205 Deprecated in favor of ``GALLIUM_OVERRIDE_CPU_CAPS``, 206 use ``GALLIUM_OVERRIDE_CPU_CAPS=nosse`` instead. 207 208.. envvar:: LP_FORCE_SSE2 209 210 Deprecated in favor of ``GALLIUM_OVERRIDE_CPU_CAPS`` 211 use ``GALLIUM_OVERRIDE_CPU_CAPS=sse2`` instead. 212 213Linux 214~~~~~ 215 216On Linux, building will create a drop-in alternative for ``libGL.so`` 217into 218 219:: 220 221 build/foo/gallium/targets/libgl-xlib/libGL.so 222 223or 224 225:: 226 227 lib/gallium/libGL.so 228 229To use it set the ``LD_LIBRARY_PATH`` environment variable accordingly. 230 231Windows 232~~~~~~~ 233 234On Windows, building will create 235``build/windows-x86-debug/gallium/targets/libgl-gdi/opengl32.dll`` which 236is a drop-in alternative for system's ``opengl32.dll``, which will use 237the Mesa ICD, ``build/windows-x86-debug/gallium/targets/wgl/libgallium_wgl.dll``. 238To use it put both DLLs in the same directory as your application. It can also 239be used by replacing the native ICD driver, but it's quite an advanced usage, so if 240you need to ask, don't even try it. 241 242There is however an easy way to replace the OpenGL software renderer 243that comes with Microsoft Windows 7 (or later) with LLVMpipe (that is, 244on systems without any OpenGL drivers): 245 246- copy 247 ``build/windows-x86-debug/gallium/targets/wgl/libgallium_wgl.dll`` to 248 ``C:\Windows\SysWOW64\mesadrv.dll`` 249 250- load this registry settings: 251 252 :: 253 254 REGEDIT4 255 256 ; https://technet.microsoft.com/en-us/library/cc749368.aspx 257 ; https://www.msfn.org/board/topic/143241-portable-windows-7-build-from-winpe-30/page-5#entry942596 258 [HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Windows NT\CurrentVersion\OpenGLDrivers\MSOGL] 259 "DLL"="mesadrv.dll" 260 "DriverVersion"=dword:00000001 261 "Flags"=dword:00000001 262 "Version"=dword:00000002 263 264- Ditto for 64 bits drivers if you need them. 265 266Profiling 267--------- 268 269Linux perf integration 270~~~~~~~~~~~~~~~~~~~~~~ 271 272On Linux, it is possible to have symbol resolution of JIT code with 273`Linux perf <https://perfwiki.github.io/main/>`__: 274 275:: 276 277 perf record -g /my/application 278 perf report 279 280When run inside Linux perf, LLVMpipe will create a 281``/tmp/perf-XXXXX.map`` file with symbol address table. It also dumps 282assembly code to ``/tmp/perf-XXXXX.map.asm``, which can be used by the 283``bin/perf-annotate-jit.py`` script to produce disassembly of the 284generated code annotated with the samples. 285 286You can obtain a call graph via 287`Gprof2Dot <https://github.com/jrfonseca/gprof2dot#linux-perf>`__. 288 289FlameGraph support 290~~~~~~~~~~~~~~~~~~~~~~ 291 292Outside Linux, it is possible to generate a 293`FlameGraph <https://github.com/brendangregg/FlameGraph>`__ 294with resolved JIT symbols. 295 296Set the environment variable ``JIT_SYMBOL_MAP_DIR`` to a directory path, 297and run your LLVMpipe program. Follow the FlameGraph instructions: 298capture traces using a supported tool (for example DTrace), 299and fold the stacks using the associated script 300(``stackcollapse.pl`` for DTrace stacks). 301 302LLVMpipe will create a ``jit-symbols-XXXXX.map`` file containing the symbol 303address table inside the chosen directory. It will also dump the JIT 304disassemblies to ``jit-symbols-XXXXX.map.asm``. Run your folded traces and 305both output files through the ``bin/flamegraph_map_lp_jit.py`` script to map 306addresses to JIT symbols, and annotate the disassembly with the sample counts. 307 308Unit testing 309------------ 310 311Building will also create several unit tests in 312``build/linux-???-debug/gallium/drivers/llvmpipe``: 313 314- ``lp_test_blend``: blending 315- ``lp_test_conv``: SIMD vector conversion 316- ``lp_test_format``: pixel unpacking/packing 317 318Some of these tests can output results and benchmarks to a tab-separated 319file for later analysis, e.g.: 320 321:: 322 323 build/linux-x86_64-debug/gallium/drivers/llvmpipe/lp_test_blend -o blend.tsv 324 325Development Notes 326----------------- 327 328- When looking at this code for the first time, start in lp_state_fs.c, 329 and then skim through the ``lp_bld_*`` functions called there, and 330 the comments at the top of the ``lp_bld_*.c`` functions. 331- The driver-independent parts of the LLVM / Gallium code are found in 332 ``src/gallium/auxiliary/gallivm/``. The filenames and function 333 prefixes need to be renamed from ``lp_bld_`` to something else 334 though. 335- We use LLVM-C bindings for now. They are not documented, but follow 336 the C++ interfaces very closely, and appear to be complete enough for 337 code generation. See `this stand-alone 338 example <https://npcontemplation.blogspot.com/2008/06/secret-of-llvm-c-bindings.html>`__. 339 See the ``llvm-c/Core.h`` file for reference. 340 341.. _recommended_reading: 342 343Recommended Reading 344------------------- 345 346- Rasterization 347 348 - `Triangle Scan Conversion using 2D Homogeneous 349 Coordinates <https://userpages.cs.umbc.edu/olano/papers/2dh-tri/>`__ 350 - `Rasterization on 351 Larrabee <https://www.drdobbs.com/parallel/rasterization-on-larrabee/217200602>`__ 352 - `Rasterization using half-space 353 functions <http://web.archive.org/web/20110820052005/http://www.devmaster.net/codespotlight/show.php?id=17>`__ 354 - `Advanced 355 Rasterization <http://web.archive.org/web/20140514220546/http://devmaster.net/posts/6145/advanced-rasterization>`__ 356 - `Optimizing Software Occlusion 357 Culling <https://fgiesen.wordpress.com/2013/02/17/optimizing-sw-occlusion-culling-index/>`__ 358 359- Texture sampling 360 361 - `Perspective Texture 362 Mapping <https://chrishecker.com/Miscellaneous_Technical_Articles#Perspective_Texture_Mapping>`__ 363 - `Texturing As In 364 Unreal <https://www.flipcode.com/archives/Texturing_As_In_Unreal.shtml>`__ 365 - `Run-Time MIP-Map 366 Filtering <http://web.archive.org/web/20220709145555/http://www.gamasutra.com/view/feature/3301/runtime_mipmap_filtering.php>`__ 367 - `Will "brilinear" filtering 368 persist? <https://alt.3dcenter.org/artikel/2003/10-26_a_english.php>`__ 369 - `Trilinear 370 filtering <http://ixbtlabs.com/articles2/gffx/nv40-rx800-3.html>`__ 371 - `Texture tiling and 372 swizzling <https://fgiesen.wordpress.com/2011/01/17/texture-tiling-and-swizzling/>`__ 373 374- SIMD 375 376 - `Whole-Function 377 Vectorization <https://compilers.cs.uni-saarland.de/projects/wfv/#pubs>`__ 378 379- Optimization 380 381 - `Optimizing Pixomatic For Modern x86 382 Processors <https://www.drdobbs.com/optimizing-pixomatic-for-modern-x86-proc/184405807>`__ 383 - `Intel 64 and IA-32 Architectures Optimization Reference 384 Manual <https://www.intel.com/content/www/us/en/content-details/779559/intel-64-and-ia-32-architectures-optimization-reference-manual.html>`__ 385 - `Software optimization 386 resources <https://www.agner.org/optimize/>`__ 387 - `Intel Intrinsics 388 Guide <https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html>`__ 389 390- LLVM 391 392 - `LLVM Language Reference 393 Manual <https://llvm.org/docs/LangRef.html>`__ 394 - `The secret of LLVM C 395 bindings <https://npcontemplation.blogspot.com/2008/06/secret-of-llvm-c-bindings.html>`__ 396 397- General 398 399 - `A trip through the Graphics 400 Pipeline <https://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-graphics-pipeline-2011-index/>`__ 401 - `WARP Architecture and 402 Performance <https://learn.microsoft.com/en-us/windows/win32/direct3darticles/directx-warp#warp-architecture-and-performance>`__ 403