1Subzero - Fast code generator for PNaCl bitcode 2=============================================== 3 4Design 5------ 6 7See the accompanying DESIGN.rst file for a more detailed technical overview of 8Subzero. 9 10Building 11-------- 12 13Subzero is set up to be built within the Native Client tree. Follow the 14`Developing PNaCl 15<https://sites.google.com/a/chromium.org/dev/nativeclient/pnacl/developing-pnacl>`_ 16instructions, in particular the section on building PNaCl sources. This will 17prepare the necessary external headers and libraries that Subzero needs. 18Checking out the Native Client project also gets the pre-built clang and LLVM 19tools in ``native_client/../third_party/llvm-build/Release+Asserts/bin`` which 20are used for building Subzero. 21 22The Subzero source is in ``native_client/toolchain_build/src/subzero``. From 23within that directory, ``git checkout master && git pull`` to get the latest 24version of Subzero source code. 25 26The Makefile is designed to be used as part of the higher level LLVM build 27system. To build manually, use the ``Makefile.standalone``. There are several 28build configurations from the command line:: 29 30 make -f Makefile.standalone 31 make -f Makefile.standalone DEBUG=1 32 make -f Makefile.standalone NOASSERT=1 33 make -f Makefile.standalone DEBUG=1 NOASSERT=1 34 make -f Makefile.standalone MINIMAL=1 35 make -f Makefile.standalone ASAN=1 36 make -f Makefile.standalone TSAN=1 37 38``DEBUG=1`` builds without optimizations and is good when running the translator 39inside a debugger. ``NOASSERT=1`` disables assertions and is the preferred 40configuration for performance testing the translator. ``MINIMAL=1`` attempts to 41minimize the size of the translator by compiling out everything unnecessary. 42``ASAN=1`` enables AddressSanitizer, and ``TSAN=1`` enables ThreadSanitizer. 43 44The result of the ``make`` command is the target ``pnacl-sz`` in the current 45directory. 46 47Building within LLVM trunk 48-------------------------- 49 50Subzero can also be built from within a standard LLVM trunk checkout. Here is 51an example of how it can be checked out and built:: 52 53 mkdir llvm-git 54 cd llvm-git 55 git clone http://llvm.org/git/llvm.git 56 cd llvm/projects/ 57 git clone https://chromium.googlesource.com/native_client/pnacl-subzero 58 cd ../.. 59 mkdir build 60 cd build 61 cmake -G Ninja ../llvm/ 62 ninja 63 ./bin/pnacl-sz -version 64 65This creates a default build of ``pnacl-sz``; currently any options such as 66``DEBUG=1`` or ``MINIMAL=1`` have to be added manually. 67 68``pnacl-sz`` 69------------ 70 71The ``pnacl-sz`` program parses a pexe or an LLVM bitcode file and translates it 72into ICE (Subzero's intermediate representation). It then invokes the ICE 73translate method to lower it to target-specific machine code, optionally dumping 74the intermediate representation at various stages of the translation. 75 76The program can be run as follows:: 77 78 ../pnacl-sz ./path/to/<file>.pexe 79 ../pnacl-sz ./tests_lit/pnacl-sz_tests/<file>.ll 80 81At this time, ``pnacl-sz`` accepts a number of arguments, including the 82following: 83 84 ``-help`` -- Show available arguments and possible values. (Note: this 85 unfortunately also pulls in some LLVM-specific options that are reported but 86 that Subzero doesn't use.) 87 88 ``-notranslate`` -- Suppress the ICE translation phase, which is useful if 89 ICE is missing some support. 90 91 ``-target=<TARGET>`` -- Set the target architecture. The default is x8632. 92 Future targets include x8664, arm32, and arm64. 93 94 ``-filetype=obj|asm|iasm`` -- Select the output file type. ``obj`` is a 95 native ELF file, ``asm`` is a textual assembly file, and ``iasm`` is a 96 low-level textual assembly file demonstrating the integrated assembler. 97 98 ``-O<LEVEL>`` -- Set the optimization level. Valid levels are ``2``, ``1``, 99 ``0``, ``-1``, and ``m1``. Levels ``-1`` and ``m1`` are synonyms, and 100 represent the minimum optimization and worst code quality, but fastest code 101 generation. 102 103 ``-verbose=<list>`` -- Set verbosity flags. This argument allows a 104 comma-separated list of values. The default is ``none``, and the value 105 ``inst,pred`` will roughly match the .ll bitcode file. Of particular use 106 are ``all``, ``most``, and ``none``. 107 108 ``-o <FILE>`` -- Set the assembly output file name. Default is stdout. 109 110 ``-log <FILE>`` -- Set the file name for diagnostic output (whose level is 111 controlled by ``-verbose``). Default is stdout. 112 113 ``-timing`` -- Dump some pass timing information after translating the input 114 file. 115 116Running the test suite 117---------------------- 118 119Subzero uses the LLVM ``lit`` testing tool for part of its test suite, which 120lives in ``tests_lit``. To execute the test suite, first build Subzero, and then 121run:: 122 123 make -f Makefile.standalone check-lit 124 125There is also a suite of cross tests in the ``crosstest`` directory. A cross 126test takes a test bitcode file implementing some unit tests, and translates it 127twice, once with Subzero and once with LLVM's known-good ``llc`` translator. 128The Subzero-translated symbols are specially mangled to avoid multiple 129definition errors from the linker. Both translated versions are linked together 130with a driver program that calls each version of each unit test with a variety 131of interesting inputs and compares the results for equality. The cross tests 132are currently invoked by running:: 133 134 make -f Makefile.standalone check-xtest 135 136Similar, there is a suite of unit tests:: 137 138 make -f Makefile.standalone check-unit 139 140A convenient way to run the lit, cross, and unit tests is:: 141 142 make -f Makefile.standalone check 143 144Assembling ``pnacl-sz`` output as needed 145---------------------------------------- 146 147``pnacl-sz`` can now produce a native ELF binary using ``-filetype=obj``. 148 149``pnacl-sz`` can also produce textual assembly code in a structure suitable for 150input to ``llvm-mc``, using ``-filetype=asm`` or ``-filetype=iasm``. An object 151file can then be produced using the command:: 152 153 llvm-mc -triple=i686 -filetype=obj -o=MyObj.o 154 155Building a translated binary 156---------------------------- 157 158There is a helper script, ``pydir/szbuild.py``, that translates a finalized pexe 159into a fully linked executable. Run it with ``-help`` for extensive 160documentation. 161 162By default, ``szbuild.py`` builds an executable using only Subzero translation, 163but it can also be used to produce hybrid Subzero/``llc`` binaries (``llc`` is 164the name of the LLVM translator) for bisection-based debugging. In bisection 165debugging mode, the pexe is translated using both Subzero and ``llc``, and the 166resulting object files are combined into a single executable using symbol 167weakening and other linker tricks to control which Subzero symbols and which 168``llc`` symbols take precedence. This is controlled by the ``-include`` and 169``-exclude`` arguments. These can be used to rapidly find a single function 170that Subzero translates incorrectly leading to incorrect output. 171 172There is another helper script, ``pydir/szbuild_spec2k.py``, that runs 173``szbuild.py`` on one or more components of the Spec2K suite. This assumes that 174Spec2K is set up in the usual place in the Native Client tree, and the finalized 175pexe files have been built. (Note: for working with Spec2K and other pexes, 176it's helpful to finalize the pexe using ``--no-strip-syms``, to preserve the 177original function and global variable names.) 178 179Status 180------ 181 182Subzero currently fully supports the x86-32 architecture, for both native and 183Native Client sandboxing modes. The x86-64 architecture is also supported in 184native mode only, and only for the x32 flavor due to the fact that pointers and 18532-bit integers are indistinguishable in PNaCl bitcode. Sandboxing support for 186x86-64 is in progress. ARM and MIPS support is in progress. Two optimization 187levels, ``-Om1`` and ``-O2``, are implemented. 188 189The ``-Om1`` configuration is designed to be the simplest and fastest possible, 190with a minimal set of passes and transformations. 191 192* Simple Phi lowering before target lowering, by generating temporaries and 193 adding assignments to the end of predecessor blocks. 194 195* Simple register allocation limited to pre-colored or infinite-weight 196 Variables. 197 198The ``-O2`` configuration is designed to use all optimizations available and 199produce the best code. 200 201* Address mode inference to leverage the complex x86 addressing modes. 202 203* Compare/branch fusing based on liveness/last-use analysis. 204 205* Global, linear-scan register allocation. 206 207* Advanced phi lowering after target lowering and global register allocation, 208 via edge splitting, topological sorting of the parallel moves, and final local 209 register allocation. 210 211* Stack slot coalescing to reduce frame size. 212 213* Branch optimization to reduce the number of branches to the following block. 214