1================================ 2Fuzzing LLVM libraries and tools 3================================ 4 5.. contents:: 6 :local: 7 :depth: 2 8 9Introduction 10============ 11 12The LLVM tree includes a number of fuzzers for various components. These are 13built on top of :doc:`LibFuzzer <LibFuzzer>`. In order to build and run these 14fuzzers, see :ref:`building-fuzzers`. 15 16 17Available Fuzzers 18================= 19 20clang-fuzzer 21------------ 22 23A |generic fuzzer| that tries to compile textual input as C++ code. Some of the 24bugs this fuzzer has reported are `on bugzilla`__ and `on OSS Fuzz's 25tracker`__. 26 27__ https://llvm.org/pr23057 28__ https://bugs.chromium.org/p/oss-fuzz/issues/list?q=proj-llvm+clang-fuzzer 29 30clang-proto-fuzzer 31------------------ 32 33A |protobuf fuzzer| that compiles valid C++ programs generated from a protobuf 34class that describes a subset of the C++ language. 35 36This fuzzer accepts clang command line options after `ignore_remaining_args=1`. 37For example, the following command will fuzz clang with a higher optimization 38level: 39 40.. code-block:: shell 41 42 % bin/clang-proto-fuzzer <corpus-dir> -ignore_remaining_args=1 -O3 43 44clang-format-fuzzer 45------------------- 46 47A |generic fuzzer| that runs clang-format_ on C++ text fragments. Some of the 48bugs this fuzzer has reported are `on bugzilla`__ 49and `on OSS Fuzz's tracker`__. 50 51.. _clang-format: https://clang.llvm.org/docs/ClangFormat.html 52__ https://llvm.org/pr23052 53__ https://bugs.chromium.org/p/oss-fuzz/issues/list?q=proj-llvm+clang-format-fuzzer 54 55llvm-as-fuzzer 56-------------- 57 58A |generic fuzzer| that tries to parse text as :doc:`LLVM assembly <LangRef>`. 59Some of the bugs this fuzzer has reported are `on bugzilla`__. 60 61__ https://llvm.org/pr24639 62 63llvm-dwarfdump-fuzzer 64--------------------- 65 66A |generic fuzzer| that interprets inputs as object files and runs 67:doc:`llvm-dwarfdump <CommandGuide/llvm-dwarfdump>` on them. Some of the bugs 68this fuzzer has reported are `on OSS Fuzz's tracker`__ 69 70__ https://bugs.chromium.org/p/oss-fuzz/issues/list?q=proj-llvm+llvm-dwarfdump-fuzzer 71 72llvm-demangle-fuzzer 73--------------------- 74 75A |generic fuzzer| for the Itanium demangler used in various LLVM tools. We've 76fuzzed __cxa_demangle to death, why not fuzz LLVM's implementation of the same 77function! 78 79llvm-isel-fuzzer 80---------------- 81 82A |LLVM IR fuzzer| aimed at finding bugs in instruction selection. 83 84This fuzzer accepts flags after `ignore_remaining_args=1`. The flags match 85those of :doc:`llc <CommandGuide/llc>` and the triple is required. For example, 86the following command would fuzz AArch64 with :doc:`GlobalISel/index`: 87 88.. code-block:: shell 89 90 % bin/llvm-isel-fuzzer <corpus-dir> -ignore_remaining_args=1 -mtriple aarch64 -global-isel -O0 91 92Some flags can also be specified in the binary name itself in order to support 93OSS Fuzz, which has trouble with required arguments. To do this, you can copy 94or move ``llvm-isel-fuzzer`` to ``llvm-isel-fuzzer--x-y-z``, separating options 95from the binary name using "--". The valid options are architecture names 96(``aarch64``, ``x86_64``), optimization levels (``O0``, ``O2``), or specific 97keywords, like ``gisel`` for enabling global instruction selection. In this 98mode, the same example could be run like so: 99 100.. code-block:: shell 101 102 % bin/llvm-isel-fuzzer--aarch64-O0-gisel <corpus-dir> 103 104llvm-opt-fuzzer 105--------------- 106 107A |LLVM IR fuzzer| aimed at finding bugs in optimization passes. 108 109It receives optimization pipeline and runs it for each fuzzer input. 110 111Interface of this fuzzer almost directly mirrors ``llvm-isel-fuzzer``. Both 112``mtriple`` and ``passes`` arguments are required. Passes are specified in a 113format suitable for the new pass manager. You can find some documentation about 114this format in the doxygen for ``PassBuilder::parsePassPipeline``. 115 116.. code-block:: shell 117 118 % bin/llvm-opt-fuzzer <corpus-dir> -ignore_remaining_args=1 -mtriple x86_64 -passes instcombine 119 120Similarly to the ``llvm-isel-fuzzer`` arguments in some predefined configurations 121might be embedded directly into the binary file name: 122 123.. code-block:: shell 124 125 % bin/llvm-opt-fuzzer--x86_64-instcombine <corpus-dir> 126 127llvm-mc-assemble-fuzzer 128----------------------- 129 130A |generic fuzzer| that fuzzes the MC layer's assemblers by treating inputs as 131target specific assembly. 132 133Note that this fuzzer has an unusual command line interface which is not fully 134compatible with all of libFuzzer's features. Fuzzer arguments must be passed 135after ``--fuzzer-args``, and any ``llc`` flags must use two dashes. For 136example, to fuzz the AArch64 assembler you might use the following command: 137 138.. code-block:: console 139 140 llvm-mc-fuzzer --triple=aarch64-linux-gnu --fuzzer-args -max_len=4 141 142This scheme will likely change in the future. 143 144llvm-mc-disassemble-fuzzer 145-------------------------- 146 147A |generic fuzzer| that fuzzes the MC layer's disassemblers by treating inputs 148as assembled binary data. 149 150Note that this fuzzer has an unusual command line interface which is not fully 151compatible with all of libFuzzer's features. See the notes above about 152``llvm-mc-assemble-fuzzer`` for details. 153 154 155.. |generic fuzzer| replace:: :ref:`generic fuzzer <fuzzing-llvm-generic>` 156.. |protobuf fuzzer| 157 replace:: :ref:`libprotobuf-mutator based fuzzer <fuzzing-llvm-protobuf>` 158.. |LLVM IR fuzzer| 159 replace:: :ref:`structured LLVM IR fuzzer <fuzzing-llvm-ir>` 160 161 162Mutators and Input Generators 163============================= 164 165The inputs for a fuzz target are generated via random mutations of a 166:ref:`corpus <libfuzzer-corpus>`. There are a few options for the kinds of 167mutations that a fuzzer in LLVM might want. 168 169.. _fuzzing-llvm-generic: 170 171Generic Random Fuzzing 172---------------------- 173 174The most basic form of input mutation is to use the built in mutators of 175LibFuzzer. These simply treat the input corpus as a bag of bits and make random 176mutations. This type of fuzzer is good for stressing the surface layers of a 177program, and is good at testing things like lexers, parsers, or binary 178protocols. 179 180Some of the in-tree fuzzers that use this type of mutator are `clang-fuzzer`_, 181`clang-format-fuzzer`_, `llvm-as-fuzzer`_, `llvm-dwarfdump-fuzzer`_, 182`llvm-mc-assemble-fuzzer`_, and `llvm-mc-disassemble-fuzzer`_. 183 184.. _fuzzing-llvm-protobuf: 185 186Structured Fuzzing using ``libprotobuf-mutator`` 187------------------------------------------------ 188 189We can use libprotobuf-mutator_ in order to perform structured fuzzing and 190stress deeper layers of programs. This works by defining a protobuf class that 191translates arbitrary data into structurally interesting input. Specifically, we 192use this to work with a subset of the C++ language and perform mutations that 193produce valid C++ programs in order to exercise parts of clang that are more 194interesting than parser error handling. 195 196To build this kind of fuzzer you need `protobuf`_ and its dependencies 197installed, and you need to specify some extra flags when configuring the build 198with :doc:`CMake <CMake>`. For example, `clang-proto-fuzzer`_ can be enabled by 199adding ``-DCLANG_ENABLE_PROTO_FUZZER=ON`` to the flags described in 200:ref:`building-fuzzers`. 201 202The only in-tree fuzzer that uses ``libprotobuf-mutator`` today is 203`clang-proto-fuzzer`_. 204 205.. _libprotobuf-mutator: https://github.com/google/libprotobuf-mutator 206.. _protobuf: https://github.com/google/protobuf 207 208.. _fuzzing-llvm-ir: 209 210Structured Fuzzing of LLVM IR 211----------------------------- 212 213We also use a more direct form of structured fuzzing for fuzzers that take 214:doc:`LLVM IR <LangRef>` as input. This is achieved through the ``FuzzMutate`` 215library, which was `discussed at EuroLLVM 2017`_. 216 217The ``FuzzMutate`` library is used to structurally fuzz backends in 218`llvm-isel-fuzzer`_. 219 220.. _discussed at EuroLLVM 2017: https://www.youtube.com/watch?v=UBbQ_s6hNgg 221 222 223Building and Running 224==================== 225 226.. _building-fuzzers: 227 228Configuring LLVM to Build Fuzzers 229--------------------------------- 230 231Fuzzers will be built and linked to libFuzzer by default as long as you build 232LLVM with sanitizer coverage enabled. You would typically also enable at least 233one sanitizer to find bugs faster. The most common way to build the fuzzers is 234by adding the following two flags to your CMake invocation: 235``-DLLVM_USE_SANITIZER=Address -DLLVM_USE_SANITIZE_COVERAGE=On``. 236 237.. note:: If you have ``compiler-rt`` checked out in an LLVM tree when building 238 with sanitizers, you'll want to specify ``-DLLVM_BUILD_RUNTIME=Off`` 239 to avoid building the sanitizers themselves with sanitizers enabled. 240 241.. note:: You may run into issues if you build with BFD ld, which is the 242 default linker on many unix systems. These issues are being tracked 243 in https://llvm.org/PR34636. 244 245Continuously Running and Finding Bugs 246------------------------------------- 247 248There used to be a public buildbot running LLVM fuzzers continuously, and while 249this did find issues, it didn't have a very good way to report problems in an 250actionable way. Because of this, we're moving towards using `OSS Fuzz`_ more 251instead. 252 253You can browse the `LLVM project issue list`_ for the bugs found by 254`LLVM on OSS Fuzz`_. These are also mailed to the `llvm-bugs mailing 255list`_. 256 257.. _OSS Fuzz: https://github.com/google/oss-fuzz 258.. _LLVM project issue list: 259 https://bugs.chromium.org/p/oss-fuzz/issues/list?q=Proj-llvm 260.. _LLVM on OSS Fuzz: 261 https://github.com/google/oss-fuzz/blob/master/projects/llvm 262.. _llvm-bugs mailing list: 263 http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs 264 265 266Utilities for Writing Fuzzers 267============================= 268 269There are some utilities available for writing fuzzers in LLVM. 270 271Some helpers for handling the command line interface are available in 272``include/llvm/FuzzMutate/FuzzerCLI.h``, including functions to parse command 273line options in a consistent way and to implement standalone main functions so 274your fuzzer can be built and tested when not built against libFuzzer. 275 276There is also some handling of the CMake config for fuzzers, where you should 277use the ``add_llvm_fuzzer`` to set up fuzzer targets. This function works 278similarly to functions such as ``add_llvm_tool``, but they take care of linking 279to LibFuzzer when appropriate and can be passed the ``DUMMY_MAIN`` argument to 280enable standalone testing. 281