• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1# Fuzzing binary-only targets
2
3AFL++, libfuzzer, and other fuzzers are great if you have the source code of the
4target. This allows for very fast and coverage guided fuzzing.
5
6However, if there is only the binary program and no source code available, then
7standard `afl-fuzz -n` (non-instrumented mode) is not effective.
8
9For fast, on-the-fly instrumentation of black-box binaries, AFL++ still offers
10various support. The following is a description of how these binaries can be
11fuzzed with AFL++.
12
13## TL;DR:
14
15FRIDA mode and QEMU mode in persistent mode are the fastest - if persistent mode
16is possible and the stability is high enough.
17
18Otherwise, try Zafl, RetroWrite, Dyninst, and if these fail, too, then try
19standard FRIDA/QEMU mode with `AFL_ENTRYPOINT` to where you need it.
20
21If your target is non-linux, then use unicorn_mode.
22
23## Fuzzing binary-only targets with AFL++
24
25### QEMU mode
26
27QEMU mode is the "native" solution to the program. It is available in the
28./qemu_mode/ directory and, once compiled, it can be accessed by the afl-fuzz -Q
29command line option. It is the easiest to use alternative and even works for
30cross-platform binaries.
31
32For linux programs and its libraries, this is accomplished with a version of
33QEMU running in the lesser-known "user space emulation" mode. QEMU is a project
34separate from AFL++, but you can conveniently build the feature by doing:
35
36```shell
37cd qemu_mode
38./build_qemu_support.sh
39```
40
41The following setup to use QEMU mode is recommended:
42
43* run 1 afl-fuzz -Q instance with CMPLOG (`-c 0` + `AFL_COMPCOV_LEVEL=2`)
44* run 1 afl-fuzz -Q instance with QASAN (`AFL_USE_QASAN=1`)
45* run 1 afl-fuzz -Q instance with LAF (`AFL_PRELOAD=libcmpcov.so` +
46  `AFL_COMPCOV_LEVEL=2`), alternatively you can use FRIDA mode, just switch `-Q`
47  with `-O` and remove the LAF instance
48
49Then run as many instances as you have cores left with either -Q mode or - even
50better - use a binary rewriter like Dyninst, RetroWrite, ZAFL, etc.
51The binary rewriters all have their own advantages and caveats.
52ZAFL is the best but cannot be used in a business/commercial context.
53
54If a binary rewriter works for your target then you can use afl-fuzz normally
55and it will have twice the speed compared to QEMU mode (but slower than QEMU
56persistent mode).
57
58The speed decrease of QEMU mode is at about 50%. However, various options exist
59to increase the speed:
60- using AFL_ENTRYPOINT to move the forkserver entry to a later basic block in
61  the binary (+5-10% speed)
62- using persistent mode
63  [qemu_mode/README.persistent.md](../qemu_mode/README.persistent.md) this will
64  result in a 150-300% overall speed increase - so 3-8x the original QEMU mode
65  speed!
66- using AFL_CODE_START/AFL_CODE_END to only instrument specific parts
67
68For additional instructions and caveats, see
69[qemu_mode/README.md](../qemu_mode/README.md). If possible, you should use the
70persistent mode, see
71[qemu_mode/README.persistent.md](../qemu_mode/README.persistent.md). The mode is
72approximately 2-5x slower than compile-time instrumentation, and is less
73conducive to parallelization.
74
75Note that there is also honggfuzz:
76[https://github.com/google/honggfuzz](https://github.com/google/honggfuzz) which
77now has a QEMU mode, but its performance is just 1.5% ...
78
79If you like to code a customized fuzzer without much work, we highly recommend
80to check out our sister project libafl which supports QEMU, too:
81[https://github.com/AFLplusplus/LibAFL](https://github.com/AFLplusplus/LibAFL)
82
83### WINE+QEMU
84
85Wine mode can run Win32 PE binaries with the QEMU instrumentation. It needs
86Wine, python3, and the pefile python package installed.
87
88It is included in AFL++.
89
90For more information, see
91[qemu_mode/README.wine.md](../qemu_mode/README.wine.md).
92
93### FRIDA mode
94
95In FRIDA mode, you can fuzz binary-only targets as easily as with QEMU mode.
96FRIDA mode is most of the times slightly faster than QEMU mode. It is also
97newer, lacks COMPCOV, and has the advantage that it works on MacOS (both intel
98and M1).
99
100To build FRIDA mode:
101
102```shell
103cd frida_mode
104gmake
105```
106
107For additional instructions and caveats, see
108[frida_mode/README.md](../frida_mode/README.md).
109
110If possible, you should use the persistent mode, see
111[instrumentation/README.persistent_mode.md](../instrumentation/README.persistent_mode.md).
112The mode is approximately 2-5x slower than compile-time instrumentation, and is
113less conducive to parallelization. But for binary-only fuzzing, it gives a huge
114speed improvement if it is possible to use.
115
116If you want to fuzz a binary-only library, then you can fuzz it with frida-gum
117via frida_mode/. You will have to write a harness to call the target function in
118the library, use afl-frida.c as a template.
119
120You can also perform remote fuzzing with frida, e.g., if you want to fuzz on
121iPhone or Android devices, for this you can use
122[https://github.com/ttdennis/fpicker/](https://github.com/ttdennis/fpicker/) as
123an intermediate that uses AFL++ for fuzzing.
124
125If you like to code a customized fuzzer without much work, we highly recommend
126to check out our sister project libafl which supports Frida, too:
127[https://github.com/AFLplusplus/LibAFL](https://github.com/AFLplusplus/LibAFL).
128Working examples already exist :-)
129
130### Nyx mode
131
132Nyx is a full system emulation fuzzing environment with snapshot support that is
133built upon KVM and QEMU. It is only available on Linux and currently restricted
134to x86_x64.
135
136For binary-only fuzzing a special 5.10 kernel is required.
137
138See [nyx_mode/README.md](../nyx_mode/README.md).
139
140### Unicorn
141
142Unicorn is a fork of QEMU. The instrumentation is, therefore, very similar. In
143contrast to QEMU, Unicorn does not offer a full system or even userland
144emulation. Runtime environment and/or loaders have to be written from scratch,
145if needed. On top, block chaining has been removed. This means the speed boost
146introduced in the patched QEMU Mode of AFL++ cannot be ported over to Unicorn.
147
148For non-Linux binaries, you can use AFL++'s unicorn_mode which can emulate
149anything you want - for the price of speed and user written scripts.
150
151To build unicorn_mode:
152
153```shell
154cd unicorn_mode
155./build_unicorn_support.sh
156```
157
158For further information, check out
159[unicorn_mode/README.md](../unicorn_mode/README.md).
160
161### Shared libraries
162
163If the goal is to fuzz a dynamic library, then there are two options available.
164For both, you need to write a small harness that loads and calls the library.
165Then you fuzz this with either FRIDA mode or QEMU mode and either use
166`AFL_INST_LIBS=1` or `AFL_QEMU/FRIDA_INST_RANGES`.
167
168Another, less precise and slower option is to fuzz it with utils/afl_untracer/
169and use afl-untracer.c as a template. It is slower than FRIDA mode.
170
171For more information, see
172[utils/afl_untracer/README.md](../utils/afl_untracer/README.md).
173
174### Coresight
175
176Coresight is ARM's answer to Intel's PT. With AFL++ v3.15, there is a coresight
177tracer implementation available in `coresight_mode/` which is faster than QEMU,
178however, cannot run in parallel. Currently, only one process can be traced, it
179is WIP.
180
181Fore more information, see
182[coresight_mode/README.md](../coresight_mode/README.md).
183
184## Binary rewriters
185
186An alternative solution are binary rewriters. They are faster than the solutions
187native to AFL++ but don't always work.
188
189### ZAFL
190
191ZAFL is a static rewriting platform supporting x86-64 C/C++,
192stripped/unstripped, and PIE/non-PIE binaries. Beyond conventional
193instrumentation, ZAFL's API enables transformation passes (e.g., laf-Intel,
194context sensitivity, InsTrim, etc.).
195
196Its baseline instrumentation speed typically averages 90-95% of
197afl-clang-fast's.
198
199[https://git.zephyr-software.com/opensrc/zafl](https://git.zephyr-software.com/opensrc/zafl)
200
201### RetroWrite
202
203RetroWrite is a static binary rewriter that can be combined with AFL++. If you
204have an x86_64 binary that still has its symbols (i.e., not stripped binary), is
205compiled with position independent code (PIC/PIE), and does not contain C++
206exceptions, then the RetroWrite solution might be for you. It decompiles to ASM
207files which can then be instrumented with afl-gcc.
208
209Binaries that are statically instrumented for fuzzing using RetroWrite are close
210in performance to compiler-instrumented binaries and outperform the QEMU-based
211instrumentation.
212
213[https://github.com/HexHive/retrowrite](https://github.com/HexHive/retrowrite)
214
215### Dyninst
216
217Dyninst is a binary instrumentation framework similar to Pintool and DynamoRIO.
218However, whereas Pintool and DynamoRIO work at runtime, Dyninst instruments the
219target at load time and then let it run - or save the binary with the changes.
220This is great for some things, e.g., fuzzing, and not so effective for others,
221e.g., malware analysis.
222
223So, what you can do with Dyninst is taking every basic block and putting AFL++'s
224instrumentation code in there - and then save the binary. Afterwards, just fuzz
225the newly saved target binary with afl-fuzz. Sounds great? It is. The issue
226though - it is a non-trivial problem to insert instructions, which change
227addresses in the process space, so that everything is still working afterwards.
228Hence, more often than not binaries crash when they are run.
229
230The speed decrease is about 15-35%, depending on the optimization options used
231with afl-dyninst.
232
233[https://github.com/vanhauser-thc/afl-dyninst](https://github.com/vanhauser-thc/afl-dyninst)
234
235### Mcsema
236
237Theoretically, you can also decompile to llvm IR with mcsema, and then use
238llvm_mode to instrument the binary. Good luck with that.
239
240[https://github.com/lifting-bits/mcsema](https://github.com/lifting-bits/mcsema)
241
242## Binary tracers
243
244### Pintool & DynamoRIO
245
246Pintool and DynamoRIO are dynamic instrumentation engines. They can be used for
247getting basic block information at runtime. Pintool is only available for Intel
248x32/x64 on Linux, Mac OS, and Windows, whereas DynamoRIO is additionally
249available for ARM and AARCH64. DynamoRIO is also 10x faster than Pintool.
250
251The big issue with DynamoRIO (and therefore Pintool, too) is speed. DynamoRIO
252has a speed decrease of 98-99%, Pintool has a speed decrease of 99.5%.
253
254Hence, DynamoRIO is the option to go for if everything else fails and Pintool
255only if DynamoRIO fails, too.
256
257DynamoRIO solutions:
258* [https://github.com/vanhauser-thc/afl-dynamorio](https://github.com/vanhauser-thc/afl-dynamorio)
259* [https://github.com/mxmssh/drAFL](https://github.com/mxmssh/drAFL)
260* [https://github.com/googleprojectzero/winafl/](https://github.com/googleprojectzero/winafl/)
261  <= very good but windows only
262
263Pintool solutions:
264* [https://github.com/vanhauser-thc/afl-pin](https://github.com/vanhauser-thc/afl-pin)
265* [https://github.com/mothran/aflpin](https://github.com/mothran/aflpin)
266* [https://github.com/spinpx/afl_pin_mode](https://github.com/spinpx/afl_pin_mode)
267  <= only old Pintool version supported
268
269### Intel PT
270
271If you have a newer Intel CPU, you can make use of Intel's processor trace. The
272big issue with Intel's PT is the small buffer size and the complex encoding of
273the debug information collected through PT. This makes the decoding very CPU
274intensive and hence slow. As a result, the overall speed decrease is about
27570-90% (depending on the implementation and other factors).
276
277There are two AFL intel-pt implementations:
278
2791. [https://github.com/junxzm1990/afl-pt](https://github.com/junxzm1990/afl-pt)
280    => This needs Ubuntu 14.04.05 without any updates and the 4.4 kernel.
281
2822. [https://github.com/hunter-ht-2018/ptfuzzer](https://github.com/hunter-ht-2018/ptfuzzer)
283    => This needs a 4.14 or 4.15 kernel. The "nopti" kernel boot option must be
284    used. This one is faster than the other.
285
286Note that there is also honggfuzz:
287[https://github.com/google/honggfuzz](https://github.com/google/honggfuzz). But
288its IPT performance is just 6%!
289
290## Non-AFL++ solutions
291
292There are many binary-only fuzzing frameworks. Some are great for CTFs but don't
293work with large binaries, others are very slow but have good path discovery,
294some are very hard to set-up...
295
296* Jackalope:
297  [https://github.com/googleprojectzero/Jackalope](https://github.com/googleprojectzero/Jackalope)
298* Manticore:
299  [https://github.com/trailofbits/manticore](https://github.com/trailofbits/manticore)
300* QSYM:
301  [https://github.com/sslab-gatech/qsym](https://github.com/sslab-gatech/qsym)
302* S2E: [https://github.com/S2E](https://github.com/S2E)
303* TinyInst:
304  [https://github.com/googleprojectzero/TinyInst](https://github.com/googleprojectzero/TinyInst)
305  (Mac/Windows only)
306*  ... please send me any missing that are good
307
308## Closing words
309
310That's it! News, corrections, updates? Send an email to vh@thc.org.
311