• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1---
2layout: default
3title: Integrating a Python project
4parent: Setting up a new project
5grand_parent: Getting started
6nav_order: 3
7permalink: /getting-started/new-project-guide/python-lang/
8---
9
10# Integrating a Python project
11{: .no_toc}
12
13- TOC
14{:toc}
15---
16
17
18The process of integrating a project written in Python with OSS-Fuzz is very
19similar to the general
20[Setting up a new project]({{ site.baseurl }}/getting-started/new-project-guide/)
21process. The key specifics of integrating a Python project are outlined below.
22
23## Atheris
24
25Python fuzzing in OSS-Fuzz depends on
26[Atheris](https://github.com/google/atheris). Fuzzers will depend on the
27`atheris` package, and dependencies are pre-installed on the OSS-Fuzz base
28docker images.
29
30## Project files
31
32### Example project
33
34We recommend viewing [ujson](https://github.com/google/oss-fuzz/tree/master/projects/ujson) as an
35example of a simple Python fuzzing project, with both plain-Atheris and
36Atheris + Hypothesis harnesses.
37
38### project.yaml
39
40The `language` attribute must be specified.
41
42```yaml
43language: python
44```
45
46The only supported fuzzing engine is libFuzzer (`libfuzzer`). The supported
47sanitizers are AddressSanitizer (`address`) and
48UndefinedBehaviorSanitizer (`undefined`). These must be explicitly specified.
49
50```yaml
51fuzzing_engines:
52  - libfuzzer
53sanitizers:
54  - address
55  - undefined
56```
57
58### Dockerfile
59
60The Dockerfile should start by `FROM gcr.io/oss-fuzz-base/base-builder-python`
61
62Because most dependencies are already pre-installed on the images, no
63significant changes are needed in the Dockerfile for Python fuzzing projects.
64You should simply clone the project, set a `WORKDIR`, and copy any necessary
65files, or install any project-specific dependencies here as you normally would.
66
67### build.sh
68
69For Python projects, `build.sh` does need some more significant modifications
70over normal projects. The following is an annotated example build script,
71explaining why each step is necessary and when they can be omitted.
72
73```sh
74# Build and install project (using current CFLAGS, CXXFLAGS). This is required
75# for projects with C extensions so that they're built with the proper flags.
76pip3 install .
77
78# Build fuzzers into $OUT. These could be detected in other ways.
79for fuzzer in $(find $SRC -name '*_fuzzer.py'); do
80  fuzzer_basename=$(basename -s .py $fuzzer)
81  fuzzer_package=${fuzzer_basename}.pkg
82
83  # To avoid issues with Python version conflicts, or changes in environment
84  # over time on the OSS-Fuzz bots, we use pyinstaller to create a standalone
85  # package. Though not necessarily required for reproducing issues, this is
86  # required to keep fuzzers working properly in OSS-Fuzz.
87  pyinstaller --distpath $OUT --onefile --name $fuzzer_package $fuzzer
88
89  # Create execution wrapper. Atheris requires that certain libraries are
90  # preloaded, so this is also done here to ensure compatibility and simplify
91  # test case reproduction. Since this helper script is what OSS-Fuzz will
92  # actually execute, it is also always required.
93  # NOTE: If you are fuzzing python-only code and do not have native C/C++
94  # extensions, then remove the LD_PRELOAD line below as preloading sanitizer
95  # library is not required and can lead to unexpected startup crashes.
96  echo "#!/bin/sh
97# LLVMFuzzerTestOneInput for fuzzer detection.
98this_dir=\$(dirname \"\$0\")
99LD_PRELOAD=\$this_dir/sanitizer_with_fuzzer.so \
100ASAN_OPTIONS=\$ASAN_OPTIONS:symbolize=1:external_symbolizer_path=\$this_dir/llvm-symbolizer:detect_leaks=0 \
101\$this_dir/$fuzzer_package \$@" > $OUT/$fuzzer_basename
102  chmod +x $OUT/$fuzzer_basename
103done
104```
105
106## Hypothesis
107
108Using [Hypothesis](https://hypothesis.readthedocs.io/), the Python library for
109[property-based testing](https://hypothesis.works/articles/what-is-property-based-testing/),
110makes it really easy to generate complex inputs - whether in traditional test suites
111or [by using test functions as fuzz harnesses](https://hypothesis.readthedocs.io/en/latest/details.html#use-with-external-fuzzers).
112
113> Property based testing is the construction of tests such that, when these tests are fuzzed,
114  failures in the test reveal problems with the system under test that could not have been
115  revealed by direct fuzzing of that system.
116
117We recommend using the [`hypothesis write`](https://hypothesis.readthedocs.io/en/latest/ghostwriter.html)
118command to generate a starter fuzz harness.  This "ghostwritten" code may be usable as-is,
119or provide a useful template for writing more specific tests.
120
121See [here for the core "strategies"](https://hypothesis.readthedocs.io/en/latest/data.html),
122for arbitrary data, [here for Numpy + Pandas support](https://hypothesis.readthedocs.io/en/latest/numpy.html),
123or [here for a variety of third-party extensions](https://hypothesis.readthedocs.io/en/latest/strategies.html)
124supporting everything from protobufs, to jsonschemas, to networkx graphs or geojson
125or valid Python source code.
126Hypothesis' integrated test-case reduction also makes it trivial to report a canonical minimal
127example for each distinct failure discovered while fuzzing - just run the test function!
128
129To use Hypothesis in OSS-Fuzz, install it in your Dockerfile with
130
131```shell
132RUN pip3 install hypothesis
133```
134
135See [the `ujson` structured fuzzer](https://github.com/google/oss-fuzz/blob/master/projects/ujson/hypothesis_structured_fuzzer.py)
136for an example "polyglot" which can either be run with `pytest` as a standard test function,
137or run with OSS-Fuzz as a fuzz harness.
138