1# Contributing to `bindgen` 2 3Hi! We'd love to have your contributions! If you want help or mentorship, reach 4out to us in a GitHub issue, or stop by 5[#rust on chat.mozilla.org](https://chat.mozilla.org/#/room/#rust:mozilla.org) 6and introduce yourself. 7 8<!-- START doctoc generated TOC please keep comment here to allow auto update --> 9<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE --> 10 11- [Code of Conduct](#code-of-conduct) 12- [Filing an Issue](#filing-an-issue) 13- [Looking to Start Contributing to `bindgen`?](#looking-to-start-contributing-to-bindgen) 14- [Building](#building) 15- [Testing](#testing) 16 - [Overview](#overview) 17 - [Testing Bindings Generation](#testing-bindings-generation) 18 - [Testing Generated Bindings](#testing-generated-bindings) 19 - [Testing a Single Header's Bindings Generation and Compiling its Bindings](#testing-a-single-headers-bindings-generation-and-compiling-its-bindings) 20 - [Authoring New Tests](#authoring-new-tests) 21 - [Test Expectations and `libclang` Versions](#test-expectations-and-libclang-versions) 22 - [Integration Tests](#integration-tests) 23 - [Fuzzing `bindgen` with `csmith`](#fuzzing-bindgen-with-csmith) 24 - [Property tests for `bindgen` with `quickchecking`](#property-tests-for-bindgen-with-quickchecking) 25- [Code Overview](#code-overview) 26 - [Implementing new options using `syn`](#implementing-new-options-using-syn) 27- [Pull Requests and Code Reviews](#pull-requests-and-code-reviews) 28- [Generating Graphviz Dot Files](#generating-graphviz-dot-files) 29- [Debug Logging](#debug-logging) 30- [Using `creduce` to Minimize Test Cases](#using-creduce-to-minimize-test-cases) 31 - [Getting `creduce`](#getting-creduce) 32 - [Isolating Your Test Case](#isolating-your-test-case) 33 - [Writing a Predicate Script](#writing-a-predicate-script) 34- [Cutting a new bindgen release](#cutting-a-new-bindgen-release) 35 - [Updating the changelog](#updating-the-changelog) 36 - [Bumping the version numbers.](#bumping-the-version-numbers) 37 - [Merge to `main`](#merge-to-main) 38 - [Publish and add a git tag for the right commit](#publish-and-add-a-git-tag-for-the-right-commit) 39 40<!-- END doctoc generated TOC please keep comment here to allow auto update --> 41 42## Code of Conduct 43 44We abide by the [Rust Code of Conduct][coc] and ask that you do as well. 45 46[coc]: https://www.rust-lang.org/en-US/conduct.html 47 48## Filing an Issue 49 50Think you've found a bug? File an issue! To help us understand and reproduce the 51issue, provide us with: 52 53* A (preferably reduced) C/C++ header file that reproduces the issue 54* The `bindgen` flags used to reproduce the issue with the header file 55* The expected `bindgen` output 56* The actual `bindgen` output 57* The [debugging logs](#logs) generated when running `bindgen` on this testcase 58 59## Looking to Start Contributing to `bindgen`? 60 61* [Issues labeled "easy"](https://github.com/rust-lang/rust-bindgen/issues?q=is%3Aopen+is%3Aissue+label%3AE-easy) 62* [Issues labeled "less easy"](https://github.com/rust-lang/rust-bindgen/issues?q=is%3Aopen+is%3Aissue+label%3AE-less-easy) 63* [Issues labeled "help wanted"](https://github.com/rust-lang/rust-bindgen/labels/help%20wanted) 64* Still can't find something to work on? [Drop a comment here](https://github.com/rust-lang/rust-bindgen/issues/747) 65 66## Building 67 68To build the `bindgen` library and the `bindgen` executable: 69 70``` 71$ cargo build 72``` 73 74If you installed multiple versions of llvm, it may not be able to locate the 75latest version of libclang. In that case, you may want to either uninstall other 76versions of llvm, or specify the path of the desired libclang explicitly: 77 78``` 79$ export LIBCLANG_PATH=path/to/clang-9.0/lib 80``` 81 82Additionally, you may want to build and test with the `testing_only_docs` 83feature to ensure that you aren't forgetting to document types and functions. CI 84will catch it if you forget, but the turn around will be a lot slower ;) 85 86``` 87$ cargo build --features testing_only_docs 88``` 89 90## Testing 91 92### Overview 93 94Input C/C++ test headers reside in the `bindgen-tests/tests/headers` directory. Expected 95output Rust bindings live in `bindgen-tests/tests/expectations/tests`. For example, 96`bindgen-tests/tests/headers/my_header.h`'s expected generated Rust bindings would be 97`bindgen-tests/tests/expectations/tests/my_header.rs`. 98 99There are also some integration tests in the `./bindgen-integration` crate, which uses `bindgen` to 100generate bindings to some C++ code, and then uses the bindings, asserting that 101values are what we expect them to be, both on the Rust and C++ side. 102 103The generated and expected bindings are run through `rustfmt` before they are 104compared. Make sure you have `rustfmt` up to date: 105 106``` 107$ rustup update nightly 108$ rustup component add rustfmt --toolchain nightly 109``` 110 111Note: running `cargo test` from the root directory of `bindgen`'s repository does not 112automatically test the generated bindings or run the integration tests. 113These steps must be performed manually when needed. 114 115### Testing Bindings Generation 116 117To regenerate bindings from the corpus of test headers in `bindgen-tests/tests/headers` and 118compare them against the expected bindings in `bindgen-tests/tests/expectations/tests`, run: 119 120``` 121$ cargo test 122``` 123 124As long as you aren't making any changes to `bindgen`'s output, running this 125should be sufficient to test your local modifications. 126 127You may set the `BINDGEN_OVERWRITE_EXPECTED` environment variable to overwrite 128the expected bindings with `bindgen`'s current output: 129 130``` 131$ BINDGEN_OVERWRITE_EXPECTED=1 cargo test 132``` 133 134If you set the BINDGEN_TESTS_DIFFTOOL environment variable, `cargo test` will 135execute $BINDGEN_TESTS_DIFFTOOL /path/of/expected/output /path/of/actual/output 136when the expected output differs from the actual output. You can use this to 137hand check differences by setting it to e.g. "meld" (assuming you have meld 138installed). 139 140If you're not changing command line arguments, you may want to set 141`BINDGEN_DISABLE_ROUNDTRIP_TEST` to avoid a lot of tests for round-tripping of 142those. 143 144### Testing Generated Bindings 145 146If your local changes are introducing expected modifications in the 147`bindgen-tests/tests/expectations/tests/*` bindings files, then you should test that the 148generated bindings files still compile, and that their struct layout tests still 149pass. Also, run the integration tests (see below). 150 151You can do this with these commands: 152 153``` 154$ cd bindgen-tests/tests/expectations 155$ cargo test 156``` 157 158### Testing a Single Header's Bindings Generation and Compiling its Bindings 159 160Note: You will to need to install [Graphviz](https://graphviz.org/) since that 161is a dependency for running `test-one.sh`. 162 163Sometimes its useful to work with one test header from start (generating 164bindings for it) to finish (compiling the bindings and running their layout 165tests). This can be done with the `bindgen-tests/tests/test-one.sh` script. It supports fuzzy 166searching for test headers. For example, to test 167`tests/headers/what_is_going_on.hpp`, execute this command: 168 169``` 170$ ./bindgen-tests/tests/test-one.sh going 171``` 172 173Note that `test-one.sh` does not recompile `bindgen`, so if you change the code, 174you'll need to rebuild it before running the script again. 175 176### Authoring New Tests 177 178To add a new test header to the suite, simply put it in the `bindgen-tests/tests/headers` 179directory. Next, run `bindgen` to generate the initial expected output Rust 180bindings. Put those in `bindgen-tests/tests/expectations/tests`. 181 182If your new test requires certain flags to be passed to `bindgen`, you can 183specify them at the top of the test header, with a comment like this: 184 185`new_test_header.hpp`: 186 187```c 188// bindgen-flags: --enable-cxx-namespaces -- -std=c++14 189``` 190 191Then verify the new Rust bindings compile and pass their layout tests: 192 193``` 194$ cd bindgen-tests/tests/expectations 195$ cargo test new_test_header 196``` 197 198### Test Expectations and `libclang` Versions 199 200If a test generates different bindings across different `libclang` versions (for 201example, because we take advantage of better/newer APIs when possible), then you 202can add multiple test expectations, one for each supported `libclang` 203version. Instead of having a single `bindgen-tests/tests/expectations/tests/my_test.rs` file, 204add each of: 205 206* `bindgen-tests/tests/expectations/tests/libclang-9/my_test.rs` 207* `bindgen-tests/tests/expectations/tests/libclang-5/my_test.rs` 208 209If you need to update the test expectations for a test file that generates 210different bindings for different `libclang` versions, you *don't* need to have 211many version of `libclang` installed locally. Just make a work-in-progress pull 212request, and then when Travis CI fails, it will log a diff of the 213expectations. Use the diff to patch the appropriate expectation file locally and 214then update your pull request. 215 216Usually, `bindgen`'s test runner can infer which version of `libclang` you 217have. If for some reason it can't, you can force a specific `libclang` version 218to check the bindings against with a cargo feature: 219 220``` 221$ cargo test --features testing_only_libclang_$VERSION 222``` 223 224Where `$VERSION` is one of: 225 226* `4` 227* `3_9` 228* `3_8` 229 230depending on which version of `libclang` you have installed. 231 232### Integration Tests 233 234The `./bindgen-integration` crate uses `bindgen` to 235generate bindings to some C++ code, and then uses the bindings, asserting that 236values are what we expect them to be, both on the Rust and C++ side. 237 238To run the integration tests, issue the following: 239 240``` 241$ cd bindgen-integration 242$ cargo test 243``` 244 245### Fuzzing `bindgen` with `csmith` 246 247We <3 finding hidden bugs and the people who help us find them! One way to help 248uncover hidden bugs is by running `csmith` to generate random headers to test 249`bindgen` against. 250 251See [./csmith-fuzzing/README.md](./csmith-fuzzing/README.md) for details. 252 253### Property tests for `bindgen` with `quickchecking` 254 255The `tests/quickchecking` crate generates property tests for `bindgen`. 256From the crate's directory you can run the tests with `cargo run`. For details 257on additional configuration including how to preserve / inspect the generated 258property tests, see 259[./tests/quickchecking/README.md](./tests/quickchecking/README.md). 260 261## Code Overview 262 263`bindgen` takes C and C++ header files as input and generates corresponding Rust 264`#[repr(C)]` type definitions and `extern` foreign function declarations. 265 266First, we use `libclang` to parse the input headers. See `src/clang.rs` for our 267Rust-y wrappers over the raw C `libclang` API that the `clang-sys` crate 268exposes. We walk over `libclang`'s AST and construct our own internal 269representation (IR). The `ir` module and submodules (`src/ir/*`) contain the IR 270type definitions and `libclang` AST into IR parsing code. 271 272The umbrella IR type is the `Item`. It contains various nested `enum`s that let 273us drill down and get more specific about the kind of construct that we're 274looking at. Here is a summary of the IR types and their relationships: 275 276* `Item` contains: 277 * An `ItemId` to uniquely identify it. 278 * An `ItemKind`, which is one of: 279 * A `Module`, which is originally a C++ namespace and becomes a Rust 280 module. It contains the set of `ItemId`s of `Item`s that are defined 281 within it. 282 * A `Type`, which contains: 283 * A `Layout`, describing the type's size and alignment. 284 * A `TypeKind`, which is one of: 285 * Some integer type. 286 * Some float type. 287 * A `Pointer` to another type. 288 * A function pointer type, with `ItemId`s of its parameter types 289 and return type. 290 * An `Alias` to another type (`typedef` or `using X = ...`). 291 * A fixed size `Array` of `n` elements of another type. 292 * A `Comp` compound type, which is either a `struct`, `class`, 293 or `union`. This is potentially a template definition. 294 * A `TemplateInstantiation` referencing some template definition 295 and a set of template argument types. 296 * Etc... 297 * A `Function`, which contains: 298 * An ABI 299 * A mangled name 300 * a `FunctionKind`, which describes whether this function is a plain 301 function, method, static method, constructor, destructor, etc. 302 * The `ItemId` of its function pointer type. 303 * A `Var` representing a static variable or `#define` constant, which 304 contains: 305 * Its type's `ItemId` 306 * Optionally, a mangled name 307 * Optionally, a value 308 * An optional `clang::SourceLocation` that holds the first source code 309 location where the `Item` was encountered. 310 311The IR forms a graph of interconnected and inter-referencing types and 312functions. The `ir::traversal` module provides IR graph traversal 313infrastructure: edge kind definitions (base member vs field type vs function 314parameter, etc...), the `Trace` trait to enumerate an IR thing's outgoing edges, 315various traversal types. 316 317After constructing the IR, we run a series of analyses on it. These analyses do 318everything from allocate logical bitfields into physical units, compute for 319which types we can `#[derive(Debug)]`, to determining which implicit template 320parameters a given type uses. The analyses are defined in 321`src/ir/analysis/*`. They are implemented as fixed-point algorithms, using the 322`ir::analysis::MonotoneFramework` trait. 323 324The final phase is generating Rust source text from the analyzed IR, and it is 325defined in `src/codegen/*`. We use the `quote` crate, which provides the `quote! 326{ ... }` macro for quasi-quoting Rust forms. Some options that affect the 327generated Rust code are implemented using the [`syn`](https://docs.rs/syn) crate. 328 329### Implementing new options using `syn` 330 331If a new option can be implemented using the `syn` crate it should be added to 332the `codegen::postprocessing` module by following these steps: 333 334- Introduce a new field to `BindgenOptions` for the option. 335- Write a free function inside `codegen::postprocessing` implementing the 336 option. This function with the same name of the `BindgenOptions` field. 337- Add a new value to the `codegen::postprocessing::PASSES` for the option using 338 the `pass!` macro. 339 340## Pull Requests and Code Reviews 341 342Ensure that each commit stands alone, and passes tests. This enables better `git 343bisect`ing when needed. If your commits do not stand on their own, then rebase 344them on top of the latest main and squash them into a single commit. 345 346All pull requests undergo code review before merging. To request review, comment 347`r? @github_username_of_reviewer`. They we will respond with `r+` to approve the 348pull request, or may leave feedback and request changes to the pull request. Any 349changes should be squashed into the original commit. 350 351Unsure who to ask for review? Ask any of: 352 353* `@emilio` 354* `@fitzgen` 355 356More resources: 357 358* [Servo's GitHub Workflow](https://github.com/servo/servo/wiki/Github-workflow) 359* [Beginner's Guide to Rebasing and Squashing](https://github.com/servo/servo/wiki/Beginner's-guide-to-rebasing-and-squashing) 360 361## Generating Graphviz Dot Files 362 363We can generate [Graphviz](http://graphviz.org/pdf/dotguide.pdf) dot files from 364our internal representation of a C/C++ input header, and then you can create a 365PNG or PDF from it with Graphviz's `dot` program. This is very useful when 366debugging bindgen! 367 368First, make sure you have Graphviz and `dot` installed: 369 370``` 371$ brew install graphviz # OS X 372$ sudo dnf install graphviz # Fedora 373$ # Etc... 374``` 375 376Then, use the `--emit-ir-graphviz` flag to generate a `dot` file from our IR: 377 378``` 379$ cargo run -- example.hpp --emit-ir-graphviz output.dot 380``` 381 382Finally, convert the `dot` file to an image: 383 384``` 385$ dot -Tpng output.dot -o output.png 386``` 387 388The final result will look something like this: 389 390[![An example graphviz rendering of our IR](./example-graphviz-ir.png)](./example-graphviz-ir.png) 391 392## Debug Logging 393 394To help debug what `bindgen` is doing, you can define the environment variable 395`RUST_LOG=bindgen` to get a bunch of debugging log spew. 396 397``` 398$ RUST_LOG=bindgen ./target/debug/bindgen [flags...] ~/path/to/some/header.h 399``` 400 401This logging can also be used when debugging failing tests: 402 403``` 404$ RUST_LOG=bindgen cargo test 405``` 406 407## Using `creduce` to Minimize Test Cases 408 409If you find a test case that triggers an unexpected panic in `bindgen`, causes 410`bindgen` to emit bindings that won't compile, define structs with the wrong 411size/alignment, or results in any other kind of incorrectness, then using 412`creduce` can help reduce the test case to a minimal one that still exhibits 413that same bad behavior. 414 415***Reduced test cases are SUPER helpful when filing bug reports!*** 416 417### Getting `creduce` 418 419Often, you can install `creduce` from your OS's package manager: 420 421``` 422$ sudo apt install creduce 423$ brew install creduce 424$ # Etc... 425``` 426 427Otherwise, follow [these instructions](https://github.com/csmith-project/creduce/blob/main/INSTALL.md) for building and/or installing `creduce`. 428 429Running `creduce` requires two things: 430 4311. Your isolated test case, and 432 4332. A script to act as a predicate script describing whether the behavior you're 434 trying to isolate occurred. 435 436With those two things in hand, running `creduce` looks like this: 437 438 $ creduce ./predicate.sh ./isolated-test-case.h 439 440### Isolating Your Test Case 441 442If you're using `bindgen` as a command line tool, pass 443`--dump-preprocessed-input` flag. 444 445If you're using `bindgen` as a Rust library, invoke the 446`bindgen::Builder::dump_preprocessed_input` method where you call 447`bindgen::Builder::generate`. 448 449Afterwards, there should be a `__bindgen.i` or `__bindgen.ii` file containing 450the combined and preprocessed input headers, which is usable as an isolated, 451standalone test case. 452 453### Writing a Predicate Script 454 455Writing a `predicate.sh` script for a `bindgen` test case is straightforward. We 456already have a general purpose predicate script that you can use, you just have 457to wrap and configure it. 458 459```bash 460#!/usr/bin/env bash 461 462# Exit the script with a nonzero exit code if: 463# * any individual command finishes with a nonzero exit code, or 464# * we access any undefined variable. 465set -eu 466 467# Invoke the general purpose predicate script that comes in the 468# `bindgen` repository. 469# 470# You'll need to replace `--whatever-flags` with things that are specific to the 471# incorrectness you're trying to pin down. See below for details. 472path/to/rust-bindgen/csmith-fuzzing/predicate.py \ 473 --whatever-flags \ 474 ./isolated-test-case.h 475``` 476 477When hunting down a particular panic emanating from inside `bindgen`, you can 478invoke `predicate.py` like this: 479 480```bash 481path/to/rust-bindgen/csmith-fuzzing/predicate.py \ 482 --expect-bindgen-fail \ 483 --bindgen-grep "thread main panicked at '<insert panic message here>'" \ 484 ./isolated-test-case.h 485``` 486 487Alternatively, when hunting down a bad `#[derive(Eq)]` that is causing `rustc` 488to fail to compile `bindgen`'s emitted bindings, you can invoke `predicate.py` 489like this: 490 491```bash 492path/to/rust-bindgen/csmith-fuzzing/predicate.py \ 493 --bindings-grep NameOfTheStructThatIsErroneouslyDerivingEq \ 494 --expect-compile-fail \ 495 --rustc-grep 'error[E0277]: the trait bound `f64: std::cmp::Eq` is not satisfied' \ 496 ./isolated-test-case.h 497``` 498 499Or, when minimizing a failing layout test in the compiled bindings, you can 500invoke `predicate.py` like this: 501 502```bash 503path/to/rust-bindgen/csmith-fuzzing/predicate.py \ 504 --bindings-grep MyStruct \ 505 --expect-layout-tests-fail \ 506 --layout-tests-grep "thread 'bindgen_test_layout_MyStruct' panicked" \ 507 ./isolated-test-case.h 508``` 509 510For details on all the flags that you can pass to `predicate.py`, run: 511 512``` 513$ path/to/rust-bindgen/csmith-fuzzing/predicate.py --help 514``` 515 516And you can always write your own, arbitrary predicate script if you prefer. 517(Although, maybe we should add extra functionality to `predicate.py` -- file an 518issue if you think so!) 519 520`creduce` is *really* helpful and can cut hundreds of thousands of lines of test 521case down to 5 lines. 522 523Happy bug hunting and test case reducing! 524 525[More information on using `creduce`.](https://embed.cs.utah.edu/creduce/using/) 526 527## Cutting a new bindgen release 528 529To cut a release, the following needs to happen: 530 531### Updating the changelog 532 533Update the CHANGELOG.md file with the changes from the last release. Something 534like the following is a useful way to check what has landed: 535 536 ``` 537 $ git log --oneline v0.62.0..HEAD 538 ``` 539 540Also worth checking the [next-release tag](https://github.com/rust-lang/rust-bindgen/pulls?q=is%3Apr+label%3Anext-release). 541 542Once that's done and the changelog is up-to-date, run `doctoc` on it. 543 544If needed, install it locally by running: 545 546``` 547$ npm install doctoc 548$ ./node_modules/doctoc/doctoc.js CHANGELOG.md 549``` 550 551### Bumping the version numbers. 552 553Bump version numbers as needed. Run tests just to ensure everything is working 554as expected. 555 556### Merge to `main` 557 558For regular releases, the changes above should end up in `main` before 559publishing. For dot-releases of an old version (e.g., cherry-picking an 560important fix) you can skip this. 561 562### Publish and add a git tag for the right commit 563 564Once you're in the right commit, do: 565 566``` 567$ git tag -a v0.62.1 # With the right version of course 568$ pushd bindgen && cargo publish && popd 569$ pushd bindgen-cli && cargo publish && popd 570$ git push --tags upstream # To publish the tag 571``` 572