1CXX — safe FFI between Rust and C++ 2========================================= 3 4[<img alt="github" src="https://img.shields.io/badge/github-dtolnay/cxx-8da0cb?style=for-the-badge&labelColor=555555&logo=github" height="20">](https://github.com/dtolnay/cxx) 5[<img alt="crates.io" src="https://img.shields.io/crates/v/cxx.svg?style=for-the-badge&color=fc8d62&logo=rust" height="20">](https://crates.io/crates/cxx) 6[<img alt="docs.rs" src="https://img.shields.io/badge/docs.rs-cxx-66c2a5?style=for-the-badge&labelColor=555555&logo=docs.rs" height="20">](https://docs.rs/cxx) 7[<img alt="build status" src="https://img.shields.io/github/actions/workflow/status/dtolnay/cxx/ci.yml?branch=master&style=for-the-badge" height="20">](https://github.com/dtolnay/cxx/actions?query=branch%3Amaster) 8 9This library provides a **safe** mechanism for calling C++ code from Rust and 10Rust code from C++, not subject to the many ways that things can go wrong when 11using bindgen or cbindgen to generate unsafe C-style bindings. 12 13This doesn't change the fact that 100% of C++ code is unsafe. When auditing a 14project, you would be on the hook for auditing all the unsafe Rust code and 15*all* the C++ code. The core safety claim under this new model is that auditing 16just the C++ side would be sufficient to catch all problems, i.e. the Rust side 17can be 100% safe. 18 19```toml 20[dependencies] 21cxx = "1.0" 22 23[build-dependencies] 24cxx-build = "1.0" 25``` 26 27*Compiler support: requires rustc 1.48+ and c++11 or newer*<br> 28*[Release notes](https://github.com/dtolnay/cxx/releases)* 29 30<br> 31 32## Guide 33 34Please see **<https://cxx.rs>** for a tutorial, reference material, and example 35code. 36 37<br> 38 39## Overview 40 41The idea is that we define the signatures of both sides of our FFI boundary 42embedded together in one Rust module (the next section shows an example). From 43this, CXX receives a complete picture of the boundary to perform static analyses 44against the types and function signatures to uphold both Rust's and C++'s 45invariants and requirements. 46 47If everything checks out statically, then CXX uses a pair of code generators to 48emit the relevant `extern "C"` signatures on both sides together with any 49necessary static assertions for later in the build process to verify 50correctness. On the Rust side this code generator is simply an attribute 51procedural macro. On the C++ side it can be a small Cargo build script if your 52build is managed by Cargo, or for other build systems like Bazel or Buck we 53provide a command line tool which generates the header and source file and 54should be easy to integrate. 55 56The resulting FFI bridge operates at zero or negligible overhead, i.e. no 57copying, no serialization, no memory allocation, no runtime checks needed. 58 59The FFI signatures are able to use native types from whichever side they please, 60such as Rust's `String` or C++'s `std::string`, Rust's `Box` or C++'s 61`std::unique_ptr`, Rust's `Vec` or C++'s `std::vector`, etc in any combination. 62CXX guarantees an ABI-compatible signature that both sides understand, based on 63builtin bindings for key standard library types to expose an idiomatic API on 64those types to the other language. For example when manipulating a C++ string 65from Rust, its `len()` method becomes a call of the `size()` member function 66defined by C++; when manipulating a Rust string from C++, its `size()` member 67function calls Rust's `len()`. 68 69<br> 70 71## Example 72 73In this example we are writing a Rust application that wishes to take advantage 74of an existing C++ client for a large-file blobstore service. The blobstore 75supports a `put` operation for a discontiguous buffer upload. For example we 76might be uploading snapshots of a circular buffer which would tend to consist of 772 chunks, or fragments of a file spread across memory for some other reason. 78 79A runnable version of this example is provided under the *demo* directory of 80this repo. To try it out, run `cargo run` from that directory. 81 82```rust 83#[cxx::bridge] 84mod ffi { 85 // Any shared structs, whose fields will be visible to both languages. 86 struct BlobMetadata { 87 size: usize, 88 tags: Vec<String>, 89 } 90 91 extern "Rust" { 92 // Zero or more opaque types which both languages can pass around but 93 // only Rust can see the fields. 94 type MultiBuf; 95 96 // Functions implemented in Rust. 97 fn next_chunk(buf: &mut MultiBuf) -> &[u8]; 98 } 99 100 unsafe extern "C++" { 101 // One or more headers with the matching C++ declarations. Our code 102 // generators don't read it but it gets #include'd and used in static 103 // assertions to ensure our picture of the FFI boundary is accurate. 104 include!("demo/include/blobstore.h"); 105 106 // Zero or more opaque types which both languages can pass around but 107 // only C++ can see the fields. 108 type BlobstoreClient; 109 110 // Functions implemented in C++. 111 fn new_blobstore_client() -> UniquePtr<BlobstoreClient>; 112 fn put(&self, parts: &mut MultiBuf) -> u64; 113 fn tag(&self, blobid: u64, tag: &str); 114 fn metadata(&self, blobid: u64) -> BlobMetadata; 115 } 116} 117``` 118 119Now we simply provide Rust definitions of all the things in the `extern "Rust"` 120block and C++ definitions of all the things in the `extern "C++"` block, and get 121to call back and forth safely. 122 123Here are links to the complete set of source files involved in the demo: 124 125- [demo/src/main.rs](demo/src/main.rs) 126- [demo/build.rs](demo/build.rs) 127- [demo/include/blobstore.h](demo/include/blobstore.h) 128- [demo/src/blobstore.cc](demo/src/blobstore.cc) 129 130To look at the code generated in both languages for the example by the CXX code 131generators: 132 133```console 134 # run Rust code generator and print to stdout 135 # (requires https://github.com/dtolnay/cargo-expand) 136$ cargo expand --manifest-path demo/Cargo.toml 137 138 # run C++ code generator and print to stdout 139$ cargo run --manifest-path gen/cmd/Cargo.toml -- demo/src/main.rs 140``` 141 142<br> 143 144## Details 145 146As seen in the example, the language of the FFI boundary involves 3 kinds of 147items: 148 149- **Shared structs** — their fields are made visible to both languages. 150 The definition written within cxx::bridge is the single source of truth. 151 152- **Opaque types** — their fields are secret from the other language. 153 These cannot be passed across the FFI by value but only behind an indirection, 154 such as a reference `&`, a Rust `Box`, or a `UniquePtr`. Can be a type alias 155 for an arbitrarily complicated generic language-specific type depending on 156 your use case. 157 158- **Functions** — implemented in either language, callable from the other 159 language. 160 161Within the `extern "Rust"` part of the CXX bridge we list the types and 162functions for which Rust is the source of truth. These all implicitly refer to 163the `super` module, the parent module of the CXX bridge. You can think of the 164two items listed in the example above as being like `use super::MultiBuf` and 165`use super::next_chunk` except re-exported to C++. The parent module will either 166contain the definitions directly for simple things, or contain the relevant 167`use` statements to bring them into scope from elsewhere. 168 169Within the `extern "C++"` part, we list types and functions for which C++ is the 170source of truth, as well as the header(s) that declare those APIs. In the future 171it's possible that this section could be generated bindgen-style from the 172headers but for now we need the signatures written out; static assertions will 173verify that they are accurate. 174 175Your function implementations themselves, whether in C++ or Rust, *do not* need 176to be defined as `extern "C"` ABI or no\_mangle. CXX will put in the right shims 177where necessary to make it all work. 178 179<br> 180 181## Comparison vs bindgen and cbindgen 182 183Notice that with CXX there is repetition of all the function signatures: they 184are typed out once where the implementation is defined (in C++ or Rust) and 185again inside the cxx::bridge module, though compile-time assertions guarantee 186these are kept in sync. This is different from [bindgen] and [cbindgen] where 187function signatures are typed by a human once and the tool consumes them in one 188language and emits them in the other language. 189 190[bindgen]: https://github.com/rust-lang/rust-bindgen 191[cbindgen]: https://github.com/eqrion/cbindgen/ 192 193This is because CXX fills a somewhat different role. It is a lower level tool 194than bindgen or cbindgen in a sense; you can think of it as being a replacement 195for the concept of `extern "C"` signatures as we know them, rather than a 196replacement for a bindgen. It would be reasonable to build a higher level 197bindgen-like tool on top of CXX which consumes a C++ header and/or Rust module 198(and/or IDL like Thrift) as source of truth and generates the cxx::bridge, 199eliminating the repetition while leveraging the static analysis safety 200guarantees of CXX. 201 202But note in other ways CXX is higher level than the bindgens, with rich support 203for common standard library types. Frequently with bindgen when we are dealing 204with an idiomatic C++ API we would end up manually wrapping that API in C-style 205raw pointer functions, applying bindgen to get unsafe raw pointer Rust 206functions, and replicating the API again to expose those idiomatically in Rust. 207That's a much worse form of repetition because it is unsafe all the way through. 208 209By using a CXX bridge as the shared understanding between the languages, rather 210than `extern "C"` C-style signatures as the shared understanding, common FFI use 211cases become expressible using 100% safe code. 212 213It would also be reasonable to mix and match, using CXX bridge for the 95% of 214your FFI that is straightforward and doing the remaining few oddball signatures 215the old fashioned way with bindgen and cbindgen, if for some reason CXX's static 216restrictions get in the way. Please file an issue if you end up taking this 217approach so that we know what ways it would be worthwhile to make the tool more 218expressive. 219 220<br> 221 222## Cargo-based setup 223 224For builds that are orchestrated by Cargo, you will use a build script that runs 225CXX's C++ code generator and compiles the resulting C++ code along with any 226other C++ code for your crate. 227 228The canonical build script is as follows. The indicated line returns a 229[`cc::Build`] instance (from the usual widely used `cc` crate) on which you can 230set up any additional source files and compiler flags as normal. 231 232[`cc::Build`]: https://docs.rs/cc/1.0/cc/struct.Build.html 233 234```toml 235# Cargo.toml 236 237[build-dependencies] 238cxx-build = "1.0" 239``` 240 241```rust 242// build.rs 243 244fn main() { 245 cxx_build::bridge("src/main.rs") // returns a cc::Build 246 .file("src/demo.cc") 247 .flag_if_supported("-std=c++11") 248 .compile("cxxbridge-demo"); 249 250 println!("cargo:rerun-if-changed=src/main.rs"); 251 println!("cargo:rerun-if-changed=src/demo.cc"); 252 println!("cargo:rerun-if-changed=include/demo.h"); 253} 254``` 255 256<br> 257 258## Non-Cargo setup 259 260For use in non-Cargo builds like Bazel or Buck, CXX provides an alternate way of 261invoking the C++ code generator as a standalone command line tool. The tool is 262packaged as the `cxxbridge-cmd` crate on crates.io or can be built from the 263*gen/cmd* directory of this repo. 264 265```bash 266$ cargo install cxxbridge-cmd 267 268$ cxxbridge src/main.rs --header > path/to/mybridge.h 269$ cxxbridge src/main.rs > path/to/mybridge.cc 270``` 271 272<br> 273 274## Safety 275 276Be aware that the design of this library is intentionally restrictive and 277opinionated! It isn't a goal to be powerful enough to handle arbitrary 278signatures in either language. Instead this project is about carving out a 279reasonably expressive set of functionality about which we can make useful safety 280guarantees today and maybe extend over time. You may find that it takes some 281practice to use CXX bridge effectively as it won't work in all the ways that you 282are used to. 283 284Some of the considerations that go into ensuring safety are: 285 286- By design, our paired code generators work together to control both sides of 287 the FFI boundary. Ordinarily in Rust writing your own `extern "C"` blocks is 288 unsafe because the Rust compiler has no way to know whether the signatures 289 you've written actually match the signatures implemented in the other 290 language. With CXX we achieve that visibility and know what's on the other 291 side. 292 293- Our static analysis detects and prevents passing types by value that shouldn't 294 be passed by value from C++ to Rust, for example because they may contain 295 internal pointers that would be screwed up by Rust's move behavior. 296 297- To many people's surprise, it is possible to have a struct in Rust and a 298 struct in C++ with exactly the same layout / fields / alignment / everything, 299 and still not the same ABI when passed by value. This is a longstanding 300 bindgen bug that leads to segfaults in absolutely correct-looking code 301 ([rust-lang/rust-bindgen#778]). CXX knows about this and can insert the 302 necessary zero-cost workaround transparently where needed, so go ahead and 303 pass your structs by value without worries. This is made possible by owning 304 both sides of the boundary rather than just one. 305 306- Template instantiations: for example in order to expose a UniquePtr\<T\> type 307 in Rust backed by a real C++ unique\_ptr, we have a way of using a Rust trait 308 to connect the behavior back to the template instantiations performed by the 309 other language. 310 311[rust-lang/rust-bindgen#778]: https://github.com/rust-lang/rust-bindgen/issues/778 312 313<br> 314 315## Builtin types 316 317In addition to all the primitive types (i32 <=> int32_t), the following 318common types may be used in the fields of shared structs and the arguments and 319returns of functions. 320 321<table> 322<tr><th>name in Rust</th><th>name in C++</th><th>restrictions</th></tr> 323<tr><td>String</td><td>rust::String</td><td></td></tr> 324<tr><td>&str</td><td>rust::Str</td><td></td></tr> 325<tr><td>&[T]</td><td>rust::Slice<const T></td><td><sup><i>cannot hold opaque C++ type</i></sup></td></tr> 326<tr><td>&mut [T]</td><td>rust::Slice<T></td><td><sup><i>cannot hold opaque C++ type</i></sup></td></tr> 327<tr><td><a href="https://docs.rs/cxx/1.0/cxx/struct.CxxString.html">CxxString</a></td><td>std::string</td><td><sup><i>cannot be passed by value</i></sup></td></tr> 328<tr><td>Box<T></td><td>rust::Box<T></td><td><sup><i>cannot hold opaque C++ type</i></sup></td></tr> 329<tr><td><a href="https://docs.rs/cxx/1.0/cxx/struct.UniquePtr.html">UniquePtr<T></a></td><td>std::unique_ptr<T></td><td><sup><i>cannot hold opaque Rust type</i></sup></td></tr> 330<tr><td><a href="https://docs.rs/cxx/1.0/cxx/struct.SharedPtr.html">SharedPtr<T></a></td><td>std::shared_ptr<T></td><td><sup><i>cannot hold opaque Rust type</i></sup></td></tr> 331<tr><td>[T; N]</td><td>std::array<T, N></td><td><sup><i>cannot hold opaque C++ type</i></sup></td></tr> 332<tr><td>Vec<T></td><td>rust::Vec<T></td><td><sup><i>cannot hold opaque C++ type</i></sup></td></tr> 333<tr><td><a href="https://docs.rs/cxx/1.0/cxx/struct.CxxVector.html">CxxVector<T></a></td><td>std::vector<T></td><td><sup><i>cannot be passed by value, cannot hold opaque Rust type</i></sup></td></tr> 334<tr><td>*mut T, *const T</td><td>T*, const T*</td><td><sup><i>fn with a raw pointer argument must be declared unsafe to call</i></sup></td></tr> 335<tr><td>fn(T, U) -> V</td><td>rust::Fn<V(T, U)></td><td><sup><i>only passing from Rust to C++ is implemented so far</i></sup></td></tr> 336<tr><td>Result<T></td><td>throw/catch</td><td><sup><i>allowed as return type only</i></sup></td></tr> 337</table> 338 339The C++ API of the `rust` namespace is defined by the *include/cxx.h* file in 340this repo. You will need to include this header in your C++ code when working 341with those types. 342 343The following types are intended to be supported "soon" but are just not 344implemented yet. I don't expect any of these to be hard to make work but it's a 345matter of designing a nice API for each in its non-native language. 346 347<table> 348<tr><th>name in Rust</th><th>name in C++</th></tr> 349<tr><td>BTreeMap<K, V></td><td><sup><i>tbd</i></sup></td></tr> 350<tr><td>HashMap<K, V></td><td><sup><i>tbd</i></sup></td></tr> 351<tr><td>Arc<T></td><td><sup><i>tbd</i></sup></td></tr> 352<tr><td>Option<T></td><td><sup><i>tbd</i></sup></td></tr> 353<tr><td><sup><i>tbd</i></sup></td><td>std::map<K, V></td></tr> 354<tr><td><sup><i>tbd</i></sup></td><td>std::unordered_map<K, V></td></tr> 355</table> 356 357<br> 358 359## Remaining work 360 361This is still early days for CXX; I am releasing it as a minimum viable product 362to collect feedback on the direction and invite collaborators. Please check the 363open issues. 364 365Especially please report issues if you run into trouble building or linking any 366of this stuff. I'm sure there are ways to make the build aspects friendlier or 367more robust. 368 369Finally, I know more about Rust library design than C++ library design so I 370would appreciate help making the C++ APIs in this project more idiomatic where 371anyone has suggestions. 372 373<br> 374 375#### License 376 377<sup> 378Licensed under either of <a href="LICENSE-APACHE">Apache License, Version 3792.0</a> or <a href="LICENSE-MIT">MIT license</a> at your option. 380</sup> 381 382<br> 383 384<sub> 385Unless you explicitly state otherwise, any contribution intentionally submitted 386for inclusion in this project by you, as defined in the Apache-2.0 license, 387shall be dual licensed as above, without any additional terms or conditions. 388</sub> 389