README.md
1CXX — safe FFI between Rust and C++
2=========================================
3
4[<img alt="github" src="https://img.shields.io/badge/github-dtolnay/cxx-8da0cb?style=for-the-badge&labelColor=555555&logo=github" height="20">](https://github.com/dtolnay/cxx)
5[<img alt="crates.io" src="https://img.shields.io/crates/v/cxx.svg?style=for-the-badge&color=fc8d62&logo=rust" height="20">](https://crates.io/crates/cxx)
6[<img alt="docs.rs" src="https://img.shields.io/badge/docs.rs-cxx-66c2a5?style=for-the-badge&labelColor=555555&logoColor=white&logo=" height="20">](https://docs.rs/cxx)
7[<img alt="build status" src="https://img.shields.io/github/workflow/status/dtolnay/cxx/CI/master?style=for-the-badge" height="20">](https://github.com/dtolnay/cxx/actions?query=branch%3Amaster)
8
9This library provides a **safe** mechanism for calling C++ code from Rust and
10Rust code from C++, not subject to the many ways that things can go wrong when
11using bindgen or cbindgen to generate unsafe C-style bindings.
12
13This doesn't change the fact that 100% of C++ code is unsafe. When auditing a
14project, you would be on the hook for auditing all the unsafe Rust code and
15*all* the C++ code. The core safety claim under this new model is that auditing
16just the C++ side would be sufficient to catch all problems, i.e. the Rust side
17can be 100% safe.
18
19```toml
20[dependencies]
21cxx = "1.0"
22
23[build-dependencies]
24cxx-build = "1.0"
25```
26
27*Compiler support: requires rustc 1.48+ and c++11 or newer*<br>
28*[Release notes](https://github.com/dtolnay/cxx/releases)*
29
30<br>
31
32## Guide
33
34Please see **<https://cxx.rs>** for a tutorial, reference material, and example
35code.
36
37<br>
38
39## Overview
40
41The idea is that we define the signatures of both sides of our FFI boundary
42embedded together in one Rust module (the next section shows an example). From
43this, CXX receives a complete picture of the boundary to perform static analyses
44against the types and function signatures to uphold both Rust's and C++'s
45invariants and requirements.
46
47If everything checks out statically, then CXX uses a pair of code generators to
48emit the relevant `extern "C"` signatures on both sides together with any
49necessary static assertions for later in the build process to verify
50correctness. On the Rust side this code generator is simply an attribute
51procedural macro. On the C++ side it can be a small Cargo build script if your
52build is managed by Cargo, or for other build systems like Bazel or Buck we
53provide a command line tool which generates the header and source file and
54should be easy to integrate.
55
56The resulting FFI bridge operates at zero or negligible overhead, i.e. no
57copying, no serialization, no memory allocation, no runtime checks needed.
58
59The FFI signatures are able to use native types from whichever side they please,
60such as Rust's `String` or C++'s `std::string`, Rust's `Box` or C++'s
61`std::unique_ptr`, Rust's `Vec` or C++'s `std::vector`, etc in any combination.
62CXX guarantees an ABI-compatible signature that both sides understand, based on
63builtin bindings for key standard library types to expose an idiomatic API on
64those types to the other language. For example when manipulating a C++ string
65from Rust, its `len()` method becomes a call of the `size()` member function
66defined by C++; when manipulating a Rust string from C++, its `size()` member
67function calls Rust's `len()`.
68
69<br>
70
71## Example
72
73In this example we are writing a Rust application that wishes to take advantage
74of an existing C++ client for a large-file blobstore service. The blobstore
75supports a `put` operation for a discontiguous buffer upload. For example we
76might be uploading snapshots of a circular buffer which would tend to consist of
772 chunks, or fragments of a file spread across memory for some other reason.
78
79A runnable version of this example is provided under the *demo* directory of
80this repo. To try it out, run `cargo run` from that directory.
81
82```rust
83#[cxx::bridge]
84mod ffi {
85 // Any shared structs, whose fields will be visible to both languages.
86 struct BlobMetadata {
87 size: usize,
88 tags: Vec<String>,
89 }
90
91 extern "Rust" {
92 // Zero or more opaque types which both languages can pass around but
93 // only Rust can see the fields.
94 type MultiBuf;
95
96 // Functions implemented in Rust.
97 fn next_chunk(buf: &mut MultiBuf) -> &[u8];
98 }
99
100 unsafe extern "C++" {
101 // One or more headers with the matching C++ declarations. Our code
102 // generators don't read it but it gets #include'd and used in static
103 // assertions to ensure our picture of the FFI boundary is accurate.
104 include!("demo/include/blobstore.h");
105
106 // Zero or more opaque types which both languages can pass around but
107 // only C++ can see the fields.
108 type BlobstoreClient;
109
110 // Functions implemented in C++.
111 fn new_blobstore_client() -> UniquePtr<BlobstoreClient>;
112 fn put(&self, parts: &mut MultiBuf) -> u64;
113 fn tag(&self, blobid: u64, tag: &str);
114 fn metadata(&self, blobid: u64) -> BlobMetadata;
115 }
116}
117```
118
119Now we simply provide Rust definitions of all the things in the `extern "Rust"`
120block and C++ definitions of all the things in the `extern "C++"` block, and get
121to call back and forth safely.
122
123Here are links to the complete set of source files involved in the demo:
124
125- [demo/src/main.rs](demo/src/main.rs)
126- [demo/build.rs](demo/build.rs)
127- [demo/include/blobstore.h](demo/include/blobstore.h)
128- [demo/src/blobstore.cc](demo/src/blobstore.cc)
129
130To look at the code generated in both languages for the example by the CXX code
131generators:
132
133```console
134 # run Rust code generator and print to stdout
135 # (requires https://github.com/dtolnay/cargo-expand)
136$ cargo expand --manifest-path demo/Cargo.toml
137
138 # run C++ code generator and print to stdout
139$ cargo run --manifest-path gen/cmd/Cargo.toml -- demo/src/main.rs
140```
141
142<br>
143
144## Details
145
146As seen in the example, the language of the FFI boundary involves 3 kinds of
147items:
148
149- **Shared structs** — their fields are made visible to both languages.
150 The definition written within cxx::bridge is the single source of truth.
151
152- **Opaque types** — their fields are secret from the other language.
153 These cannot be passed across the FFI by value but only behind an indirection,
154 such as a reference `&`, a Rust `Box`, or a `UniquePtr`. Can be a type alias
155 for an arbitrarily complicated generic language-specific type depending on
156 your use case.
157
158- **Functions** — implemented in either language, callable from the other
159 language.
160
161Within the `extern "Rust"` part of the CXX bridge we list the types and
162functions for which Rust is the source of truth. These all implicitly refer to
163the `super` module, the parent module of the CXX bridge. You can think of the
164two items listed in the example above as being like `use super::MultiBuf` and
165`use super::next_chunk` except re-exported to C++. The parent module will either
166contain the definitions directly for simple things, or contain the relevant
167`use` statements to bring them into scope from elsewhere.
168
169Within the `extern "C++"` part, we list types and functions for which C++ is the
170source of truth, as well as the header(s) that declare those APIs. In the future
171it's possible that this section could be generated bindgen-style from the
172headers but for now we need the signatures written out; static assertions will
173verify that they are accurate.
174
175Your function implementations themselves, whether in C++ or Rust, *do not* need
176to be defined as `extern "C"` ABI or no\_mangle. CXX will put in the right shims
177where necessary to make it all work.
178
179<br>
180
181## Comparison vs bindgen and cbindgen
182
183Notice that with CXX there is repetition of all the function signatures: they
184are typed out once where the implementation is defined (in C++ or Rust) and
185again inside the cxx::bridge module, though compile-time assertions guarantee
186these are kept in sync. This is different from [bindgen] and [cbindgen] where
187function signatures are typed by a human once and the tool consumes them in one
188language and emits them in the other language.
189
190[bindgen]: https://github.com/rust-lang/rust-bindgen
191[cbindgen]: https://github.com/eqrion/cbindgen/
192
193This is because CXX fills a somewhat different role. It is a lower level tool
194than bindgen or cbindgen in a sense; you can think of it as being a replacement
195for the concept of `extern "C"` signatures as we know them, rather than a
196replacement for a bindgen. It would be reasonable to build a higher level
197bindgen-like tool on top of CXX which consumes a C++ header and/or Rust module
198(and/or IDL like Thrift) as source of truth and generates the cxx::bridge,
199eliminating the repetition while leveraging the static analysis safety
200guarantees of CXX.
201
202But note in other ways CXX is higher level than the bindgens, with rich support
203for common standard library types. Frequently with bindgen when we are dealing
204with an idiomatic C++ API we would end up manually wrapping that API in C-style
205raw pointer functions, applying bindgen to get unsafe raw pointer Rust
206functions, and replicating the API again to expose those idiomatically in Rust.
207That's a much worse form of repetition because it is unsafe all the way through.
208
209By using a CXX bridge as the shared understanding between the languages, rather
210than `extern "C"` C-style signatures as the shared understanding, common FFI use
211cases become expressible using 100% safe code.
212
213It would also be reasonable to mix and match, using CXX bridge for the 95% of
214your FFI that is straightforward and doing the remaining few oddball signatures
215the old fashioned way with bindgen and cbindgen, if for some reason CXX's static
216restrictions get in the way. Please file an issue if you end up taking this
217approach so that we know what ways it would be worthwhile to make the tool more
218expressive.
219
220<br>
221
222## Cargo-based setup
223
224For builds that are orchestrated by Cargo, you will use a build script that runs
225CXX's C++ code generator and compiles the resulting C++ code along with any
226other C++ code for your crate.
227
228The canonical build script is as follows. The indicated line returns a
229[`cc::Build`] instance (from the usual widely used `cc` crate) on which you can
230set up any additional source files and compiler flags as normal.
231
232[`cc::Build`]: https://docs.rs/cc/1.0/cc/struct.Build.html
233
234```toml
235# Cargo.toml
236
237[build-dependencies]
238cxx-build = "1.0"
239```
240
241```rust
242// build.rs
243
244fn main() {
245 cxx_build::bridge("src/main.rs") // returns a cc::Build
246 .file("src/demo.cc")
247 .flag_if_supported("-std=c++11")
248 .compile("cxxbridge-demo");
249
250 println!("cargo:rerun-if-changed=src/main.rs");
251 println!("cargo:rerun-if-changed=src/demo.cc");
252 println!("cargo:rerun-if-changed=include/demo.h");
253}
254```
255
256<br>
257
258## Non-Cargo setup
259
260For use in non-Cargo builds like Bazel or Buck, CXX provides an alternate way of
261invoking the C++ code generator as a standalone command line tool. The tool is
262packaged as the `cxxbridge-cmd` crate on crates.io or can be built from the
263*gen/cmd* directory of this repo.
264
265```bash
266$ cargo install cxxbridge-cmd
267
268$ cxxbridge src/main.rs --header > path/to/mybridge.h
269$ cxxbridge src/main.rs > path/to/mybridge.cc
270```
271
272<br>
273
274## Safety
275
276Be aware that the design of this library is intentionally restrictive and
277opinionated! It isn't a goal to be powerful enough to handle arbitrary
278signatures in either language. Instead this project is about carving out a
279reasonably expressive set of functionality about which we can make useful safety
280guarantees today and maybe extend over time. You may find that it takes some
281practice to use CXX bridge effectively as it won't work in all the ways that you
282are used to.
283
284Some of the considerations that go into ensuring safety are:
285
286- By design, our paired code generators work together to control both sides of
287 the FFI boundary. Ordinarily in Rust writing your own `extern "C"` blocks is
288 unsafe because the Rust compiler has no way to know whether the signatures
289 you've written actually match the signatures implemented in the other
290 language. With CXX we achieve that visibility and know what's on the other
291 side.
292
293- Our static analysis detects and prevents passing types by value that shouldn't
294 be passed by value from C++ to Rust, for example because they may contain
295 internal pointers that would be screwed up by Rust's move behavior.
296
297- To many people's surprise, it is possible to have a struct in Rust and a
298 struct in C++ with exactly the same layout / fields / alignment / everything,
299 and still not the same ABI when passed by value. This is a longstanding
300 bindgen bug that leads to segfaults in absolutely correct-looking code
301 ([rust-lang/rust-bindgen#778]). CXX knows about this and can insert the
302 necessary zero-cost workaround transparently where needed, so go ahead and
303 pass your structs by value without worries. This is made possible by owning
304 both sides of the boundary rather than just one.
305
306- Template instantiations: for example in order to expose a UniquePtr\<T\> type
307 in Rust backed by a real C++ unique\_ptr, we have a way of using a Rust trait
308 to connect the behavior back to the template instantiations performed by the
309 other language.
310
311[rust-lang/rust-bindgen#778]: https://github.com/rust-lang/rust-bindgen/issues/778
312
313<br>
314
315## Builtin types
316
317In addition to all the primitive types (i32 <=> int32_t), the following
318common types may be used in the fields of shared structs and the arguments and
319returns of functions.
320
321<table>
322<tr><th>name in Rust</th><th>name in C++</th><th>restrictions</th></tr>
323<tr><td>String</td><td>rust::String</td><td></td></tr>
324<tr><td>&str</td><td>rust::Str</td><td></td></tr>
325<tr><td>&[T]</td><td>rust::Slice<const T></td><td><sup><i>cannot hold opaque C++ type</i></sup></td></tr>
326<tr><td>&mut [T]</td><td>rust::Slice<T></td><td><sup><i>cannot hold opaque C++ type</i></sup></td></tr>
327<tr><td><a href="https://docs.rs/cxx/1.0/cxx/struct.CxxString.html">CxxString</a></td><td>std::string</td><td><sup><i>cannot be passed by value</i></sup></td></tr>
328<tr><td>Box<T></td><td>rust::Box<T></td><td><sup><i>cannot hold opaque C++ type</i></sup></td></tr>
329<tr><td><a href="https://docs.rs/cxx/1.0/cxx/struct.UniquePtr.html">UniquePtr<T></a></td><td>std::unique_ptr<T></td><td><sup><i>cannot hold opaque Rust type</i></sup></td></tr>
330<tr><td><a href="https://docs.rs/cxx/1.0/cxx/struct.SharedPtr.html">SharedPtr<T></a></td><td>std::shared_ptr<T></td><td><sup><i>cannot hold opaque Rust type</i></sup></td></tr>
331<tr><td>[T; N]</td><td>std::array<T, N></td><td><sup><i>cannot hold opaque C++ type</i></sup></td></tr>
332<tr><td>Vec<T></td><td>rust::Vec<T></td><td><sup><i>cannot hold opaque C++ type</i></sup></td></tr>
333<tr><td><a href="https://docs.rs/cxx/1.0/cxx/struct.CxxVector.html">CxxVector<T></a></td><td>std::vector<T></td><td><sup><i>cannot be passed by value, cannot hold opaque Rust type</i></sup></td></tr>
334<tr><td>*mut T, *const T</td><td>T*, const T*</td><td><sup><i>fn with a raw pointer argument must be declared unsafe to call</i></sup></td></tr>
335<tr><td>fn(T, U) -> V</td><td>rust::Fn<V(T, U)></td><td><sup><i>only passing from Rust to C++ is implemented so far</i></sup></td></tr>
336<tr><td>Result<T></td><td>throw/catch</td><td><sup><i>allowed as return type only</i></sup></td></tr>
337</table>
338
339The C++ API of the `rust` namespace is defined by the *include/cxx.h* file in
340this repo. You will need to include this header in your C++ code when working
341with those types.
342
343The following types are intended to be supported "soon" but are just not
344implemented yet. I don't expect any of these to be hard to make work but it's a
345matter of designing a nice API for each in its non-native language.
346
347<table>
348<tr><th>name in Rust</th><th>name in C++</th></tr>
349<tr><td>BTreeMap<K, V></td><td><sup><i>tbd</i></sup></td></tr>
350<tr><td>HashMap<K, V></td><td><sup><i>tbd</i></sup></td></tr>
351<tr><td>Arc<T></td><td><sup><i>tbd</i></sup></td></tr>
352<tr><td>Option<T></td><td><sup><i>tbd</i></sup></td></tr>
353<tr><td><sup><i>tbd</i></sup></td><td>std::map<K, V></td></tr>
354<tr><td><sup><i>tbd</i></sup></td><td>std::unordered_map<K, V></td></tr>
355</table>
356
357<br>
358
359## Remaining work
360
361This is still early days for CXX; I am releasing it as a minimum viable product
362to collect feedback on the direction and invite collaborators. Please check the
363open issues.
364
365Especially please report issues if you run into trouble building or linking any
366of this stuff. I'm sure there are ways to make the build aspects friendlier or
367more robust.
368
369Finally, I know more about Rust library design than C++ library design so I
370would appreciate help making the C++ APIs in this project more idiomatic where
371anyone has suggestions.
372
373<br>
374
375#### License
376
377<sup>
378Licensed under either of <a href="LICENSE-APACHE">Apache License, Version
3792.0</a> or <a href="LICENSE-MIT">MIT license</a> at your option.
380</sup>
381
382<br>
383
384<sub>
385Unless you explicitly state otherwise, any contribution intentionally submitted
386for inclusion in this project by you, as defined in the Apache-2.0 license,
387shall be dual licensed as above, without any additional terms or conditions.
388</sub>
389