{{#title Tutorial — Rust ♡ C++}} # Tutorial: CXX blobstore client This example walks through a Rust application that calls into a C++ client of a blobstore service. In fact we'll see calls going in both directions: Rust to C++ as well as C++ to Rust. For your own use case it may be that you need just one of these directions. All of the code involved in the example is shown on this page, but it's also provided in runnable form in the *demo* directory of . To try it out directly, run `cargo run` from that directory. This tutorial assumes you've read briefly about **shared structs**, **opaque types**, and **functions** in the [*Core concepts*](concepts.md) page. ## Creating the project We'll use Cargo, which is the build system commonly used by open source Rust projects. (CXX works with other build systems too; refer to chapter 5.) Create a blank Cargo project: `mkdir cxx-demo`; `cd cxx-demo`; `cargo init`. Edit the Cargo.toml to add a dependency on the `cxx` crate: ```toml,hidelines ## Cargo.toml # [package] # name = "cxx-demo" # version = "0.1.0" # edition = "2018" [dependencies] cxx = "1.0" ``` We'll revisit this Cargo.toml later when we get to compiling some C++ code. ## Defining the language boundary CXX relies on a description of the function signatures that will be exposed from each language to the other. You provide this description using `extern` blocks in a Rust module annotated with the `#[cxx::bridge]` attribute macro. We'll open with just the following at the top of src/main.rs and walk through each item in detail. ```rust,noplayground // src/main.rs #[cxx::bridge] mod ffi { } # # fn main() {} ``` The contents of this module will be everything that needs to be agreed upon by both sides of the FFI boundary. ## Calling a C++ function from Rust Let's obtain an instance of the C++ blobstore client, a class `BlobstoreClient` defined in C++. We'll treat `BlobstoreClient` as an *opaque type* in CXX's classification so that Rust does not need to assume anything about its implementation, not even its size or alignment. In general, a C++ type might have a move-constructor which is incompatible with Rust's move semantics, or may hold internal references which cannot be modeled by Rust's borrowing system. Though there are alternatives, the easiest way to not care about any such thing on an FFI boundary is to require no knowledge about a type by treating it as opaque. Opaque types may only be manipulated behind an indirection such as a reference `&`, a Rust `Box`, or a `UniquePtr` (Rust binding of `std::unique_ptr`). We'll add a function through which C++ can return a `std::unique_ptr` to Rust. ```rust,noplayground // src/main.rs #[cxx::bridge] mod ffi { unsafe extern "C++" { include!("cxx-demo/include/blobstore.h"); type BlobstoreClient; fn new_blobstore_client() -> UniquePtr; } } fn main() { let client = ffi::new_blobstore_client(); } ``` The nature of `unsafe` extern blocks is clarified in more detail in the [*extern "C++"*](extern-c++.md) chapter. In brief: the programmer is **not** promising that the signatures they have typed in are accurate; that would be unreasonable. CXX performs static assertions that the signatures exactly match what is declared in C++. Rather, the programmer is only on the hook for things that C++'s semantics are not precise enough to capture, i.e. things that would only be represented at most by comments in the C++ code. In this case, it's whether `new_blobstore_client` is safe or unsafe to call. If that function said something like "must be called at most once or we'll stomp yer memery", Rust would instead want to expose it as `unsafe fn new_blobstore_client`, this time inside a safe `extern "C++"` block because the programmer is no longer on the hook for any safety claim about the signature. If you build this file right now with `cargo build`, it won't build because we haven't written a C++ implementation of `new_blobstore_client` nor instructed Cargo about how to link it into the resulting binary. You'll see an error from the linker like this: ```console error: linking with `cc` failed: exit code: 1 | = /bin/ld: target/debug/deps/cxx-demo-7cb7fddf3d67d880.rcgu.o: in function `cxx_demo::ffi::new_blobstore_client': src/main.rs:1: undefined reference to `cxxbridge1$new_blobstore_client' collect2: error: ld returned 1 exit status ``` ## Adding in the C++ code In CXX's integration with Cargo, all #include paths begin with a crate name by default (when not explicitly selected otherwise by a crate; see `CFG.include_prefix` in chapter 5). That's why we see `include!("cxx-demo/include/blobstore.h")` above — we'll be putting the C++ header at relative path `include/blobstore.h` within the Rust crate. If your crate is named something other than `cxx-demo` according to the `name` field in Cargo.toml, you will need to use that name everywhere in place of `cxx-demo` throughout this tutorial. ```cpp // include/blobstore.h #pragma once #include class BlobstoreClient { public: BlobstoreClient(); }; std::unique_ptr new_blobstore_client(); ``` ```cpp // src/blobstore.cc #include "cxx-demo/include/blobstore.h" BlobstoreClient::BlobstoreClient() {} std::unique_ptr new_blobstore_client() { return std::unique_ptr(new BlobstoreClient()); } ``` Using `std::make_unique` would work too, as long as you pass `-std=c++14` to the C++ compiler as described later on. The placement in *include/* and *src/* is not significant; you can place C++ code anywhere else in the crate as long as you use the right paths throughout the tutorial. Be aware that *CXX does not look at any of these files.* You're free to put arbitrary C++ code in here, #include your own libraries, etc. All we do is emit static assertions against what you provide in the headers. ## Compiling the C++ code with Cargo Cargo has a [build scripts] feature suitable for compiling non-Rust code. We need to introduce a new build-time dependency on CXX's C++ code generator in Cargo.toml: ```toml,hidelines ## Cargo.toml # [package] # name = "cxx-demo" # version = "0.1.0" # edition = "2018" [dependencies] cxx = "1.0" [build-dependencies] cxx-build = "1.0" ``` Then add a build.rs build script adjacent to Cargo.toml to run the cxx-build code generator and C++ compiler. The relevant arguments are the path to the Rust source file containing the cxx::bridge language boundary definition, and the paths to any additional C++ source files to be compiled during the Rust crate's build. ```rust,noplayground // build.rs fn main() { cxx_build::bridge("src/main.rs") .file("src/blobstore.cc") .compile("cxx-demo"); } ``` This build.rs would also be where you set up C++ compiler flags, for example if you'd like to have access to `std::make_unique` from C++14. See the page on ***[Cargo-based builds](build/cargo.md)*** for more details about CXX's Cargo integration. ```rust,noplayground # // build.rs # # fn main() { cxx_build::bridge("src/main.rs") .file("src/blobstore.cc") .flag_if_supported("-std=c++14") .compile("cxx-demo"); # } ``` [build scripts]: https://doc.rust-lang.org/cargo/reference/build-scripts.html The project should now build and run successfully, though not do anything useful yet. ```console cxx-demo$ cargo run Compiling cxx-demo v0.1.0 Finished dev [unoptimized + debuginfo] target(s) in 0.34s Running `target/debug/cxx-demo` cxx-demo$ ``` ## Calling a Rust function from C++ Our C++ blobstore supports a `put` operation for a discontiguous buffer upload. For example we might be uploading snapshots of a circular buffer which would tend to consist of 2 pieces, or fragments of a file spread across memory for some other reason (like a rope data structure). We'll express this by handing off an iterator over contiguous borrowed chunks. This loosely resembles the API of the widely used `bytes` crate's `Buf` trait. During a `put`, we'll make C++ call back into Rust to obtain contiguous chunks of the upload (all with no copying or allocation on the language boundary). In reality the C++ client might contain some sophisticated batching of chunks and/or parallel uploading that all of this ties into. ```rust,noplayground // src/main.rs #[cxx::bridge] mod ffi { extern "Rust" { type MultiBuf; fn next_chunk(buf: &mut MultiBuf) -> &[u8]; } unsafe extern "C++" { include!("cxx-demo/include/blobstore.h"); type BlobstoreClient; fn new_blobstore_client() -> UniquePtr; fn put(&self, parts: &mut MultiBuf) -> u64; } } # # fn main() { # let client = ffi::new_blobstore_client(); # } ``` Any signature having a `self` parameter (the Rust name for C++'s `this`) is considered a method / non-static member function. If there is only one `type` in the surrounding extern block, it'll be a method of that type. If there is more than one `type`, you can disambiguate which one a method belongs to by writing `self: &BlobstoreClient` in the argument list. As usual, now we need to provide Rust definitions of everything declared by the `extern "Rust"` block and a C++ definition of the new signature declared by the `extern "C++"` block. ```rust,noplayground // src/main.rs # # #[cxx::bridge] # mod ffi { # extern "Rust" { # type MultiBuf; # # fn next_chunk(buf: &mut MultiBuf) -> &[u8]; # } # # unsafe extern "C++" { # include!("cxx-demo/include/blobstore.h"); # # type BlobstoreClient; # # fn new_blobstore_client() -> UniquePtr; # fn put(&self, parts: &mut MultiBuf) -> u64; # } # } // An iterator over contiguous chunks of a discontiguous file object. Toy // implementation uses a Vec> but in reality this might be iterating // over some more complex Rust data structure like a rope, or maybe loading // chunks lazily from somewhere. pub struct MultiBuf { chunks: Vec>, pos: usize, } pub fn next_chunk(buf: &mut MultiBuf) -> &[u8] { let next = buf.chunks.get(buf.pos); buf.pos += 1; next.map_or(&[], Vec::as_slice) } # # fn main() { # let client = ffi::new_blobstore_client(); # } ``` ```cpp,hidelines // include/blobstore.h # #pragma once # #include # struct MultiBuf; class BlobstoreClient { public: BlobstoreClient(); uint64_t put(MultiBuf &buf) const; }; # #std::unique_ptr new_blobstore_client(); ``` In blobstore.cc we're able to call the Rust `next_chunk` function, exposed to C++ by a header `main.rs.h` generated by the CXX code generator. In CXX's Cargo integration this generated header has a path containing the crate name, the relative path of the Rust source file within the crate, and a `.rs.h` extension. ```cpp,hidelines // src/blobstore.cc ##include "cxx-demo/include/blobstore.h" ##include "cxx-demo/src/main.rs.h" ##include ##include # # BlobstoreClient::BlobstoreClient() {} # # std::unique_ptr new_blobstore_client() { # return std::make_unique(); # } // Upload a new blob and return a blobid that serves as a handle to the blob. uint64_t BlobstoreClient::put(MultiBuf &buf) const { // Traverse the caller's chunk iterator. std::string contents; while (true) { auto chunk = next_chunk(buf); if (chunk.size() == 0) { break; } contents.append(reinterpret_cast(chunk.data()), chunk.size()); } // Pretend we did something useful to persist the data. auto blobid = std::hash{}(contents); return blobid; } ``` This is now ready to use. :) ```rust,noplayground // src/main.rs # # #[cxx::bridge] # mod ffi { # extern "Rust" { # type MultiBuf; # # fn next_chunk(buf: &mut MultiBuf) -> &[u8]; # } # # unsafe extern "C++" { # include!("cxx-demo/include/blobstore.h"); # # type BlobstoreClient; # # fn new_blobstore_client() -> UniquePtr; # fn put(&self, parts: &mut MultiBuf) -> u64; # } # } # # pub struct MultiBuf { # chunks: Vec>, # pos: usize, # } # pub fn next_chunk(buf: &mut MultiBuf) -> &[u8] { # let next = buf.chunks.get(buf.pos); # buf.pos += 1; # next.map_or(&[], Vec::as_slice) # } fn main() { let client = ffi::new_blobstore_client(); // Upload a blob. let chunks = vec![b"fearless".to_vec(), b"concurrency".to_vec()]; let mut buf = MultiBuf { chunks, pos: 0 }; let blobid = client.put(&mut buf); println!("blobid = {}", blobid); } ``` ```console cxx-demo$ cargo run Compiling cxx-demo v0.1.0 Finished dev [unoptimized + debuginfo] target(s) in 0.41s Running `target/debug/cxx-demo` blobid = 9851996977040795552 ``` ## Interlude: What gets generated? For the curious, it's easy to look behind the scenes at what CXX has done to make these function calls work. You shouldn't need to do this during normal usage of CXX, but for the purpose of this tutorial it can be educative. CXX comprises *two* code generators: a Rust one (which is the cxx::bridge attribute procedural macro) and a C++ one. ### Rust generated code It's easiest to view the output of the procedural macro by installing [cargo-expand]. Then run `cargo expand ::ffi` to macro-expand the `mod ffi` module. [cargo-expand]: https://github.com/dtolnay/cargo-expand ```console cxx-demo$ cargo install cargo-expand cxx-demo$ cargo expand ::ffi ``` You'll see some deeply unpleasant code involving `#[repr(C)]`, `#[link_name]`, and `#[export_name]`. ### C++ generated code For debugging convenience, `cxx_build` links all generated C++ code into Cargo's target directory under *target/cxxbridge/*. ```console cxx-demo$ exa -T target/cxxbridge/ target/cxxbridge ├── cxx-demo │ └── src │ ├── main.rs.cc -> ../../../debug/build/cxx-demo-11c6f678ce5c3437/out/cxxbridge/sources/cxx-demo/src/main.rs.cc │ └── main.rs.h -> ../../../debug/build/cxx-demo-11c6f678ce5c3437/out/cxxbridge/include/cxx-demo/src/main.rs.h └── rust └── cxx.h -> ~/.cargo/registry/src/github.com-1ecc6299db9ec823/cxx-1.0.0/include/cxx.h ``` In those files you'll see declarations or templates of any CXX Rust types present in your language boundary (like `rust::Slice` for `&[T]`) and `extern "C"` signatures corresponding to your extern functions. If it fits your workflow better, the CXX C++ code generator is also available as a standalone executable which outputs generated code to stdout. ```console cxx-demo$ cargo install cxxbridge-cmd cxx-demo$ cxxbridge src/main.rs ``` ## Shared data structures So far the calls in both directions above only used **opaque types**, not **shared structs**. Shared structs are data structures whose complete definition is visible to both languages, making it possible to pass them by value across the language boundary. Shared structs translate to a C++ aggregate-initialization compatible struct exactly matching the layout of the Rust one. As the last step of this demo, we'll use a shared struct `BlobMetadata` to pass metadata about blobs between our Rust application and C++ blobstore client. ```rust,noplayground // src/main.rs #[cxx::bridge] mod ffi { struct BlobMetadata { size: usize, tags: Vec, } extern "Rust" { // ... # type MultiBuf; # # fn next_chunk(buf: &mut MultiBuf) -> &[u8]; } unsafe extern "C++" { // ... # include!("cxx-demo/include/blobstore.h"); # # type BlobstoreClient; # # fn new_blobstore_client() -> UniquePtr; # fn put(&self, parts: &mut MultiBuf) -> u64; fn tag(&self, blobid: u64, tag: &str); fn metadata(&self, blobid: u64) -> BlobMetadata; } } # # pub struct MultiBuf { # chunks: Vec>, # pos: usize, # } # pub fn next_chunk(buf: &mut MultiBuf) -> &[u8] { # let next = buf.chunks.get(buf.pos); # buf.pos += 1; # next.map_or(&[], Vec::as_slice) # } fn main() { let client = ffi::new_blobstore_client(); // Upload a blob. let chunks = vec![b"fearless".to_vec(), b"concurrency".to_vec()]; let mut buf = MultiBuf { chunks, pos: 0 }; let blobid = client.put(&mut buf); println!("blobid = {}", blobid); // Add a tag. client.tag(blobid, "rust"); // Read back the tags. let metadata = client.metadata(blobid); println!("tags = {:?}", metadata.tags); } ``` ```cpp,hidelines // include/blobstore.h ##pragma once ##include "rust/cxx.h" # #include struct MultiBuf; struct BlobMetadata; class BlobstoreClient { public: BlobstoreClient(); uint64_t put(MultiBuf &buf) const; void tag(uint64_t blobid, rust::Str tag) const; BlobMetadata metadata(uint64_t blobid) const; private: class impl; std::shared_ptr impl; }; # # std::unique_ptr new_blobstore_client(); ``` ```cpp,hidelines // src/blobstore.cc ##include "cxx-demo/include/blobstore.h" ##include "cxx-demo/src/main.rs.h" ##include ##include ##include ##include ##include // Toy implementation of an in-memory blobstore. // // In reality the implementation of BlobstoreClient could be a large // complex C++ library. class BlobstoreClient::impl { friend BlobstoreClient; using Blob = struct { std::string data; std::set tags; }; std::unordered_map blobs; }; BlobstoreClient::BlobstoreClient() : impl(new class BlobstoreClient::impl) {} # # // Upload a new blob and return a blobid that serves as a handle to the blob. # uint64_t BlobstoreClient::put(MultiBuf &buf) const { # // Traverse the caller's chunk iterator. # std::string contents; # while (true) { # auto chunk = next_chunk(buf); # if (chunk.size() == 0) { # break; # } # contents.append(reinterpret_cast(chunk.data()), chunk.size()); # } # # // Insert into map and provide caller the handle. # auto blobid = std::hash{}(contents); # impl->blobs[blobid] = {std::move(contents), {}}; # return blobid; # } // Add tag to an existing blob. void BlobstoreClient::tag(uint64_t blobid, rust::Str tag) const { impl->blobs[blobid].tags.emplace(tag); } // Retrieve metadata about a blob. BlobMetadata BlobstoreClient::metadata(uint64_t blobid) const { BlobMetadata metadata{}; auto blob = impl->blobs.find(blobid); if (blob != impl->blobs.end()) { metadata.size = blob->second.data.size(); std::for_each(blob->second.tags.cbegin(), blob->second.tags.cend(), [&](auto &t) { metadata.tags.emplace_back(t); }); } return metadata; } # # std::unique_ptr new_blobstore_client() { # return std::make_unique(); # } ``` ```console cxx-demo$ cargo run Running `target/debug/cxx-demo` blobid = 9851996977040795552 tags = ["rust"] ``` *You've now seen all the code involved in the tutorial. It's available all together in runnable form in the* demo *directory of . You can run it directly without stepping through the steps above by running `cargo run` from that directory.*
# Takeaways The key contribution of CXX is it gives you Rust–C++ interop in which *all* of the Rust side of the code you write *really* looks like you are just writing normal Rust, and the C++ side *really* looks like you are just writing normal C++. You've seen in this tutorial that none of the code involved feels like C or like the usual perilous "FFI glue" prone to leaks or memory safety flaws. An expressive system of opaque types, shared types, and key standard library type bindings enables API design on the language boundary that captures the proper ownership and borrowing contracts of the interface. CXX plays to the strengths of the Rust type system *and* C++ type system *and* the programmer's intuitions. An individual working on the C++ side without a Rust background, or the Rust side without a C++ background, will be able to apply all their usual intuitions and best practices about development in their language to maintain a correct FFI.