• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1# Using C structures with Flexible Array Members
2
3Since time immemorial, C programmers have been using what was called "the struct
4hack". This is a technique for packing a fixed-size structure and a
5variable-sized tail within the same memory allocation. Typically this looks
6like:
7
8```c
9struct MyRecord {
10    time_t timestamp;
11    unsigned seq;
12    size_t len;
13    char payload[0];
14};
15```
16
17Because this is so useful, it was standardized in C99 as "flexible array
18members", using almost identical syntax:
19```c
20struct MyRecord {
21    time_t timestamp;
22    unsigned seq;
23    size_t len;
24    char payload[]; // NOTE: empty []
25};
26```
27
28Bindgen supports these structures in two different ways.
29
30## `__IncompleteArrayField`
31
32By default, bindgen will generate the corresponding Rust structure:
33```rust,ignore
34#[repr(C)]
35struct MyRecord {
36    pub timestamp: time_t,
37    pub seq: ::std::os::raw::c_uint,
38    pub len: usize,
39    pub payload: __IncompleteArrayField<::std::os::raw::c_char>,
40}
41```
42
43The `__IncompleteArrayField` type is zero-sized, so this structure represents
44the prefix without any trailing data. In order to access that data, it provides
45the `as_slice` unsafe method:
46```rust,ignore
47    // SAFETY: there's at least `len` bytes allocated and initialized after `myrecord`
48    let payload = unsafe { myrecord.payload.as_slice(myrecord.len) };
49```
50There's also `as_mut_slice` which does the obvious.
51
52These are `unsafe` simply because it's up to you to provide the right length (in
53elements of whatever type `payload` is) as there's no way for Rust or Bindgen to
54know. In this example, the length is a very straightforward `len` field in the
55structure, but it could be encoded in any number of ways within the structure,
56or come from somewhere else entirely.
57
58One big caveat with this technique is that `std::mem::size_of` (or
59`size_of_val`) will *only* include the size of the prefix structure. if you're
60working out how much storage the whole structure is using, you'll need to add
61the suffix yourself.
62
63## Using Dynamically Sized Types
64
65If you invoke bindgen with the `--flexarray-dst` option, it will generate
66something not quite like this:
67
68```rust,ignore
69#[repr(C)]
70struct MyRecord {
71    pub timestamp: time_t,
72    pub seq: ::std::os::raw::c_uint,
73    pub len: usize,
74    pub payload: [::std::os::raw::c_char],
75}
76```
77Rust has a set of types which are almost exact analogs for these Flexible Array
78Member types: the Dynamically Sized Type ("DST").
79
80This looks almost identical to a normal Rust structure, except that you'll note
81the type of the `payload` field is a raw slice `[...]` rather than the usual
82reference to slice `&[...]`.
83
84That `payload: [c_char]` is telling Rust that it can't directly know the total
85size of this structure - the `payload` field takes an amount of space that's
86determined at runtime. This means you can't directly use values of this type,
87only references: `&MyRecord`.
88
89In practice, this is very awkward. So instead, bindgen generates:
90```rust,ignore
91#[repr(C)]
92struct MyRecord<FAM: ?Sized = [::std::os::raw::c_char; 0]> {
93    pub timestamp: time_t,
94    pub seq: ::std::os::raw::c_uint,
95    pub len: usize,
96    pub payload: FAM,
97}
98```
99
100That is:
1011. a type parameter `FAM` which represents the type of the `payload` field,
1022. it's `?Sized` meaning it can be unsized (ie, a DST)
1033. it has the default type of `[c_char; 0]` - that is a zero-sized array of characters
104
105This means that referencing plain `MyRecord` will be exactly like `MyRecord`
106with `__IncompleteArrayField`: it is a fixed-sized structure which you can
107manipulate like a normal Rust value.
108
109But how do you get to the DST part?
110
111Bindgen will also implement a set of helper methods for this:
112
113```rust,ignore
114// Static sized variant
115impl MyRecord<[::std::os::raw::c_char; 0]> {
116    pub unsafe fn flex_ref(&self, len: usize) -> &MyRecord<[::std::os::raw::c_char]> { ... }
117    pub unsafe fn flex_mut_ref(&mut self, len: usize) -> &mut MyRecord<[::std::os::raw::c_char]> { ... }
118    // And some raw pointer variants
119}
120```
121These will take a sized `MyRecord<[c_char; 0]>` and a length in elements, and
122return a reference to a DST `MyRecord<[c_char]>` where the `payload` field is a
123fully usable slice of `len` characters.
124
125The magic here is that the reference is a fat pointer, which not only encodes
126the address, but also the dynamic size of the final field, just like a reference
127to a slice is. This means that you get full bounds checked access to the
128`payload` field like any other Rust slice.
129
130It also means that doing `mem::size_of_val(myrecord)` will return the *complete*
131size of this structure, including the suffix.
132
133You can go the other way:
134```rust,ignore
135// Dynamic sized variant
136impl MyRecord<[::std::os::raw::c_char]> {
137    pub fn fixed(&self) -> (&MyRecord<[::std::os::raw::c_char; 0]>, usize) { ... }
138    pub fn fixed_mut(&mut self) -> (&mut MyRecord<[::std::os::raw::c_char; 0]>, usize) { ... }
139    pub fn layout(len: usize) -> std::alloc::Layout { ... }
140}
141```
142which takes the DST variant of the structure and returns the sized variant,
143along with the number of elements are after it. These are all completely safe
144because all the information needed is part of the fat `&self` reference.
145
146The `layout` function takes a length and returns the `Layout` - that is, size
147and alignment, so that you can allocate memory for the structure (for example,
148using `malloc` so you can pass it to a C function).
149
150Unfortunately the language features needed to support these methods are still unstable:
151- [ptr_metadata](https://doc.rust-lang.org/beta/unstable-book/library-features/ptr-metadata.html),
152  which enables all the fixed<->DST conversions, and
153- [layout_for_ptr](https://doc.rust-lang.org/beta/unstable-book/library-features/layout-for-ptr.html),
154  which allows he `layout` method
155
156As a result, if you don't specify `--rust-target nightly` you'll just get the
157bare type definitions, but no real way to use them. It's often convenient to add
158the
159```bash
160--raw-line '#![feature(ptr_metadata,layout_for_ptr)]'
161```
162option if you're generating Rust as a stand-alone crate. Otherwise you'll need
163to add the feature line to your containing crate.
164