• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1# Binary Values
2
3The library implements several [binary formats](binary_formats/index.md) that encode JSON in an efficient way. Most of these formats support binary values; that is, values that have semantics define outside the library and only define a sequence of bytes to be stored.
4
5JSON itself does not have a binary value. As such, binary values are an extension that this library implements to store values received by a binary format. Binary values are never created by the JSON parser, and are only part of a serialized JSON text if they have been created manually or via a binary format.
6
7## API for binary values
8
9```plantuml
10class json::binary_t {
11    -- setters --
12    +void set_subtype(std::uint8_t subtype)
13    +void clear_subtype()
14    -- getters --
15    +std::uint8_t subtype() const
16    +bool has_subtype() const
17}
18
19"std::vector<uint8_t>" <|-- json::binary_t
20```
21
22By default, binary values are stored as `std::vector<std::uint8_t>`. This type can be changed by providing a template parameter to the `basic_json` type. To store binary subtypes, the storage type is extended and exposed as `json::binary_t`:
23
24```cpp
25auto binary = json::binary_t({0xCA, 0xFE, 0xBA, 0xBE});
26auto binary_with_subtype = json::binary_t({0xCA, 0xFE, 0xBA, 0xBE}, 42);
27```
28
29There are several convenience functions to check and set the subtype:
30
31```cpp
32binary.has_subtype();                   // returns false
33binary_with_subtype.has_subtype();      // returns true
34
35binary_with_subtype.clear_subtype();
36binary_with_subtype.has_subtype();      // returns true
37
38binary_with_subtype.set_subtype(42);
39binary.set_subtype(23);
40
41binary.subtype();                       // returns 23
42```
43
44As `json::binary_t` is subclassing `std::vector<std::uint8_t>`, all member functions are available:
45
46```cpp
47binary.size();  // returns 4
48binary[1];      // returns 0xFE
49```
50
51JSON values can be constructed from `json::binary_t`:
52
53```cpp
54json j = binary;
55```
56
57Binary values are primitive values just like numbers or strings:
58
59```cpp
60j.is_binary();    // returns true
61j.is_primitive(); // returns true
62```
63
64Given a binary JSON value, the `binary_t` can be accessed by reference as via `get_binary()`:
65
66```cpp
67j.get_binary().has_subtype();  // returns true
68j.get_binary().size();         // returns 4
69```
70
71For convencience, binary JSON values can be constructed via `json::binary`:
72
73```cpp
74auto j2 = json::binary({0xCA, 0xFE, 0xBA, 0xBE}, 23);
75auto j3 = json::binary({0xCA, 0xFE, 0xBA, 0xBE});
76
77j2 == j;                        // returns true
78j3.get_binary().has_subtype();  // returns false
79```
80
81
82
83## Serialization
84
85Binary values are serialized differently according to the formats.
86
87### JSON
88
89JSON does not have a binary type, and this library does not introduce a new type as this would break conformance. Instead, binary values are serialized as an object with two keys: `bytes` holds an array of integers, and `subtype` is an integer or `null`.
90
91??? example
92
93    Code:
94
95    ```cpp
96    // create a binary value of subtype 42
97    json j;
98    j["binary"] = json::binary({0xCA, 0xFE, 0xBA, 0xBE}, 42);
99
100    // serialize to standard output
101    std::cout << j.dump(2) << std::endl;
102    ```
103
104    Output:
105
106    ```json
107    {
108      "binary": {
109        "bytes": [202, 254, 186, 190],
110        "subtype": 42
111      }
112    }
113    ```
114
115!!! warning "No roundtrip for binary values"
116
117    The JSON parser will not parse the objects generated by binary values back to binary values. This is by design to remain standards compliant. Serializing binary values to JSON is only implemented for debugging purposes.
118
119### BSON
120
121[BSON](binary_formats/bson.md) supports binary values and subtypes. If a subtype is given, it is used and added as unsigned 8-bit integer. If no subtype is given, the generic binary subtype 0x00 is used.
122
123??? example
124
125    Code:
126
127    ```cpp
128    // create a binary value of subtype 42
129    json j;
130    j["binary"] = json::binary({0xCA, 0xFE, 0xBA, 0xBE}, 42);
131
132    // convert to BSON
133    auto v = json::to_bson(j);
134    ```
135
136    `v` is a `std::vector<std::uint8t>` with the following 22 elements:
137
138    ```c
139    0x16 0x00 0x00 0x00                         // number of bytes in the document
140        0x05                                    // binary value
141            0x62 0x69 0x6E 0x61 0x72 0x79 0x00  // key "binary" + null byte
142            0x04 0x00 0x00 0x00                 // number of bytes
143            0x2a                                // subtype
144            0xCA 0xFE 0xBA 0xBE                 // content
145    0x00                                        // end of the document
146    ```
147
148    Note that the serialization preserves the subtype, and deserializing `v` would yield the following value:
149
150    ```json
151    {
152      "binary": {
153        "bytes": [202, 254, 186, 190],
154        "subtype": 42
155      }
156    }
157    ```
158
159### CBOR
160
161[CBOR](binary_formats/cbor.md) supports binary values, but no subtypes. Subtypes will be serialized as tags. Any binary value will be serialized as byte strings. The library will choose the smallest representation using the length of the byte array.
162
163??? example
164
165    Code:
166
167    ```cpp
168    // create a binary value of subtype 42
169    json j;
170    j["binary"] = json::binary({0xCA, 0xFE, 0xBA, 0xBE}, 42);
171
172    // convert to CBOR
173    auto v = json::to_cbor(j);
174    ```
175
176    `v` is a `std::vector<std::uint8t>` with the following 15 elements:
177
178    ```c
179    0xA1                                   // map(1)
180        0x66                               // text(6)
181            0x62 0x69 0x6E 0x61 0x72 0x79  // "binary"
182        0xD8 0x2A                          // tag(42)
183        0x44                               // bytes(4)
184            0xCA 0xFE 0xBA 0xBE            // content
185    ```
186
187    Note that the subtype is serialized as tag. However, parsing tagged values yield a parse error unless `json::cbor_tag_handler_t::ignore` is passed to `json::from_cbor`.
188
189    ```json
190    {
191      "binary": {
192        "bytes": [202, 254, 186, 190],
193        "subtype": null
194      }
195    }
196    ```
197
198### MessagePack
199
200[MessagePack](binary_formats/messagepack.md) supports binary values and subtypes. If a subtype is given, the ext family is used. The library will choose the smallest representation among fixext1, fixext2, fixext4, fixext8, ext8, ext16, and ext32. The subtype is then added as singed 8-bit integer.
201
202If no subtype is given, the bin family (bin8, bin16, bin32) is used.
203
204??? example
205
206    Code:
207
208    ```cpp
209    // create a binary value of subtype 42
210    json j;
211    j["binary"] = json::binary({0xCA, 0xFE, 0xBA, 0xBE}, 42);
212
213    // convert to MessagePack
214    auto v = json::to_msgpack(j);
215    ```
216
217    `v` is a `std::vector<std::uint8t>` with the following 14 elements:
218
219    ```c
220    0x81                                   // fixmap1
221        0xA6                               // fixstr6
222            0x62 0x69 0x6E 0x61 0x72 0x79  // "binary"
223        0xD6                               // fixext4
224            0x2A                           // subtype
225            0xCA 0xFE 0xBA 0xBE            // content
226    ```
227
228    Note that the serialization preserves the subtype, and deserializing `v` would yield the following value:
229
230    ```json
231    {
232      "binary": {
233        "bytes": [202, 254, 186, 190],
234        "subtype": 42
235      }
236    }
237    ```
238
239### UBJSON
240
241[UBJSON](binary_formats/ubjson.md) neither supports binary values nor subtypes, and proposes to serialize binary values as array of uint8 values. This translation is implemented by the library.
242
243??? example
244
245    Code:
246
247    ```cpp
248    // create a binary value of subtype 42 (will be ignored in UBJSON)
249    json j;
250    j["binary"] = json::binary({0xCA, 0xFE, 0xBA, 0xBE}, 42);
251
252    // convert to UBJSON
253    auto v = json::to_msgpack(j);
254    ```
255
256    `v` is a `std::vector<std::uint8t>` with the following 20 elements:
257
258    ```c
259    0x7B                                             // '{'
260        0x69 0x06                                    // i 6 (length of the key)
261        0x62 0x69 0x6E 0x61 0x72 0x79                // "binary"
262        0x5B                                         // '['
263            0x55 0xCA 0x55 0xFE 0x55 0xBA 0x55 0xBE  // content (each byte prefixed with 'U')
264        0x5D                                         // ']'
265    0x7D                                             // '}'
266    ```
267
268    The following code uses the type and size optimization for UBJSON:
269
270    ```cpp
271    // convert to UBJSON using the size and type optimization
272    auto v = json::to_ubjson(j, true, true);
273    ```
274
275    The resulting vector has 23 elements; the optimization is not effective for examples with few values:
276
277    ```c
278    0x7B                                // '{'
279        0x24                            // '$' type of the object elements
280        0x5B                            // '[' array
281        0x23 0x69 0x01                  // '#' i 1 number of object elements
282        0x69 0x06                       // i 6 (length of the key)
283        0x62 0x69 0x6E 0x61 0x72 0x79   // "binary"
284            0x24 0x55                   // '$' 'U' type of the array elements: unsinged integers
285            0x23 0x69 0x04              // '#' i 4 number of array elements
286            0xCA 0xFE 0xBA 0xBE         // content
287    ```
288
289    Note that subtype (42) is **not** serialized and that UBJSON has **no binary type**, and deserializing `v` would yield the following value:
290
291    ```json
292    {
293      "binary": [202, 254, 186, 190]
294    }
295    ```
296