• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1# Binary Values
2
3The library implements several [binary formats](binary_formats/index.md) that encode JSON in an efficient way. Most of these formats support binary values; that is, values that have semantics define outside the library and only define a sequence of bytes to be stored.
4
5JSON itself does not have a binary value. As such, binary values are an extension that this library implements to store values received by a binary format. Binary values are never created by the JSON parser, and are only part of a serialized JSON text if they have been created manually or via a binary format.
6
7## API for binary values
8
9```plantuml
10class json::binary_t {
11    -- setters --
12    +void set_subtype(std::uint64_t subtype)
13    +void clear_subtype()
14    -- getters --
15    +std::uint64_t subtype() const
16    +bool has_subtype() const
17}
18
19"std::vector<uint8_t>" <|-- json::binary_t
20```
21
22By default, binary values are stored as `std::vector<std::uint8_t>`. This type can be changed by providing a template parameter to the `basic_json` type. To store binary subtypes, the storage type is extended and exposed as `json::binary_t`:
23
24```cpp
25auto binary = json::binary_t({0xCA, 0xFE, 0xBA, 0xBE});
26auto binary_with_subtype = json::binary_t({0xCA, 0xFE, 0xBA, 0xBE}, 42);
27```
28
29There are several convenience functions to check and set the subtype:
30
31```cpp
32binary.has_subtype();                   // returns false
33binary_with_subtype.has_subtype();      // returns true
34
35binary_with_subtype.clear_subtype();
36binary_with_subtype.has_subtype();      // returns true
37
38binary_with_subtype.set_subtype(42);
39binary.set_subtype(23);
40
41binary.subtype();                       // returns 23
42```
43
44As `json::binary_t` is subclassing `std::vector<std::uint8_t>`, all member functions are available:
45
46```cpp
47binary.size();  // returns 4
48binary[1];      // returns 0xFE
49```
50
51JSON values can be constructed from `json::binary_t`:
52
53```cpp
54json j = binary;
55```
56
57Binary values are primitive values just like numbers or strings:
58
59```cpp
60j.is_binary();    // returns true
61j.is_primitive(); // returns true
62```
63
64Given a binary JSON value, the `binary_t` can be accessed by reference as via `get_binary()`:
65
66```cpp
67j.get_binary().has_subtype();  // returns true
68j.get_binary().size();         // returns 4
69```
70
71For convenience, binary JSON values can be constructed via `json::binary`:
72
73```cpp
74auto j2 = json::binary({0xCA, 0xFE, 0xBA, 0xBE}, 23);
75auto j3 = json::binary({0xCA, 0xFE, 0xBA, 0xBE});
76
77j2 == j;                        // returns true
78j3.get_binary().has_subtype();  // returns false
79j3.get_binary().subtype();      // returns std::uint64_t(-1) as j3 has no subtype
80```
81
82
83
84## Serialization
85
86Binary values are serialized differently according to the formats.
87
88### JSON
89
90JSON does not have a binary type, and this library does not introduce a new type as this would break conformance. Instead, binary values are serialized as an object with two keys: `bytes` holds an array of integers, and `subtype` is an integer or `null`.
91
92??? example
93
94    Code:
95
96    ```cpp
97    // create a binary value of subtype 42
98    json j;
99    j["binary"] = json::binary({0xCA, 0xFE, 0xBA, 0xBE}, 42);
100
101    // serialize to standard output
102    std::cout << j.dump(2) << std::endl;
103    ```
104
105    Output:
106
107    ```json
108    {
109      "binary": {
110        "bytes": [202, 254, 186, 190],
111        "subtype": 42
112      }
113    }
114    ```
115
116!!! warning "No roundtrip for binary values"
117
118    The JSON parser will not parse the objects generated by binary values back to binary values. This is by design to remain standards compliant. Serializing binary values to JSON is only implemented for debugging purposes.
119
120### BSON
121
122[BSON](binary_formats/bson.md) supports binary values and subtypes. If a subtype is given, it is used and added as unsigned 8-bit integer. If no subtype is given, the generic binary subtype 0x00 is used.
123
124??? example
125
126    Code:
127
128    ```cpp
129    // create a binary value of subtype 42
130    json j;
131    j["binary"] = json::binary({0xCA, 0xFE, 0xBA, 0xBE}, 42);
132
133    // convert to BSON
134    auto v = json::to_bson(j);
135    ```
136
137    `v` is a `std::vector<std::uint8t>` with the following 22 elements:
138
139    ```c
140    0x16 0x00 0x00 0x00                         // number of bytes in the document
141        0x05                                    // binary value
142            0x62 0x69 0x6E 0x61 0x72 0x79 0x00  // key "binary" + null byte
143            0x04 0x00 0x00 0x00                 // number of bytes
144            0x2a                                // subtype
145            0xCA 0xFE 0xBA 0xBE                 // content
146    0x00                                        // end of the document
147    ```
148
149    Note that the serialization preserves the subtype, and deserializing `v` would yield the following value:
150
151    ```json
152    {
153      "binary": {
154        "bytes": [202, 254, 186, 190],
155        "subtype": 42
156      }
157    }
158    ```
159
160### CBOR
161
162[CBOR](binary_formats/cbor.md) supports binary values, but no subtypes. Subtypes will be serialized as tags. Any binary value will be serialized as byte strings. The library will choose the smallest representation using the length of the byte array.
163
164??? example
165
166    Code:
167
168    ```cpp
169    // create a binary value of subtype 42
170    json j;
171    j["binary"] = json::binary({0xCA, 0xFE, 0xBA, 0xBE}, 42);
172
173    // convert to CBOR
174    auto v = json::to_cbor(j);
175    ```
176
177    `v` is a `std::vector<std::uint8t>` with the following 15 elements:
178
179    ```c
180    0xA1                                   // map(1)
181        0x66                               // text(6)
182            0x62 0x69 0x6E 0x61 0x72 0x79  // "binary"
183        0xD8 0x2A                          // tag(42)
184        0x44                               // bytes(4)
185            0xCA 0xFE 0xBA 0xBE            // content
186    ```
187
188    Note that the subtype is serialized as tag. However, parsing tagged values yield a parse error unless `json::cbor_tag_handler_t::ignore` or `json::cbor_tag_handler_t::store` is passed to `json::from_cbor`.
189
190    ```json
191    {
192      "binary": {
193        "bytes": [202, 254, 186, 190],
194        "subtype": null
195      }
196    }
197    ```
198
199### MessagePack
200
201[MessagePack](binary_formats/messagepack.md) supports binary values and subtypes. If a subtype is given, the ext family is used. The library will choose the smallest representation among fixext1, fixext2, fixext4, fixext8, ext8, ext16, and ext32. The subtype is then added as singed 8-bit integer.
202
203If no subtype is given, the bin family (bin8, bin16, bin32) is used.
204
205??? example
206
207    Code:
208
209    ```cpp
210    // create a binary value of subtype 42
211    json j;
212    j["binary"] = json::binary({0xCA, 0xFE, 0xBA, 0xBE}, 42);
213
214    // convert to MessagePack
215    auto v = json::to_msgpack(j);
216    ```
217
218    `v` is a `std::vector<std::uint8t>` with the following 14 elements:
219
220    ```c
221    0x81                                   // fixmap1
222        0xA6                               // fixstr6
223            0x62 0x69 0x6E 0x61 0x72 0x79  // "binary"
224        0xD6                               // fixext4
225            0x2A                           // subtype
226            0xCA 0xFE 0xBA 0xBE            // content
227    ```
228
229    Note that the serialization preserves the subtype, and deserializing `v` would yield the following value:
230
231    ```json
232    {
233      "binary": {
234        "bytes": [202, 254, 186, 190],
235        "subtype": 42
236      }
237    }
238    ```
239
240### UBJSON
241
242[UBJSON](binary_formats/ubjson.md) neither supports binary values nor subtypes, and proposes to serialize binary values as array of uint8 values. This translation is implemented by the library.
243
244??? example
245
246    Code:
247
248    ```cpp
249    // create a binary value of subtype 42 (will be ignored in UBJSON)
250    json j;
251    j["binary"] = json::binary({0xCA, 0xFE, 0xBA, 0xBE}, 42);
252
253    // convert to UBJSON
254    auto v = json::to_msgpack(j);
255    ```
256
257    `v` is a `std::vector<std::uint8t>` with the following 20 elements:
258
259    ```c
260    0x7B                                             // '{'
261        0x69 0x06                                    // i 6 (length of the key)
262        0x62 0x69 0x6E 0x61 0x72 0x79                // "binary"
263        0x5B                                         // '['
264            0x55 0xCA 0x55 0xFE 0x55 0xBA 0x55 0xBE  // content (each byte prefixed with 'U')
265        0x5D                                         // ']'
266    0x7D                                             // '}'
267    ```
268
269    The following code uses the type and size optimization for UBJSON:
270
271    ```cpp
272    // convert to UBJSON using the size and type optimization
273    auto v = json::to_ubjson(j, true, true);
274    ```
275
276    The resulting vector has 23 elements; the optimization is not effective for examples with few values:
277
278    ```c
279    0x7B                                // '{'
280        0x24                            // '$' type of the object elements
281        0x5B                            // '[' array
282        0x23 0x69 0x01                  // '#' i 1 number of object elements
283        0x69 0x06                       // i 6 (length of the key)
284        0x62 0x69 0x6E 0x61 0x72 0x79   // "binary"
285            0x24 0x55                   // '$' 'U' type of the array elements: unsinged integers
286            0x23 0x69 0x04              // '#' i 4 number of array elements
287            0xCA 0xFE 0xBA 0xBE         // content
288    ```
289
290    Note that subtype (42) is **not** serialized and that UBJSON has **no binary type**, and deserializing `v` would yield the following value:
291
292    ```json
293    {
294      "binary": [202, 254, 186, 190]
295    }
296    ```
297