1# Binary Values 2 3The library implements several [binary formats](binary_formats/index.md) that encode JSON in an efficient way. Most of these formats support binary values; that is, values that have semantics define outside the library and only define a sequence of bytes to be stored. 4 5JSON itself does not have a binary value. As such, binary values are an extension that this library implements to store values received by a binary format. Binary values are never created by the JSON parser, and are only part of a serialized JSON text if they have been created manually or via a binary format. 6 7## API for binary values 8 9```plantuml 10class json::binary_t { 11 -- setters -- 12 +void set_subtype(std::uint8_t subtype) 13 +void clear_subtype() 14 -- getters -- 15 +std::uint8_t subtype() const 16 +bool has_subtype() const 17} 18 19"std::vector<uint8_t>" <|-- json::binary_t 20``` 21 22By default, binary values are stored as `std::vector<std::uint8_t>`. This type can be changed by providing a template parameter to the `basic_json` type. To store binary subtypes, the storage type is extended and exposed as `json::binary_t`: 23 24```cpp 25auto binary = json::binary_t({0xCA, 0xFE, 0xBA, 0xBE}); 26auto binary_with_subtype = json::binary_t({0xCA, 0xFE, 0xBA, 0xBE}, 42); 27``` 28 29There are several convenience functions to check and set the subtype: 30 31```cpp 32binary.has_subtype(); // returns false 33binary_with_subtype.has_subtype(); // returns true 34 35binary_with_subtype.clear_subtype(); 36binary_with_subtype.has_subtype(); // returns true 37 38binary_with_subtype.set_subtype(42); 39binary.set_subtype(23); 40 41binary.subtype(); // returns 23 42``` 43 44As `json::binary_t` is subclassing `std::vector<std::uint8_t>`, all member functions are available: 45 46```cpp 47binary.size(); // returns 4 48binary[1]; // returns 0xFE 49``` 50 51JSON values can be constructed from `json::binary_t`: 52 53```cpp 54json j = binary; 55``` 56 57Binary values are primitive values just like numbers or strings: 58 59```cpp 60j.is_binary(); // returns true 61j.is_primitive(); // returns true 62``` 63 64Given a binary JSON value, the `binary_t` can be accessed by reference as via `get_binary()`: 65 66```cpp 67j.get_binary().has_subtype(); // returns true 68j.get_binary().size(); // returns 4 69``` 70 71For convencience, binary JSON values can be constructed via `json::binary`: 72 73```cpp 74auto j2 = json::binary({0xCA, 0xFE, 0xBA, 0xBE}, 23); 75auto j3 = json::binary({0xCA, 0xFE, 0xBA, 0xBE}); 76 77j2 == j; // returns true 78j3.get_binary().has_subtype(); // returns false 79``` 80 81 82 83## Serialization 84 85Binary values are serialized differently according to the formats. 86 87### JSON 88 89JSON does not have a binary type, and this library does not introduce a new type as this would break conformance. Instead, binary values are serialized as an object with two keys: `bytes` holds an array of integers, and `subtype` is an integer or `null`. 90 91??? example 92 93 Code: 94 95 ```cpp 96 // create a binary value of subtype 42 97 json j; 98 j["binary"] = json::binary({0xCA, 0xFE, 0xBA, 0xBE}, 42); 99 100 // serialize to standard output 101 std::cout << j.dump(2) << std::endl; 102 ``` 103 104 Output: 105 106 ```json 107 { 108 "binary": { 109 "bytes": [202, 254, 186, 190], 110 "subtype": 42 111 } 112 } 113 ``` 114 115!!! warning "No roundtrip for binary values" 116 117 The JSON parser will not parse the objects generated by binary values back to binary values. This is by design to remain standards compliant. Serializing binary values to JSON is only implemented for debugging purposes. 118 119### BSON 120 121[BSON](binary_formats/bson.md) supports binary values and subtypes. If a subtype is given, it is used and added as unsigned 8-bit integer. If no subtype is given, the generic binary subtype 0x00 is used. 122 123??? example 124 125 Code: 126 127 ```cpp 128 // create a binary value of subtype 42 129 json j; 130 j["binary"] = json::binary({0xCA, 0xFE, 0xBA, 0xBE}, 42); 131 132 // convert to BSON 133 auto v = json::to_bson(j); 134 ``` 135 136 `v` is a `std::vector<std::uint8t>` with the following 22 elements: 137 138 ```c 139 0x16 0x00 0x00 0x00 // number of bytes in the document 140 0x05 // binary value 141 0x62 0x69 0x6E 0x61 0x72 0x79 0x00 // key "binary" + null byte 142 0x04 0x00 0x00 0x00 // number of bytes 143 0x2a // subtype 144 0xCA 0xFE 0xBA 0xBE // content 145 0x00 // end of the document 146 ``` 147 148 Note that the serialization preserves the subtype, and deserializing `v` would yield the following value: 149 150 ```json 151 { 152 "binary": { 153 "bytes": [202, 254, 186, 190], 154 "subtype": 42 155 } 156 } 157 ``` 158 159### CBOR 160 161[CBOR](binary_formats/cbor.md) supports binary values, but no subtypes. Subtypes will be serialized as tags. Any binary value will be serialized as byte strings. The library will choose the smallest representation using the length of the byte array. 162 163??? example 164 165 Code: 166 167 ```cpp 168 // create a binary value of subtype 42 169 json j; 170 j["binary"] = json::binary({0xCA, 0xFE, 0xBA, 0xBE}, 42); 171 172 // convert to CBOR 173 auto v = json::to_cbor(j); 174 ``` 175 176 `v` is a `std::vector<std::uint8t>` with the following 15 elements: 177 178 ```c 179 0xA1 // map(1) 180 0x66 // text(6) 181 0x62 0x69 0x6E 0x61 0x72 0x79 // "binary" 182 0xD8 0x2A // tag(42) 183 0x44 // bytes(4) 184 0xCA 0xFE 0xBA 0xBE // content 185 ``` 186 187 Note that the subtype is serialized as tag. However, parsing tagged values yield a parse error unless `json::cbor_tag_handler_t::ignore` is passed to `json::from_cbor`. 188 189 ```json 190 { 191 "binary": { 192 "bytes": [202, 254, 186, 190], 193 "subtype": null 194 } 195 } 196 ``` 197 198### MessagePack 199 200[MessagePack](binary_formats/messagepack.md) supports binary values and subtypes. If a subtype is given, the ext family is used. The library will choose the smallest representation among fixext1, fixext2, fixext4, fixext8, ext8, ext16, and ext32. The subtype is then added as singed 8-bit integer. 201 202If no subtype is given, the bin family (bin8, bin16, bin32) is used. 203 204??? example 205 206 Code: 207 208 ```cpp 209 // create a binary value of subtype 42 210 json j; 211 j["binary"] = json::binary({0xCA, 0xFE, 0xBA, 0xBE}, 42); 212 213 // convert to MessagePack 214 auto v = json::to_msgpack(j); 215 ``` 216 217 `v` is a `std::vector<std::uint8t>` with the following 14 elements: 218 219 ```c 220 0x81 // fixmap1 221 0xA6 // fixstr6 222 0x62 0x69 0x6E 0x61 0x72 0x79 // "binary" 223 0xD6 // fixext4 224 0x2A // subtype 225 0xCA 0xFE 0xBA 0xBE // content 226 ``` 227 228 Note that the serialization preserves the subtype, and deserializing `v` would yield the following value: 229 230 ```json 231 { 232 "binary": { 233 "bytes": [202, 254, 186, 190], 234 "subtype": 42 235 } 236 } 237 ``` 238 239### UBJSON 240 241[UBJSON](binary_formats/ubjson.md) neither supports binary values nor subtypes, and proposes to serialize binary values as array of uint8 values. This translation is implemented by the library. 242 243??? example 244 245 Code: 246 247 ```cpp 248 // create a binary value of subtype 42 (will be ignored in UBJSON) 249 json j; 250 j["binary"] = json::binary({0xCA, 0xFE, 0xBA, 0xBE}, 42); 251 252 // convert to UBJSON 253 auto v = json::to_msgpack(j); 254 ``` 255 256 `v` is a `std::vector<std::uint8t>` with the following 20 elements: 257 258 ```c 259 0x7B // '{' 260 0x69 0x06 // i 6 (length of the key) 261 0x62 0x69 0x6E 0x61 0x72 0x79 // "binary" 262 0x5B // '[' 263 0x55 0xCA 0x55 0xFE 0x55 0xBA 0x55 0xBE // content (each byte prefixed with 'U') 264 0x5D // ']' 265 0x7D // '}' 266 ``` 267 268 The following code uses the type and size optimization for UBJSON: 269 270 ```cpp 271 // convert to UBJSON using the size and type optimization 272 auto v = json::to_ubjson(j, true, true); 273 ``` 274 275 The resulting vector has 23 elements; the optimization is not effective for examples with few values: 276 277 ```c 278 0x7B // '{' 279 0x24 // '$' type of the object elements 280 0x5B // '[' array 281 0x23 0x69 0x01 // '#' i 1 number of object elements 282 0x69 0x06 // i 6 (length of the key) 283 0x62 0x69 0x6E 0x61 0x72 0x79 // "binary" 284 0x24 0x55 // '$' 'U' type of the array elements: unsinged integers 285 0x23 0x69 0x04 // '#' i 4 number of array elements 286 0xCA 0xFE 0xBA 0xBE // content 287 ``` 288 289 Note that subtype (42) is **not** serialized and that UBJSON has **no binary type**, and deserializing `v` would yield the following value: 290 291 ```json 292 { 293 "binary": [202, 254, 186, 190] 294 } 295 ``` 296