1# Binary Values 2 3The library implements several [binary formats](binary_formats/index.md) that encode JSON in an efficient way. Most of these formats support binary values; that is, values that have semantics define outside the library and only define a sequence of bytes to be stored. 4 5JSON itself does not have a binary value. As such, binary values are an extension that this library implements to store values received by a binary format. Binary values are never created by the JSON parser, and are only part of a serialized JSON text if they have been created manually or via a binary format. 6 7## API for binary values 8 9```plantuml 10class json::binary_t { 11 -- setters -- 12 +void set_subtype(std::uint64_t subtype) 13 +void clear_subtype() 14 -- getters -- 15 +std::uint64_t subtype() const 16 +bool has_subtype() const 17} 18 19"std::vector<uint8_t>" <|-- json::binary_t 20``` 21 22By default, binary values are stored as `std::vector<std::uint8_t>`. This type can be changed by providing a template parameter to the `basic_json` type. To store binary subtypes, the storage type is extended and exposed as `json::binary_t`: 23 24```cpp 25auto binary = json::binary_t({0xCA, 0xFE, 0xBA, 0xBE}); 26auto binary_with_subtype = json::binary_t({0xCA, 0xFE, 0xBA, 0xBE}, 42); 27``` 28 29There are several convenience functions to check and set the subtype: 30 31```cpp 32binary.has_subtype(); // returns false 33binary_with_subtype.has_subtype(); // returns true 34 35binary_with_subtype.clear_subtype(); 36binary_with_subtype.has_subtype(); // returns true 37 38binary_with_subtype.set_subtype(42); 39binary.set_subtype(23); 40 41binary.subtype(); // returns 23 42``` 43 44As `json::binary_t` is subclassing `std::vector<std::uint8_t>`, all member functions are available: 45 46```cpp 47binary.size(); // returns 4 48binary[1]; // returns 0xFE 49``` 50 51JSON values can be constructed from `json::binary_t`: 52 53```cpp 54json j = binary; 55``` 56 57Binary values are primitive values just like numbers or strings: 58 59```cpp 60j.is_binary(); // returns true 61j.is_primitive(); // returns true 62``` 63 64Given a binary JSON value, the `binary_t` can be accessed by reference as via `get_binary()`: 65 66```cpp 67j.get_binary().has_subtype(); // returns true 68j.get_binary().size(); // returns 4 69``` 70 71For convenience, binary JSON values can be constructed via `json::binary`: 72 73```cpp 74auto j2 = json::binary({0xCA, 0xFE, 0xBA, 0xBE}, 23); 75auto j3 = json::binary({0xCA, 0xFE, 0xBA, 0xBE}); 76 77j2 == j; // returns true 78j3.get_binary().has_subtype(); // returns false 79j3.get_binary().subtype(); // returns std::uint64_t(-1) as j3 has no subtype 80``` 81 82 83 84## Serialization 85 86Binary values are serialized differently according to the formats. 87 88### JSON 89 90JSON does not have a binary type, and this library does not introduce a new type as this would break conformance. Instead, binary values are serialized as an object with two keys: `bytes` holds an array of integers, and `subtype` is an integer or `null`. 91 92??? example 93 94 Code: 95 96 ```cpp 97 // create a binary value of subtype 42 98 json j; 99 j["binary"] = json::binary({0xCA, 0xFE, 0xBA, 0xBE}, 42); 100 101 // serialize to standard output 102 std::cout << j.dump(2) << std::endl; 103 ``` 104 105 Output: 106 107 ```json 108 { 109 "binary": { 110 "bytes": [202, 254, 186, 190], 111 "subtype": 42 112 } 113 } 114 ``` 115 116!!! warning "No roundtrip for binary values" 117 118 The JSON parser will not parse the objects generated by binary values back to binary values. This is by design to remain standards compliant. Serializing binary values to JSON is only implemented for debugging purposes. 119 120### BSON 121 122[BSON](binary_formats/bson.md) supports binary values and subtypes. If a subtype is given, it is used and added as unsigned 8-bit integer. If no subtype is given, the generic binary subtype 0x00 is used. 123 124??? example 125 126 Code: 127 128 ```cpp 129 // create a binary value of subtype 42 130 json j; 131 j["binary"] = json::binary({0xCA, 0xFE, 0xBA, 0xBE}, 42); 132 133 // convert to BSON 134 auto v = json::to_bson(j); 135 ``` 136 137 `v` is a `std::vector<std::uint8t>` with the following 22 elements: 138 139 ```c 140 0x16 0x00 0x00 0x00 // number of bytes in the document 141 0x05 // binary value 142 0x62 0x69 0x6E 0x61 0x72 0x79 0x00 // key "binary" + null byte 143 0x04 0x00 0x00 0x00 // number of bytes 144 0x2a // subtype 145 0xCA 0xFE 0xBA 0xBE // content 146 0x00 // end of the document 147 ``` 148 149 Note that the serialization preserves the subtype, and deserializing `v` would yield the following value: 150 151 ```json 152 { 153 "binary": { 154 "bytes": [202, 254, 186, 190], 155 "subtype": 42 156 } 157 } 158 ``` 159 160### CBOR 161 162[CBOR](binary_formats/cbor.md) supports binary values, but no subtypes. Subtypes will be serialized as tags. Any binary value will be serialized as byte strings. The library will choose the smallest representation using the length of the byte array. 163 164??? example 165 166 Code: 167 168 ```cpp 169 // create a binary value of subtype 42 170 json j; 171 j["binary"] = json::binary({0xCA, 0xFE, 0xBA, 0xBE}, 42); 172 173 // convert to CBOR 174 auto v = json::to_cbor(j); 175 ``` 176 177 `v` is a `std::vector<std::uint8t>` with the following 15 elements: 178 179 ```c 180 0xA1 // map(1) 181 0x66 // text(6) 182 0x62 0x69 0x6E 0x61 0x72 0x79 // "binary" 183 0xD8 0x2A // tag(42) 184 0x44 // bytes(4) 185 0xCA 0xFE 0xBA 0xBE // content 186 ``` 187 188 Note that the subtype is serialized as tag. However, parsing tagged values yield a parse error unless `json::cbor_tag_handler_t::ignore` or `json::cbor_tag_handler_t::store` is passed to `json::from_cbor`. 189 190 ```json 191 { 192 "binary": { 193 "bytes": [202, 254, 186, 190], 194 "subtype": null 195 } 196 } 197 ``` 198 199### MessagePack 200 201[MessagePack](binary_formats/messagepack.md) supports binary values and subtypes. If a subtype is given, the ext family is used. The library will choose the smallest representation among fixext1, fixext2, fixext4, fixext8, ext8, ext16, and ext32. The subtype is then added as singed 8-bit integer. 202 203If no subtype is given, the bin family (bin8, bin16, bin32) is used. 204 205??? example 206 207 Code: 208 209 ```cpp 210 // create a binary value of subtype 42 211 json j; 212 j["binary"] = json::binary({0xCA, 0xFE, 0xBA, 0xBE}, 42); 213 214 // convert to MessagePack 215 auto v = json::to_msgpack(j); 216 ``` 217 218 `v` is a `std::vector<std::uint8t>` with the following 14 elements: 219 220 ```c 221 0x81 // fixmap1 222 0xA6 // fixstr6 223 0x62 0x69 0x6E 0x61 0x72 0x79 // "binary" 224 0xD6 // fixext4 225 0x2A // subtype 226 0xCA 0xFE 0xBA 0xBE // content 227 ``` 228 229 Note that the serialization preserves the subtype, and deserializing `v` would yield the following value: 230 231 ```json 232 { 233 "binary": { 234 "bytes": [202, 254, 186, 190], 235 "subtype": 42 236 } 237 } 238 ``` 239 240### UBJSON 241 242[UBJSON](binary_formats/ubjson.md) neither supports binary values nor subtypes, and proposes to serialize binary values as array of uint8 values. This translation is implemented by the library. 243 244??? example 245 246 Code: 247 248 ```cpp 249 // create a binary value of subtype 42 (will be ignored in UBJSON) 250 json j; 251 j["binary"] = json::binary({0xCA, 0xFE, 0xBA, 0xBE}, 42); 252 253 // convert to UBJSON 254 auto v = json::to_msgpack(j); 255 ``` 256 257 `v` is a `std::vector<std::uint8t>` with the following 20 elements: 258 259 ```c 260 0x7B // '{' 261 0x69 0x06 // i 6 (length of the key) 262 0x62 0x69 0x6E 0x61 0x72 0x79 // "binary" 263 0x5B // '[' 264 0x55 0xCA 0x55 0xFE 0x55 0xBA 0x55 0xBE // content (each byte prefixed with 'U') 265 0x5D // ']' 266 0x7D // '}' 267 ``` 268 269 The following code uses the type and size optimization for UBJSON: 270 271 ```cpp 272 // convert to UBJSON using the size and type optimization 273 auto v = json::to_ubjson(j, true, true); 274 ``` 275 276 The resulting vector has 23 elements; the optimization is not effective for examples with few values: 277 278 ```c 279 0x7B // '{' 280 0x24 // '$' type of the object elements 281 0x5B // '[' array 282 0x23 0x69 0x01 // '#' i 1 number of object elements 283 0x69 0x06 // i 6 (length of the key) 284 0x62 0x69 0x6E 0x61 0x72 0x79 // "binary" 285 0x24 0x55 // '$' 'U' type of the array elements: unsinged integers 286 0x23 0x69 0x04 // '#' i 4 number of array elements 287 0xCA 0xFE 0xBA 0xBE // content 288 ``` 289 290 Note that subtype (42) is **not** serialized and that UBJSON has **no binary type**, and deserializing `v` would yield the following value: 291 292 ```json 293 { 294 "binary": [202, 254, 186, 190] 295 } 296 ``` 297