1# Random Access Compression: RAC 2 3Status: Draft (as of March 2019). There is no compatibility guarantee yet. 4 5 6## Overview 7 8The goal of the RAC file format is to compress a source file (the "decompressed 9file") such that, starting from the compressed file, it is possible to 10reconstruct the half-open byte range `[di .. dj)` of the decompressed file 11without always having to first decompress all of `[0 .. di)`. 12 13Conceptually, the decompressed file is partitioned into non-overlapping chunks. 14Each compressed chunk can be decompressed independently (although possibly 15sharing additional context, such as a LZ77 prefix dictionary). A RAC file also 16contains a hierarchical index of those chunks. 17 18RAC is a container format, and while it supports common compression codecs like 19Zlib, Brotli and Zstandard, it is not tied to any particular compression codec. 20 21Non-goals for version 1 include: 22 23 - Filesystem metadata like file names, file sizes and modification times. 24 - Multiple source files. 25 26There is the capability (see reserved `TTag`s, below) but no promise to address 27these in a future RAC version. There might not be a need to, as other designs 28such as EROFS ([Extendable Read-Only File 29System](https://lkml.org/lkml/2018/5/31/306)) already exist. 30 31Non-goals in general include: 32 33 - Encryption. 34 - Streaming decompression. 35 36 37### Related Work 38 39In general, RAC differs from most other archive or compression formats in a 40number of ways. This doesn't mean that RAC is necessarily better or worse than 41these other designs, just that different designs make different trade-offs. For 42example, supporting shared dictionaries means giving up streaming (one pass) 43decoding. 44 45 - RAC files must start with a magic sequence, but RAC still supports 46 append-only modifications. 47 - RAC seek points are identified by numbers (i.e. `DOffset`s), not strings 48 (i.e. file names). Some other formats' seek points are positioned only at 49 source file boundaries. They cannot seek to a relative offset (in what RAC 50 calls `DSpace`) within a source file. 51 - RAC supports [shared compression 52 dictionaries](http://fastcompression.blogspot.com/2018/02/when-to-use-dictionary-compression.html) 53 embedded within an archive. 54 55These points apply to established archive formats like 56[Tar](https://en.wikipedia.org/wiki/Tar_%28computing%29) and 57[Zip](https://en.wikipedia.org/wiki/Zip_%28file_format%29), and newer archive 58formats like [Śiva](https://blog.sourced.tech/post/siva/). 59 60[Riegeli](https://github.com/google/riegeli) is a sequence-of-records format. A 61RAC file arguably contains what Riegeli calls records (which are not files, as 62they don't have names) and both formats support numerical seek points, and 63multiple compression codecs, but Riegeli seeks to a point in what RAC calls 64`CSpace`, whereas RAC seeks to a point in `DSpace`. 65 66[XFLATE](https://github.com/dsnet/compress/blob/master/doc/xflate-format.pdf) 67supports random access like RAC, but is tied to the DEFLATE family of 68compression codecs. 69 70[XZ](https://tukaani.org/xz/format.html) supports random access like RAC, but 71as one of its explicit goals is to support streaming decoding, it does not 72support shared dictionaries. Also, unlike RAC, finding the record that contains 73any given `DOffset` requires a linear scan of all previous records, even if 74those records don't need decompressing. 75 76[QCOW](https://github.com/qemu/qemu/blob/master/docs/interop/qcow2.txt) 77supports random access like RAC, but compression does not seem to be the 78foremost concern. That specification mentions compression but does not give any 79particular codec. A [separate 80document](https://people.gnome.org/~markmc/qcow-image-format.html) suggests 81that the codec is Zlib (with no other option), and that it does not support 82shared dictionaries. 83 84The 85[`examples/zran.c`](https://github.com/madler/zlib/blob/master/examples/zran.c) 86program in the zlib-the-library source code repository builds an in-memory 87index, not a persistent (disk-backed) one, and building the index involves 88first decompressing the whole file. It also requires storing (in memory) the 89entire state of the decompressor, including 32 KiB of history, per seek point. 90For coarse grained seek points (e.g. once every 1 MiB), that overhead can be 91acceptable, but for fine grained seek points (e.g. once every 16 KiB), that 92overhead is prohibitive. 93 94RAC differs from the [LevelDB 95Table](https://github.com/google/leveldb/blob/master/doc/table_format.md) 96format, even if the LevelDB Table string keys are re-purposed to encode both 97numerical `DOffset` seek points and identifiers for shared compression 98dictionaries. A LevelDB Table index is flat, unlike RAC's hierarchical index. 99In both cases, a maliciously written index could contain multiple entries for 100the same key, and different decoders could therefore produce different output 101for the same source file - a potential security vulnerability. For LevelDB 102Tables, verifying an index key's uniqueness requires scanning every key in the 103file, which can be relatively expensive, and is therefore not done in practice. 104In comparison, RAC's "Branch Node Validation" process (see below) ensures that 105exactly one `Leaf Node` contains any given byte offset `di` in the decompressed 106file, and the RAC index's hierarchical nature places a scalable upper bound on 107the scanning cost of verifying that, based on that portion of the index tree 108visited for any given request `[di .. dj)`. 109 110 111### Glossary 112 113 - `CBias` is the delta added to a `CPointer` to produce a `COffset`. 114 - `DBias` is the delta added to a `DPointer` to produce a `DOffset`. 115 - `CFile` is the compressed file. 116 - `DFile` is the decompressed file. 117 - `CFileSize` is the size of the `CFile`. 118 - `DFileSize` is the size of the `DFile`. 119 - `COffset` is a byte offset in `CSpace`. 120 - `DOffset` is a byte offset in `DSpace`. 121 - `CPointer` is a relative `COffset`, prior to bias-correction. 122 - `DPointer` is a relative `DOffset`, prior to bias-correction. 123 - `CRange` is a `Range` in `CSpace`. 124 - `DRange` is a `Range` in `DSpace`. 125 - `CSpace` means that byte offsets refer to the `CFile`. 126 - `DSpace` means that byte offsets refer to the `DFile`. 127 128`Range` is a pair of byte offsets `[i .. j)`, in either `CSpace` or `DSpace`. 129It is half-open, containing every byte offset `x` such that `(i <= x)` and `(x 130< j)`. It is invalid to have `(i > j)`. The size of a `Range` equals `(j - i)`. 131 132All bytes are 8 bits and unless explicitly specified otherwise, all fixed-size 133unsigned integers (e.g. `uint32_t`, `uint64_t`) are encoded little-endian. 134Within those unsigned integers, bit 0 is the least significant bit and e.g. bit 13531 is the most significant bit of a `uint32_t`. 136 137The maximum supported `CFileSize` and the maximum supported `DFileSize` are the 138same number: `0x0000_FFFF_FFFF_FFFF`, which is `((1 << 48) - 1)`. 139 140 141## File Structure 142 143A RAC file (the `CFile`) must be at least 4 bytes long, and start with the 3 144byte `Magic` (see below), so that no valid RAC file can also be e.g. a valid 145JPEG file. The fourth byte is examined in the process described by the "Root 146Node at the CFile Start" section, below. 147 148The `CFile` contains a tree of `Node`s. Each `Node` is either a `Branch Node` 149(pointing to between 1 and 255 child `Node`s) or a `Leaf Node`. There must be 150at least one `Branch Node`, called the `Root Node`. Parsing a `CFile` requires 151knowing the `CFileSize` in order to identify the `Root Node`, which is either 152at the start or the end of the `CFile`. 153 154Each `Node` has a `DRange`. An empty `DRange` means that the `Node` contains 155metadata or other decompression context such as a shared dictionary. 156 157Each `Leaf Node` also has 3 `CRange`s (`Primary`, `Secondary` and `Tertiary`), 158any or all of which may be empty. The contents of the `CFile`, within those 159`CRange`s, are decompressed according to the `Codec` (see below) to reconstruct 160that part of the `DFile` within the `Leaf Node`'s `DRange`. 161 162 163## Branch Nodes 164 165A `Branch Node`'s encoding in the `CFile` has a variable byte size, between 32 166and 4096 inclusive, depending on its number of children. Specifically, it 167occupies `((Arity * 16) + 16)` bytes, grouped into 8 byte segments (but not 168necessarily 8 byte aligned), starting at a `COffset` called its `Branch 169COffset`: 170 171 +-+-+-+-+-+-+-+-+ 172 |0|1|2|3|4|5|6|7| 173 +-+-+-+-+-+-+-+-+ 174 |Magic|A|Che|0|T| Magic, Arity, Checksum, Reserved (0), TTag[0] 175 | DPtr[1] |0|T| DPtr[1], Reserved (0), TTag[1] 176 | DPtr[2] |0|T| DPtr[2], Reserved (0), TTag[2] 177 | ... |0|.| ..., Reserved (0), ... 178 | DPtr[A-2] |0|T| DPtr[Arity-2], Reserved (0), TTag[Arity-2] 179 | DPtr[A-1] |0|T| DPtr[Arity-1], Reserved (0), TTag[Arity-1] 180 | DPtr[A] |0|C| DPtr[Arity] a.k.a. DPtrMax, Reserved (0), Codec 181 | CPtr[0] |L|S| CPtr[0], CLen[0], STag[0] 182 | CPtr[1] |L|S| CPtr[1], CLen[1], STag[1] 183 | CPtr[2] |L|S| CPtr[2], CLen[2], STag[2] 184 | ... |L|.| ..., ..., ... 185 | CPtr[A-2] |L|S| CPtr[Arity-2], CLen[Arity-2], STag[Arity-2] 186 | CPtr[A-1] |L|S| CPtr[Arity-1], CLen[Arity-2], STag[Arity-1] 187 | CPtr[A] |V|A| CPtr[Arity] a.k.a. CPtrMax, Version, Arity 188 +-+-+-+-+-+-+-+-+ 189 190For the `(XPtr | Other6 | Other7)` 8 byte fields, the `XPtr` occupies the low 19148 bits (as a little-endian `uint64_t`) and the `Other` fields occupy the high 19216 bits. 193 194The `CPtr` and `DPtr` values are what is explicitly written in the `CFile`'s 195bytes. These are added to a `Branch Node`'s implicit `Branch CBias` and `Branch 196DBias` values to give the implicit `COff` and `DOff` values: `COff[i]` and 197`DOff[i]` are defined to be `(Branch_CBias + CPtr[i])` and `(Branch_DBias + 198DPtr[i])`. 199 200`CPtrMax` is another name for `CPtr[Arity]`, and `COffMax` is defined to be 201`(Branch_CBias + CPtrMax)`. Likewise for `DPtrMax` and `DOffMax`. 202 203The `DPtr[0]` value is implicit, and always equals zero, so that `DOff[0]` 204always equals the `Branch DBias`. 205 206 - For the `Root Node`, the `DPtrMax` also sets the `DFileSize`. The `Branch 207 CBias` and `Branch DBias` are both zero. The `Branch COffset` is determined 208 by the "Root Node" section below. 209 - For a child `Branch Node`, the `Branch COffset`, `Branch CBias` and `Branch 210 DBias` are given by the parent `Branch Node`. See the "Search Within a 211 Branch Node" section below. 212 213 214### Magic 215 216`Magic` is the three bytes `"\x72\xC3\x63"`, which is invalid UTF-8 but is 217`"rÃc"` in ISO 8859-1. The tilde isn't particularly meaningful, other than 218`"rÃc"` being a nonsensical word (with nonsensical capitalization) that is 219unlikely to appear in other files. 220 221Every `Branch Node` must start with these `Magic` bytes, not just a `Branch 222Node` positioned at the start of the `CFile`. 223 224 225### Arity 226 227`Arity` is the `Branch Node`'s number of children. Zero is invalid. 228 229The `Arity` byte is given twice: the fourth byte and the final byte of the 230`Branch Node`. The two values must match. 231 232The repetition lets a RAC reader determine the size of the `Branch Node` data 233(as the size depends on the `Arity`), given either its start or its end offset 234in `CSpace`. For almost all `Branch Node`s, we will know its start offset (its 235`Branch COffset`), but for a `Root Node` at the end of a `CFile`, we will only 236know its end offset. 237 238 239### Checksum 240 241`Checksum` is a checksum of the `Branch Node`'s bytes. It is not a checksum of 242the `CFile` or `DFile` contents pointed to by a `Branch Node`. Content 243checksums are a `Codec`-specific consideration. 244 245The little-endian `uint16_t` `Checksum` value is the low 16 bits XOR'ed with 246the high 16 bits of the `uint32_t` CRC-32 IEEE checksum of the `((Arity * 16) + 24710)` bytes immediately after the `Checksum`. The 4 bytes immediately before the 248`Checksum` are not considered: the `Magic` bytes have only one valid value and 249the `Arity` byte near the start is replicated by the `Arity` byte at the end. 250 251 252### Reserved (0) 253 254The `Reserved (0)` bytes must have the value `0x00`. 255 256 257### COffs and DOffs, STags and TTags 258 259For every `a` in the half-open range `[0 .. Arity)`, the `a`'th child `Node` 260has two tags, `STag[a]` and `TTag[a]`, and a `DRange` of `[DOff[a] .. 261DOff[a+1])`. The `DOff` values must be non-decreasing: see the "Branch Node 262Validation" section below. 263 264A `TTag[a]` of `0xFE` means that child is a `Branch Node`. A `TTag[a]` in the 265half-open range `[0xC0 .. 0xFE)` is reserved. Otherwise, the child is a `Leaf 266Node`. 267 268A child `Branch Node`'s `SubBranch COffset` is defined to be `COff[a]`. Its 269`SubBranch DBias` and `SubBranch DOffMax` are defined to be `DOff[a]` and 270`DOff[a+1]. 271 272 - When `(STag[a] < Arity)`, it is a `CBiasing Branch Node`. The `SubBranch 273 CBias` is defined to be `(Branch_CBias + CPtr[STag[a]])`. This expression 274 is equivalent to `COff[STag[a]]`. 275 - When `(STag[a] >= Arity)`, it is a `CNeutral Branch Node`. The `SubBranch 276 CBias` is defined to be `(Branch_CBias)`. 277 278A child `Leaf Node`'s `STag[a]` and `TTag[a]` values are also called its `Leaf 279STag` and `Leaf TTag`. It also has: 280 281 - A `Primary CRange`, equal to `MakeCRange(a)`. 282 - A `Secondary CRange`, equal to `MakeCRange(STag[a])`. 283 - A `Tertiary CRange`, equal to `MakeCRange(TTag[a])`. 284 285The `MakeCRange(i)` function defines a `CRange`. If `(i >= Arity)` then that 286`CRange` is the empty range `[COffMax .. COffMax)`. Otherwise, the lower bound 287is `COff[i]` and the upper bound is: 288 289 - `COffMax` when `CLen[i]` is zero. 290 - The minimum of `COffMax` and `(COff[i] + (CLen[i] * 1024))` when `CLen[i]` 291 is non-zero. 292 293In other words, the `COffMax` value clamps the `CRange` upper bound. The `CLen` 294value, if non-zero, combines with the `COff` value to apply another clamp. The 295`CLen` is given in units of 1024 bytes, but the `(COff[i] + (CLen[i] * 1024))` 296value is not necessarily quantized to 1024 byte boundaries. 297 298Note that, since `Arity` is at most 255, an `STag[a]` of `0xFF` always results 299in a `CNeutral Branch Node` or an empty `Secondary CRange`. Likewise, a 300`TTag[a]` of `0xFF` always results in an empty `Tertiary CRange`. 301 302 303### COffMax 304 305`COffMax` is an inclusive upper bound on every `COff` in a `Branch Node` and in 306its descendent `Branch Node`s. A child `Branch Node` must not have a larger 307`COffMax` than the parent `Branch Node`'s `COffMax`, and the `Root Node`'s 308`COffMax` must equal the `CFileSize`. See the "Branch Node Validation" section 309below. 310 311A RAC file can therefore be incrementally modified, if the RAC writer only 312appends new `CFile` bytes and does not re-write existing `CFile` bytes, so that 313the `CFileSize` increases. Even if the old (smaller) RAC file's `Root Node` was 314at the `CFile` start, the new (larger) `CFileSize` means that those starting 315bytes are an obsolete `Root Node` (but still a valid `Branch Node`). The new 316`Root Node` is therefore located at the end of the new RAC file. 317 318Concatenating RAC files (concatenating in `DSpace`) involves concatenating the 319RAC files in `CSpace` and then appending a new `Root Node` with `CBiasing 320Branch Node`s pointing to each source RAC file's `Root Node`. 321 322 323### Version 324 325`Version` must have the value `0x01`, indicating version 1 of the RAC format. 326 327 328### Codec 329 330`Codec`s define specializations of RAC, such as "RAC + Zlib" or "RAC + Brotli". 331It is valid for a "RAC + Zstandard" only decoder to reject a "RAC + Brotli" 332file, even if it is a valid RAC file. Recall that RAC is just a container, and 333not tied to any particular compression codec. For the `Codec` byte in a `Branch 334Node`: 335 336 - `0x00` is reserved. 337 - `0x01` means "RAC + Zlib". 338 - `0x02` means "RAC + Brotli". 339 - `0x04` means "RAC + ZStandard". 340 - Any other value less than 0x08 means that all of this `Branch Node`'s 341 children must be `Branch Node`s and not `Leaf Node`s and that no child's 342 `Codec` byte can have a bit set that is not set in this `Codec` byte. 343 - All other values, 0x08 or greater, are reserved. 344 345 346### Branch Node Validation 347 348The first time that a RAC reader visits any particular `Branch Node`, it must 349check that the `Magic` matches, the two `Arity` values match and are non-zero, 350the computed checksum matches the listed `Checksum` and that the RAC reader 351accepts the `Version` and the `Codec`. 352 353It must also check that all of its `DOff` values are sorted: `(DOff[a] <= 354DOff[a+1])` for every `a` in the half-open range `[0 .. Arity)`. By induction, 355this means that all of its `DOff` values do not exceed `DOffMax`, and again by 356induction, therefore do not exceed `DFileSize`. 357 358It must also check that all of its `COff` values do not exceed `COffMax` (and 359again by induction, therefore do not exceed `CFileSize`). Other than that, 360`COff` values do not have to be sorted: successive `Node`s (in `DSpace`) can be 361out of order (in `CSpace`), allowing for incrementally modified RAC files. 362 363For the `Root Node`, its `COffMax` must equal the `CFileSize`. Recall that 364parsing a `CFile` requires knowing the `CFileSize`, and also that a `Root 365Node`'s `Branch CBias` is zero, so its `COffMax` equals its `CPtrMax`. 366 367For a child `Branch Node`, its `Codec` bits must be a subset of its parent's 368`Codec` bits, its `COffMax` must be less than or equal to its parent's 369`COffMax`, and its `DOffMax` must equal its parent's `SubBranch DOffMax`. The 370`DOffMax` condition is equivalent to checking that the parent and child agree 371on the child's size in `DSpace`. The parent states that it is its `(DPtr[a+1] - 372DPtr[a])` and the child states that it is its `DPtrMax`. 373 374One conservative way to check `Branch Node`s' validity on first visit is to 375check them on every visit, as validating any particular `Branch Node` is 376idempotent, but other ways are acceptable. 377 378 379## Root Node 380 381The `Root Node` might be at the start of the `CFile`, as this might optimize 382alignment of `Branch Node`s and of `CRange`s. All `Branch Node`s' sizes are 383multiples of 16 bytes, and a maximal `Branch Node` is exactly 4096 bytes. 384 385The `Root Node` might be at the end of the `CFile`, as this allows one-pass 386(streaming) encoding of a RAC file. It also allows appending to, concatenating 387or incrementally modifying existing RAC files relatively cheaply. 388 389To find the `Root Node`, first look for a valid `Root Node` at the `CFile` 390start. If and only if that fails, look at the `CFile` end. If that also fails, 391it is not a valid RAC file. 392 393 394### Root Node at the CFile Start 395 396The fourth byte of the `CFile` gives the `Arity`, assuming the `Root Node` is 397at the `CFile` start. If it is zero, fail over to the `CFile` end. A RAC writer 398that does not want to place the `Root Node` at the `CFile` start should 399intentionally write a zero to the fourth byte. 400 401Otherwise, that `Arity` defines the size in bytes of any starting `Root Node`: 402`((Arity * 16) + 16)`. If the `CFileSize` is less than this size, fail over to 403the `CFile` end. 404 405If those starting bytes form a valid `Root Node` (as per the "Branch Node 406Validation" section), including having a `CPtrMax` equal to the `CFileSize`, 407then we have indeed found our `Root Node`: its `Branch COffset` is zero. 408Otherwise, fail over to the `CFile` end. 409 410 411### Root Node at the CFile End 412 413If there is no valid `Root Node` at the `CFile` start then the last byte of the 414`CFile` gives the `Root Node`'s `Arity`. This does not necessarily need to 415match the fourth byte of the `CFile`. 416 417As before, that `Arity` defines the size in bytes of any ending `Root Node`: 418`((Arity * 16) + 16)`. If the `CFileSize` is less than this size, it is not a 419valid RAC file. 420 421If those ending bytes form a valid `Root Node` (as per the "Branch Node 422Validation" section), including having a `CPtrMax` equal to the `CFileSize`, 423then we have indeed found our `Root Node`: its `Branch COffset` is the 424`CFileSize` minus the size implied by the `Arity`. Otherwise, it is not a valid 425RAC file. 426 427 428## DRange Reconstruction 429 430To reconstruct the `DRange` `[di .. dj)`, if that `DRange` is empty then the 431request is trivially satisfied. 432 433Otherwise, if `(dj > DFileSize)` then reject the request. 434 435Otherwise, start at the `Root Node` and continue to the next section to find 436the `Leaf Node` containing the `DOffset` `di`. 437 438 439### Search Within a Branch Node 440 441Load (and validate) the `Branch Node` given its `Branch COffset`, `Branch 442CBias` and `Branch DBias`. 443 444Find the largest child index `a` such that `(a < Arity)` and `(DOff[a] <= di)` 445and `(DOff[a+1] > di)`, then examine `TTag[a]` to see if the child is a `Leaf 446Node`. If so, skip to the next section. 447 448For a `Branch Node` child, let `CRemaining` equal this `Branch Node`'s (the 449parents') `COffMax` minus the `SubBranch COffset`. It invalid for `CRemaining` 450to be less than 4, or to be less than the child's size implied by the child's 451`Arity` byte at a `COffset` equal to `(SubBranch_COffset + 3)`. 452 453The `SubBranch COffset`, `SubBranch CBias` and `SubBranch DBias` from the 454parent `Branch Node` become the `Branch COffset`, `Branch CBias` and `Branch 455DBias` of the child `Branch Node`. In order to rule out infinite loops, at 456least one of these two conditions must hold: 457 458 - The child's `Branch COffset` is less than the parent's `Branch COffset`. 459 - The child's `DPtrMax` is less than the parent's `DPtrMax`. 460 461It is valid for one of those conditions to hold between a parent-child pair and 462the other condition to hold between the next parent-child pair. 463 464Repeat this "Search Within a Branch Node" section with the child `Branch Node`. 465 466 467### Decompressing a Leaf Node 468 469If a `Leaf Node`'s `DRange` is empty, decompression is a no-op and skip the 470rest of this section. 471 472Otherwise, decompression combines the `Primary CRange`, `Secondary CRange` and 473`Tertiary CRange` slices of the `CFile`, and the `Leaf STag` and `Leaf TTag` 474values, in a `Codec`-specific way to produce decompressed data. 475 476There are two general principles, although specific `Codec`s can overrule: 477 478 - The `Codec` may produce fewer bytes than the `DRange` size. In that case, 479 the remaining bytes (in `DSpace`) are set to NUL (`memset` to zero). 480 - The `Codec` may consume fewer bytes than each `CRange` size, and the 481 compressed data typically contains an explicit "end of data" marker. In 482 that case, the remaining bytes (in `CSpace`) are ignored. Padding allows 483 `COff` values to optionally be aligned. 484 485It is invalid to produce more bytes than the `DRange` size or to consume more 486bytes than each `CRange` size. 487 488 489### Continuing the Reconstruction 490 491If decompressing that `Leaf Node` did not fully reconstruct `[di .. dj)`, 492advance through the `Node` tree in depth first search order, decompressing 493every `Leaf Node` along the way, until we have gone up to or beyond `dj`. 494 495During that traversal, `Node`s with an empty `DRange` should be skipped, even 496if they are `Branch Node`s. 497 498 499# Specific Codecs 500 501 502## RAC + Zlib 503 504The `CFile` data in the `Leaf Node`'s `Primary CRange` is decompressed as Zlib 505(RFC 1950), possibly referencing a LZ77 prefix dictionary (what the RFC calls a 506"preset dictionary"). 507 508If a `Leaf Node`'s `Secondary CRange` is empty then there is no dictionary. 509Otherwise, the `Secondary CRange` must be at least 6 bytes long: 510 511 - 2 byte little-endian `uint16_t` `Dictionary Length`. 512 - `Dictionary Length` bytes `Dictionary`. 513 - 4 byte big-endian `uint32_t` `Dictionary Checksum`. 514 - Padding (ignored). 515 516The `Dictionary Checksum` is Zlib's Adler32 checksum over the `Dictionary`'s 517bytes (excluding the initial 2 bytes for the `Dictionary Length`). The checksum 518is stored big-endian, like Zlib's other checksums, and its 4 byte value must 519match the `DICTID` (in RFC 1950 terminology) given in the `Primary CRange`'s 520Zlib-formatted data. 521 522The `Leaf TTag` must be `0xFF`. All other `Leaf TTag` values (below `0xC0`) are 523reserved. The empty `Tertiary CRange` is ignored. The `Leaf STag` value is also 524ignored, other than deriving the `Secondary CRange`. 525 526 527## RAC + Brotli 528 529TODO. 530 531 532## RAC + Zstandard 533 534TODO. 535 536 537# Examples 538 539TODO (zlib). 540 541 542# Acknowledgements 543 544I (Nigel Tao) thank Robert Obryk, Sanjay Ghemawat and Sean Klein for their 545review. 546 547 548--- 549 550Updated on April 2019. 551