• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1# Random Access Compression: RAC
2
3Status: Draft (as of March 2019). There is no compatibility guarantee yet.
4
5
6## Overview
7
8The goal of the RAC file format is to compress a source file (the "decompressed
9file") such that, starting from the compressed file, it is possible to
10reconstruct the half-open byte range `[di .. dj)` of the decompressed file
11without always having to first decompress all of `[0 .. di)`.
12
13Conceptually, the decompressed file is partitioned into non-overlapping chunks.
14Each compressed chunk can be decompressed independently (although possibly
15sharing additional context, such as a LZ77 prefix dictionary). A RAC file also
16contains a hierarchical index of those chunks.
17
18RAC is a container format, and while it supports common compression codecs like
19Zlib, Brotli and Zstandard, it is not tied to any particular compression codec.
20
21Non-goals for version 1 include:
22
23  - Filesystem metadata like file names, file sizes and modification times.
24  - Multiple source files.
25
26There is the capability (see reserved `TTag`s, below) but no promise to address
27these in a future RAC version. There might not be a need to, as other designs
28such as EROFS ([Extendable Read-Only File
29System](https://lkml.org/lkml/2018/5/31/306)) already exist.
30
31Non-goals in general include:
32
33  - Encryption.
34  - Streaming decompression.
35
36
37### Related Work
38
39In general, RAC differs from most other archive or compression formats in a
40number of ways. This doesn't mean that RAC is necessarily better or worse than
41these other designs, just that different designs make different trade-offs. For
42example, supporting shared dictionaries means giving up streaming (one pass)
43decoding.
44
45  - RAC files must start with a magic sequence, but RAC still supports
46    append-only modifications.
47  - RAC seek points are identified by numbers (i.e. `DOffset`s), not strings
48    (i.e. file names). Some other formats' seek points are positioned only at
49    source file boundaries. They cannot seek to a relative offset (in what RAC
50    calls `DSpace`) within a source file.
51  - RAC supports [shared compression
52    dictionaries](http://fastcompression.blogspot.com/2018/02/when-to-use-dictionary-compression.html)
53    embedded within an archive.
54
55These points apply to established archive formats like
56[Tar](https://en.wikipedia.org/wiki/Tar_%28computing%29) and
57[Zip](https://en.wikipedia.org/wiki/Zip_%28file_format%29), and newer archive
58formats like [Śiva](https://blog.sourced.tech/post/siva/).
59
60[Riegeli](https://github.com/google/riegeli) is a sequence-of-records format. A
61RAC file arguably contains what Riegeli calls records (which are not files, as
62they don't have names) and both formats support numerical seek points, and
63multiple compression codecs, but Riegeli seeks to a point in what RAC calls
64`CSpace`, whereas RAC seeks to a point in `DSpace`.
65
66[XFLATE](https://github.com/dsnet/compress/blob/master/doc/xflate-format.pdf)
67supports random access like RAC, but is tied to the DEFLATE family of
68compression codecs.
69
70[XZ](https://tukaani.org/xz/format.html) supports random access like RAC, but
71as one of its explicit goals is to support streaming decoding, it does not
72support shared dictionaries. Also, unlike RAC, finding the record that contains
73any given `DOffset` requires a linear scan of all previous records, even if
74those records don't need decompressing.
75
76[QCOW](https://github.com/qemu/qemu/blob/master/docs/interop/qcow2.txt)
77supports random access like RAC, but compression does not seem to be the
78foremost concern. That specification mentions compression but does not give any
79particular codec. A [separate
80document](https://people.gnome.org/~markmc/qcow-image-format.html) suggests
81that the codec is Zlib (with no other option), and that it does not support
82shared dictionaries.
83
84The
85[`examples/zran.c`](https://github.com/madler/zlib/blob/master/examples/zran.c)
86program in the zlib-the-library source code repository builds an in-memory
87index, not a persistent (disk-backed) one, and building the index involves
88first decompressing the whole file. It also requires storing (in memory) the
89entire state of the decompressor, including 32 KiB of history, per seek point.
90For coarse grained seek points (e.g. once every 1 MiB), that overhead can be
91acceptable, but for fine grained seek points (e.g. once every 16 KiB), that
92overhead is prohibitive.
93
94RAC differs from the [LevelDB
95Table](https://github.com/google/leveldb/blob/master/doc/table_format.md)
96format, even if the LevelDB Table string keys are re-purposed to encode both
97numerical `DOffset` seek points and identifiers for shared compression
98dictionaries. A LevelDB Table index is flat, unlike RAC's hierarchical index.
99In both cases, a maliciously written index could contain multiple entries for
100the same key, and different decoders could therefore produce different output
101for the same source file - a potential security vulnerability. For LevelDB
102Tables, verifying an index key's uniqueness requires scanning every key in the
103file, which can be relatively expensive, and is therefore not done in practice.
104In comparison, RAC's "Branch Node Validation" process (see below) ensures that
105exactly one `Leaf Node` contains any given byte offset `di` in the decompressed
106file, and the RAC index's hierarchical nature places a scalable upper bound on
107the scanning cost of verifying that, based on that portion of the index tree
108visited for any given request `[di .. dj)`.
109
110
111### Glossary
112
113  - `CBias` is the delta added to a `CPointer` to produce a `COffset`.
114  - `DBias` is the delta added to a `DPointer` to produce a `DOffset`.
115  - `CFile` is the compressed file.
116  - `DFile` is the decompressed file.
117  - `CFileSize` is the size of the `CFile`.
118  - `DFileSize` is the size of the `DFile`.
119  - `COffset` is a byte offset in `CSpace`.
120  - `DOffset` is a byte offset in `DSpace`.
121  - `CPointer` is a relative `COffset`, prior to bias-correction.
122  - `DPointer` is a relative `DOffset`, prior to bias-correction.
123  - `CRange` is a `Range` in `CSpace`.
124  - `DRange` is a `Range` in `DSpace`.
125  - `CSpace` means that byte offsets refer to the `CFile`.
126  - `DSpace` means that byte offsets refer to the `DFile`.
127
128`Range` is a pair of byte offsets `[i .. j)`, in either `CSpace` or `DSpace`.
129It is half-open, containing every byte offset `x` such that `(i <= x)` and `(x
130< j)`. It is invalid to have `(i > j)`. The size of a `Range` equals `(j - i)`.
131
132All bytes are 8 bits and unless explicitly specified otherwise, all fixed-size
133unsigned integers (e.g. `uint32_t`, `uint64_t`) are encoded little-endian.
134Within those unsigned integers, bit 0 is the least significant bit and e.g. bit
13531 is the most significant bit of a `uint32_t`.
136
137The maximum supported `CFileSize` and the maximum supported `DFileSize` are the
138same number: `0x0000_FFFF_FFFF_FFFF`, which is `((1 << 48) - 1)`.
139
140
141## File Structure
142
143A RAC file (the `CFile`) must be at least 4 bytes long, and start with the 3
144byte `Magic` (see below), so that no valid RAC file can also be e.g. a valid
145JPEG file. The fourth byte is examined in the process described by the "Root
146Node at the CFile Start" section, below.
147
148The `CFile` contains a tree of `Node`s. Each `Node` is either a `Branch Node`
149(pointing to between 1 and 255 child `Node`s) or a `Leaf Node`. There must be
150at least one `Branch Node`, called the `Root Node`. Parsing a `CFile` requires
151knowing the `CFileSize` in order to identify the `Root Node`, which is either
152at the start or the end of the `CFile`.
153
154Each `Node` has a `DRange`. An empty `DRange` means that the `Node` contains
155metadata or other decompression context such as a shared dictionary.
156
157Each `Leaf Node` also has 3 `CRange`s (`Primary`, `Secondary` and `Tertiary`),
158any or all of which may be empty. The contents of the `CFile`, within those
159`CRange`s, are decompressed according to the `Codec` (see below) to reconstruct
160that part of the `DFile` within the `Leaf Node`'s `DRange`.
161
162
163## Branch Nodes
164
165A `Branch Node`'s encoding in the `CFile` has a variable byte size, between 32
166and 4096 inclusive, depending on its number of children. Specifically, it
167occupies `((Arity * 16) + 16)` bytes, grouped into 8 byte segments (but not
168necessarily 8 byte aligned), starting at a `COffset` called its `Branch
169COffset`:
170
171    +-+-+-+-+-+-+-+-+
172    |0|1|2|3|4|5|6|7|
173    +-+-+-+-+-+-+-+-+
174    |Magic|A|Che|0|T|  Magic, Arity, Checksum,     Reserved (0),  TTag[0]
175    | DPtr[1]   |0|T|  DPtr[1],                    Reserved (0),  TTag[1]
176    | DPtr[2]   |0|T|  DPtr[2],                    Reserved (0),  TTag[2]
177    | ...       |0|.|  ...,                        Reserved (0),  ...
178    | DPtr[A-2] |0|T|  DPtr[Arity-2],              Reserved (0),  TTag[Arity-2]
179    | DPtr[A-1] |0|T|  DPtr[Arity-1],              Reserved (0),  TTag[Arity-1]
180    | DPtr[A]   |0|C|  DPtr[Arity] a.k.a. DPtrMax, Reserved (0),  Codec
181    | CPtr[0]   |L|S|  CPtr[0],                    CLen[0],       STag[0]
182    | CPtr[1]   |L|S|  CPtr[1],                    CLen[1],       STag[1]
183    | CPtr[2]   |L|S|  CPtr[2],                    CLen[2],       STag[2]
184    | ...       |L|.|  ...,                        ...,           ...
185    | CPtr[A-2] |L|S|  CPtr[Arity-2],              CLen[Arity-2], STag[Arity-2]
186    | CPtr[A-1] |L|S|  CPtr[Arity-1],              CLen[Arity-2], STag[Arity-1]
187    | CPtr[A]   |V|A|  CPtr[Arity] a.k.a. CPtrMax, Version,       Arity
188    +-+-+-+-+-+-+-+-+
189
190For the `(XPtr | Other6 | Other7)` 8 byte fields, the `XPtr` occupies the low
19148 bits (as a little-endian `uint64_t`) and the `Other` fields occupy the high
19216 bits.
193
194The `CPtr` and `DPtr` values are what is explicitly written in the `CFile`'s
195bytes. These are added to a `Branch Node`'s implicit `Branch CBias` and `Branch
196DBias` values to give the implicit `COff` and `DOff` values: `COff[i]` and
197`DOff[i]` are defined to be `(Branch_CBias + CPtr[i])` and `(Branch_DBias +
198DPtr[i])`.
199
200`CPtrMax` is another name for `CPtr[Arity]`, and `COffMax` is defined to be
201`(Branch_CBias + CPtrMax)`. Likewise for `DPtrMax` and `DOffMax`.
202
203The `DPtr[0]` value is implicit, and always equals zero, so that `DOff[0]`
204always equals the `Branch DBias`.
205
206  - For the `Root Node`, the `DPtrMax` also sets the `DFileSize`. The `Branch
207    CBias` and `Branch DBias` are both zero. The `Branch COffset` is determined
208    by the "Root Node" section below.
209  - For a child `Branch Node`, the `Branch COffset`, `Branch CBias` and `Branch
210    DBias` are given by the parent `Branch Node`. See the "Search Within a
211    Branch Node" section below.
212
213
214### Magic
215
216`Magic` is the three bytes `"\x72\xC3\x63"`, which is invalid UTF-8 but is
217`"rÃc"` in ISO 8859-1. The tilde isn't particularly meaningful, other than
218`"rÃc"` being a nonsensical word (with nonsensical capitalization) that is
219unlikely to appear in other files.
220
221Every `Branch Node` must start with these `Magic` bytes, not just a `Branch
222Node` positioned at the start of the `CFile`.
223
224
225### Arity
226
227`Arity` is the `Branch Node`'s number of children. Zero is invalid.
228
229The `Arity` byte is given twice: the fourth byte and the final byte of the
230`Branch Node`. The two values must match.
231
232The repetition lets a RAC reader determine the size of the `Branch Node` data
233(as the size depends on the `Arity`), given either its start or its end offset
234in `CSpace`. For almost all `Branch Node`s, we will know its start offset (its
235`Branch COffset`), but for a `Root Node` at the end of a `CFile`, we will only
236know its end offset.
237
238
239### Checksum
240
241`Checksum` is a checksum of the `Branch Node`'s bytes. It is not a checksum of
242the `CFile` or `DFile` contents pointed to by a `Branch Node`. Content
243checksums are a `Codec`-specific consideration.
244
245The little-endian `uint16_t` `Checksum` value is the low 16 bits XOR'ed with
246the high 16 bits of the `uint32_t` CRC-32 IEEE checksum of the `((Arity * 16) +
24710)` bytes immediately after the `Checksum`. The 4 bytes immediately before the
248`Checksum` are not considered: the `Magic` bytes have only one valid value and
249the `Arity` byte near the start is replicated by the `Arity` byte at the end.
250
251
252### Reserved (0)
253
254The `Reserved (0)` bytes must have the value `0x00`.
255
256
257### COffs and DOffs, STags and TTags
258
259For every `a` in the half-open range `[0 .. Arity)`, the `a`'th child `Node`
260has two tags, `STag[a]` and `TTag[a]`, and a `DRange` of `[DOff[a] ..
261DOff[a+1])`. The `DOff` values must be non-decreasing: see the "Branch Node
262Validation" section below.
263
264A `TTag[a]` of `0xFE` means that child is a `Branch Node`. A `TTag[a]` in the
265half-open range `[0xC0 .. 0xFE)` is reserved. Otherwise, the child is a `Leaf
266Node`.
267
268A child `Branch Node`'s `SubBranch COffset` is defined to be `COff[a]`. Its
269`SubBranch DBias` and `SubBranch DOffMax` are defined to be `DOff[a]` and
270`DOff[a+1].
271
272  - When `(STag[a] < Arity)`, it is a `CBiasing Branch Node`. The `SubBranch
273    CBias` is defined to be `(Branch_CBias + CPtr[STag[a]])`. This expression
274    is equivalent to `COff[STag[a]]`.
275  - When `(STag[a] >= Arity)`, it is a `CNeutral Branch Node`. The `SubBranch
276    CBias` is defined to be `(Branch_CBias)`.
277
278A child `Leaf Node`'s `STag[a]` and `TTag[a]` values are also called its `Leaf
279STag` and `Leaf TTag`. It also has:
280
281  - A `Primary CRange`, equal to `MakeCRange(a)`.
282  - A `Secondary CRange`, equal to `MakeCRange(STag[a])`.
283  - A `Tertiary CRange`, equal to `MakeCRange(TTag[a])`.
284
285The `MakeCRange(i)` function defines a `CRange`. If `(i >= Arity)` then that
286`CRange` is the empty range `[COffMax .. COffMax)`. Otherwise, the lower bound
287is `COff[i]` and the upper bound is:
288
289  - `COffMax` when `CLen[i]` is zero.
290  - The minimum of `COffMax` and `(COff[i] + (CLen[i] * 1024))` when `CLen[i]`
291    is non-zero.
292
293In other words, the `COffMax` value clamps the `CRange` upper bound. The `CLen`
294value, if non-zero, combines with the `COff` value to apply another clamp. The
295`CLen` is given in units of 1024 bytes, but the `(COff[i] + (CLen[i] * 1024))`
296value is not necessarily quantized to 1024 byte boundaries.
297
298Note that, since `Arity` is at most 255, an `STag[a]` of `0xFF` always results
299in a `CNeutral Branch Node` or an empty `Secondary CRange`. Likewise, a
300`TTag[a]` of `0xFF` always results in an empty `Tertiary CRange`.
301
302
303### COffMax
304
305`COffMax` is an inclusive upper bound on every `COff` in a `Branch Node` and in
306its descendent `Branch Node`s. A child `Branch Node` must not have a larger
307`COffMax` than the parent `Branch Node`'s `COffMax`, and the `Root Node`'s
308`COffMax` must equal the `CFileSize`. See the "Branch Node Validation" section
309below.
310
311A RAC file can therefore be incrementally modified, if the RAC writer only
312appends new `CFile` bytes and does not re-write existing `CFile` bytes, so that
313the `CFileSize` increases. Even if the old (smaller) RAC file's `Root Node` was
314at the `CFile` start, the new (larger) `CFileSize` means that those starting
315bytes are an obsolete `Root Node` (but still a valid `Branch Node`). The new
316`Root Node` is therefore located at the end of the new RAC file.
317
318Concatenating RAC files (concatenating in `DSpace`) involves concatenating the
319RAC files in `CSpace` and then appending a new `Root Node` with `CBiasing
320Branch Node`s pointing to each source RAC file's `Root Node`.
321
322
323### Version
324
325`Version` must have the value `0x01`, indicating version 1 of the RAC format.
326
327
328### Codec
329
330`Codec`s define specializations of RAC, such as "RAC + Zlib" or "RAC + Brotli".
331It is valid for a "RAC + Zstandard" only decoder to reject a "RAC + Brotli"
332file, even if it is a valid RAC file. Recall that RAC is just a container, and
333not tied to any particular compression codec. For the `Codec` byte in a `Branch
334Node`:
335
336  - `0x00` is reserved.
337  - `0x01` means "RAC + Zlib".
338  - `0x02` means "RAC + Brotli".
339  - `0x04` means "RAC + ZStandard".
340  - Any other value less than 0x08 means that all of this `Branch Node`'s
341    children must be `Branch Node`s and not `Leaf Node`s and that no child's
342    `Codec` byte can have a bit set that is not set in this `Codec` byte.
343  - All other values, 0x08 or greater, are reserved.
344
345
346### Branch Node Validation
347
348The first time that a RAC reader visits any particular `Branch Node`, it must
349check that the `Magic` matches, the two `Arity` values match and are non-zero,
350the computed checksum matches the listed `Checksum` and that the RAC reader
351accepts the `Version` and the `Codec`.
352
353It must also check that all of its `DOff` values are sorted: `(DOff[a] <=
354DOff[a+1])` for every `a` in the half-open range `[0 .. Arity)`. By induction,
355this means that all of its `DOff` values do not exceed `DOffMax`, and again by
356induction, therefore do not exceed `DFileSize`.
357
358It must also check that all of its `COff` values do not exceed `COffMax` (and
359again by induction, therefore do not exceed `CFileSize`). Other than that,
360`COff` values do not have to be sorted: successive `Node`s (in `DSpace`) can be
361out of order (in `CSpace`), allowing for incrementally modified RAC files.
362
363For the `Root Node`, its `COffMax` must equal the `CFileSize`. Recall that
364parsing a `CFile` requires knowing the `CFileSize`, and also that a `Root
365Node`'s `Branch CBias` is zero, so its `COffMax` equals its `CPtrMax`.
366
367For a child `Branch Node`, its `Codec` bits must be a subset of its parent's
368`Codec` bits, its `COffMax` must be less than or equal to its parent's
369`COffMax`, and its `DOffMax` must equal its parent's `SubBranch DOffMax`. The
370`DOffMax` condition is equivalent to checking that the parent and child agree
371on the child's size in `DSpace`. The parent states that it is its `(DPtr[a+1] -
372DPtr[a])` and the child states that it is its `DPtrMax`.
373
374One conservative way to check `Branch Node`s' validity on first visit is to
375check them on every visit, as validating any particular `Branch Node` is
376idempotent, but other ways are acceptable.
377
378
379## Root Node
380
381The `Root Node` might be at the start of the `CFile`, as this might optimize
382alignment of `Branch Node`s and of `CRange`s. All `Branch Node`s' sizes are
383multiples of 16 bytes, and a maximal `Branch Node` is exactly 4096 bytes.
384
385The `Root Node` might be at the end of the `CFile`, as this allows one-pass
386(streaming) encoding of a RAC file. It also allows appending to, concatenating
387or incrementally modifying existing RAC files relatively cheaply.
388
389To find the `Root Node`, first look for a valid `Root Node` at the `CFile`
390start. If and only if that fails, look at the `CFile` end. If that also fails,
391it is not a valid RAC file.
392
393
394### Root Node at the CFile Start
395
396The fourth byte of the `CFile` gives the `Arity`, assuming the `Root Node` is
397at the `CFile` start. If it is zero, fail over to the `CFile` end. A RAC writer
398that does not want to place the `Root Node` at the `CFile` start should
399intentionally write a zero to the fourth byte.
400
401Otherwise, that `Arity` defines the size in bytes of any starting `Root Node`:
402`((Arity * 16) + 16)`. If the `CFileSize` is less than this size, fail over to
403the `CFile` end.
404
405If those starting bytes form a valid `Root Node` (as per the "Branch Node
406Validation" section), including having a `CPtrMax` equal to the `CFileSize`,
407then we have indeed found our `Root Node`: its `Branch COffset` is zero.
408Otherwise, fail over to the `CFile` end.
409
410
411### Root Node at the CFile End
412
413If there is no valid `Root Node` at the `CFile` start then the last byte of the
414`CFile` gives the `Root Node`'s `Arity`. This does not necessarily need to
415match the fourth byte of the `CFile`.
416
417As before, that `Arity` defines the size in bytes of any ending `Root Node`:
418`((Arity * 16) + 16)`. If the `CFileSize` is less than this size, it is not a
419valid RAC file.
420
421If those ending bytes form a valid `Root Node` (as per the "Branch Node
422Validation" section), including having a `CPtrMax` equal to the `CFileSize`,
423then we have indeed found our `Root Node`: its `Branch COffset` is the
424`CFileSize` minus the size implied by the `Arity`. Otherwise, it is not a valid
425RAC file.
426
427
428## DRange Reconstruction
429
430To reconstruct the `DRange` `[di .. dj)`, if that `DRange` is empty then the
431request is trivially satisfied.
432
433Otherwise, if `(dj > DFileSize)` then reject the request.
434
435Otherwise, start at the `Root Node` and continue to the next section to find
436the `Leaf Node` containing the `DOffset` `di`.
437
438
439### Search Within a Branch Node
440
441Load (and validate) the `Branch Node` given its `Branch COffset`, `Branch
442CBias` and `Branch DBias`.
443
444Find the largest child index `a` such that `(a < Arity)` and `(DOff[a] <= di)`
445and `(DOff[a+1] > di)`, then examine `TTag[a]` to see if the child is a `Leaf
446Node`. If so, skip to the next section.
447
448For a `Branch Node` child, let `CRemaining` equal this `Branch Node`'s (the
449parents') `COffMax` minus the `SubBranch COffset`. It invalid for `CRemaining`
450to be less than 4, or to be less than the child's size implied by the child's
451`Arity` byte at a `COffset` equal to `(SubBranch_COffset + 3)`.
452
453The `SubBranch COffset`, `SubBranch CBias` and `SubBranch DBias` from the
454parent `Branch Node` become the `Branch COffset`, `Branch CBias` and `Branch
455DBias` of the child `Branch Node`. In order to rule out infinite loops, at
456least one of these two conditions must hold:
457
458  - The child's `Branch COffset` is less than the parent's `Branch COffset`.
459  - The child's `DPtrMax` is less than the parent's `DPtrMax`.
460
461It is valid for one of those conditions to hold between a parent-child pair and
462the other condition to hold between the next parent-child pair.
463
464Repeat this "Search Within a Branch Node" section with the child `Branch Node`.
465
466
467### Decompressing a Leaf Node
468
469If a `Leaf Node`'s `DRange` is empty, decompression is a no-op and skip the
470rest of this section.
471
472Otherwise, decompression combines the `Primary CRange`, `Secondary CRange` and
473`Tertiary CRange` slices of the `CFile`, and the `Leaf STag` and `Leaf TTag`
474values, in a `Codec`-specific way to produce decompressed data.
475
476There are two general principles, although specific `Codec`s can overrule:
477
478  - The `Codec` may produce fewer bytes than the `DRange` size. In that case,
479    the remaining bytes (in `DSpace`) are set to NUL (`memset` to zero).
480  - The `Codec` may consume fewer bytes than each `CRange` size, and the
481    compressed data typically contains an explicit "end of data" marker. In
482    that case, the remaining bytes (in `CSpace`) are ignored. Padding allows
483    `COff` values to optionally be aligned.
484
485It is invalid to produce more bytes than the `DRange` size or to consume more
486bytes than each `CRange` size.
487
488
489### Continuing the Reconstruction
490
491If decompressing that `Leaf Node` did not fully reconstruct `[di .. dj)`,
492advance through the `Node` tree in depth first search order, decompressing
493every `Leaf Node` along the way, until we have gone up to or beyond `dj`.
494
495During that traversal, `Node`s with an empty `DRange` should be skipped, even
496if they are `Branch Node`s.
497
498
499# Specific Codecs
500
501
502## RAC + Zlib
503
504The `CFile` data in the `Leaf Node`'s `Primary CRange` is decompressed as Zlib
505(RFC 1950), possibly referencing a LZ77 prefix dictionary (what the RFC calls a
506"preset dictionary").
507
508If a `Leaf Node`'s `Secondary CRange` is empty then there is no dictionary.
509Otherwise, the `Secondary CRange` must be at least 6 bytes long:
510
511  - 2 byte little-endian `uint16_t` `Dictionary Length`.
512  - `Dictionary Length` bytes `Dictionary`.
513  - 4 byte big-endian `uint32_t` `Dictionary Checksum`.
514  - Padding (ignored).
515
516The `Dictionary Checksum` is Zlib's Adler32 checksum over the `Dictionary`'s
517bytes (excluding the initial 2 bytes for the `Dictionary Length`). The checksum
518is stored big-endian, like Zlib's other checksums, and its 4 byte value must
519match the `DICTID` (in RFC 1950 terminology) given in the `Primary CRange`'s
520Zlib-formatted data.
521
522The `Leaf TTag` must be `0xFF`. All other `Leaf TTag` values (below `0xC0`) are
523reserved. The empty `Tertiary CRange` is ignored. The `Leaf STag` value is also
524ignored, other than deriving the `Secondary CRange`.
525
526
527## RAC + Brotli
528
529TODO.
530
531
532## RAC + Zstandard
533
534TODO.
535
536
537# Examples
538
539TODO (zlib).
540
541
542# Acknowledgements
543
544I (Nigel Tao) thank Robert Obryk, Sanjay Ghemawat and Sean Klein for their
545review.
546
547
548---
549
550Updated on April 2019.
551