• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1Decompressor Permissiveness to Invalid Data
2===========================================
3
4This document describes the behavior of the reference decompressor in cases
5where it accepts formally invalid data instead of reporting an error.
6
7While the reference decompressor *must* decode any compliant frame following
8the specification, its ability to detect erroneous data is on a best effort
9basis: the decoder may accept input data that would be formally invalid,
10when it causes no risk to the decoder, and which detection would cost too much
11complexity or speed regression.
12
13In practice, the vast majority of invalid data are detected, if only because
14many corruption events are dangerous for the decoder process (such as
15requesting an out-of-bound memory access) and many more are easy to check.
16
17This document lists a few known cases where invalid data was formerly accepted
18by the decoder, and what has changed since.
19
20
21Truncated Huffman states
22------------------------
23
24**Last affected version**: v1.5.6
25
26**Produced by the reference compressor**: No
27
28**Example Frame**: `28b5 2ffd 0000 5500 0072 8001 0420 7e1f 02aa 00`
29
30When using FSE-compressed Huffman weights, the compressed weight bitstream
31could contain fewer bits than necessary to decode the initial states.
32
33The reference decompressor up to v1.5.6 will decode truncated or missing
34initial states as zero, which can result in a valid Huffman tree if only
35the second state is truncated.
36
37In newer versions, truncated initial states are reported as a corruption
38error by the decoder.
39
40
41Offset == 0
42-----------
43
44**Last affected version**: v1.5.5
45
46**Produced by the reference compressor**: No
47
48**Example Frame**: `28b5 2ffd 0000 4500 0008 0002 002f 430b ae`
49
50If a sequence is decoded with `literals_length = 0` and `offset_value = 3`
51while `Repeated_Offset_1 = 1`, the computed offset will be `0`, which is
52invalid.
53
54The reference decompressor up to v1.5.5 processes this case as if the computed
55offset was `1`, including inserting `1` into the repeated offset list.
56This prevents the output buffer from remaining uninitialized, thus denying a
57potential attack vector from an untrusted source.
58However, in the rare case where this scenario would be the outcome of a
59transmission or storage error, the decoder relies on the checksum to detect
60the error.
61
62In newer versions, this case is always detected and reported as a corruption error.
63
64
65Non-zeroes reserved bits
66------------------------
67
68**Last affected version**: v1.5.5
69
70**Produced by the reference compressor**: No
71
72The Sequences section of each block has a header, and one of its elements is a
73byte, which describes the compression mode of each symbol.
74This byte contains 2 reserved bits which must be set to zero.
75
76The reference decompressor up to v1.5.5 just ignores these 2 bits.
77This behavior has no consequence for the rest of the frame decoding process.
78
79In newer versions, the 2 reserved bits are actively checked for value zero,
80and the decoder reports a corruption error if they are not.
81