• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1# LZ4 Streaming API Example : Line by Line Text Compression
2by *Takayuki Matsuoka*
3
4`blockStreaming_lineByLine.c` is LZ4 Straming API example which implements line by line incremental (de)compression.
5
6Please note the following restrictions :
7
8 - Firstly, read "LZ4 Streaming API Basics".
9 - This is relatively advanced application example.
10 - Output file is not compatible with lz4frame and platform dependent.
11
12
13## What's the point of this example ?
14
15 - Line by line incremental (de)compression.
16 - Handle huge file in small amount of memory
17 - Generally better compression ratio than Block API
18 - Non-uniform block size
19
20
21## How the compression works
22
23First of all, allocate "Ring Buffer" for input and LZ4 compressed data buffer for output.
24
25```
26(1)
27    Ring Buffer
28
29    +--------+
30    | Line#1 |
31    +---+----+
32        |
33        v
34     {Out#1}
35
36
37(2)
38    Prefix Mode Dependency
39          +----+
40          |    |
41          v    |
42    +--------+-+------+
43    | Line#1 | Line#2 |
44    +--------+---+----+
45                 |
46                 v
47              {Out#2}
48
49
50(3)
51          Prefix   Prefix
52          +----+   +----+
53          |    |   |    |
54          v    |   v    |
55    +--------+-+------+-+------+
56    | Line#1 | Line#2 | Line#3 |
57    +--------+--------+---+----+
58                          |
59                          v
60                       {Out#3}
61
62
63(4)
64                        External Dictionary Mode
65                +----+   +----+
66                |    |   |    |
67                v    |   v    |
68    ------+--------+-+------+-+--------+
69          |  ....  | Line#X | Line#X+1 |
70    ------+--------+--------+-----+----+
71                            ^     |
72                            |     v
73                            |  {Out#X+1}
74                            |
75                          Reset
76
77
78(5)
79                                    Prefix
80                                    +-----+
81                                    |     |
82                                    v     |
83    ------+--------+--------+----------+--+-------+
84          |  ....  | Line#X | Line#X+1 | Line#X+2 |
85    ------+--------+--------+----------+-----+----+
86                            ^                |
87                            |                v
88                            |            {Out#X+2}
89                            |
90                          Reset
91```
92
93Next (see (1)), read first line to ringbuffer and compress it by `LZ4_compress_continue()`.
94For the first time, LZ4 doesn't know any previous dependencies,
95so it just compress the line without dependencies and generates compressed line {Out#1} to LZ4 compressed data buffer.
96After that, write {Out#1} to the file and forward ringbuffer offset.
97
98Do the same things to second line (see (2)).
99But in this time, LZ4 can use dependency to Line#1 to improve compression ratio.
100This dependency is called "Prefix mode".
101
102Eventually, we'll reach end of ringbuffer at Line#X (see (4)).
103This time, we should reset ringbuffer offset.
104After resetting, at Line#X+1 pointer is not adjacent, but LZ4 still maintain its memory.
105This is called "External Dictionary Mode".
106
107In Line#X+2 (see (5)), finally LZ4 forget almost all memories but still remains Line#X+1.
108This is the same situation as Line#2.
109
110Continue these procedure to the end of text file.
111
112
113## How the decompression works
114
115Decompression will do reverse order.
116
117 - Read compressed line from the file to buffer.
118 - Decompress it to the ringbuffer.
119 - Output decompressed plain text line to the file.
120 - Forward ringbuffer offset. If offset exceedes end of the ringbuffer, reset it.
121
122Continue these procedure to the end of the compressed file.
123