• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
2
3<html>
4<head>
5<title>What's In A GIF - LZW Image Data</title>
6<style type="text/css">
7.byte {font-family: Courier, fixed;
8	padding: .2em}
9.gif_header {background-color: #f9E89D}
10.gif_screen {background-color: #C8DBD9}
11.gif_color {background-color: #E1E1E1}
12.gif_graphic {background-color: #F9EB9D}
13.gif_imgdesc {background-color: #C2D1DC}
14.gif_imgdata {background-color: #D0C4C4}
15.gif_trailer {background-color: #f9E89D}
16.gif_ext {background-color: #D0CFAE}
17#global_color_size {margin-left: auto; margin-right:auto; border:1px solid black;}
18#global_color_size th {border-bottom: 1px solid #666666}
19#global_color_size td {text-align:center;}
20.code_table {margin-left: auto; margin-right:auto; border:1px solid black;}
21.code_table th {text-align: left; border-bottom: 1px solid #666666}
22.alg_steps {margin: 0 auto; border: 1px solid black}
23.alg_steps th, .alg_steps td {border: 1px solid black}
24.alg_steps .index {padding: 0 .3em}
25.alg_steps .processed {color: #CCC}
26.alg_steps .buffer {background: #C8DBD9 url(highlight_green.gif) repeat-x center left;
27	border-top: 1px solid #AAA2A2; border-bottom: 1px solid #AAA2A2;}
28.alg_steps .current {background: #D0C4C4 url(highlight_purple.gif) repeat-x center left;
29	border-top: 1px solid #98A5A4; border-bottom: 1px solid #98A5A4;}
30</style>
31</head>
32<body>
33<table width='100%' cellpadding='0' summary='Canned page header' bgcolor="#ddd">
34<tr>
35<td><h2>What's In A GIF</h2></td>
36<td align="center"><img src="../giflib-logo.gif"></td>
37<td align="right">(LZW image data)</td>
38</tr>
39</table>
40
41<div id="body">
42<div style="text-align:center; margin-top: 10px; padding-top: 10px; border-top: #cecece 1px solid">
43<p><a href="index.html">Back to the GIF index page.</a></p>
44</div>
45
46<p>Now let's look at exactly how we go about storing an image in a GIF
47file. The GIF format is a raster format, meaning it stores image data
48by remembering the color of every pixel in the image. More
49specifically, GIF files remember the index of the color in a color
50table for each pixel. To make that clearer, let's review the
51sample image we used in the <a href="bits_and_bytes.html">first
52section</a>.</p>
53
54<table style="margin-left: auto; margin-right:auto;">
55<tr>
56<td style="text-align:center; vertical-align: top; padding: 5px;width:30%">
57<h3>Actual Size</h3>
58<img src="sample_1.gif" alt="sample gif, actual size" title="Actual Size"
59width="10" height="10" style="padding: 20px" /><br/>(10x10)</td>
60<td style="text-align:center; vertical-align: top; padding: 5px;; width:40%">
61<h3>Enlarged</h3>
62<img src="sample_1_enlarged.gif" alt="sample gif, enlarged"
63title="Enlarged" width="100" height="100" /><br/>(100x100)</td>
64<td style="vertical-align: top; padding: 5px; width:30%">
65<h3>Color Table</h3>
66<table>
67<tr><th>Index</th><th>Color</th></tr>
68<tr><td>0</td><td><span style="color:#FFFFFF; background: #000000; font-weight: bold">White</span></td></tr>
69<tr><td>1</td><td><span style="color:#FF0000; font-weight: bold">Red</span></td></tr>
70<tr><td>2</td><td><span style="color:#0000FF; font-weight: bold">Blue</span></td></tr>
71<tr><td>3</td><td><span style="font-weight: bold">Black</span></td></tr>
72</table>
73</td>
74</tr></table>
75
76<p>The color table came from the global color table block. The colors
77are listed in the order which they appear in the file. The first color
78is given an index of zero. When we send the codes, we always start at
79the top left of the image and work our way right. When we get to the
80end of the line, the very next code is the one that starts the next
81line. (The decoder will &quot;wrap&quot; the image based on the image
82dimensions.) We could encode our sample image in the following
83way:</p>
84
85<blockquote><p>1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 1,
861, 1, 1, 1, 2, 2, 2, 2, 2, 1, 1, 1, 0, 0, 0, 0, 2, 2, 2, 1, 1, 1,
870, 0, 0, 0, 2, 2, 2, ...</p></blockquote>
88
89<p>The above listing shows the sequence required to render the first five
90lines of the image. We could continue with this method until we've
91specified the color for every pixel; however, this can result in a
92rather large file. Luckily for us, the GIF format allows us to take
93advantage of repetition in our output and to compress our data.</p>
94
95<p>Much of the following information came from John Barkaus's tutorial
96<cite>LZW and GIF Explained</cite>, which seems to have fallen off the
97web.  I've tried to provide more detailed samples as well
98as illustrations to make the process even clearer</p>
99
100<h2><a name="lzw_compression">LZW Compression</a></h2>
101
102<p>The compression method GIF use is a variant of LZW
103(Lempel-Ziv-Welch) compression. To start using this method, we need a
104<strong>code table</strong>. This code table will allow us to use
105special codes to indicate a sequence of colors rather than just one at
106a time.  The first thing we do is to <em>initialize the code
107table</em>.  We start by adding a code for each of the colors in the
108color table. This would be a local color table if one was provided, or
109the global color table. (I will be starting all codes with
110&quot;#&quot; to distinguish them from color indexes.)</p>
111
112<table class="code_table">
113<tr><th>Code</th><th>Color(s)</th></tr>
114<tr><td>#0</td><td>0</td></tr>
115<tr><td>#1</td><td>1</td></tr>
116<tr><td>#2</td><td>2</td></tr>
117<tr><td>#3</td><td>3</td></tr>
118<tr><td>#4</td><td>Clear Code</td></tr>
119<tr><td>#5</td><td>End Of Information Code</td></tr>
120</table>
121
122<p>I added a code for each of the colors in the global color table of
123our sample image. I also snuck in two special control codes.  (These
124special codes are only used in the GIF version of LZW, not in standard
125LZW compression.) Our code table is now considered initialized.</p>
126
127<p>Let me now explain what those special codes are for. The first new code
128is the <em>clear code</em> (CC). Whenever you come across the clear code
129in the image data, it's your cue to reinitialize the code table. (I'll
130explain why you might need to do this in a bit.) The second new code
131is the <em>end of information code</em> (EOI). When you come across
132this code, this means you've reached the end of the image. Here I've placed
133the special codes right after the color codes, but actually the value of
134the special codes depends on the value of the LZW minimum code size
135from the image data block. If the LZW minimum code size is the same as
136the color table size, then special codes immediatly follow the colors; however
137it is possible to specify a larger LWZ minimum code size which may leave
138a gap in the codes where no colors are assigned. This can be
139summarizaed in the <a name="color_table_size">following table</a>.</p>
140
141<div style="text-align:center">
142<table id="global_color_size">
143<tr><th>LWZ Min Code<br/>Size</th><th>Color<br/>Codes</th><th>Clear<br/>Code</th><th>EOI<br/>Code</th></tr>
144<tr><td>2</td><td>#0-#3</td><td>#4</td><td>#5</td></tr>
145<tr><td>3</td><td>#0-#7</td><td>#8</td><td>#9</td></tr>
146<tr><td>4</td><td>#0-#15</td><td>#16</td><td>#17</td></tr>
147<tr><td>5</td><td>#0-#31</td><td>#32</td><td>#33</td></tr>
148<tr><td>6</td><td>#0-#63</td><td>#64</td><td>#65</td></tr>
149<tr><td>7</td><td>#0-#127</td><td>#128</td><td>#129</td></tr>
150<tr><td>8</td><td>#0-#255</td><td>#256</td><td>#257</td></tr>
151</table>
152</div>
153
154<p>Before we proceed, let me define two more terms. First the <strong>index
155stream</strong> will be the list of indexes of the color for each of
156the pixels. This is the input we will be compressing. The <strong>code
157stream</strong> will be the list of codes we generate as output. The
158<strong>index buffer</strong> will be the list of color indexes
159we care &quot;currently looking at.&quot; The index buffer will contain a list
160of one or more color indexes. Now we can step though the LZW
161compression algorithm. First, I'll just list the steps. After that
162I'll walk through the steps with our specific example.</p>
163
164<ul>
165<li>Initialize code table</li>
166<li>Always start by sending a clear code to the code stream.</li>
167<li>Read first index from index stream. This value is now the value
168for the index buffer</li>
169<li>&lt;LOOP POINT&gt;</li>
170<li>Get the next index from the index stream to the index buffer. We will
171call this index, K</li>
172<li>Is index buffer + K in our code table?</li>
173<li>Yes:
174	<ul>
175	<li>add K to the end of the index buffer</li>
176	<li>if there are more indexes, return to LOOP POINT</li>
177	</ul>
178</li>
179<li>No:
180	<ul>
181	<li>Add a row for index buffer + K into our code table with
182	the next smallest code</li>
183	<li>Output the code for just the index buffer to our code steam</li>
184	<li>Index buffer is set to K</li>
185	<li>K is set to nothing</li>
186	<li>if there are more indexes, return to LOOP POINT</li>
187	</ul>
188</li>
189<li>Output code for contents of index buffer</li>
190<li>Output end-of-information code</li>
191</ul>
192
193<p>Seems simple enough, right? It really isn't all that bad. Let's
194walk though our sample image to show you how this works. (The steps I
195will be describing are summarized in the following table. Numbers
196highlighted in green are in the index buffer; numbers in purple are
197the current K value.)  We have already initialized our code table. We
198start by doing two things: we output our clear code (#4) to the code
199stream, and we read the first color index from the index stream, 1,
200into our index buffer [Step 0].</p>
201
202<p>Now we enter the main loop of the algorithm. We read the next index
203in the index stream, 1, into K [Step 1]. Next we see if we have a
204record for the index buffer plus K in the code stream. In this case we
205looking for 1,1. Currently our code table only contains single colors
206so this value is not in there. Now we will actually add a new row to
207our code table that does contain this value.  The next available code
208is #6, we will let #6 be 1,1. Note that we do not actually send this
209code to the code stream, instead we send just the code for the
210value(s) in the index buffer. The index buffer is just 1 and the code
211for 1 is #1. This is the code we output. We now reset the index buffer
212to just the value in K and K becomes nothing. [Step 2].</p>
213
214<p>We continue by reading the next index into K. [Step 3]. Now K is 1 and the
215index buffer is 1. Again we look to see if there is a value in our code
216table for the buffer plus K (1,1) and this time there is. (In fact we just
217added it.) Therefore we add K to the end of the index buffer and clear out
218K. Now our index buffer is 1,1. [Step 4].</p>
219
220<p>The next index in the index stream is yet another 1. This is our
221new K [Step 5].  Now the index buffer plus K is 1,1,1 which we do not
222have a code for in our code table. As we did before, we define a new
223code and add it to the code table. The next code would be #7; thus #7
224= 1, 1, 1. Now we kick out the code for just the values in the index
225buffer (#6 = 1,1) to the code stream and set the index buffer to be
226K. [Step 6].</p>
227
228<table class="alg_steps" cellspacing="0">
229<tbody>
230<tr>
231	<th>Step</th>
232	<th>Action</th>
233	<th>Index Stream</th>
234	<th>New Code Table Row</th>
235	<th>Code Stream</th>
236</tr>
237<tr>
238	<td>0</td>
239	<td>Init</td>
240	<td><span class="processed"></span>
241        <span class="buffer"><span class="index">1</span></span>
242        <span class="index">1</span>
243        <span class="index">1</span>
244        <span class="index">1</span>
245        <span class="index">1</span>
246        <span class="index">2</span>
247        <span class="index">2</span>
248        <span class="index">2</span>
249        <span class="index">2</span>
250        <span class="index">2</span>
251        <span class="index">1</span>
252        <span class="index">1</span>
253        <span class="index">1</span>
254        <span class="index">1</span>...</td>
255	<td>&nbsp;</td>
256	<td>#4</td>
257</tr>
258
259<tr>
260	<td>1</td>
261	<td>Read</td>
262	<td><span class="processed"></span>
263        <span class="buffer"><span class="index">1</span></span>
264        <span class="current"><span class="index">1</span></span>
265        <span class="index">1</span>
266        <span class="index">1</span>
267        <span class="index">1</span>
268        <span class="index">2</span>
269        <span class="index">2</span>
270        <span class="index">2</span>
271        <span class="index">2</span>
272        <span class="index">2</span>
273        <span class="index">1</span>
274        <span class="index">1</span>
275        <span class="index">1</span>
276        <span class="index">1</span>...</td>
277	<td>&nbsp;</td>
278	<td>#4</td>
279</tr>
280
281<tr>
282	<td>2</td>
283	<td>Not Found</td>
284	<td><span class="processed"><span class="index">1</span></span>
285        <span class="buffer"><span class="index">1</span></span>
286        <span class="index">1</span>
287        <span class="index">1</span>
288        <span class="index">1</span>
289        <span class="index">2</span>
290        <span class="index">2</span>
291        <span class="index">2</span>
292        <span class="index">2</span>
293        <span class="index">2</span>
294        <span class="index">1</span>
295        <span class="index">1</span>
296        <span class="index">1</span>
297        <span class="index">1</span>...</td>
298	<td>#6 - 1, 1</td>
299	<td>#4 #1</td>
300</tr>
301
302<tr>
303	<td>3</td>
304	<td>Read</td>
305	<td><span class="processed"><span class="index">1</span></span>
306        <span class="buffer"><span class="index">1</span></span>
307        <span class="current"><span class="index">1</span></span>
308        <span class="index">1</span>
309        <span class="index">1</span>
310        <span class="index">2</span>
311        <span class="index">2</span>
312        <span class="index">2</span>
313        <span class="index">2</span>
314        <span class="index">2</span>
315        <span class="index">1</span>
316        <span class="index">1</span>
317        <span class="index">1</span>
318        <span class="index">1</span>...</td>
319	<td>&nbsp;</td>
320	<td>#4 #1</td>
321</tr>
322
323<tr>
324	<td>4</td>
325	<td>Found</td>
326	<td><span class="processed"><span class="index">1</span></span>
327        <span class="buffer"><span class="index">1</span>
328        <span class="index">1</span></span>
329        <span class="index">1</span>
330        <span class="index">1</span>
331        <span class="index">2</span>
332        <span class="index">2</span>
333        <span class="index">2</span>
334        <span class="index">2</span>
335        <span class="index">2</span>
336        <span class="index">1</span>
337        <span class="index">1</span>
338        <span class="index">1</span>
339        <span class="index">1</span>...</td>
340	<td>&nbsp;</td>
341	<td>#4 #1</td>
342</tr>
343
344<tr>
345	<td>5</td>
346	<td>Read</td>
347	<td><span class="processed"><span class="index">1</span></span>
348        <span class="buffer"><span class="index">1</span>
349        <span class="index">1</span></span>
350        <span class="current"><span class="index">1</span></span>
351        <span class="index">1</span>
352        <span class="index">2</span>
353        <span class="index">2</span>
354        <span class="index">2</span>
355        <span class="index">2</span>
356        <span class="index">2</span>
357        <span class="index">1</span>
358        <span class="index">1</span>
359        <span class="index">1</span>
360        <span class="index">1</span>...</td>
361	<td>&nbsp;</td>
362	<td>#4 #1</td>
363</tr>
364
365<tr>
366	<td>6</td>
367	<td>Not Found</td>
368	<td><span class="processed"><span class="index">1</span>
369        <span class="index">1</span>
370        <span class="index">1</span></span>
371        <span class="buffer"><span class="index">1</span></span>
372        <span class="index">1</span>
373        <span class="index">2</span>
374        <span class="index">2</span>
375        <span class="index">2</span>
376        <span class="index">2</span>
377        <span class="index">2</span>
378        <span class="index">1</span>
379        <span class="index">1</span>
380        <span class="index">1</span>
381        <span class="index">1</span>...</td>
382	<td>#7 - 1, 1, 1</td>
383	<td>#4 #1 #6</td>
384</tr>
385</tbody>
386<tbody id="compress_more">
387
388<tr>
389	<td>7</td>
390	<td>Read</td>
391	<td><span class="processed"><span class="index">1</span>
392        <span class="index">1</span>
393        <span class="index">1</span></span>
394        <span class="buffer"><span class="index">1</span></span>
395        <span class="current"><span class="index">1</span></span>
396        <span class="index">2</span>
397        <span class="index">2</span>
398        <span class="index">2</span>
399        <span class="index">2</span>
400        <span class="index">2</span>
401        <span class="index">1</span>
402        <span class="index">1</span>
403        <span class="index">1</span>
404        <span class="index">1</span>...</td>
405	<td>&nbsp;</td>
406	<td>#4 #1 #6</td>
407</tr>
408
409<tr>
410	<td>8</td>
411	<td>Found</td>
412	<td><span class="processed"><span class="index">1</span>
413        <span class="index">1</span>
414        <span class="index">1</span></span>
415        <span class="buffer"><span class="index">1</span>
416        <span class="index">1</span></span>
417        <span class="index">2</span>
418        <span class="index">2</span>
419        <span class="index">2</span>
420        <span class="index">2</span>
421        <span class="index">2</span>
422        <span class="index">1</span>
423        <span class="index">1</span>
424        <span class="index">1</span>
425        <span class="index">1</span>...</td>
426	<td>&nbsp;</td>
427	<td>#4 #1 #6</td>
428</tr>
429
430<tr>
431	<td>9</td>
432	<td>Read</td>
433	<td><span class="processed"><span class="index">1</span>
434        <span class="index">1</span>
435        <span class="index">1</span></span>
436        <span class="buffer"><span class="index">1</span>
437        <span class="index">1</span></span>
438        <span class="current"><span class="index">2</span></span>
439        <span class="index">2</span>
440        <span class="index">2</span>
441        <span class="index">2</span>
442        <span class="index">2</span>
443        <span class="index">1</span>
444        <span class="index">1</span>
445        <span class="index">1</span>
446        <span class="index">1</span>...</td>
447	<td>&nbsp;</td>
448	<td>#4 #1 #6</td>
449</tr>
450
451<tr>
452	<td>10</td>
453	<td>Not Found</td>
454	<td><span class="processed"><span class="index">1</span>
455        <span class="index">1</span>
456        <span class="index">1</span>
457        <span class="index">1</span>
458        <span class="index">1</span></span>
459        <span class="buffer"><span class="index">2</span></span>
460        <span class="index">2</span>
461        <span class="index">2</span>
462        <span class="index">2</span>
463        <span class="index">2</span>
464        <span class="index">1</span>
465        <span class="index">1</span>
466        <span class="index">1</span>
467        <span class="index">1</span>...</td>
468	<td>#8 - 1, 1, 2</td>
469	<td>#4 #1 #6 #6</td>
470</tr>
471
472<tr>
473	<td>11</td>
474	<td>Read</td>
475	<td><span class="processed"><span class="index">1</span>
476        <span class="index">1</span>
477        <span class="index">1</span>
478        <span class="index">1</span>
479        <span class="index">1</span></span>
480        <span class="buffer"><span class="index">2</span></span>
481        <span class="current"><span class="index">2</span></span>
482        <span class="index">2</span>
483        <span class="index">2</span>
484        <span class="index">2</span>
485        <span class="index">1</span>
486        <span class="index">1</span>
487        <span class="index">1</span>
488        <span class="index">1</span>...</td>
489	<td>&nbsp;</td>
490	<td>#4 #1 #6 #6</td>
491</tr>
492
493<tr>
494	<td>12</td>
495	<td>Not Found </td>
496	<td><span class="processed"><span class="index">1</span>
497        <span class="index">1</span>
498        <span class="index">1</span>
499        <span class="index">1</span>
500        <span class="index">1</span>
501        <span class="index">2</span></span>
502        <span class="buffer"><span class="index">2</span></span>
503        <span class="index">2</span>
504        <span class="index">2</span>
505        <span class="index">2</span>
506        <span class="index">1</span>
507        <span class="index">1</span>
508        <span class="index">1</span>
509        <span class="index">1</span>...</td>
510	<td>#9 - 2, 2</td>
511	<td>#4 #1 #6 #6 #2</td>
512</tr>
513
514<tr>
515	<td>13</td>
516	<td>Read</td>
517	<td><span class="processed"><span class="index">1</span>
518        <span class="index">1</span>
519        <span class="index">1</span>
520        <span class="index">1</span>
521        <span class="index">1</span>
522        <span class="index">2</span></span>
523        <span class="buffer"><span class="index">2</span></span>
524        <span class="current"><span class="index">2</span></span>
525        <span class="index">2</span>
526        <span class="index">2</span>
527        <span class="index">1</span>
528        <span class="index">1</span>
529        <span class="index">1</span>
530        <span class="index">1</span>...</td>
531	<td>&nbsp;</td>
532	<td>#4 #1 #6 #6 #2</td>
533</tr>
534
535<tr>
536	<td>14</td>
537	<td>Found </td>
538	<td><span class="processed"><span class="index">1</span>
539        <span class="index">1</span>
540        <span class="index">1</span>
541        <span class="index">1</span>
542        <span class="index">1</span>
543        <span class="index">2</span></span>
544        <span class="buffer"><span class="index">2</span>
545        <span class="index">2</span></span>
546        <span class="index">2</span>
547        <span class="index">2</span>
548        <span class="index">1</span>
549        <span class="index">1</span>
550        <span class="index">1</span>
551        <span class="index">1</span>...</td>
552	<td>&nbsp;</td>
553	<td>#4 #1 #6 #6 #2</td>
554</tr>
555
556<tr>
557	<td>15</td>
558	<td>Read</td>
559	<td><span class="processed"><span class="index">1</span>
560        <span class="index">1</span>
561        <span class="index">1</span>
562        <span class="index">1</span>
563        <span class="index">1</span>
564        <span class="index">2</span></span>
565        <span class="buffer"><span class="index">2</span>
566        <span class="index">2</span></span>
567        <span class="current"><span class="index">2</span></span>
568        <span class="index">2</span>
569        <span class="index">1</span>
570        <span class="index">1</span>
571        <span class="index">1</span>
572        <span class="index">1</span>...</td>
573	<td>&nbsp;</td>
574	<td>#4 #1 #6 #6 #2</td>
575</tr>
576
577<tr>
578	<td>16</td>
579	<td>Not Found</td>
580	<td><span class="processed"><span class="index">1</span>
581        <span class="index">1</span>
582        <span class="index">1</span>
583        <span class="index">1</span>
584        <span class="index">1</span>
585        <span class="index">2</span>
586        <span class="index">2</span>
587        <span class="index">2</span></span>
588        <span class="buffer"><span class="index">2</span></span>
589        <span class="index">2</span>
590        <span class="index">1</span>
591        <span class="index">1</span>
592        <span class="index">1</span>
593        <span class="index">1</span>...</td>
594	<td>#10 - 2, 2, 2</td>
595	<td>#4 #1 #6 #6 #2 #9</td>
596</tr>
597
598<tr>
599	<td>17</td>
600	<td>Read</td>
601	<td><span class="processed"><span class="index">1</span>
602        <span class="index">1</span>
603        <span class="index">1</span>
604        <span class="index">1</span>
605        <span class="index">1</span>
606        <span class="index">2</span>
607        <span class="index">2</span>
608        <span class="index">2</span></span>
609        <span class="buffer"><span class="index">2</span></span>
610        <span class="current"><span class="index">2</span></span>
611        <span class="index">1</span>
612        <span class="index">1</span>
613        <span class="index">1</span>
614        <span class="index">1</span>...</td>
615	<td>&nbsp;</td>
616	<td>#4 #1 #6 #6 #2 #9</td>
617</tr>
618
619<tr>
620	<td>18</td>
621	<td>Found</td>
622	<td><span class="processed"><span class="index">1</span>
623        <span class="index">1</span>
624        <span class="index">1</span>
625        <span class="index">1</span>
626        <span class="index">1</span>
627        <span class="index">2</span>
628        <span class="index">2</span>
629        <span class="index">2</span></span>
630        <span class="buffer"><span class="index">2</span>
631        <span class="index">2</span></span>
632        <span class="index">1</span>
633        <span class="index">1</span>
634        <span class="index">1</span>
635        <span class="index">1</span>...</td>
636	<td>&nbsp;</td>
637	<td>#4 #1 #6 #6 #2 #9</td>
638</tr>
639
640<tr>
641	<td>19</td>
642	<td>Read</td>
643	<td><span class="processed"><span class="index">1</span>
644        <span class="index">1</span>
645        <span class="index">1</span>
646        <span class="index">1</span>
647        <span class="index">1</span>
648        <span class="index">2</span>
649        <span class="index">2</span>
650        <span class="index">2</span></span>
651        <span class="buffer"><span class="index">2</span>
652        <span class="index">2</span></span>
653        <span class="current"><span class="index">1</span></span>
654        <span class="index">1</span>
655        <span class="index">1</span>
656        <span class="index">1</span>...</td>
657	<td>&nbsp;</td>
658	<td>#4 #1 #6 #6 #2 #9</td>
659</tr>
660
661<tr>
662	<td>20</td>
663	<td>Not Found</td>
664	<td><span class="processed"><span class="index">1</span>
665        <span class="index">1</span>
666        <span class="index">1</span>
667        <span class="index">1</span>
668        <span class="index">1</span>
669        <span class="index">2</span>
670        <span class="index">2</span>
671        <span class="index">2</span>
672        <span class="index">2</span>
673        <span class="index">2</span></span>
674        <span class="buffer"><span class="index">1</span></span>
675        <span class="index">1</span>
676        <span class="index">1</span>
677        <span class="index">1</span>...</td>
678	<td>#11 - 2, 2, 1</td>
679	<td>#4 #1 #6 #6 #2 #9 #9</td>
680</tr>
681
682<tr>
683	<td>21</td>
684	<td>Read</td>
685	<td><span class="processed"><span class="index">1</span>
686        <span class="index">1</span>
687        <span class="index">1</span>
688        <span class="index">1</span>
689        <span class="index">1</span>
690        <span class="index">2</span>
691        <span class="index">2</span>
692        <span class="index">2</span>
693        <span class="index">2</span>
694        <span class="index">2</span></span>
695        <span class="buffer"><span class="index">1</span></span>
696        <span class="current"><span class="index">1</span></span>
697        <span class="index">1</span>
698        <span class="index">1</span>...</td>
699	<td>&nbsp;</td>
700	<td>#4 #1 #6 #6 #2 #9 #9</td>
701</tr>
702
703<tr>
704	<td>22</td>
705	<td>Found</td>
706	<td><span class="processed"><span class="index">1</span>
707        <span class="index">1</span>
708        <span class="index">1</span>
709        <span class="index">1</span>
710        <span class="index">1</span>
711        <span class="index">2</span>
712        <span class="index">2</span>
713        <span class="index">2</span>
714        <span class="index">2</span>
715        <span class="index">2</span></span>
716        <span class="buffer"><span class="index">1</span>
717        <span class="index">1</span></span>
718        <span class="index">1</span>
719        <span class="index">1</span>...</td>
720	<td>&nbsp;</td>
721	<td>#4 #1 #6 #6 #2 #9 #9</td>
722</tr>
723
724<tr>
725	<td>23</td>
726	<td>Read</td>
727	<td><span class="processed"><span class="index">1</span>
728        <span class="index">1</span>
729        <span class="index">1</span>
730        <span class="index">1</span>
731        <span class="index">1</span>
732        <span class="index">2</span>
733        <span class="index">2</span>
734        <span class="index">2</span>
735        <span class="index">2</span>
736        <span class="index">2</span></span>
737        <span class="buffer"><span class="index">1</span>
738        <span class="index">1</span></span>
739        <span class="current"><span class="index">1</span></span>
740        <span class="index">1</span>...</td>
741	<td>&nbsp;</td>
742	<td>#4 #1 #6 #6 #2 #9 #9</td>
743</tr>
744
745<tr>
746	<td>24</td>
747	<td>Found</td>
748	<td><span class="processed"><span class="index">1</span>
749        <span class="index">1</span>
750        <span class="index">1</span>
751        <span class="index">1</span>
752        <span class="index">1</span>
753        <span class="index">2</span>
754        <span class="index">2</span>
755        <span class="index">2</span>
756        <span class="index">2</span>
757        <span class="index">2</span></span>
758        <span class="buffer"><span class="index">1</span>
759        <span class="index">1</span>
760        <span class="index">1</span></span>
761        <span class="index">1</span>...</td>
762	<td>&nbsp;</td>
763	<td>#4 #1 #6 #6 #2 #9 #9</td>
764</tr>
765
766<tr>
767	<td>25</td>
768	<td>Read</td>
769	<td><span class="processed"><span class="index">1</span>
770        <span class="index">1</span>
771        <span class="index">1</span>
772        <span class="index">1</span>
773        <span class="index">1</span>
774        <span class="index">2</span>
775        <span class="index">2</span>
776        <span class="index">2</span>
777        <span class="index">2</span>
778        <span class="index">2</span></span>
779        <span class="buffer"><span class="index">1</span>
780        <span class="index">1</span>
781        <span class="index">1</span></span>
782        <span class="current"><span class="index">1</span></span>...</td>
783	<td>&nbsp;</td>
784	<td>#4 #1 #6 #6 #2 #9 #9</td>
785</tr>
786
787<tr>
788	<td>26</td>
789	<td>Not Found</td>
790	<td><span class="processed"><span class="index">1</span>
791        <span class="index">1</span>
792        <span class="index">1</span>
793        <span class="index">1</span>
794        <span class="index">1</span>
795        <span class="index">2</span>
796        <span class="index">2</span>
797        <span class="index">2</span>
798        <span class="index">2</span>
799        <span class="index">2</span>
800        <span class="index">1</span>
801        <span class="index">1</span>
802        <span class="index">1</span></span>
803        <span class="buffer"><span class="index">1</span></span>...</td>
804	<td>#12 - 1, 1, 1, 1</td>
805	<td>#4 #1 #6 #6 #2 #9 #9 #7</td>
806</tr>
807</tbody>
808</table>
809
810<p>I've included a few more steps to help you see the pattern. You
811keep going until you run out of indexes in the index stream. When
812there is nothing new to read, you simply write out the code for
813whatever values you may have in your index buffer.  Finally you should
814send the end-of-information code to the code stream. In this example,
815that code is #5. (View the <a href="lzw_image_data_code_table.html">
816complete code table</a>.)</p>
817
818<p>As you can see we dynamically built many new codes for our code
819table as we compressed the data. For large files this can turn into a
820large number of codes. It turns out that the GIF format specifies a
821maximum code of #4095 (this happens to be the largest 12-bit
822number). If you want to use a new code, you have to clear out all of
823your old codes. You do this by sending the clear code (which for our
824sample was the #4). This tells the decoder that you are reinitializing
825your code table and it should too. Then you start building your own
826codes again starting just after the value for your end-of-information
827code (in our sample, we would start again at #6).</p>
828
829<p>The final code stream would look like this:</p>
830
831<blockquote><p>#4 #1 #6 #6 #2 #9 #9 #7 #8 #10 #2 #12 #1 #14 #15 #6 #0 #21 #0 #10 #7 #22 #23
832#18 #26 #7 #10 #29 #13 #24 #12 #18 #16 #36 #12 #5</p></blockquote>
833
834<p>This is only 36 codes versus the 100 that would be required without compression.</p>
835
836<h2><a name="lzw_decompression">LZW Decompression</a></h2>
837
838<p> At some point we will need to turn this code stream back into
839a picture. To do this, we only need to know the values in the stream
840and the size of the color table that was used. That's it. You remember that
841big code table we built during compression? We actually have enough information
842in the code stream itself to be able to rebuild it.</p>
843
844<p>Again, i'll list the algorithm and then we will walk though an example. Let
845me define a few terms i will be using. CODE will be current code we're working
846with. CODE-1 will be the code just before CODE in the code stream. {CODE}
847will be the value for CODE in the code table. For example, using the code
848table we created during compression, if CODE=#7 then {CODE}=1,1,1.
849In the same way, {CODE-1} would be the value in the code table for the
850code that came before CODE. Looking at step 26 from the compression,
851if CODE=#7, then {CODE-1} would be {#9}, not {#6}, which was 2,2.</p>
852
853<ul>
854<li>Initialize code table</li>
855<li>let CODE be the first code in the code stream</li>
856<li>output {CODE} to index stream</li>
857<li>&lt;LOOP POINT&gt;</li>
858<li>let CODE be the next code in the code stream</li>
859<li>is CODE in the code table?</li>
860<li>Yes:
861	<ul>
862	<li>output {CODE} to index stream</li>
863	<li>let K be the first index in {CODE}</li>
864	<li>add {CODE-1}+K to the code table</li>
865	</ul>
866</li>
867<li>No:
868	<ul>
869	<li>let K be the first index of {CODE-1}</li>
870	<li>output {CODE-1}+K to index stream</li>
871	<li>add {CODE-1}+K to code table</li>
872	</ul>
873</li>
874<li>return to LOOP POINT</li>
875</ul>
876
877<p>Let's start reading though the code stream we've created to show how to
878turn it back into a list of color indexes.  The first value in the code
879stream should be a clear code. This means we should initialize our code
880table. To do this we must know how many colors  are in our color table.
881(This information comes from the first byte in the image data block in
882the file. More on this later.) Again we will set up codes #0-#3 to be each
883of the four colors and add in the clear code (#4)
884and end of information code (#5).</p>
885
886<p>The next step is to read the first color code. In the following table you
887will see the values of CODE highlighted in purple, and the values for
888CODE-1 highlighted in green. Our first CODE value is #1. We then output
889{#1}, or simply 1,  to the index stream [Step 0].</p>
890
891<p>Now we enter the main loop of the algorithm. The next code gets assigned
892to CODE which now makes that value #6. Next we check to see if this value
893is in our code table. At this time, it is not. This means we must find the
894first index in the value of {CODE-1} and call this K. Thus K = first index of
895{CODE-1} = first index of {#1} = 1. Now we output {CODE-1} + K to the index
896stream and add this value to our code table. The means we output 1,1 and
897give this value a code of #6 [Step 1].</p>
898
899<table class="alg_steps" cellspacing="0">
900<tr>
901	<th>Step</th>
902	<th>Action</th>
903	<th>Code Stream</th>
904	<th>New Code Table Row</th>
905	<th>Index Stream</th>
906</tr>
907<tr>
908	<td>0</td>
909	<td>Init</td>
910	<td><span class="processed">#4</span> <span class="current">#1</span> #6 #6 #2 #9 #9 #7 ...</td>
911	<td>&nbsp;</td>
912	<td>1</td>
913</tr>
914<tr>
915	<td>1</td>
916	<td>Not Found</td>
917	<td><span class="processed">#4</span> <span class="buffer">#1</span> <span class="current">#6</span> #6 #2 #9 #9 #7 ...</td>
918	<td>#6 - 1, 1</td>
919	<td>1, 1, 1</td>
920</tr>
921<tr>
922	<td>2</td>
923	<td>Found</td>
924	<td><span class="processed">#4 #1</span> <span class="buffer">#6</span> <span class="current">#6</span> #2 #9 #9 #7 ...</td>
925	<td>#7 - 1, 1, 1</td>
926	<td>1, 1, 1, 1, 1</td>
927</tr>
928<tr>
929	<td>3</td>
930	<td>Found</td>
931	<td><span class="processed">#4 #1 #6</span> <span class="buffer">#6</span> <span class="current">#2</span> #9 #9 #7 ...</td>
932	<td>#8 - 1, 1, 2</td>
933	<td>1, 1, 1, 1, 1, 2</td>
934</tr>
935<tr>
936	<td>4</td>
937	<td>Not Found</td>
938	<td><span class="processed">#4 #1 #6 #6</span> <span class="buffer">#2</span> <span class="current">#9</span> #9 #7 ...</td>
939	<td>#9 - 2, 2</td>
940	<td>1, 1, 1, 1, 1, 2, 2, 2</td>
941</tr>
942<tr>
943	<td>5</td>
944	<td>Found</td>
945	<td><span class="processed">#4 #1 #6 #6 #2</span> <span class="buffer">#9</span> <span class="current">#9</span> #7 ...</td>
946	<td>#10 - 2, 2, 2</td>
947	<td>1, 1, 1, 1, 1, 2, 2, 2, 2, 2</td>
948</tr>
949<tr>
950	<td>6</td>
951	<td>Found</td>
952	<td><span class="processed">#4 #1 #6 #6 #2 #9</span> <span class="buffer">#9</span> <span class="current">#7</span> ...</td>
953	<td>#11 - 2, 2, 1</td>
954	<td>1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 1, 1, 1</td>
955</tr>
956</table>
957
958<p>We start the loop again by reading the next code. CODE now would be
959#6 and this time we do have a record for this code in our code
960table. Thus we dump {#6} to the index stream which would be 1,1.
961Now we take the first index in {#6} and call that K. Here, {#6} has
962two indexes, the first of which is 1; thus K = 1. Before moving
963on, we add {CODE-1}+K to the code table. This #7 is now 1, 1, 1 [Step 2].</p>
964
965<p>I've included a few more steps so you can see the algorithm in action. While
966the explanation may sound complicated, you can see it's actually quite simple.
967You'll also notice that you end up building the exact same
968<a href="lzw_image_data_code_table.html">code table</a>
969as the one that was created during compression. This is the reason that
970LZW is so great; we can just share the codes and not the table.</p>
971
972<h2><a name="lzw_bytes">Saving the Code Stream as Bytes</a></h2>
973
974<p>I've shown you how to go back and forth between index and code stream, but
975haven't told you what to do with them. The index stream is used to specify the
976color of each of the pixel of your image and really only shows up on screen.
977It is the code stream that is actually saved in the GIF files on your computer
978or transmitted over the internet. In order to save these code streams, we must
979turn them into bytes. The first thought might be to store each of the codes
980as its own byte; however this would limit the max code to just #255 and
981result in a lot of wasted bits for the small codes. To solve these problems,
982the GIF file format actually uses flexible <em>code sizes</em>.</p>
983
984<p>Flexible code sizes allow for further compression by limiting the bits
985needed to save the code stream as bytes. The <em>code size</em> is the number
986of bits it takes to store the value of the code. When we talk about bits,
987we're referring to the 1's and 0's that make up a byte. The codes are
988converted to their binary values to come up with the bits. To specify
989the code for #4, you would look at this binary equivalent, which is 100,
990and see that you would need three bits to store this value. The largest code
991value in our sample code stream is #36 (binary: 100100) which would
992take 6 bits to encode. Note that the number of bits i've just given is
993the minimum number. You can make the number take up more bits by adding
994zeros to the front.</p>
995
996<p style="text-align:center"><img src="image_data_block.gif" alt="GIF image data block layout" style="border: 1px solid black" /></p>
997
998<p>We need a way to know what size each of the codes are. Recall that the
999image data block begins with a single byte value called the
1000<em>LZW minimum code size</em>. The GIF format allows sizes as small
1001as 2 bits and as large as 12 bits. This minimum code size value is typically
1002the number of bits/pixel of the image. So if you have 32 colors in your image,
1003you will need 5 bits/pixel (for numbers 0-31 because 31 in binary is 11111).
1004Thus, this will most likely be one more than the bit value for the size of the
1005color table you are using. (Even if you only have two colors, the minimum
1006code size most be at least 2.) Refer to the <a href="#color_table_size">
1007code table above</a> to remind yourself how that works.</p>
1008
1009<p>Here's the funny thing: the value for minimum code size isn't
1010actually the smallest code size that's used in the encoding
1011process. Because the minimum code size tells you how many bits are
1012needed just for the different colors of the image, you still have to
1013account for the two special codes that we always add to the code
1014table. Therefore the actual smallest code size that will be used is
1015one more than the value specified in the &quot;minimum&quot; code size
1016byte. I'll call this new value the <em>first code size</em>.</p>
1017
1018<p>We now know how many bytes the first code will be. This size will probably
1019work for the next few codes as well, but recall that the GIF format
1020allows for flexible code sizes. As larger code values get added to your
1021code table, you will soon realize that you need larger code sizes if you
1022were to output those values. When you are encoding the data, you increase
1023your code size as soon as your write out the code equal to
10242^(current code size)-1. If you are decoding from codes to indexes,
1025you need to increase your code size as soon as you add the code value that
1026is equal to 2^(current code size)-1 to your code table. That is, the next
1027time you grab the next section of bits, you grab one more.</p>
1028
1029<p>Note that the largest code size allowed is 12 bits. If you get to this
1030limit, the next code you encounter should be the <em>clear code</em> which
1031would tell you to reinitialize the code table. You then go back to using
1032the first code size and grow again when necessary.</p>
1033
1034<p>Jumping back to our sample image, we see that we have a minimum code
1035size value of 2 which means out first code size will be 3 bits long.
1036Out first three codes, #1 #6 and #6, would be coded as 001 110 and 110.
1037If you see at Step 6 of the encoding, we added a code of #7 to our code
1038table. This is our clue to increase our code size because 7 is equal to
10392^3-1 (where 3 is our current code size). Thus, the next code we
1040write out, #2, will use the new code size of 4 and therefore look
1041like 0010. In the decoding process, we again would increase our code
1042size when we read the code for #7 and would read the next 4, rather than
1043the next 3 bits, to get the next code. In the sample table above this
1044occurs in Step 2.</p>
1045
1046<p>Finally we must turn all these bit values into bytes. The lowest bit of the
1047code bit value gets placed in the lowest available bit of the byte. After
1048you've filled up the 8 bits in the byte, you take any left over bits and
1049start a new byte. Take a look at the following illustration to see
1050how that works with the codes from our sample image.</p>
1051
1052<p style="text-align:center"><img src="lzw_encoding_codes.gif"
1053alt="Encoding LZW Codes" style="border: 1px solid black" / WIDTH="500"
1054HEIGHT="220"></p>
1055
1056<p>You can see in the first byte that was returned (<span
1057class="byte">8C</span>) that the lowest three bits (because that was
1058our first code size) contain 110 which is the binary value of 4 so
1059that would be the clear code we started with, #4. In the three bits to
1060the left, you see 001 which out or first data code of #1. You can also
1061see when we switched into code sizes of 4 bits in the second byte
1062(<span class="byte">2D</span>).</p>
1063
1064<p>When you run out of codes but have filled less than 8 bits of the
1065byte, you should just fill the remaining bits with zeros. Recall that
1066the image data must be broken up onto <a
1067href="bits_and_bytes.html#image_data_block">data sub-blocks</a>.  Each
1068of the data sub-blocks begins with a byte that specifies how many
1069bytes of data. The value will be between 1 and 255. After you read
1070those bytes, the next byte indicates again how many bytes of data
1071follow. You stop when you encounter a subblock that has a lenght of
1072zero. That tells you when you've reached the end of the image data. In
1073our sample the image the byte just after the LZW code size is <span
1074class="byte">16</span> which indicates that 22 bytes of data
1075follow. After we reach those, we see the next byte is <span
1076class="byte">00</span> which means we are all done.</p>
1077
1078<p>Return codes from bytes the basically just the same process in
1079reverse.  A sample illustration of the process follows which shows how
1080you would extract codes if the first code size were 5 bits.</p>
1081
1082<p style="text-align:center"><img src="lzw_decoding_bytes.gif" alt="Decoding LZW Bytes" style="border: 1px solid black" / WIDTH="500" HEIGHT="220"></p>
1083
1084<h2>Next: Animation and Transparency</h2>
1085
1086<p>That is pretty much everything you need to know to read or generate
1087a basic image file. One of the reasons the GIF becames such a popular
1088format was because it also allowed for &quot;fancier&quot; features. These
1089features include animation and transparency. Next we'll look
1090at how those work.</p>
1091
1092<p><a href="animation_and_transparency.html">Continue...</a></p>
1093</div>
1094
1095<div style="text-align:center; margin-top: 10px; padding-top: 10px; border-top: #cecece 1px solid">
1096<a href="../index.html">Back to GIFLIB documentation</a>
1097</div>
1098
1099</body>
1100
1101</html>
1102