1<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> 2 3<html> 4<head> 5<title>What's In A GIF - LZW Image Data</title> 6<style type="text/css"> 7.byte {font-family: Courier, fixed; 8 padding: .2em} 9.gif_header {background-color: #f9E89D} 10.gif_screen {background-color: #C8DBD9} 11.gif_color {background-color: #E1E1E1} 12.gif_graphic {background-color: #F9EB9D} 13.gif_imgdesc {background-color: #C2D1DC} 14.gif_imgdata {background-color: #D0C4C4} 15.gif_trailer {background-color: #f9E89D} 16.gif_ext {background-color: #D0CFAE} 17#global_color_size {margin-left: auto; margin-right:auto; border:1px solid black;} 18#global_color_size th {border-bottom: 1px solid #666666} 19#global_color_size td {text-align:center;} 20.code_table {margin-left: auto; margin-right:auto; border:1px solid black;} 21.code_table th {text-align: left; border-bottom: 1px solid #666666} 22.alg_steps {margin: 0 auto; border: 1px solid black} 23.alg_steps th, .alg_steps td {border: 1px solid black} 24.alg_steps .index {padding: 0 .3em} 25.alg_steps .processed {color: #CCC} 26.alg_steps .buffer {background: #C8DBD9 url(highlight_green.gif) repeat-x center left; 27 border-top: 1px solid #AAA2A2; border-bottom: 1px solid #AAA2A2;} 28.alg_steps .current {background: #D0C4C4 url(highlight_purple.gif) repeat-x center left; 29 border-top: 1px solid #98A5A4; border-bottom: 1px solid #98A5A4;} 30</style> 31</head> 32<body> 33<table width='100%' cellpadding='0' summary='Canned page header' bgcolor="#ddd"> 34<tr> 35<td><h2>What's In A GIF</h2></td> 36<td align="center"><img src="../giflib-logo.gif"></td> 37<td align="right">(LZW image data)</td> 38</tr> 39</table> 40 41<div id="body"> 42<div style="text-align:center; margin-top: 10px; padding-top: 10px; border-top: #cecece 1px solid"> 43<p><a href="index.html">Back to the GIF index page.</a></p> 44</div> 45 46<p>Now let's look at exactly how we go about storing an image in a GIF 47file. The GIF format is a raster format, meaning it stores image data 48by remembering the color of every pixel in the image. More 49specifically, GIF files remember the index of the color in a color 50table for each pixel. To make that clearer, let's review the 51sample image we used in the <a href="bits_and_bytes.html">first 52section</a>.</p> 53 54<table style="margin-left: auto; margin-right:auto;"> 55<tr> 56<td style="text-align:center; vertical-align: top; padding: 5px;width:30%"> 57<h3>Actual Size</h3> 58<img src="sample_1.gif" alt="sample gif, actual size" title="Actual Size" 59width="10" height="10" style="padding: 20px" /><br/>(10x10)</td> 60<td style="text-align:center; vertical-align: top; padding: 5px;; width:40%"> 61<h3>Enlarged</h3> 62<img src="sample_1_enlarged.gif" alt="sample gif, enlarged" 63title="Enlarged" width="100" height="100" /><br/>(100x100)</td> 64<td style="vertical-align: top; padding: 5px; width:30%"> 65<h3>Color Table</h3> 66<table> 67<tr><th>Index</th><th>Color</th></tr> 68<tr><td>0</td><td><span style="color:#FFFFFF; background: #000000; font-weight: bold">White</span></td></tr> 69<tr><td>1</td><td><span style="color:#FF0000; font-weight: bold">Red</span></td></tr> 70<tr><td>2</td><td><span style="color:#0000FF; font-weight: bold">Blue</span></td></tr> 71<tr><td>3</td><td><span style="font-weight: bold">Black</span></td></tr> 72</table> 73</td> 74</tr></table> 75 76<p>The color table came from the global color table block. The colors 77are listed in the order which they appear in the file. The first color 78is given an index of zero. When we send the codes, we always start at 79the top left of the image and work our way right. When we get to the 80end of the line, the very next code is the one that starts the next 81line. (The decoder will "wrap" the image based on the image 82dimensions.) We could encode our sample image in the following 83way:</p> 84 85<blockquote><p>1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 1, 861, 1, 1, 1, 2, 2, 2, 2, 2, 1, 1, 1, 0, 0, 0, 0, 2, 2, 2, 1, 1, 1, 870, 0, 0, 0, 2, 2, 2, ...</p></blockquote> 88 89<p>The above listing shows the sequence required to render the first five 90lines of the image. We could continue with this method until we've 91specified the color for every pixel; however, this can result in a 92rather large file. Luckily for us, the GIF format allows us to take 93advantage of repetition in our output and to compress our data.</p> 94 95<p>Much of the following information came from John Barkaus's tutorial 96<cite>LZW and GIF Explained</cite>, which seems to have fallen off the 97web. I've tried to provide more detailed samples as well 98as illustrations to make the process even clearer</p> 99 100<h2><a name="lzw_compression">LZW Compression</a></h2> 101 102<p>The compression method GIF use is a variant of LZW 103(Lempel-Ziv-Welch) compression. To start using this method, we need a 104<strong>code table</strong>. This code table will allow us to use 105special codes to indicate a sequence of colors rather than just one at 106a time. The first thing we do is to <em>initialize the code 107table</em>. We start by adding a code for each of the colors in the 108color table. This would be a local color table if one was provided, or 109the global color table. (I will be starting all codes with 110"#" to distinguish them from color indexes.)</p> 111 112<table class="code_table"> 113<tr><th>Code</th><th>Color(s)</th></tr> 114<tr><td>#0</td><td>0</td></tr> 115<tr><td>#1</td><td>1</td></tr> 116<tr><td>#2</td><td>2</td></tr> 117<tr><td>#3</td><td>3</td></tr> 118<tr><td>#4</td><td>Clear Code</td></tr> 119<tr><td>#5</td><td>End Of Information Code</td></tr> 120</table> 121 122<p>I added a code for each of the colors in the global color table of 123our sample image. I also snuck in two special control codes. (These 124special codes are only used in the GIF version of LZW, not in standard 125LZW compression.) Our code table is now considered initialized.</p> 126 127<p>Let me now explain what those special codes are for. The first new code 128is the <em>clear code</em> (CC). Whenever you come across the clear code 129in the image data, it's your cue to reinitialize the code table. (I'll 130explain why you might need to do this in a bit.) The second new code 131is the <em>end of information code</em> (EOI). When you come across 132this code, this means you've reached the end of the image. Here I've placed 133the special codes right after the color codes, but actually the value of 134the special codes depends on the value of the LZW minimum code size 135from the image data block. If the LZW minimum code size is the same as 136the color table size, then special codes immediatly follow the colors; however 137it is possible to specify a larger LWZ minimum code size which may leave 138a gap in the codes where no colors are assigned. This can be 139summarizaed in the <a name="color_table_size">following table</a>.</p> 140 141<div style="text-align:center"> 142<table id="global_color_size"> 143<tr><th>LWZ Min Code<br/>Size</th><th>Color<br/>Codes</th><th>Clear<br/>Code</th><th>EOI<br/>Code</th></tr> 144<tr><td>2</td><td>#0-#3</td><td>#4</td><td>#5</td></tr> 145<tr><td>3</td><td>#0-#7</td><td>#8</td><td>#9</td></tr> 146<tr><td>4</td><td>#0-#15</td><td>#16</td><td>#17</td></tr> 147<tr><td>5</td><td>#0-#31</td><td>#32</td><td>#33</td></tr> 148<tr><td>6</td><td>#0-#63</td><td>#64</td><td>#65</td></tr> 149<tr><td>7</td><td>#0-#127</td><td>#128</td><td>#129</td></tr> 150<tr><td>8</td><td>#0-#255</td><td>#256</td><td>#257</td></tr> 151</table> 152</div> 153 154<p>Before we proceed, let me define two more terms. First the <strong>index 155stream</strong> will be the list of indexes of the color for each of 156the pixels. This is the input we will be compressing. The <strong>code 157stream</strong> will be the list of codes we generate as output. The 158<strong>index buffer</strong> will be the list of color indexes 159we care "currently looking at." The index buffer will contain a list 160of one or more color indexes. Now we can step though the LZW 161compression algorithm. First, I'll just list the steps. After that 162I'll walk through the steps with our specific example.</p> 163 164<ul> 165<li>Initialize code table</li> 166<li>Always start by sending a clear code to the code stream.</li> 167<li>Read first index from index stream. This value is now the value 168for the index buffer</li> 169<li><LOOP POINT></li> 170<li>Get the next index from the index stream to the index buffer. We will 171call this index, K</li> 172<li>Is index buffer + K in our code table?</li> 173<li>Yes: 174 <ul> 175 <li>add K to the end of the index buffer</li> 176 <li>if there are more indexes, return to LOOP POINT</li> 177 </ul> 178</li> 179<li>No: 180 <ul> 181 <li>Add a row for index buffer + K into our code table with 182 the next smallest code</li> 183 <li>Output the code for just the index buffer to our code steam</li> 184 <li>Index buffer is set to K</li> 185 <li>K is set to nothing</li> 186 <li>if there are more indexes, return to LOOP POINT</li> 187 </ul> 188</li> 189<li>Output code for contents of index buffer</li> 190<li>Output end-of-information code</li> 191</ul> 192 193<p>Seems simple enough, right? It really isn't all that bad. Let's 194walk though our sample image to show you how this works. (The steps I 195will be describing are summarized in the following table. Numbers 196highlighted in green are in the index buffer; numbers in purple are 197the current K value.) We have already initialized our code table. We 198start by doing two things: we output our clear code (#4) to the code 199stream, and we read the first color index from the index stream, 1, 200into our index buffer [Step 0].</p> 201 202<p>Now we enter the main loop of the algorithm. We read the next index 203in the index stream, 1, into K [Step 1]. Next we see if we have a 204record for the index buffer plus K in the code stream. In this case we 205looking for 1,1. Currently our code table only contains single colors 206so this value is not in there. Now we will actually add a new row to 207our code table that does contain this value. The next available code 208is #6, we will let #6 be 1,1. Note that we do not actually send this 209code to the code stream, instead we send just the code for the 210value(s) in the index buffer. The index buffer is just 1 and the code 211for 1 is #1. This is the code we output. We now reset the index buffer 212to just the value in K and K becomes nothing. [Step 2].</p> 213 214<p>We continue by reading the next index into K. [Step 3]. Now K is 1 and the 215index buffer is 1. Again we look to see if there is a value in our code 216table for the buffer plus K (1,1) and this time there is. (In fact we just 217added it.) Therefore we add K to the end of the index buffer and clear out 218K. Now our index buffer is 1,1. [Step 4].</p> 219 220<p>The next index in the index stream is yet another 1. This is our 221new K [Step 5]. Now the index buffer plus K is 1,1,1 which we do not 222have a code for in our code table. As we did before, we define a new 223code and add it to the code table. The next code would be #7; thus #7 224= 1, 1, 1. Now we kick out the code for just the values in the index 225buffer (#6 = 1,1) to the code stream and set the index buffer to be 226K. [Step 6].</p> 227 228<table class="alg_steps" cellspacing="0"> 229<tbody> 230<tr> 231 <th>Step</th> 232 <th>Action</th> 233 <th>Index Stream</th> 234 <th>New Code Table Row</th> 235 <th>Code Stream</th> 236</tr> 237<tr> 238 <td>0</td> 239 <td>Init</td> 240 <td><span class="processed"></span> 241 <span class="buffer"><span class="index">1</span></span> 242 <span class="index">1</span> 243 <span class="index">1</span> 244 <span class="index">1</span> 245 <span class="index">1</span> 246 <span class="index">2</span> 247 <span class="index">2</span> 248 <span class="index">2</span> 249 <span class="index">2</span> 250 <span class="index">2</span> 251 <span class="index">1</span> 252 <span class="index">1</span> 253 <span class="index">1</span> 254 <span class="index">1</span>...</td> 255 <td> </td> 256 <td>#4</td> 257</tr> 258 259<tr> 260 <td>1</td> 261 <td>Read</td> 262 <td><span class="processed"></span> 263 <span class="buffer"><span class="index">1</span></span> 264 <span class="current"><span class="index">1</span></span> 265 <span class="index">1</span> 266 <span class="index">1</span> 267 <span class="index">1</span> 268 <span class="index">2</span> 269 <span class="index">2</span> 270 <span class="index">2</span> 271 <span class="index">2</span> 272 <span class="index">2</span> 273 <span class="index">1</span> 274 <span class="index">1</span> 275 <span class="index">1</span> 276 <span class="index">1</span>...</td> 277 <td> </td> 278 <td>#4</td> 279</tr> 280 281<tr> 282 <td>2</td> 283 <td>Not Found</td> 284 <td><span class="processed"><span class="index">1</span></span> 285 <span class="buffer"><span class="index">1</span></span> 286 <span class="index">1</span> 287 <span class="index">1</span> 288 <span class="index">1</span> 289 <span class="index">2</span> 290 <span class="index">2</span> 291 <span class="index">2</span> 292 <span class="index">2</span> 293 <span class="index">2</span> 294 <span class="index">1</span> 295 <span class="index">1</span> 296 <span class="index">1</span> 297 <span class="index">1</span>...</td> 298 <td>#6 - 1, 1</td> 299 <td>#4 #1</td> 300</tr> 301 302<tr> 303 <td>3</td> 304 <td>Read</td> 305 <td><span class="processed"><span class="index">1</span></span> 306 <span class="buffer"><span class="index">1</span></span> 307 <span class="current"><span class="index">1</span></span> 308 <span class="index">1</span> 309 <span class="index">1</span> 310 <span class="index">2</span> 311 <span class="index">2</span> 312 <span class="index">2</span> 313 <span class="index">2</span> 314 <span class="index">2</span> 315 <span class="index">1</span> 316 <span class="index">1</span> 317 <span class="index">1</span> 318 <span class="index">1</span>...</td> 319 <td> </td> 320 <td>#4 #1</td> 321</tr> 322 323<tr> 324 <td>4</td> 325 <td>Found</td> 326 <td><span class="processed"><span class="index">1</span></span> 327 <span class="buffer"><span class="index">1</span> 328 <span class="index">1</span></span> 329 <span class="index">1</span> 330 <span class="index">1</span> 331 <span class="index">2</span> 332 <span class="index">2</span> 333 <span class="index">2</span> 334 <span class="index">2</span> 335 <span class="index">2</span> 336 <span class="index">1</span> 337 <span class="index">1</span> 338 <span class="index">1</span> 339 <span class="index">1</span>...</td> 340 <td> </td> 341 <td>#4 #1</td> 342</tr> 343 344<tr> 345 <td>5</td> 346 <td>Read</td> 347 <td><span class="processed"><span class="index">1</span></span> 348 <span class="buffer"><span class="index">1</span> 349 <span class="index">1</span></span> 350 <span class="current"><span class="index">1</span></span> 351 <span class="index">1</span> 352 <span class="index">2</span> 353 <span class="index">2</span> 354 <span class="index">2</span> 355 <span class="index">2</span> 356 <span class="index">2</span> 357 <span class="index">1</span> 358 <span class="index">1</span> 359 <span class="index">1</span> 360 <span class="index">1</span>...</td> 361 <td> </td> 362 <td>#4 #1</td> 363</tr> 364 365<tr> 366 <td>6</td> 367 <td>Not Found</td> 368 <td><span class="processed"><span class="index">1</span> 369 <span class="index">1</span> 370 <span class="index">1</span></span> 371 <span class="buffer"><span class="index">1</span></span> 372 <span class="index">1</span> 373 <span class="index">2</span> 374 <span class="index">2</span> 375 <span class="index">2</span> 376 <span class="index">2</span> 377 <span class="index">2</span> 378 <span class="index">1</span> 379 <span class="index">1</span> 380 <span class="index">1</span> 381 <span class="index">1</span>...</td> 382 <td>#7 - 1, 1, 1</td> 383 <td>#4 #1 #6</td> 384</tr> 385</tbody> 386<tbody id="compress_more"> 387 388<tr> 389 <td>7</td> 390 <td>Read</td> 391 <td><span class="processed"><span class="index">1</span> 392 <span class="index">1</span> 393 <span class="index">1</span></span> 394 <span class="buffer"><span class="index">1</span></span> 395 <span class="current"><span class="index">1</span></span> 396 <span class="index">2</span> 397 <span class="index">2</span> 398 <span class="index">2</span> 399 <span class="index">2</span> 400 <span class="index">2</span> 401 <span class="index">1</span> 402 <span class="index">1</span> 403 <span class="index">1</span> 404 <span class="index">1</span>...</td> 405 <td> </td> 406 <td>#4 #1 #6</td> 407</tr> 408 409<tr> 410 <td>8</td> 411 <td>Found</td> 412 <td><span class="processed"><span class="index">1</span> 413 <span class="index">1</span> 414 <span class="index">1</span></span> 415 <span class="buffer"><span class="index">1</span> 416 <span class="index">1</span></span> 417 <span class="index">2</span> 418 <span class="index">2</span> 419 <span class="index">2</span> 420 <span class="index">2</span> 421 <span class="index">2</span> 422 <span class="index">1</span> 423 <span class="index">1</span> 424 <span class="index">1</span> 425 <span class="index">1</span>...</td> 426 <td> </td> 427 <td>#4 #1 #6</td> 428</tr> 429 430<tr> 431 <td>9</td> 432 <td>Read</td> 433 <td><span class="processed"><span class="index">1</span> 434 <span class="index">1</span> 435 <span class="index">1</span></span> 436 <span class="buffer"><span class="index">1</span> 437 <span class="index">1</span></span> 438 <span class="current"><span class="index">2</span></span> 439 <span class="index">2</span> 440 <span class="index">2</span> 441 <span class="index">2</span> 442 <span class="index">2</span> 443 <span class="index">1</span> 444 <span class="index">1</span> 445 <span class="index">1</span> 446 <span class="index">1</span>...</td> 447 <td> </td> 448 <td>#4 #1 #6</td> 449</tr> 450 451<tr> 452 <td>10</td> 453 <td>Not Found</td> 454 <td><span class="processed"><span class="index">1</span> 455 <span class="index">1</span> 456 <span class="index">1</span> 457 <span class="index">1</span> 458 <span class="index">1</span></span> 459 <span class="buffer"><span class="index">2</span></span> 460 <span class="index">2</span> 461 <span class="index">2</span> 462 <span class="index">2</span> 463 <span class="index">2</span> 464 <span class="index">1</span> 465 <span class="index">1</span> 466 <span class="index">1</span> 467 <span class="index">1</span>...</td> 468 <td>#8 - 1, 1, 2</td> 469 <td>#4 #1 #6 #6</td> 470</tr> 471 472<tr> 473 <td>11</td> 474 <td>Read</td> 475 <td><span class="processed"><span class="index">1</span> 476 <span class="index">1</span> 477 <span class="index">1</span> 478 <span class="index">1</span> 479 <span class="index">1</span></span> 480 <span class="buffer"><span class="index">2</span></span> 481 <span class="current"><span class="index">2</span></span> 482 <span class="index">2</span> 483 <span class="index">2</span> 484 <span class="index">2</span> 485 <span class="index">1</span> 486 <span class="index">1</span> 487 <span class="index">1</span> 488 <span class="index">1</span>...</td> 489 <td> </td> 490 <td>#4 #1 #6 #6</td> 491</tr> 492 493<tr> 494 <td>12</td> 495 <td>Not Found </td> 496 <td><span class="processed"><span class="index">1</span> 497 <span class="index">1</span> 498 <span class="index">1</span> 499 <span class="index">1</span> 500 <span class="index">1</span> 501 <span class="index">2</span></span> 502 <span class="buffer"><span class="index">2</span></span> 503 <span class="index">2</span> 504 <span class="index">2</span> 505 <span class="index">2</span> 506 <span class="index">1</span> 507 <span class="index">1</span> 508 <span class="index">1</span> 509 <span class="index">1</span>...</td> 510 <td>#9 - 2, 2</td> 511 <td>#4 #1 #6 #6 #2</td> 512</tr> 513 514<tr> 515 <td>13</td> 516 <td>Read</td> 517 <td><span class="processed"><span class="index">1</span> 518 <span class="index">1</span> 519 <span class="index">1</span> 520 <span class="index">1</span> 521 <span class="index">1</span> 522 <span class="index">2</span></span> 523 <span class="buffer"><span class="index">2</span></span> 524 <span class="current"><span class="index">2</span></span> 525 <span class="index">2</span> 526 <span class="index">2</span> 527 <span class="index">1</span> 528 <span class="index">1</span> 529 <span class="index">1</span> 530 <span class="index">1</span>...</td> 531 <td> </td> 532 <td>#4 #1 #6 #6 #2</td> 533</tr> 534 535<tr> 536 <td>14</td> 537 <td>Found </td> 538 <td><span class="processed"><span class="index">1</span> 539 <span class="index">1</span> 540 <span class="index">1</span> 541 <span class="index">1</span> 542 <span class="index">1</span> 543 <span class="index">2</span></span> 544 <span class="buffer"><span class="index">2</span> 545 <span class="index">2</span></span> 546 <span class="index">2</span> 547 <span class="index">2</span> 548 <span class="index">1</span> 549 <span class="index">1</span> 550 <span class="index">1</span> 551 <span class="index">1</span>...</td> 552 <td> </td> 553 <td>#4 #1 #6 #6 #2</td> 554</tr> 555 556<tr> 557 <td>15</td> 558 <td>Read</td> 559 <td><span class="processed"><span class="index">1</span> 560 <span class="index">1</span> 561 <span class="index">1</span> 562 <span class="index">1</span> 563 <span class="index">1</span> 564 <span class="index">2</span></span> 565 <span class="buffer"><span class="index">2</span> 566 <span class="index">2</span></span> 567 <span class="current"><span class="index">2</span></span> 568 <span class="index">2</span> 569 <span class="index">1</span> 570 <span class="index">1</span> 571 <span class="index">1</span> 572 <span class="index">1</span>...</td> 573 <td> </td> 574 <td>#4 #1 #6 #6 #2</td> 575</tr> 576 577<tr> 578 <td>16</td> 579 <td>Not Found</td> 580 <td><span class="processed"><span class="index">1</span> 581 <span class="index">1</span> 582 <span class="index">1</span> 583 <span class="index">1</span> 584 <span class="index">1</span> 585 <span class="index">2</span> 586 <span class="index">2</span> 587 <span class="index">2</span></span> 588 <span class="buffer"><span class="index">2</span></span> 589 <span class="index">2</span> 590 <span class="index">1</span> 591 <span class="index">1</span> 592 <span class="index">1</span> 593 <span class="index">1</span>...</td> 594 <td>#10 - 2, 2, 2</td> 595 <td>#4 #1 #6 #6 #2 #9</td> 596</tr> 597 598<tr> 599 <td>17</td> 600 <td>Read</td> 601 <td><span class="processed"><span class="index">1</span> 602 <span class="index">1</span> 603 <span class="index">1</span> 604 <span class="index">1</span> 605 <span class="index">1</span> 606 <span class="index">2</span> 607 <span class="index">2</span> 608 <span class="index">2</span></span> 609 <span class="buffer"><span class="index">2</span></span> 610 <span class="current"><span class="index">2</span></span> 611 <span class="index">1</span> 612 <span class="index">1</span> 613 <span class="index">1</span> 614 <span class="index">1</span>...</td> 615 <td> </td> 616 <td>#4 #1 #6 #6 #2 #9</td> 617</tr> 618 619<tr> 620 <td>18</td> 621 <td>Found</td> 622 <td><span class="processed"><span class="index">1</span> 623 <span class="index">1</span> 624 <span class="index">1</span> 625 <span class="index">1</span> 626 <span class="index">1</span> 627 <span class="index">2</span> 628 <span class="index">2</span> 629 <span class="index">2</span></span> 630 <span class="buffer"><span class="index">2</span> 631 <span class="index">2</span></span> 632 <span class="index">1</span> 633 <span class="index">1</span> 634 <span class="index">1</span> 635 <span class="index">1</span>...</td> 636 <td> </td> 637 <td>#4 #1 #6 #6 #2 #9</td> 638</tr> 639 640<tr> 641 <td>19</td> 642 <td>Read</td> 643 <td><span class="processed"><span class="index">1</span> 644 <span class="index">1</span> 645 <span class="index">1</span> 646 <span class="index">1</span> 647 <span class="index">1</span> 648 <span class="index">2</span> 649 <span class="index">2</span> 650 <span class="index">2</span></span> 651 <span class="buffer"><span class="index">2</span> 652 <span class="index">2</span></span> 653 <span class="current"><span class="index">1</span></span> 654 <span class="index">1</span> 655 <span class="index">1</span> 656 <span class="index">1</span>...</td> 657 <td> </td> 658 <td>#4 #1 #6 #6 #2 #9</td> 659</tr> 660 661<tr> 662 <td>20</td> 663 <td>Not Found</td> 664 <td><span class="processed"><span class="index">1</span> 665 <span class="index">1</span> 666 <span class="index">1</span> 667 <span class="index">1</span> 668 <span class="index">1</span> 669 <span class="index">2</span> 670 <span class="index">2</span> 671 <span class="index">2</span> 672 <span class="index">2</span> 673 <span class="index">2</span></span> 674 <span class="buffer"><span class="index">1</span></span> 675 <span class="index">1</span> 676 <span class="index">1</span> 677 <span class="index">1</span>...</td> 678 <td>#11 - 2, 2, 1</td> 679 <td>#4 #1 #6 #6 #2 #9 #9</td> 680</tr> 681 682<tr> 683 <td>21</td> 684 <td>Read</td> 685 <td><span class="processed"><span class="index">1</span> 686 <span class="index">1</span> 687 <span class="index">1</span> 688 <span class="index">1</span> 689 <span class="index">1</span> 690 <span class="index">2</span> 691 <span class="index">2</span> 692 <span class="index">2</span> 693 <span class="index">2</span> 694 <span class="index">2</span></span> 695 <span class="buffer"><span class="index">1</span></span> 696 <span class="current"><span class="index">1</span></span> 697 <span class="index">1</span> 698 <span class="index">1</span>...</td> 699 <td> </td> 700 <td>#4 #1 #6 #6 #2 #9 #9</td> 701</tr> 702 703<tr> 704 <td>22</td> 705 <td>Found</td> 706 <td><span class="processed"><span class="index">1</span> 707 <span class="index">1</span> 708 <span class="index">1</span> 709 <span class="index">1</span> 710 <span class="index">1</span> 711 <span class="index">2</span> 712 <span class="index">2</span> 713 <span class="index">2</span> 714 <span class="index">2</span> 715 <span class="index">2</span></span> 716 <span class="buffer"><span class="index">1</span> 717 <span class="index">1</span></span> 718 <span class="index">1</span> 719 <span class="index">1</span>...</td> 720 <td> </td> 721 <td>#4 #1 #6 #6 #2 #9 #9</td> 722</tr> 723 724<tr> 725 <td>23</td> 726 <td>Read</td> 727 <td><span class="processed"><span class="index">1</span> 728 <span class="index">1</span> 729 <span class="index">1</span> 730 <span class="index">1</span> 731 <span class="index">1</span> 732 <span class="index">2</span> 733 <span class="index">2</span> 734 <span class="index">2</span> 735 <span class="index">2</span> 736 <span class="index">2</span></span> 737 <span class="buffer"><span class="index">1</span> 738 <span class="index">1</span></span> 739 <span class="current"><span class="index">1</span></span> 740 <span class="index">1</span>...</td> 741 <td> </td> 742 <td>#4 #1 #6 #6 #2 #9 #9</td> 743</tr> 744 745<tr> 746 <td>24</td> 747 <td>Found</td> 748 <td><span class="processed"><span class="index">1</span> 749 <span class="index">1</span> 750 <span class="index">1</span> 751 <span class="index">1</span> 752 <span class="index">1</span> 753 <span class="index">2</span> 754 <span class="index">2</span> 755 <span class="index">2</span> 756 <span class="index">2</span> 757 <span class="index">2</span></span> 758 <span class="buffer"><span class="index">1</span> 759 <span class="index">1</span> 760 <span class="index">1</span></span> 761 <span class="index">1</span>...</td> 762 <td> </td> 763 <td>#4 #1 #6 #6 #2 #9 #9</td> 764</tr> 765 766<tr> 767 <td>25</td> 768 <td>Read</td> 769 <td><span class="processed"><span class="index">1</span> 770 <span class="index">1</span> 771 <span class="index">1</span> 772 <span class="index">1</span> 773 <span class="index">1</span> 774 <span class="index">2</span> 775 <span class="index">2</span> 776 <span class="index">2</span> 777 <span class="index">2</span> 778 <span class="index">2</span></span> 779 <span class="buffer"><span class="index">1</span> 780 <span class="index">1</span> 781 <span class="index">1</span></span> 782 <span class="current"><span class="index">1</span></span>...</td> 783 <td> </td> 784 <td>#4 #1 #6 #6 #2 #9 #9</td> 785</tr> 786 787<tr> 788 <td>26</td> 789 <td>Not Found</td> 790 <td><span class="processed"><span class="index">1</span> 791 <span class="index">1</span> 792 <span class="index">1</span> 793 <span class="index">1</span> 794 <span class="index">1</span> 795 <span class="index">2</span> 796 <span class="index">2</span> 797 <span class="index">2</span> 798 <span class="index">2</span> 799 <span class="index">2</span> 800 <span class="index">1</span> 801 <span class="index">1</span> 802 <span class="index">1</span></span> 803 <span class="buffer"><span class="index">1</span></span>...</td> 804 <td>#12 - 1, 1, 1, 1</td> 805 <td>#4 #1 #6 #6 #2 #9 #9 #7</td> 806</tr> 807</tbody> 808</table> 809 810<p>I've included a few more steps to help you see the pattern. You 811keep going until you run out of indexes in the index stream. When 812there is nothing new to read, you simply write out the code for 813whatever values you may have in your index buffer. Finally you should 814send the end-of-information code to the code stream. In this example, 815that code is #5. (View the <a href="lzw_image_data_code_table.html"> 816complete code table</a>.)</p> 817 818<p>As you can see we dynamically built many new codes for our code 819table as we compressed the data. For large files this can turn into a 820large number of codes. It turns out that the GIF format specifies a 821maximum code of #4095 (this happens to be the largest 12-bit 822number). If you want to use a new code, you have to clear out all of 823your old codes. You do this by sending the clear code (which for our 824sample was the #4). This tells the decoder that you are reinitializing 825your code table and it should too. Then you start building your own 826codes again starting just after the value for your end-of-information 827code (in our sample, we would start again at #6).</p> 828 829<p>The final code stream would look like this:</p> 830 831<blockquote><p>#4 #1 #6 #6 #2 #9 #9 #7 #8 #10 #2 #12 #1 #14 #15 #6 #0 #21 #0 #10 #7 #22 #23 832#18 #26 #7 #10 #29 #13 #24 #12 #18 #16 #36 #12 #5</p></blockquote> 833 834<p>This is only 36 codes versus the 100 that would be required without compression.</p> 835 836<h2><a name="lzw_decompression">LZW Decompression</a></h2> 837 838<p> At some point we will need to turn this code stream back into 839a picture. To do this, we only need to know the values in the stream 840and the size of the color table that was used. That's it. You remember that 841big code table we built during compression? We actually have enough information 842in the code stream itself to be able to rebuild it.</p> 843 844<p>Again, i'll list the algorithm and then we will walk though an example. Let 845me define a few terms i will be using. CODE will be current code we're working 846with. CODE-1 will be the code just before CODE in the code stream. {CODE} 847will be the value for CODE in the code table. For example, using the code 848table we created during compression, if CODE=#7 then {CODE}=1,1,1. 849In the same way, {CODE-1} would be the value in the code table for the 850code that came before CODE. Looking at step 26 from the compression, 851if CODE=#7, then {CODE-1} would be {#9}, not {#6}, which was 2,2.</p> 852 853<ul> 854<li>Initialize code table</li> 855<li>let CODE be the first code in the code stream</li> 856<li>output {CODE} to index stream</li> 857<li><LOOP POINT></li> 858<li>let CODE be the next code in the code stream</li> 859<li>is CODE in the code table?</li> 860<li>Yes: 861 <ul> 862 <li>output {CODE} to index stream</li> 863 <li>let K be the first index in {CODE}</li> 864 <li>add {CODE-1}+K to the code table</li> 865 </ul> 866</li> 867<li>No: 868 <ul> 869 <li>let K be the first index of {CODE-1}</li> 870 <li>output {CODE-1}+K to index stream</li> 871 <li>add {CODE-1}+K to code table</li> 872 </ul> 873</li> 874<li>return to LOOP POINT</li> 875</ul> 876 877<p>Let's start reading though the code stream we've created to show how to 878turn it back into a list of color indexes. The first value in the code 879stream should be a clear code. This means we should initialize our code 880table. To do this we must know how many colors are in our color table. 881(This information comes from the first byte in the image data block in 882the file. More on this later.) Again we will set up codes #0-#3 to be each 883of the four colors and add in the clear code (#4) 884and end of information code (#5).</p> 885 886<p>The next step is to read the first color code. In the following table you 887will see the values of CODE highlighted in purple, and the values for 888CODE-1 highlighted in green. Our first CODE value is #1. We then output 889{#1}, or simply 1, to the index stream [Step 0].</p> 890 891<p>Now we enter the main loop of the algorithm. The next code gets assigned 892to CODE which now makes that value #6. Next we check to see if this value 893is in our code table. At this time, it is not. This means we must find the 894first index in the value of {CODE-1} and call this K. Thus K = first index of 895{CODE-1} = first index of {#1} = 1. Now we output {CODE-1} + K to the index 896stream and add this value to our code table. The means we output 1,1 and 897give this value a code of #6 [Step 1].</p> 898 899<table class="alg_steps" cellspacing="0"> 900<tr> 901 <th>Step</th> 902 <th>Action</th> 903 <th>Code Stream</th> 904 <th>New Code Table Row</th> 905 <th>Index Stream</th> 906</tr> 907<tr> 908 <td>0</td> 909 <td>Init</td> 910 <td><span class="processed">#4</span> <span class="current">#1</span> #6 #6 #2 #9 #9 #7 ...</td> 911 <td> </td> 912 <td>1</td> 913</tr> 914<tr> 915 <td>1</td> 916 <td>Not Found</td> 917 <td><span class="processed">#4</span> <span class="buffer">#1</span> <span class="current">#6</span> #6 #2 #9 #9 #7 ...</td> 918 <td>#6 - 1, 1</td> 919 <td>1, 1, 1</td> 920</tr> 921<tr> 922 <td>2</td> 923 <td>Found</td> 924 <td><span class="processed">#4 #1</span> <span class="buffer">#6</span> <span class="current">#6</span> #2 #9 #9 #7 ...</td> 925 <td>#7 - 1, 1, 1</td> 926 <td>1, 1, 1, 1, 1</td> 927</tr> 928<tr> 929 <td>3</td> 930 <td>Found</td> 931 <td><span class="processed">#4 #1 #6</span> <span class="buffer">#6</span> <span class="current">#2</span> #9 #9 #7 ...</td> 932 <td>#8 - 1, 1, 2</td> 933 <td>1, 1, 1, 1, 1, 2</td> 934</tr> 935<tr> 936 <td>4</td> 937 <td>Not Found</td> 938 <td><span class="processed">#4 #1 #6 #6</span> <span class="buffer">#2</span> <span class="current">#9</span> #9 #7 ...</td> 939 <td>#9 - 2, 2</td> 940 <td>1, 1, 1, 1, 1, 2, 2, 2</td> 941</tr> 942<tr> 943 <td>5</td> 944 <td>Found</td> 945 <td><span class="processed">#4 #1 #6 #6 #2</span> <span class="buffer">#9</span> <span class="current">#9</span> #7 ...</td> 946 <td>#10 - 2, 2, 2</td> 947 <td>1, 1, 1, 1, 1, 2, 2, 2, 2, 2</td> 948</tr> 949<tr> 950 <td>6</td> 951 <td>Found</td> 952 <td><span class="processed">#4 #1 #6 #6 #2 #9</span> <span class="buffer">#9</span> <span class="current">#7</span> ...</td> 953 <td>#11 - 2, 2, 1</td> 954 <td>1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 1, 1, 1</td> 955</tr> 956</table> 957 958<p>We start the loop again by reading the next code. CODE now would be 959#6 and this time we do have a record for this code in our code 960table. Thus we dump {#6} to the index stream which would be 1,1. 961Now we take the first index in {#6} and call that K. Here, {#6} has 962two indexes, the first of which is 1; thus K = 1. Before moving 963on, we add {CODE-1}+K to the code table. This #7 is now 1, 1, 1 [Step 2].</p> 964 965<p>I've included a few more steps so you can see the algorithm in action. While 966the explanation may sound complicated, you can see it's actually quite simple. 967You'll also notice that you end up building the exact same 968<a href="lzw_image_data_code_table.html">code table</a> 969as the one that was created during compression. This is the reason that 970LZW is so great; we can just share the codes and not the table.</p> 971 972<h2><a name="lzw_bytes">Saving the Code Stream as Bytes</a></h2> 973 974<p>I've shown you how to go back and forth between index and code stream, but 975haven't told you what to do with them. The index stream is used to specify the 976color of each of the pixel of your image and really only shows up on screen. 977It is the code stream that is actually saved in the GIF files on your computer 978or transmitted over the internet. In order to save these code streams, we must 979turn them into bytes. The first thought might be to store each of the codes 980as its own byte; however this would limit the max code to just #255 and 981result in a lot of wasted bits for the small codes. To solve these problems, 982the GIF file format actually uses flexible <em>code sizes</em>.</p> 983 984<p>Flexible code sizes allow for further compression by limiting the bits 985needed to save the code stream as bytes. The <em>code size</em> is the number 986of bits it takes to store the value of the code. When we talk about bits, 987we're referring to the 1's and 0's that make up a byte. The codes are 988converted to their binary values to come up with the bits. To specify 989the code for #4, you would look at this binary equivalent, which is 100, 990and see that you would need three bits to store this value. The largest code 991value in our sample code stream is #36 (binary: 100100) which would 992take 6 bits to encode. Note that the number of bits i've just given is 993the minimum number. You can make the number take up more bits by adding 994zeros to the front.</p> 995 996<p style="text-align:center"><img src="image_data_block.gif" alt="GIF image data block layout" style="border: 1px solid black" /></p> 997 998<p>We need a way to know what size each of the codes are. Recall that the 999image data block begins with a single byte value called the 1000<em>LZW minimum code size</em>. The GIF format allows sizes as small 1001as 2 bits and as large as 12 bits. This minimum code size value is typically 1002the number of bits/pixel of the image. So if you have 32 colors in your image, 1003you will need 5 bits/pixel (for numbers 0-31 because 31 in binary is 11111). 1004Thus, this will most likely be one more than the bit value for the size of the 1005color table you are using. (Even if you only have two colors, the minimum 1006code size most be at least 2.) Refer to the <a href="#color_table_size"> 1007code table above</a> to remind yourself how that works.</p> 1008 1009<p>Here's the funny thing: the value for minimum code size isn't 1010actually the smallest code size that's used in the encoding 1011process. Because the minimum code size tells you how many bits are 1012needed just for the different colors of the image, you still have to 1013account for the two special codes that we always add to the code 1014table. Therefore the actual smallest code size that will be used is 1015one more than the value specified in the "minimum" code size 1016byte. I'll call this new value the <em>first code size</em>.</p> 1017 1018<p>We now know how many bytes the first code will be. This size will probably 1019work for the next few codes as well, but recall that the GIF format 1020allows for flexible code sizes. As larger code values get added to your 1021code table, you will soon realize that you need larger code sizes if you 1022were to output those values. When you are encoding the data, you increase 1023your code size as soon as your write out the code equal to 10242^(current code size)-1. If you are decoding from codes to indexes, 1025you need to increase your code size as soon as you add the code value that 1026is equal to 2^(current code size)-1 to your code table. That is, the next 1027time you grab the next section of bits, you grab one more.</p> 1028 1029<p>Note that the largest code size allowed is 12 bits. If you get to this 1030limit, the next code you encounter should be the <em>clear code</em> which 1031would tell you to reinitialize the code table. You then go back to using 1032the first code size and grow again when necessary.</p> 1033 1034<p>Jumping back to our sample image, we see that we have a minimum code 1035size value of 2 which means out first code size will be 3 bits long. 1036Out first three codes, #1 #6 and #6, would be coded as 001 110 and 110. 1037If you see at Step 6 of the encoding, we added a code of #7 to our code 1038table. This is our clue to increase our code size because 7 is equal to 10392^3-1 (where 3 is our current code size). Thus, the next code we 1040write out, #2, will use the new code size of 4 and therefore look 1041like 0010. In the decoding process, we again would increase our code 1042size when we read the code for #7 and would read the next 4, rather than 1043the next 3 bits, to get the next code. In the sample table above this 1044occurs in Step 2.</p> 1045 1046<p>Finally we must turn all these bit values into bytes. The lowest bit of the 1047code bit value gets placed in the lowest available bit of the byte. After 1048you've filled up the 8 bits in the byte, you take any left over bits and 1049start a new byte. Take a look at the following illustration to see 1050how that works with the codes from our sample image.</p> 1051 1052<p style="text-align:center"><img src="lzw_encoding_codes.gif" 1053alt="Encoding LZW Codes" style="border: 1px solid black" / WIDTH="500" 1054HEIGHT="220"></p> 1055 1056<p>You can see in the first byte that was returned (<span 1057class="byte">8C</span>) that the lowest three bits (because that was 1058our first code size) contain 110 which is the binary value of 4 so 1059that would be the clear code we started with, #4. In the three bits to 1060the left, you see 001 which out or first data code of #1. You can also 1061see when we switched into code sizes of 4 bits in the second byte 1062(<span class="byte">2D</span>).</p> 1063 1064<p>When you run out of codes but have filled less than 8 bits of the 1065byte, you should just fill the remaining bits with zeros. Recall that 1066the image data must be broken up onto <a 1067href="bits_and_bytes.html#image_data_block">data sub-blocks</a>. Each 1068of the data sub-blocks begins with a byte that specifies how many 1069bytes of data. The value will be between 1 and 255. After you read 1070those bytes, the next byte indicates again how many bytes of data 1071follow. You stop when you encounter a subblock that has a lenght of 1072zero. That tells you when you've reached the end of the image data. In 1073our sample the image the byte just after the LZW code size is <span 1074class="byte">16</span> which indicates that 22 bytes of data 1075follow. After we reach those, we see the next byte is <span 1076class="byte">00</span> which means we are all done.</p> 1077 1078<p>Return codes from bytes the basically just the same process in 1079reverse. A sample illustration of the process follows which shows how 1080you would extract codes if the first code size were 5 bits.</p> 1081 1082<p style="text-align:center"><img src="lzw_decoding_bytes.gif" alt="Decoding LZW Bytes" style="border: 1px solid black" / WIDTH="500" HEIGHT="220"></p> 1083 1084<h2>Next: Animation and Transparency</h2> 1085 1086<p>That is pretty much everything you need to know to read or generate 1087a basic image file. One of the reasons the GIF becames such a popular 1088format was because it also allowed for "fancier" features. These 1089features include animation and transparency. Next we'll look 1090at how those work.</p> 1091 1092<p><a href="animation_and_transparency.html">Continue...</a></p> 1093</div> 1094 1095<div style="text-align:center; margin-top: 10px; padding-top: 10px; border-top: #cecece 1px solid"> 1096<a href="../index.html">Back to GIFLIB documentation</a> 1097</div> 1098 1099</body> 1100 1101</html> 1102