1 /* stb_image - v2.08 - public domain image loader - http://nothings.org/stb_image.h
2 no warranty implied; use at your own risk
3
4 Do this:
5 #define STB_IMAGE_IMPLEMENTATION
6 before you include this file in *one* C or C++ file to create the implementation.
7
8 // i.e. it should look like this:
9 #include ...
10 #include ...
11 #include ...
12 #define STB_IMAGE_IMPLEMENTATION
13 #include "stb_image.h"
14
15 You can #define STBI_ASSERT(x) before the #include to avoid using assert.h.
16 And #define STBI_MALLOC, STBI_REALLOC, and STBI_FREE to avoid using malloc,realloc,free
17
18
19 QUICK NOTES:
20 Primarily of interest to game developers and other people who can
21 avoid problematic images and only need the trivial interface
22
23 JPEG baseline & progressive (12 bpc/arithmetic not supported, same as stock IJG lib)
24 PNG 1/2/4/8-bit-per-channel (16 bpc not supported)
25
26 TGA (not sure what subset, if a subset)
27 BMP non-1bpp, non-RLE
28 PSD (composited view only, no extra channels, 8/16 bit-per-channel)
29
30 GIF (*comp always reports as 4-channel)
31 HDR (radiance rgbE format)
32 PIC (Softimage PIC)
33 PNM (PPM and PGM binary only)
34
35 Animated GIF still needs a proper API, but here's one way to do it:
36 http://gist.github.com/urraka/685d9a6340b26b830d49
37
38 - decode from memory or through FILE (define STBI_NO_STDIO to remove code)
39 - decode from arbitrary I/O callbacks
40 - SIMD acceleration on x86/x64 (SSE2) and ARM (NEON)
41
42 Full documentation under "DOCUMENTATION" below.
43
44
45 Revision 2.00 release notes:
46
47 - Progressive JPEG is now supported.
48
49 - PPM and PGM binary formats are now supported, thanks to Ken Miller.
50
51 - x86 platforms now make use of SSE2 SIMD instructions for
52 JPEG decoding, and ARM platforms can use NEON SIMD if requested.
53 This work was done by Fabian "ryg" Giesen. SSE2 is used by
54 default, but NEON must be enabled explicitly; see docs.
55
56 With other JPEG optimizations included in this version, we see
57 2x speedup on a JPEG on an x86 machine, and a 1.5x speedup
58 on a JPEG on an ARM machine, relative to previous versions of this
59 library. The same results will not obtain for all JPGs and for all
60 x86/ARM machines. (Note that progressive JPEGs are significantly
61 slower to decode than regular JPEGs.) This doesn't mean that this
62 is the fastest JPEG decoder in the land; rather, it brings it
63 closer to parity with standard libraries. If you want the fastest
64 decode, look elsewhere. (See "Philosophy" section of docs below.)
65
66 See final bullet items below for more info on SIMD.
67
68 - Added STBI_MALLOC, STBI_REALLOC, and STBI_FREE macros for replacing
69 the memory allocator. Unlike other STBI libraries, these macros don't
70 support a context parameter, so if you need to pass a context in to
71 the allocator, you'll have to store it in a global or a thread-local
72 variable.
73
74 - Split existing STBI_NO_HDR flag into two flags, STBI_NO_HDR and
75 STBI_NO_LINEAR.
76 STBI_NO_HDR: suppress implementation of .hdr reader format
77 STBI_NO_LINEAR: suppress high-dynamic-range light-linear float API
78
79 - You can suppress implementation of any of the decoders to reduce
80 your code footprint by #defining one or more of the following
81 symbols before creating the implementation.
82
83 STBI_NO_JPEG
84 STBI_NO_PNG
85 STBI_NO_BMP
86 STBI_NO_PSD
87 STBI_NO_TGA
88 STBI_NO_GIF
89 STBI_NO_HDR
90 STBI_NO_PIC
91 STBI_NO_PNM (.ppm and .pgm)
92
93 - You can request *only* certain decoders and suppress all other ones
94 (this will be more forward-compatible, as addition of new decoders
95 doesn't require you to disable them explicitly):
96
97 STBI_ONLY_JPEG
98 STBI_ONLY_PNG
99 STBI_ONLY_BMP
100 STBI_ONLY_PSD
101 STBI_ONLY_TGA
102 STBI_ONLY_GIF
103 STBI_ONLY_HDR
104 STBI_ONLY_PIC
105 STBI_ONLY_PNM (.ppm and .pgm)
106
107 Note that you can define multiples of these, and you will get all
108 of them ("only x" and "only y" is interpreted to mean "only x&y").
109
110 - If you use STBI_NO_PNG (or _ONLY_ without PNG), and you still
111 want the zlib decoder to be available, #define STBI_SUPPORT_ZLIB
112
113 - Compilation of all SIMD code can be suppressed with
114 #define STBI_NO_SIMD
115 It should not be necessary to disable SIMD unless you have issues
116 compiling (e.g. using an x86 compiler which doesn't support SSE
117 intrinsics or that doesn't support the method used to detect
118 SSE2 support at run-time), and even those can be reported as
119 bugs so I can refine the built-in compile-time checking to be
120 smarter.
121
122 - The old STBI_SIMD system which allowed installing a user-defined
123 IDCT etc. has been removed. If you need this, don't upgrade. My
124 assumption is that almost nobody was doing this, and those who
125 were will find the built-in SIMD more satisfactory anyway.
126
127 - RGB values computed for JPEG images are slightly different from
128 previous versions of stb_image. (This is due to using less
129 integer precision in SIMD.) The C code has been adjusted so
130 that the same RGB values will be computed regardless of whether
131 SIMD support is available, so your app should always produce
132 consistent results. But these results are slightly different from
133 previous versions. (Specifically, about 3% of available YCbCr values
134 will compute different RGB results from pre-1.49 versions by +-1;
135 most of the deviating values are one smaller in the G channel.)
136
137 - If you must produce consistent results with previous versions of
138 stb_image, #define STBI_JPEG_OLD and you will get the same results
139 you used to; however, you will not get the SIMD speedups for
140 the YCbCr-to-RGB conversion step (although you should still see
141 significant JPEG speedup from the other changes).
142
143 Please note that STBI_JPEG_OLD is a temporary feature; it will be
144 removed in future versions of the library. It is only intended for
145 near-term back-compatibility use.
146
147
148 Latest revision history:
149 2.08 (2015-09-13) fix to 2.07 cleanup, reading RGB PSD as RGBA
150 2.07 (2015-09-13) partial animated GIF support
151 limited 16-bit PSD support
152 minor bugs, code cleanup, and compiler warnings
153 2.06 (2015-04-19) fix bug where PSD returns wrong '*comp' value
154 2.05 (2015-04-19) fix bug in progressive JPEG handling, fix warning
155 2.04 (2015-04-15) try to re-enable SIMD on MinGW 64-bit
156 2.03 (2015-04-12) additional corruption checking
157 stbi_set_flip_vertically_on_load
158 fix NEON support; fix mingw support
159 2.02 (2015-01-19) fix incorrect assert, fix warning
160 2.01 (2015-01-17) fix various warnings
161 2.00b (2014-12-25) fix STBI_MALLOC in progressive JPEG
162 2.00 (2014-12-25) optimize JPEG, including x86 SSE2 & ARM NEON SIMD
163 progressive JPEG
164 PGM/PPM support
165 STBI_MALLOC,STBI_REALLOC,STBI_FREE
166 STBI_NO_*, STBI_ONLY_*
167 GIF bugfix
168 1.48 (2014-12-14) fix incorrectly-named assert()
169 1.47 (2014-12-14) 1/2/4-bit PNG support (both grayscale and paletted)
170 optimize PNG
171 fix bug in interlaced PNG with user-specified channel count
172
173 See end of file for full revision history.
174
175
176 ============================ Contributors =========================
177
178 Image formats Bug fixes & warning fixes
179 Sean Barrett (jpeg, png, bmp) Marc LeBlanc
180 Nicolas Schulz (hdr, psd) Christpher Lloyd
181 Jonathan Dummer (tga) Dave Moore
182 Jean-Marc Lienher (gif) Won Chun
183 Tom Seddon (pic) the Horde3D community
184 Thatcher Ulrich (psd) Janez Zemva
185 Ken Miller (pgm, ppm) Jonathan Blow
186 urraka@github (animated gif) Laurent Gomila
187 Aruelien Pocheville
188 Ryamond Barbiero
189 David Woo
190 Extensions, features Martin Golini
191 Jetro Lauha (stbi_info) Roy Eltham
192 Martin "SpartanJ" Golini (stbi_info) Luke Graham
193 James "moose2000" Brown (iPhone PNG) Thomas Ruf
194 Ben "Disch" Wenger (io callbacks) John Bartholomew
195 Omar Cornut (1/2/4-bit PNG) Ken Hamada
196 Nicolas Guillemot (vertical flip) Cort Stratton
197 Richard Mitton (16-bit PSD) Blazej Dariusz Roszkowski
198 Thibault Reuille
199 Paul Du Bois
200 Guillaume George
201 Jerry Jansson
202 Hayaki Saito
203 Johan Duparc
204 Ronny Chevalier
205 Optimizations & bugfixes Michal Cichon
206 Fabian "ryg" Giesen Tero Hanninen
207 Arseny Kapoulkine Sergio Gonzalez
208 Cass Everitt
209 Engin Manap
210 If your name should be here but Martins Mozeiko
211 isn't, let Sean know. Joseph Thomson
212 Phil Jordan
213 Nathan Reed
214 Michaelangel007@github
215 Nick Verigakis
216
217 LICENSE
218
219 This software is in the public domain. Where that dedication is not
220 recognized, you are granted a perpetual, irrevocable license to copy,
221 distribute, and modify this file as you see fit.
222
223 */
224
225 #ifndef STBI_INCLUDE_STB_IMAGE_H
226 #define STBI_INCLUDE_STB_IMAGE_H
227
228 // DOCUMENTATION
229 //
230 // Limitations:
231 // - no 16-bit-per-channel PNG
232 // - no 12-bit-per-channel JPEG
233 // - no JPEGs with arithmetic coding
234 // - no 1-bit BMP
235 // - GIF always returns *comp=4
236 //
237 // Basic usage (see HDR discussion below for HDR usage):
238 // int x,y,n;
239 // unsigned char *data = stbi_load(filename, &x, &y, &n, 0);
240 // // ... process data if not NULL ...
241 // // ... x = width, y = height, n = # 8-bit components per pixel ...
242 // // ... replace '0' with '1'..'4' to force that many components per pixel
243 // // ... but 'n' will always be the number that it would have been if you said 0
244 // stbi_image_free(data)
245 //
246 // Standard parameters:
247 // int *x -- outputs image width in pixels
248 // int *y -- outputs image height in pixels
249 // int *comp -- outputs # of image components in image file
250 // int req_comp -- if non-zero, # of image components requested in result
251 //
252 // The return value from an image loader is an 'unsigned char *' which points
253 // to the pixel data, or NULL on an allocation failure or if the image is
254 // corrupt or invalid. The pixel data consists of *y scanlines of *x pixels,
255 // with each pixel consisting of N interleaved 8-bit components; the first
256 // pixel pointed to is top-left-most in the image. There is no padding between
257 // image scanlines or between pixels, regardless of format. The number of
258 // components N is 'req_comp' if req_comp is non-zero, or *comp otherwise.
259 // If req_comp is non-zero, *comp has the number of components that _would_
260 // have been output otherwise. E.g. if you set req_comp to 4, you will always
261 // get RGBA output, but you can check *comp to see if it's trivially opaque
262 // because e.g. there were only 3 channels in the source image.
263 //
264 // An output image with N components has the following components interleaved
265 // in this order in each pixel:
266 //
267 // N=#comp components
268 // 1 grey
269 // 2 grey, alpha
270 // 3 red, green, blue
271 // 4 red, green, blue, alpha
272 //
273 // If image loading fails for any reason, the return value will be NULL,
274 // and *x, *y, *comp will be unchanged. The function stbi_failure_reason()
275 // can be queried for an extremely brief, end-user unfriendly explanation
276 // of why the load failed. Define STBI_NO_FAILURE_STRINGS to avoid
277 // compiling these strings at all, and STBI_FAILURE_USERMSG to get slightly
278 // more user-friendly ones.
279 //
280 // Paletted PNG, BMP, GIF, and PIC images are automatically depalettized.
281 //
282 // ===========================================================================
283 //
284 // Philosophy
285 //
286 // stb libraries are designed with the following priorities:
287 //
288 // 1. easy to use
289 // 2. easy to maintain
290 // 3. good performance
291 //
292 // Sometimes I let "good performance" creep up in priority over "easy to maintain",
293 // and for best performance I may provide less-easy-to-use APIs that give higher
294 // performance, in addition to the easy to use ones. Nevertheless, it's important
295 // to keep in mind that from the standpoint of you, a client of this library,
296 // all you care about is #1 and #3, and stb libraries do not emphasize #3 above all.
297 //
298 // Some secondary priorities arise directly from the first two, some of which
299 // make more explicit reasons why performance can't be emphasized.
300 //
301 // - Portable ("ease of use")
302 // - Small footprint ("easy to maintain")
303 // - No dependencies ("ease of use")
304 //
305 // ===========================================================================
306 //
307 // I/O callbacks
308 //
309 // I/O callbacks allow you to read from arbitrary sources, like packaged
310 // files or some other source. Data read from callbacks are processed
311 // through a small internal buffer (currently 128 bytes) to try to reduce
312 // overhead.
313 //
314 // The three functions you must define are "read" (reads some bytes of data),
315 // "skip" (skips some bytes of data), "eof" (reports if the stream is at the end).
316 //
317 // ===========================================================================
318 //
319 // SIMD support
320 //
321 // The JPEG decoder will try to automatically use SIMD kernels on x86 when
322 // supported by the compiler. For ARM Neon support, you must explicitly
323 // request it.
324 //
325 // (The old do-it-yourself SIMD API is no longer supported in the current
326 // code.)
327 //
328 // On x86, SSE2 will automatically be used when available based on a run-time
329 // test; if not, the generic C versions are used as a fall-back. On ARM targets,
330 // the typical path is to have separate builds for NEON and non-NEON devices
331 // (at least this is true for iOS and Android). Therefore, the NEON support is
332 // toggled by a build flag: define STBI_NEON to get NEON loops.
333 //
334 // The output of the JPEG decoder is slightly different from versions where
335 // SIMD support was introduced (that is, for versions before 1.49). The
336 // difference is only +-1 in the 8-bit RGB channels, and only on a small
337 // fraction of pixels. You can force the pre-1.49 behavior by defining
338 // STBI_JPEG_OLD, but this will disable some of the SIMD decoding path
339 // and hence cost some performance.
340 //
341 // If for some reason you do not want to use any of SIMD code, or if
342 // you have issues compiling it, you can disable it entirely by
343 // defining STBI_NO_SIMD.
344 //
345 // ===========================================================================
346 //
347 // HDR image support (disable by defining STBI_NO_HDR)
348 //
349 // stb_image now supports loading HDR images in general, and currently
350 // the Radiance .HDR file format, although the support is provided
351 // generically. You can still load any file through the existing interface;
352 // if you attempt to load an HDR file, it will be automatically remapped to
353 // LDR, assuming gamma 2.2 and an arbitrary scale factor defaulting to 1;
354 // both of these constants can be reconfigured through this interface:
355 //
356 // stbi_hdr_to_ldr_gamma(2.2f);
357 // stbi_hdr_to_ldr_scale(1.0f);
358 //
359 // (note, do not use _inverse_ constants; stbi_image will invert them
360 // appropriately).
361 //
362 // Additionally, there is a new, parallel interface for loading files as
363 // (linear) floats to preserve the full dynamic range:
364 //
365 // float *data = stbi_loadf(filename, &x, &y, &n, 0);
366 //
367 // If you load LDR images through this interface, those images will
368 // be promoted to floating point values, run through the inverse of
369 // constants corresponding to the above:
370 //
371 // stbi_ldr_to_hdr_scale(1.0f);
372 // stbi_ldr_to_hdr_gamma(2.2f);
373 //
374 // Finally, given a filename (or an open file or memory block--see header
375 // file for details) containing image data, you can query for the "most
376 // appropriate" interface to use (that is, whether the image is HDR or
377 // not), using:
378 //
379 // stbi_is_hdr(char *filename);
380 //
381 // ===========================================================================
382 //
383 // iPhone PNG support:
384 //
385 // By default we convert iphone-formatted PNGs back to RGB, even though
386 // they are internally encoded differently. You can disable this conversion
387 // by by calling stbi_convert_iphone_png_to_rgb(0), in which case
388 // you will always just get the native iphone "format" through (which
389 // is BGR stored in RGB).
390 //
391 // Call stbi_set_unpremultiply_on_load(1) as well to force a divide per
392 // pixel to remove any premultiplied alpha *only* if the image file explicitly
393 // says there's premultiplied data (currently only happens in iPhone images,
394 // and only if iPhone convert-to-rgb processing is on).
395 //
396
397
398 #ifndef STBI_NO_STDIO
399 #include <stdio.h>
400 #endif // STBI_NO_STDIO
401
402 #define STBI_VERSION 1
403
404 enum
405 {
406 STBI_default = 0, // only used for req_comp
407
408 STBI_grey = 1,
409 STBI_grey_alpha = 2,
410 STBI_rgb = 3,
411 STBI_rgb_alpha = 4
412 };
413
414 typedef unsigned char stbi_uc;
415
416 #ifdef __cplusplus
417 extern "C" {
418 #endif
419
420 #ifdef STB_IMAGE_STATIC
421 #define STBIDEF static
422 #else
423 #define STBIDEF extern
424 #endif
425
426 //////////////////////////////////////////////////////////////////////////////
427 //
428 // PRIMARY API - works on images of any type
429 //
430
431 //
432 // load image by filename, open file, or memory buffer
433 //
434
435 typedef struct
436 {
437 int (*read) (void *user,char *data,int size); // fill 'data' with 'size' bytes. return number of bytes actually read
438 void (*skip) (void *user,int n); // skip the next 'n' bytes, or 'unget' the last -n bytes if negative
439 int (*eof) (void *user); // returns nonzero if we are at end of file/data
440 } stbi_io_callbacks;
441
442 STBIDEF stbi_uc *stbi_load (char const *filename, int *x, int *y, int *comp, int req_comp);
443 STBIDEF stbi_uc *stbi_load_from_memory (stbi_uc const *buffer, int len , int *x, int *y, int *comp, int req_comp);
444 STBIDEF stbi_uc *stbi_load_from_callbacks(stbi_io_callbacks const *clbk , void *user, int *x, int *y, int *comp, int req_comp);
445
446 #ifndef STBI_NO_STDIO
447 STBIDEF stbi_uc *stbi_load_from_file (FILE *f, int *x, int *y, int *comp, int req_comp);
448 // for stbi_load_from_file, file pointer is left pointing immediately after image
449 #endif
450
451 #ifndef STBI_NO_LINEAR
452 STBIDEF float *stbi_loadf (char const *filename, int *x, int *y, int *comp, int req_comp);
453 STBIDEF float *stbi_loadf_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp);
454 STBIDEF float *stbi_loadf_from_callbacks (stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp);
455
456 #ifndef STBI_NO_STDIO
457 STBIDEF float *stbi_loadf_from_file (FILE *f, int *x, int *y, int *comp, int req_comp);
458 #endif
459 #endif
460
461 #ifndef STBI_NO_HDR
462 STBIDEF void stbi_hdr_to_ldr_gamma(float gamma);
463 STBIDEF void stbi_hdr_to_ldr_scale(float scale);
464 #endif
465
466 #ifndef STBI_NO_LINEAR
467 STBIDEF void stbi_ldr_to_hdr_gamma(float gamma);
468 STBIDEF void stbi_ldr_to_hdr_scale(float scale);
469 #endif // STBI_NO_HDR
470
471 // stbi_is_hdr is always defined, but always returns false if STBI_NO_HDR
472 STBIDEF int stbi_is_hdr_from_callbacks(stbi_io_callbacks const *clbk, void *user);
473 STBIDEF int stbi_is_hdr_from_memory(stbi_uc const *buffer, int len);
474 #ifndef STBI_NO_STDIO
475 STBIDEF int stbi_is_hdr (char const *filename);
476 STBIDEF int stbi_is_hdr_from_file(FILE *f);
477 #endif // STBI_NO_STDIO
478
479
480 // get a VERY brief reason for failure
481 // NOT THREADSAFE
482 STBIDEF const char *stbi_failure_reason (void);
483
484 // free the loaded image -- this is just free()
485 STBIDEF void stbi_image_free (void *retval_from_stbi_load);
486
487 // get image dimensions & components without fully decoding
488 STBIDEF int stbi_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp);
489 STBIDEF int stbi_info_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp);
490
491 #ifndef STBI_NO_STDIO
492 STBIDEF int stbi_info (char const *filename, int *x, int *y, int *comp);
493 STBIDEF int stbi_info_from_file (FILE *f, int *x, int *y, int *comp);
494
495 #endif
496
497
498
499 // for image formats that explicitly notate that they have premultiplied alpha,
500 // we just return the colors as stored in the file. set this flag to force
501 // unpremultiplication. results are undefined if the unpremultiply overflow.
502 STBIDEF void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply);
503
504 // indicate whether we should process iphone images back to canonical format,
505 // or just pass them through "as-is"
506 STBIDEF void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert);
507
508 // flip the image vertically, so the first pixel in the output array is the bottom left
509 STBIDEF void stbi_set_flip_vertically_on_load(int flag_true_if_should_flip);
510
511 // ZLIB client - used by PNG, available for other purposes
512
513 STBIDEF char *stbi_zlib_decode_malloc_guesssize(const char *buffer, int len, int initial_size, int *outlen);
514 STBIDEF char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer, int len, int initial_size, int *outlen, int parse_header);
515 STBIDEF char *stbi_zlib_decode_malloc(const char *buffer, int len, int *outlen);
516 STBIDEF int stbi_zlib_decode_buffer(char *obuffer, int olen, const char *ibuffer, int ilen);
517
518 STBIDEF char *stbi_zlib_decode_noheader_malloc(const char *buffer, int len, int *outlen);
519 STBIDEF int stbi_zlib_decode_noheader_buffer(char *obuffer, int olen, const char *ibuffer, int ilen);
520
521
522 #ifdef __cplusplus
523 }
524 #endif
525
526 //
527 //
528 //// end header file /////////////////////////////////////////////////////
529 #endif // STBI_INCLUDE_STB_IMAGE_H
530
531 #ifdef STB_IMAGE_IMPLEMENTATION
532
533 #if defined(STBI_ONLY_JPEG) || defined(STBI_ONLY_PNG) || defined(STBI_ONLY_BMP) \
534 || defined(STBI_ONLY_TGA) || defined(STBI_ONLY_GIF) || defined(STBI_ONLY_PSD) \
535 || defined(STBI_ONLY_HDR) || defined(STBI_ONLY_PIC) || defined(STBI_ONLY_PNM) \
536 || defined(STBI_ONLY_ZLIB)
537 #ifndef STBI_ONLY_JPEG
538 #define STBI_NO_JPEG
539 #endif
540 #ifndef STBI_ONLY_PNG
541 #define STBI_NO_PNG
542 #endif
543 #ifndef STBI_ONLY_BMP
544 #define STBI_NO_BMP
545 #endif
546 #ifndef STBI_ONLY_PSD
547 #define STBI_NO_PSD
548 #endif
549 #ifndef STBI_ONLY_TGA
550 #define STBI_NO_TGA
551 #endif
552 #ifndef STBI_ONLY_GIF
553 #define STBI_NO_GIF
554 #endif
555 #ifndef STBI_ONLY_HDR
556 #define STBI_NO_HDR
557 #endif
558 #ifndef STBI_ONLY_PIC
559 #define STBI_NO_PIC
560 #endif
561 #ifndef STBI_ONLY_PNM
562 #define STBI_NO_PNM
563 #endif
564 #endif
565
566 #if defined(STBI_NO_PNG) && !defined(STBI_SUPPORT_ZLIB) && !defined(STBI_NO_ZLIB)
567 #define STBI_NO_ZLIB
568 #endif
569
570
571 #include <stdarg.h>
572 #include <stddef.h> // ptrdiff_t on osx
573 #include <stdlib.h>
574 #include <string.h>
575 #include <limits.h>
576
577 #if !defined(STBI_NO_LINEAR) || !defined(STBI_NO_HDR)
578 #include <math.h> // ldexp
579 #endif
580
581 #ifndef STBI_NO_STDIO
582 #include <stdio.h>
583 #endif
584
585 #ifndef STBI_ASSERT
586 #include <assert.h>
587 #define STBI_ASSERT(x) assert(x)
588 #endif
589
590
591 #ifndef _MSC_VER
592 #ifdef __cplusplus
593 #define stbi_inline inline
594 #else
595 #define stbi_inline
596 #endif
597 #else
598 #define stbi_inline __forceinline
599 #endif
600
601
602 #ifdef _MSC_VER
603 typedef unsigned short stbi__uint16;
604 typedef signed short stbi__int16;
605 typedef unsigned int stbi__uint32;
606 typedef signed int stbi__int32;
607 #else
608 #include <stdint.h>
609 typedef uint16_t stbi__uint16;
610 typedef int16_t stbi__int16;
611 typedef uint32_t stbi__uint32;
612 typedef int32_t stbi__int32;
613 #endif
614
615 // should produce compiler error if size is wrong
616 typedef unsigned char validate_uint32[sizeof(stbi__uint32)==4 ? 1 : -1];
617
618 #ifdef _MSC_VER
619 #define STBI_NOTUSED(v) (void)(v)
620 #else
621 #define STBI_NOTUSED(v) (void)sizeof(v)
622 #endif
623
624 #ifdef _MSC_VER
625 #define STBI_HAS_LROTL
626 #endif
627
628 #ifdef STBI_HAS_LROTL
629 #define stbi_lrot(x,y) _lrotl(x,y)
630 #else
631 #define stbi_lrot(x,y) (((x) << (y)) | ((x) >> (32 - (y))))
632 #endif
633
634 #if defined(STBI_MALLOC) && defined(STBI_FREE) && defined(STBI_REALLOC)
635 // ok
636 #elif !defined(STBI_MALLOC) && !defined(STBI_FREE) && !defined(STBI_REALLOC)
637 // ok
638 #else
639 #error "Must define all or none of STBI_MALLOC, STBI_FREE, and STBI_REALLOC."
640 #endif
641
642 #ifndef STBI_MALLOC
643 #define STBI_MALLOC(sz) malloc(sz)
644 #define STBI_REALLOC(p,sz) realloc(p,sz)
645 #define STBI_FREE(p) free(p)
646 #endif
647
648 // x86/x64 detection
649 #if defined(__x86_64__) || defined(_M_X64)
650 #define STBI__X64_TARGET
651 #elif defined(__i386) || defined(_M_IX86)
652 #define STBI__X86_TARGET
653 #endif
654
655 #if defined(__GNUC__) && (defined(STBI__X86_TARGET) || defined(STBI__X64_TARGET)) && !defined(__SSE2__) && !defined(STBI_NO_SIMD)
656 // NOTE: not clear do we actually need this for the 64-bit path?
657 // gcc doesn't support sse2 intrinsics unless you compile with -msse2,
658 // (but compiling with -msse2 allows the compiler to use SSE2 everywhere;
659 // this is just broken and gcc are jerks for not fixing it properly
660 // http://www.virtualdub.org/blog/pivot/entry.php?id=363 )
661 #define STBI_NO_SIMD
662 #endif
663
664 #if defined(__MINGW32__) && defined(STBI__X86_TARGET) && !defined(STBI_MINGW_ENABLE_SSE2) && !defined(STBI_NO_SIMD)
665 // Note that __MINGW32__ doesn't actually mean 32-bit, so we have to avoid STBI__X64_TARGET
666 //
667 // 32-bit MinGW wants ESP to be 16-byte aligned, but this is not in the
668 // Windows ABI and VC++ as well as Windows DLLs don't maintain that invariant.
669 // As a result, enabling SSE2 on 32-bit MinGW is dangerous when not
670 // simultaneously enabling "-mstackrealign".
671 //
672 // See https://github.com/nothings/stb/issues/81 for more information.
673 //
674 // So default to no SSE2 on 32-bit MinGW. If you've read this far and added
675 // -mstackrealign to your build settings, feel free to #define STBI_MINGW_ENABLE_SSE2.
676 #define STBI_NO_SIMD
677 #endif
678
679 #if !defined(STBI_NO_SIMD) && defined(STBI__X86_TARGET)
680 #define STBI_SSE2
681 #include <emmintrin.h>
682
683 #ifdef _MSC_VER
684
685 #if _MSC_VER >= 1400 // not VC6
686 #include <intrin.h> // __cpuid
stbi__cpuid3(void)687 static int stbi__cpuid3(void)
688 {
689 int info[4];
690 __cpuid(info,1);
691 return info[3];
692 }
693 #else
stbi__cpuid3(void)694 static int stbi__cpuid3(void)
695 {
696 int res;
697 __asm {
698 mov eax,1
699 cpuid
700 mov res,edx
701 }
702 return res;
703 }
704 #endif
705
706 #define STBI_SIMD_ALIGN(type, name) __declspec(align(16)) type name
707
stbi__sse2_available()708 static int stbi__sse2_available()
709 {
710 int info3 = stbi__cpuid3();
711 return ((info3 >> 26) & 1) != 0;
712 }
713 #else // assume GCC-style if not VC++
714 #define STBI_SIMD_ALIGN(type, name) type name __attribute__((aligned(16)))
715
stbi__sse2_available()716 static int stbi__sse2_available()
717 {
718 #if defined(__GNUC__) && (__GNUC__ * 100 + __GNUC_MINOR__) >= 408 // GCC 4.8 or later
719 // GCC 4.8+ has a nice way to do this
720 return __builtin_cpu_supports("sse2");
721 #else
722 // portable way to do this, preferably without using GCC inline ASM?
723 // just bail for now.
724 return 0;
725 #endif
726 }
727 #endif
728 #endif
729
730 // ARM NEON
731 #if defined(STBI_NO_SIMD) && defined(STBI_NEON)
732 #undef STBI_NEON
733 #endif
734
735 #ifdef STBI_NEON
736 #include <arm_neon.h>
737 // assume GCC or Clang on ARM targets
738 #define STBI_SIMD_ALIGN(type, name) type name __attribute__((aligned(16)))
739 #endif
740
741 #ifndef STBI_SIMD_ALIGN
742 #define STBI_SIMD_ALIGN(type, name) type name
743 #endif
744
745 ///////////////////////////////////////////////
746 //
747 // stbi__context struct and start_xxx functions
748
749 // stbi__context structure is our basic context used by all images, so it
750 // contains all the IO context, plus some basic image information
751 typedef struct
752 {
753 stbi__uint32 img_x, img_y;
754 int img_n, img_out_n;
755
756 stbi_io_callbacks io;
757 void *io_user_data;
758
759 int read_from_callbacks;
760 int buflen;
761 stbi_uc buffer_start[128];
762
763 stbi_uc *img_buffer, *img_buffer_end;
764 stbi_uc *img_buffer_original, *img_buffer_original_end;
765 } stbi__context;
766
767
768 static void stbi__refill_buffer(stbi__context *s);
769
770 // initialize a memory-decode context
stbi__start_mem(stbi__context * s,stbi_uc const * buffer,int len)771 static void stbi__start_mem(stbi__context *s, stbi_uc const *buffer, int len)
772 {
773 s->io.read = NULL;
774 s->read_from_callbacks = 0;
775 s->img_buffer = s->img_buffer_original = (stbi_uc *) buffer;
776 s->img_buffer_end = s->img_buffer_original_end = (stbi_uc *) buffer+len;
777 }
778
779 // initialize a callback-based context
stbi__start_callbacks(stbi__context * s,stbi_io_callbacks * c,void * user)780 static void stbi__start_callbacks(stbi__context *s, stbi_io_callbacks *c, void *user)
781 {
782 s->io = *c;
783 s->io_user_data = user;
784 s->buflen = sizeof(s->buffer_start);
785 s->read_from_callbacks = 1;
786 s->img_buffer_original = s->buffer_start;
787 stbi__refill_buffer(s);
788 s->img_buffer_original_end = s->img_buffer_end;
789 }
790
791 #ifndef STBI_NO_STDIO
792
stbi__stdio_read(void * user,char * data,int size)793 static int stbi__stdio_read(void *user, char *data, int size)
794 {
795 return (int) fread(data,1,size,(FILE*) user);
796 }
797
stbi__stdio_skip(void * user,int n)798 static void stbi__stdio_skip(void *user, int n)
799 {
800 fseek((FILE*) user, n, SEEK_CUR);
801 }
802
stbi__stdio_eof(void * user)803 static int stbi__stdio_eof(void *user)
804 {
805 return feof((FILE*) user);
806 }
807
808 static stbi_io_callbacks stbi__stdio_callbacks =
809 {
810 stbi__stdio_read,
811 stbi__stdio_skip,
812 stbi__stdio_eof,
813 };
814
stbi__start_file(stbi__context * s,FILE * f)815 static void stbi__start_file(stbi__context *s, FILE *f)
816 {
817 stbi__start_callbacks(s, &stbi__stdio_callbacks, (void *) f);
818 }
819
820 //static void stop_file(stbi__context *s) { }
821
822 #endif // !STBI_NO_STDIO
823
stbi__rewind(stbi__context * s)824 static void stbi__rewind(stbi__context *s)
825 {
826 // conceptually rewind SHOULD rewind to the beginning of the stream,
827 // but we just rewind to the beginning of the initial buffer, because
828 // we only use it after doing 'test', which only ever looks at at most 92 bytes
829 s->img_buffer = s->img_buffer_original;
830 s->img_buffer_end = s->img_buffer_original_end;
831 }
832
833 #ifndef STBI_NO_JPEG
834 static int stbi__jpeg_test(stbi__context *s);
835 static stbi_uc *stbi__jpeg_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
836 static int stbi__jpeg_info(stbi__context *s, int *x, int *y, int *comp);
837 #endif
838
839 #ifndef STBI_NO_PNG
840 static int stbi__png_test(stbi__context *s);
841 static stbi_uc *stbi__png_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
842 static int stbi__png_info(stbi__context *s, int *x, int *y, int *comp);
843 #endif
844
845 #ifndef STBI_NO_BMP
846 static int stbi__bmp_test(stbi__context *s);
847 static stbi_uc *stbi__bmp_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
848 static int stbi__bmp_info(stbi__context *s, int *x, int *y, int *comp);
849 #endif
850
851 #ifndef STBI_NO_TGA
852 static int stbi__tga_test(stbi__context *s);
853 static stbi_uc *stbi__tga_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
854 static int stbi__tga_info(stbi__context *s, int *x, int *y, int *comp);
855 #endif
856
857 #ifndef STBI_NO_PSD
858 static int stbi__psd_test(stbi__context *s);
859 static stbi_uc *stbi__psd_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
860 static int stbi__psd_info(stbi__context *s, int *x, int *y, int *comp);
861 #endif
862
863 #ifndef STBI_NO_HDR
864 static int stbi__hdr_test(stbi__context *s);
865 static float *stbi__hdr_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
866 static int stbi__hdr_info(stbi__context *s, int *x, int *y, int *comp);
867 #endif
868
869 #ifndef STBI_NO_PIC
870 static int stbi__pic_test(stbi__context *s);
871 static stbi_uc *stbi__pic_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
872 static int stbi__pic_info(stbi__context *s, int *x, int *y, int *comp);
873 #endif
874
875 #ifndef STBI_NO_GIF
876 static int stbi__gif_test(stbi__context *s);
877 static stbi_uc *stbi__gif_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
878 static int stbi__gif_info(stbi__context *s, int *x, int *y, int *comp);
879 #endif
880
881 #ifndef STBI_NO_PNM
882 static int stbi__pnm_test(stbi__context *s);
883 static stbi_uc *stbi__pnm_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
884 static int stbi__pnm_info(stbi__context *s, int *x, int *y, int *comp);
885 #endif
886
887 // this is not threadsafe
888 static const char *stbi__g_failure_reason;
889
stbi_failure_reason(void)890 STBIDEF const char *stbi_failure_reason(void)
891 {
892 return stbi__g_failure_reason;
893 }
894
stbi__err(const char * str)895 static int stbi__err(const char *str)
896 {
897 stbi__g_failure_reason = str;
898 return 0;
899 }
900
stbi__malloc(size_t size)901 static void *stbi__malloc(size_t size)
902 {
903 return STBI_MALLOC(size);
904 }
905
906 // stbi__err - error
907 // stbi__errpf - error returning pointer to float
908 // stbi__errpuc - error returning pointer to unsigned char
909
910 #ifdef STBI_NO_FAILURE_STRINGS
911 #define stbi__err(x,y) 0
912 #elif defined(STBI_FAILURE_USERMSG)
913 #define stbi__err(x,y) stbi__err(y)
914 #else
915 #define stbi__err(x,y) stbi__err(x)
916 #endif
917
918 #define stbi__errpf(x,y) ((float *)(size_t) (stbi__err(x,y)?NULL:NULL))
919 #define stbi__errpuc(x,y) ((unsigned char *)(size_t) (stbi__err(x,y)?NULL:NULL))
920
stbi_image_free(void * retval_from_stbi_load)921 STBIDEF void stbi_image_free(void *retval_from_stbi_load)
922 {
923 STBI_FREE(retval_from_stbi_load);
924 }
925
926 #ifndef STBI_NO_LINEAR
927 static float *stbi__ldr_to_hdr(stbi_uc *data, int x, int y, int comp);
928 #endif
929
930 #ifndef STBI_NO_HDR
931 static stbi_uc *stbi__hdr_to_ldr(float *data, int x, int y, int comp);
932 #endif
933
934 static int stbi__vertically_flip_on_load = 0;
935
stbi_set_flip_vertically_on_load(int flag_true_if_should_flip)936 STBIDEF void stbi_set_flip_vertically_on_load(int flag_true_if_should_flip)
937 {
938 stbi__vertically_flip_on_load = flag_true_if_should_flip;
939 }
940
stbi__load_main(stbi__context * s,int * x,int * y,int * comp,int req_comp)941 static unsigned char *stbi__load_main(stbi__context *s, int *x, int *y, int *comp, int req_comp)
942 {
943 #ifndef STBI_NO_JPEG
944 if (stbi__jpeg_test(s)) return stbi__jpeg_load(s,x,y,comp,req_comp);
945 #endif
946 #ifndef STBI_NO_PNG
947 if (stbi__png_test(s)) return stbi__png_load(s,x,y,comp,req_comp);
948 #endif
949 #ifndef STBI_NO_BMP
950 if (stbi__bmp_test(s)) return stbi__bmp_load(s,x,y,comp,req_comp);
951 #endif
952 #ifndef STBI_NO_GIF
953 if (stbi__gif_test(s)) return stbi__gif_load(s,x,y,comp,req_comp);
954 #endif
955 #ifndef STBI_NO_PSD
956 if (stbi__psd_test(s)) return stbi__psd_load(s,x,y,comp,req_comp);
957 #endif
958 #ifndef STBI_NO_PIC
959 if (stbi__pic_test(s)) return stbi__pic_load(s,x,y,comp,req_comp);
960 #endif
961 #ifndef STBI_NO_PNM
962 if (stbi__pnm_test(s)) return stbi__pnm_load(s,x,y,comp,req_comp);
963 #endif
964
965 #ifndef STBI_NO_HDR
966 if (stbi__hdr_test(s)) {
967 float *hdr = stbi__hdr_load(s, x,y,comp,req_comp);
968 if (hdr == NULL) {
969 return NULL;
970 }
971 return stbi__hdr_to_ldr(hdr, *x, *y, req_comp ? req_comp : *comp);
972 }
973 #endif
974
975 #ifndef STBI_NO_TGA
976 // test tga last because it's a crappy test!
977 if (stbi__tga_test(s))
978 return stbi__tga_load(s,x,y,comp,req_comp);
979 #endif
980
981 return stbi__errpuc("unknown image type", "Image not of any known type, or corrupt");
982 }
983
stbi__load_flip(stbi__context * s,int * x,int * y,int * comp,int req_comp)984 static unsigned char *stbi__load_flip(stbi__context *s, int *x, int *y, int *comp, int req_comp)
985 {
986 unsigned char *result = stbi__load_main(s, x, y, comp, req_comp);
987
988 if (stbi__vertically_flip_on_load && result != NULL) {
989 int w = *x, h = *y;
990 int depth = req_comp ? req_comp : *comp;
991 int row,col,z;
992 stbi_uc temp;
993
994 // @OPTIMIZE: use a bigger temp buffer and memcpy multiple pixels at once
995 for (row = 0; row < (h>>1); row++) {
996 for (col = 0; col < w; col++) {
997 for (z = 0; z < depth; z++) {
998 temp = result[(row * w + col) * depth + z];
999 result[(row * w + col) * depth + z] = result[((h - row - 1) * w + col) * depth + z];
1000 result[((h - row - 1) * w + col) * depth + z] = temp;
1001 }
1002 }
1003 }
1004 }
1005
1006 return result;
1007 }
1008
1009 #ifndef STBI_NO_HDR
stbi__float_postprocess(float * result,int * x,int * y,int * comp,int req_comp)1010 static void stbi__float_postprocess(float *result, int *x, int *y, int *comp, int req_comp)
1011 {
1012 if (stbi__vertically_flip_on_load && result != NULL) {
1013 int w = *x, h = *y;
1014 int depth = req_comp ? req_comp : *comp;
1015 int row,col,z;
1016 float temp;
1017
1018 // @OPTIMIZE: use a bigger temp buffer and memcpy multiple pixels at once
1019 for (row = 0; row < (h>>1); row++) {
1020 for (col = 0; col < w; col++) {
1021 for (z = 0; z < depth; z++) {
1022 temp = result[(row * w + col) * depth + z];
1023 result[(row * w + col) * depth + z] = result[((h - row - 1) * w + col) * depth + z];
1024 result[((h - row - 1) * w + col) * depth + z] = temp;
1025 }
1026 }
1027 }
1028 }
1029 }
1030 #endif
1031
1032 #ifndef STBI_NO_STDIO
1033
stbi__fopen(char const * filename,char const * mode)1034 static FILE *stbi__fopen(char const *filename, char const *mode)
1035 {
1036 FILE *f;
1037 #if defined(_MSC_VER) && _MSC_VER >= 1400
1038 if (0 != fopen_s(&f, filename, mode))
1039 f=0;
1040 #else
1041 f = fopen(filename, mode);
1042 #endif
1043 return f;
1044 }
1045
1046
stbi_load(char const * filename,int * x,int * y,int * comp,int req_comp)1047 STBIDEF stbi_uc *stbi_load(char const *filename, int *x, int *y, int *comp, int req_comp)
1048 {
1049 FILE *f = stbi__fopen(filename, "rb");
1050 unsigned char *result;
1051 if (!f) return stbi__errpuc("can't fopen", "Unable to open file");
1052 result = stbi_load_from_file(f,x,y,comp,req_comp);
1053 fclose(f);
1054 return result;
1055 }
1056
stbi_load_from_file(FILE * f,int * x,int * y,int * comp,int req_comp)1057 STBIDEF stbi_uc *stbi_load_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
1058 {
1059 unsigned char *result;
1060 stbi__context s;
1061 stbi__start_file(&s,f);
1062 result = stbi__load_flip(&s,x,y,comp,req_comp);
1063 if (result) {
1064 // need to 'unget' all the characters in the IO buffer
1065 fseek(f, - (int) (s.img_buffer_end - s.img_buffer), SEEK_CUR);
1066 }
1067 return result;
1068 }
1069 #endif //!STBI_NO_STDIO
1070
stbi_load_from_memory(stbi_uc const * buffer,int len,int * x,int * y,int * comp,int req_comp)1071 STBIDEF stbi_uc *stbi_load_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
1072 {
1073 stbi__context s;
1074 stbi__start_mem(&s,buffer,len);
1075 return stbi__load_flip(&s,x,y,comp,req_comp);
1076 }
1077
stbi_load_from_callbacks(stbi_io_callbacks const * clbk,void * user,int * x,int * y,int * comp,int req_comp)1078 STBIDEF stbi_uc *stbi_load_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp)
1079 {
1080 stbi__context s;
1081 stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
1082 return stbi__load_flip(&s,x,y,comp,req_comp);
1083 }
1084
1085 #ifndef STBI_NO_LINEAR
stbi__loadf_main(stbi__context * s,int * x,int * y,int * comp,int req_comp)1086 static float *stbi__loadf_main(stbi__context *s, int *x, int *y, int *comp, int req_comp)
1087 {
1088 unsigned char *data;
1089 #ifndef STBI_NO_HDR
1090 if (stbi__hdr_test(s)) {
1091 float *hdr_data = stbi__hdr_load(s,x,y,comp,req_comp);
1092 if (hdr_data)
1093 stbi__float_postprocess(hdr_data,x,y,comp,req_comp);
1094 return hdr_data;
1095 }
1096 #endif
1097 data = stbi__load_flip(s, x, y, comp, req_comp);
1098 if (data)
1099 return stbi__ldr_to_hdr(data, *x, *y, req_comp ? req_comp : *comp);
1100 return stbi__errpf("unknown image type", "Image not of any known type, or corrupt");
1101 }
1102
stbi_loadf_from_memory(stbi_uc const * buffer,int len,int * x,int * y,int * comp,int req_comp)1103 STBIDEF float *stbi_loadf_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
1104 {
1105 stbi__context s;
1106 stbi__start_mem(&s,buffer,len);
1107 return stbi__loadf_main(&s,x,y,comp,req_comp);
1108 }
1109
stbi_loadf_from_callbacks(stbi_io_callbacks const * clbk,void * user,int * x,int * y,int * comp,int req_comp)1110 STBIDEF float *stbi_loadf_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp)
1111 {
1112 stbi__context s;
1113 stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
1114 return stbi__loadf_main(&s,x,y,comp,req_comp);
1115 }
1116
1117 #ifndef STBI_NO_STDIO
stbi_loadf(char const * filename,int * x,int * y,int * comp,int req_comp)1118 STBIDEF float *stbi_loadf(char const *filename, int *x, int *y, int *comp, int req_comp)
1119 {
1120 float *result;
1121 FILE *f = stbi__fopen(filename, "rb");
1122 if (!f) return stbi__errpf("can't fopen", "Unable to open file");
1123 result = stbi_loadf_from_file(f,x,y,comp,req_comp);
1124 fclose(f);
1125 return result;
1126 }
1127
stbi_loadf_from_file(FILE * f,int * x,int * y,int * comp,int req_comp)1128 STBIDEF float *stbi_loadf_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
1129 {
1130 stbi__context s;
1131 stbi__start_file(&s,f);
1132 return stbi__loadf_main(&s,x,y,comp,req_comp);
1133 }
1134 #endif // !STBI_NO_STDIO
1135
1136 #endif // !STBI_NO_LINEAR
1137
1138 // these is-hdr-or-not is defined independent of whether STBI_NO_LINEAR is
1139 // defined, for API simplicity; if STBI_NO_LINEAR is defined, it always
1140 // reports false!
1141
stbi_is_hdr_from_memory(stbi_uc const * buffer,int len)1142 STBIDEF int stbi_is_hdr_from_memory(stbi_uc const *buffer, int len)
1143 {
1144 #ifndef STBI_NO_HDR
1145 stbi__context s;
1146 stbi__start_mem(&s,buffer,len);
1147 return stbi__hdr_test(&s);
1148 #else
1149 STBI_NOTUSED(buffer);
1150 STBI_NOTUSED(len);
1151 return 0;
1152 #endif
1153 }
1154
1155 #ifndef STBI_NO_STDIO
stbi_is_hdr(char const * filename)1156 STBIDEF int stbi_is_hdr (char const *filename)
1157 {
1158 FILE *f = stbi__fopen(filename, "rb");
1159 int result=0;
1160 if (f) {
1161 result = stbi_is_hdr_from_file(f);
1162 fclose(f);
1163 }
1164 return result;
1165 }
1166
stbi_is_hdr_from_file(FILE * f)1167 STBIDEF int stbi_is_hdr_from_file(FILE *f)
1168 {
1169 #ifndef STBI_NO_HDR
1170 stbi__context s;
1171 stbi__start_file(&s,f);
1172 return stbi__hdr_test(&s);
1173 #else
1174 STBI_NOTUSED(f);
1175 return 0;
1176 #endif
1177 }
1178 #endif // !STBI_NO_STDIO
1179
stbi_is_hdr_from_callbacks(stbi_io_callbacks const * clbk,void * user)1180 STBIDEF int stbi_is_hdr_from_callbacks(stbi_io_callbacks const *clbk, void *user)
1181 {
1182 #ifndef STBI_NO_HDR
1183 stbi__context s;
1184 stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
1185 return stbi__hdr_test(&s);
1186 #else
1187 STBI_NOTUSED(clbk);
1188 STBI_NOTUSED(user);
1189 return 0;
1190 #endif
1191 }
1192
1193 static float stbi__h2l_gamma_i=1.0f/2.2f, stbi__h2l_scale_i=1.0f;
1194 static float stbi__l2h_gamma=2.2f, stbi__l2h_scale=1.0f;
1195
1196 #ifndef STBI_NO_LINEAR
stbi_ldr_to_hdr_gamma(float gamma)1197 STBIDEF void stbi_ldr_to_hdr_gamma(float gamma) { stbi__l2h_gamma = gamma; }
stbi_ldr_to_hdr_scale(float scale)1198 STBIDEF void stbi_ldr_to_hdr_scale(float scale) { stbi__l2h_scale = scale; }
1199 #endif
1200
stbi_hdr_to_ldr_gamma(float gamma)1201 STBIDEF void stbi_hdr_to_ldr_gamma(float gamma) { stbi__h2l_gamma_i = 1/gamma; }
stbi_hdr_to_ldr_scale(float scale)1202 STBIDEF void stbi_hdr_to_ldr_scale(float scale) { stbi__h2l_scale_i = 1/scale; }
1203
1204
1205 //////////////////////////////////////////////////////////////////////////////
1206 //
1207 // Common code used by all image loaders
1208 //
1209
1210 enum
1211 {
1212 STBI__SCAN_load=0,
1213 STBI__SCAN_type,
1214 STBI__SCAN_header
1215 };
1216
stbi__refill_buffer(stbi__context * s)1217 static void stbi__refill_buffer(stbi__context *s)
1218 {
1219 int n = (s->io.read)(s->io_user_data,(char*)s->buffer_start,s->buflen);
1220 if (n == 0) {
1221 // at end of file, treat same as if from memory, but need to handle case
1222 // where s->img_buffer isn't pointing to safe memory, e.g. 0-byte file
1223 s->read_from_callbacks = 0;
1224 s->img_buffer = s->buffer_start;
1225 s->img_buffer_end = s->buffer_start+1;
1226 *s->img_buffer = 0;
1227 } else {
1228 s->img_buffer = s->buffer_start;
1229 s->img_buffer_end = s->buffer_start + n;
1230 }
1231 }
1232
stbi__get8(stbi__context * s)1233 stbi_inline static stbi_uc stbi__get8(stbi__context *s)
1234 {
1235 if (s->img_buffer < s->img_buffer_end)
1236 return *s->img_buffer++;
1237 if (s->read_from_callbacks) {
1238 stbi__refill_buffer(s);
1239 return *s->img_buffer++;
1240 }
1241 return 0;
1242 }
1243
stbi__at_eof(stbi__context * s)1244 stbi_inline static int stbi__at_eof(stbi__context *s)
1245 {
1246 if (s->io.read) {
1247 if (!(s->io.eof)(s->io_user_data)) return 0;
1248 // if feof() is true, check if buffer = end
1249 // special case: we've only got the special 0 character at the end
1250 if (s->read_from_callbacks == 0) return 1;
1251 }
1252
1253 return s->img_buffer >= s->img_buffer_end;
1254 }
1255
stbi__skip(stbi__context * s,int n)1256 static void stbi__skip(stbi__context *s, int n)
1257 {
1258 if (n < 0) {
1259 s->img_buffer = s->img_buffer_end;
1260 return;
1261 }
1262 if (s->io.read) {
1263 int blen = (int) (s->img_buffer_end - s->img_buffer);
1264 if (blen < n) {
1265 s->img_buffer = s->img_buffer_end;
1266 (s->io.skip)(s->io_user_data, n - blen);
1267 return;
1268 }
1269 }
1270 s->img_buffer += n;
1271 }
1272
stbi__getn(stbi__context * s,stbi_uc * buffer,int n)1273 static int stbi__getn(stbi__context *s, stbi_uc *buffer, int n)
1274 {
1275 if (s->io.read) {
1276 int blen = (int) (s->img_buffer_end - s->img_buffer);
1277 if (blen < n) {
1278 int res, count;
1279
1280 memcpy(buffer, s->img_buffer, blen);
1281
1282 count = (s->io.read)(s->io_user_data, (char*) buffer + blen, n - blen);
1283 res = (count == (n-blen));
1284 s->img_buffer = s->img_buffer_end;
1285 return res;
1286 }
1287 }
1288
1289 if (s->img_buffer+n <= s->img_buffer_end) {
1290 memcpy(buffer, s->img_buffer, n);
1291 s->img_buffer += n;
1292 return 1;
1293 } else
1294 return 0;
1295 }
1296
stbi__get16be(stbi__context * s)1297 static int stbi__get16be(stbi__context *s)
1298 {
1299 int z = stbi__get8(s);
1300 return (z << 8) + stbi__get8(s);
1301 }
1302
stbi__get32be(stbi__context * s)1303 static stbi__uint32 stbi__get32be(stbi__context *s)
1304 {
1305 stbi__uint32 z = stbi__get16be(s);
1306 return (z << 16) + stbi__get16be(s);
1307 }
1308
1309 #if defined(STBI_NO_BMP) && defined(STBI_NO_TGA) && defined(STBI_NO_GIF)
1310 // nothing
1311 #else
stbi__get16le(stbi__context * s)1312 static int stbi__get16le(stbi__context *s)
1313 {
1314 int z = stbi__get8(s);
1315 return z + (stbi__get8(s) << 8);
1316 }
1317 #endif
1318
1319 #ifndef STBI_NO_BMP
stbi__get32le(stbi__context * s)1320 static stbi__uint32 stbi__get32le(stbi__context *s)
1321 {
1322 stbi__uint32 z = stbi__get16le(s);
1323 return z + (stbi__get16le(s) << 16);
1324 }
1325 #endif
1326
1327 #define STBI__BYTECAST(x) ((stbi_uc) ((x) & 255)) // truncate int to byte without warnings
1328
1329
1330 //////////////////////////////////////////////////////////////////////////////
1331 //
1332 // generic converter from built-in img_n to req_comp
1333 // individual types do this automatically as much as possible (e.g. jpeg
1334 // does all cases internally since it needs to colorspace convert anyway,
1335 // and it never has alpha, so very few cases ). png can automatically
1336 // interleave an alpha=255 channel, but falls back to this for other cases
1337 //
1338 // assume data buffer is malloced, so malloc a new one and free that one
1339 // only failure mode is malloc failing
1340
stbi__compute_y(int r,int g,int b)1341 static stbi_uc stbi__compute_y(int r, int g, int b)
1342 {
1343 return (stbi_uc) (((r*77) + (g*150) + (29*b)) >> 8);
1344 }
1345
stbi__convert_format(unsigned char * data,int img_n,int req_comp,unsigned int x,unsigned int y)1346 static unsigned char *stbi__convert_format(unsigned char *data, int img_n, int req_comp, unsigned int x, unsigned int y)
1347 {
1348 int i,j;
1349 unsigned char *good;
1350
1351 if (req_comp == img_n) return data;
1352 STBI_ASSERT(req_comp >= 1 && req_comp <= 4);
1353
1354 if (x == 0 || y == 0 || req_comp <= 0 || (req_comp > INT_MAX / x / y))
1355 return stbi__errpuc("Integer OverFlow", "x or y is bad");
1356
1357 good = (unsigned char *) stbi__malloc(req_comp * x * y);
1358 if (good == NULL) {
1359 STBI_FREE(data);
1360 return stbi__errpuc("outofmem", "Out of memory");
1361 }
1362
1363 for (j=0; j < (int) y; ++j) {
1364 unsigned char *src = data + j * x * img_n ;
1365 unsigned char *dest = good + j * x * req_comp;
1366
1367 #define COMBO(a,b) ((a)*8+(b))
1368 #define CASE(a,b) case COMBO(a,b): for(i=x-1; i >= 0; --i, src += a, dest += b)
1369 // convert source image with img_n components to one with req_comp components;
1370 // avoid switch per pixel, so use switch per scanline and massive macros
1371 switch (COMBO(img_n, req_comp)) {
1372 CASE(1,2) dest[0]=src[0], dest[1]=255; break;
1373 CASE(1,3) dest[0]=dest[1]=dest[2]=src[0]; break;
1374 CASE(1,4) dest[0]=dest[1]=dest[2]=src[0], dest[3]=255; break;
1375 CASE(2,1) dest[0]=src[0]; break;
1376 CASE(2,3) dest[0]=dest[1]=dest[2]=src[0]; break;
1377 CASE(2,4) dest[0]=dest[1]=dest[2]=src[0], dest[3]=src[1]; break;
1378 CASE(3,4) dest[0]=src[0],dest[1]=src[1],dest[2]=src[2],dest[3]=255; break;
1379 CASE(3,1) dest[0]=stbi__compute_y(src[0],src[1],src[2]); break;
1380 CASE(3,2) dest[0]=stbi__compute_y(src[0],src[1],src[2]), dest[1] = 255; break;
1381 CASE(4,1) dest[0]=stbi__compute_y(src[0],src[1],src[2]); break;
1382 CASE(4,2) dest[0]=stbi__compute_y(src[0],src[1],src[2]), dest[1] = src[3]; break;
1383 CASE(4,3) dest[0]=src[0],dest[1]=src[1],dest[2]=src[2]; break;
1384 default: STBI_ASSERT(0);
1385 }
1386 #undef CASE
1387 }
1388
1389 STBI_FREE(data);
1390 return good;
1391 }
1392
1393 #ifndef STBI_NO_LINEAR
stbi__ldr_to_hdr(stbi_uc * data,int x,int y,int comp)1394 static float *stbi__ldr_to_hdr(stbi_uc *data, int x, int y, int comp)
1395 {
1396 int i,k,n;
1397
1398 if (x <= 0 || y <= 0 || comp <= 0 ||
1399 (sizeof(float) > INT_MAX / x / y / comp))
1400 return stbi__errpf("Integer OverFlow", "x , y or comp is too large");
1401
1402 float *output = (float *) stbi__malloc(x * y * comp * sizeof(float));
1403 if (output == NULL) { STBI_FREE(data); return stbi__errpf("outofmem", "Out of memory"); }
1404 // compute number of non-alpha components
1405 if (comp & 1) n = comp; else n = comp-1;
1406 for (i=0; i < x*y; ++i) {
1407 for (k=0; k < n; ++k) {
1408 output[i*comp + k] = (float) (pow(data[i*comp+k]/255.0f, stbi__l2h_gamma) * stbi__l2h_scale);
1409 }
1410 if (k < comp) output[i*comp + k] = data[i*comp+k]/255.0f;
1411 }
1412 STBI_FREE(data);
1413 return output;
1414 }
1415 #endif
1416
1417 #ifndef STBI_NO_HDR
1418 #define stbi__float2int(x) ((int) (x))
stbi__hdr_to_ldr(float * data,int x,int y,int comp)1419 static stbi_uc *stbi__hdr_to_ldr(float *data, int x, int y, int comp)
1420 {
1421 int i,k,n;
1422
1423 if (x <= 0 || y <= 0 || comp <= 0 ||
1424 (comp > INT_MAX / x / y))
1425 return stbi__errpuc("Integer OverFlow", "x or y is too large");
1426
1427 stbi_uc *output = (stbi_uc *) stbi__malloc(x * y * comp);
1428 if (output == NULL) { STBI_FREE(data); return stbi__errpuc("outofmem", "Out of memory"); }
1429 // compute number of non-alpha components
1430 if (comp & 1) n = comp; else n = comp-1;
1431 for (i=0; i < x*y; ++i) {
1432 for (k=0; k < n; ++k) {
1433 float z = (float) pow(data[i*comp+k]*stbi__h2l_scale_i, stbi__h2l_gamma_i) * 255 + 0.5f;
1434 if (z < 0) z = 0;
1435 if (z > 255) z = 255;
1436 output[i*comp + k] = (stbi_uc) stbi__float2int(z);
1437 }
1438 if (k < comp) {
1439 float z = data[i*comp+k] * 255 + 0.5f;
1440 if (z < 0) z = 0;
1441 if (z > 255) z = 255;
1442 output[i*comp + k] = (stbi_uc) stbi__float2int(z);
1443 }
1444 }
1445 STBI_FREE(data);
1446 return output;
1447 }
1448 #endif
1449
1450 //////////////////////////////////////////////////////////////////////////////
1451 //
1452 // "baseline" JPEG/JFIF decoder
1453 //
1454 // simple implementation
1455 // - doesn't support delayed output of y-dimension
1456 // - simple interface (only one output format: 8-bit interleaved RGB)
1457 // - doesn't try to recover corrupt jpegs
1458 // - doesn't allow partial loading, loading multiple at once
1459 // - still fast on x86 (copying globals into locals doesn't help x86)
1460 // - allocates lots of intermediate memory (full size of all components)
1461 // - non-interleaved case requires this anyway
1462 // - allows good upsampling (see next)
1463 // high-quality
1464 // - upsampled channels are bilinearly interpolated, even across blocks
1465 // - quality integer IDCT derived from IJG's 'slow'
1466 // performance
1467 // - fast huffman; reasonable integer IDCT
1468 // - some SIMD kernels for common paths on targets with SSE2/NEON
1469 // - uses a lot of intermediate memory, could cache poorly
1470
1471 #ifndef STBI_NO_JPEG
1472
1473 // huffman decoding acceleration
1474 #define FAST_BITS 9 // larger handles more cases; smaller stomps less cache
1475
1476 typedef struct
1477 {
1478 stbi_uc fast[1 << FAST_BITS];
1479 // weirdly, repacking this into AoS is a 10% speed loss, instead of a win
1480 stbi__uint16 code[256];
1481 stbi_uc values[256];
1482 stbi_uc size[257];
1483 unsigned int maxcode[18];
1484 int delta[17]; // old 'firstsymbol' - old 'firstcode'
1485 } stbi__huffman;
1486
1487 typedef struct
1488 {
1489 stbi__context *s;
1490 stbi__huffman huff_dc[4];
1491 stbi__huffman huff_ac[4];
1492 stbi_uc dequant[4][64];
1493 stbi__int16 fast_ac[4][1 << FAST_BITS];
1494
1495 // sizes for components, interleaved MCUs
1496 int img_h_max, img_v_max;
1497 int img_mcu_x, img_mcu_y;
1498 int img_mcu_w, img_mcu_h;
1499
1500 // definition of jpeg image component
1501 struct
1502 {
1503 int id;
1504 int h,v;
1505 int tq;
1506 int hd,ha;
1507 int dc_pred;
1508
1509 int x,y,w2,h2;
1510 stbi_uc *data;
1511 void *raw_data, *raw_coeff;
1512 stbi_uc *linebuf;
1513 short *coeff; // progressive only
1514 int coeff_w, coeff_h; // number of 8x8 coefficient blocks
1515 } img_comp[4];
1516
1517 stbi__uint32 code_buffer; // jpeg entropy-coded buffer
1518 int code_bits; // number of valid bits
1519 unsigned char marker; // marker seen while filling entropy buffer
1520 int nomore; // flag if we saw a marker so must stop
1521
1522 int progressive;
1523 int spec_start;
1524 int spec_end;
1525 int succ_high;
1526 int succ_low;
1527 int eob_run;
1528
1529 int scan_n, order[4];
1530 int restart_interval, todo;
1531
1532 // kernels
1533 void (*idct_block_kernel)(stbi_uc *out, int out_stride, short data[64]);
1534 void (*YCbCr_to_RGB_kernel)(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step);
1535 stbi_uc *(*resample_row_hv_2_kernel)(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs);
1536 } stbi__jpeg;
1537
stbi__build_huffman(stbi__huffman * h,int * count)1538 static int stbi__build_huffman(stbi__huffman *h, int *count)
1539 {
1540 int i,j,k=0,code;
1541 // build size list for each symbol (from JPEG spec)
1542 for (i=0; i < 16; ++i)
1543 for (j=0; j < count[i]; ++j)
1544 h->size[k++] = (stbi_uc) (i+1);
1545 h->size[k] = 0;
1546
1547 // compute actual symbols (from jpeg spec)
1548 code = 0;
1549 k = 0;
1550 for(j=1; j <= 16; ++j) {
1551 // compute delta to add to code to compute symbol id
1552 h->delta[j] = k - code;
1553 if (h->size[k] == j) {
1554 while (h->size[k] == j)
1555 h->code[k++] = (stbi__uint16) (code++);
1556 if (code-1 >= (1 << j)) return stbi__err("bad code lengths","Corrupt JPEG");
1557 }
1558 // compute largest code + 1 for this size, preshifted as needed later
1559 h->maxcode[j] = code << (16-j);
1560 code <<= 1;
1561 }
1562 h->maxcode[j] = 0xffffffff;
1563
1564 // build non-spec acceleration table; 255 is flag for not-accelerated
1565 memset(h->fast, 255, 1 << FAST_BITS);
1566 for (i=0; i < k; ++i) {
1567 int s = h->size[i];
1568 if (s <= FAST_BITS) {
1569 int c = h->code[i] << (FAST_BITS-s);
1570 int m = 1 << (FAST_BITS-s);
1571 for (j=0; j < m; ++j) {
1572 h->fast[c+j] = (stbi_uc) i;
1573 }
1574 }
1575 }
1576 return 1;
1577 }
1578
1579 // build a table that decodes both magnitude and value of small ACs in
1580 // one go.
stbi__build_fast_ac(stbi__int16 * fast_ac,stbi__huffman * h)1581 static void stbi__build_fast_ac(stbi__int16 *fast_ac, stbi__huffman *h)
1582 {
1583 int i;
1584 for (i=0; i < (1 << FAST_BITS); ++i) {
1585 stbi_uc fast = h->fast[i];
1586 fast_ac[i] = 0;
1587 if (fast < 255) {
1588 int rs = h->values[fast];
1589 int run = (rs >> 4) & 15;
1590 int magbits = rs & 15;
1591 int len = h->size[fast];
1592
1593 if (magbits && len + magbits <= FAST_BITS) {
1594 // magnitude code followed by receive_extend code
1595 int k = ((i << len) & ((1 << FAST_BITS) - 1)) >> (FAST_BITS - magbits);
1596 int m = 1 << (magbits - 1);
1597 if (k < m) k += (-1 << magbits) + 1;
1598 // if the result is small enough, we can fit it in fast_ac table
1599 if (k >= -128 && k <= 127)
1600 fast_ac[i] = (stbi__int16) ((k << 8) + (run << 4) + (len + magbits));
1601 }
1602 }
1603 }
1604 }
1605
stbi__grow_buffer_unsafe(stbi__jpeg * j)1606 static void stbi__grow_buffer_unsafe(stbi__jpeg *j)
1607 {
1608 do {
1609 int b = j->nomore ? 0 : stbi__get8(j->s);
1610 if (b == 0xff) {
1611 int c = stbi__get8(j->s);
1612 if (c != 0) {
1613 j->marker = (unsigned char) c;
1614 j->nomore = 1;
1615 return;
1616 }
1617 }
1618 j->code_buffer |= b << (24 - j->code_bits);
1619 j->code_bits += 8;
1620 } while (j->code_bits <= 24);
1621 }
1622
1623 // (1 << n) - 1
1624 static stbi__uint32 stbi__bmask[17]={0,1,3,7,15,31,63,127,255,511,1023,2047,4095,8191,16383,32767,65535};
1625
1626 // decode a jpeg huffman value from the bitstream
stbi__jpeg_huff_decode(stbi__jpeg * j,stbi__huffman * h)1627 stbi_inline static int stbi__jpeg_huff_decode(stbi__jpeg *j, stbi__huffman *h)
1628 {
1629 unsigned int temp;
1630 int c,k;
1631
1632 if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1633
1634 // look at the top FAST_BITS and determine what symbol ID it is,
1635 // if the code is <= FAST_BITS
1636 c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
1637 k = h->fast[c];
1638 if (k < 255) {
1639 int s = h->size[k];
1640 if (s > j->code_bits)
1641 return -1;
1642 j->code_buffer <<= s;
1643 j->code_bits -= s;
1644 return h->values[k];
1645 }
1646
1647 // naive test is to shift the code_buffer down so k bits are
1648 // valid, then test against maxcode. To speed this up, we've
1649 // preshifted maxcode left so that it has (16-k) 0s at the
1650 // end; in other words, regardless of the number of bits, it
1651 // wants to be compared against something shifted to have 16;
1652 // that way we don't need to shift inside the loop.
1653 temp = j->code_buffer >> 16;
1654 for (k=FAST_BITS+1 ; ; ++k)
1655 if (temp < h->maxcode[k])
1656 break;
1657 if (k == 17) {
1658 // error! code not found
1659 j->code_bits -= 16;
1660 return -1;
1661 }
1662
1663 if (k > j->code_bits)
1664 return -1;
1665
1666 // convert the huffman code to the symbol id
1667 c = ((j->code_buffer >> (32 - k)) & stbi__bmask[k]) + h->delta[k];
1668 STBI_ASSERT((((j->code_buffer) >> (32 - h->size[c])) & stbi__bmask[h->size[c]]) == h->code[c]);
1669
1670 // convert the id to a symbol
1671 j->code_bits -= k;
1672 j->code_buffer <<= k;
1673 return h->values[c];
1674 }
1675
1676 // bias[n] = (-1<<n) + 1
1677 static int const stbi__jbias[16] = {0,-1,-3,-7,-15,-31,-63,-127,-255,-511,-1023,-2047,-4095,-8191,-16383,-32767};
1678
1679 // combined JPEG 'receive' and JPEG 'extend', since baseline
1680 // always extends everything it receives.
stbi__extend_receive(stbi__jpeg * j,int n)1681 stbi_inline static int stbi__extend_receive(stbi__jpeg *j, int n)
1682 {
1683 unsigned int k;
1684 int sgn;
1685 if (j->code_bits < n) stbi__grow_buffer_unsafe(j);
1686
1687 sgn = (stbi__int32)j->code_buffer >> 31; // sign bit is always in MSB
1688 k = stbi_lrot(j->code_buffer, n);
1689 STBI_ASSERT(n >= 0 && n < (int) (sizeof(stbi__bmask)/sizeof(*stbi__bmask)));
1690 j->code_buffer = k & ~stbi__bmask[n];
1691 k &= stbi__bmask[n];
1692 j->code_bits -= n;
1693 return k + (stbi__jbias[n] & ~sgn);
1694 }
1695
1696 // get some unsigned bits
stbi__jpeg_get_bits(stbi__jpeg * j,int n)1697 stbi_inline static int stbi__jpeg_get_bits(stbi__jpeg *j, int n)
1698 {
1699 unsigned int k;
1700 if (j->code_bits < n) stbi__grow_buffer_unsafe(j);
1701 k = stbi_lrot(j->code_buffer, n);
1702 j->code_buffer = k & ~stbi__bmask[n];
1703 k &= stbi__bmask[n];
1704 j->code_bits -= n;
1705 return k;
1706 }
1707
stbi__jpeg_get_bit(stbi__jpeg * j)1708 stbi_inline static int stbi__jpeg_get_bit(stbi__jpeg *j)
1709 {
1710 unsigned int k;
1711 if (j->code_bits < 1) stbi__grow_buffer_unsafe(j);
1712 k = j->code_buffer;
1713 j->code_buffer <<= 1;
1714 --j->code_bits;
1715 return k & 0x80000000;
1716 }
1717
1718 // given a value that's at position X in the zigzag stream,
1719 // where does it appear in the 8x8 matrix coded as row-major?
1720 static stbi_uc stbi__jpeg_dezigzag[64+15] =
1721 {
1722 0, 1, 8, 16, 9, 2, 3, 10,
1723 17, 24, 32, 25, 18, 11, 4, 5,
1724 12, 19, 26, 33, 40, 48, 41, 34,
1725 27, 20, 13, 6, 7, 14, 21, 28,
1726 35, 42, 49, 56, 57, 50, 43, 36,
1727 29, 22, 15, 23, 30, 37, 44, 51,
1728 58, 59, 52, 45, 38, 31, 39, 46,
1729 53, 60, 61, 54, 47, 55, 62, 63,
1730 // let corrupt input sample past end
1731 63, 63, 63, 63, 63, 63, 63, 63,
1732 63, 63, 63, 63, 63, 63, 63
1733 };
1734
1735 // decode one 64-entry block--
stbi__jpeg_decode_block(stbi__jpeg * j,short data[64],stbi__huffman * hdc,stbi__huffman * hac,stbi__int16 * fac,int b,stbi_uc * dequant)1736 static int stbi__jpeg_decode_block(stbi__jpeg *j, short data[64], stbi__huffman *hdc, stbi__huffman *hac, stbi__int16 *fac, int b, stbi_uc *dequant)
1737 {
1738 int diff,dc,k;
1739 int t;
1740
1741 if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1742 t = stbi__jpeg_huff_decode(j, hdc);
1743 if (t < 0) return stbi__err("bad huffman code","Corrupt JPEG");
1744
1745 // 0 all the ac values now so we can do it 32-bits at a time
1746 memset(data,0,64*sizeof(data[0]));
1747
1748 diff = t ? stbi__extend_receive(j, t) : 0;
1749 dc = j->img_comp[b].dc_pred + diff;
1750 j->img_comp[b].dc_pred = dc;
1751 data[0] = (short) (dc * dequant[0]);
1752
1753 // decode AC components, see JPEG spec
1754 k = 1;
1755 do {
1756 unsigned int zig;
1757 int c,r,s;
1758 if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1759 c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
1760 r = fac[c];
1761 if (r) { // fast-AC path
1762 k += (r >> 4) & 15; // run
1763 s = r & 15; // combined length
1764 j->code_buffer <<= s;
1765 j->code_bits -= s;
1766 // decode into unzigzag'd location
1767 zig = stbi__jpeg_dezigzag[k++];
1768 data[zig] = (short) ((r >> 8) * dequant[zig]);
1769 } else {
1770 int rs = stbi__jpeg_huff_decode(j, hac);
1771 if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
1772 s = rs & 15;
1773 r = rs >> 4;
1774 if (s == 0) {
1775 if (rs != 0xf0) break; // end block
1776 k += 16;
1777 } else {
1778 k += r;
1779 // decode into unzigzag'd location
1780 zig = stbi__jpeg_dezigzag[k++];
1781 data[zig] = (short) (stbi__extend_receive(j,s) * dequant[zig]);
1782 }
1783 }
1784 } while (k < 64);
1785 return 1;
1786 }
1787
stbi__jpeg_decode_block_prog_dc(stbi__jpeg * j,short data[64],stbi__huffman * hdc,int b)1788 static int stbi__jpeg_decode_block_prog_dc(stbi__jpeg *j, short data[64], stbi__huffman *hdc, int b)
1789 {
1790 int diff,dc;
1791 int t;
1792 if (j->spec_end != 0) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
1793
1794 if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1795
1796 if (j->succ_high == 0) {
1797 // first scan for DC coefficient, must be first
1798 memset(data,0,64*sizeof(data[0])); // 0 all the ac values now
1799 t = stbi__jpeg_huff_decode(j, hdc);
1800 diff = t ? stbi__extend_receive(j, t) : 0;
1801
1802 dc = j->img_comp[b].dc_pred + diff;
1803 j->img_comp[b].dc_pred = dc;
1804 data[0] = (short) (dc << j->succ_low);
1805 } else {
1806 // refinement scan for DC coefficient
1807 if (stbi__jpeg_get_bit(j))
1808 data[0] += (short) (1 << j->succ_low);
1809 }
1810 return 1;
1811 }
1812
1813 // @OPTIMIZE: store non-zigzagged during the decode passes,
1814 // and only de-zigzag when dequantizing
stbi__jpeg_decode_block_prog_ac(stbi__jpeg * j,short data[64],stbi__huffman * hac,stbi__int16 * fac)1815 static int stbi__jpeg_decode_block_prog_ac(stbi__jpeg *j, short data[64], stbi__huffman *hac, stbi__int16 *fac)
1816 {
1817 int k;
1818 if (j->spec_start == 0) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
1819
1820 if (j->succ_high == 0) {
1821 int shift = j->succ_low;
1822
1823 if (j->eob_run) {
1824 --j->eob_run;
1825 return 1;
1826 }
1827
1828 k = j->spec_start;
1829 do {
1830 unsigned int zig;
1831 int c,r,s;
1832 if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1833 c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
1834 r = fac[c];
1835 if (r) { // fast-AC path
1836 k += (r >> 4) & 15; // run
1837 s = r & 15; // combined length
1838 j->code_buffer <<= s;
1839 j->code_bits -= s;
1840 zig = stbi__jpeg_dezigzag[k++];
1841 data[zig] = (short) ((r >> 8) << shift);
1842 } else {
1843 int rs = stbi__jpeg_huff_decode(j, hac);
1844 if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
1845 s = rs & 15;
1846 r = rs >> 4;
1847 if (s == 0) {
1848 if (r < 15) {
1849 j->eob_run = (1 << r);
1850 if (r)
1851 j->eob_run += stbi__jpeg_get_bits(j, r);
1852 --j->eob_run;
1853 break;
1854 }
1855 k += 16;
1856 } else {
1857 k += r;
1858 zig = stbi__jpeg_dezigzag[k++];
1859 data[zig] = (short) (stbi__extend_receive(j,s) << shift);
1860 }
1861 }
1862 } while (k <= j->spec_end);
1863 } else {
1864 // refinement scan for these AC coefficients
1865
1866 short bit = (short) (1 << j->succ_low);
1867
1868 if (j->eob_run) {
1869 --j->eob_run;
1870 for (k = j->spec_start; k <= j->spec_end; ++k) {
1871 short *p = &data[stbi__jpeg_dezigzag[k]];
1872 if (*p != 0)
1873 if (stbi__jpeg_get_bit(j))
1874 if ((*p & bit)==0) {
1875 if (*p > 0)
1876 *p += bit;
1877 else
1878 *p -= bit;
1879 }
1880 }
1881 } else {
1882 k = j->spec_start;
1883 do {
1884 int r,s;
1885 int rs = stbi__jpeg_huff_decode(j, hac); // @OPTIMIZE see if we can use the fast path here, advance-by-r is so slow, eh
1886 if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
1887 s = rs & 15;
1888 r = rs >> 4;
1889 if (s == 0) {
1890 if (r < 15) {
1891 j->eob_run = (1 << r) - 1;
1892 if (r)
1893 j->eob_run += stbi__jpeg_get_bits(j, r);
1894 r = 64; // force end of block
1895 } else {
1896 // r=15 s=0 should write 16 0s, so we just do
1897 // a run of 15 0s and then write s (which is 0),
1898 // so we don't have to do anything special here
1899 }
1900 } else {
1901 if (s != 1) return stbi__err("bad huffman code", "Corrupt JPEG");
1902 // sign bit
1903 if (stbi__jpeg_get_bit(j))
1904 s = bit;
1905 else
1906 s = -bit;
1907 }
1908
1909 // advance by r
1910 while (k <= j->spec_end) {
1911 short *p = &data[stbi__jpeg_dezigzag[k++]];
1912 if (*p != 0) {
1913 if (stbi__jpeg_get_bit(j))
1914 if ((*p & bit)==0) {
1915 if (*p > 0)
1916 *p += bit;
1917 else
1918 *p -= bit;
1919 }
1920 } else {
1921 if (r == 0) {
1922 *p = (short) s;
1923 break;
1924 }
1925 --r;
1926 }
1927 }
1928 } while (k <= j->spec_end);
1929 }
1930 }
1931 return 1;
1932 }
1933
1934 // take a -128..127 value and stbi__clamp it and convert to 0..255
stbi__clamp(int x)1935 stbi_inline static stbi_uc stbi__clamp(int x)
1936 {
1937 // trick to use a single test to catch both cases
1938 if ((unsigned int) x > 255) {
1939 if (x < 0) return 0;
1940 if (x > 255) return 255;
1941 }
1942 return (stbi_uc) x;
1943 }
1944
1945 #define stbi__f2f(x) ((int) (((x) * 4096 + 0.5)))
1946 #define stbi__fsh(x) ((x) << 12)
1947
1948 // derived from jidctint -- DCT_ISLOW
1949 #define STBI__IDCT_1D(s0,s1,s2,s3,s4,s5,s6,s7) \
1950 int t0,t1,t2,t3,p1,p2,p3,p4,p5,x0,x1,x2,x3; \
1951 p2 = s2; \
1952 p3 = s6; \
1953 p1 = (p2+p3) * stbi__f2f(0.5411961f); \
1954 t2 = p1 + p3*stbi__f2f(-1.847759065f); \
1955 t3 = p1 + p2*stbi__f2f( 0.765366865f); \
1956 p2 = s0; \
1957 p3 = s4; \
1958 t0 = stbi__fsh(p2+p3); \
1959 t1 = stbi__fsh(p2-p3); \
1960 x0 = t0+t3; \
1961 x3 = t0-t3; \
1962 x1 = t1+t2; \
1963 x2 = t1-t2; \
1964 t0 = s7; \
1965 t1 = s5; \
1966 t2 = s3; \
1967 t3 = s1; \
1968 p3 = t0+t2; \
1969 p4 = t1+t3; \
1970 p1 = t0+t3; \
1971 p2 = t1+t2; \
1972 p5 = (p3+p4)*stbi__f2f( 1.175875602f); \
1973 t0 = t0*stbi__f2f( 0.298631336f); \
1974 t1 = t1*stbi__f2f( 2.053119869f); \
1975 t2 = t2*stbi__f2f( 3.072711026f); \
1976 t3 = t3*stbi__f2f( 1.501321110f); \
1977 p1 = p5 + p1*stbi__f2f(-0.899976223f); \
1978 p2 = p5 + p2*stbi__f2f(-2.562915447f); \
1979 p3 = p3*stbi__f2f(-1.961570560f); \
1980 p4 = p4*stbi__f2f(-0.390180644f); \
1981 t3 += p1+p4; \
1982 t2 += p2+p3; \
1983 t1 += p2+p4; \
1984 t0 += p1+p3;
1985
stbi__idct_block(stbi_uc * out,int out_stride,short data[64])1986 static void stbi__idct_block(stbi_uc *out, int out_stride, short data[64])
1987 {
1988 int i,val[64],*v=val;
1989 stbi_uc *o;
1990 short *d = data;
1991
1992 // columns
1993 for (i=0; i < 8; ++i,++d, ++v) {
1994 // if all zeroes, shortcut -- this avoids dequantizing 0s and IDCTing
1995 if (d[ 8]==0 && d[16]==0 && d[24]==0 && d[32]==0
1996 && d[40]==0 && d[48]==0 && d[56]==0) {
1997 // no shortcut 0 seconds
1998 // (1|2|3|4|5|6|7)==0 0 seconds
1999 // all separate -0.047 seconds
2000 // 1 && 2|3 && 4|5 && 6|7: -0.047 seconds
2001 int dcterm = d[0] << 2;
2002 v[0] = v[8] = v[16] = v[24] = v[32] = v[40] = v[48] = v[56] = dcterm;
2003 } else {
2004 STBI__IDCT_1D(d[ 0],d[ 8],d[16],d[24],d[32],d[40],d[48],d[56])
2005 // constants scaled things up by 1<<12; let's bring them back
2006 // down, but keep 2 extra bits of precision
2007 x0 += 512; x1 += 512; x2 += 512; x3 += 512;
2008 v[ 0] = (x0+t3) >> 10;
2009 v[56] = (x0-t3) >> 10;
2010 v[ 8] = (x1+t2) >> 10;
2011 v[48] = (x1-t2) >> 10;
2012 v[16] = (x2+t1) >> 10;
2013 v[40] = (x2-t1) >> 10;
2014 v[24] = (x3+t0) >> 10;
2015 v[32] = (x3-t0) >> 10;
2016 }
2017 }
2018
2019 for (i=0, v=val, o=out; i < 8; ++i,v+=8,o+=out_stride) {
2020 // no fast case since the first 1D IDCT spread components out
2021 STBI__IDCT_1D(v[0],v[1],v[2],v[3],v[4],v[5],v[6],v[7])
2022 // constants scaled things up by 1<<12, plus we had 1<<2 from first
2023 // loop, plus horizontal and vertical each scale by sqrt(8) so together
2024 // we've got an extra 1<<3, so 1<<17 total we need to remove.
2025 // so we want to round that, which means adding 0.5 * 1<<17,
2026 // aka 65536. Also, we'll end up with -128 to 127 that we want
2027 // to encode as 0..255 by adding 128, so we'll add that before the shift
2028 x0 += 65536 + (128<<17);
2029 x1 += 65536 + (128<<17);
2030 x2 += 65536 + (128<<17);
2031 x3 += 65536 + (128<<17);
2032 // tried computing the shifts into temps, or'ing the temps to see
2033 // if any were out of range, but that was slower
2034 o[0] = stbi__clamp((x0+t3) >> 17);
2035 o[7] = stbi__clamp((x0-t3) >> 17);
2036 o[1] = stbi__clamp((x1+t2) >> 17);
2037 o[6] = stbi__clamp((x1-t2) >> 17);
2038 o[2] = stbi__clamp((x2+t1) >> 17);
2039 o[5] = stbi__clamp((x2-t1) >> 17);
2040 o[3] = stbi__clamp((x3+t0) >> 17);
2041 o[4] = stbi__clamp((x3-t0) >> 17);
2042 }
2043 }
2044
2045 #ifdef STBI_SSE2
2046 // sse2 integer IDCT. not the fastest possible implementation but it
2047 // produces bit-identical results to the generic C version so it's
2048 // fully "transparent".
stbi__idct_simd(stbi_uc * out,int out_stride,short data[64])2049 static void stbi__idct_simd(stbi_uc *out, int out_stride, short data[64])
2050 {
2051 // This is constructed to match our regular (generic) integer IDCT exactly.
2052 __m128i row0, row1, row2, row3, row4, row5, row6, row7;
2053 __m128i tmp;
2054
2055 // dot product constant: even elems=x, odd elems=y
2056 #define dct_const(x,y) _mm_setr_epi16((x),(y),(x),(y),(x),(y),(x),(y))
2057
2058 // out(0) = c0[even]*x + c0[odd]*y (c0, x, y 16-bit, out 32-bit)
2059 // out(1) = c1[even]*x + c1[odd]*y
2060 #define dct_rot(out0,out1, x,y,c0,c1) \
2061 __m128i c0##lo = _mm_unpacklo_epi16((x),(y)); \
2062 __m128i c0##hi = _mm_unpackhi_epi16((x),(y)); \
2063 __m128i out0##_l = _mm_madd_epi16(c0##lo, c0); \
2064 __m128i out0##_h = _mm_madd_epi16(c0##hi, c0); \
2065 __m128i out1##_l = _mm_madd_epi16(c0##lo, c1); \
2066 __m128i out1##_h = _mm_madd_epi16(c0##hi, c1)
2067
2068 // out = in << 12 (in 16-bit, out 32-bit)
2069 #define dct_widen(out, in) \
2070 __m128i out##_l = _mm_srai_epi32(_mm_unpacklo_epi16(_mm_setzero_si128(), (in)), 4); \
2071 __m128i out##_h = _mm_srai_epi32(_mm_unpackhi_epi16(_mm_setzero_si128(), (in)), 4)
2072
2073 // wide add
2074 #define dct_wadd(out, a, b) \
2075 __m128i out##_l = _mm_add_epi32(a##_l, b##_l); \
2076 __m128i out##_h = _mm_add_epi32(a##_h, b##_h)
2077
2078 // wide sub
2079 #define dct_wsub(out, a, b) \
2080 __m128i out##_l = _mm_sub_epi32(a##_l, b##_l); \
2081 __m128i out##_h = _mm_sub_epi32(a##_h, b##_h)
2082
2083 // butterfly a/b, add bias, then shift by "s" and pack
2084 #define dct_bfly32o(out0, out1, a,b,bias,s) \
2085 { \
2086 __m128i abiased_l = _mm_add_epi32(a##_l, bias); \
2087 __m128i abiased_h = _mm_add_epi32(a##_h, bias); \
2088 dct_wadd(sum, abiased, b); \
2089 dct_wsub(dif, abiased, b); \
2090 out0 = _mm_packs_epi32(_mm_srai_epi32(sum_l, s), _mm_srai_epi32(sum_h, s)); \
2091 out1 = _mm_packs_epi32(_mm_srai_epi32(dif_l, s), _mm_srai_epi32(dif_h, s)); \
2092 }
2093
2094 // 8-bit interleave step (for transposes)
2095 #define dct_interleave8(a, b) \
2096 tmp = a; \
2097 a = _mm_unpacklo_epi8(a, b); \
2098 b = _mm_unpackhi_epi8(tmp, b)
2099
2100 // 16-bit interleave step (for transposes)
2101 #define dct_interleave16(a, b) \
2102 tmp = a; \
2103 a = _mm_unpacklo_epi16(a, b); \
2104 b = _mm_unpackhi_epi16(tmp, b)
2105
2106 #define dct_pass(bias,shift) \
2107 { \
2108 /* even part */ \
2109 dct_rot(t2e,t3e, row2,row6, rot0_0,rot0_1); \
2110 __m128i sum04 = _mm_add_epi16(row0, row4); \
2111 __m128i dif04 = _mm_sub_epi16(row0, row4); \
2112 dct_widen(t0e, sum04); \
2113 dct_widen(t1e, dif04); \
2114 dct_wadd(x0, t0e, t3e); \
2115 dct_wsub(x3, t0e, t3e); \
2116 dct_wadd(x1, t1e, t2e); \
2117 dct_wsub(x2, t1e, t2e); \
2118 /* odd part */ \
2119 dct_rot(y0o,y2o, row7,row3, rot2_0,rot2_1); \
2120 dct_rot(y1o,y3o, row5,row1, rot3_0,rot3_1); \
2121 __m128i sum17 = _mm_add_epi16(row1, row7); \
2122 __m128i sum35 = _mm_add_epi16(row3, row5); \
2123 dct_rot(y4o,y5o, sum17,sum35, rot1_0,rot1_1); \
2124 dct_wadd(x4, y0o, y4o); \
2125 dct_wadd(x5, y1o, y5o); \
2126 dct_wadd(x6, y2o, y5o); \
2127 dct_wadd(x7, y3o, y4o); \
2128 dct_bfly32o(row0,row7, x0,x7,bias,shift); \
2129 dct_bfly32o(row1,row6, x1,x6,bias,shift); \
2130 dct_bfly32o(row2,row5, x2,x5,bias,shift); \
2131 dct_bfly32o(row3,row4, x3,x4,bias,shift); \
2132 }
2133
2134 __m128i rot0_0 = dct_const(stbi__f2f(0.5411961f), stbi__f2f(0.5411961f) + stbi__f2f(-1.847759065f));
2135 __m128i rot0_1 = dct_const(stbi__f2f(0.5411961f) + stbi__f2f( 0.765366865f), stbi__f2f(0.5411961f));
2136 __m128i rot1_0 = dct_const(stbi__f2f(1.175875602f) + stbi__f2f(-0.899976223f), stbi__f2f(1.175875602f));
2137 __m128i rot1_1 = dct_const(stbi__f2f(1.175875602f), stbi__f2f(1.175875602f) + stbi__f2f(-2.562915447f));
2138 __m128i rot2_0 = dct_const(stbi__f2f(-1.961570560f) + stbi__f2f( 0.298631336f), stbi__f2f(-1.961570560f));
2139 __m128i rot2_1 = dct_const(stbi__f2f(-1.961570560f), stbi__f2f(-1.961570560f) + stbi__f2f( 3.072711026f));
2140 __m128i rot3_0 = dct_const(stbi__f2f(-0.390180644f) + stbi__f2f( 2.053119869f), stbi__f2f(-0.390180644f));
2141 __m128i rot3_1 = dct_const(stbi__f2f(-0.390180644f), stbi__f2f(-0.390180644f) + stbi__f2f( 1.501321110f));
2142
2143 // rounding biases in column/row passes, see stbi__idct_block for explanation.
2144 __m128i bias_0 = _mm_set1_epi32(512);
2145 __m128i bias_1 = _mm_set1_epi32(65536 + (128<<17));
2146
2147 // load
2148 row0 = _mm_load_si128((const __m128i *) (data + 0*8));
2149 row1 = _mm_load_si128((const __m128i *) (data + 1*8));
2150 row2 = _mm_load_si128((const __m128i *) (data + 2*8));
2151 row3 = _mm_load_si128((const __m128i *) (data + 3*8));
2152 row4 = _mm_load_si128((const __m128i *) (data + 4*8));
2153 row5 = _mm_load_si128((const __m128i *) (data + 5*8));
2154 row6 = _mm_load_si128((const __m128i *) (data + 6*8));
2155 row7 = _mm_load_si128((const __m128i *) (data + 7*8));
2156
2157 // column pass
2158 dct_pass(bias_0, 10);
2159
2160 {
2161 // 16bit 8x8 transpose pass 1
2162 dct_interleave16(row0, row4);
2163 dct_interleave16(row1, row5);
2164 dct_interleave16(row2, row6);
2165 dct_interleave16(row3, row7);
2166
2167 // transpose pass 2
2168 dct_interleave16(row0, row2);
2169 dct_interleave16(row1, row3);
2170 dct_interleave16(row4, row6);
2171 dct_interleave16(row5, row7);
2172
2173 // transpose pass 3
2174 dct_interleave16(row0, row1);
2175 dct_interleave16(row2, row3);
2176 dct_interleave16(row4, row5);
2177 dct_interleave16(row6, row7);
2178 }
2179
2180 // row pass
2181 dct_pass(bias_1, 17);
2182
2183 {
2184 // pack
2185 __m128i p0 = _mm_packus_epi16(row0, row1); // a0a1a2a3...a7b0b1b2b3...b7
2186 __m128i p1 = _mm_packus_epi16(row2, row3);
2187 __m128i p2 = _mm_packus_epi16(row4, row5);
2188 __m128i p3 = _mm_packus_epi16(row6, row7);
2189
2190 // 8bit 8x8 transpose pass 1
2191 dct_interleave8(p0, p2); // a0e0a1e1...
2192 dct_interleave8(p1, p3); // c0g0c1g1...
2193
2194 // transpose pass 2
2195 dct_interleave8(p0, p1); // a0c0e0g0...
2196 dct_interleave8(p2, p3); // b0d0f0h0...
2197
2198 // transpose pass 3
2199 dct_interleave8(p0, p2); // a0b0c0d0...
2200 dct_interleave8(p1, p3); // a4b4c4d4...
2201
2202 // store
2203 _mm_storel_epi64((__m128i *) out, p0); out += out_stride;
2204 _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p0, 0x4e)); out += out_stride;
2205 _mm_storel_epi64((__m128i *) out, p2); out += out_stride;
2206 _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p2, 0x4e)); out += out_stride;
2207 _mm_storel_epi64((__m128i *) out, p1); out += out_stride;
2208 _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p1, 0x4e)); out += out_stride;
2209 _mm_storel_epi64((__m128i *) out, p3); out += out_stride;
2210 _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p3, 0x4e));
2211 }
2212
2213 #undef dct_const
2214 #undef dct_rot
2215 #undef dct_widen
2216 #undef dct_wadd
2217 #undef dct_wsub
2218 #undef dct_bfly32o
2219 #undef dct_interleave8
2220 #undef dct_interleave16
2221 #undef dct_pass
2222 }
2223
2224 #endif // STBI_SSE2
2225
2226 #ifdef STBI_NEON
2227
2228 // NEON integer IDCT. should produce bit-identical
2229 // results to the generic C version.
stbi__idct_simd(stbi_uc * out,int out_stride,short data[64])2230 static void stbi__idct_simd(stbi_uc *out, int out_stride, short data[64])
2231 {
2232 int16x8_t row0, row1, row2, row3, row4, row5, row6, row7;
2233
2234 int16x4_t rot0_0 = vdup_n_s16(stbi__f2f(0.5411961f));
2235 int16x4_t rot0_1 = vdup_n_s16(stbi__f2f(-1.847759065f));
2236 int16x4_t rot0_2 = vdup_n_s16(stbi__f2f( 0.765366865f));
2237 int16x4_t rot1_0 = vdup_n_s16(stbi__f2f( 1.175875602f));
2238 int16x4_t rot1_1 = vdup_n_s16(stbi__f2f(-0.899976223f));
2239 int16x4_t rot1_2 = vdup_n_s16(stbi__f2f(-2.562915447f));
2240 int16x4_t rot2_0 = vdup_n_s16(stbi__f2f(-1.961570560f));
2241 int16x4_t rot2_1 = vdup_n_s16(stbi__f2f(-0.390180644f));
2242 int16x4_t rot3_0 = vdup_n_s16(stbi__f2f( 0.298631336f));
2243 int16x4_t rot3_1 = vdup_n_s16(stbi__f2f( 2.053119869f));
2244 int16x4_t rot3_2 = vdup_n_s16(stbi__f2f( 3.072711026f));
2245 int16x4_t rot3_3 = vdup_n_s16(stbi__f2f( 1.501321110f));
2246
2247 #define dct_long_mul(out, inq, coeff) \
2248 int32x4_t out##_l = vmull_s16(vget_low_s16(inq), coeff); \
2249 int32x4_t out##_h = vmull_s16(vget_high_s16(inq), coeff)
2250
2251 #define dct_long_mac(out, acc, inq, coeff) \
2252 int32x4_t out##_l = vmlal_s16(acc##_l, vget_low_s16(inq), coeff); \
2253 int32x4_t out##_h = vmlal_s16(acc##_h, vget_high_s16(inq), coeff)
2254
2255 #define dct_widen(out, inq) \
2256 int32x4_t out##_l = vshll_n_s16(vget_low_s16(inq), 12); \
2257 int32x4_t out##_h = vshll_n_s16(vget_high_s16(inq), 12)
2258
2259 // wide add
2260 #define dct_wadd(out, a, b) \
2261 int32x4_t out##_l = vaddq_s32(a##_l, b##_l); \
2262 int32x4_t out##_h = vaddq_s32(a##_h, b##_h)
2263
2264 // wide sub
2265 #define dct_wsub(out, a, b) \
2266 int32x4_t out##_l = vsubq_s32(a##_l, b##_l); \
2267 int32x4_t out##_h = vsubq_s32(a##_h, b##_h)
2268
2269 // butterfly a/b, then shift using "shiftop" by "s" and pack
2270 #define dct_bfly32o(out0,out1, a,b,shiftop,s) \
2271 { \
2272 dct_wadd(sum, a, b); \
2273 dct_wsub(dif, a, b); \
2274 out0 = vcombine_s16(shiftop(sum_l, s), shiftop(sum_h, s)); \
2275 out1 = vcombine_s16(shiftop(dif_l, s), shiftop(dif_h, s)); \
2276 }
2277
2278 #define dct_pass(shiftop, shift) \
2279 { \
2280 /* even part */ \
2281 int16x8_t sum26 = vaddq_s16(row2, row6); \
2282 dct_long_mul(p1e, sum26, rot0_0); \
2283 dct_long_mac(t2e, p1e, row6, rot0_1); \
2284 dct_long_mac(t3e, p1e, row2, rot0_2); \
2285 int16x8_t sum04 = vaddq_s16(row0, row4); \
2286 int16x8_t dif04 = vsubq_s16(row0, row4); \
2287 dct_widen(t0e, sum04); \
2288 dct_widen(t1e, dif04); \
2289 dct_wadd(x0, t0e, t3e); \
2290 dct_wsub(x3, t0e, t3e); \
2291 dct_wadd(x1, t1e, t2e); \
2292 dct_wsub(x2, t1e, t2e); \
2293 /* odd part */ \
2294 int16x8_t sum15 = vaddq_s16(row1, row5); \
2295 int16x8_t sum17 = vaddq_s16(row1, row7); \
2296 int16x8_t sum35 = vaddq_s16(row3, row5); \
2297 int16x8_t sum37 = vaddq_s16(row3, row7); \
2298 int16x8_t sumodd = vaddq_s16(sum17, sum35); \
2299 dct_long_mul(p5o, sumodd, rot1_0); \
2300 dct_long_mac(p1o, p5o, sum17, rot1_1); \
2301 dct_long_mac(p2o, p5o, sum35, rot1_2); \
2302 dct_long_mul(p3o, sum37, rot2_0); \
2303 dct_long_mul(p4o, sum15, rot2_1); \
2304 dct_wadd(sump13o, p1o, p3o); \
2305 dct_wadd(sump24o, p2o, p4o); \
2306 dct_wadd(sump23o, p2o, p3o); \
2307 dct_wadd(sump14o, p1o, p4o); \
2308 dct_long_mac(x4, sump13o, row7, rot3_0); \
2309 dct_long_mac(x5, sump24o, row5, rot3_1); \
2310 dct_long_mac(x6, sump23o, row3, rot3_2); \
2311 dct_long_mac(x7, sump14o, row1, rot3_3); \
2312 dct_bfly32o(row0,row7, x0,x7,shiftop,shift); \
2313 dct_bfly32o(row1,row6, x1,x6,shiftop,shift); \
2314 dct_bfly32o(row2,row5, x2,x5,shiftop,shift); \
2315 dct_bfly32o(row3,row4, x3,x4,shiftop,shift); \
2316 }
2317
2318 // load
2319 row0 = vld1q_s16(data + 0*8);
2320 row1 = vld1q_s16(data + 1*8);
2321 row2 = vld1q_s16(data + 2*8);
2322 row3 = vld1q_s16(data + 3*8);
2323 row4 = vld1q_s16(data + 4*8);
2324 row5 = vld1q_s16(data + 5*8);
2325 row6 = vld1q_s16(data + 6*8);
2326 row7 = vld1q_s16(data + 7*8);
2327
2328 // add DC bias
2329 row0 = vaddq_s16(row0, vsetq_lane_s16(1024, vdupq_n_s16(0), 0));
2330
2331 // column pass
2332 dct_pass(vrshrn_n_s32, 10);
2333
2334 // 16bit 8x8 transpose
2335 {
2336 // these three map to a single VTRN.16, VTRN.32, and VSWP, respectively.
2337 // whether compilers actually get this is another story, sadly.
2338 #define dct_trn16(x, y) { int16x8x2_t t = vtrnq_s16(x, y); x = t.val[0]; y = t.val[1]; }
2339 #define dct_trn32(x, y) { int32x4x2_t t = vtrnq_s32(vreinterpretq_s32_s16(x), vreinterpretq_s32_s16(y)); x = vreinterpretq_s16_s32(t.val[0]); y = vreinterpretq_s16_s32(t.val[1]); }
2340 #define dct_trn64(x, y) { int16x8_t x0 = x; int16x8_t y0 = y; x = vcombine_s16(vget_low_s16(x0), vget_low_s16(y0)); y = vcombine_s16(vget_high_s16(x0), vget_high_s16(y0)); }
2341
2342 // pass 1
2343 dct_trn16(row0, row1); // a0b0a2b2a4b4a6b6
2344 dct_trn16(row2, row3);
2345 dct_trn16(row4, row5);
2346 dct_trn16(row6, row7);
2347
2348 // pass 2
2349 dct_trn32(row0, row2); // a0b0c0d0a4b4c4d4
2350 dct_trn32(row1, row3);
2351 dct_trn32(row4, row6);
2352 dct_trn32(row5, row7);
2353
2354 // pass 3
2355 dct_trn64(row0, row4); // a0b0c0d0e0f0g0h0
2356 dct_trn64(row1, row5);
2357 dct_trn64(row2, row6);
2358 dct_trn64(row3, row7);
2359
2360 #undef dct_trn16
2361 #undef dct_trn32
2362 #undef dct_trn64
2363 }
2364
2365 // row pass
2366 // vrshrn_n_s32 only supports shifts up to 16, we need
2367 // 17. so do a non-rounding shift of 16 first then follow
2368 // up with a rounding shift by 1.
2369 dct_pass(vshrn_n_s32, 16);
2370
2371 {
2372 // pack and round
2373 uint8x8_t p0 = vqrshrun_n_s16(row0, 1);
2374 uint8x8_t p1 = vqrshrun_n_s16(row1, 1);
2375 uint8x8_t p2 = vqrshrun_n_s16(row2, 1);
2376 uint8x8_t p3 = vqrshrun_n_s16(row3, 1);
2377 uint8x8_t p4 = vqrshrun_n_s16(row4, 1);
2378 uint8x8_t p5 = vqrshrun_n_s16(row5, 1);
2379 uint8x8_t p6 = vqrshrun_n_s16(row6, 1);
2380 uint8x8_t p7 = vqrshrun_n_s16(row7, 1);
2381
2382 // again, these can translate into one instruction, but often don't.
2383 #define dct_trn8_8(x, y) { uint8x8x2_t t = vtrn_u8(x, y); x = t.val[0]; y = t.val[1]; }
2384 #define dct_trn8_16(x, y) { uint16x4x2_t t = vtrn_u16(vreinterpret_u16_u8(x), vreinterpret_u16_u8(y)); x = vreinterpret_u8_u16(t.val[0]); y = vreinterpret_u8_u16(t.val[1]); }
2385 #define dct_trn8_32(x, y) { uint32x2x2_t t = vtrn_u32(vreinterpret_u32_u8(x), vreinterpret_u32_u8(y)); x = vreinterpret_u8_u32(t.val[0]); y = vreinterpret_u8_u32(t.val[1]); }
2386
2387 // sadly can't use interleaved stores here since we only write
2388 // 8 bytes to each scan line!
2389
2390 // 8x8 8-bit transpose pass 1
2391 dct_trn8_8(p0, p1);
2392 dct_trn8_8(p2, p3);
2393 dct_trn8_8(p4, p5);
2394 dct_trn8_8(p6, p7);
2395
2396 // pass 2
2397 dct_trn8_16(p0, p2);
2398 dct_trn8_16(p1, p3);
2399 dct_trn8_16(p4, p6);
2400 dct_trn8_16(p5, p7);
2401
2402 // pass 3
2403 dct_trn8_32(p0, p4);
2404 dct_trn8_32(p1, p5);
2405 dct_trn8_32(p2, p6);
2406 dct_trn8_32(p3, p7);
2407
2408 // store
2409 vst1_u8(out, p0); out += out_stride;
2410 vst1_u8(out, p1); out += out_stride;
2411 vst1_u8(out, p2); out += out_stride;
2412 vst1_u8(out, p3); out += out_stride;
2413 vst1_u8(out, p4); out += out_stride;
2414 vst1_u8(out, p5); out += out_stride;
2415 vst1_u8(out, p6); out += out_stride;
2416 vst1_u8(out, p7);
2417
2418 #undef dct_trn8_8
2419 #undef dct_trn8_16
2420 #undef dct_trn8_32
2421 }
2422
2423 #undef dct_long_mul
2424 #undef dct_long_mac
2425 #undef dct_widen
2426 #undef dct_wadd
2427 #undef dct_wsub
2428 #undef dct_bfly32o
2429 #undef dct_pass
2430 }
2431
2432 #endif // STBI_NEON
2433
2434 #define STBI__MARKER_none 0xff
2435 // if there's a pending marker from the entropy stream, return that
2436 // otherwise, fetch from the stream and get a marker. if there's no
2437 // marker, return 0xff, which is never a valid marker value
stbi__get_marker(stbi__jpeg * j)2438 static stbi_uc stbi__get_marker(stbi__jpeg *j)
2439 {
2440 stbi_uc x;
2441 if (j->marker != STBI__MARKER_none) { x = j->marker; j->marker = STBI__MARKER_none; return x; }
2442 x = stbi__get8(j->s);
2443 if (x != 0xff) return STBI__MARKER_none;
2444 while (x == 0xff)
2445 x = stbi__get8(j->s);
2446 return x;
2447 }
2448
2449 // in each scan, we'll have scan_n components, and the order
2450 // of the components is specified by order[]
2451 #define STBI__RESTART(x) ((x) >= 0xd0 && (x) <= 0xd7)
2452
2453 // after a restart interval, stbi__jpeg_reset the entropy decoder and
2454 // the dc prediction
stbi__jpeg_reset(stbi__jpeg * j)2455 static void stbi__jpeg_reset(stbi__jpeg *j)
2456 {
2457 j->code_bits = 0;
2458 j->code_buffer = 0;
2459 j->nomore = 0;
2460 j->img_comp[0].dc_pred = j->img_comp[1].dc_pred = j->img_comp[2].dc_pred = 0;
2461 j->marker = STBI__MARKER_none;
2462 j->todo = j->restart_interval ? j->restart_interval : 0x7fffffff;
2463 j->eob_run = 0;
2464 // no more than 1<<31 MCUs if no restart_interal? that's plenty safe,
2465 // since we don't even allow 1<<30 pixels
2466 }
2467
stbi__parse_entropy_coded_data(stbi__jpeg * z)2468 static int stbi__parse_entropy_coded_data(stbi__jpeg *z)
2469 {
2470 stbi__jpeg_reset(z);
2471 if (!z->progressive) {
2472 if (z->scan_n == 1) {
2473 int i,j;
2474 STBI_SIMD_ALIGN(short, data[64]);
2475 int n = z->order[0];
2476 // non-interleaved data, we just need to process one block at a time,
2477 // in trivial scanline order
2478 // number of blocks to do just depends on how many actual "pixels" this
2479 // component has, independent of interleaved MCU blocking and such
2480 int w = (z->img_comp[n].x+7) >> 3;
2481 int h = (z->img_comp[n].y+7) >> 3;
2482 for (j=0; j < h; ++j) {
2483 for (i=0; i < w; ++i) {
2484 int ha = z->img_comp[n].ha;
2485 if (!stbi__jpeg_decode_block(z, data, z->huff_dc+z->img_comp[n].hd, z->huff_ac+ha, z->fast_ac[ha], n, z->dequant[z->img_comp[n].tq])) return 0;
2486 z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*j*8+i*8, z->img_comp[n].w2, data);
2487 // every data block is an MCU, so countdown the restart interval
2488 if (--z->todo <= 0) {
2489 if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2490 // if it's NOT a restart, then just bail, so we get corrupt data
2491 // rather than no data
2492 if (!STBI__RESTART(z->marker)) return 1;
2493 stbi__jpeg_reset(z);
2494 }
2495 }
2496 }
2497 return 1;
2498 } else { // interleaved
2499 int i,j,k,x,y;
2500 STBI_SIMD_ALIGN(short, data[64]);
2501 for (j=0; j < z->img_mcu_y; ++j) {
2502 for (i=0; i < z->img_mcu_x; ++i) {
2503 // scan an interleaved mcu... process scan_n components in order
2504 for (k=0; k < z->scan_n; ++k) {
2505 int n = z->order[k];
2506 // scan out an mcu's worth of this component; that's just determined
2507 // by the basic H and V specified for the component
2508 for (y=0; y < z->img_comp[n].v; ++y) {
2509 for (x=0; x < z->img_comp[n].h; ++x) {
2510 int x2 = (i*z->img_comp[n].h + x)*8;
2511 int y2 = (j*z->img_comp[n].v + y)*8;
2512 int ha = z->img_comp[n].ha;
2513 if (!stbi__jpeg_decode_block(z, data, z->huff_dc+z->img_comp[n].hd, z->huff_ac+ha, z->fast_ac[ha], n, z->dequant[z->img_comp[n].tq])) return 0;
2514 z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*y2+x2, z->img_comp[n].w2, data);
2515 }
2516 }
2517 }
2518 // after all interleaved components, that's an interleaved MCU,
2519 // so now count down the restart interval
2520 if (--z->todo <= 0) {
2521 if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2522 if (!STBI__RESTART(z->marker)) return 1;
2523 stbi__jpeg_reset(z);
2524 }
2525 }
2526 }
2527 return 1;
2528 }
2529 } else {
2530 if (z->scan_n == 1) {
2531 int i,j;
2532 int n = z->order[0];
2533 // non-interleaved data, we just need to process one block at a time,
2534 // in trivial scanline order
2535 // number of blocks to do just depends on how many actual "pixels" this
2536 // component has, independent of interleaved MCU blocking and such
2537 int w = (z->img_comp[n].x+7) >> 3;
2538 int h = (z->img_comp[n].y+7) >> 3;
2539 for (j=0; j < h; ++j) {
2540 for (i=0; i < w; ++i) {
2541 short *data = z->img_comp[n].coeff + 64 * (i + j * z->img_comp[n].coeff_w);
2542 if (z->spec_start == 0) {
2543 if (!stbi__jpeg_decode_block_prog_dc(z, data, &z->huff_dc[z->img_comp[n].hd], n))
2544 return 0;
2545 } else {
2546 int ha = z->img_comp[n].ha;
2547 if (!stbi__jpeg_decode_block_prog_ac(z, data, &z->huff_ac[ha], z->fast_ac[ha]))
2548 return 0;
2549 }
2550 // every data block is an MCU, so countdown the restart interval
2551 if (--z->todo <= 0) {
2552 if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2553 if (!STBI__RESTART(z->marker)) return 1;
2554 stbi__jpeg_reset(z);
2555 }
2556 }
2557 }
2558 return 1;
2559 } else { // interleaved
2560 int i,j,k,x,y;
2561 for (j=0; j < z->img_mcu_y; ++j) {
2562 for (i=0; i < z->img_mcu_x; ++i) {
2563 // scan an interleaved mcu... process scan_n components in order
2564 for (k=0; k < z->scan_n; ++k) {
2565 int n = z->order[k];
2566 // scan out an mcu's worth of this component; that's just determined
2567 // by the basic H and V specified for the component
2568 for (y=0; y < z->img_comp[n].v; ++y) {
2569 for (x=0; x < z->img_comp[n].h; ++x) {
2570 int x2 = (i*z->img_comp[n].h + x);
2571 int y2 = (j*z->img_comp[n].v + y);
2572 short *data = z->img_comp[n].coeff + 64 * (x2 + y2 * z->img_comp[n].coeff_w);
2573 if (!stbi__jpeg_decode_block_prog_dc(z, data, &z->huff_dc[z->img_comp[n].hd], n))
2574 return 0;
2575 }
2576 }
2577 }
2578 // after all interleaved components, that's an interleaved MCU,
2579 // so now count down the restart interval
2580 if (--z->todo <= 0) {
2581 if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2582 if (!STBI__RESTART(z->marker)) return 1;
2583 stbi__jpeg_reset(z);
2584 }
2585 }
2586 }
2587 return 1;
2588 }
2589 }
2590 }
2591
stbi__jpeg_dequantize(short * data,stbi_uc * dequant)2592 static void stbi__jpeg_dequantize(short *data, stbi_uc *dequant)
2593 {
2594 int i;
2595 for (i=0; i < 64; ++i)
2596 data[i] *= dequant[i];
2597 }
2598
stbi__jpeg_finish(stbi__jpeg * z)2599 static void stbi__jpeg_finish(stbi__jpeg *z)
2600 {
2601 if (z->progressive) {
2602 // dequantize and idct the data
2603 int i,j,n;
2604 for (n=0; n < z->s->img_n; ++n) {
2605 int w = (z->img_comp[n].x+7) >> 3;
2606 int h = (z->img_comp[n].y+7) >> 3;
2607 for (j=0; j < h; ++j) {
2608 for (i=0; i < w; ++i) {
2609 short *data = z->img_comp[n].coeff + 64 * (i + j * z->img_comp[n].coeff_w);
2610 stbi__jpeg_dequantize(data, z->dequant[z->img_comp[n].tq]);
2611 z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*j*8+i*8, z->img_comp[n].w2, data);
2612 }
2613 }
2614 }
2615 }
2616 }
2617
stbi__process_marker(stbi__jpeg * z,int m)2618 static int stbi__process_marker(stbi__jpeg *z, int m)
2619 {
2620 int L;
2621 switch (m) {
2622 case STBI__MARKER_none: // no marker found
2623 return stbi__err("expected marker","Corrupt JPEG");
2624
2625 case 0xDD: // DRI - specify restart interval
2626 if (stbi__get16be(z->s) != 4) return stbi__err("bad DRI len","Corrupt JPEG");
2627 z->restart_interval = stbi__get16be(z->s);
2628 return 1;
2629
2630 case 0xDB: // DQT - define quantization table
2631 L = stbi__get16be(z->s)-2;
2632 while (L > 0) {
2633 int q = stbi__get8(z->s);
2634 int p = q >> 4;
2635 int t = q & 15,i;
2636 if (p != 0) return stbi__err("bad DQT type","Corrupt JPEG");
2637 if (t > 3) return stbi__err("bad DQT table","Corrupt JPEG");
2638 for (i=0; i < 64; ++i)
2639 z->dequant[t][stbi__jpeg_dezigzag[i]] = stbi__get8(z->s);
2640 L -= 65;
2641 }
2642 return L==0;
2643
2644 case 0xC4: // DHT - define huffman table
2645 L = stbi__get16be(z->s)-2;
2646 while (L > 0) {
2647 stbi_uc *v;
2648 int sizes[16],i,n=0;
2649 int q = stbi__get8(z->s);
2650 int tc = q >> 4;
2651 int th = q & 15;
2652 if (tc > 1 || th > 3) return stbi__err("bad DHT header","Corrupt JPEG");
2653 for (i=0; i < 16; ++i) {
2654 sizes[i] = stbi__get8(z->s);
2655 n += sizes[i];
2656 }
2657 L -= 17;
2658 if (tc == 0) {
2659 if (!stbi__build_huffman(z->huff_dc+th, sizes)) return 0;
2660 v = z->huff_dc[th].values;
2661 } else {
2662 if (!stbi__build_huffman(z->huff_ac+th, sizes)) return 0;
2663 v = z->huff_ac[th].values;
2664 }
2665 for (i=0; i < n; ++i)
2666 v[i] = stbi__get8(z->s);
2667 if (tc != 0)
2668 stbi__build_fast_ac(z->fast_ac[th], z->huff_ac + th);
2669 L -= n;
2670 }
2671 return L==0;
2672 }
2673 // check for comment block or APP blocks
2674 if ((m >= 0xE0 && m <= 0xEF) || m == 0xFE) {
2675 stbi__skip(z->s, stbi__get16be(z->s)-2);
2676 return 1;
2677 }
2678 return 0;
2679 }
2680
2681 // after we see SOS
stbi__process_scan_header(stbi__jpeg * z)2682 static int stbi__process_scan_header(stbi__jpeg *z)
2683 {
2684 int i;
2685 int Ls = stbi__get16be(z->s);
2686 z->scan_n = stbi__get8(z->s);
2687 if (z->scan_n < 1 || z->scan_n > 4 || z->scan_n > (int) z->s->img_n) return stbi__err("bad SOS component count","Corrupt JPEG");
2688 if (Ls != 6+2*z->scan_n) return stbi__err("bad SOS len","Corrupt JPEG");
2689 for (i=0; i < z->scan_n; ++i) {
2690 int id = stbi__get8(z->s), which;
2691 int q = stbi__get8(z->s);
2692 for (which = 0; which < z->s->img_n; ++which)
2693 if (z->img_comp[which].id == id)
2694 break;
2695 if (which == z->s->img_n) return 0; // no match
2696 z->img_comp[which].hd = q >> 4; if (z->img_comp[which].hd > 3) return stbi__err("bad DC huff","Corrupt JPEG");
2697 z->img_comp[which].ha = q & 15; if (z->img_comp[which].ha > 3) return stbi__err("bad AC huff","Corrupt JPEG");
2698 z->order[i] = which;
2699 }
2700
2701 {
2702 int aa;
2703 z->spec_start = stbi__get8(z->s);
2704 z->spec_end = stbi__get8(z->s); // should be 63, but might be 0
2705 aa = stbi__get8(z->s);
2706 z->succ_high = (aa >> 4);
2707 z->succ_low = (aa & 15);
2708 if (z->progressive) {
2709 if (z->spec_start > 63 || z->spec_end > 63 || z->spec_start > z->spec_end || z->succ_high > 13 || z->succ_low > 13)
2710 return stbi__err("bad SOS", "Corrupt JPEG");
2711 } else {
2712 if (z->spec_start != 0) return stbi__err("bad SOS","Corrupt JPEG");
2713 if (z->succ_high != 0 || z->succ_low != 0) return stbi__err("bad SOS","Corrupt JPEG");
2714 z->spec_end = 63;
2715 }
2716 }
2717
2718 return 1;
2719 }
2720
stbi__process_frame_header(stbi__jpeg * z,int scan)2721 static int stbi__process_frame_header(stbi__jpeg *z, int scan)
2722 {
2723 stbi__context *s = z->s;
2724 int Lf,p,i,q, h_max=1,v_max=1,c;
2725 Lf = stbi__get16be(s); if (Lf < 11) return stbi__err("bad SOF len","Corrupt JPEG"); // JPEG
2726 p = stbi__get8(s); if (p != 8) return stbi__err("only 8-bit","JPEG format not supported: 8-bit only"); // JPEG baseline
2727 s->img_y = stbi__get16be(s); if (s->img_y == 0) return stbi__err("no header height", "JPEG format not supported: delayed height"); // Legal, but we don't handle it--but neither does IJG
2728 s->img_x = stbi__get16be(s); if (s->img_x == 0) return stbi__err("0 width","Corrupt JPEG"); // JPEG requires
2729 c = stbi__get8(s);
2730 if (c != 3 && c != 1) return stbi__err("bad component count","Corrupt JPEG"); // JFIF requires
2731 s->img_n = c;
2732 for (i=0; i < c; ++i) {
2733 z->img_comp[i].data = NULL;
2734 z->img_comp[i].linebuf = NULL;
2735 }
2736
2737 if (Lf != 8+3*s->img_n) return stbi__err("bad SOF len","Corrupt JPEG");
2738
2739 for (i=0; i < s->img_n; ++i) {
2740 z->img_comp[i].id = stbi__get8(s);
2741 if (z->img_comp[i].id != i+1) // JFIF requires
2742 if (z->img_comp[i].id != i) // some version of jpegtran outputs non-JFIF-compliant files!
2743 return stbi__err("bad component ID","Corrupt JPEG");
2744 q = stbi__get8(s);
2745 z->img_comp[i].h = (q >> 4); if (!z->img_comp[i].h || z->img_comp[i].h > 4) return stbi__err("bad H","Corrupt JPEG");
2746 z->img_comp[i].v = q & 15; if (!z->img_comp[i].v || z->img_comp[i].v > 4) return stbi__err("bad V","Corrupt JPEG");
2747 z->img_comp[i].tq = stbi__get8(s); if (z->img_comp[i].tq > 3) return stbi__err("bad TQ","Corrupt JPEG");
2748 }
2749
2750 if (scan != STBI__SCAN_load) return 1;
2751
2752 if ((1 << 30) / s->img_x / s->img_n < s->img_y) return stbi__err("too large", "Image too large to decode");
2753
2754 for (i=0; i < s->img_n; ++i) {
2755 if (z->img_comp[i].h > h_max) h_max = z->img_comp[i].h;
2756 if (z->img_comp[i].v > v_max) v_max = z->img_comp[i].v;
2757 }
2758
2759 // compute interleaved mcu info
2760 z->img_h_max = h_max;
2761 z->img_v_max = v_max;
2762 z->img_mcu_w = h_max * 8;
2763 z->img_mcu_h = v_max * 8;
2764 z->img_mcu_x = (s->img_x + z->img_mcu_w-1) / z->img_mcu_w;
2765 z->img_mcu_y = (s->img_y + z->img_mcu_h-1) / z->img_mcu_h;
2766
2767 for (i=0; i < s->img_n; ++i) {
2768 // number of effective pixels (e.g. for non-interleaved MCU)
2769 z->img_comp[i].x = (s->img_x * z->img_comp[i].h + h_max-1) / h_max;
2770 z->img_comp[i].y = (s->img_y * z->img_comp[i].v + v_max-1) / v_max;
2771 // to simplify generation, we'll allocate enough memory to decode
2772 // the bogus oversized data from using interleaved MCUs and their
2773 // big blocks (e.g. a 16x16 iMCU on an image of width 33); we won't
2774 // discard the extra data until colorspace conversion
2775 z->img_comp[i].w2 = z->img_mcu_x * z->img_comp[i].h * 8;
2776 z->img_comp[i].h2 = z->img_mcu_y * z->img_comp[i].v * 8;
2777 if (z->img_comp[i].w2 <= 0 || z->img_comp[i].h2 <= 0 ||
2778 (z->img_comp[i].w2 > (INT_MAX - 15) / z->img_comp[i].h2))
2779 return stbi__err("Integer Overflow", "w2 or h2 incorrect");
2780 z->img_comp[i].raw_data = stbi__malloc(z->img_comp[i].w2 * z->img_comp[i].h2+15);
2781
2782 if (z->img_comp[i].raw_data == NULL) {
2783 for(--i; i >= 0; --i) {
2784 STBI_FREE(z->img_comp[i].raw_data);
2785 z->img_comp[i].raw_data = NULL;
2786 }
2787 return stbi__err("outofmem", "Out of memory");
2788 }
2789 // align blocks for idct using mmx/sse
2790 z->img_comp[i].data = (stbi_uc*) (((size_t) z->img_comp[i].raw_data + 15) & ~15);
2791 z->img_comp[i].linebuf = NULL;
2792 if (z->progressive) {
2793 z->img_comp[i].coeff_w = (z->img_comp[i].w2 + 7) >> 3;
2794 z->img_comp[i].coeff_h = (z->img_comp[i].h2 + 7) >> 3;
2795 z->img_comp[i].raw_coeff = STBI_MALLOC(z->img_comp[i].coeff_w * z->img_comp[i].coeff_h * 64 * sizeof(short) + 15);
2796 z->img_comp[i].coeff = (short*) (((size_t) z->img_comp[i].raw_coeff + 15) & ~15);
2797 } else {
2798 z->img_comp[i].coeff = 0;
2799 z->img_comp[i].raw_coeff = 0;
2800 }
2801 }
2802
2803 return 1;
2804 }
2805
2806 // use comparisons since in some cases we handle more than one case (e.g. SOF)
2807 #define stbi__DNL(x) ((x) == 0xdc)
2808 #define stbi__SOI(x) ((x) == 0xd8)
2809 #define stbi__EOI(x) ((x) == 0xd9)
2810 #define stbi__SOF(x) ((x) == 0xc0 || (x) == 0xc1 || (x) == 0xc2)
2811 #define stbi__SOS(x) ((x) == 0xda)
2812
2813 #define stbi__SOF_progressive(x) ((x) == 0xc2)
2814
stbi__decode_jpeg_header(stbi__jpeg * z,int scan)2815 static int stbi__decode_jpeg_header(stbi__jpeg *z, int scan)
2816 {
2817 int m;
2818 z->marker = STBI__MARKER_none; // initialize cached marker to empty
2819 m = stbi__get_marker(z);
2820 if (!stbi__SOI(m)) return stbi__err("no SOI","Corrupt JPEG");
2821 if (scan == STBI__SCAN_type) return 1;
2822 m = stbi__get_marker(z);
2823 while (!stbi__SOF(m)) {
2824 if (!stbi__process_marker(z,m)) return 0;
2825 m = stbi__get_marker(z);
2826 while (m == STBI__MARKER_none) {
2827 // some files have extra padding after their blocks, so ok, we'll scan
2828 if (stbi__at_eof(z->s)) return stbi__err("no SOF", "Corrupt JPEG");
2829 m = stbi__get_marker(z);
2830 }
2831 }
2832 z->progressive = stbi__SOF_progressive(m);
2833 if (!stbi__process_frame_header(z, scan)) return 0;
2834 return 1;
2835 }
2836
2837 // decode image to YCbCr format
stbi__decode_jpeg_image(stbi__jpeg * j)2838 static int stbi__decode_jpeg_image(stbi__jpeg *j)
2839 {
2840 int m;
2841 for (m = 0; m < 4; m++) {
2842 j->img_comp[m].raw_data = NULL;
2843 j->img_comp[m].raw_coeff = NULL;
2844 }
2845 j->restart_interval = 0;
2846 if (!stbi__decode_jpeg_header(j, STBI__SCAN_load)) return 0;
2847 m = stbi__get_marker(j);
2848 while (!stbi__EOI(m)) {
2849 if (stbi__SOS(m)) {
2850 if (!stbi__process_scan_header(j)) return 0;
2851 if (!stbi__parse_entropy_coded_data(j)) return 0;
2852 if (j->marker == STBI__MARKER_none ) {
2853 // handle 0s at the end of image data from IP Kamera 9060
2854 while (!stbi__at_eof(j->s)) {
2855 int x = stbi__get8(j->s);
2856 if (x == 255) {
2857 j->marker = stbi__get8(j->s);
2858 break;
2859 } else if (x != 0) {
2860 return stbi__err("junk before marker", "Corrupt JPEG");
2861 }
2862 }
2863 // if we reach eof without hitting a marker, stbi__get_marker() below will fail and we'll eventually return 0
2864 }
2865 } else {
2866 if (!stbi__process_marker(j, m)) return 0;
2867 }
2868 m = stbi__get_marker(j);
2869 }
2870 if (j->progressive)
2871 stbi__jpeg_finish(j);
2872 return 1;
2873 }
2874
2875 // static jfif-centered resampling (across block boundaries)
2876
2877 typedef stbi_uc *(*resample_row_func)(stbi_uc *out, stbi_uc *in0, stbi_uc *in1,
2878 int w, int hs);
2879
2880 #define stbi__div4(x) ((stbi_uc) ((x) >> 2))
2881
resample_row_1(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)2882 static stbi_uc *resample_row_1(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2883 {
2884 STBI_NOTUSED(out);
2885 STBI_NOTUSED(in_far);
2886 STBI_NOTUSED(w);
2887 STBI_NOTUSED(hs);
2888 return in_near;
2889 }
2890
stbi__resample_row_v_2(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)2891 static stbi_uc* stbi__resample_row_v_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2892 {
2893 // need to generate two samples vertically for every one in input
2894 int i;
2895 STBI_NOTUSED(hs);
2896 for (i=0; i < w; ++i)
2897 out[i] = stbi__div4(3*in_near[i] + in_far[i] + 2);
2898 return out;
2899 }
2900
stbi__resample_row_h_2(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)2901 static stbi_uc* stbi__resample_row_h_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2902 {
2903 // need to generate two samples horizontally for every one in input
2904 int i;
2905 stbi_uc *input = in_near;
2906
2907 if (w == 1) {
2908 // if only one sample, can't do any interpolation
2909 out[0] = out[1] = input[0];
2910 return out;
2911 }
2912
2913 out[0] = input[0];
2914 out[1] = stbi__div4(input[0]*3 + input[1] + 2);
2915 for (i=1; i < w-1; ++i) {
2916 int n = 3*input[i]+2;
2917 out[i*2+0] = stbi__div4(n+input[i-1]);
2918 out[i*2+1] = stbi__div4(n+input[i+1]);
2919 }
2920 out[i*2+0] = stbi__div4(input[w-2]*3 + input[w-1] + 2);
2921 out[i*2+1] = input[w-1];
2922
2923 STBI_NOTUSED(in_far);
2924 STBI_NOTUSED(hs);
2925
2926 return out;
2927 }
2928
2929 #define stbi__div16(x) ((stbi_uc) ((x) >> 4))
2930
stbi__resample_row_hv_2(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)2931 static stbi_uc *stbi__resample_row_hv_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2932 {
2933 // need to generate 2x2 samples for every one in input
2934 int i,t0,t1;
2935 if (w == 1) {
2936 out[0] = out[1] = stbi__div4(3*in_near[0] + in_far[0] + 2);
2937 return out;
2938 }
2939
2940 t1 = 3*in_near[0] + in_far[0];
2941 out[0] = stbi__div4(t1+2);
2942 for (i=1; i < w; ++i) {
2943 t0 = t1;
2944 t1 = 3*in_near[i]+in_far[i];
2945 out[i*2-1] = stbi__div16(3*t0 + t1 + 8);
2946 out[i*2 ] = stbi__div16(3*t1 + t0 + 8);
2947 }
2948 out[w*2-1] = stbi__div4(t1+2);
2949
2950 STBI_NOTUSED(hs);
2951
2952 return out;
2953 }
2954
2955 #if defined(STBI_SSE2) || defined(STBI_NEON)
stbi__resample_row_hv_2_simd(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)2956 static stbi_uc *stbi__resample_row_hv_2_simd(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2957 {
2958 // need to generate 2x2 samples for every one in input
2959 int i=0,t0,t1;
2960
2961 if (w == 1) {
2962 out[0] = out[1] = stbi__div4(3*in_near[0] + in_far[0] + 2);
2963 return out;
2964 }
2965
2966 t1 = 3*in_near[0] + in_far[0];
2967 // process groups of 8 pixels for as long as we can.
2968 // note we can't handle the last pixel in a row in this loop
2969 // because we need to handle the filter boundary conditions.
2970 for (; i < ((w-1) & ~7); i += 8) {
2971 #if defined(STBI_SSE2)
2972 // load and perform the vertical filtering pass
2973 // this uses 3*x + y = 4*x + (y - x)
2974 __m128i zero = _mm_setzero_si128();
2975 __m128i farb = _mm_loadl_epi64((__m128i *) (in_far + i));
2976 __m128i nearb = _mm_loadl_epi64((__m128i *) (in_near + i));
2977 __m128i farw = _mm_unpacklo_epi8(farb, zero);
2978 __m128i nearw = _mm_unpacklo_epi8(nearb, zero);
2979 __m128i diff = _mm_sub_epi16(farw, nearw);
2980 __m128i nears = _mm_slli_epi16(nearw, 2);
2981 __m128i curr = _mm_add_epi16(nears, diff); // current row
2982
2983 // horizontal filter works the same based on shifted vers of current
2984 // row. "prev" is current row shifted right by 1 pixel; we need to
2985 // insert the previous pixel value (from t1).
2986 // "next" is current row shifted left by 1 pixel, with first pixel
2987 // of next block of 8 pixels added in.
2988 __m128i prv0 = _mm_slli_si128(curr, 2);
2989 __m128i nxt0 = _mm_srli_si128(curr, 2);
2990 __m128i prev = _mm_insert_epi16(prv0, t1, 0);
2991 __m128i next = _mm_insert_epi16(nxt0, 3*in_near[i+8] + in_far[i+8], 7);
2992
2993 // horizontal filter, polyphase implementation since it's convenient:
2994 // even pixels = 3*cur + prev = cur*4 + (prev - cur)
2995 // odd pixels = 3*cur + next = cur*4 + (next - cur)
2996 // note the shared term.
2997 __m128i bias = _mm_set1_epi16(8);
2998 __m128i curs = _mm_slli_epi16(curr, 2);
2999 __m128i prvd = _mm_sub_epi16(prev, curr);
3000 __m128i nxtd = _mm_sub_epi16(next, curr);
3001 __m128i curb = _mm_add_epi16(curs, bias);
3002 __m128i even = _mm_add_epi16(prvd, curb);
3003 __m128i odd = _mm_add_epi16(nxtd, curb);
3004
3005 // interleave even and odd pixels, then undo scaling.
3006 __m128i int0 = _mm_unpacklo_epi16(even, odd);
3007 __m128i int1 = _mm_unpackhi_epi16(even, odd);
3008 __m128i de0 = _mm_srli_epi16(int0, 4);
3009 __m128i de1 = _mm_srli_epi16(int1, 4);
3010
3011 // pack and write output
3012 __m128i outv = _mm_packus_epi16(de0, de1);
3013 _mm_storeu_si128((__m128i *) (out + i*2), outv);
3014 #elif defined(STBI_NEON)
3015 // load and perform the vertical filtering pass
3016 // this uses 3*x + y = 4*x + (y - x)
3017 uint8x8_t farb = vld1_u8(in_far + i);
3018 uint8x8_t nearb = vld1_u8(in_near + i);
3019 int16x8_t diff = vreinterpretq_s16_u16(vsubl_u8(farb, nearb));
3020 int16x8_t nears = vreinterpretq_s16_u16(vshll_n_u8(nearb, 2));
3021 int16x8_t curr = vaddq_s16(nears, diff); // current row
3022
3023 // horizontal filter works the same based on shifted vers of current
3024 // row. "prev" is current row shifted right by 1 pixel; we need to
3025 // insert the previous pixel value (from t1).
3026 // "next" is current row shifted left by 1 pixel, with first pixel
3027 // of next block of 8 pixels added in.
3028 int16x8_t prv0 = vextq_s16(curr, curr, 7);
3029 int16x8_t nxt0 = vextq_s16(curr, curr, 1);
3030 int16x8_t prev = vsetq_lane_s16(t1, prv0, 0);
3031 int16x8_t next = vsetq_lane_s16(3*in_near[i+8] + in_far[i+8], nxt0, 7);
3032
3033 // horizontal filter, polyphase implementation since it's convenient:
3034 // even pixels = 3*cur + prev = cur*4 + (prev - cur)
3035 // odd pixels = 3*cur + next = cur*4 + (next - cur)
3036 // note the shared term.
3037 int16x8_t curs = vshlq_n_s16(curr, 2);
3038 int16x8_t prvd = vsubq_s16(prev, curr);
3039 int16x8_t nxtd = vsubq_s16(next, curr);
3040 int16x8_t even = vaddq_s16(curs, prvd);
3041 int16x8_t odd = vaddq_s16(curs, nxtd);
3042
3043 // undo scaling and round, then store with even/odd phases interleaved
3044 uint8x8x2_t o;
3045 o.val[0] = vqrshrun_n_s16(even, 4);
3046 o.val[1] = vqrshrun_n_s16(odd, 4);
3047 vst2_u8(out + i*2, o);
3048 #endif
3049
3050 // "previous" value for next iter
3051 t1 = 3*in_near[i+7] + in_far[i+7];
3052 }
3053
3054 t0 = t1;
3055 t1 = 3*in_near[i] + in_far[i];
3056 out[i*2] = stbi__div16(3*t1 + t0 + 8);
3057
3058 for (++i; i < w; ++i) {
3059 t0 = t1;
3060 t1 = 3*in_near[i]+in_far[i];
3061 out[i*2-1] = stbi__div16(3*t0 + t1 + 8);
3062 out[i*2 ] = stbi__div16(3*t1 + t0 + 8);
3063 }
3064 out[w*2-1] = stbi__div4(t1+2);
3065
3066 STBI_NOTUSED(hs);
3067
3068 return out;
3069 }
3070 #endif
3071
stbi__resample_row_generic(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)3072 static stbi_uc *stbi__resample_row_generic(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
3073 {
3074 // resample with nearest-neighbor
3075 int i,j;
3076 STBI_NOTUSED(in_far);
3077 for (i=0; i < w; ++i)
3078 for (j=0; j < hs; ++j)
3079 out[i*hs+j] = in_near[i];
3080 return out;
3081 }
3082
3083 #ifdef STBI_JPEG_OLD
3084 // this is the same YCbCr-to-RGB calculation that stb_image has used
3085 // historically before the algorithm changes in 1.49
3086 #define float2fixed(x) ((int) ((x) * 65536 + 0.5))
stbi__YCbCr_to_RGB_row(stbi_uc * out,const stbi_uc * y,const stbi_uc * pcb,const stbi_uc * pcr,int count,int step)3087 static void stbi__YCbCr_to_RGB_row(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step)
3088 {
3089 int i;
3090 for (i=0; i < count; ++i) {
3091 int y_fixed = (y[i] << 16) + 32768; // rounding
3092 int r,g,b;
3093 int cr = pcr[i] - 128;
3094 int cb = pcb[i] - 128;
3095 r = y_fixed + cr*float2fixed(1.40200f);
3096 g = y_fixed - cr*float2fixed(0.71414f) - cb*float2fixed(0.34414f);
3097 b = y_fixed + cb*float2fixed(1.77200f);
3098 r >>= 16;
3099 g >>= 16;
3100 b >>= 16;
3101 if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
3102 if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
3103 if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
3104 out[0] = (stbi_uc)r;
3105 out[1] = (stbi_uc)g;
3106 out[2] = (stbi_uc)b;
3107 out[3] = 255;
3108 out += step;
3109 }
3110 }
3111 #else
3112 // this is a reduced-precision calculation of YCbCr-to-RGB introduced
3113 // to make sure the code produces the same results in both SIMD and scalar
3114 #define float2fixed(x) (((int) ((x) * 4096.0f + 0.5f)) << 8)
stbi__YCbCr_to_RGB_row(stbi_uc * out,const stbi_uc * y,const stbi_uc * pcb,const stbi_uc * pcr,int count,int step)3115 static void stbi__YCbCr_to_RGB_row(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step)
3116 {
3117 int i;
3118 for (i=0; i < count; ++i) {
3119 int y_fixed = (y[i] << 20) + (1<<19); // rounding
3120 int r,g,b;
3121 int cr = pcr[i] - 128;
3122 int cb = pcb[i] - 128;
3123 r = y_fixed + cr* float2fixed(1.40200f);
3124 g = y_fixed + (cr*-float2fixed(0.71414f)) + ((cb*-float2fixed(0.34414f)) & 0xffff0000);
3125 b = y_fixed + cb* float2fixed(1.77200f);
3126 r >>= 20;
3127 g >>= 20;
3128 b >>= 20;
3129 if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
3130 if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
3131 if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
3132 out[0] = (stbi_uc)r;
3133 out[1] = (stbi_uc)g;
3134 out[2] = (stbi_uc)b;
3135 out[3] = 255;
3136 out += step;
3137 }
3138 }
3139 #endif
3140
3141 #if defined(STBI_SSE2) || defined(STBI_NEON)
stbi__YCbCr_to_RGB_simd(stbi_uc * out,stbi_uc const * y,stbi_uc const * pcb,stbi_uc const * pcr,int count,int step)3142 static void stbi__YCbCr_to_RGB_simd(stbi_uc *out, stbi_uc const *y, stbi_uc const *pcb, stbi_uc const *pcr, int count, int step)
3143 {
3144 int i = 0;
3145
3146 #ifdef STBI_SSE2
3147 // step == 3 is pretty ugly on the final interleave, and i'm not convinced
3148 // it's useful in practice (you wouldn't use it for textures, for example).
3149 // so just accelerate step == 4 case.
3150 if (step == 4) {
3151 // this is a fairly straightforward implementation and not super-optimized.
3152 __m128i signflip = _mm_set1_epi8(-0x80);
3153 __m128i cr_const0 = _mm_set1_epi16( (short) ( 1.40200f*4096.0f+0.5f));
3154 __m128i cr_const1 = _mm_set1_epi16( - (short) ( 0.71414f*4096.0f+0.5f));
3155 __m128i cb_const0 = _mm_set1_epi16( - (short) ( 0.34414f*4096.0f+0.5f));
3156 __m128i cb_const1 = _mm_set1_epi16( (short) ( 1.77200f*4096.0f+0.5f));
3157 __m128i y_bias = _mm_set1_epi8((char) (unsigned char) 128);
3158 __m128i xw = _mm_set1_epi16(255); // alpha channel
3159
3160 for (; i+7 < count; i += 8) {
3161 // load
3162 __m128i y_bytes = _mm_loadl_epi64((__m128i *) (y+i));
3163 __m128i cr_bytes = _mm_loadl_epi64((__m128i *) (pcr+i));
3164 __m128i cb_bytes = _mm_loadl_epi64((__m128i *) (pcb+i));
3165 __m128i cr_biased = _mm_xor_si128(cr_bytes, signflip); // -128
3166 __m128i cb_biased = _mm_xor_si128(cb_bytes, signflip); // -128
3167
3168 // unpack to short (and left-shift cr, cb by 8)
3169 __m128i yw = _mm_unpacklo_epi8(y_bias, y_bytes);
3170 __m128i crw = _mm_unpacklo_epi8(_mm_setzero_si128(), cr_biased);
3171 __m128i cbw = _mm_unpacklo_epi8(_mm_setzero_si128(), cb_biased);
3172
3173 // color transform
3174 __m128i yws = _mm_srli_epi16(yw, 4);
3175 __m128i cr0 = _mm_mulhi_epi16(cr_const0, crw);
3176 __m128i cb0 = _mm_mulhi_epi16(cb_const0, cbw);
3177 __m128i cb1 = _mm_mulhi_epi16(cbw, cb_const1);
3178 __m128i cr1 = _mm_mulhi_epi16(crw, cr_const1);
3179 __m128i rws = _mm_add_epi16(cr0, yws);
3180 __m128i gwt = _mm_add_epi16(cb0, yws);
3181 __m128i bws = _mm_add_epi16(yws, cb1);
3182 __m128i gws = _mm_add_epi16(gwt, cr1);
3183
3184 // descale
3185 __m128i rw = _mm_srai_epi16(rws, 4);
3186 __m128i bw = _mm_srai_epi16(bws, 4);
3187 __m128i gw = _mm_srai_epi16(gws, 4);
3188
3189 // back to byte, set up for transpose
3190 __m128i brb = _mm_packus_epi16(rw, bw);
3191 __m128i gxb = _mm_packus_epi16(gw, xw);
3192
3193 // transpose to interleave channels
3194 __m128i t0 = _mm_unpacklo_epi8(brb, gxb);
3195 __m128i t1 = _mm_unpackhi_epi8(brb, gxb);
3196 __m128i o0 = _mm_unpacklo_epi16(t0, t1);
3197 __m128i o1 = _mm_unpackhi_epi16(t0, t1);
3198
3199 // store
3200 _mm_storeu_si128((__m128i *) (out + 0), o0);
3201 _mm_storeu_si128((__m128i *) (out + 16), o1);
3202 out += 32;
3203 }
3204 }
3205 #endif
3206
3207 #ifdef STBI_NEON
3208 // in this version, step=3 support would be easy to add. but is there demand?
3209 if (step == 4) {
3210 // this is a fairly straightforward implementation and not super-optimized.
3211 uint8x8_t signflip = vdup_n_u8(0x80);
3212 int16x8_t cr_const0 = vdupq_n_s16( (short) ( 1.40200f*4096.0f+0.5f));
3213 int16x8_t cr_const1 = vdupq_n_s16( - (short) ( 0.71414f*4096.0f+0.5f));
3214 int16x8_t cb_const0 = vdupq_n_s16( - (short) ( 0.34414f*4096.0f+0.5f));
3215 int16x8_t cb_const1 = vdupq_n_s16( (short) ( 1.77200f*4096.0f+0.5f));
3216
3217 for (; i+7 < count; i += 8) {
3218 // load
3219 uint8x8_t y_bytes = vld1_u8(y + i);
3220 uint8x8_t cr_bytes = vld1_u8(pcr + i);
3221 uint8x8_t cb_bytes = vld1_u8(pcb + i);
3222 int8x8_t cr_biased = vreinterpret_s8_u8(vsub_u8(cr_bytes, signflip));
3223 int8x8_t cb_biased = vreinterpret_s8_u8(vsub_u8(cb_bytes, signflip));
3224
3225 // expand to s16
3226 int16x8_t yws = vreinterpretq_s16_u16(vshll_n_u8(y_bytes, 4));
3227 int16x8_t crw = vshll_n_s8(cr_biased, 7);
3228 int16x8_t cbw = vshll_n_s8(cb_biased, 7);
3229
3230 // color transform
3231 int16x8_t cr0 = vqdmulhq_s16(crw, cr_const0);
3232 int16x8_t cb0 = vqdmulhq_s16(cbw, cb_const0);
3233 int16x8_t cr1 = vqdmulhq_s16(crw, cr_const1);
3234 int16x8_t cb1 = vqdmulhq_s16(cbw, cb_const1);
3235 int16x8_t rws = vaddq_s16(yws, cr0);
3236 int16x8_t gws = vaddq_s16(vaddq_s16(yws, cb0), cr1);
3237 int16x8_t bws = vaddq_s16(yws, cb1);
3238
3239 // undo scaling, round, convert to byte
3240 uint8x8x4_t o;
3241 o.val[0] = vqrshrun_n_s16(rws, 4);
3242 o.val[1] = vqrshrun_n_s16(gws, 4);
3243 o.val[2] = vqrshrun_n_s16(bws, 4);
3244 o.val[3] = vdup_n_u8(255);
3245
3246 // store, interleaving r/g/b/a
3247 vst4_u8(out, o);
3248 out += 8*4;
3249 }
3250 }
3251 #endif
3252
3253 for (; i < count; ++i) {
3254 int y_fixed = (y[i] << 20) + (1<<19); // rounding
3255 int r,g,b;
3256 int cr = pcr[i] - 128;
3257 int cb = pcb[i] - 128;
3258 r = y_fixed + cr* float2fixed(1.40200f);
3259 g = y_fixed + cr*-float2fixed(0.71414f) + ((cb*-float2fixed(0.34414f)) & 0xffff0000);
3260 b = y_fixed + cb* float2fixed(1.77200f);
3261 r >>= 20;
3262 g >>= 20;
3263 b >>= 20;
3264 if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
3265 if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
3266 if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
3267 out[0] = (stbi_uc)r;
3268 out[1] = (stbi_uc)g;
3269 out[2] = (stbi_uc)b;
3270 out[3] = 255;
3271 out += step;
3272 }
3273 }
3274 #endif
3275
3276 // set up the kernels
stbi__setup_jpeg(stbi__jpeg * j)3277 static void stbi__setup_jpeg(stbi__jpeg *j)
3278 {
3279 j->idct_block_kernel = stbi__idct_block;
3280 j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_row;
3281 j->resample_row_hv_2_kernel = stbi__resample_row_hv_2;
3282
3283 #ifdef STBI_SSE2
3284 if (stbi__sse2_available()) {
3285 j->idct_block_kernel = stbi__idct_simd;
3286 #ifndef STBI_JPEG_OLD
3287 j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_simd;
3288 #endif
3289 j->resample_row_hv_2_kernel = stbi__resample_row_hv_2_simd;
3290 }
3291 #endif
3292
3293 #ifdef STBI_NEON
3294 j->idct_block_kernel = stbi__idct_simd;
3295 #ifndef STBI_JPEG_OLD
3296 j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_simd;
3297 #endif
3298 j->resample_row_hv_2_kernel = stbi__resample_row_hv_2_simd;
3299 #endif
3300 }
3301
3302 // clean up the temporary component buffers
stbi__cleanup_jpeg(stbi__jpeg * j)3303 static void stbi__cleanup_jpeg(stbi__jpeg *j)
3304 {
3305 int i;
3306 for (i=0; i < j->s->img_n; ++i) {
3307 if (j->img_comp[i].raw_data) {
3308 STBI_FREE(j->img_comp[i].raw_data);
3309 j->img_comp[i].raw_data = NULL;
3310 j->img_comp[i].data = NULL;
3311 }
3312 if (j->img_comp[i].raw_coeff) {
3313 STBI_FREE(j->img_comp[i].raw_coeff);
3314 j->img_comp[i].raw_coeff = 0;
3315 j->img_comp[i].coeff = 0;
3316 }
3317 if (j->img_comp[i].linebuf) {
3318 STBI_FREE(j->img_comp[i].linebuf);
3319 j->img_comp[i].linebuf = NULL;
3320 }
3321 }
3322 }
3323
3324 typedef struct
3325 {
3326 resample_row_func resample;
3327 stbi_uc *line0,*line1;
3328 int hs,vs; // expansion factor in each axis
3329 int w_lores; // horizontal pixels pre-expansion
3330 int ystep; // how far through vertical expansion we are
3331 int ypos; // which pre-expansion row we're on
3332 } stbi__resample;
3333
load_jpeg_image(stbi__jpeg * z,int * out_x,int * out_y,int * comp,int req_comp)3334 static stbi_uc *load_jpeg_image(stbi__jpeg *z, int *out_x, int *out_y, int *comp, int req_comp)
3335 {
3336 int n, decode_n;
3337 z->s->img_n = 0; // make stbi__cleanup_jpeg safe
3338
3339 // validate req_comp
3340 if (req_comp < 0 || req_comp > 4) return stbi__errpuc("bad req_comp", "Internal error");
3341
3342 // load a jpeg image from whichever source, but leave in YCbCr format
3343 if (!stbi__decode_jpeg_image(z)) { stbi__cleanup_jpeg(z); return NULL; }
3344
3345 // determine actual number of components to generate
3346 n = req_comp ? req_comp : z->s->img_n;
3347
3348 if (z->s->img_n == 3 && n < 3)
3349 decode_n = 1;
3350 else
3351 decode_n = z->s->img_n;
3352
3353 // resample and color-convert
3354 {
3355 int k;
3356 unsigned int i,j;
3357 stbi_uc *output;
3358 stbi_uc *coutput[4];
3359
3360 stbi__resample res_comp[4];
3361
3362 for (k=0; k < decode_n; ++k) {
3363 stbi__resample *r = &res_comp[k];
3364
3365 // allocate line buffer big enough for upsampling off the edges
3366 // with upsample factor of 4
3367 if (z->s->img_x > (INT_MAX - 3))
3368 return stbi__errpuc("Integer Overflow", "z->s->img_x incorrect");
3369 z->img_comp[k].linebuf = (stbi_uc *) stbi__malloc(z->s->img_x + 3);
3370 if (!z->img_comp[k].linebuf) { stbi__cleanup_jpeg(z); return stbi__errpuc("outofmem", "Out of memory"); }
3371
3372 r->hs = z->img_h_max / z->img_comp[k].h;
3373 r->vs = z->img_v_max / z->img_comp[k].v;
3374 r->ystep = r->vs >> 1;
3375 r->w_lores = (z->s->img_x + r->hs-1) / r->hs;
3376 r->ypos = 0;
3377 r->line0 = r->line1 = z->img_comp[k].data;
3378
3379 if (r->hs == 1 && r->vs == 1) r->resample = resample_row_1;
3380 else if (r->hs == 1 && r->vs == 2) r->resample = stbi__resample_row_v_2;
3381 else if (r->hs == 2 && r->vs == 1) r->resample = stbi__resample_row_h_2;
3382 else if (r->hs == 2 && r->vs == 2) r->resample = z->resample_row_hv_2_kernel;
3383 else r->resample = stbi__resample_row_generic;
3384 }
3385
3386 // can't error after this so, this is safe
3387 if(n <= 0 || z->s->img_x <= 0 || z->s->img_y <= 0 ||
3388 (z->s->img_y > (INT_MAX - 1) / z->s->img_x / n))
3389 return stbi__errpuc("Integer Overflow", "z->s->img_x or z->s->img_y incorrect");
3390 output = (stbi_uc *) stbi__malloc(n * z->s->img_x * z->s->img_y + 1);
3391 if (!output) { stbi__cleanup_jpeg(z); return stbi__errpuc("outofmem", "Out of memory"); }
3392
3393 // now go ahead and resample
3394 for (j=0; j < z->s->img_y; ++j) {
3395 stbi_uc *out = output + n * z->s->img_x * j;
3396 for (k=0; k < decode_n; ++k) {
3397 stbi__resample *r = &res_comp[k];
3398 int y_bot = r->ystep >= (r->vs >> 1);
3399 coutput[k] = r->resample(z->img_comp[k].linebuf,
3400 y_bot ? r->line1 : r->line0,
3401 y_bot ? r->line0 : r->line1,
3402 r->w_lores, r->hs);
3403 if (++r->ystep >= r->vs) {
3404 r->ystep = 0;
3405 r->line0 = r->line1;
3406 if (++r->ypos < z->img_comp[k].y)
3407 r->line1 += z->img_comp[k].w2;
3408 }
3409 }
3410 if (n >= 3) {
3411 stbi_uc *y = coutput[0];
3412 if (z->s->img_n == 3) {
3413 z->YCbCr_to_RGB_kernel(out, y, coutput[1], coutput[2], z->s->img_x, n);
3414 } else
3415 for (i=0; i < z->s->img_x; ++i) {
3416 out[0] = out[1] = out[2] = y[i];
3417 out[3] = 255; // not used if n==3
3418 out += n;
3419 }
3420 } else {
3421 stbi_uc *y = coutput[0];
3422 if (n == 1)
3423 for (i=0; i < z->s->img_x; ++i) out[i] = y[i];
3424 else
3425 for (i=0; i < z->s->img_x; ++i) *out++ = y[i], *out++ = 255;
3426 }
3427 }
3428 stbi__cleanup_jpeg(z);
3429 *out_x = z->s->img_x;
3430 *out_y = z->s->img_y;
3431 if (comp) *comp = z->s->img_n; // report original components, not output
3432 return output;
3433 }
3434 }
3435
stbi__jpeg_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)3436 static unsigned char *stbi__jpeg_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
3437 {
3438 stbi__jpeg j;
3439 j.s = s;
3440 stbi__setup_jpeg(&j);
3441 return load_jpeg_image(&j, x,y,comp,req_comp);
3442 }
3443
stbi__jpeg_test(stbi__context * s)3444 static int stbi__jpeg_test(stbi__context *s)
3445 {
3446 int r;
3447 stbi__jpeg j;
3448 j.s = s;
3449 stbi__setup_jpeg(&j);
3450 r = stbi__decode_jpeg_header(&j, STBI__SCAN_type);
3451 stbi__rewind(s);
3452 return r;
3453 }
3454
stbi__jpeg_info_raw(stbi__jpeg * j,int * x,int * y,int * comp)3455 static int stbi__jpeg_info_raw(stbi__jpeg *j, int *x, int *y, int *comp)
3456 {
3457 if (!stbi__decode_jpeg_header(j, STBI__SCAN_header)) {
3458 stbi__rewind( j->s );
3459 return 0;
3460 }
3461 if (x) *x = j->s->img_x;
3462 if (y) *y = j->s->img_y;
3463 if (comp) *comp = j->s->img_n;
3464 return 1;
3465 }
3466
stbi__jpeg_info(stbi__context * s,int * x,int * y,int * comp)3467 static int stbi__jpeg_info(stbi__context *s, int *x, int *y, int *comp)
3468 {
3469 stbi__jpeg j;
3470 j.s = s;
3471 return stbi__jpeg_info_raw(&j, x, y, comp);
3472 }
3473 #endif
3474
3475 // public domain zlib decode v0.2 Sean Barrett 2006-11-18
3476 // simple implementation
3477 // - all input must be provided in an upfront buffer
3478 // - all output is written to a single output buffer (can malloc/realloc)
3479 // performance
3480 // - fast huffman
3481
3482 #ifndef STBI_NO_ZLIB
3483
3484 // fast-way is faster to check than jpeg huffman, but slow way is slower
3485 #define STBI__ZFAST_BITS 9 // accelerate all cases in default tables
3486 #define STBI__ZFAST_MASK ((1 << STBI__ZFAST_BITS) - 1)
3487
3488 // zlib-style huffman encoding
3489 // (jpegs packs from left, zlib from right, so can't share code)
3490 typedef struct
3491 {
3492 stbi__uint16 fast[1 << STBI__ZFAST_BITS];
3493 stbi__uint16 firstcode[16];
3494 int maxcode[17];
3495 stbi__uint16 firstsymbol[16];
3496 stbi_uc size[288];
3497 stbi__uint16 value[288];
3498 } stbi__zhuffman;
3499
stbi__bitreverse16(int n)3500 stbi_inline static int stbi__bitreverse16(int n)
3501 {
3502 n = ((n & 0xAAAA) >> 1) | ((n & 0x5555) << 1);
3503 n = ((n & 0xCCCC) >> 2) | ((n & 0x3333) << 2);
3504 n = ((n & 0xF0F0) >> 4) | ((n & 0x0F0F) << 4);
3505 n = ((n & 0xFF00) >> 8) | ((n & 0x00FF) << 8);
3506 return n;
3507 }
3508
stbi__bit_reverse(int v,int bits)3509 stbi_inline static int stbi__bit_reverse(int v, int bits)
3510 {
3511 STBI_ASSERT(bits <= 16);
3512 // to bit reverse n bits, reverse 16 and shift
3513 // e.g. 11 bits, bit reverse and shift away 5
3514 return stbi__bitreverse16(v) >> (16-bits);
3515 }
3516
stbi__zbuild_huffman(stbi__zhuffman * z,stbi_uc * sizelist,int num)3517 static int stbi__zbuild_huffman(stbi__zhuffman *z, stbi_uc *sizelist, int num)
3518 {
3519 int i,k=0;
3520 int code, next_code[16], sizes[17];
3521
3522 // DEFLATE spec for generating codes
3523 memset(sizes, 0, sizeof(sizes));
3524 memset(z->fast, 0, sizeof(z->fast));
3525 for (i=0; i < num; ++i)
3526 ++sizes[sizelist[i]];
3527 sizes[0] = 0;
3528 for (i=1; i < 16; ++i)
3529 if (sizes[i] > (1 << i))
3530 return stbi__err("bad sizes", "Corrupt PNG");
3531 code = 0;
3532 for (i=1; i < 16; ++i) {
3533 next_code[i] = code;
3534 z->firstcode[i] = (stbi__uint16) code;
3535 z->firstsymbol[i] = (stbi__uint16) k;
3536 code = (code + sizes[i]);
3537 if (sizes[i])
3538 if (code-1 >= (1 << i)) return stbi__err("bad codelengths","Corrupt PNG");
3539 z->maxcode[i] = code << (16-i); // preshift for inner loop
3540 code <<= 1;
3541 k += sizes[i];
3542 }
3543 z->maxcode[16] = 0x10000; // sentinel
3544 for (i=0; i < num; ++i) {
3545 int s = sizelist[i];
3546 if (s) {
3547 int c = next_code[s] - z->firstcode[s] + z->firstsymbol[s];
3548 stbi__uint16 fastv = (stbi__uint16) ((s << 9) | i);
3549 z->size [c] = (stbi_uc ) s;
3550 z->value[c] = (stbi__uint16) i;
3551 if (s <= STBI__ZFAST_BITS) {
3552 int j = stbi__bit_reverse(next_code[s],s);
3553 while (j < (1 << STBI__ZFAST_BITS)) {
3554 z->fast[j] = fastv;
3555 j += (1 << s);
3556 }
3557 }
3558 ++next_code[s];
3559 }
3560 }
3561 return 1;
3562 }
3563
3564 // zlib-from-memory implementation for PNG reading
3565 // because PNG allows splitting the zlib stream arbitrarily,
3566 // and it's annoying structurally to have PNG call ZLIB call PNG,
3567 // we require PNG read all the IDATs and combine them into a single
3568 // memory buffer
3569
3570 typedef struct
3571 {
3572 stbi_uc *zbuffer, *zbuffer_end;
3573 int num_bits;
3574 stbi__uint32 code_buffer;
3575
3576 char *zout;
3577 char *zout_start;
3578 char *zout_end;
3579 int z_expandable;
3580
3581 stbi__zhuffman z_length, z_distance;
3582 } stbi__zbuf;
3583
stbi__zget8(stbi__zbuf * z)3584 stbi_inline static stbi_uc stbi__zget8(stbi__zbuf *z)
3585 {
3586 if (z->zbuffer >= z->zbuffer_end) return 0;
3587 return *z->zbuffer++;
3588 }
3589
stbi__fill_bits(stbi__zbuf * z)3590 static void stbi__fill_bits(stbi__zbuf *z)
3591 {
3592 do {
3593 STBI_ASSERT(z->code_buffer < (1U << z->num_bits));
3594 z->code_buffer |= (unsigned int) stbi__zget8(z) << z->num_bits;
3595 z->num_bits += 8;
3596 } while (z->num_bits <= 24);
3597 }
3598
stbi__zreceive(stbi__zbuf * z,int n)3599 stbi_inline static unsigned int stbi__zreceive(stbi__zbuf *z, int n)
3600 {
3601 unsigned int k;
3602 if (z->num_bits < n) stbi__fill_bits(z);
3603 k = z->code_buffer & ((1 << n) - 1);
3604 z->code_buffer >>= n;
3605 z->num_bits -= n;
3606 return k;
3607 }
3608
stbi__zhuffman_decode_slowpath(stbi__zbuf * a,stbi__zhuffman * z)3609 static int stbi__zhuffman_decode_slowpath(stbi__zbuf *a, stbi__zhuffman *z)
3610 {
3611 int b,s,k;
3612 // not resolved by fast table, so compute it the slow way
3613 // use jpeg approach, which requires MSbits at top
3614 k = stbi__bit_reverse(a->code_buffer, 16);
3615 for (s=STBI__ZFAST_BITS+1; ; ++s)
3616 if (k < z->maxcode[s])
3617 break;
3618 if (s == 16) return -1; // invalid code!
3619 // code size is s, so:
3620 b = (k >> (16-s)) - z->firstcode[s] + z->firstsymbol[s];
3621 STBI_ASSERT(z->size[b] == s);
3622 a->code_buffer >>= s;
3623 a->num_bits -= s;
3624 return z->value[b];
3625 }
3626
stbi__zhuffman_decode(stbi__zbuf * a,stbi__zhuffman * z)3627 stbi_inline static int stbi__zhuffman_decode(stbi__zbuf *a, stbi__zhuffman *z)
3628 {
3629 int b,s;
3630 if (a->num_bits < 16) stbi__fill_bits(a);
3631 b = z->fast[a->code_buffer & STBI__ZFAST_MASK];
3632 if (b) {
3633 s = b >> 9;
3634 a->code_buffer >>= s;
3635 a->num_bits -= s;
3636 return b & 511;
3637 }
3638 return stbi__zhuffman_decode_slowpath(a, z);
3639 }
3640
stbi__zexpand(stbi__zbuf * z,char * zout,int n)3641 static int stbi__zexpand(stbi__zbuf *z, char *zout, int n) // need to make room for n bytes
3642 {
3643 char *q;
3644 int cur, limit;
3645 z->zout = zout;
3646 if (!z->z_expandable) return stbi__err("output buffer limit","Corrupt PNG");
3647 cur = (int) (z->zout - z->zout_start);
3648 limit = (int) (z->zout_end - z->zout_start);
3649 while (cur + n > limit)
3650 limit *= 2;
3651 q = (char *) STBI_REALLOC(z->zout_start, limit);
3652 if (q == NULL) return stbi__err("outofmem", "Out of memory");
3653 z->zout_start = q;
3654 z->zout = q + cur;
3655 z->zout_end = q + limit;
3656 return 1;
3657 }
3658
3659 static int stbi__zlength_base[31] = {
3660 3,4,5,6,7,8,9,10,11,13,
3661 15,17,19,23,27,31,35,43,51,59,
3662 67,83,99,115,131,163,195,227,258,0,0 };
3663
3664 static int stbi__zlength_extra[31]=
3665 { 0,0,0,0,0,0,0,0,1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5,0,0,0 };
3666
3667 static int stbi__zdist_base[32] = { 1,2,3,4,5,7,9,13,17,25,33,49,65,97,129,193,
3668 257,385,513,769,1025,1537,2049,3073,4097,6145,8193,12289,16385,24577,0,0};
3669
3670 static int stbi__zdist_extra[32] =
3671 { 0,0,0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12,13,13};
3672
stbi__parse_huffman_block(stbi__zbuf * a)3673 static int stbi__parse_huffman_block(stbi__zbuf *a)
3674 {
3675 char *zout = a->zout;
3676 for(;;) {
3677 int z = stbi__zhuffman_decode(a, &a->z_length);
3678 if (z < 256) {
3679 if (z < 0) return stbi__err("bad huffman code","Corrupt PNG"); // error in huffman codes
3680 if (zout >= a->zout_end) {
3681 if (!stbi__zexpand(a, zout, 1)) return 0;
3682 zout = a->zout;
3683 }
3684 *zout++ = (char) z;
3685 } else {
3686 stbi_uc *p;
3687 int len,dist;
3688 if (z == 256) {
3689 a->zout = zout;
3690 return 1;
3691 }
3692 z -= 257;
3693 len = stbi__zlength_base[z];
3694 if (stbi__zlength_extra[z]) len += stbi__zreceive(a, stbi__zlength_extra[z]);
3695 z = stbi__zhuffman_decode(a, &a->z_distance);
3696 if (z < 0) return stbi__err("bad huffman code","Corrupt PNG");
3697 dist = stbi__zdist_base[z];
3698 if (stbi__zdist_extra[z]) dist += stbi__zreceive(a, stbi__zdist_extra[z]);
3699 if (zout - a->zout_start < dist) return stbi__err("bad dist","Corrupt PNG");
3700 if (zout + len > a->zout_end) {
3701 if (!stbi__zexpand(a, zout, len)) return 0;
3702 zout = a->zout;
3703 }
3704 p = (stbi_uc *) (zout - dist);
3705 if (dist == 1) { // run of one byte; common in images.
3706 stbi_uc v = *p;
3707 if (len) { do *zout++ = v; while (--len); }
3708 } else {
3709 if (len) { do *zout++ = *p++; while (--len); }
3710 }
3711 }
3712 }
3713 }
3714
stbi__compute_huffman_codes(stbi__zbuf * a)3715 static int stbi__compute_huffman_codes(stbi__zbuf *a)
3716 {
3717 static stbi_uc length_dezigzag[19] = { 16,17,18,0,8,7,9,6,10,5,11,4,12,3,13,2,14,1,15 };
3718 stbi__zhuffman z_codelength;
3719 stbi_uc lencodes[286+32+137];//padding for maximum single op
3720 stbi_uc codelength_sizes[19];
3721 int i,n;
3722
3723 int hlit = stbi__zreceive(a,5) + 257;
3724 int hdist = stbi__zreceive(a,5) + 1;
3725 int hclen = stbi__zreceive(a,4) + 4;
3726
3727 memset(codelength_sizes, 0, sizeof(codelength_sizes));
3728 for (i=0; i < hclen; ++i) {
3729 int s = stbi__zreceive(a,3);
3730 codelength_sizes[length_dezigzag[i]] = (stbi_uc) s;
3731 }
3732 if (!stbi__zbuild_huffman(&z_codelength, codelength_sizes, 19)) return 0;
3733
3734 n = 0;
3735 while (n < hlit + hdist) {
3736 int c = stbi__zhuffman_decode(a, &z_codelength);
3737 if (c < 0 || c >= 19) return stbi__err("bad codelengths", "Corrupt PNG");
3738 if (c < 16)
3739 lencodes[n++] = (stbi_uc) c;
3740 else if (c == 16) {
3741 c = stbi__zreceive(a,2)+3;
3742 memset(lencodes+n, lencodes[n-1], c);
3743 n += c;
3744 } else if (c == 17) {
3745 c = stbi__zreceive(a,3)+3;
3746 memset(lencodes+n, 0, c);
3747 n += c;
3748 } else {
3749 STBI_ASSERT(c == 18);
3750 c = stbi__zreceive(a,7)+11;
3751 memset(lencodes+n, 0, c);
3752 n += c;
3753 }
3754 }
3755 if (n != hlit+hdist) return stbi__err("bad codelengths","Corrupt PNG");
3756 if (!stbi__zbuild_huffman(&a->z_length, lencodes, hlit)) return 0;
3757 if (!stbi__zbuild_huffman(&a->z_distance, lencodes+hlit, hdist)) return 0;
3758 return 1;
3759 }
3760
stbi__parse_uncomperssed_block(stbi__zbuf * a)3761 static int stbi__parse_uncomperssed_block(stbi__zbuf *a)
3762 {
3763 stbi_uc header[4];
3764 int len,nlen,k;
3765 if (a->num_bits & 7)
3766 stbi__zreceive(a, a->num_bits & 7); // discard
3767 // drain the bit-packed data into header
3768 k = 0;
3769 while (a->num_bits > 0) {
3770 header[k++] = (stbi_uc) (a->code_buffer & 255); // suppress MSVC run-time check
3771 a->code_buffer >>= 8;
3772 a->num_bits -= 8;
3773 }
3774 STBI_ASSERT(a->num_bits == 0);
3775 // now fill header the normal way
3776 while (k < 4)
3777 header[k++] = stbi__zget8(a);
3778 len = header[1] * 256 + header[0];
3779 nlen = header[3] * 256 + header[2];
3780 if (nlen != (len ^ 0xffff)) return stbi__err("zlib corrupt","Corrupt PNG");
3781 if (a->zbuffer + len > a->zbuffer_end) return stbi__err("read past buffer","Corrupt PNG");
3782 if (a->zout + len > a->zout_end)
3783 if (!stbi__zexpand(a, a->zout, len)) return 0;
3784 memcpy(a->zout, a->zbuffer, len);
3785 a->zbuffer += len;
3786 a->zout += len;
3787 return 1;
3788 }
3789
stbi__parse_zlib_header(stbi__zbuf * a)3790 static int stbi__parse_zlib_header(stbi__zbuf *a)
3791 {
3792 int cmf = stbi__zget8(a);
3793 int cm = cmf & 15;
3794 /* int cinfo = cmf >> 4; */
3795 int flg = stbi__zget8(a);
3796 if ((cmf*256+flg) % 31 != 0) return stbi__err("bad zlib header","Corrupt PNG"); // zlib spec
3797 if (flg & 32) return stbi__err("no preset dict","Corrupt PNG"); // preset dictionary not allowed in png
3798 if (cm != 8) return stbi__err("bad compression","Corrupt PNG"); // DEFLATE required for png
3799 // window = 1 << (8 + cinfo)... but who cares, we fully buffer output
3800 return 1;
3801 }
3802
3803 // @TODO: should statically initialize these for optimal thread safety
3804 static stbi_uc stbi__zdefault_length[288], stbi__zdefault_distance[32];
stbi__init_zdefaults(void)3805 static void stbi__init_zdefaults(void)
3806 {
3807 int i; // use <= to match clearly with spec
3808 for (i=0; i <= 143; ++i) stbi__zdefault_length[i] = 8;
3809 for ( ; i <= 255; ++i) stbi__zdefault_length[i] = 9;
3810 for ( ; i <= 279; ++i) stbi__zdefault_length[i] = 7;
3811 for ( ; i <= 287; ++i) stbi__zdefault_length[i] = 8;
3812
3813 for (i=0; i <= 31; ++i) stbi__zdefault_distance[i] = 5;
3814 }
3815
stbi__parse_zlib(stbi__zbuf * a,int parse_header)3816 static int stbi__parse_zlib(stbi__zbuf *a, int parse_header)
3817 {
3818 int final, type;
3819 if (parse_header)
3820 if (!stbi__parse_zlib_header(a)) return 0;
3821 a->num_bits = 0;
3822 a->code_buffer = 0;
3823 do {
3824 final = stbi__zreceive(a,1);
3825 type = stbi__zreceive(a,2);
3826 if (type == 0) {
3827 if (!stbi__parse_uncomperssed_block(a)) return 0;
3828 } else if (type == 3) {
3829 return 0;
3830 } else {
3831 if (type == 1) {
3832 // use fixed code lengths
3833 if (!stbi__zdefault_distance[31]) stbi__init_zdefaults();
3834 if (!stbi__zbuild_huffman(&a->z_length , stbi__zdefault_length , 288)) return 0;
3835 if (!stbi__zbuild_huffman(&a->z_distance, stbi__zdefault_distance, 32)) return 0;
3836 } else {
3837 if (!stbi__compute_huffman_codes(a)) return 0;
3838 }
3839 if (!stbi__parse_huffman_block(a)) return 0;
3840 }
3841 } while (!final);
3842 return 1;
3843 }
3844
stbi__do_zlib(stbi__zbuf * a,char * obuf,int olen,int exp,int parse_header)3845 static int stbi__do_zlib(stbi__zbuf *a, char *obuf, int olen, int exp, int parse_header)
3846 {
3847 a->zout_start = obuf;
3848 a->zout = obuf;
3849 a->zout_end = obuf + olen;
3850 a->z_expandable = exp;
3851
3852 return stbi__parse_zlib(a, parse_header);
3853 }
3854
stbi_zlib_decode_malloc_guesssize(const char * buffer,int len,int initial_size,int * outlen)3855 STBIDEF char *stbi_zlib_decode_malloc_guesssize(const char *buffer, int len, int initial_size, int *outlen)
3856 {
3857 stbi__zbuf a;
3858 char *p = (char *) stbi__malloc(initial_size);
3859 if (p == NULL) return NULL;
3860 a.zbuffer = (stbi_uc *) buffer;
3861 a.zbuffer_end = (stbi_uc *) buffer + len;
3862 if (stbi__do_zlib(&a, p, initial_size, 1, 1)) {
3863 if (outlen) *outlen = (int) (a.zout - a.zout_start);
3864 return a.zout_start;
3865 } else {
3866 STBI_FREE(a.zout_start);
3867 return NULL;
3868 }
3869 }
3870
stbi_zlib_decode_malloc(char const * buffer,int len,int * outlen)3871 STBIDEF char *stbi_zlib_decode_malloc(char const *buffer, int len, int *outlen)
3872 {
3873 return stbi_zlib_decode_malloc_guesssize(buffer, len, 16384, outlen);
3874 }
3875
stbi_zlib_decode_malloc_guesssize_headerflag(const char * buffer,int len,int initial_size,int * outlen,int parse_header)3876 STBIDEF char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer, int len, int initial_size, int *outlen, int parse_header)
3877 {
3878 stbi__zbuf a;
3879 char *p = (char *) stbi__malloc(initial_size);
3880 if (p == NULL) return NULL;
3881 a.zbuffer = (stbi_uc *) buffer;
3882 a.zbuffer_end = (stbi_uc *) buffer + len;
3883 if (stbi__do_zlib(&a, p, initial_size, 1, parse_header)) {
3884 if (outlen) *outlen = (int) (a.zout - a.zout_start);
3885 return a.zout_start;
3886 } else {
3887 STBI_FREE(a.zout_start);
3888 return NULL;
3889 }
3890 }
3891
stbi_zlib_decode_buffer(char * obuffer,int olen,char const * ibuffer,int ilen)3892 STBIDEF int stbi_zlib_decode_buffer(char *obuffer, int olen, char const *ibuffer, int ilen)
3893 {
3894 stbi__zbuf a;
3895 a.zbuffer = (stbi_uc *) ibuffer;
3896 a.zbuffer_end = (stbi_uc *) ibuffer + ilen;
3897 if (stbi__do_zlib(&a, obuffer, olen, 0, 1))
3898 return (int) (a.zout - a.zout_start);
3899 else
3900 return -1;
3901 }
3902
stbi_zlib_decode_noheader_malloc(char const * buffer,int len,int * outlen)3903 STBIDEF char *stbi_zlib_decode_noheader_malloc(char const *buffer, int len, int *outlen)
3904 {
3905 stbi__zbuf a;
3906 char *p = (char *) stbi__malloc(16384);
3907 if (p == NULL) return NULL;
3908 a.zbuffer = (stbi_uc *) buffer;
3909 a.zbuffer_end = (stbi_uc *) buffer+len;
3910 if (stbi__do_zlib(&a, p, 16384, 1, 0)) {
3911 if (outlen) *outlen = (int) (a.zout - a.zout_start);
3912 return a.zout_start;
3913 } else {
3914 STBI_FREE(a.zout_start);
3915 return NULL;
3916 }
3917 }
3918
stbi_zlib_decode_noheader_buffer(char * obuffer,int olen,const char * ibuffer,int ilen)3919 STBIDEF int stbi_zlib_decode_noheader_buffer(char *obuffer, int olen, const char *ibuffer, int ilen)
3920 {
3921 stbi__zbuf a;
3922 a.zbuffer = (stbi_uc *) ibuffer;
3923 a.zbuffer_end = (stbi_uc *) ibuffer + ilen;
3924 if (stbi__do_zlib(&a, obuffer, olen, 0, 0))
3925 return (int) (a.zout - a.zout_start);
3926 else
3927 return -1;
3928 }
3929 #endif
3930
3931 // public domain "baseline" PNG decoder v0.10 Sean Barrett 2006-11-18
3932 // simple implementation
3933 // - only 8-bit samples
3934 // - no CRC checking
3935 // - allocates lots of intermediate memory
3936 // - avoids problem of streaming data between subsystems
3937 // - avoids explicit window management
3938 // performance
3939 // - uses stb_zlib, a PD zlib implementation with fast huffman decoding
3940
3941 #ifndef STBI_NO_PNG
3942 typedef struct
3943 {
3944 stbi__uint32 length;
3945 stbi__uint32 type;
3946 } stbi__pngchunk;
3947
stbi__get_chunk_header(stbi__context * s)3948 static stbi__pngchunk stbi__get_chunk_header(stbi__context *s)
3949 {
3950 stbi__pngchunk c;
3951 c.length = stbi__get32be(s);
3952 c.type = stbi__get32be(s);
3953 return c;
3954 }
3955
stbi__check_png_header(stbi__context * s)3956 static int stbi__check_png_header(stbi__context *s)
3957 {
3958 static stbi_uc png_sig[8] = { 137,80,78,71,13,10,26,10 };
3959 int i;
3960 for (i=0; i < 8; ++i)
3961 if (stbi__get8(s) != png_sig[i]) return stbi__err("bad png sig","Not a PNG");
3962 return 1;
3963 }
3964
3965 typedef struct
3966 {
3967 stbi__context *s;
3968 stbi_uc *idata, *expanded, *out;
3969 } stbi__png;
3970
3971
3972 enum {
3973 STBI__F_none=0,
3974 STBI__F_sub=1,
3975 STBI__F_up=2,
3976 STBI__F_avg=3,
3977 STBI__F_paeth=4,
3978 // synthetic filters used for first scanline to avoid needing a dummy row of 0s
3979 STBI__F_avg_first,
3980 STBI__F_paeth_first
3981 };
3982
3983 static stbi_uc first_row_filter[5] =
3984 {
3985 STBI__F_none,
3986 STBI__F_sub,
3987 STBI__F_none,
3988 STBI__F_avg_first,
3989 STBI__F_paeth_first
3990 };
3991
stbi__paeth(int a,int b,int c)3992 static int stbi__paeth(int a, int b, int c)
3993 {
3994 int p = a + b - c;
3995 int pa = abs(p-a);
3996 int pb = abs(p-b);
3997 int pc = abs(p-c);
3998 if (pa <= pb && pa <= pc) return a;
3999 if (pb <= pc) return b;
4000 return c;
4001 }
4002
4003 static stbi_uc stbi__depth_scale_table[9] = { 0, 0xff, 0x55, 0, 0x11, 0,0,0, 0x01 };
4004
4005 // create the png data from post-deflated data
stbi__create_png_image_raw(stbi__png * a,stbi_uc * raw,stbi__uint32 raw_len,int out_n,stbi__uint32 x,stbi__uint32 y,int depth,int color)4006 static int stbi__create_png_image_raw(stbi__png *a, stbi_uc *raw, stbi__uint32 raw_len, int out_n, stbi__uint32 x, stbi__uint32 y, int depth, int color)
4007 {
4008 stbi__context *s = a->s;
4009 stbi__uint32 i,j,stride = x*out_n;
4010 stbi__uint32 img_len, img_width_bytes;
4011 int k;
4012 int img_n = s->img_n; // copy it into a local for later
4013
4014 STBI_ASSERT(out_n == s->img_n || out_n == s->img_n+1);
4015 if (x == 0 || y == 0 || out_n <= 0 || (out_n > (INT_MAX / x / y)))
4016 return stbi__err("Integer Overflow", "x or y incorrect");
4017 a->out = (stbi_uc *) stbi__malloc(x * y * out_n); // extra bytes to write off the end into
4018 if (!a->out) return stbi__err("outofmem", "Out of memory");
4019
4020 img_width_bytes = (((img_n * x * depth) + 7) >> 3);
4021 img_len = (img_width_bytes + 1) * y;
4022 if (s->img_x == x && s->img_y == y) {
4023 if (raw_len != img_len) return stbi__err("not enough pixels","Corrupt PNG");
4024 } else { // interlaced:
4025 if (raw_len < img_len) return stbi__err("not enough pixels","Corrupt PNG");
4026 }
4027
4028 for (j=0; j < y; ++j) {
4029 stbi_uc *cur = a->out + stride*j;
4030 stbi_uc *prior = cur - stride;
4031 int filter = *raw++;
4032 int filter_bytes = img_n;
4033 int width = x;
4034 if (filter > 4)
4035 return stbi__err("invalid filter","Corrupt PNG");
4036
4037 if (depth < 8) {
4038 STBI_ASSERT(img_width_bytes <= x);
4039 cur += x*out_n - img_width_bytes; // store output to the rightmost img_len bytes, so we can decode in place
4040 filter_bytes = 1;
4041 width = img_width_bytes;
4042 }
4043
4044 // if first row, use special filter that doesn't sample previous row
4045 if (j == 0) filter = first_row_filter[filter];
4046
4047 // handle first byte explicitly
4048 for (k=0; k < filter_bytes; ++k) {
4049 switch (filter) {
4050 case STBI__F_none : cur[k] = raw[k]; break;
4051 case STBI__F_sub : cur[k] = raw[k]; break;
4052 case STBI__F_up : cur[k] = STBI__BYTECAST(raw[k] + prior[k]); break;
4053 case STBI__F_avg : cur[k] = STBI__BYTECAST(raw[k] + (prior[k]>>1)); break;
4054 case STBI__F_paeth : cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(0,prior[k],0)); break;
4055 case STBI__F_avg_first : cur[k] = raw[k]; break;
4056 case STBI__F_paeth_first: cur[k] = raw[k]; break;
4057 }
4058 }
4059
4060 if (depth == 8) {
4061 if (img_n != out_n)
4062 cur[img_n] = 255; // first pixel
4063 raw += img_n;
4064 cur += out_n;
4065 prior += out_n;
4066 } else {
4067 raw += 1;
4068 cur += 1;
4069 prior += 1;
4070 }
4071
4072 // this is a little gross, so that we don't switch per-pixel or per-component
4073 if (depth < 8 || img_n == out_n) {
4074 int nk = (width - 1)*img_n;
4075 #define CASE(f) \
4076 case f: \
4077 for (k=0; k < nk; ++k)
4078 switch (filter) {
4079 // "none" filter turns into a memcpy here; make that explicit.
4080 case STBI__F_none: memcpy(cur, raw, nk); break;
4081 CASE(STBI__F_sub) cur[k] = STBI__BYTECAST(raw[k] + cur[k-filter_bytes]); break;
4082 CASE(STBI__F_up) cur[k] = STBI__BYTECAST(raw[k] + prior[k]); break;
4083 CASE(STBI__F_avg) cur[k] = STBI__BYTECAST(raw[k] + ((prior[k] + cur[k-filter_bytes])>>1)); break;
4084 CASE(STBI__F_paeth) cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-filter_bytes],prior[k],prior[k-filter_bytes])); break;
4085 CASE(STBI__F_avg_first) cur[k] = STBI__BYTECAST(raw[k] + (cur[k-filter_bytes] >> 1)); break;
4086 CASE(STBI__F_paeth_first) cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-filter_bytes],0,0)); break;
4087 }
4088 #undef CASE
4089 raw += nk;
4090 } else {
4091 STBI_ASSERT(img_n+1 == out_n);
4092 #define CASE(f) \
4093 case f: \
4094 for (i=x-1; i >= 1; --i, cur[img_n]=255,raw+=img_n,cur+=out_n,prior+=out_n) \
4095 for (k=0; k < img_n; ++k)
4096 switch (filter) {
4097 CASE(STBI__F_none) cur[k] = raw[k]; break;
4098 CASE(STBI__F_sub) cur[k] = STBI__BYTECAST(raw[k] + cur[k-out_n]); break;
4099 CASE(STBI__F_up) cur[k] = STBI__BYTECAST(raw[k] + prior[k]); break;
4100 CASE(STBI__F_avg) cur[k] = STBI__BYTECAST(raw[k] + ((prior[k] + cur[k-out_n])>>1)); break;
4101 CASE(STBI__F_paeth) cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-out_n],prior[k],prior[k-out_n])); break;
4102 CASE(STBI__F_avg_first) cur[k] = STBI__BYTECAST(raw[k] + (cur[k-out_n] >> 1)); break;
4103 CASE(STBI__F_paeth_first) cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-out_n],0,0)); break;
4104 }
4105 #undef CASE
4106 }
4107 }
4108
4109 // we make a separate pass to expand bits to pixels; for performance,
4110 // this could run two scanlines behind the above code, so it won't
4111 // intefere with filtering but will still be in the cache.
4112 if (depth < 8) {
4113 for (j=0; j < y; ++j) {
4114 stbi_uc *cur = a->out + stride*j;
4115 stbi_uc *in = a->out + stride*j + x*out_n - img_width_bytes;
4116 // unpack 1/2/4-bit into a 8-bit buffer. allows us to keep the common 8-bit path optimal at minimal cost for 1/2/4-bit
4117 // png guarante byte alignment, if width is not multiple of 8/4/2 we'll decode dummy trailing data that will be skipped in the later loop
4118 stbi_uc scale = (color == 0) ? stbi__depth_scale_table[depth] : 1; // scale grayscale values to 0..255 range
4119
4120 // note that the final byte might overshoot and write more data than desired.
4121 // we can allocate enough data that this never writes out of memory, but it
4122 // could also overwrite the next scanline. can it overwrite non-empty data
4123 // on the next scanline? yes, consider 1-pixel-wide scanlines with 1-bit-per-pixel.
4124 // so we need to explicitly clamp the final ones
4125
4126 if (depth == 4) {
4127 for (k=x*img_n; k >= 2; k-=2, ++in) {
4128 *cur++ = scale * ((*in >> 4) );
4129 *cur++ = scale * ((*in ) & 0x0f);
4130 }
4131 if (k > 0) *cur++ = scale * ((*in >> 4) );
4132 } else if (depth == 2) {
4133 for (k=x*img_n; k >= 4; k-=4, ++in) {
4134 *cur++ = scale * ((*in >> 6) );
4135 *cur++ = scale * ((*in >> 4) & 0x03);
4136 *cur++ = scale * ((*in >> 2) & 0x03);
4137 *cur++ = scale * ((*in ) & 0x03);
4138 }
4139 if (k > 0) *cur++ = scale * ((*in >> 6) );
4140 if (k > 1) *cur++ = scale * ((*in >> 4) & 0x03);
4141 if (k > 2) *cur++ = scale * ((*in >> 2) & 0x03);
4142 } else if (depth == 1) {
4143 for (k=x*img_n; k >= 8; k-=8, ++in) {
4144 *cur++ = scale * ((*in >> 7) );
4145 *cur++ = scale * ((*in >> 6) & 0x01);
4146 *cur++ = scale * ((*in >> 5) & 0x01);
4147 *cur++ = scale * ((*in >> 4) & 0x01);
4148 *cur++ = scale * ((*in >> 3) & 0x01);
4149 *cur++ = scale * ((*in >> 2) & 0x01);
4150 *cur++ = scale * ((*in >> 1) & 0x01);
4151 *cur++ = scale * ((*in ) & 0x01);
4152 }
4153 if (k > 0) *cur++ = scale * ((*in >> 7) );
4154 if (k > 1) *cur++ = scale * ((*in >> 6) & 0x01);
4155 if (k > 2) *cur++ = scale * ((*in >> 5) & 0x01);
4156 if (k > 3) *cur++ = scale * ((*in >> 4) & 0x01);
4157 if (k > 4) *cur++ = scale * ((*in >> 3) & 0x01);
4158 if (k > 5) *cur++ = scale * ((*in >> 2) & 0x01);
4159 if (k > 6) *cur++ = scale * ((*in >> 1) & 0x01);
4160 }
4161 if (img_n != out_n) {
4162 int q;
4163 // insert alpha = 255
4164 cur = a->out + stride*j;
4165 if (img_n == 1) {
4166 for (q=x-1; q >= 0; --q) {
4167 cur[q*2+1] = 255;
4168 cur[q*2+0] = cur[q];
4169 }
4170 } else {
4171 STBI_ASSERT(img_n == 3);
4172 for (q=x-1; q >= 0; --q) {
4173 cur[q*4+3] = 255;
4174 cur[q*4+2] = cur[q*3+2];
4175 cur[q*4+1] = cur[q*3+1];
4176 cur[q*4+0] = cur[q*3+0];
4177 }
4178 }
4179 }
4180 }
4181 }
4182
4183 return 1;
4184 }
4185
stbi__create_png_image(stbi__png * a,stbi_uc * image_data,stbi__uint32 image_data_len,int out_n,int depth,int color,int interlaced)4186 static int stbi__create_png_image(stbi__png *a, stbi_uc *image_data, stbi__uint32 image_data_len, int out_n, int depth, int color, int interlaced)
4187 {
4188 stbi_uc *final;
4189 int p;
4190 if (!interlaced)
4191 return stbi__create_png_image_raw(a, image_data, image_data_len, out_n, a->s->img_x, a->s->img_y, depth, color);
4192
4193 // de-interlacing
4194 if (a->s->img_x == 0 || a->s->img_y == 0 || out_n <= 0
4195 || (out_n > (INT_MAX / a->s->img_x / a->s->img_y)))
4196 return stbi__err("Integer Overflow", "x or y incorrect");
4197
4198 final = (stbi_uc *) stbi__malloc(a->s->img_x * a->s->img_y * out_n);
4199 if (final == NULL) return stbi__err("outofmem", "Out of memory");
4200 for (p=0; p < 7; ++p) {
4201 int xorig[] = { 0,4,0,2,0,1,0 };
4202 int yorig[] = { 0,0,4,0,2,0,1 };
4203 int xspc[] = { 8,8,4,4,2,2,1 };
4204 int yspc[] = { 8,8,8,4,4,2,2 };
4205 int i,j,x,y;
4206 // pass1_x[4] = 0, pass1_x[5] = 1, pass1_x[12] = 1
4207 x = (a->s->img_x - xorig[p] + xspc[p]-1) / xspc[p];
4208 y = (a->s->img_y - yorig[p] + yspc[p]-1) / yspc[p];
4209 if (x && y) {
4210 stbi__uint32 img_len = ((((a->s->img_n * x * depth) + 7) >> 3) + 1) * y;
4211 if (!stbi__create_png_image_raw(a, image_data, image_data_len, out_n, x, y, depth, color)) {
4212 STBI_FREE(final);
4213 return 0;
4214 }
4215 for (j=0; j < y; ++j) {
4216 for (i=0; i < x; ++i) {
4217 int out_y = j*yspc[p]+yorig[p];
4218 int out_x = i*xspc[p]+xorig[p];
4219 memcpy(final + out_y*a->s->img_x*out_n + out_x*out_n,
4220 a->out + (j*x+i)*out_n, out_n);
4221 }
4222 }
4223 STBI_FREE(a->out);
4224 image_data += img_len;
4225 image_data_len -= img_len;
4226 }
4227 }
4228 a->out = final;
4229
4230 return 1;
4231 }
4232
stbi__compute_transparency(stbi__png * z,stbi_uc tc[3],int out_n)4233 static int stbi__compute_transparency(stbi__png *z, stbi_uc tc[3], int out_n)
4234 {
4235 stbi__context *s = z->s;
4236 stbi__uint32 i, pixel_count = s->img_x * s->img_y;
4237 stbi_uc *p = z->out;
4238
4239 // compute color-based transparency, assuming we've
4240 // already got 255 as the alpha value in the output
4241 STBI_ASSERT(out_n == 2 || out_n == 4);
4242
4243 if (out_n == 2) {
4244 for (i=0; i < pixel_count; ++i) {
4245 p[1] = (p[0] == tc[0] ? 0 : 255);
4246 p += 2;
4247 }
4248 } else {
4249 for (i=0; i < pixel_count; ++i) {
4250 if (p[0] == tc[0] && p[1] == tc[1] && p[2] == tc[2])
4251 p[3] = 0;
4252 p += 4;
4253 }
4254 }
4255 return 1;
4256 }
4257
stbi__expand_png_palette(stbi__png * a,stbi_uc * palette,int len,int pal_img_n)4258 static int stbi__expand_png_palette(stbi__png *a, stbi_uc *palette, int len, int pal_img_n)
4259 {
4260 stbi__uint32 i, pixel_count = a->s->img_x * a->s->img_y;
4261 stbi_uc *p, *temp_out, *orig = a->out;
4262
4263 if(a->s->img_x == 0 || a->s->img_y == 0 || pal_img_n > (INT_MAX / a->s->img_x / a->s->img_y))
4264 return stbi__err("Integer Overflow", "x or y incorrect");
4265 p = (stbi_uc *) stbi__malloc(pixel_count * pal_img_n);
4266 if (p == NULL) return stbi__err("outofmem", "Out of memory");
4267
4268 // between here and free(out) below, exitting would leak
4269 temp_out = p;
4270
4271 if (pal_img_n == 3) {
4272 for (i=0; i < pixel_count; ++i) {
4273 int n = orig[i]*4;
4274 p[0] = palette[n ];
4275 p[1] = palette[n+1];
4276 p[2] = palette[n+2];
4277 p += 3;
4278 }
4279 } else {
4280 for (i=0; i < pixel_count; ++i) {
4281 int n = orig[i]*4;
4282 p[0] = palette[n ];
4283 p[1] = palette[n+1];
4284 p[2] = palette[n+2];
4285 p[3] = palette[n+3];
4286 p += 4;
4287 }
4288 }
4289 STBI_FREE(a->out);
4290 a->out = temp_out;
4291
4292 STBI_NOTUSED(len);
4293
4294 return 1;
4295 }
4296
4297 static int stbi__unpremultiply_on_load = 0;
4298 static int stbi__de_iphone_flag = 0;
4299
stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply)4300 STBIDEF void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply)
4301 {
4302 stbi__unpremultiply_on_load = flag_true_if_should_unpremultiply;
4303 }
4304
stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert)4305 STBIDEF void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert)
4306 {
4307 stbi__de_iphone_flag = flag_true_if_should_convert;
4308 }
4309
stbi__de_iphone(stbi__png * z)4310 static void stbi__de_iphone(stbi__png *z)
4311 {
4312 stbi__context *s = z->s;
4313 stbi__uint32 i, pixel_count = s->img_x * s->img_y;
4314 stbi_uc *p = z->out;
4315
4316 if (s->img_out_n == 3) { // convert bgr to rgb
4317 for (i=0; i < pixel_count; ++i) {
4318 stbi_uc t = p[0];
4319 p[0] = p[2];
4320 p[2] = t;
4321 p += 3;
4322 }
4323 } else {
4324 STBI_ASSERT(s->img_out_n == 4);
4325 if (stbi__unpremultiply_on_load) {
4326 // convert bgr to rgb and unpremultiply
4327 for (i=0; i < pixel_count; ++i) {
4328 stbi_uc a = p[3];
4329 stbi_uc t = p[0];
4330 if (a) {
4331 p[0] = p[2] * 255 / a;
4332 p[1] = p[1] * 255 / a;
4333 p[2] = t * 255 / a;
4334 } else {
4335 p[0] = p[2];
4336 p[2] = t;
4337 }
4338 p += 4;
4339 }
4340 } else {
4341 // convert bgr to rgb
4342 for (i=0; i < pixel_count; ++i) {
4343 stbi_uc t = p[0];
4344 p[0] = p[2];
4345 p[2] = t;
4346 p += 4;
4347 }
4348 }
4349 }
4350 }
4351
4352 #define STBI__PNG_TYPE(a,b,c,d) (((a) << 24) + ((b) << 16) + ((c) << 8) + (d))
4353
stbi__parse_png_file(stbi__png * z,int scan,int req_comp)4354 static int stbi__parse_png_file(stbi__png *z, int scan, int req_comp)
4355 {
4356 stbi_uc palette[1024], pal_img_n=0;
4357 stbi_uc has_trans=0, tc[3];
4358 stbi__uint32 ioff=0, idata_limit=0, i, pal_len=0;
4359 int first=1,k,interlace=0, color=0, depth=0, is_iphone=0;
4360 stbi__context *s = z->s;
4361
4362 z->expanded = NULL;
4363 z->idata = NULL;
4364 z->out = NULL;
4365
4366 if (!stbi__check_png_header(s)) return 0;
4367
4368 if (scan == STBI__SCAN_type) return 1;
4369
4370 for (;;) {
4371 stbi__pngchunk c = stbi__get_chunk_header(s);
4372 switch (c.type) {
4373 case STBI__PNG_TYPE('C','g','B','I'):
4374 is_iphone = 1;
4375 stbi__skip(s, c.length);
4376 break;
4377 case STBI__PNG_TYPE('I','H','D','R'): {
4378 int comp,filter;
4379 if (!first) return stbi__err("multiple IHDR","Corrupt PNG");
4380 first = 0;
4381 if (c.length != 13) return stbi__err("bad IHDR len","Corrupt PNG");
4382 s->img_x = stbi__get32be(s); if (s->img_x > (1 << 24)) return stbi__err("too large","Very large image (corrupt?)");
4383 s->img_y = stbi__get32be(s); if (s->img_y > (1 << 24)) return stbi__err("too large","Very large image (corrupt?)");
4384 depth = stbi__get8(s); if (depth != 1 && depth != 2 && depth != 4 && depth != 8) return stbi__err("1/2/4/8-bit only","PNG not supported: 1/2/4/8-bit only");
4385 color = stbi__get8(s); if (color > 6) return stbi__err("bad ctype","Corrupt PNG");
4386 if (color == 3) pal_img_n = 3; else if (color & 1) return stbi__err("bad ctype","Corrupt PNG");
4387 comp = stbi__get8(s); if (comp) return stbi__err("bad comp method","Corrupt PNG");
4388 filter= stbi__get8(s); if (filter) return stbi__err("bad filter method","Corrupt PNG");
4389 interlace = stbi__get8(s); if (interlace>1) return stbi__err("bad interlace method","Corrupt PNG");
4390 if (!s->img_x || !s->img_y) return stbi__err("0-pixel image","Corrupt PNG");
4391 if (!pal_img_n) {
4392 s->img_n = (color & 2 ? 3 : 1) + (color & 4 ? 1 : 0);
4393 if ((1 << 30) / s->img_x / s->img_n < s->img_y) return stbi__err("too large", "Image too large to decode");
4394 if (scan == STBI__SCAN_header) return 1;
4395 } else {
4396 // if paletted, then pal_n is our final components, and
4397 // img_n is # components to decompress/filter.
4398 s->img_n = 1;
4399 if ((1 << 30) / s->img_x / 4 < s->img_y) return stbi__err("too large","Corrupt PNG");
4400 // if SCAN_header, have to scan to see if we have a tRNS
4401 }
4402 break;
4403 }
4404
4405 case STBI__PNG_TYPE('P','L','T','E'): {
4406 if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4407 if (c.length > 256*3) return stbi__err("invalid PLTE","Corrupt PNG");
4408 pal_len = c.length / 3;
4409 if (pal_len * 3 != c.length) return stbi__err("invalid PLTE","Corrupt PNG");
4410 for (i=0; i < pal_len; ++i) {
4411 palette[i*4+0] = stbi__get8(s);
4412 palette[i*4+1] = stbi__get8(s);
4413 palette[i*4+2] = stbi__get8(s);
4414 palette[i*4+3] = 255;
4415 }
4416 break;
4417 }
4418
4419 case STBI__PNG_TYPE('t','R','N','S'): {
4420 if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4421 if (z->idata) return stbi__err("tRNS after IDAT","Corrupt PNG");
4422 if (pal_img_n) {
4423 if (scan == STBI__SCAN_header) { s->img_n = 4; return 1; }
4424 if (pal_len == 0) return stbi__err("tRNS before PLTE","Corrupt PNG");
4425 if (c.length > pal_len) return stbi__err("bad tRNS len","Corrupt PNG");
4426 pal_img_n = 4;
4427 for (i=0; i < c.length; ++i)
4428 palette[i*4+3] = stbi__get8(s);
4429 } else {
4430 if (!(s->img_n & 1)) return stbi__err("tRNS with alpha","Corrupt PNG");
4431 if (c.length != (stbi__uint32) s->img_n*2) return stbi__err("bad tRNS len","Corrupt PNG");
4432 has_trans = 1;
4433 for (k=0; k < s->img_n; ++k)
4434 tc[k] = (stbi_uc) (stbi__get16be(s) & 255) * stbi__depth_scale_table[depth]; // non 8-bit images will be larger
4435 }
4436 break;
4437 }
4438
4439 case STBI__PNG_TYPE('I','D','A','T'): {
4440 if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4441 if (pal_img_n && !pal_len) return stbi__err("no PLTE","Corrupt PNG");
4442 if (scan == STBI__SCAN_header) { s->img_n = pal_img_n; return 1; }
4443 if ((int)(ioff + c.length) < (int)ioff) return 0;
4444 if (ioff + c.length > idata_limit) {
4445 stbi_uc *p;
4446 if (idata_limit == 0) idata_limit = c.length > 4096 ? c.length : 4096;
4447 while (ioff + c.length > idata_limit)
4448 idata_limit *= 2;
4449 p = (stbi_uc *) STBI_REALLOC(z->idata, idata_limit); if (p == NULL) return stbi__err("outofmem", "Out of memory");
4450 z->idata = p;
4451 }
4452 if (!stbi__getn(s, z->idata+ioff,c.length)) return stbi__err("outofdata","Corrupt PNG");
4453 ioff += c.length;
4454 break;
4455 }
4456
4457 case STBI__PNG_TYPE('I','E','N','D'): {
4458 stbi__uint32 raw_len, bpl;
4459 if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4460 if (scan != STBI__SCAN_load) return 1;
4461 if (z->idata == NULL) return stbi__err("no IDAT","Corrupt PNG");
4462 if (depth > (INT_MAX - 7) / s->img_x)
4463 return stbi__err("Bad x","Bad x");
4464 // initial guess for decoded data size to avoid unnecessary reallocs
4465 bpl = (s->img_x * depth + 7) / 8; // bytes per line, per component
4466 if (bpl > (INT_MAX - s->img_y) / s->img_n / s->img_y)
4467 return stbi__err("Integer Overflow","y incorrect");
4468 raw_len = bpl * s->img_y * s->img_n /* pixels */ + s->img_y /* filter mode per row */;
4469 z->expanded = (stbi_uc *) stbi_zlib_decode_malloc_guesssize_headerflag((char *) z->idata, ioff, raw_len, (int *) &raw_len, !is_iphone);
4470 if (z->expanded == NULL) return 0; // zlib should set error
4471 STBI_FREE(z->idata); z->idata = NULL;
4472 if ((req_comp == s->img_n+1 && req_comp != 3 && !pal_img_n) || has_trans)
4473 s->img_out_n = s->img_n+1;
4474 else
4475 s->img_out_n = s->img_n;
4476 if (!stbi__create_png_image(z, z->expanded, raw_len, s->img_out_n, depth, color, interlace)) return 0;
4477 if (has_trans)
4478 if (!stbi__compute_transparency(z, tc, s->img_out_n)) return 0;
4479 if (is_iphone && stbi__de_iphone_flag && s->img_out_n > 2)
4480 stbi__de_iphone(z);
4481 if (pal_img_n) {
4482 // pal_img_n == 3 or 4
4483 s->img_n = pal_img_n; // record the actual colors we had
4484 s->img_out_n = pal_img_n;
4485 if (req_comp >= 3) s->img_out_n = req_comp;
4486 if (!stbi__expand_png_palette(z, palette, pal_len, s->img_out_n))
4487 return 0;
4488 }
4489 STBI_FREE(z->expanded); z->expanded = NULL;
4490 return 1;
4491 }
4492
4493 default:
4494 // if critical, fail
4495 if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4496 if ((c.type & (1 << 29)) == 0) {
4497 #ifndef STBI_NO_FAILURE_STRINGS
4498 // not threadsafe
4499 static char invalid_chunk[] = "XXXX PNG chunk not known";
4500 invalid_chunk[0] = STBI__BYTECAST(c.type >> 24);
4501 invalid_chunk[1] = STBI__BYTECAST(c.type >> 16);
4502 invalid_chunk[2] = STBI__BYTECAST(c.type >> 8);
4503 invalid_chunk[3] = STBI__BYTECAST(c.type >> 0);
4504 #endif
4505 return stbi__err(invalid_chunk, "PNG not supported: unknown PNG chunk type");
4506 }
4507 stbi__skip(s, c.length);
4508 break;
4509 }
4510 // end of PNG chunk, read and skip CRC
4511 stbi__get32be(s);
4512 }
4513 }
4514
stbi__do_png(stbi__png * p,int * x,int * y,int * n,int req_comp)4515 static unsigned char *stbi__do_png(stbi__png *p, int *x, int *y, int *n, int req_comp)
4516 {
4517 unsigned char *result=NULL;
4518 if (req_comp < 0 || req_comp > 4) return stbi__errpuc("bad req_comp", "Internal error");
4519 if (stbi__parse_png_file(p, STBI__SCAN_load, req_comp)) {
4520 result = p->out;
4521 p->out = NULL;
4522 if (req_comp && req_comp != p->s->img_out_n) {
4523 result = stbi__convert_format(result, p->s->img_out_n, req_comp, p->s->img_x, p->s->img_y);
4524 p->s->img_out_n = req_comp;
4525 if (result == NULL) return result;
4526 }
4527 *x = p->s->img_x;
4528 *y = p->s->img_y;
4529 if (n) *n = p->s->img_out_n;
4530 }
4531 STBI_FREE(p->out); p->out = NULL;
4532 STBI_FREE(p->expanded); p->expanded = NULL;
4533 STBI_FREE(p->idata); p->idata = NULL;
4534
4535 return result;
4536 }
4537
stbi__png_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)4538 static unsigned char *stbi__png_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
4539 {
4540 stbi__png p;
4541 p.s = s;
4542 return stbi__do_png(&p, x,y,comp,req_comp);
4543 }
4544
stbi__png_test(stbi__context * s)4545 static int stbi__png_test(stbi__context *s)
4546 {
4547 int r;
4548 r = stbi__check_png_header(s);
4549 stbi__rewind(s);
4550 return r;
4551 }
4552
stbi__png_info_raw(stbi__png * p,int * x,int * y,int * comp)4553 static int stbi__png_info_raw(stbi__png *p, int *x, int *y, int *comp)
4554 {
4555 if (!stbi__parse_png_file(p, STBI__SCAN_header, 0)) {
4556 stbi__rewind( p->s );
4557 return 0;
4558 }
4559 if (x) *x = p->s->img_x;
4560 if (y) *y = p->s->img_y;
4561 if (comp) *comp = p->s->img_n;
4562 return 1;
4563 }
4564
stbi__png_info(stbi__context * s,int * x,int * y,int * comp)4565 static int stbi__png_info(stbi__context *s, int *x, int *y, int *comp)
4566 {
4567 stbi__png p;
4568 p.s = s;
4569 return stbi__png_info_raw(&p, x, y, comp);
4570 }
4571 #endif
4572
4573 // Microsoft/Windows BMP image
4574
4575 #ifndef STBI_NO_BMP
stbi__bmp_test_raw(stbi__context * s)4576 static int stbi__bmp_test_raw(stbi__context *s)
4577 {
4578 int r;
4579 int sz;
4580 if (stbi__get8(s) != 'B') return 0;
4581 if (stbi__get8(s) != 'M') return 0;
4582 stbi__get32le(s); // discard filesize
4583 stbi__get16le(s); // discard reserved
4584 stbi__get16le(s); // discard reserved
4585 stbi__get32le(s); // discard data offset
4586 sz = stbi__get32le(s);
4587 r = (sz == 12 || sz == 40 || sz == 56 || sz == 108 || sz == 124);
4588 return r;
4589 }
4590
stbi__bmp_test(stbi__context * s)4591 static int stbi__bmp_test(stbi__context *s)
4592 {
4593 int r = stbi__bmp_test_raw(s);
4594 stbi__rewind(s);
4595 return r;
4596 }
4597
4598
4599 // returns 0..31 for the highest set bit
stbi__high_bit(unsigned int z)4600 static int stbi__high_bit(unsigned int z)
4601 {
4602 int n=0;
4603 if (z == 0) return -1;
4604 if (z >= 0x10000) n += 16, z >>= 16;
4605 if (z >= 0x00100) n += 8, z >>= 8;
4606 if (z >= 0x00010) n += 4, z >>= 4;
4607 if (z >= 0x00004) n += 2, z >>= 2;
4608 if (z >= 0x00002) n += 1, z >>= 1;
4609 return n;
4610 }
4611
stbi__bitcount(unsigned int a)4612 static int stbi__bitcount(unsigned int a)
4613 {
4614 a = (a & 0x55555555) + ((a >> 1) & 0x55555555); // max 2
4615 a = (a & 0x33333333) + ((a >> 2) & 0x33333333); // max 4
4616 a = (a + (a >> 4)) & 0x0f0f0f0f; // max 8 per 4, now 8 bits
4617 a = (a + (a >> 8)); // max 16 per 8 bits
4618 a = (a + (a >> 16)); // max 32 per 8 bits
4619 return a & 0xff;
4620 }
4621
stbi__shiftsigned(int v,int shift,int bits)4622 static int stbi__shiftsigned(int v, int shift, int bits)
4623 {
4624 int result;
4625 int z=0;
4626
4627 if (shift < 0) v <<= -shift;
4628 else v >>= shift;
4629 result = v;
4630
4631 z = bits;
4632 while (z < 8) {
4633 result += v >> z;
4634 z += bits;
4635 }
4636 return result;
4637 }
4638
stbi__bmp_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)4639 static stbi_uc *stbi__bmp_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
4640 {
4641 stbi_uc *out;
4642 unsigned int mr=0,mg=0,mb=0,ma=0, all_a=255;
4643 stbi_uc pal[256][4];
4644 int psize=0,i,j,compress=0,width;
4645 int bpp, flip_vertically, pad, target, offset, hsz;
4646 if (stbi__get8(s) != 'B' || stbi__get8(s) != 'M') return stbi__errpuc("not BMP", "Corrupt BMP");
4647 stbi__get32le(s); // discard filesize
4648 stbi__get16le(s); // discard reserved
4649 stbi__get16le(s); // discard reserved
4650 offset = stbi__get32le(s);
4651 hsz = stbi__get32le(s);
4652 if (hsz != 12 && hsz != 40 && hsz != 56 && hsz != 108 && hsz != 124) return stbi__errpuc("unknown BMP", "BMP type not supported: unknown");
4653 if (hsz == 12) {
4654 s->img_x = stbi__get16le(s);
4655 s->img_y = stbi__get16le(s);
4656 } else {
4657 s->img_x = stbi__get32le(s);
4658 s->img_y = stbi__get32le(s);
4659 }
4660 if (stbi__get16le(s) != 1) return stbi__errpuc("bad BMP", "bad BMP");
4661 bpp = stbi__get16le(s);
4662 if (bpp == 1) return stbi__errpuc("monochrome", "BMP type not supported: 1-bit");
4663 flip_vertically = ((int) s->img_y) > 0;
4664 s->img_y = abs((int) s->img_y);
4665 if (hsz == 12) {
4666 if (bpp < 24)
4667 psize = (offset - 14 - 24) / 3;
4668 } else {
4669 compress = stbi__get32le(s);
4670 if (compress == 1 || compress == 2) return stbi__errpuc("BMP RLE", "BMP type not supported: RLE");
4671 stbi__get32le(s); // discard sizeof
4672 stbi__get32le(s); // discard hres
4673 stbi__get32le(s); // discard vres
4674 stbi__get32le(s); // discard colorsused
4675 stbi__get32le(s); // discard max important
4676 if (hsz == 40 || hsz == 56) {
4677 if (hsz == 56) {
4678 stbi__get32le(s);
4679 stbi__get32le(s);
4680 stbi__get32le(s);
4681 stbi__get32le(s);
4682 }
4683 if (bpp == 16 || bpp == 32) {
4684 mr = mg = mb = 0;
4685 if (compress == 0) {
4686 if (bpp == 32) {
4687 mr = 0xffu << 16;
4688 mg = 0xffu << 8;
4689 mb = 0xffu << 0;
4690 ma = 0xffu << 24;
4691 all_a = 0; // if all_a is 0 at end, then we loaded alpha channel but it was all 0
4692 } else {
4693 mr = 31u << 10;
4694 mg = 31u << 5;
4695 mb = 31u << 0;
4696 }
4697 } else if (compress == 3) {
4698 mr = stbi__get32le(s);
4699 mg = stbi__get32le(s);
4700 mb = stbi__get32le(s);
4701 // not documented, but generated by photoshop and handled by mspaint
4702 if (mr == mg && mg == mb) {
4703 // ?!?!?
4704 return stbi__errpuc("bad BMP", "bad BMP");
4705 }
4706 } else
4707 return stbi__errpuc("bad BMP", "bad BMP");
4708 }
4709 } else {
4710 STBI_ASSERT(hsz == 108 || hsz == 124);
4711 mr = stbi__get32le(s);
4712 mg = stbi__get32le(s);
4713 mb = stbi__get32le(s);
4714 ma = stbi__get32le(s);
4715 stbi__get32le(s); // discard color space
4716 for (i=0; i < 12; ++i)
4717 stbi__get32le(s); // discard color space parameters
4718 if (hsz == 124) {
4719 stbi__get32le(s); // discard rendering intent
4720 stbi__get32le(s); // discard offset of profile data
4721 stbi__get32le(s); // discard size of profile data
4722 stbi__get32le(s); // discard reserved
4723 }
4724 }
4725 if (bpp < 16)
4726 psize = (offset - 14 - hsz) >> 2;
4727 }
4728 s->img_n = ma ? 4 : 3;
4729 if (req_comp && req_comp >= 3) // we can directly decode 3 or 4
4730 target = req_comp;
4731 else
4732 target = s->img_n; // if they want monochrome, we'll post-convert
4733 if (s->img_x == 0 || s->img_y == 0 || target <= 0 || target > (INT_MAX / s->img_x / s->img_y))
4734 return stbi__errpuc("Integer Overflow", "x or y incorrect");
4735 out = (stbi_uc *) stbi__malloc(target * s->img_x * s->img_y);
4736 if (!out) return stbi__errpuc("outofmem", "Out of memory");
4737 if (bpp < 16) {
4738 int z=0;
4739 if (psize == 0 || psize > 256) { STBI_FREE(out); return stbi__errpuc("invalid", "Corrupt BMP"); }
4740 for (i=0; i < psize; ++i) {
4741 pal[i][2] = stbi__get8(s);
4742 pal[i][1] = stbi__get8(s);
4743 pal[i][0] = stbi__get8(s);
4744 if (hsz != 12) stbi__get8(s);
4745 pal[i][3] = 255;
4746 }
4747 stbi__skip(s, offset - 14 - hsz - psize * (hsz == 12 ? 3 : 4));
4748 if (bpp == 4) width = (s->img_x + 1) >> 1;
4749 else if (bpp == 8) width = s->img_x;
4750 else { STBI_FREE(out); return stbi__errpuc("bad bpp", "Corrupt BMP"); }
4751 pad = (-width)&3;
4752 for (j=0; j < (int) s->img_y; ++j) {
4753 for (i=0; i < (int) s->img_x; i += 2) {
4754 int v=stbi__get8(s),v2=0;
4755 if (bpp == 4) {
4756 v2 = v & 15;
4757 v >>= 4;
4758 }
4759 out[z++] = pal[v][0];
4760 out[z++] = pal[v][1];
4761 out[z++] = pal[v][2];
4762 if (target == 4) out[z++] = 255;
4763 if (i+1 == (int) s->img_x) break;
4764 v = (bpp == 8) ? stbi__get8(s) : v2;
4765 out[z++] = pal[v][0];
4766 out[z++] = pal[v][1];
4767 out[z++] = pal[v][2];
4768 if (target == 4) out[z++] = 255;
4769 }
4770 stbi__skip(s, pad);
4771 }
4772 } else {
4773 int rshift=0,gshift=0,bshift=0,ashift=0,rcount=0,gcount=0,bcount=0,acount=0;
4774 int z = 0;
4775 int easy=0;
4776 stbi__skip(s, offset - 14 - hsz);
4777 if (bpp == 24) width = 3 * s->img_x;
4778 else if (bpp == 16) width = 2*s->img_x;
4779 else /* bpp = 32 and pad = 0 */ width=0;
4780 pad = (-width) & 3;
4781 if (bpp == 24) {
4782 easy = 1;
4783 } else if (bpp == 32) {
4784 if (mb == 0xff && mg == 0xff00 && mr == 0x00ff0000 && ma == 0xff000000)
4785 easy = 2;
4786 }
4787 if (!easy) {
4788 if (!mr || !mg || !mb) { STBI_FREE(out); return stbi__errpuc("bad masks", "Corrupt BMP"); }
4789 // right shift amt to put high bit in position #7
4790 rshift = stbi__high_bit(mr)-7; rcount = stbi__bitcount(mr);
4791 gshift = stbi__high_bit(mg)-7; gcount = stbi__bitcount(mg);
4792 bshift = stbi__high_bit(mb)-7; bcount = stbi__bitcount(mb);
4793 ashift = stbi__high_bit(ma)-7; acount = stbi__bitcount(ma);
4794 }
4795 for (j=0; j < (int) s->img_y; ++j) {
4796 if (easy) {
4797 for (i=0; i < (int) s->img_x; ++i) {
4798 unsigned char a;
4799 out[z+2] = stbi__get8(s);
4800 out[z+1] = stbi__get8(s);
4801 out[z+0] = stbi__get8(s);
4802 z += 3;
4803 a = (easy == 2 ? stbi__get8(s) : 255);
4804 all_a |= a;
4805 if (target == 4) out[z++] = a;
4806 }
4807 } else {
4808 for (i=0; i < (int) s->img_x; ++i) {
4809 stbi__uint32 v = (bpp == 16 ? (stbi__uint32) stbi__get16le(s) : stbi__get32le(s));
4810 int a;
4811 out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mr, rshift, rcount));
4812 out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mg, gshift, gcount));
4813 out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mb, bshift, bcount));
4814 a = (ma ? stbi__shiftsigned(v & ma, ashift, acount) : 255);
4815 all_a |= a;
4816 if (target == 4) out[z++] = STBI__BYTECAST(a);
4817 }
4818 }
4819 stbi__skip(s, pad);
4820 }
4821 }
4822
4823 // if alpha channel is all 0s, replace with all 255s
4824 if (target == 4 && all_a == 0)
4825 for (i=4*s->img_x*s->img_y-1; i >= 0; i -= 4)
4826 out[i] = 255;
4827
4828 if (flip_vertically) {
4829 stbi_uc t;
4830 for (j=0; j < (int) s->img_y>>1; ++j) {
4831 stbi_uc *p1 = out + j *s->img_x*target;
4832 stbi_uc *p2 = out + (s->img_y-1-j)*s->img_x*target;
4833 for (i=0; i < (int) s->img_x*target; ++i) {
4834 t = p1[i], p1[i] = p2[i], p2[i] = t;
4835 }
4836 }
4837 }
4838
4839 if (req_comp && req_comp != target) {
4840 out = stbi__convert_format(out, target, req_comp, s->img_x, s->img_y);
4841 if (out == NULL) return out; // stbi__convert_format frees input on failure
4842 }
4843
4844 *x = s->img_x;
4845 *y = s->img_y;
4846 if (comp) *comp = s->img_n;
4847 return out;
4848 }
4849 #endif
4850
4851 // Targa Truevision - TGA
4852 // by Jonathan Dummer
4853 #ifndef STBI_NO_TGA
stbi__tga_info(stbi__context * s,int * x,int * y,int * comp)4854 static int stbi__tga_info(stbi__context *s, int *x, int *y, int *comp)
4855 {
4856 int tga_w, tga_h, tga_comp;
4857 int sz;
4858 stbi__get8(s); // discard Offset
4859 sz = stbi__get8(s); // color type
4860 if( sz > 1 ) {
4861 stbi__rewind(s);
4862 return 0; // only RGB or indexed allowed
4863 }
4864 sz = stbi__get8(s); // image type
4865 // only RGB or grey allowed, +/- RLE
4866 if ((sz != 1) && (sz != 2) && (sz != 3) && (sz != 9) && (sz != 10) && (sz != 11)) return 0;
4867 stbi__skip(s,9);
4868 tga_w = stbi__get16le(s);
4869 if( tga_w < 1 ) {
4870 stbi__rewind(s);
4871 return 0; // test width
4872 }
4873 tga_h = stbi__get16le(s);
4874 if( tga_h < 1 ) {
4875 stbi__rewind(s);
4876 return 0; // test height
4877 }
4878 sz = stbi__get8(s); // bits per pixel
4879 // only RGB or RGBA or grey allowed
4880 if ((sz != 8) && (sz != 16) && (sz != 24) && (sz != 32)) {
4881 stbi__rewind(s);
4882 return 0;
4883 }
4884 tga_comp = sz;
4885 if (x) *x = tga_w;
4886 if (y) *y = tga_h;
4887 if (comp) *comp = tga_comp / 8;
4888 return 1; // seems to have passed everything
4889 }
4890
stbi__tga_test(stbi__context * s)4891 static int stbi__tga_test(stbi__context *s)
4892 {
4893 int res;
4894 int sz;
4895 stbi__get8(s); // discard Offset
4896 sz = stbi__get8(s); // color type
4897 if ( sz > 1 ) return 0; // only RGB or indexed allowed
4898 sz = stbi__get8(s); // image type
4899 if ( (sz != 1) && (sz != 2) && (sz != 3) && (sz != 9) && (sz != 10) && (sz != 11) ) return 0; // only RGB or grey allowed, +/- RLE
4900 stbi__get16be(s); // discard palette start
4901 stbi__get16be(s); // discard palette length
4902 stbi__get8(s); // discard bits per palette color entry
4903 stbi__get16be(s); // discard x origin
4904 stbi__get16be(s); // discard y origin
4905 if ( stbi__get16be(s) < 1 ) return 0; // test width
4906 if ( stbi__get16be(s) < 1 ) return 0; // test height
4907 sz = stbi__get8(s); // bits per pixel
4908 if ( (sz != 8) && (sz != 16) && (sz != 24) && (sz != 32) )
4909 res = 0;
4910 else
4911 res = 1;
4912 stbi__rewind(s);
4913 return res;
4914 }
4915
stbi__tga_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)4916 static stbi_uc *stbi__tga_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
4917 {
4918 // read in the TGA header stuff
4919 int tga_offset = stbi__get8(s);
4920 int tga_indexed = stbi__get8(s);
4921 int tga_image_type = stbi__get8(s);
4922 int tga_is_RLE = 0;
4923 int tga_palette_start = stbi__get16le(s);
4924 int tga_palette_len = stbi__get16le(s);
4925 int tga_palette_bits = stbi__get8(s);
4926 int tga_x_origin = stbi__get16le(s);
4927 int tga_y_origin = stbi__get16le(s);
4928 int tga_width = stbi__get16le(s);
4929 int tga_height = stbi__get16le(s);
4930 int tga_bits_per_pixel = stbi__get8(s);
4931 int tga_comp = tga_bits_per_pixel / 8;
4932 int tga_inverted = stbi__get8(s);
4933 // image data
4934 unsigned char *tga_data;
4935 unsigned char *tga_palette = NULL;
4936 int i, j;
4937 unsigned char raw_data[4];
4938 int RLE_count = 0;
4939 int RLE_repeating = 0;
4940 int read_next_pixel = 1;
4941
4942 // do a tiny bit of precessing
4943 if ( tga_image_type >= 8 )
4944 {
4945 tga_image_type -= 8;
4946 tga_is_RLE = 1;
4947 }
4948 /* int tga_alpha_bits = tga_inverted & 15; */
4949 tga_inverted = 1 - ((tga_inverted >> 5) & 1);
4950
4951 // error check
4952 if ( //(tga_indexed) ||
4953 (tga_width < 1) || (tga_height < 1) ||
4954 (tga_image_type < 1) || (tga_image_type > 3) ||
4955 ((tga_bits_per_pixel != 8) && (tga_bits_per_pixel != 16) &&
4956 (tga_bits_per_pixel != 24) && (tga_bits_per_pixel != 32))
4957 )
4958 {
4959 return NULL; // we don't report this as a bad TGA because we don't even know if it's TGA
4960 }
4961
4962 // If I'm paletted, then I'll use the number of bits from the palette
4963 if ( tga_indexed )
4964 {
4965 tga_comp = tga_palette_bits / 8;
4966 }
4967
4968 // tga info
4969 *x = tga_width;
4970 *y = tga_height;
4971 if (comp) *comp = tga_comp;
4972
4973 if(tga_width <= 0 || tga_height <= 0 || tga_comp <= 0 ||
4974 (tga_comp > INT_MAX / tga_width / tga_height))
4975 return stbi__errpuc("Integer Overflow", "TGA image width or height is too large");
4976
4977 tga_data = (unsigned char*)stbi__malloc( (size_t)tga_width * tga_height * tga_comp );
4978 if (!tga_data) return stbi__errpuc("outofmem", "Out of memory");
4979
4980 // skip to the data's starting position (offset usually = 0)
4981 stbi__skip(s, tga_offset );
4982
4983 if ( !tga_indexed && !tga_is_RLE) {
4984 for (i=0; i < tga_height; ++i) {
4985 int row = tga_inverted ? tga_height -i - 1 : i;
4986 stbi_uc *tga_row = tga_data + row*tga_width*tga_comp;
4987 stbi__getn(s, tga_row, tga_width * tga_comp);
4988 }
4989 } else {
4990 // do I need to load a palette?
4991 if ( tga_indexed)
4992 {
4993 // any data to skip? (offset usually = 0)
4994 stbi__skip(s, tga_palette_start );
4995 // load the palette
4996 tga_palette = (unsigned char*)stbi__malloc( tga_palette_len * tga_palette_bits / 8 );
4997 if (!tga_palette) {
4998 STBI_FREE(tga_data);
4999 return stbi__errpuc("outofmem", "Out of memory");
5000 }
5001 if (!stbi__getn(s, tga_palette, tga_palette_len * tga_palette_bits / 8 )) {
5002 STBI_FREE(tga_data);
5003 STBI_FREE(tga_palette);
5004 return stbi__errpuc("bad palette", "Corrupt TGA");
5005 }
5006 }
5007 // load the data
5008 for (i=0; i < tga_width * tga_height; ++i)
5009 {
5010 // if I'm in RLE mode, do I need to get a RLE stbi__pngchunk?
5011 if ( tga_is_RLE )
5012 {
5013 if ( RLE_count == 0 )
5014 {
5015 // yep, get the next byte as a RLE command
5016 int RLE_cmd = stbi__get8(s);
5017 RLE_count = 1 + (RLE_cmd & 127);
5018 RLE_repeating = RLE_cmd >> 7;
5019 read_next_pixel = 1;
5020 } else if ( !RLE_repeating )
5021 {
5022 read_next_pixel = 1;
5023 }
5024 } else
5025 {
5026 read_next_pixel = 1;
5027 }
5028 // OK, if I need to read a pixel, do it now
5029 if ( read_next_pixel )
5030 {
5031 // load however much data we did have
5032 if ( tga_indexed )
5033 {
5034 // read in 1 byte, then perform the lookup
5035 int pal_idx = stbi__get8(s);
5036 if ( pal_idx >= tga_palette_len )
5037 {
5038 // invalid index
5039 pal_idx = 0;
5040 }
5041 pal_idx *= tga_bits_per_pixel / 8;
5042 for (j = 0; j*8 < tga_bits_per_pixel; ++j)
5043 {
5044 raw_data[j] = tga_palette[pal_idx+j];
5045 }
5046 } else
5047 {
5048 // read in the data raw
5049 for (j = 0; j*8 < tga_bits_per_pixel; ++j)
5050 {
5051 raw_data[j] = stbi__get8(s);
5052 }
5053 }
5054 // clear the reading flag for the next pixel
5055 read_next_pixel = 0;
5056 } // end of reading a pixel
5057
5058 // copy data
5059 for (j = 0; j < tga_comp; ++j)
5060 tga_data[i*tga_comp+j] = raw_data[j];
5061
5062 // in case we're in RLE mode, keep counting down
5063 --RLE_count;
5064 }
5065 // do I need to invert the image?
5066 if ( tga_inverted )
5067 {
5068 for (j = 0; j*2 < tga_height; ++j)
5069 {
5070 int index1 = j * tga_width * tga_comp;
5071 int index2 = (tga_height - 1 - j) * tga_width * tga_comp;
5072 for (i = tga_width * tga_comp; i > 0; --i)
5073 {
5074 unsigned char temp = tga_data[index1];
5075 tga_data[index1] = tga_data[index2];
5076 tga_data[index2] = temp;
5077 ++index1;
5078 ++index2;
5079 }
5080 }
5081 }
5082 // clear my palette, if I had one
5083 if ( tga_palette != NULL )
5084 {
5085 STBI_FREE( tga_palette );
5086 }
5087 }
5088
5089 // swap RGB
5090 if (tga_comp >= 3)
5091 {
5092 unsigned char* tga_pixel = tga_data;
5093 for (i=0; i < tga_width * tga_height; ++i)
5094 {
5095 unsigned char temp = tga_pixel[0];
5096 tga_pixel[0] = tga_pixel[2];
5097 tga_pixel[2] = temp;
5098 tga_pixel += tga_comp;
5099 }
5100 }
5101
5102 // convert to target component count
5103 if (req_comp && req_comp != tga_comp)
5104 tga_data = stbi__convert_format(tga_data, tga_comp, req_comp, tga_width, tga_height);
5105
5106 // the things I do to get rid of an error message, and yet keep
5107 // Microsoft's C compilers happy... [8^(
5108 tga_palette_start = tga_palette_len = tga_palette_bits =
5109 tga_x_origin = tga_y_origin = 0;
5110 // OK, done
5111 return tga_data;
5112 }
5113 #endif
5114
5115 // *************************************************************************************************
5116 // Photoshop PSD loader -- PD by Thatcher Ulrich, integration by Nicolas Schulz, tweaked by STB
5117
5118 #ifndef STBI_NO_PSD
stbi__psd_test(stbi__context * s)5119 static int stbi__psd_test(stbi__context *s)
5120 {
5121 int r = (stbi__get32be(s) == 0x38425053);
5122 stbi__rewind(s);
5123 return r;
5124 }
5125
stbi__psd_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)5126 static stbi_uc *stbi__psd_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
5127 {
5128 int pixelCount;
5129 int channelCount, compression;
5130 int channel, i, count, len;
5131 int bitdepth;
5132 int w,h;
5133 stbi_uc *out;
5134
5135 // Check identifier
5136 if (stbi__get32be(s) != 0x38425053) // "8BPS"
5137 return stbi__errpuc("not PSD", "Corrupt PSD image");
5138
5139 // Check file type version.
5140 if (stbi__get16be(s) != 1)
5141 return stbi__errpuc("wrong version", "Unsupported version of PSD image");
5142
5143 // Skip 6 reserved bytes.
5144 stbi__skip(s, 6 );
5145
5146 // Read the number of channels (R, G, B, A, etc).
5147 channelCount = stbi__get16be(s);
5148 if (channelCount < 0 || channelCount > 16)
5149 return stbi__errpuc("wrong channel count", "Unsupported number of channels in PSD image");
5150
5151 // Read the rows and columns of the image.
5152 h = stbi__get32be(s);
5153 w = stbi__get32be(s);
5154
5155 // Make sure the depth is 8 bits.
5156 bitdepth = stbi__get16be(s);
5157 if (bitdepth != 8 && bitdepth != 16)
5158 return stbi__errpuc("unsupported bit depth", "PSD bit depth is not 8 or 16 bit");
5159
5160 // Make sure the color mode is RGB.
5161 // Valid options are:
5162 // 0: Bitmap
5163 // 1: Grayscale
5164 // 2: Indexed color
5165 // 3: RGB color
5166 // 4: CMYK color
5167 // 7: Multichannel
5168 // 8: Duotone
5169 // 9: Lab color
5170 if (stbi__get16be(s) != 3)
5171 return stbi__errpuc("wrong color format", "PSD is not in RGB color format");
5172
5173 // Skip the Mode Data. (It's the palette for indexed color; other info for other modes.)
5174 stbi__skip(s,stbi__get32be(s) );
5175
5176 // Skip the image resources. (resolution, pen tool paths, etc)
5177 stbi__skip(s, stbi__get32be(s) );
5178
5179 // Skip the reserved data.
5180 stbi__skip(s, stbi__get32be(s) );
5181
5182 // Find out if the data is compressed.
5183 // Known values:
5184 // 0: no compression
5185 // 1: RLE compressed
5186 compression = stbi__get16be(s);
5187 if (compression > 1)
5188 return stbi__errpuc("bad compression", "PSD has an unknown compression format");
5189
5190 // Create the destination image.
5191 if (w <= 0 || h <= 0 ||
5192 (4 > (INT_MAX / w / h)))
5193 return stbi__errpuc("Integer Overflow", "w or h incorrect");
5194 out = (stbi_uc *) stbi__malloc(4 * w*h);
5195 if (!out) return stbi__errpuc("outofmem", "Out of memory");
5196 pixelCount = w*h;
5197
5198 // Initialize the data to zero.
5199 //memset( out, 0, pixelCount * 4 );
5200
5201 // Finally, the image data.
5202 if (compression) {
5203 // RLE as used by .PSD and .TIFF
5204 // Loop until you get the number of unpacked bytes you are expecting:
5205 // Read the next source byte into n.
5206 // If n is between 0 and 127 inclusive, copy the next n+1 bytes literally.
5207 // Else if n is between -127 and -1 inclusive, copy the next byte -n+1 times.
5208 // Else if n is 128, noop.
5209 // Endloop
5210
5211 // The RLE-compressed data is preceeded by a 2-byte data count for each row in the data,
5212 // which we're going to just skip.
5213 stbi__skip(s, h * channelCount * 2 );
5214
5215 // Read the RLE data by channel.
5216 for (channel = 0; channel < 4; channel++) {
5217 stbi_uc *p;
5218
5219 p = out+channel;
5220 if (channel >= channelCount) {
5221 // Fill this channel with default data.
5222 for (i = 0; i < pixelCount; i++, p += 4)
5223 *p = (channel == 3 ? 255 : 0);
5224 } else {
5225 // Read the RLE data.
5226 count = 0;
5227 while (count < pixelCount) {
5228 len = stbi__get8(s);
5229 if (len == 128) {
5230 // No-op.
5231 } else if (len < 128) {
5232 // Copy next len+1 bytes literally.
5233 len++;
5234 if (len >= pixelCount - count) {
5235 STBI_FREE(out);
5236 return stbi__errpuc("corruptfile", "Corrupt PSD file");
5237 }
5238 count += len;
5239 while (len) {
5240 *p = stbi__get8(s);
5241 p += 4;
5242 len--;
5243 }
5244 } else if (len > 128) {
5245 stbi_uc val;
5246 // Next -len+1 bytes in the dest are replicated from next source byte.
5247 // (Interpret len as a negative 8-bit int.)
5248 len ^= 0x0FF;
5249 len += 2;
5250 val = stbi__get8(s);
5251 if (len >= pixelCount - count) {
5252 STBI_FREE(out);
5253 return stbi__errpuc("corruptfile", "Corrupt PSD file");
5254 }
5255 count += len;
5256 while (len) {
5257 *p = val;
5258 p += 4;
5259 len--;
5260 }
5261 }
5262 }
5263 }
5264 }
5265
5266 } else {
5267 // We're at the raw image data. It's each channel in order (Red, Green, Blue, Alpha, ...)
5268 // where each channel consists of an 8-bit value for each pixel in the image.
5269
5270 // Read the data by channel.
5271 for (channel = 0; channel < 4; channel++) {
5272 stbi_uc *p;
5273
5274 p = out + channel;
5275 if (channel >= channelCount) {
5276 // Fill this channel with default data.
5277 stbi_uc val = channel == 3 ? 255 : 0;
5278 for (i = 0; i < pixelCount; i++, p += 4)
5279 *p = val;
5280 } else {
5281 // Read the data.
5282 if (bitdepth == 16) {
5283 for (i = 0; i < pixelCount; i++, p += 4)
5284 *p = (stbi_uc) (stbi__get16be(s) >> 8);
5285 } else {
5286 for (i = 0; i < pixelCount; i++, p += 4)
5287 *p = stbi__get8(s);
5288 }
5289 }
5290 }
5291 }
5292
5293 if (req_comp && req_comp != 4) {
5294 out = stbi__convert_format(out, 4, req_comp, w, h);
5295 if (out == NULL) return out; // stbi__convert_format frees input on failure
5296 }
5297
5298 if (comp) *comp = 4;
5299 *y = h;
5300 *x = w;
5301
5302 return out;
5303 }
5304 #endif
5305
5306 // *************************************************************************************************
5307 // Softimage PIC loader
5308 // by Tom Seddon
5309 //
5310 // See http://softimage.wiki.softimage.com/index.php/INFO:_PIC_file_format
5311 // See http://ozviz.wasp.uwa.edu.au/~pbourke/dataformats/softimagepic/
5312
5313 #ifndef STBI_NO_PIC
stbi__pic_is4(stbi__context * s,const char * str)5314 static int stbi__pic_is4(stbi__context *s,const char *str)
5315 {
5316 int i;
5317 for (i=0; i<4; ++i)
5318 if (stbi__get8(s) != (stbi_uc)str[i])
5319 return 0;
5320
5321 return 1;
5322 }
5323
stbi__pic_test_core(stbi__context * s)5324 static int stbi__pic_test_core(stbi__context *s)
5325 {
5326 int i;
5327
5328 if (!stbi__pic_is4(s,"\x53\x80\xF6\x34"))
5329 return 0;
5330
5331 for(i=0;i<84;++i)
5332 stbi__get8(s);
5333
5334 if (!stbi__pic_is4(s,"PICT"))
5335 return 0;
5336
5337 return 1;
5338 }
5339
5340 typedef struct
5341 {
5342 stbi_uc size,type,channel;
5343 } stbi__pic_packet;
5344
stbi__readval(stbi__context * s,int channel,stbi_uc * dest)5345 static stbi_uc *stbi__readval(stbi__context *s, int channel, stbi_uc *dest)
5346 {
5347 int mask=0x80, i;
5348
5349 for (i=0; i<4; ++i, mask>>=1) {
5350 if (channel & mask) {
5351 if (stbi__at_eof(s)) return stbi__errpuc("bad file","PIC file too short");
5352 dest[i]=stbi__get8(s);
5353 }
5354 }
5355
5356 return dest;
5357 }
5358
stbi__copyval(int channel,stbi_uc * dest,const stbi_uc * src)5359 static void stbi__copyval(int channel,stbi_uc *dest,const stbi_uc *src)
5360 {
5361 int mask=0x80,i;
5362
5363 for (i=0;i<4; ++i, mask>>=1)
5364 if (channel&mask)
5365 dest[i]=src[i];
5366 }
5367
stbi__pic_load_core(stbi__context * s,int width,int height,int * comp,stbi_uc * result)5368 static stbi_uc *stbi__pic_load_core(stbi__context *s,int width,int height,int *comp, stbi_uc *result)
5369 {
5370 int act_comp=0,num_packets=0,y,chained;
5371 stbi__pic_packet packets[10];
5372
5373 // this will (should...) cater for even some bizarre stuff like having data
5374 // for the same channel in multiple packets.
5375 do {
5376 stbi__pic_packet *packet;
5377
5378 if (num_packets==sizeof(packets)/sizeof(packets[0]))
5379 return stbi__errpuc("bad format","too many packets");
5380
5381 packet = &packets[num_packets++];
5382
5383 chained = stbi__get8(s);
5384 packet->size = stbi__get8(s);
5385 packet->type = stbi__get8(s);
5386 packet->channel = stbi__get8(s);
5387
5388 act_comp |= packet->channel;
5389
5390 if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (reading packets)");
5391 if (packet->size != 8) return stbi__errpuc("bad format","packet isn't 8bpp");
5392 } while (chained);
5393
5394 *comp = (act_comp & 0x10 ? 4 : 3); // has alpha channel?
5395
5396 for(y=0; y<height; ++y) {
5397 int packet_idx;
5398
5399 for(packet_idx=0; packet_idx < num_packets; ++packet_idx) {
5400 stbi__pic_packet *packet = &packets[packet_idx];
5401 stbi_uc *dest = result+y*width*4;
5402
5403 switch (packet->type) {
5404 default:
5405 return stbi__errpuc("bad format","packet has bad compression type");
5406
5407 case 0: {//uncompressed
5408 int x;
5409
5410 for(x=0;x<width;++x, dest+=4)
5411 if (!stbi__readval(s,packet->channel,dest))
5412 return 0;
5413 break;
5414 }
5415
5416 case 1://Pure RLE
5417 {
5418 int left=width, i;
5419
5420 while (left>0) {
5421 stbi_uc count,value[4];
5422
5423 count=stbi__get8(s);
5424 if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (pure read count)");
5425
5426 if (count > left)
5427 count = (stbi_uc) left;
5428
5429 if (!stbi__readval(s,packet->channel,value)) return 0;
5430
5431 for(i=0; i<count; ++i,dest+=4)
5432 stbi__copyval(packet->channel,dest,value);
5433 left -= count;
5434 }
5435 }
5436 break;
5437
5438 case 2: {//Mixed RLE
5439 int left=width;
5440 while (left>0) {
5441 int count = stbi__get8(s), i;
5442 if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (mixed read count)");
5443
5444 if (count >= 128) { // Repeated
5445 stbi_uc value[4];
5446
5447 if (count==128)
5448 count = stbi__get16be(s);
5449 else
5450 count -= 127;
5451 if (count > left)
5452 return stbi__errpuc("bad file","scanline overrun");
5453
5454 if (!stbi__readval(s,packet->channel,value))
5455 return 0;
5456
5457 for(i=0;i<count;++i, dest += 4)
5458 stbi__copyval(packet->channel,dest,value);
5459 } else { // Raw
5460 ++count;
5461 if (count>left) return stbi__errpuc("bad file","scanline overrun");
5462
5463 for(i=0;i<count;++i, dest+=4)
5464 if (!stbi__readval(s,packet->channel,dest))
5465 return 0;
5466 }
5467 left-=count;
5468 }
5469 break;
5470 }
5471 }
5472 }
5473 }
5474
5475 return result;
5476 }
5477
stbi__pic_load(stbi__context * s,int * px,int * py,int * comp,int req_comp)5478 static stbi_uc *stbi__pic_load(stbi__context *s,int *px,int *py,int *comp,int req_comp)
5479 {
5480 stbi_uc *result;
5481 int i, x,y;
5482
5483 for (i=0; i<92; ++i)
5484 stbi__get8(s);
5485
5486 x = stbi__get16be(s);
5487 y = stbi__get16be(s);
5488 if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (pic header)");
5489 if ((1 << 28) / x < y) return stbi__errpuc("too large", "Image too large to decode");
5490
5491 stbi__get32be(s); //skip `ratio'
5492 stbi__get16be(s); //skip `fields'
5493 stbi__get16be(s); //skip `pad'
5494
5495 if (x <= 0 || y <= 0 ||
5496 (4 > (INT_MAX / x / y)))
5497 return stbi__errpuc("Integer Overflow", "x or y incorrect");
5498 // intermediate buffer is RGBA
5499 result = (stbi_uc *) stbi__malloc(x*y*4);
5500 if(result == NULL) return stbi__errpuc("outofmem", "Out of memory");
5501 memset(result, 0xff, x*y*4);
5502
5503 if (!stbi__pic_load_core(s,x,y,comp, result)) {
5504 STBI_FREE(result);
5505 result=0;
5506 }
5507 *px = x;
5508 *py = y;
5509 if (req_comp == 0) req_comp = *comp;
5510 result=stbi__convert_format(result,4,req_comp,x,y);
5511
5512 return result;
5513 }
5514
stbi__pic_test(stbi__context * s)5515 static int stbi__pic_test(stbi__context *s)
5516 {
5517 int r = stbi__pic_test_core(s);
5518 stbi__rewind(s);
5519 return r;
5520 }
5521 #endif
5522
5523 // *************************************************************************************************
5524 // GIF loader -- public domain by Jean-Marc Lienher -- simplified/shrunk by stb
5525
5526 #ifndef STBI_NO_GIF
5527 typedef struct
5528 {
5529 stbi__int16 prefix;
5530 stbi_uc first;
5531 stbi_uc suffix;
5532 } stbi__gif_lzw;
5533
5534 typedef struct
5535 {
5536 int w,h;
5537 stbi_uc *out, *old_out; // output buffer (always 4 components)
5538 int flags, bgindex, ratio, transparent, eflags, delay;
5539 stbi_uc pal[256][4];
5540 stbi_uc lpal[256][4];
5541 stbi__gif_lzw codes[4096];
5542 stbi_uc *color_table;
5543 int parse, step;
5544 int lflags;
5545 int start_x, start_y;
5546 int max_x, max_y;
5547 int cur_x, cur_y;
5548 int line_size;
5549 } stbi__gif;
5550
stbi__gif_test_raw(stbi__context * s)5551 static int stbi__gif_test_raw(stbi__context *s)
5552 {
5553 int sz;
5554 if (stbi__get8(s) != 'G' || stbi__get8(s) != 'I' || stbi__get8(s) != 'F' || stbi__get8(s) != '8') return 0;
5555 sz = stbi__get8(s);
5556 if (sz != '9' && sz != '7') return 0;
5557 if (stbi__get8(s) != 'a') return 0;
5558 return 1;
5559 }
5560
stbi__gif_test(stbi__context * s)5561 static int stbi__gif_test(stbi__context *s)
5562 {
5563 int r = stbi__gif_test_raw(s);
5564 stbi__rewind(s);
5565 return r;
5566 }
5567
stbi__gif_parse_colortable(stbi__context * s,stbi_uc pal[256][4],int num_entries,int transp)5568 static void stbi__gif_parse_colortable(stbi__context *s, stbi_uc pal[256][4], int num_entries, int transp)
5569 {
5570 int i;
5571 for (i=0; i < num_entries; ++i) {
5572 pal[i][2] = stbi__get8(s);
5573 pal[i][1] = stbi__get8(s);
5574 pal[i][0] = stbi__get8(s);
5575 pal[i][3] = transp == i ? 0 : 255;
5576 }
5577 }
5578
stbi__gif_header(stbi__context * s,stbi__gif * g,int * comp,int is_info)5579 static int stbi__gif_header(stbi__context *s, stbi__gif *g, int *comp, int is_info)
5580 {
5581 stbi_uc version;
5582 if (stbi__get8(s) != 'G' || stbi__get8(s) != 'I' || stbi__get8(s) != 'F' || stbi__get8(s) != '8')
5583 return stbi__err("not GIF", "Corrupt GIF");
5584
5585 version = stbi__get8(s);
5586 if (version != '7' && version != '9') return stbi__err("not GIF", "Corrupt GIF");
5587 if (stbi__get8(s) != 'a') return stbi__err("not GIF", "Corrupt GIF");
5588
5589 stbi__g_failure_reason = "";
5590 g->w = stbi__get16le(s);
5591 g->h = stbi__get16le(s);
5592 g->flags = stbi__get8(s);
5593 g->bgindex = stbi__get8(s);
5594 g->ratio = stbi__get8(s);
5595 g->transparent = -1;
5596
5597 if (comp != 0) *comp = 4; // can't actually tell whether it's 3 or 4 until we parse the comments
5598
5599 if (is_info) return 1;
5600
5601 if (g->flags & 0x80)
5602 stbi__gif_parse_colortable(s,g->pal, 2 << (g->flags & 7), -1);
5603
5604 return 1;
5605 }
5606
stbi__gif_info_raw(stbi__context * s,int * x,int * y,int * comp)5607 static int stbi__gif_info_raw(stbi__context *s, int *x, int *y, int *comp)
5608 {
5609 stbi__gif g;
5610 if (!stbi__gif_header(s, &g, comp, 1)) {
5611 stbi__rewind( s );
5612 return 0;
5613 }
5614 if (x) *x = g.w;
5615 if (y) *y = g.h;
5616 return 1;
5617 }
5618
stbi__out_gif_code(stbi__gif * g,stbi__uint16 code)5619 static void stbi__out_gif_code(stbi__gif *g, stbi__uint16 code)
5620 {
5621 stbi_uc *p, *c;
5622
5623 // recurse to decode the prefixes, since the linked-list is backwards,
5624 // and working backwards through an interleaved image would be nasty
5625 if (g->codes[code].prefix >= 0)
5626 stbi__out_gif_code(g, g->codes[code].prefix);
5627
5628 if (g->cur_y >= g->max_y) return;
5629
5630 p = &g->out[g->cur_x + g->cur_y];
5631 c = &g->color_table[g->codes[code].suffix * 4];
5632
5633 if (c[3] >= 128) {
5634 p[0] = c[2];
5635 p[1] = c[1];
5636 p[2] = c[0];
5637 p[3] = c[3];
5638 }
5639 g->cur_x += 4;
5640
5641 if (g->cur_x >= g->max_x) {
5642 g->cur_x = g->start_x;
5643 g->cur_y += g->step;
5644
5645 while (g->cur_y >= g->max_y && g->parse > 0) {
5646 g->step = (1 << g->parse) * g->line_size;
5647 g->cur_y = g->start_y + (g->step >> 1);
5648 --g->parse;
5649 }
5650 }
5651 }
5652
stbi__process_gif_raster(stbi__context * s,stbi__gif * g)5653 static stbi_uc *stbi__process_gif_raster(stbi__context *s, stbi__gif *g)
5654 {
5655 stbi_uc lzw_cs;
5656 stbi__int32 len, init_code;
5657 stbi__uint32 first;
5658 stbi__int32 codesize, codemask, avail, oldcode, bits, valid_bits, clear;
5659 stbi__gif_lzw *p;
5660
5661 lzw_cs = stbi__get8(s);
5662 if (lzw_cs > 12) return NULL;
5663 clear = 1 << lzw_cs;
5664 first = 1;
5665 codesize = lzw_cs + 1;
5666 codemask = (1 << codesize) - 1;
5667 bits = 0;
5668 valid_bits = 0;
5669 for (init_code = 0; init_code < clear; init_code++) {
5670 g->codes[init_code].prefix = -1;
5671 g->codes[init_code].first = (stbi_uc) init_code;
5672 g->codes[init_code].suffix = (stbi_uc) init_code;
5673 }
5674
5675 // support no starting clear code
5676 avail = clear+2;
5677 oldcode = -1;
5678
5679 len = 0;
5680 for(;;) {
5681 if (valid_bits < codesize) {
5682 if (len == 0) {
5683 len = stbi__get8(s); // start new block
5684 if (len == 0)
5685 return g->out;
5686 }
5687 --len;
5688 bits |= (stbi__int32) stbi__get8(s) << valid_bits;
5689 valid_bits += 8;
5690 } else {
5691 stbi__int32 code = bits & codemask;
5692 bits >>= codesize;
5693 valid_bits -= codesize;
5694 // @OPTIMIZE: is there some way we can accelerate the non-clear path?
5695 if (code == clear) { // clear code
5696 codesize = lzw_cs + 1;
5697 codemask = (1 << codesize) - 1;
5698 avail = clear + 2;
5699 oldcode = -1;
5700 first = 0;
5701 } else if (code == clear + 1) { // end of stream code
5702 stbi__skip(s, len);
5703 while ((len = stbi__get8(s)) > 0)
5704 stbi__skip(s,len);
5705 return g->out;
5706 } else if (code <= avail) {
5707 if (first) return stbi__errpuc("no clear code", "Corrupt GIF");
5708
5709 if (oldcode >= 0) {
5710 p = &g->codes[avail++];
5711 if (avail > 4096) return stbi__errpuc("too many codes", "Corrupt GIF");
5712 p->prefix = (stbi__int16) oldcode;
5713 p->first = g->codes[oldcode].first;
5714 p->suffix = (code == avail) ? p->first : g->codes[code].first;
5715 } else if (code == avail)
5716 return stbi__errpuc("illegal code in raster", "Corrupt GIF");
5717
5718 stbi__out_gif_code(g, (stbi__uint16) code);
5719
5720 if ((avail & codemask) == 0 && avail <= 0x0FFF) {
5721 codesize++;
5722 codemask = (1 << codesize) - 1;
5723 }
5724
5725 oldcode = code;
5726 } else {
5727 return stbi__errpuc("illegal code in raster", "Corrupt GIF");
5728 }
5729 }
5730 }
5731 }
5732
stbi__fill_gif_background(stbi__gif * g,int x0,int y0,int x1,int y1)5733 static void stbi__fill_gif_background(stbi__gif *g, int x0, int y0, int x1, int y1)
5734 {
5735 int x, y;
5736 stbi_uc *c = g->pal[g->bgindex];
5737 for (y = y0; y < y1; y += 4 * g->w) {
5738 for (x = x0; x < x1; x += 4) {
5739 stbi_uc *p = &g->out[y + x];
5740 p[0] = c[2];
5741 p[1] = c[1];
5742 p[2] = c[0];
5743 p[3] = 0;
5744 }
5745 }
5746 }
5747
5748 // this function is designed to support animated gifs, although stb_image doesn't support it
stbi__gif_load_next(stbi__context * s,stbi__gif * g,int * comp,int req_comp)5749 static stbi_uc *stbi__gif_load_next(stbi__context *s, stbi__gif *g, int *comp, int req_comp)
5750 {
5751 int i;
5752 stbi_uc *prev_out = 0;
5753
5754 if (g->out == 0 && !stbi__gif_header(s, g, comp,0))
5755 return 0; // stbi__g_failure_reason set by stbi__gif_header
5756
5757 if(g->w <= 0 || g->h <= 0 ||
5758 (4 > (INT_MAX / g->w / g->h)))
5759 return stbi__errpuc("Integer Overflow", "width or height too big");
5760
5761 prev_out = g->out;
5762 g->out = (stbi_uc *) stbi__malloc(4 * g->w * g->h);
5763 if (g->out == 0) return stbi__errpuc("outofmem", "Out of memory");
5764
5765 switch ((g->eflags & 0x1C) >> 2) {
5766 case 0: // unspecified (also always used on 1st frame)
5767 stbi__fill_gif_background(g, 0, 0, 4 * g->w, 4 * g->w * g->h);
5768 break;
5769 case 1: // do not dispose
5770 if (prev_out) memcpy(g->out, prev_out, 4 * g->w * g->h);
5771 g->old_out = prev_out;
5772 break;
5773 case 2: // dispose to background
5774 if (prev_out) memcpy(g->out, prev_out, 4 * g->w * g->h);
5775 stbi__fill_gif_background(g, g->start_x, g->start_y, g->max_x, g->max_y);
5776 break;
5777 case 3: // dispose to previous
5778 if (g->old_out) {
5779 for (i = g->start_y; i < g->max_y; i += 4 * g->w)
5780 memcpy(&g->out[i + g->start_x], &g->old_out[i + g->start_x], g->max_x - g->start_x);
5781 }
5782 break;
5783 }
5784
5785 for (;;) {
5786 switch (stbi__get8(s)) {
5787 case 0x2C: /* Image Descriptor */
5788 {
5789 int prev_trans = -1;
5790 stbi__int32 x, y, w, h;
5791 stbi_uc *o;
5792
5793 x = stbi__get16le(s);
5794 y = stbi__get16le(s);
5795 w = stbi__get16le(s);
5796 h = stbi__get16le(s);
5797 if (((x + w) > (g->w)) || ((y + h) > (g->h)))
5798 return stbi__errpuc("bad Image Descriptor", "Corrupt GIF");
5799
5800 g->line_size = g->w * 4;
5801 g->start_x = x * 4;
5802 g->start_y = y * g->line_size;
5803 g->max_x = g->start_x + w * 4;
5804 g->max_y = g->start_y + h * g->line_size;
5805 g->cur_x = g->start_x;
5806 g->cur_y = g->start_y;
5807
5808 g->lflags = stbi__get8(s);
5809
5810 if (g->lflags & 0x40) {
5811 g->step = 8 * g->line_size; // first interlaced spacing
5812 g->parse = 3;
5813 } else {
5814 g->step = g->line_size;
5815 g->parse = 0;
5816 }
5817
5818 if (g->lflags & 0x80) {
5819 stbi__gif_parse_colortable(s,g->lpal, 2 << (g->lflags & 7), g->eflags & 0x01 ? g->transparent : -1);
5820 g->color_table = (stbi_uc *) g->lpal;
5821 } else if (g->flags & 0x80) {
5822 if (g->transparent >= 0 && (g->eflags & 0x01)) {
5823 prev_trans = g->pal[g->transparent][3];
5824 g->pal[g->transparent][3] = 0;
5825 }
5826 g->color_table = (stbi_uc *) g->pal;
5827 } else
5828 return stbi__errpuc("missing color table", "Corrupt GIF");
5829
5830 o = stbi__process_gif_raster(s, g);
5831 if (o == NULL) return NULL;
5832
5833 if (prev_trans != -1)
5834 g->pal[g->transparent][3] = (stbi_uc) prev_trans;
5835
5836 return o;
5837 }
5838
5839 case 0x21: // Comment Extension.
5840 {
5841 int len;
5842 if (stbi__get8(s) == 0xF9) { // Graphic Control Extension.
5843 len = stbi__get8(s);
5844 if (len == 4) {
5845 g->eflags = stbi__get8(s);
5846 g->delay = stbi__get16le(s);
5847 g->transparent = stbi__get8(s);
5848 } else {
5849 stbi__skip(s, len);
5850 break;
5851 }
5852 }
5853 while ((len = stbi__get8(s)) != 0)
5854 stbi__skip(s, len);
5855 break;
5856 }
5857
5858 case 0x3B: // gif stream termination code
5859 return (stbi_uc *) s; // using '1' causes warning on some compilers
5860
5861 default:
5862 return stbi__errpuc("unknown code", "Corrupt GIF");
5863 }
5864 }
5865
5866 STBI_NOTUSED(req_comp);
5867 }
5868
stbi__gif_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)5869 static stbi_uc *stbi__gif_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
5870 {
5871 stbi_uc *u = 0;
5872 stbi__gif g;
5873 memset(&g, 0, sizeof(g));
5874
5875 u = stbi__gif_load_next(s, &g, comp, req_comp);
5876 if (u == (stbi_uc *) s) u = 0; // end of animated gif marker
5877 if (u) {
5878 *x = g.w;
5879 *y = g.h;
5880 if (req_comp && req_comp != 4)
5881 u = stbi__convert_format(u, 4, req_comp, g.w, g.h);
5882 }
5883 else if (g.out)
5884 STBI_FREE(g.out);
5885
5886 return u;
5887 }
5888
stbi__gif_info(stbi__context * s,int * x,int * y,int * comp)5889 static int stbi__gif_info(stbi__context *s, int *x, int *y, int *comp)
5890 {
5891 return stbi__gif_info_raw(s,x,y,comp);
5892 }
5893 #endif
5894
5895 // *************************************************************************************************
5896 // Radiance RGBE HDR loader
5897 // originally by Nicolas Schulz
5898 #ifndef STBI_NO_HDR
stbi__hdr_test_core(stbi__context * s)5899 static int stbi__hdr_test_core(stbi__context *s)
5900 {
5901 const char *signature = "#?RADIANCE\n";
5902 int i;
5903 for (i=0; signature[i]; ++i)
5904 if (stbi__get8(s) != signature[i])
5905 return 0;
5906 return 1;
5907 }
5908
stbi__hdr_test(stbi__context * s)5909 static int stbi__hdr_test(stbi__context* s)
5910 {
5911 int r = stbi__hdr_test_core(s);
5912 stbi__rewind(s);
5913 return r;
5914 }
5915
5916 #define STBI__HDR_BUFLEN 1024
stbi__hdr_gettoken(stbi__context * z,char * buffer)5917 static char *stbi__hdr_gettoken(stbi__context *z, char *buffer)
5918 {
5919 int len=0;
5920 char c = '\0';
5921
5922 c = (char) stbi__get8(z);
5923
5924 while (!stbi__at_eof(z) && c != '\n') {
5925 buffer[len++] = c;
5926 if (len == STBI__HDR_BUFLEN-1) {
5927 // flush to end of line
5928 while (!stbi__at_eof(z) && stbi__get8(z) != '\n')
5929 ;
5930 break;
5931 }
5932 c = (char) stbi__get8(z);
5933 }
5934
5935 buffer[len] = 0;
5936 return buffer;
5937 }
5938
stbi__hdr_convert(float * output,stbi_uc * input,int req_comp)5939 static void stbi__hdr_convert(float *output, stbi_uc *input, int req_comp)
5940 {
5941 if ( input[3] != 0 ) {
5942 float f1;
5943 // Exponent
5944 f1 = (float) ldexp(1.0f, input[3] - (int)(128 + 8));
5945 if (req_comp <= 2)
5946 output[0] = (input[0] + input[1] + input[2]) * f1 / 3;
5947 else {
5948 output[0] = input[0] * f1;
5949 output[1] = input[1] * f1;
5950 output[2] = input[2] * f1;
5951 }
5952 if (req_comp == 2) output[1] = 1;
5953 if (req_comp == 4) output[3] = 1;
5954 } else {
5955 switch (req_comp) {
5956 case 4: output[3] = 1; /* fallthrough */
5957 case 3: output[0] = output[1] = output[2] = 0;
5958 break;
5959 case 2: output[1] = 1; /* fallthrough */
5960 case 1: output[0] = 0;
5961 break;
5962 }
5963 }
5964 }
5965
stbi__hdr_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)5966 static float *stbi__hdr_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
5967 {
5968 char buffer[STBI__HDR_BUFLEN];
5969 char *token;
5970 int valid = 0;
5971 int width, height;
5972 stbi_uc *scanline;
5973 float *hdr_data;
5974 int len;
5975 unsigned char count, value;
5976 int i, j, k, c1,c2, z;
5977
5978
5979 // Check identifier
5980 if (strcmp(stbi__hdr_gettoken(s,buffer), "#?RADIANCE") != 0)
5981 return stbi__errpf("not HDR", "Corrupt HDR image");
5982
5983 // Parse header
5984 for(;;) {
5985 token = stbi__hdr_gettoken(s,buffer);
5986 if (token[0] == 0) break;
5987 if (strcmp(token, "FORMAT=32-bit_rle_rgbe") == 0) valid = 1;
5988 }
5989
5990 if (!valid) return stbi__errpf("unsupported format", "Unsupported HDR format");
5991
5992 // Parse width and height
5993 // can't use sscanf() if we're not using stdio!
5994 token = stbi__hdr_gettoken(s,buffer);
5995 if (strncmp(token, "-Y ", 3)) return stbi__errpf("unsupported data layout", "Unsupported HDR format");
5996 token += 3;
5997 height = (int) strtol(token, &token, 10);
5998 while (*token == ' ') ++token;
5999 if (strncmp(token, "+X ", 3)) return stbi__errpf("unsupported data layout", "Unsupported HDR format");
6000 token += 3;
6001 width = (int) strtol(token, NULL, 10);
6002
6003 *x = width;
6004 *y = height;
6005
6006 if (comp) *comp = 3;
6007 if (req_comp == 0) req_comp = 3;
6008
6009 if (height <= 0 || width <= 0 || req_comp <= 0 ||
6010 (sizeof(float) > (INT_MAX / req_comp / height / width)))
6011 return stbi__errpf("Integer Overflow", "w or h incorrect");
6012 // Read data
6013 hdr_data = (float *) stbi__malloc(height * width * req_comp * sizeof(float));
6014 if (hdr_data == NULL) return stbi__errpf("outofmem", "Out of memory");
6015
6016 // Load image data
6017 // image data is stored as some number of sca
6018 if ( width < 8 || width >= 32768) {
6019 // Read flat data
6020 for (j=0; j < height; ++j) {
6021 for (i=0; i < width; ++i) {
6022 stbi_uc rgbe[4];
6023 main_decode_loop:
6024 stbi__getn(s, rgbe, 4);
6025 stbi__hdr_convert(hdr_data + j * width * req_comp + i * req_comp, rgbe, req_comp);
6026 }
6027 }
6028 } else {
6029 // Read RLE-encoded data
6030 scanline = NULL;
6031
6032 for (j = 0; j < height; ++j) {
6033 c1 = stbi__get8(s);
6034 c2 = stbi__get8(s);
6035 len = stbi__get8(s);
6036 if (c1 != 2 || c2 != 2 || (len & 0x80)) {
6037 // not run-length encoded, so we have to actually use THIS data as a decoded
6038 // pixel (note this can't be a valid pixel--one of RGB must be >= 128)
6039 stbi_uc rgbe[4];
6040 rgbe[0] = (stbi_uc) c1;
6041 rgbe[1] = (stbi_uc) c2;
6042 rgbe[2] = (stbi_uc) len;
6043 rgbe[3] = (stbi_uc) stbi__get8(s);
6044 stbi__hdr_convert(hdr_data, rgbe, req_comp);
6045 i = 1;
6046 j = 0;
6047 STBI_FREE(scanline);
6048 goto main_decode_loop; // yes, this makes no sense
6049 }
6050 len <<= 8;
6051 len |= stbi__get8(s);
6052 if (len != width) {
6053 STBI_FREE(hdr_data);
6054 STBI_FREE(scanline);
6055 return stbi__errpf("invalid decoded scanline length", "corrupt HDR");
6056 }
6057 if (scanline == NULL) scanline = (stbi_uc *) stbi__malloc(width * 4);
6058
6059 for (k = 0; k < 4; ++k) {
6060 i = 0;
6061 while (i < width) {
6062 count = stbi__get8(s);
6063 if (count > 128) {
6064 // Run
6065 value = stbi__get8(s);
6066 count -= 128;
6067 if (count >= width - i) {
6068 STBI_FREE(hdr_data);
6069 STBI_FREE(scanline);
6070 return stbi__errpf("invalid buffer size", "corrupt HDR");
6071 }
6072 for (z = 0; z < count; ++z)
6073 scanline[i++ * 4 + k] = value;
6074 } else {
6075 if (count >= width - i) {
6076 STBI_FREE(hdr_data);
6077 STBI_FREE(scanline);
6078 return stbi__errpf("invalid buffer size", "corrupt HDR");
6079 }
6080 // Dump
6081 for (z = 0; z < count; ++z)
6082 scanline[i++ * 4 + k] = stbi__get8(s);
6083 }
6084 }
6085 }
6086 for (i=0; i < width; ++i)
6087 stbi__hdr_convert(hdr_data+(j*width + i)*req_comp, scanline + i*4, req_comp);
6088 }
6089 STBI_FREE(scanline);
6090 }
6091
6092 return hdr_data;
6093 }
6094
stbi__hdr_info(stbi__context * s,int * x,int * y,int * comp)6095 static int stbi__hdr_info(stbi__context *s, int *x, int *y, int *comp)
6096 {
6097 char buffer[STBI__HDR_BUFLEN];
6098 char *token;
6099 int valid = 0;
6100
6101 if (strcmp(stbi__hdr_gettoken(s,buffer), "#?RADIANCE") != 0) {
6102 stbi__rewind( s );
6103 return 0;
6104 }
6105
6106 for(;;) {
6107 token = stbi__hdr_gettoken(s,buffer);
6108 if (token[0] == 0) break;
6109 if (strcmp(token, "FORMAT=32-bit_rle_rgbe") == 0) valid = 1;
6110 }
6111
6112 if (!valid) {
6113 stbi__rewind( s );
6114 return 0;
6115 }
6116 token = stbi__hdr_gettoken(s,buffer);
6117 if (strncmp(token, "-Y ", 3)) {
6118 stbi__rewind( s );
6119 return 0;
6120 }
6121 token += 3;
6122 *y = (int) strtol(token, &token, 10);
6123 while (*token == ' ') ++token;
6124 if (strncmp(token, "+X ", 3)) {
6125 stbi__rewind( s );
6126 return 0;
6127 }
6128 token += 3;
6129 *x = (int) strtol(token, NULL, 10);
6130 *comp = 3;
6131 return 1;
6132 }
6133 #endif // STBI_NO_HDR
6134
6135 #ifndef STBI_NO_BMP
stbi__bmp_info(stbi__context * s,int * x,int * y,int * comp)6136 static int stbi__bmp_info(stbi__context *s, int *x, int *y, int *comp)
6137 {
6138 int hsz;
6139 if (stbi__get8(s) != 'B' || stbi__get8(s) != 'M') {
6140 stbi__rewind( s );
6141 return 0;
6142 }
6143 stbi__skip(s,12);
6144 hsz = stbi__get32le(s);
6145 if (hsz != 12 && hsz != 40 && hsz != 56 && hsz != 108 && hsz != 124) {
6146 stbi__rewind( s );
6147 return 0;
6148 }
6149 if (hsz == 12) {
6150 *x = stbi__get16le(s);
6151 *y = stbi__get16le(s);
6152 } else {
6153 *x = stbi__get32le(s);
6154 *y = stbi__get32le(s);
6155 }
6156 if (stbi__get16le(s) != 1) {
6157 stbi__rewind( s );
6158 return 0;
6159 }
6160 *comp = stbi__get16le(s) / 8;
6161 return 1;
6162 }
6163 #endif
6164
6165 #ifndef STBI_NO_PSD
stbi__psd_info(stbi__context * s,int * x,int * y,int * comp)6166 static int stbi__psd_info(stbi__context *s, int *x, int *y, int *comp)
6167 {
6168 int channelCount;
6169 if (stbi__get32be(s) != 0x38425053) {
6170 stbi__rewind( s );
6171 return 0;
6172 }
6173 if (stbi__get16be(s) != 1) {
6174 stbi__rewind( s );
6175 return 0;
6176 }
6177 stbi__skip(s, 6);
6178 channelCount = stbi__get16be(s);
6179 if (channelCount < 0 || channelCount > 16) {
6180 stbi__rewind( s );
6181 return 0;
6182 }
6183 *y = stbi__get32be(s);
6184 *x = stbi__get32be(s);
6185 if (stbi__get16be(s) != 8) {
6186 stbi__rewind( s );
6187 return 0;
6188 }
6189 if (stbi__get16be(s) != 3) {
6190 stbi__rewind( s );
6191 return 0;
6192 }
6193 *comp = 4;
6194 return 1;
6195 }
6196 #endif
6197
6198 #ifndef STBI_NO_PIC
stbi__pic_info(stbi__context * s,int * x,int * y,int * comp)6199 static int stbi__pic_info(stbi__context *s, int *x, int *y, int *comp)
6200 {
6201 int act_comp=0,num_packets=0,chained;
6202 stbi__pic_packet packets[10];
6203
6204 if (!stbi__pic_is4(s,"\x53\x80\xF6\x34")) {
6205 stbi__rewind(s);
6206 return 0;
6207 }
6208
6209 stbi__skip(s, 88);
6210
6211 *x = stbi__get16be(s);
6212 *y = stbi__get16be(s);
6213 if (stbi__at_eof(s)) {
6214 stbi__rewind( s);
6215 return 0;
6216 }
6217 if ( (*x) != 0 && (1 << 28) / (*x) < (*y)) {
6218 stbi__rewind( s );
6219 return 0;
6220 }
6221
6222 stbi__skip(s, 8);
6223
6224 do {
6225 stbi__pic_packet *packet;
6226
6227 if (num_packets==sizeof(packets)/sizeof(packets[0]))
6228 return 0;
6229
6230 packet = &packets[num_packets++];
6231 chained = stbi__get8(s);
6232 packet->size = stbi__get8(s);
6233 packet->type = stbi__get8(s);
6234 packet->channel = stbi__get8(s);
6235 act_comp |= packet->channel;
6236
6237 if (stbi__at_eof(s)) {
6238 stbi__rewind( s );
6239 return 0;
6240 }
6241 if (packet->size != 8) {
6242 stbi__rewind( s );
6243 return 0;
6244 }
6245 } while (chained);
6246
6247 *comp = (act_comp & 0x10 ? 4 : 3);
6248
6249 return 1;
6250 }
6251 #endif
6252
6253 // *************************************************************************************************
6254 // Portable Gray Map and Portable Pixel Map loader
6255 // by Ken Miller
6256 //
6257 // PGM: http://netpbm.sourceforge.net/doc/pgm.html
6258 // PPM: http://netpbm.sourceforge.net/doc/ppm.html
6259 //
6260 // Known limitations:
6261 // Does not support comments in the header section
6262 // Does not support ASCII image data (formats P2 and P3)
6263 // Does not support 16-bit-per-channel
6264
6265 #ifndef STBI_NO_PNM
6266
stbi__pnm_test(stbi__context * s)6267 static int stbi__pnm_test(stbi__context *s)
6268 {
6269 char p, t;
6270 p = (char) stbi__get8(s);
6271 t = (char) stbi__get8(s);
6272 if (p != 'P' || (t != '5' && t != '6')) {
6273 stbi__rewind( s );
6274 return 0;
6275 }
6276 return 1;
6277 }
6278
stbi__pnm_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)6279 static stbi_uc *stbi__pnm_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
6280 {
6281 stbi_uc *out;
6282 if (!stbi__pnm_info(s, (int *)&s->img_x, (int *)&s->img_y, (int *)&s->img_n))
6283 return 0;
6284
6285 *x = s->img_x;
6286 *y = s->img_y;
6287
6288 if (*x <= 0 || *y <= 0)
6289 return stbi__errpuc("Integer overflow", "img_x or img_y incorrect");
6290
6291 *comp = s->img_n;
6292 if (s->img_x == 0 || s->img_y == 0 || s->img_n <= 0
6293 || (s->img_n > (INT_MAX / s->img_x / s->img_y)))
6294 return stbi__errpuc("Integer Overflow", "x or y incorrect");
6295 out = (stbi_uc *) stbi__malloc(s->img_n * s->img_x * s->img_y);
6296 if (!out) return stbi__errpuc("outofmem", "Out of memory");
6297 stbi__getn(s, out, s->img_n * s->img_x * s->img_y);
6298
6299 if (req_comp && req_comp != s->img_n) {
6300 out = stbi__convert_format(out, s->img_n, req_comp, s->img_x, s->img_y);
6301 if (out == NULL) return out; // stbi__convert_format frees input on failure
6302 }
6303 return out;
6304 }
6305
stbi__pnm_isspace(char c)6306 static int stbi__pnm_isspace(char c)
6307 {
6308 return c == ' ' || c == '\t' || c == '\n' || c == '\v' || c == '\f' || c == '\r';
6309 }
6310
stbi__pnm_skip_whitespace(stbi__context * s,char * c)6311 static void stbi__pnm_skip_whitespace(stbi__context *s, char *c)
6312 {
6313 while (!stbi__at_eof(s) && stbi__pnm_isspace(*c))
6314 *c = (char) stbi__get8(s);
6315 }
6316
stbi__pnm_isdigit(char c)6317 static int stbi__pnm_isdigit(char c)
6318 {
6319 return c >= '0' && c <= '9';
6320 }
6321
stbi__pnm_getinteger(stbi__context * s,char * c)6322 static int stbi__pnm_getinteger(stbi__context *s, char *c)
6323 {
6324 int value = 0;
6325
6326 while (!stbi__at_eof(s) && stbi__pnm_isdigit(*c)) {
6327 value = value*10 + (*c - '0');
6328 *c = (char) stbi__get8(s);
6329 }
6330
6331 return value;
6332 }
6333
stbi__pnm_info(stbi__context * s,int * x,int * y,int * comp)6334 static int stbi__pnm_info(stbi__context *s, int *x, int *y, int *comp)
6335 {
6336 int maxv;
6337 char c, p, t;
6338
6339 stbi__rewind( s );
6340
6341 // Get identifier
6342 p = (char) stbi__get8(s);
6343 t = (char) stbi__get8(s);
6344 if (p != 'P' || (t != '5' && t != '6')) {
6345 stbi__rewind( s );
6346 return 0;
6347 }
6348
6349 *comp = (t == '6') ? 3 : 1; // '5' is 1-component .pgm; '6' is 3-component .ppm
6350
6351 c = (char) stbi__get8(s);
6352 stbi__pnm_skip_whitespace(s, &c);
6353
6354 *x = stbi__pnm_getinteger(s, &c); // read width
6355 stbi__pnm_skip_whitespace(s, &c);
6356
6357 *y = stbi__pnm_getinteger(s, &c); // read height
6358 stbi__pnm_skip_whitespace(s, &c);
6359
6360 maxv = stbi__pnm_getinteger(s, &c); // read max value
6361
6362 if (maxv > 255)
6363 return stbi__err("max value > 255", "PPM image not 8-bit");
6364 else
6365 return 1;
6366 }
6367 #endif
6368
stbi__info_main(stbi__context * s,int * x,int * y,int * comp)6369 static int stbi__info_main(stbi__context *s, int *x, int *y, int *comp)
6370 {
6371 #ifndef STBI_NO_JPEG
6372 if (stbi__jpeg_info(s, x, y, comp)) return 1;
6373 #endif
6374
6375 #ifndef STBI_NO_PNG
6376 if (stbi__png_info(s, x, y, comp)) return 1;
6377 #endif
6378
6379 #ifndef STBI_NO_GIF
6380 if (stbi__gif_info(s, x, y, comp)) return 1;
6381 #endif
6382
6383 #ifndef STBI_NO_BMP
6384 if (stbi__bmp_info(s, x, y, comp)) return 1;
6385 #endif
6386
6387 #ifndef STBI_NO_PSD
6388 if (stbi__psd_info(s, x, y, comp)) return 1;
6389 #endif
6390
6391 #ifndef STBI_NO_PIC
6392 if (stbi__pic_info(s, x, y, comp)) return 1;
6393 #endif
6394
6395 #ifndef STBI_NO_PNM
6396 if (stbi__pnm_info(s, x, y, comp)) return 1;
6397 #endif
6398
6399 #ifndef STBI_NO_HDR
6400 if (stbi__hdr_info(s, x, y, comp)) return 1;
6401 #endif
6402
6403 // test tga last because it's a crappy test!
6404 #ifndef STBI_NO_TGA
6405 if (stbi__tga_info(s, x, y, comp))
6406 return 1;
6407 #endif
6408 return stbi__err("unknown image type", "Image not of any known type, or corrupt");
6409 }
6410
6411 #ifndef STBI_NO_STDIO
stbi_info(char const * filename,int * x,int * y,int * comp)6412 STBIDEF int stbi_info(char const *filename, int *x, int *y, int *comp)
6413 {
6414 FILE *f = stbi__fopen(filename, "rb");
6415 int result;
6416 if (!f) return stbi__err("can't fopen", "Unable to open file");
6417 result = stbi_info_from_file(f, x, y, comp);
6418 fclose(f);
6419 return result;
6420 }
6421
stbi_info_from_file(FILE * f,int * x,int * y,int * comp)6422 STBIDEF int stbi_info_from_file(FILE *f, int *x, int *y, int *comp)
6423 {
6424 int r;
6425 stbi__context s;
6426 long pos = ftell(f);
6427 stbi__start_file(&s, f);
6428 r = stbi__info_main(&s,x,y,comp);
6429 fseek(f,pos,SEEK_SET);
6430 return r;
6431 }
6432 #endif // !STBI_NO_STDIO
6433
stbi_info_from_memory(stbi_uc const * buffer,int len,int * x,int * y,int * comp)6434 STBIDEF int stbi_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp)
6435 {
6436 stbi__context s;
6437 stbi__start_mem(&s,buffer,len);
6438 return stbi__info_main(&s,x,y,comp);
6439 }
6440
stbi_info_from_callbacks(stbi_io_callbacks const * c,void * user,int * x,int * y,int * comp)6441 STBIDEF int stbi_info_from_callbacks(stbi_io_callbacks const *c, void *user, int *x, int *y, int *comp)
6442 {
6443 stbi__context s;
6444 stbi__start_callbacks(&s, (stbi_io_callbacks *) c, user);
6445 return stbi__info_main(&s,x,y,comp);
6446 }
6447
6448 #endif // STB_IMAGE_IMPLEMENTATION
6449
6450 /*
6451 revision history:
6452 2.08 (2015-09-13) fix to 2.07 cleanup, reading RGB PSD as RGBA
6453 2.07 (2015-09-13) fix compiler warnings
6454 partial animated GIF support
6455 limited 16-bit PSD support
6456 #ifdef unused functions
6457 bug with < 92 byte PIC,PNM,HDR,TGA
6458 2.06 (2015-04-19) fix bug where PSD returns wrong '*comp' value
6459 2.05 (2015-04-19) fix bug in progressive JPEG handling, fix warning
6460 2.04 (2015-04-15) try to re-enable SIMD on MinGW 64-bit
6461 2.03 (2015-04-12) extra corruption checking (mmozeiko)
6462 stbi_set_flip_vertically_on_load (nguillemot)
6463 fix NEON support; fix mingw support
6464 2.02 (2015-01-19) fix incorrect assert, fix warning
6465 2.01 (2015-01-17) fix various warnings; suppress SIMD on gcc 32-bit without -msse2
6466 2.00b (2014-12-25) fix STBI_MALLOC in progressive JPEG
6467 2.00 (2014-12-25) optimize JPG, including x86 SSE2 & NEON SIMD (ryg)
6468 progressive JPEG (stb)
6469 PGM/PPM support (Ken Miller)
6470 STBI_MALLOC,STBI_REALLOC,STBI_FREE
6471 GIF bugfix -- seemingly never worked
6472 STBI_NO_*, STBI_ONLY_*
6473 1.48 (2014-12-14) fix incorrectly-named assert()
6474 1.47 (2014-12-14) 1/2/4-bit PNG support, both direct and paletted (Omar Cornut & stb)
6475 optimize PNG (ryg)
6476 fix bug in interlaced PNG with user-specified channel count (stb)
6477 1.46 (2014-08-26)
6478 fix broken tRNS chunk (colorkey-style transparency) in non-paletted PNG
6479 1.45 (2014-08-16)
6480 fix MSVC-ARM internal compiler error by wrapping malloc
6481 1.44 (2014-08-07)
6482 various warning fixes from Ronny Chevalier
6483 1.43 (2014-07-15)
6484 fix MSVC-only compiler problem in code changed in 1.42
6485 1.42 (2014-07-09)
6486 don't define _CRT_SECURE_NO_WARNINGS (affects user code)
6487 fixes to stbi__cleanup_jpeg path
6488 added STBI_ASSERT to avoid requiring assert.h
6489 1.41 (2014-06-25)
6490 fix search&replace from 1.36 that messed up comments/error messages
6491 1.40 (2014-06-22)
6492 fix gcc struct-initialization warning
6493 1.39 (2014-06-15)
6494 fix to TGA optimization when req_comp != number of components in TGA;
6495 fix to GIF loading because BMP wasn't rewinding (whoops, no GIFs in my test suite)
6496 add support for BMP version 5 (more ignored fields)
6497 1.38 (2014-06-06)
6498 suppress MSVC warnings on integer casts truncating values
6499 fix accidental rename of 'skip' field of I/O
6500 1.37 (2014-06-04)
6501 remove duplicate typedef
6502 1.36 (2014-06-03)
6503 convert to header file single-file library
6504 if de-iphone isn't set, load iphone images color-swapped instead of returning NULL
6505 1.35 (2014-05-27)
6506 various warnings
6507 fix broken STBI_SIMD path
6508 fix bug where stbi_load_from_file no longer left file pointer in correct place
6509 fix broken non-easy path for 32-bit BMP (possibly never used)
6510 TGA optimization by Arseny Kapoulkine
6511 1.34 (unknown)
6512 use STBI_NOTUSED in stbi__resample_row_generic(), fix one more leak in tga failure case
6513 1.33 (2011-07-14)
6514 make stbi_is_hdr work in STBI_NO_HDR (as specified), minor compiler-friendly improvements
6515 1.32 (2011-07-13)
6516 support for "info" function for all supported filetypes (SpartanJ)
6517 1.31 (2011-06-20)
6518 a few more leak fixes, bug in PNG handling (SpartanJ)
6519 1.30 (2011-06-11)
6520 added ability to load files via callbacks to accomidate custom input streams (Ben Wenger)
6521 removed deprecated format-specific test/load functions
6522 removed support for installable file formats (stbi_loader) -- would have been broken for IO callbacks anyway
6523 error cases in bmp and tga give messages and don't leak (Raymond Barbiero, grisha)
6524 fix inefficiency in decoding 32-bit BMP (David Woo)
6525 1.29 (2010-08-16)
6526 various warning fixes from Aurelien Pocheville
6527 1.28 (2010-08-01)
6528 fix bug in GIF palette transparency (SpartanJ)
6529 1.27 (2010-08-01)
6530 cast-to-stbi_uc to fix warnings
6531 1.26 (2010-07-24)
6532 fix bug in file buffering for PNG reported by SpartanJ
6533 1.25 (2010-07-17)
6534 refix trans_data warning (Won Chun)
6535 1.24 (2010-07-12)
6536 perf improvements reading from files on platforms with lock-heavy fgetc()
6537 minor perf improvements for jpeg
6538 deprecated type-specific functions so we'll get feedback if they're needed
6539 attempt to fix trans_data warning (Won Chun)
6540 1.23 fixed bug in iPhone support
6541 1.22 (2010-07-10)
6542 removed image *writing* support
6543 stbi_info support from Jetro Lauha
6544 GIF support from Jean-Marc Lienher
6545 iPhone PNG-extensions from James Brown
6546 warning-fixes from Nicolas Schulz and Janez Zemva (i.stbi__err. Janez (U+017D)emva)
6547 1.21 fix use of 'stbi_uc' in header (reported by jon blow)
6548 1.20 added support for Softimage PIC, by Tom Seddon
6549 1.19 bug in interlaced PNG corruption check (found by ryg)
6550 1.18 (2008-08-02)
6551 fix a threading bug (local mutable static)
6552 1.17 support interlaced PNG
6553 1.16 major bugfix - stbi__convert_format converted one too many pixels
6554 1.15 initialize some fields for thread safety
6555 1.14 fix threadsafe conversion bug
6556 header-file-only version (#define STBI_HEADER_FILE_ONLY before including)
6557 1.13 threadsafe
6558 1.12 const qualifiers in the API
6559 1.11 Support installable IDCT, colorspace conversion routines
6560 1.10 Fixes for 64-bit (don't use "unsigned long")
6561 optimized upsampling by Fabian "ryg" Giesen
6562 1.09 Fix format-conversion for PSD code (bad global variables!)
6563 1.08 Thatcher Ulrich's PSD code integrated by Nicolas Schulz
6564 1.07 attempt to fix C++ warning/errors again
6565 1.06 attempt to fix C++ warning/errors again
6566 1.05 fix TGA loading to return correct *comp and use good luminance calc
6567 1.04 default float alpha is 1, not 255; use 'void *' for stbi_image_free
6568 1.03 bugfixes to STBI_NO_STDIO, STBI_NO_HDR
6569 1.02 support for (subset of) HDR files, float interface for preferred access to them
6570 1.01 fix bug: possible bug in handling right-side up bmps... not sure
6571 fix bug: the stbi__bmp_load() and stbi__tga_load() functions didn't work at all
6572 1.00 interface to zlib that skips zlib header
6573 0.99 correct handling of alpha in palette
6574 0.98 TGA loader by lonesock; dynamically add loaders (untested)
6575 0.97 jpeg errors on too large a file; also catch another malloc failure
6576 0.96 fix detection of invalid v value - particleman@mollyrocket forum
6577 0.95 during header scan, seek to markers in case of padding
6578 0.94 STBI_NO_STDIO to disable stdio usage; rename all #defines the same
6579 0.93 handle jpegtran output; verbose errors
6580 0.92 read 4,8,16,24,32-bit BMP files of several formats
6581 0.91 output 24-bit Windows 3.0 BMP files
6582 0.90 fix a few more warnings; bump version number to approach 1.0
6583 0.61 bugfixes due to Marc LeBlanc, Christopher Lloyd
6584 0.60 fix compiling as c++
6585 0.59 fix warnings: merge Dave Moore's -Wall fixes
6586 0.58 fix bug: zlib uncompressed mode len/nlen was wrong endian
6587 0.57 fix bug: jpg last huffman symbol before marker was >9 bits but less than 16 available
6588 0.56 fix bug: zlib uncompressed mode len vs. nlen
6589 0.55 fix bug: restart_interval not initialized to 0
6590 0.54 allow NULL for 'int *comp'
6591 0.53 fix bug in png 3->4; speedup png decoding
6592 0.52 png handles req_comp=3,4 directly; minor cleanup; jpeg comments
6593 0.51 obey req_comp requests, 1-component jpegs return as 1-component,
6594 on 'test' only check type, not whether we support this variant
6595 0.50 (2006-11-19)
6596 first released version
6597 */
6598