11.5.1 2===== 3 4### Significant changes relative to 1.5.0: 5 61. Previously, the undocumented `JSIMD_FORCE*` environment variables could be 7used to force-enable a particular SIMD instruction set if multiple instruction 8sets were available on a particular platform. On x86 platforms, where CPU 9feature detection is bulletproof and multiple SIMD instruction sets are 10available, it makes sense for those environment variables to allow forcing the 11use of an instruction set only if that instruction set is available. However, 12since the ARM implementations of libjpeg-turbo can only use one SIMD 13instruction set, and since their feature detection code is less bulletproof 14(parsing /proc/cpuinfo), it makes sense for the `JSIMD_FORCENEON` environment 15variable to bypass the feature detection code and really force the use of NEON 16instructions. A new environment variable (`JSIMD_FORCEDSPR2`) was introduced 17in the MIPS implementation for the same reasons, and the existing 18`JSIMD_FORCENONE` environment variable was extended to that implementation. 19These environment variables provide a workaround for those attempting to test 20ARM and MIPS builds of libjpeg-turbo in QEMU, which passes through 21/proc/cpuinfo from the host system. 22 232. libjpeg-turbo previously assumed that AltiVec instructions were always 24available on PowerPC platforms, which led to "illegal instruction" errors when 25running on PowerPC chips that lack AltiVec support (such as the older 7xx/G3 26and newer e5500 series.) libjpeg-turbo now examines /proc/cpuinfo on 27Linux/Android systems and enables AltiVec instructions only if the CPU supports 28them. It also now provides two environment variables, `JSIMD_FORCEALTIVEC` and 29`JSIMD_FORCENONE`, to force-enable and force-disable AltiVec instructions in 30environments where /proc/cpuinfo is an unreliable means of CPU feature 31detection (such as when running in QEMU.) On OS X, libjpeg-turbo continues to 32assume that AltiVec support is always available, which means that libjpeg-turbo 33cannot be used with G3 Macs unless you set the environment variable 34`JSIMD_FORCENONE` to `1`. 35 363. Fixed an issue whereby 64-bit ARM (AArch64) builds of libjpeg-turbo would 37crash when built with recent releases of the Clang/LLVM compiler. This was 38caused by an ABI conformance issue in some of libjpeg-turbo's 64-bit NEON SIMD 39routines. Those routines were incorrectly using 64-bit instructions to 40transfer a 32-bit JDIMENSION argument, whereas the ABI allows the upper 41(unused) 32 bits of a 32-bit argument's register to be undefined. The new 42Clang/LLVM optimizer uses load combining to transfer multiple adjacent 32-bit 43structure members into a single 64-bit register, and this exposed the ABI 44conformance issue. 45 464. Fancy upsampling is now supported when decompressing JPEG images that use 474:4:0 (h1v2) chroma subsampling. These images are generated when losslessly 48rotating or transposing JPEG images that use 4:2:2 (h2v1) chroma subsampling. 49The h1v2 fancy upsampling algorithm is not currently SIMD-accelerated. 50 515. If merged upsampling isn't SIMD-accelerated but YCbCr-to-RGB conversion is, 52then libjpeg-turbo will now disable merged upsampling when decompressing YCbCr 53JPEG images into RGB or extended RGB output images. This significantly speeds 54up the decompression of 4:2:0 and 4:2:2 JPEGs on ARM platforms if fancy 55upsampling is not used (for example, if the `-nosmooth` option to djpeg is 56specified.) 57 586. The TurboJPEG API will now decompress 4:2:2 and 4:4:0 JPEG images with 592x2 luminance sampling factors and 2x1 or 1x2 chrominance sampling factors. 60This is a non-standard way of specifying 2x subsampling (normally 4:2:2 JPEGs 61have 2x1 luminance and 1x1 chrominance sampling factors, and 4:4:0 JPEGs have 621x2 luminance and 1x1 chrominance sampling factors), but the JPEG specification 63and the libjpeg API both allow it. 64 657. Fixed an unsigned integer overflow in the libjpeg memory manager, detected 66by the Clang undefined behavior sanitizer, that could be triggered by 67attempting to decompress a specially-crafted malformed JPEG image. This issue 68affected only 32-bit code and did not pose a security threat, but removing the 69warning makes it easier to detect actual security issues, should they arise in 70the future. 71 728. Fixed additional negative left shifts and other issues reported by the GCC 73and Clang undefined behavior sanitizers when attempting to decompress 74specially-crafted malformed JPEG images. None of these issues posed a security 75threat, but removing the warnings makes it easier to detect actual security 76issues, should they arise in the future. 77 789. Fixed an out-of-bounds array reference, introduced by 1.4.90[2] (partial 79image decompression) and detected by the Clang undefined behavior sanitizer, 80that could be triggered by a specially-crafted malformed JPEG image with more 81than four components. Because the out-of-bounds reference was still within the 82same structure, it was not known to pose a security threat, but removing the 83warning makes it easier to detect actual security issues, should they arise in 84the future. 85 8610. Fixed another ABI conformance issue in the 64-bit ARM (AArch64) NEON SIMD 87code. Some of the routines were incorrectly reading and storing data below the 88stack pointer, which caused segfaults in certain applications under specific 89circumstances. 90 91 921.5.0 93===== 94 95### Significant changes relative to 1.5 beta1: 96 971. Fixed an issue whereby a malformed motion-JPEG frame could cause the "fast 98path" of libjpeg-turbo's Huffman decoder to read from uninitialized memory. 99 1002. Added libjpeg-turbo version and build information to the global string table 101of the libjpeg and TurboJPEG API libraries. This is a common practice in other 102infrastructure libraries, such as OpenSSL and libpng, because it makes it easy 103to examine an application binary and determine which version of the library the 104application was linked against. 105 1063. Fixed a couple of issues in the PPM reader that would cause buffer overruns 107in cjpeg if one of the values in a binary PPM/PGM input file exceeded the 108maximum value defined in the file's header. libjpeg-turbo 1.4.2 already 109included a similar fix for ASCII PPM/PGM files. Note that these issues were 110not security bugs, since they were confined to the cjpeg program and did not 111affect any of the libjpeg-turbo libraries. 112 1134. Fixed an issue whereby attempting to decompress a JPEG file with a corrupt 114header using the `tjDecompressToYUV2()` function would cause the function to 115abort without returning an error and, under certain circumstances, corrupt the 116stack. This only occurred if `tjDecompressToYUV2()` was called prior to 117calling `tjDecompressHeader3()`, or if the return value from 118`tjDecompressHeader3()` was ignored (both cases represent incorrect usage of 119the TurboJPEG API.) 120 1215. Fixed an issue in the ARM 32-bit SIMD-accelerated Huffman encoder that 122prevented the code from assembling properly with clang. 123 1246. The `jpeg_stdio_src()`, `jpeg_mem_src()`, `jpeg_stdio_dest()`, and 125`jpeg_mem_dest()` functions in the libjpeg API will now throw an error if a 126source/destination manager has already been assigned to the compress or 127decompress object by a different function or by the calling program. This 128prevents these functions from attempting to reuse a source/destination manager 129structure that was allocated elsewhere, because there is no way to ensure that 130it would be big enough to accommodate the new source/destination manager. 131 132 1331.4.90 (1.5 beta1) 134================== 135 136### Significant changes relative to 1.4.2: 137 1381. Added full SIMD acceleration for PowerPC platforms using AltiVec VMX 139(128-bit SIMD) instructions. Although the performance of libjpeg-turbo on 140PowerPC was already good, due to the increased number of registers available 141to the compiler vs. x86, it was still possible to speed up compression by about 1423-4x and decompression by about 2-2.5x (relative to libjpeg v6b) through the 143use of AltiVec instructions. 144 1452. Added two new libjpeg API functions (`jpeg_skip_scanlines()` and 146`jpeg_crop_scanline()`) that can be used to partially decode a JPEG image. See 147[libjpeg.txt](libjpeg.txt) for more details. 148 1493. The TJCompressor and TJDecompressor classes in the TurboJPEG Java API now 150implement the Closeable interface, so those classes can be used with a 151try-with-resources statement. 152 1534. The TurboJPEG Java classes now throw unchecked idiomatic exceptions 154(IllegalArgumentException, IllegalStateException) for unrecoverable errors 155caused by incorrect API usage, and those classes throw a new checked exception 156type (TJException) for errors that are passed through from the C library. 157 1585. Source buffers for the TurboJPEG C API functions, as well as the 159`jpeg_mem_src()` function in the libjpeg API, are now declared as const 160pointers. This facilitates passing read-only buffers to those functions and 161ensures the caller that the source buffer will not be modified. This should 162not create any backward API or ABI incompatibilities with prior libjpeg-turbo 163releases. 164 1656. The MIPS DSPr2 SIMD code can now be compiled to support either FR=0 or FR=1 166FPUs. 167 1687. Fixed additional negative left shifts and other issues reported by the GCC 169and Clang undefined behavior sanitizers. Most of these issues affected only 17032-bit code, and none of them was known to pose a security threat, but removing 171the warnings makes it easier to detect actual security issues, should they 172arise in the future. 173 1748. Removed the unnecessary `.arch` directive from the ARM64 NEON SIMD code. 175This directive was preventing the code from assembling using the clang 176integrated assembler. 177 1789. Fixed a regression caused by 1.4.1[6] that prevented 32-bit and 64-bit 179libjpeg-turbo RPMs from being installed simultaneously on recent Red Hat/Fedora 180distributions. This was due to the addition of a macro in jconfig.h that 181allows the Huffman codec to determine the word size at compile time. Since 182that macro differs between 32-bit and 64-bit builds, this caused a conflict 183between the i386 and x86_64 RPMs (any differing files, other than executables, 184are not allowed when 32-bit and 64-bit RPMs are installed simultaneously.) 185Since the macro is used only internally, it has been moved into jconfigint.h. 186 18710. The x86-64 SIMD code can now be disabled at run time by setting the 188`JSIMD_FORCENONE` environment variable to `1` (the other SIMD implementations 189already had this capability.) 190 19111. Added a new command-line argument to TJBench (`-nowrite`) that prevents the 192benchmark from outputting any images. This removes any potential operating 193system overhead that might be caused by lazy writes to disk and thus improves 194the consistency of the performance measurements. 195 19612. Added SIMD acceleration for Huffman encoding on SSE2-capable x86 and x86-64 197platforms. This speeds up the compression of full-color JPEGs by about 10-15% 198on average (relative to libjpeg-turbo 1.4.x) when using modern Intel and AMD 199CPUs. Additionally, this works around an issue in the clang optimizer that 200prevents it (as of this writing) from achieving the same performance as GCC 201when compiling the C version of the Huffman encoder 202(<https://llvm.org/bugs/show_bug.cgi?id=16035>). For the purposes of 203benchmarking or regression testing, SIMD-accelerated Huffman encoding can be 204disabled by setting the `JSIMD_NOHUFFENC` environment variable to `1`. 205 20613. Added ARM 64-bit (ARMv8) NEON SIMD implementations of the commonly-used 207compression algorithms (including the slow integer forward DCT and h2v2 & h2v1 208downsampling algorithms, which are not accelerated in the 32-bit NEON 209implementation.) This speeds up the compression of full-color JPEGs by about 21075% on average on a Cavium ThunderX processor and by about 2-2.5x on average on 211Cortex-A53 and Cortex-A57 cores. 212 21314. Added SIMD acceleration for Huffman encoding on NEON-capable ARM 32-bit 214and 64-bit platforms. 215 216 For 32-bit code, this speeds up the compression of full-color JPEGs by 217about 30% on average on a typical iOS device (iPhone 4S, Cortex-A9) and by 218about 6-7% on average on a typical Android device (Nexus 5X, Cortex-A53 and 219Cortex-A57), relative to libjpeg-turbo 1.4.x. Note that the larger speedup 220under iOS is due to the fact that iOS builds use LLVM, which does not optimize 221the C Huffman encoder as well as GCC does. 222 223 For 64-bit code, NEON-accelerated Huffman encoding speeds up the 224compression of full-color JPEGs by about 40% on average on a typical iOS device 225(iPhone 5S, Apple A7) and by about 7-8% on average on a typical Android device 226(Nexus 5X, Cortex-A53 and Cortex-A57), in addition to the speedup described in 227[13] above. 228 229 For the purposes of benchmarking or regression testing, SIMD-accelerated 230Huffman encoding can be disabled by setting the `JSIMD_NOHUFFENC` environment 231variable to `1`. 232 23315. pkg-config (.pc) scripts are now included for both the libjpeg and 234TurboJPEG API libraries on Un*x systems. Note that if a project's build system 235relies on these scripts, then it will not be possible to build that project 236with libjpeg or with a prior version of libjpeg-turbo. 237 23816. Optimized the ARM 64-bit (ARMv8) NEON SIMD decompression routines to 239improve performance on CPUs with in-order pipelines. This speeds up the 240decompression of full-color JPEGs by nearly 2x on average on a Cavium ThunderX 241processor and by about 15% on average on a Cortex-A53 core. 242 24317. Fixed an issue in the accelerated Huffman decoder that could have caused 244the decoder to read past the end of the input buffer when a malformed, 245specially-crafted JPEG image was being decompressed. In prior versions of 246libjpeg-turbo, the accelerated Huffman decoder was invoked (in most cases) only 247if there were > 128 bytes of data in the input buffer. However, it is possible 248to construct a JPEG image in which a single Huffman block is over 430 bytes 249long, so this version of libjpeg-turbo activates the accelerated Huffman 250decoder only if there are > 512 bytes of data in the input buffer. 251 25218. Fixed a memory leak in tjunittest encountered when running the program 253with the `-yuv` option. 254 255 2561.4.2 257===== 258 259### Significant changes relative to 1.4.1: 260 2611. Fixed an issue whereby cjpeg would segfault if a Windows bitmap with a 262negative width or height was used as an input image (Windows bitmaps can have 263a negative height if they are stored in top-down order, but such files are 264rare and not supported by libjpeg-turbo.) 265 2662. Fixed an issue whereby, under certain circumstances, libjpeg-turbo would 267incorrectly encode certain JPEG images when quality=100 and the fast integer 268forward DCT were used. This was known to cause `make test` to fail when the 269library was built with `-march=haswell` on x86 systems. 270 2713. Fixed an issue whereby libjpeg-turbo would crash when built with the latest 272& greatest development version of the Clang/LLVM compiler. This was caused by 273an x86-64 ABI conformance issue in some of libjpeg-turbo's 64-bit SSE2 SIMD 274routines. Those routines were incorrectly using a 64-bit `mov` instruction to 275transfer a 32-bit JDIMENSION argument, whereas the x86-64 ABI allows the upper 276(unused) 32 bits of a 32-bit argument's register to be undefined. The new 277Clang/LLVM optimizer uses load combining to transfer multiple adjacent 32-bit 278structure members into a single 64-bit register, and this exposed the ABI 279conformance issue. 280 2814. Fixed a bug in the MIPS DSPr2 4:2:0 "plain" (non-fancy and non-merged) 282upsampling routine that caused a buffer overflow (and subsequent segfault) when 283decompressing a 4:2:0 JPEG image whose scaled output width was less than 16 284pixels. The "plain" upsampling routines are normally only used when 285decompressing a non-YCbCr JPEG image, but they are also used when decompressing 286a JPEG image whose scaled output height is 1. 287 2885. Fixed various negative left shifts and other issues reported by the GCC and 289Clang undefined behavior sanitizers. None of these was known to pose a 290security threat, but removing the warnings makes it easier to detect actual 291security issues, should they arise in the future. 292 293 2941.4.1 295===== 296 297### Significant changes relative to 1.4.0: 298 2991. tjbench now properly handles CMYK/YCCK JPEG files. Passing an argument of 300`-cmyk` (instead of, for instance, `-rgb`) will cause tjbench to internally 301convert the source bitmap to CMYK prior to compression, to generate YCCK JPEG 302files, and to internally convert the decompressed CMYK pixels back to RGB after 303decompression (the latter is done automatically if a CMYK or YCCK JPEG is 304passed to tjbench as a source image.) The CMYK<->RGB conversion operation is 305not benchmarked. NOTE: The quick & dirty CMYK<->RGB conversions that tjbench 306uses are suitable for testing only. Proper conversion between CMYK and RGB 307requires a color management system. 308 3092. `make test` now performs additional bitwise regression tests using tjbench, 310mainly for the purpose of testing compression from/decompression to a subregion 311of a larger image buffer. 312 3133. `make test` no longer tests the regression of the floating point DCT/IDCT 314by default, since the results of those tests can vary if the algorithms in 315question are not implemented using SIMD instructions on a particular platform. 316See the comments in [Makefile.am](Makefile.am) for information on how to 317re-enable the tests and to specify an expected result for them based on the 318particulars of your platform. 319 3204. The NULL color conversion routines have been significantly optimized, 321which speeds up the compression of RGB and CMYK JPEGs by 5-20% when using 32264-bit code and 0-3% when using 32-bit code, and the decompression of those 323images by 10-30% when using 64-bit code and 3-12% when using 32-bit code. 324 3255. Fixed an "illegal instruction" error that occurred when djpeg from a 326SIMD-enabled libjpeg-turbo MIPS build was executed with the `-nosmooth` option 327on a MIPS machine that lacked DSPr2 support. The MIPS SIMD routines for h2v1 328and h2v2 merged upsampling were not properly checking for the existence of 329DSPr2. 330 3316. Performance has been improved significantly on 64-bit non-Linux and 332non-Windows platforms (generally 10-20% faster compression and 5-10% faster 333decompression.) Due to an oversight, the 64-bit version of the accelerated 334Huffman codec was not being compiled in when libjpeg-turbo was built on 335platforms other than Windows or Linux. Oops. 336 3377. Fixed an extremely rare bug in the Huffman encoder that caused 64-bit 338builds of libjpeg-turbo to incorrectly encode a few specific test images when 339quality=98, an optimized Huffman table, and the slow integer forward DCT were 340used. 341 3428. The Windows (CMake) build system now supports building only static or only 343shared libraries. This is accomplished by adding either `-DENABLE_STATIC=0` or 344`-DENABLE_SHARED=0` to the CMake command line. 345 3469. TurboJPEG API functions will now return an error code if a warning is 347triggered in the underlying libjpeg API. For instance, if a JPEG file is 348corrupt, the TurboJPEG decompression functions will attempt to decompress 349as much of the image as possible, but those functions will now return -1 to 350indicate that the decompression was not entirely successful. 351 35210. Fixed a bug in the MIPS DSPr2 4:2:2 fancy upsampling routine that caused a 353buffer overflow (and subsequent segfault) when decompressing a 4:2:2 JPEG image 354in which the right-most MCU was 5 or 6 pixels wide. 355 356 3571.4.0 358===== 359 360### Significant changes relative to 1.4 beta1: 361 3621. Fixed a build issue on OS X PowerPC platforms (md5cmp failed to build 363because OS X does not provide the `le32toh()` and `htole32()` functions.) 364 3652. The non-SIMD RGB565 color conversion code did not work correctly on big 366endian machines. This has been fixed. 367 3683. Fixed an issue in `tjPlaneSizeYUV()` whereby it would erroneously return 1 369instead of -1 if `componentID` was > 0 and `subsamp` was `TJSAMP_GRAY`. 370 3713. Fixed an issue in `tjBufSizeYUV2()` whereby it would erroneously return 0 372instead of -1 if `width` was < 1. 373 3745. The Huffman encoder now uses `clz` and `bsr` instructions for bit counting 375on ARM64 platforms (see 1.4 beta1[5].) 376 3776. The `close()` method in the TJCompressor and TJDecompressor Java classes is 378now idempotent. Previously, that method would call the native `tjDestroy()` 379function even if the TurboJPEG instance had already been destroyed. This 380caused an exception to be thrown during finalization, if the `close()` method 381had already been called. The exception was caught, but it was still an 382expensive operation. 383 3847. The TurboJPEG API previously generated an error (`Could not determine 385subsampling type for JPEG image`) when attempting to decompress grayscale JPEG 386images that were compressed with a sampling factor other than 1 (for instance, 387with `cjpeg -grayscale -sample 2x2`). Subsampling technically has no meaning 388with grayscale JPEGs, and thus the horizontal and vertical sampling factors 389for such images are ignored by the decompressor. However, the TurboJPEG API 390was being too rigid and was expecting the sampling factors to be equal to 1 391before it treated the image as a grayscale JPEG. 392 3938. cjpeg, djpeg, and jpegtran now accept an argument of `-version`, which will 394print the library version and exit. 395 3969. Referring to 1.4 beta1[15], another extremely rare circumstance was 397discovered under which the Huffman encoder's local buffer can be overrun 398when a buffered destination manager is being used and an 399extremely-high-frequency block (basically junk image data) is being encoded. 400Even though the Huffman local buffer was increased from 128 bytes to 136 bytes 401to address the previous issue, the new issue caused even the larger buffer to 402be overrun. Further analysis reveals that, in the absolute worst case (such as 403setting alternating AC coefficients to 32767 and -32768 in the JPEG scanning 404order), the Huffman encoder can produce encoded blocks that approach double the 405size of the unencoded blocks. Thus, the Huffman local buffer was increased to 406256 bytes, which should prevent any such issue from re-occurring in the future. 407 40810. The new `tjPlaneSizeYUV()`, `tjPlaneWidth()`, and `tjPlaneHeight()` 409functions were not actually usable on any platform except OS X and Windows, 410because those functions were not included in the libturbojpeg mapfile. This 411has been fixed. 412 41311. Restored the `JPP()`, `JMETHOD()`, and `FAR` macros in the libjpeg-turbo 414header files. The `JPP()` and `JMETHOD()` macros were originally implemented 415in libjpeg as a way of supporting non-ANSI compilers that lacked support for 416prototype parameters. libjpeg-turbo has never supported such compilers, but 417some software packages still use the macros to define their own prototypes. 418Similarly, libjpeg-turbo has never supported MS-DOS and other platforms that 419have far symbols, but some software packages still use the `FAR` macro. A 420pretty good argument can be made that this is a bad practice on the part of the 421software in question, but since this affects more than one package, it's just 422easier to fix it here. 423 42412. Fixed issues that were preventing the ARM 64-bit SIMD code from compiling 425for iOS, and included an ARMv8 architecture in all of the binaries installed by 426the "official" libjpeg-turbo SDK for OS X. 427 428 4291.3.90 (1.4 beta1) 430================== 431 432### Significant changes relative to 1.3.1: 433 4341. New features in the TurboJPEG API: 435 436 - YUV planar images can now be generated with an arbitrary line padding 437(previously only 4-byte padding, which was compatible with X Video, was 438supported.) 439 - The decompress-to-YUV function has been extended to support image 440scaling. 441 - JPEG images can now be compressed from YUV planar source images. 442 - YUV planar images can now be decoded into RGB or grayscale images. 443 - 4:1:1 subsampling is now supported. This is mainly included for 444compatibility, since 4:1:1 is not fully accelerated in libjpeg-turbo and has no 445significant advantages relative to 4:2:0. 446 - CMYK images are now supported. This feature allows CMYK source images 447to be compressed to YCCK JPEGs and YCCK or CMYK JPEGs to be decompressed to 448CMYK destination images. Conversion between CMYK/YCCK and RGB or YUV images is 449not supported. Such conversion requires a color management system and is thus 450out of scope for a codec library. 451 - The handling of YUV images in the Java API has been significantly 452refactored and should now be much more intuitive. 453 - The Java API now supports encoding a YUV image from an arbitrary 454position in a large image buffer. 455 - All of the YUV functions now have a corresponding function that operates 456on separate image planes instead of a unified image buffer. This allows for 457compressing/decoding from or decompressing/encoding to a subregion of a larger 458YUV image. It also allows for handling YUV formats that swap the order of the 459U and V planes. 460 4612. Added SIMD acceleration for DSPr2-capable MIPS platforms. This speeds up 462the compression of full-color JPEGs by 70-80% on such platforms and 463decompression by 25-35%. 464 4653. If an application attempts to decompress a Huffman-coded JPEG image whose 466header does not contain Huffman tables, libjpeg-turbo will now insert the 467default Huffman tables. In order to save space, many motion JPEG video frames 468are encoded without the default Huffman tables, so these frames can now be 469successfully decompressed by libjpeg-turbo without additional work on the part 470of the application. An application can still override the Huffman tables, for 471instance to re-use tables from a previous frame of the same video. 472 4734. The Mac packaging system now uses pkgbuild and productbuild rather than 474PackageMaker (which is obsolete and no longer supported.) This means that 475OS X 10.6 "Snow Leopard" or later must be used when packaging libjpeg-turbo, 476although the packages produced can be installed on OS X 10.5 "Leopard" or 477later. OS X 10.4 "Tiger" is no longer supported. 478 4795. The Huffman encoder now uses `clz` and `bsr` instructions for bit counting 480on ARM platforms rather than a lookup table. This reduces the memory footprint 481by 64k, which may be important for some mobile applications. Out of four 482Android devices that were tested, two demonstrated a small overall performance 483loss (~3-4% on average) with ARMv6 code and a small gain (also ~3-4%) with 484ARMv7 code when enabling this new feature, but the other two devices 485demonstrated a significant overall performance gain with both ARMv6 and ARMv7 486code (~10-20%) when enabling the feature. Actual mileage may vary. 487 4886. Worked around an issue with Visual C++ 2010 and later that caused incorrect 489pixels to be generated when decompressing a JPEG image to a 256-color bitmap, 490if compiler optimization was enabled when libjpeg-turbo was built. This caused 491the regression tests to fail when doing a release build under Visual C++ 2010 492and later. 493 4947. Improved the accuracy and performance of the non-SIMD implementation of the 495floating point inverse DCT (using code borrowed from libjpeg v8a and later.) 496The accuracy of this implementation now matches the accuracy of the SSE/SSE2 497implementation. Note, however, that the floating point DCT/IDCT algorithms are 498mainly a legacy feature. They generally do not produce significantly better 499accuracy than the slow integer DCT/IDCT algorithms, and they are quite a bit 500slower. 501 5028. Added a new output colorspace (`JCS_RGB565`) to the libjpeg API that allows 503for decompressing JPEG images into RGB565 (16-bit) pixels. If dithering is not 504used, then this code path is SIMD-accelerated on ARM platforms. 505 5069. Numerous obsolete features, such as support for non-ANSI compilers and 507support for the MS-DOS memory model, were removed from the libjpeg code, 508greatly improving its readability and making it easier to maintain and extend. 509 51010. Fixed a segfault that occurred when calling `output_message()` with 511`msg_code` set to `JMSG_COPYRIGHT`. 512 51311. Fixed an issue whereby wrjpgcom was allowing comments longer than 65k 514characters to be passed on the command line, which was causing it to generate 515incorrect JPEG files. 516 51712. Fixed a bug in the build system that was causing the Windows version of 518wrjpgcom to be built using the rdjpgcom source code. 519 52013. Restored 12-bit-per-component JPEG support. A 12-bit version of 521libjpeg-turbo can now be built by passing an argument of `--with-12bit` to 522configure (Unix) or `-DWITH_12BIT=1` to cmake (Windows.) 12-bit JPEG support 523is included only for convenience. Enabling this feature disables all of the 524performance features in libjpeg-turbo, as well as arithmetic coding and the 525TurboJPEG API. The resulting library still contains the other libjpeg-turbo 526features (such as the colorspace extensions), but in general, it performs no 527faster than libjpeg v6b. 528 52914. Added ARM 64-bit SIMD acceleration for the YCC-to-RGB color conversion 530and IDCT algorithms (both are used during JPEG decompression.) For unknown 531reasons (probably related to clang), this code cannot currently be compiled for 532iOS. 533 53415. Fixed an extremely rare bug that could cause the Huffman encoder's local 535buffer to overrun when a very high-frequency MCU is compressed using quality 536100 and no subsampling, and when the JPEG output buffer is being dynamically 537resized by the destination manager. This issue was so rare that, even with a 538test program specifically designed to make the bug occur (by injecting random 539high-frequency YUV data into the compressor), it was reproducible only once in 540about every 25 million iterations. 541 54216. Fixed an oversight in the TurboJPEG C wrapper: if any of the JPEG 543compression functions was called repeatedly with the same 544automatically-allocated destination buffer, then TurboJPEG would erroneously 545assume that the `jpegSize` parameter was equal to the size of the buffer, when 546in fact that parameter was probably equal to the size of the most recently 547compressed JPEG image. If the size of the previous JPEG image was not as large 548as the current JPEG image, then TurboJPEG would unnecessarily reallocate the 549destination buffer. 550 551 5521.3.1 553===== 554 555### Significant changes relative to 1.3.0: 556 5571. On Un*x systems, `make install` now installs the libjpeg-turbo libraries 558into /opt/libjpeg-turbo/lib32 by default on any 32-bit system, not just x86, 559and into /opt/libjpeg-turbo/lib64 by default on any 64-bit system, not just 560x86-64. You can override this by overriding either the `prefix` or `libdir` 561configure variables. 562 5632. The Windows installer now places a copy of the TurboJPEG DLLs in the same 564directory as the rest of the libjpeg-turbo binaries. This was mainly done 565to support TurboVNC 1.3, which bundles the DLLs in its Windows installation. 566When using a 32-bit version of CMake on 64-bit Windows, it is impossible to 567access the c:\WINDOWS\system32 directory, which made it impossible for the 568TurboVNC build scripts to bundle the 64-bit TurboJPEG DLL. 569 5703. Fixed a bug whereby attempting to encode a progressive JPEG with arithmetic 571entropy coding (by passing arguments of `-progressive -arithmetic` to cjpeg or 572jpegtran, for instance) would result in an error, `Requested feature was 573omitted at compile time`. 574 5754. Fixed a couple of issues whereby malformed JPEG images would cause 576libjpeg-turbo to use uninitialized memory during decompression. 577 5785. Fixed an error (`Buffer passed to JPEG library is too small`) that occurred 579when calling the TurboJPEG YUV encoding function with a very small (< 5x5) 580source image, and added a unit test to check for this error. 581 5826. The Java classes should now build properly under Visual Studio 2010 and 583later. 584 5857. Fixed an issue that prevented SRPMs generated using the in-tree packaging 586tools from being rebuilt on certain newer Linux distributions. 587 5888. Numerous minor fixes to eliminate compilation and build/packaging system 589warnings, fix cosmetic issues, improve documentation clarity, and other general 590source cleanup. 591 592 5931.3.0 594===== 595 596### Significant changes relative to 1.3 beta1: 597 5981. `make test` now works properly on FreeBSD, and it no longer requires the 599md5sum executable to be present on other Un*x platforms. 600 6012. Overhauled the packaging system: 602 603 - To avoid conflict with vendor-supplied libjpeg-turbo packages, the 604official RPMs and DEBs for libjpeg-turbo have been renamed to 605"libjpeg-turbo-official". 606 - The TurboJPEG libraries are now located under /opt/libjpeg-turbo in the 607official Linux and Mac packages, to avoid conflict with vendor-supplied 608packages and also to streamline the packaging system. 609 - Release packages are now created with the directory structure defined 610by the configure variables `prefix`, `bindir`, `libdir`, etc. (Un\*x) or by the 611`CMAKE_INSTALL_PREFIX` variable (Windows.) The exception is that the docs are 612always located under the system default documentation directory on Un\*x and 613Mac systems, and on Windows, the TurboJPEG DLL is always located in the Windows 614system directory. 615 - To avoid confusion, official libjpeg-turbo packages on Linux/Unix 616platforms (except for Mac) will always install the 32-bit libraries in 617/opt/libjpeg-turbo/lib32 and the 64-bit libraries in /opt/libjpeg-turbo/lib64. 618 - Fixed an issue whereby, in some cases, the libjpeg-turbo executables on 619Un*x systems were not properly linking with the shared libraries installed by 620the same package. 621 - Fixed an issue whereby building the "installer" target on Windows when 622`WITH_JAVA=1` would fail if the TurboJPEG JAR had not been previously built. 623 - Building the "install" target on Windows now installs files into the 624same places that the installer does. 625 6263. Fixed a Huffman encoder bug that prevented I/O suspension from working 627properly. 628 629 6301.2.90 (1.3 beta1) 631================== 632 633### Significant changes relative to 1.2.1: 634 6351. Added support for additional scaling factors (3/8, 5/8, 3/4, 7/8, 9/8, 5/4, 63611/8, 3/2, 13/8, 7/4, 15/8, and 2) when decompressing. Note that the IDCT will 637not be SIMD-accelerated when using any of these new scaling factors. 638 6392. The TurboJPEG dynamic library is now versioned. It was not strictly 640necessary to do so, because TurboJPEG uses versioned symbols, and if a function 641changes in an ABI-incompatible way, that function is renamed and a legacy 642function is provided to maintain backward compatibility. However, certain 643Linux distro maintainers have a policy against accepting any library that isn't 644versioned. 645 6463. Extended the TurboJPEG Java API so that it can be used to compress a JPEG 647image from and decompress a JPEG image to an arbitrary position in a large 648image buffer. 649 6504. The `tjDecompressToYUV()` function now supports the `TJFLAG_FASTDCT` flag. 651 6525. The 32-bit supplementary package for amd64 Debian systems now provides 653symlinks in /usr/lib/i386-linux-gnu for the TurboJPEG libraries in /usr/lib32. 654This allows those libraries to be used on MultiArch-compatible systems (such as 655Ubuntu 11 and later) without setting the linker path. 656 6576. The TurboJPEG Java wrapper should now find the JNI library on Mac systems 658without having to pass `-Djava.library.path=/usr/lib` to java. 659 6607. TJBench has been ported to Java to provide a convenient way of validating 661the performance of the TurboJPEG Java API. It can be run with 662`java -cp turbojpeg.jar TJBench`. 663 6648. cjpeg can now be used to generate JPEG files with the RGB colorspace 665(feature ported from jpeg-8d.) 666 6679. The width and height in the `-crop` argument passed to jpegtran can now be 668suffixed with `f` to indicate that, when the upper left corner of the cropping 669region is automatically moved to the nearest iMCU boundary, the bottom right 670corner should be moved by the same amount. In other words, this feature causes 671jpegtran to strictly honor the specified width/height rather than the specified 672bottom right corner (feature ported from jpeg-8d.) 673 67410. JPEG files using the RGB colorspace can now be decompressed into grayscale 675images (feature ported from jpeg-8d.) 676 67711. Fixed a regression caused by 1.2.1[7] whereby the build would fail with 678multiple "Mismatch in operand sizes" errors when attempting to build the x86 679SIMD code with NASM 0.98. 680 68112. The in-memory source/destination managers (`jpeg_mem_src()` and 682`jpeg_mem_dest()`) are now included by default when building libjpeg-turbo with 683libjpeg v6b or v7 emulation, so that programs can take advantage of these 684functions without requiring the use of the backward-incompatible libjpeg v8 685ABI. The "age number" of the libjpeg-turbo library on Un*x systems has been 686incremented by 1 to reflect this. You can disable this feature with a 687configure/CMake switch in order to retain strict API/ABI compatibility with the 688libjpeg v6b or v7 API/ABI (or with previous versions of libjpeg-turbo.) See 689[README.md](README.md) for more details. 690 69113. Added ARMv7s architecture to libjpeg.a and libturbojpeg.a in the official 692libjpeg-turbo binary package for OS X, so that those libraries can be used to 693build applications that leverage the faster CPUs in the iPhone 5 and iPad 4. 694 695 6961.2.1 697===== 698 699### Significant changes relative to 1.2.0: 700 7011. Creating or decoding a JPEG file that uses the RGB colorspace should now 702properly work when the input or output colorspace is one of the libjpeg-turbo 703colorspace extensions. 704 7052. When libjpeg-turbo was built without SIMD support and merged (non-fancy) 706upsampling was used along with an alpha-enabled colorspace during 707decompression, the unused byte of the decompressed pixels was not being set to 7080xFF. This has been fixed. TJUnitTest has also been extended to test for the 709correct behavior of the colorspace extensions when merged upsampling is used. 710 7113. Fixed a bug whereby the libjpeg-turbo SSE2 SIMD code would not preserve the 712upper 64 bits of xmm6 and xmm7 on Win64 platforms, which violated the Win64 713calling conventions. 714 7154. Fixed a regression caused by 1.2.0[6] whereby decompressing corrupt JPEG 716images (specifically, images in which the component count was erroneously set 717to a large value) would cause libjpeg-turbo to segfault. 718 7195. Worked around a severe performance issue with "Bobcat" (AMD Embedded APU) 720processors. The `MASKMOVDQU` instruction, which was used by the libjpeg-turbo 721SSE2 SIMD code, is apparently implemented in microcode on AMD processors, and 722it is painfully slow on Bobcat processors in particular. Eliminating the use 723of this instruction improved performance by an order of magnitude on Bobcat 724processors and by a small amount (typically 5%) on AMD desktop processors. 725 7266. Added SIMD acceleration for performing 4:2:2 upsampling on NEON-capable ARM 727platforms. This speeds up the decompression of 4:2:2 JPEGs by 20-25% on such 728platforms. 729 7307. Fixed a regression caused by 1.2.0[2] whereby, on Linux/x86 platforms 731running the 32-bit SSE2 SIMD code in libjpeg-turbo, decompressing a 4:2:0 or 7324:2:2 JPEG image into a 32-bit (RGBX, BGRX, etc.) buffer without using fancy 733upsampling would produce several incorrect columns of pixels at the right-hand 734side of the output image if each row in the output image was not evenly 735divisible by 16 bytes. 736 7378. Fixed an issue whereby attempting to build the SIMD extensions with Xcode 7384.3 on OS X platforms would cause NASM to return numerous errors of the form 739"'%define' expects a macro identifier". 740 7419. Added flags to the TurboJPEG API that allow the caller to force the use of 742either the fast or the accurate DCT/IDCT algorithms in the underlying codec. 743 744 7451.2.0 746===== 747 748### Significant changes relative to 1.2 beta1: 749 7501. Fixed build issue with YASM on Unix systems (the libjpeg-turbo build system 751was not adding the current directory to the assembler include path, so YASM 752was not able to find jsimdcfg.inc.) 753 7542. Fixed out-of-bounds read in SSE2 SIMD code that occurred when decompressing 755a JPEG image to a bitmap buffer whose size was not a multiple of 16 bytes. 756This was more of an annoyance than an actual bug, since it did not cause any 757actual run-time problems, but the issue showed up when running libjpeg-turbo in 758valgrind. See <http://crbug.com/72399> for more information. 759 7603. Added a compile-time macro (`LIBJPEG_TURBO_VERSION`) that can be used to 761check the version of libjpeg-turbo against which an application was compiled. 762 7634. Added new RGBA/BGRA/ABGR/ARGB colorspace extension constants (libjpeg API) 764and pixel formats (TurboJPEG API), which allow applications to specify that, 765when decompressing to a 4-component RGB buffer, the unused byte should be set 766to 0xFF so that it can be interpreted as an opaque alpha channel. 767 7685. Fixed regression issue whereby DevIL failed to build against libjpeg-turbo 769because libjpeg-turbo's distributed version of jconfig.h contained an `INLINE` 770macro, which conflicted with a similar macro in DevIL. This macro is used only 771internally when building libjpeg-turbo, so it was moved into config.h. 772 7736. libjpeg-turbo will now correctly decompress erroneous CMYK/YCCK JPEGs whose 774K component is assigned a component ID of 1 instead of 4. Although these files 775are in violation of the spec, other JPEG implementations handle them 776correctly. 777 7787. Added ARMv6 and ARMv7 architectures to libjpeg.a and libturbojpeg.a in 779the official libjpeg-turbo binary package for OS X, so that those libraries can 780be used to build both OS X and iOS applications. 781 782 7831.1.90 (1.2 beta1) 784================== 785 786### Significant changes relative to 1.1.1: 787 7881. Added a Java wrapper for the TurboJPEG API. See [java/README](java/README) 789for more details. 790 7912. The TurboJPEG API can now be used to scale down images during 792decompression. 793 7943. Added SIMD routines for RGB-to-grayscale color conversion, which 795significantly improves the performance of grayscale JPEG compression from an 796RGB source image. 797 7984. Improved the performance of the C color conversion routines, which are used 799on platforms for which SIMD acceleration is not available. 800 8015. Added a function to the TurboJPEG API that performs lossless transforms. 802This function is implemented using the same back end as jpegtran, but it 803performs transcoding entirely in memory and allows multiple transforms and/or 804crop operations to be batched together, so the source coefficients only need to 805be read once. This is useful when generating image tiles from a single source 806JPEG. 807 8086. Added tests for the new TurboJPEG scaled decompression and lossless 809transform features to tjbench (the TurboJPEG benchmark, formerly called 810"jpgtest".) 811 8127. Added support for 4:4:0 (transposed 4:2:2) subsampling in TurboJPEG, which 813was necessary in order for it to read 4:2:2 JPEG files that had been losslessly 814transposed or rotated 90 degrees. 815 8168. All legacy VirtualGL code has been re-factored, and this has allowed 817libjpeg-turbo, in its entirety, to be re-licensed under a BSD-style license. 818 8199. libjpeg-turbo can now be built with YASM. 820 82110. Added SIMD acceleration for ARM Linux and iOS platforms that support 822NEON instructions. 823 82411. Refactored the TurboJPEG C API and documented it using Doxygen. The 825TurboJPEG 1.2 API uses pixel formats to define the size and component order of 826the uncompressed source/destination images, and it includes a more efficient 827version of `TJBUFSIZE()` that computes a worst-case JPEG size based on the 828level of chrominance subsampling. The refactored implementation of the 829TurboJPEG API now uses the libjpeg memory source and destination managers, 830which allows the TurboJPEG compressor to grow the JPEG buffer as necessary. 831 83212. Eliminated errors in the output of jpegtran on Windows that occurred when 833the application was invoked using I/O redirection 834(`jpegtran <input.jpg >output.jpg`.) 835 83613. The inclusion of libjpeg v7 and v8 emulation as well as arithmetic coding 837support in libjpeg-turbo v1.1.0 introduced several new error constants in 838jerror.h, and these were mistakenly enabled for all emulation modes, causing 839the error enum in libjpeg-turbo to sometimes have different values than the 840same enum in libjpeg. This represents an ABI incompatibility, and it caused 841problems with rare applications that took specific action based on a particular 842error value. The fix was to include the new error constants conditionally 843based on whether libjpeg v7 or v8 emulation was enabled. 844 84514. Fixed an issue whereby Windows applications that used libjpeg-turbo would 846fail to compile if the Windows system headers were included before jpeglib.h. 847This issue was caused by a conflict in the definition of the INT32 type. 848 84915. Fixed 32-bit supplementary package for amd64 Debian systems, which was 850broken by enhancements to the packaging system in 1.1. 851 85216. When decompressing a JPEG image using an output colorspace of 853`JCS_EXT_RGBX`, `JCS_EXT_BGRX`, `JCS_EXT_XBGR`, or `JCS_EXT_XRGB`, 854libjpeg-turbo will now set the unused byte to 0xFF, which allows applications 855to interpret that byte as an alpha channel (0xFF = opaque). 856 857 8581.1.1 859===== 860 861### Significant changes relative to 1.1.0: 862 8631. Fixed a 1-pixel error in row 0, column 21 of the luminance plane generated 864by `tjEncodeYUV()`. 865 8662. libjpeg-turbo's accelerated Huffman decoder previously ignored unexpected 867markers found in the middle of the JPEG data stream during decompression. It 868will now hand off decoding of a particular block to the unaccelerated Huffman 869decoder if an unexpected marker is found, so that the unaccelerated Huffman 870decoder can generate an appropriate warning. 871 8723. Older versions of MinGW64 prefixed symbol names with underscores by 873default, which differed from the behavior of 64-bit Visual C++. MinGW64 1.0 874has adopted the behavior of 64-bit Visual C++ as the default, so to accommodate 875this, the libjpeg-turbo SIMD function names are no longer prefixed with an 876underscore when building with MinGW64. This means that, when building 877libjpeg-turbo with older versions of MinGW64, you will now have to add 878`-fno-leading-underscore` to the `CFLAGS`. 879 8804. Fixed a regression bug in the NSIS script that caused the Windows installer 881build to fail when using the Visual Studio IDE. 882 8835. Fixed a bug in `jpeg_read_coefficients()` whereby it would not initialize 884`cinfo->image_width` and `cinfo->image_height` if libjpeg v7 or v8 emulation 885was enabled. This specifically caused the jpegoptim program to fail if it was 886linked against a version of libjpeg-turbo that was built with libjpeg v7 or v8 887emulation. 888 8896. Eliminated excessive I/O overhead that occurred when reading BMP files in 890cjpeg. 891 8927. Eliminated errors in the output of cjpeg on Windows that occurred when the 893application was invoked using I/O redirection (`cjpeg <inputfile >output.jpg`.) 894 895 8961.1.0 897===== 898 899### Significant changes relative to 1.1 beta1: 900 9011. The algorithm used by the SIMD quantization function cannot produce correct 902results when the JPEG quality is >= 98 and the fast integer forward DCT is 903used. Thus, the non-SIMD quantization function is now used for those cases, 904and libjpeg-turbo should now produce identical output to libjpeg v6b in all 905cases. 906 9072. Despite the above, the fast integer forward DCT still degrades somewhat for 908JPEG qualities greater than 95, so the TurboJPEG wrapper will now automatically 909use the slow integer forward DCT when generating JPEG images of quality 96 or 910greater. This reduces compression performance by as much as 15% for these 911high-quality images but is necessary to ensure that the images are perceptually 912lossless. It also ensures that the library can avoid the performance pitfall 913created by [1]. 914 9153. Ported jpgtest.cxx to pure C to avoid the need for a C++ compiler. 916 9174. Fixed visual artifacts in grayscale JPEG compression caused by a typo in 918the RGB-to-luminance lookup tables. 919 9205. The Windows distribution packages now include the libjpeg run-time programs 921(cjpeg, etc.) 922 9236. All packages now include jpgtest. 924 9257. The TurboJPEG dynamic library now uses versioned symbols. 926 9278. Added two new TurboJPEG API functions, `tjEncodeYUV()` and 928`tjDecompressToYUV()`, to replace the somewhat hackish `TJ_YUV` flag. 929 930 9311.0.90 (1.1 beta1) 932================== 933 934### Significant changes relative to 1.0.1: 935 9361. Added emulation of the libjpeg v7 and v8 APIs and ABIs. See 937[README.md](README.md) for more details. This feature was sponsored by 938CamTrace SAS. 939 9402. Created a new CMake-based build system for the Visual C++ and MinGW builds. 941 9423. Grayscale bitmaps can now be compressed from/decompressed to using the 943TurboJPEG API. 944 9454. jpgtest can now be used to test decompression performance with existing 946JPEG images. 947 9485. If the default install prefix (/opt/libjpeg-turbo) is used, then 949`make install` now creates /opt/libjpeg-turbo/lib32 and 950/opt/libjpeg-turbo/lib64 sym links to duplicate the behavior of the binary 951packages. 952 9536. All symbols in the libjpeg-turbo dynamic library are now versioned, even 954when the library is built with libjpeg v6b emulation. 955 9567. Added arithmetic encoding and decoding support (can be disabled with 957configure or CMake options) 958 9598. Added a `TJ_YUV` flag to the TurboJPEG API, which causes both the compressor 960and decompressor to output planar YUV images. 961 9629. Added an extended version of `tjDecompressHeader()` to the TurboJPEG API, 963which allows the caller to determine the type of subsampling used in a JPEG 964image. 965 96610. Added further protections against invalid Huffman codes. 967 968 9691.0.1 970===== 971 972### Significant changes relative to 1.0.0: 973 9741. The Huffman decoder will now handle erroneous Huffman codes (for instance, 975from a corrupt JPEG image.) Previously, these would cause libjpeg-turbo to 976crash under certain circumstances. 977 9782. Fixed typo in SIMD dispatch routines that was causing 4:2:2 upsampling to 979be used instead of 4:2:0 when decompressing JPEG images using SSE2 code. 980 9813. The configure script will now automatically determine whether the 982`INCOMPLETE_TYPES_BROKEN` macro should be defined. 983 984 9851.0.0 986===== 987 988### Significant changes relative to 0.0.93: 989 9901. 2983700: Further FreeBSD build tweaks (no longer necessary to specify 991`--host` when configuring on a 64-bit system) 992 9932. Created symlinks in the Unix/Linux packages so that the TurboJPEG 994include file can always be found in /opt/libjpeg-turbo/include, the 32-bit 995static libraries can always be found in /opt/libjpeg-turbo/lib32, and the 99664-bit static libraries can always be found in /opt/libjpeg-turbo/lib64. 997 9983. The Unix/Linux distribution packages now include the libjpeg run-time 999programs (cjpeg, etc.) and man pages. 1000 10014. Created a 32-bit supplementary package for amd64 Debian systems, which 1002contains just the 32-bit libjpeg-turbo libraries. 1003 10045. Moved the libraries from */lib32 to */lib in the i386 Debian package. 1005 10066. Include distribution package for Cygwin 1007 10087. No longer necessary to specify `--without-simd` on non-x86 architectures, 1009and unit tests now work on those architectures. 1010 1011 10120.0.93 1013====== 1014 1015### Significant changes since 0.0.91: 1016 10171. 2982659: Fixed x86-64 build on FreeBSD systems 1018 10192. 2988188: Added support for Windows 64-bit systems 1020 1021 10220.0.91 1023====== 1024 1025### Significant changes relative to 0.0.90: 1026 10271. Added documentation to .deb packages 1028 10292. 2968313: Fixed data corruption issues when decompressing large JPEG images 1030and/or using buffered I/O with the libjpeg-turbo decompressor 1031 1032 10330.0.90 1034====== 1035 1036Initial release 1037