1Zstandard wrapper for zlib 2================================ 3 4The main objective of creating a zstd wrapper for [zlib](http://zlib.net/) is to allow a quick and smooth transition to zstd for projects already using zlib. 5 6#### Required files 7 8To build the zstd wrapper for zlib the following files are required: 9- zlib.h 10- a static or dynamic zlib library 11- zlibWrapper/zstd_zlibwrapper.h 12- zlibWrapper/zstd_zlibwrapper.c 13- zlibWrapper/gz*.c files (gzclose.c, gzlib.c, gzread.c, gzwrite.c) 14- zlibWrapper/gz*.h files (gzcompatibility.h, gzguts.h) 15- a static or dynamic zstd library 16 17The first two files are required by all projects using zlib and they are not included with the zstd distribution. 18The further files are supplied with the zstd distribution. 19 20 21#### Embedding the zstd wrapper within your project 22 23Let's assume that your project that uses zlib is compiled with: 24```gcc project.o -lz``` 25 26To compile the zstd wrapper with your project you have to do the following: 27- change all references with `#include "zlib.h"` to `#include "zstd_zlibwrapper.h"` 28- compile your project with `zstd_zlibwrapper.c`, `gz*.c` and a static or dynamic zstd library 29 30The linking should be changed to: 31```gcc project.o zstd_zlibwrapper.o gz*.c -lz -lzstd``` 32 33 34#### Enabling zstd compression within your project 35 36After embedding the zstd wrapper within your project the zstd library is turned off by default. 37Your project should work as before with zlib. There are two options to enable zstd compression: 38- compilation with `-DZWRAP_USE_ZSTD=1` (or using `#define ZWRAP_USE_ZSTD 1` before `#include "zstd_zlibwrapper.h"`) 39- using the `void ZWRAP_useZSTDcompression(int turn_on)` function (declared in `#include "zstd_zlibwrapper.h"`) 40 41During decompression zlib and zstd streams are automatically detected and decompressed using a proper library. 42This behavior can be changed using `ZWRAP_setDecompressionType(ZWRAP_FORCE_ZLIB)` what will make zlib decompression slightly faster. 43 44 45#### Example 46We have take the file `test/example.c` from [the zlib library distribution](http://zlib.net/) and copied it to [zlibWrapper/examples/example.c](examples/example.c). 47After compilation and execution it shows the following results: 48``` 49zlib version 1.2.8 = 0x1280, compile flags = 0x65 50uncompress(): hello, hello! 51gzread(): hello, hello! 52gzgets() after gzseek: hello! 53inflate(): hello, hello! 54large_inflate(): OK 55after inflateSync(): hello, hello! 56inflate with dictionary: hello, hello! 57``` 58Then we have changed `#include "zlib.h"` to `#include "zstd_zlibwrapper.h"`, compiled the [example.c](examples/example.c) file 59with `-DZWRAP_USE_ZSTD=1` and linked with additional `zstd_zlibwrapper.o gz*.c -lzstd`. 60We were forced to turn off the following functions: `test_flush`, `test_sync` which use currently unsupported features. 61After running it shows the following results: 62``` 63zlib version 1.2.8 = 0x1280, compile flags = 0x65 64uncompress(): hello, hello! 65gzread(): hello, hello! 66gzgets() after gzseek: hello! 67inflate(): hello, hello! 68large_inflate(): OK 69inflate with dictionary: hello, hello! 70``` 71The script used for compilation can be found at [zlibWrapper/Makefile](Makefile). 72 73 74#### The measurement of performance of Zstandard wrapper for zlib 75 76The zstd distribution contains a tool called `zwrapbench` which can measure speed and ratio of zlib, zstd, and the wrapper. 77The benchmark is conducted using given filenames or synthetic data if filenames are not provided. 78The files are read into memory and processed independently. 79It makes benchmark more precise as it eliminates I/O overhead. 80Many filenames can be supplied as multiple parameters, parameters with wildcards or names of directories can be used as parameters with the -r option. 81One can select compression levels starting from `-b` and ending with `-e`. The `-i` parameter selects minimal time used for each of tested levels. 82With `-B` option bigger files can be divided into smaller, independently compressed blocks. 83The benchmark tool can be compiled with `make zwrapbench` using [zlibWrapper/Makefile](Makefile). 84 85 86#### Improving speed of streaming compression 87 88During streaming compression the compressor never knows how big is data to compress. 89Zstandard compression can be improved by providing size of source data to the compressor. By default streaming compressor assumes that data is bigger than 256 KB but it can hurt compression speed on smaller data. 90The zstd wrapper provides the `ZWRAP_setPledgedSrcSize()` function that allows to change a pledged source size for a given compression stream. 91The function will change zstd compression parameters what may improve compression speed and/or ratio. 92It should be called just after `deflateInit()`or `deflateReset()` and before `deflate()` or `deflateSetDictionary()`. The function is only helpful when data is compressed in blocks. There will be no change in case of `deflateInit()` or `deflateReset()` immediately followed by `deflate(strm, Z_FINISH)` 93as this case is automatically detected. 94 95 96#### Reusing contexts 97 98The ordinary zlib compression of two files/streams allocates two contexts: 99- for the 1st file calls `deflateInit`, `deflate`, `...`, `deflate`, `deflateEnd` 100- for the 2nd file calls `deflateInit`, `deflate`, `...`, `deflate`, `deflateEnd` 101 102The speed of compression can be improved with reusing a single context with following steps: 103- initialize the context with `deflateInit` 104- for the 1st file call `deflate`, `...`, `deflate` 105- for the 2nd file call `deflateReset`, `deflate`, `...`, `deflate` 106- free the context with `deflateEnd` 107 108To check the difference we made experiments using `zwrapbench -ri6b6` with zstd and zlib compression (both at level 6). 109The input data was decompressed git repository downloaded from https://github.com/git/git/archive/master.zip which contains 2979 files. 110The table below shows that reusing contexts has a minor influence on zlib but it gives improvement for zstd. 111In our example (the last 2 lines) it gives 4% better compression speed and 5% better decompression speed. 112 113| Compression type | Compression | Decompress.| Compr. size | Ratio | 114| ------------------------------------------------- | ------------| -----------| ----------- | ----- | 115| zlib 1.2.8 | 30.51 MB/s | 219.3 MB/s | 6819783 | 3.459 | 116| zlib 1.2.8 not reusing a context | 30.22 MB/s | 218.1 MB/s | 6819783 | 3.459 | 117| zlib 1.2.8 with zlibWrapper and reusing a context | 30.40 MB/s | 218.9 MB/s | 6819783 | 3.459 | 118| zlib 1.2.8 with zlibWrapper not reusing a context | 30.28 MB/s | 218.1 MB/s | 6819783 | 3.459 | 119| zstd 1.1.0 using ZSTD_CCtx | 68.35 MB/s | 430.9 MB/s | 6868521 | 3.435 | 120| zstd 1.1.0 using ZSTD_CStream | 66.63 MB/s | 422.3 MB/s | 6868521 | 3.435 | 121| zstd 1.1.0 with zlibWrapper and reusing a context | 54.01 MB/s | 403.2 MB/s | 6763482 | 3.488 | 122| zstd 1.1.0 with zlibWrapper not reusing a context | 51.59 MB/s | 383.7 MB/s | 6763482 | 3.488 | 123 124 125#### Compatibility issues 126After enabling zstd compression not all native zlib functions are supported. When calling unsupported methods they put error message into `strm->msg` and return Z_STREAM_ERROR. 127 128Supported methods: 129- deflateInit 130- deflate (with exception of Z_FULL_FLUSH, Z_BLOCK, and Z_TREES) 131- deflateSetDictionary 132- deflateEnd 133- deflateReset 134- deflateBound 135- inflateInit 136- inflate 137- inflateSetDictionary 138- inflateReset 139- inflateReset2 140- compress 141- compress2 142- compressBound 143- uncompress 144- gzip file access functions 145 146Ignored methods (they do nothing): 147- deflateParams 148 149Unsupported methods: 150- deflateCopy 151- deflateTune 152- deflatePending 153- deflatePrime 154- deflateSetHeader 155- inflateGetDictionary 156- inflateCopy 157- inflateSync 158- inflatePrime 159- inflateMark 160- inflateGetHeader 161- inflateBackInit 162- inflateBack 163- inflateBackEnd 164