1 2:mod:`zlib` --- Compression compatible with :program:`gzip` 3=========================================================== 4 5.. module:: zlib 6 :synopsis: Low-level interface to compression and decompression routines compatible with 7 gzip. 8 9 10For applications that require data compression, the functions in this module 11allow compression and decompression, using the zlib library. The zlib library 12has its own home page at http://www.zlib.net. There are known 13incompatibilities between the Python module and versions of the zlib library 14earlier than 1.1.3; 1.1.3 has a security vulnerability, so we recommend using 151.1.4 or later. 16 17zlib's functions have many options and often need to be used in a particular 18order. This documentation doesn't attempt to cover all of the permutations; 19consult the zlib manual at http://www.zlib.net/manual.html for authoritative 20information. 21 22For reading and writing ``.gz`` files see the :mod:`gzip` module. 23 24The available exception and functions in this module are: 25 26 27.. exception:: error 28 29 Exception raised on compression and decompression errors. 30 31 32.. function:: adler32(data[, value]) 33 34 Computes an Adler-32 checksum of *data*. (An Adler-32 checksum is almost as 35 reliable as a CRC32 but can be computed much more quickly.) If *value* is 36 present, it is used as the starting value of the checksum; otherwise, a fixed 37 default value is used. This allows computing a running checksum over the 38 concatenation of several inputs. The algorithm is not cryptographically 39 strong, and should not be used for authentication or digital signatures. Since 40 the algorithm is designed for use as a checksum algorithm, it is not suitable 41 for use as a general hash algorithm. 42 43 This function always returns an integer object. 44 45.. note:: 46 To generate the same numeric value across all Python versions and 47 platforms use adler32(data) & 0xffffffff. If you are only using 48 the checksum in packed binary format this is not necessary as the 49 return value is the correct 32bit binary representation 50 regardless of sign. 51 52.. versionchanged:: 2.6 53 The return value is in the range [-2**31, 2**31-1] 54 regardless of platform. In older versions the value is 55 signed on some platforms and unsigned on others. 56 57.. versionchanged:: 3.0 58 The return value is unsigned and in the range [0, 2**32-1] 59 regardless of platform. 60 61 62.. function:: compress(string[, level]) 63 64 Compresses the data in *string*, returning a string contained compressed data. 65 *level* is an integer from ``0`` to ``9`` controlling the level of compression; 66 ``1`` is fastest and produces the least compression, ``9`` is slowest and 67 produces the most. ``0`` is no compression. The default value is ``6``. 68 Raises the :exc:`error` exception if any error occurs. 69 70 71.. function:: compressobj([level[, method[, wbits[, memlevel[, strategy]]]]]) 72 73 Returns a compression object, to be used for compressing data streams that won't 74 fit into memory at once. *level* is an integer from 75 ``0`` to ``9`` or ``-1``, controlling 76 the level of compression; ``1`` is fastest and produces the least compression, 77 ``9`` is slowest and produces the most. ``0`` is no compression. The default 78 value is ``-1`` (Z_DEFAULT_COMPRESSION). Z_DEFAULT_COMPRESSION represents a default 79 compromise between speed and compression (currently equivalent to level 6). 80 81 *method* is the compression algorithm. Currently, the only supported value is 82 ``DEFLATED``. 83 84 The *wbits* argument controls the size of the history buffer (or the 85 "window size") used when compressing data, and whether a header and 86 trailer is included in the output. It can take several ranges of values. 87 The default is 15. 88 89 * +9 to +15: The base-two logarithm of the window size, which 90 therefore ranges between 512 and 32768. Larger values produce 91 better compression at the expense of greater memory usage. The 92 resulting output will include a zlib-specific header and trailer. 93 94 * −9 to −15: Uses the absolute value of *wbits* as the 95 window size logarithm, while producing a raw output stream with no 96 header or trailing checksum. 97 98 * +25 to +31 = 16 + (9 to 15): Uses the low 4 bits of the value as the 99 window size logarithm, while including a basic :program:`gzip` header 100 and trailing checksum in the output. 101 102 *memlevel* controls the amount of memory used for internal compression state. 103 Valid values range from ``1`` to ``9``. Higher values using more memory, 104 but are faster and produce smaller output. The default is 8. 105 106 *strategy* is used to tune the compression algorithm. Possible values are 107 ``Z_DEFAULT_STRATEGY``, ``Z_FILTERED``, and ``Z_HUFFMAN_ONLY``. The default 108 is ``Z_DEFAULT_STRATEGY``. 109 110 111.. function:: crc32(data[, value]) 112 113 .. index:: 114 single: Cyclic Redundancy Check 115 single: checksum; Cyclic Redundancy Check 116 117 Computes a CRC (Cyclic Redundancy Check) checksum of *data*. If *value* is 118 present, it is used as the starting value of the checksum; otherwise, a fixed 119 default value is used. This allows computing a running checksum over the 120 concatenation of several inputs. The algorithm is not cryptographically 121 strong, and should not be used for authentication or digital signatures. Since 122 the algorithm is designed for use as a checksum algorithm, it is not suitable 123 for use as a general hash algorithm. 124 125 This function always returns an integer object. 126 127.. note:: 128 To generate the same numeric value across all Python versions and 129 platforms use crc32(data) & 0xffffffff. If you are only using 130 the checksum in packed binary format this is not necessary as the 131 return value is the correct 32bit binary representation 132 regardless of sign. 133 134.. versionchanged:: 2.6 135 The return value is in the range [-2**31, 2**31-1] 136 regardless of platform. In older versions the value would be 137 signed on some platforms and unsigned on others. 138 139.. versionchanged:: 3.0 140 The return value is unsigned and in the range [0, 2**32-1] 141 regardless of platform. 142 143 144.. function:: decompress(string[, wbits[, bufsize]]) 145 146 Decompresses the data in *string*, returning a string containing the 147 uncompressed data. The *wbits* parameter depends on 148 the format of *string*, and is discussed further below. 149 If *bufsize* is given, it is used as the initial size of the output 150 buffer. Raises the :exc:`error` exception if any error occurs. 151 152 .. _decompress-wbits: 153 154 The *wbits* parameter controls the size of the history buffer 155 (or "window size"), and what header and trailer format is expected. 156 It is similar to the parameter for :func:`compressobj`, but accepts 157 more ranges of values: 158 159 * +8 to +15: The base-two logarithm of the window size. The input 160 must include a zlib header and trailer. 161 162 * 0: Automatically determine the window size from the zlib header. 163 Only supported since zlib 1.2.3.5. 164 165 * −8 to −15: Uses the absolute value of *wbits* as the window size 166 logarithm. The input must be a raw stream with no header or trailer. 167 168 * +24 to +31 = 16 + (8 to 15): Uses the low 4 bits of the value as 169 the window size logarithm. The input must include a gzip header and 170 trailer. 171 172 * +40 to +47 = 32 + (8 to 15): Uses the low 4 bits of the value as 173 the window size logarithm, and automatically accepts either 174 the zlib or gzip format. 175 176 When decompressing a stream, the window size must not be smaller 177 than the size originally used to compress the stream; using a too-small 178 value may result in an :exc:`error` exception. The default *wbits* value 179 is 15, which corresponds to the largest window size and requires a zlib 180 header and trailer to be included. 181 182 *bufsize* is the initial size of the buffer used to hold decompressed data. If 183 more space is required, the buffer size will be increased as needed, so you 184 don't have to get this value exactly right; tuning it will only save a few calls 185 to :c:func:`malloc`. The default size is 16384. 186 187 188.. function:: decompressobj([wbits]) 189 190 Returns a decompression object, to be used for decompressing data streams that 191 won't fit into memory at once. 192 193 The *wbits* parameter controls the size of the history buffer (or the 194 "window size"), and what header and trailer format is expected. It has 195 the same meaning as `described for decompress() <#decompress-wbits>`__. 196 197Compression objects support the following methods: 198 199 200.. method:: Compress.compress(string) 201 202 Compress *string*, returning a string containing compressed data for at least 203 part of the data in *string*. This data should be concatenated to the output 204 produced by any preceding calls to the :meth:`compress` method. Some input may 205 be kept in internal buffers for later processing. 206 207 208.. method:: Compress.flush([mode]) 209 210 All pending input is processed, and a string containing the remaining compressed 211 output is returned. *mode* can be selected from the constants 212 :const:`Z_SYNC_FLUSH`, :const:`Z_FULL_FLUSH`, or :const:`Z_FINISH`, 213 defaulting to :const:`Z_FINISH`. :const:`Z_SYNC_FLUSH` and 214 :const:`Z_FULL_FLUSH` allow compressing further strings of data, while 215 :const:`Z_FINISH` finishes the compressed stream and prevents compressing any 216 more data. After calling :meth:`flush` with *mode* set to :const:`Z_FINISH`, 217 the :meth:`compress` method cannot be called again; the only realistic action is 218 to delete the object. 219 220 221.. method:: Compress.copy() 222 223 Returns a copy of the compression object. This can be used to efficiently 224 compress a set of data that share a common initial prefix. 225 226 .. versionadded:: 2.5 227 228Decompression objects support the following methods, and two attributes: 229 230 231.. attribute:: Decompress.unused_data 232 233 A string which contains any bytes past the end of the compressed data. That is, 234 this remains ``""`` until the last byte that contains compression data is 235 available. If the whole string turned out to contain compressed data, this is 236 ``""``, the empty string. 237 238 The only way to determine where a string of compressed data ends is by actually 239 decompressing it. This means that when compressed data is contained part of a 240 larger file, you can only find the end of it by reading data and feeding it 241 followed by some non-empty string into a decompression object's 242 :meth:`decompress` method until the :attr:`unused_data` attribute is no longer 243 the empty string. 244 245 246.. attribute:: Decompress.unconsumed_tail 247 248 A string that contains any data that was not consumed by the last 249 :meth:`decompress` call because it exceeded the limit for the uncompressed data 250 buffer. This data has not yet been seen by the zlib machinery, so you must feed 251 it (possibly with further data concatenated to it) back to a subsequent 252 :meth:`decompress` method call in order to get correct output. 253 254 255.. method:: Decompress.decompress(string[, max_length]) 256 257 Decompress *string*, returning a string containing the uncompressed data 258 corresponding to at least part of the data in *string*. This data should be 259 concatenated to the output produced by any preceding calls to the 260 :meth:`decompress` method. Some of the input data may be preserved in internal 261 buffers for later processing. 262 263 If the optional parameter *max_length* is non-zero then the return value will be 264 no longer than *max_length*. This may mean that not all of the compressed input 265 can be processed; and unconsumed data will be stored in the attribute 266 :attr:`unconsumed_tail`. This string must be passed to a subsequent call to 267 :meth:`decompress` if decompression is to continue. If *max_length* is not 268 supplied then the whole input is decompressed, and :attr:`unconsumed_tail` is an 269 empty string. 270 271 272.. method:: Decompress.flush([length]) 273 274 All pending input is processed, and a string containing the remaining 275 uncompressed output is returned. After calling :meth:`flush`, the 276 :meth:`decompress` method cannot be called again; the only realistic action is 277 to delete the object. 278 279 The optional parameter *length* sets the initial size of the output buffer. 280 281 282.. method:: Decompress.copy() 283 284 Returns a copy of the decompression object. This can be used to save the state 285 of the decompressor midway through the data stream in order to speed up random 286 seeks into the stream at a future point. 287 288 .. versionadded:: 2.5 289 290 291.. seealso:: 292 293 Module :mod:`gzip` 294 Reading and writing :program:`gzip`\ -format files. 295 296 http://www.zlib.net 297 The zlib library home page. 298 299 http://www.zlib.net/manual.html 300 The zlib manual explains the semantics and usage of the library's many 301 functions. 302 303