• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1:mod:`gzip` --- Support for :program:`gzip` files
2=================================================
3
4.. module:: gzip
5   :synopsis: Interfaces for gzip compression and decompression using file objects.
6
7**Source code:** :source:`Lib/gzip.py`
8
9--------------
10
11This module provides a simple interface to compress and decompress files just
12like the GNU programs :program:`gzip` and :program:`gunzip` would.
13
14The data compression is provided by the :mod:`zlib` module.
15
16The :mod:`gzip` module provides the :class:`GzipFile` class, as well as the
17:func:`.open`, :func:`compress` and :func:`decompress` convenience functions.
18The :class:`GzipFile` class reads and writes :program:`gzip`\ -format files,
19automatically compressing or decompressing the data so that it looks like an
20ordinary :term:`file object`.
21
22Note that additional file formats which can be decompressed by the
23:program:`gzip` and :program:`gunzip` programs, such  as those produced by
24:program:`compress` and :program:`pack`, are not supported by this module.
25
26The module defines the following items:
27
28
29.. function:: open(filename, mode='rb', compresslevel=9, encoding=None, errors=None, newline=None)
30
31   Open a gzip-compressed file in binary or text mode, returning a :term:`file
32   object`.
33
34   The *filename* argument can be an actual filename (a :class:`str` or
35   :class:`bytes` object), or an existing file object to read from or write to.
36
37   The *mode* argument can be any of ``'r'``, ``'rb'``, ``'a'``, ``'ab'``,
38   ``'w'``, ``'wb'``, ``'x'`` or ``'xb'`` for binary mode, or ``'rt'``,
39   ``'at'``, ``'wt'``, or ``'xt'`` for text mode. The default is ``'rb'``.
40
41   The *compresslevel* argument is an integer from 0 to 9, as for the
42   :class:`GzipFile` constructor.
43
44   For binary mode, this function is equivalent to the :class:`GzipFile`
45   constructor: ``GzipFile(filename, mode, compresslevel)``. In this case, the
46   *encoding*, *errors* and *newline* arguments must not be provided.
47
48   For text mode, a :class:`GzipFile` object is created, and wrapped in an
49   :class:`io.TextIOWrapper` instance with the specified encoding, error
50   handling behavior, and line ending(s).
51
52   .. versionchanged:: 3.3
53      Added support for *filename* being a file object, support for text mode,
54      and the *encoding*, *errors* and *newline* arguments.
55
56   .. versionchanged:: 3.4
57      Added support for the ``'x'``, ``'xb'`` and ``'xt'`` modes.
58
59   .. versionchanged:: 3.6
60      Accepts a :term:`path-like object`.
61
62.. exception:: BadGzipFile
63
64   An exception raised for invalid gzip files.  It inherits :exc:`OSError`.
65   :exc:`EOFError` and :exc:`zlib.error` can also be raised for invalid gzip
66   files.
67
68   .. versionadded:: 3.8
69
70.. class:: GzipFile(filename=None, mode=None, compresslevel=9, fileobj=None, mtime=None)
71
72   Constructor for the :class:`GzipFile` class, which simulates most of the
73   methods of a :term:`file object`, with the exception of the :meth:`truncate`
74   method.  At least one of *fileobj* and *filename* must be given a non-trivial
75   value.
76
77   The new class instance is based on *fileobj*, which can be a regular file, an
78   :class:`io.BytesIO` object, or any other object which simulates a file.  It
79   defaults to ``None``, in which case *filename* is opened to provide a file
80   object.
81
82   When *fileobj* is not ``None``, the *filename* argument is only used to be
83   included in the :program:`gzip` file header, which may include the original
84   filename of the uncompressed file.  It defaults to the filename of *fileobj*, if
85   discernible; otherwise, it defaults to the empty string, and in this case the
86   original filename is not included in the header.
87
88   The *mode* argument can be any of ``'r'``, ``'rb'``, ``'a'``, ``'ab'``, ``'w'``,
89   ``'wb'``, ``'x'``, or ``'xb'``, depending on whether the file will be read or
90   written.  The default is the mode of *fileobj* if discernible; otherwise, the
91   default is ``'rb'``.  In future Python releases the mode of *fileobj* will
92   not be used.  It is better to always specify *mode* for writing.
93
94   Note that the file is always opened in binary mode. To open a compressed file
95   in text mode, use :func:`.open` (or wrap your :class:`GzipFile` with an
96   :class:`io.TextIOWrapper`).
97
98   The *compresslevel* argument is an integer from ``0`` to ``9`` controlling
99   the level of compression; ``1`` is fastest and produces the least
100   compression, and ``9`` is slowest and produces the most compression. ``0``
101   is no compression. The default is ``9``.
102
103   The *mtime* argument is an optional numeric timestamp to be written to
104   the last modification time field in the stream when compressing.  It
105   should only be provided in compression mode.  If omitted or ``None``, the
106   current time is used.  See the :attr:`mtime` attribute for more details.
107
108   Calling a :class:`GzipFile` object's :meth:`close` method does not close
109   *fileobj*, since you might wish to append more material after the compressed
110   data.  This also allows you to pass an :class:`io.BytesIO` object opened for
111   writing as *fileobj*, and retrieve the resulting memory buffer using the
112   :class:`io.BytesIO` object's :meth:`~io.BytesIO.getvalue` method.
113
114   :class:`GzipFile` supports the :class:`io.BufferedIOBase` interface,
115   including iteration and the :keyword:`with` statement.  Only the
116   :meth:`truncate` method isn't implemented.
117
118   :class:`GzipFile` also provides the following method and attribute:
119
120   .. method:: peek(n)
121
122      Read *n* uncompressed bytes without advancing the file position.
123      At most one single read on the compressed stream is done to satisfy
124      the call.  The number of bytes returned may be more or less than
125      requested.
126
127      .. note:: While calling :meth:`peek` does not change the file position of
128         the :class:`GzipFile`, it may change the position of the underlying
129         file object (e.g. if the :class:`GzipFile` was constructed with the
130         *fileobj* parameter).
131
132      .. versionadded:: 3.2
133
134   .. attribute:: mtime
135
136      When decompressing, the value of the last modification time field in
137      the most recently read header may be read from this attribute, as an
138      integer.  The initial value before reading any headers is ``None``.
139
140      All :program:`gzip` compressed streams are required to contain this
141      timestamp field.  Some programs, such as :program:`gunzip`\ , make use
142      of the timestamp.  The format is the same as the return value of
143      :func:`time.time` and the :attr:`~os.stat_result.st_mtime` attribute of
144      the object returned by :func:`os.stat`.
145
146   .. versionchanged:: 3.1
147      Support for the :keyword:`with` statement was added, along with the
148      *mtime* constructor argument and :attr:`mtime` attribute.
149
150   .. versionchanged:: 3.2
151      Support for zero-padded and unseekable files was added.
152
153   .. versionchanged:: 3.3
154      The :meth:`io.BufferedIOBase.read1` method is now implemented.
155
156   .. versionchanged:: 3.4
157      Added support for the ``'x'`` and ``'xb'`` modes.
158
159   .. versionchanged:: 3.5
160      Added support for writing arbitrary
161      :term:`bytes-like objects <bytes-like object>`.
162      The :meth:`~io.BufferedIOBase.read` method now accepts an argument of
163      ``None``.
164
165   .. versionchanged:: 3.6
166      Accepts a :term:`path-like object`.
167
168   .. deprecated:: 3.9
169      Opening :class:`GzipFile` for writing without specifying the *mode*
170      argument is deprecated.
171
172
173.. function:: compress(data, compresslevel=9, *, mtime=None)
174
175   Compress the *data*, returning a :class:`bytes` object containing
176   the compressed data.  *compresslevel* and *mtime* have the same meaning as in
177   the :class:`GzipFile` constructor above.
178
179   .. versionadded:: 3.2
180   .. versionchanged:: 3.8
181      Added the *mtime* parameter for reproducible output.
182
183.. function:: decompress(data)
184
185   Decompress the *data*, returning a :class:`bytes` object containing the
186   uncompressed data.
187
188   .. versionadded:: 3.2
189
190
191.. _gzip-usage-examples:
192
193Examples of usage
194-----------------
195
196Example of how to read a compressed file::
197
198   import gzip
199   with gzip.open('/home/joe/file.txt.gz', 'rb') as f:
200       file_content = f.read()
201
202Example of how to create a compressed GZIP file::
203
204   import gzip
205   content = b"Lots of content here"
206   with gzip.open('/home/joe/file.txt.gz', 'wb') as f:
207       f.write(content)
208
209Example of how to GZIP compress an existing file::
210
211   import gzip
212   import shutil
213   with open('/home/joe/file.txt', 'rb') as f_in:
214       with gzip.open('/home/joe/file.txt.gz', 'wb') as f_out:
215           shutil.copyfileobj(f_in, f_out)
216
217Example of how to GZIP compress a binary string::
218
219   import gzip
220   s_in = b"Lots of content here"
221   s_out = gzip.compress(s_in)
222
223.. seealso::
224
225   Module :mod:`zlib`
226      The basic data compression module needed to support the :program:`gzip` file
227      format.
228
229
230.. program:: gzip
231
232Command Line Interface
233----------------------
234
235The :mod:`gzip` module provides a simple command line interface to compress or
236decompress files.
237
238Once executed the :mod:`gzip` module keeps the input file(s).
239
240.. versionchanged:: 3.8
241
242   Add a new command line interface with a usage.
243   By default, when you will execute the CLI, the default compression level is 6.
244
245Command line options
246^^^^^^^^^^^^^^^^^^^^
247
248.. cmdoption:: file
249
250   If *file* is not specified, read from :attr:`sys.stdin`.
251
252.. cmdoption:: --fast
253
254   Indicates the fastest compression method (less compression).
255
256.. cmdoption:: --best
257
258   Indicates the slowest compression method (best compression).
259
260.. cmdoption:: -d, --decompress
261
262   Decompress the given file.
263
264.. cmdoption:: -h, --help
265
266   Show the help message.
267
268