1:mod:`tarfile` --- Read and write tar archive files 2=================================================== 3 4.. module:: tarfile 5 :synopsis: Read and write tar-format archive files. 6 7.. moduleauthor:: Lars Gustäbel <lars@gustaebel.de> 8.. sectionauthor:: Lars Gustäbel <lars@gustaebel.de> 9 10**Source code:** :source:`Lib/tarfile.py` 11 12-------------- 13 14The :mod:`tarfile` module makes it possible to read and write tar 15archives, including those using gzip, bz2 and lzma compression. 16Use the :mod:`zipfile` module to read or write :file:`.zip` files, or the 17higher-level functions in :ref:`shutil <archiving-operations>`. 18 19Some facts and figures: 20 21* reads and writes :mod:`gzip`, :mod:`bz2` and :mod:`lzma` compressed archives 22 if the respective modules are available. 23 24* read/write support for the POSIX.1-1988 (ustar) format. 25 26* read/write support for the GNU tar format including *longname* and *longlink* 27 extensions, read-only support for all variants of the *sparse* extension 28 including restoration of sparse files. 29 30* read/write support for the POSIX.1-2001 (pax) format. 31 32* handles directories, regular files, hardlinks, symbolic links, fifos, 33 character devices and block devices and is able to acquire and restore file 34 information like timestamp, access permissions and owner. 35 36.. versionchanged:: 3.3 37 Added support for :mod:`lzma` compression. 38 39 40.. function:: open(name=None, mode='r', fileobj=None, bufsize=10240, **kwargs) 41 42 Return a :class:`TarFile` object for the pathname *name*. For detailed 43 information on :class:`TarFile` objects and the keyword arguments that are 44 allowed, see :ref:`tarfile-objects`. 45 46 *mode* has to be a string of the form ``'filemode[:compression]'``, it defaults 47 to ``'r'``. Here is a full list of mode combinations: 48 49 +------------------+---------------------------------------------+ 50 | mode | action | 51 +==================+=============================================+ 52 | ``'r' or 'r:*'`` | Open for reading with transparent | 53 | | compression (recommended). | 54 +------------------+---------------------------------------------+ 55 | ``'r:'`` | Open for reading exclusively without | 56 | | compression. | 57 +------------------+---------------------------------------------+ 58 | ``'r:gz'`` | Open for reading with gzip compression. | 59 +------------------+---------------------------------------------+ 60 | ``'r:bz2'`` | Open for reading with bzip2 compression. | 61 +------------------+---------------------------------------------+ 62 | ``'r:xz'`` | Open for reading with lzma compression. | 63 +------------------+---------------------------------------------+ 64 | ``'x'`` or | Create a tarfile exclusively without | 65 | ``'x:'`` | compression. | 66 | | Raise an :exc:`FileExistsError` exception | 67 | | if it already exists. | 68 +------------------+---------------------------------------------+ 69 | ``'x:gz'`` | Create a tarfile with gzip compression. | 70 | | Raise an :exc:`FileExistsError` exception | 71 | | if it already exists. | 72 +------------------+---------------------------------------------+ 73 | ``'x:bz2'`` | Create a tarfile with bzip2 compression. | 74 | | Raise an :exc:`FileExistsError` exception | 75 | | if it already exists. | 76 +------------------+---------------------------------------------+ 77 | ``'x:xz'`` | Create a tarfile with lzma compression. | 78 | | Raise an :exc:`FileExistsError` exception | 79 | | if it already exists. | 80 +------------------+---------------------------------------------+ 81 | ``'a' or 'a:'`` | Open for appending with no compression. The | 82 | | file is created if it does not exist. | 83 +------------------+---------------------------------------------+ 84 | ``'w' or 'w:'`` | Open for uncompressed writing. | 85 +------------------+---------------------------------------------+ 86 | ``'w:gz'`` | Open for gzip compressed writing. | 87 +------------------+---------------------------------------------+ 88 | ``'w:bz2'`` | Open for bzip2 compressed writing. | 89 +------------------+---------------------------------------------+ 90 | ``'w:xz'`` | Open for lzma compressed writing. | 91 +------------------+---------------------------------------------+ 92 93 Note that ``'a:gz'``, ``'a:bz2'`` or ``'a:xz'`` is not possible. If *mode* 94 is not suitable to open a certain (compressed) file for reading, 95 :exc:`ReadError` is raised. Use *mode* ``'r'`` to avoid this. If a 96 compression method is not supported, :exc:`CompressionError` is raised. 97 98 If *fileobj* is specified, it is used as an alternative to a :term:`file object` 99 opened in binary mode for *name*. It is supposed to be at position 0. 100 101 For modes ``'w:gz'``, ``'r:gz'``, ``'w:bz2'``, ``'r:bz2'``, ``'x:gz'``, 102 ``'x:bz2'``, :func:`tarfile.open` accepts the keyword argument 103 *compresslevel* (default ``9``) to specify the compression level of the file. 104 105 For modes ``'w:xz'`` and ``'x:xz'``, :func:`tarfile.open` accepts the 106 keyword argument *preset* to specify the compression level of the file. 107 108 For special purposes, there is a second format for *mode*: 109 ``'filemode|[compression]'``. :func:`tarfile.open` will return a :class:`TarFile` 110 object that processes its data as a stream of blocks. No random seeking will 111 be done on the file. If given, *fileobj* may be any object that has a 112 :meth:`read` or :meth:`write` method (depending on the *mode*). *bufsize* 113 specifies the blocksize and defaults to ``20 * 512`` bytes. Use this variant 114 in combination with e.g. ``sys.stdin``, a socket :term:`file object` or a tape 115 device. However, such a :class:`TarFile` object is limited in that it does 116 not allow random access, see :ref:`tar-examples`. The currently 117 possible modes: 118 119 +-------------+--------------------------------------------+ 120 | Mode | Action | 121 +=============+============================================+ 122 | ``'r|*'`` | Open a *stream* of tar blocks for reading | 123 | | with transparent compression. | 124 +-------------+--------------------------------------------+ 125 | ``'r|'`` | Open a *stream* of uncompressed tar blocks | 126 | | for reading. | 127 +-------------+--------------------------------------------+ 128 | ``'r|gz'`` | Open a gzip compressed *stream* for | 129 | | reading. | 130 +-------------+--------------------------------------------+ 131 | ``'r|bz2'`` | Open a bzip2 compressed *stream* for | 132 | | reading. | 133 +-------------+--------------------------------------------+ 134 | ``'r|xz'`` | Open an lzma compressed *stream* for | 135 | | reading. | 136 +-------------+--------------------------------------------+ 137 | ``'w|'`` | Open an uncompressed *stream* for writing. | 138 +-------------+--------------------------------------------+ 139 | ``'w|gz'`` | Open a gzip compressed *stream* for | 140 | | writing. | 141 +-------------+--------------------------------------------+ 142 | ``'w|bz2'`` | Open a bzip2 compressed *stream* for | 143 | | writing. | 144 +-------------+--------------------------------------------+ 145 | ``'w|xz'`` | Open an lzma compressed *stream* for | 146 | | writing. | 147 +-------------+--------------------------------------------+ 148 149 .. versionchanged:: 3.5 150 The ``'x'`` (exclusive creation) mode was added. 151 152 .. versionchanged:: 3.6 153 The *name* parameter accepts a :term:`path-like object`. 154 155 156.. class:: TarFile 157 :noindex: 158 159 Class for reading and writing tar archives. Do not use this class directly: 160 use :func:`tarfile.open` instead. See :ref:`tarfile-objects`. 161 162 163.. function:: is_tarfile(name) 164 165 Return :const:`True` if *name* is a tar archive file, that the :mod:`tarfile` 166 module can read. *name* may be a :class:`str`, file, or file-like object. 167 168 .. versionchanged:: 3.9 169 Support for file and file-like objects. 170 171 172The :mod:`tarfile` module defines the following exceptions: 173 174 175.. exception:: TarError 176 177 Base class for all :mod:`tarfile` exceptions. 178 179 180.. exception:: ReadError 181 182 Is raised when a tar archive is opened, that either cannot be handled by the 183 :mod:`tarfile` module or is somehow invalid. 184 185 186.. exception:: CompressionError 187 188 Is raised when a compression method is not supported or when the data cannot be 189 decoded properly. 190 191 192.. exception:: StreamError 193 194 Is raised for the limitations that are typical for stream-like :class:`TarFile` 195 objects. 196 197 198.. exception:: ExtractError 199 200 Is raised for *non-fatal* errors when using :meth:`TarFile.extract`, but only if 201 :attr:`TarFile.errorlevel`\ ``== 2``. 202 203 204.. exception:: HeaderError 205 206 Is raised by :meth:`TarInfo.frombuf` if the buffer it gets is invalid. 207 208 209The following constants are available at the module level: 210 211.. data:: ENCODING 212 213 The default character encoding: ``'utf-8'`` on Windows, the value returned by 214 :func:`sys.getfilesystemencoding` otherwise. 215 216 217Each of the following constants defines a tar archive format that the 218:mod:`tarfile` module is able to create. See section :ref:`tar-formats` for 219details. 220 221 222.. data:: USTAR_FORMAT 223 224 POSIX.1-1988 (ustar) format. 225 226 227.. data:: GNU_FORMAT 228 229 GNU tar format. 230 231 232.. data:: PAX_FORMAT 233 234 POSIX.1-2001 (pax) format. 235 236 237.. data:: DEFAULT_FORMAT 238 239 The default format for creating archives. This is currently :const:`PAX_FORMAT`. 240 241 .. versionchanged:: 3.8 242 The default format for new archives was changed to 243 :const:`PAX_FORMAT` from :const:`GNU_FORMAT`. 244 245 246.. seealso:: 247 248 Module :mod:`zipfile` 249 Documentation of the :mod:`zipfile` standard module. 250 251 :ref:`archiving-operations` 252 Documentation of the higher-level archiving facilities provided by the 253 standard :mod:`shutil` module. 254 255 `GNU tar manual, Basic Tar Format <https://www.gnu.org/software/tar/manual/html_node/Standard.html>`_ 256 Documentation for tar archive files, including GNU tar extensions. 257 258 259.. _tarfile-objects: 260 261TarFile Objects 262--------------- 263 264The :class:`TarFile` object provides an interface to a tar archive. A tar 265archive is a sequence of blocks. An archive member (a stored file) is made up of 266a header block followed by data blocks. It is possible to store a file in a tar 267archive several times. Each archive member is represented by a :class:`TarInfo` 268object, see :ref:`tarinfo-objects` for details. 269 270A :class:`TarFile` object can be used as a context manager in a :keyword:`with` 271statement. It will automatically be closed when the block is completed. Please 272note that in the event of an exception an archive opened for writing will not 273be finalized; only the internally used file object will be closed. See the 274:ref:`tar-examples` section for a use case. 275 276.. versionadded:: 3.2 277 Added support for the context management protocol. 278 279.. class:: TarFile(name=None, mode='r', fileobj=None, format=DEFAULT_FORMAT, tarinfo=TarInfo, dereference=False, ignore_zeros=False, encoding=ENCODING, errors='surrogateescape', pax_headers=None, debug=0, errorlevel=0) 280 281 All following arguments are optional and can be accessed as instance attributes 282 as well. 283 284 *name* is the pathname of the archive. *name* may be a :term:`path-like object`. 285 It can be omitted if *fileobj* is given. 286 In this case, the file object's :attr:`name` attribute is used if it exists. 287 288 *mode* is either ``'r'`` to read from an existing archive, ``'a'`` to append 289 data to an existing file, ``'w'`` to create a new file overwriting an existing 290 one, or ``'x'`` to create a new file only if it does not already exist. 291 292 If *fileobj* is given, it is used for reading or writing data. If it can be 293 determined, *mode* is overridden by *fileobj*'s mode. *fileobj* will be used 294 from position 0. 295 296 .. note:: 297 298 *fileobj* is not closed, when :class:`TarFile` is closed. 299 300 *format* controls the archive format for writing. It must be one of the constants 301 :const:`USTAR_FORMAT`, :const:`GNU_FORMAT` or :const:`PAX_FORMAT` that are 302 defined at module level. When reading, format will be automatically detected, even 303 if different formats are present in a single archive. 304 305 The *tarinfo* argument can be used to replace the default :class:`TarInfo` class 306 with a different one. 307 308 If *dereference* is :const:`False`, add symbolic and hard links to the archive. If it 309 is :const:`True`, add the content of the target files to the archive. This has no 310 effect on systems that do not support symbolic links. 311 312 If *ignore_zeros* is :const:`False`, treat an empty block as the end of the archive. 313 If it is :const:`True`, skip empty (and invalid) blocks and try to get as many members 314 as possible. This is only useful for reading concatenated or damaged archives. 315 316 *debug* can be set from ``0`` (no debug messages) up to ``3`` (all debug 317 messages). The messages are written to ``sys.stderr``. 318 319 If *errorlevel* is ``0``, all errors are ignored when using :meth:`TarFile.extract`. 320 Nevertheless, they appear as error messages in the debug output, when debugging 321 is enabled. If ``1``, all *fatal* errors are raised as :exc:`OSError` 322 exceptions. If ``2``, all *non-fatal* errors are raised as :exc:`TarError` 323 exceptions as well. 324 325 The *encoding* and *errors* arguments define the character encoding to be 326 used for reading or writing the archive and how conversion errors are going 327 to be handled. The default settings will work for most users. 328 See section :ref:`tar-unicode` for in-depth information. 329 330 The *pax_headers* argument is an optional dictionary of strings which 331 will be added as a pax global header if *format* is :const:`PAX_FORMAT`. 332 333 .. versionchanged:: 3.2 334 Use ``'surrogateescape'`` as the default for the *errors* argument. 335 336 .. versionchanged:: 3.5 337 The ``'x'`` (exclusive creation) mode was added. 338 339 .. versionchanged:: 3.6 340 The *name* parameter accepts a :term:`path-like object`. 341 342 343.. classmethod:: TarFile.open(...) 344 345 Alternative constructor. The :func:`tarfile.open` function is actually a 346 shortcut to this classmethod. 347 348 349.. method:: TarFile.getmember(name) 350 351 Return a :class:`TarInfo` object for member *name*. If *name* can not be found 352 in the archive, :exc:`KeyError` is raised. 353 354 .. note:: 355 356 If a member occurs more than once in the archive, its last occurrence is assumed 357 to be the most up-to-date version. 358 359 360.. method:: TarFile.getmembers() 361 362 Return the members of the archive as a list of :class:`TarInfo` objects. The 363 list has the same order as the members in the archive. 364 365 366.. method:: TarFile.getnames() 367 368 Return the members as a list of their names. It has the same order as the list 369 returned by :meth:`getmembers`. 370 371 372.. method:: TarFile.list(verbose=True, *, members=None) 373 374 Print a table of contents to ``sys.stdout``. If *verbose* is :const:`False`, 375 only the names of the members are printed. If it is :const:`True`, output 376 similar to that of :program:`ls -l` is produced. If optional *members* is 377 given, it must be a subset of the list returned by :meth:`getmembers`. 378 379 .. versionchanged:: 3.5 380 Added the *members* parameter. 381 382 383.. method:: TarFile.next() 384 385 Return the next member of the archive as a :class:`TarInfo` object, when 386 :class:`TarFile` is opened for reading. Return :const:`None` if there is no more 387 available. 388 389 390.. method:: TarFile.extractall(path=".", members=None, *, numeric_owner=False) 391 392 Extract all members from the archive to the current working directory or 393 directory *path*. If optional *members* is given, it must be a subset of the 394 list returned by :meth:`getmembers`. Directory information like owner, 395 modification time and permissions are set after all members have been extracted. 396 This is done to work around two problems: A directory's modification time is 397 reset each time a file is created in it. And, if a directory's permissions do 398 not allow writing, extracting files to it will fail. 399 400 If *numeric_owner* is :const:`True`, the uid and gid numbers from the tarfile 401 are used to set the owner/group for the extracted files. Otherwise, the named 402 values from the tarfile are used. 403 404 .. warning:: 405 406 Never extract archives from untrusted sources without prior inspection. 407 It is possible that files are created outside of *path*, e.g. members 408 that have absolute filenames starting with ``"/"`` or filenames with two 409 dots ``".."``. 410 411 .. versionchanged:: 3.5 412 Added the *numeric_owner* parameter. 413 414 .. versionchanged:: 3.6 415 The *path* parameter accepts a :term:`path-like object`. 416 417 418.. method:: TarFile.extract(member, path="", set_attrs=True, *, numeric_owner=False) 419 420 Extract a member from the archive to the current working directory, using its 421 full name. Its file information is extracted as accurately as possible. *member* 422 may be a filename or a :class:`TarInfo` object. You can specify a different 423 directory using *path*. *path* may be a :term:`path-like object`. 424 File attributes (owner, mtime, mode) are set unless *set_attrs* is false. 425 426 If *numeric_owner* is :const:`True`, the uid and gid numbers from the tarfile 427 are used to set the owner/group for the extracted files. Otherwise, the named 428 values from the tarfile are used. 429 430 .. note:: 431 432 The :meth:`extract` method does not take care of several extraction issues. 433 In most cases you should consider using the :meth:`extractall` method. 434 435 .. warning:: 436 437 See the warning for :meth:`extractall`. 438 439 .. versionchanged:: 3.2 440 Added the *set_attrs* parameter. 441 442 .. versionchanged:: 3.5 443 Added the *numeric_owner* parameter. 444 445 .. versionchanged:: 3.6 446 The *path* parameter accepts a :term:`path-like object`. 447 448 449.. method:: TarFile.extractfile(member) 450 451 Extract a member from the archive as a file object. *member* may be 452 a filename or a :class:`TarInfo` object. If *member* is a regular file or 453 a link, an :class:`io.BufferedReader` object is returned. For all other 454 existing members, :const:`None` is returned. If *member* does not appear 455 in the archive, :exc:`KeyError` is raised. 456 457 .. versionchanged:: 3.3 458 Return an :class:`io.BufferedReader` object. 459 460 461.. method:: TarFile.add(name, arcname=None, recursive=True, *, filter=None) 462 463 Add the file *name* to the archive. *name* may be any type of file 464 (directory, fifo, symbolic link, etc.). If given, *arcname* specifies an 465 alternative name for the file in the archive. Directories are added 466 recursively by default. This can be avoided by setting *recursive* to 467 :const:`False`. Recursion adds entries in sorted order. 468 If *filter* is given, it 469 should be a function that takes a :class:`TarInfo` object argument and 470 returns the changed :class:`TarInfo` object. If it instead returns 471 :const:`None` the :class:`TarInfo` object will be excluded from the 472 archive. See :ref:`tar-examples` for an example. 473 474 .. versionchanged:: 3.2 475 Added the *filter* parameter. 476 477 .. versionchanged:: 3.7 478 Recursion adds entries in sorted order. 479 480 481.. method:: TarFile.addfile(tarinfo, fileobj=None) 482 483 Add the :class:`TarInfo` object *tarinfo* to the archive. If *fileobj* is given, 484 it should be a :term:`binary file`, and 485 ``tarinfo.size`` bytes are read from it and added to the archive. You can 486 create :class:`TarInfo` objects directly, or by using :meth:`gettarinfo`. 487 488 489.. method:: TarFile.gettarinfo(name=None, arcname=None, fileobj=None) 490 491 Create a :class:`TarInfo` object from the result of :func:`os.stat` or 492 equivalent on an existing file. The file is either named by *name*, or 493 specified as a :term:`file object` *fileobj* with a file descriptor. 494 *name* may be a :term:`path-like object`. If 495 given, *arcname* specifies an alternative name for the file in the 496 archive, otherwise, the name is taken from *fileobj*’s 497 :attr:`~io.FileIO.name` attribute, or the *name* argument. The name 498 should be a text string. 499 500 You can modify 501 some of the :class:`TarInfo`’s attributes before you add it using :meth:`addfile`. 502 If the file object is not an ordinary file object positioned at the 503 beginning of the file, attributes such as :attr:`~TarInfo.size` may need 504 modifying. This is the case for objects such as :class:`~gzip.GzipFile`. 505 The :attr:`~TarInfo.name` may also be modified, in which case *arcname* 506 could be a dummy string. 507 508 .. versionchanged:: 3.6 509 The *name* parameter accepts a :term:`path-like object`. 510 511 512.. method:: TarFile.close() 513 514 Close the :class:`TarFile`. In write mode, two finishing zero blocks are 515 appended to the archive. 516 517 518.. attribute:: TarFile.pax_headers 519 520 A dictionary containing key-value pairs of pax global headers. 521 522 523 524.. _tarinfo-objects: 525 526TarInfo Objects 527--------------- 528 529A :class:`TarInfo` object represents one member in a :class:`TarFile`. Aside 530from storing all required attributes of a file (like file type, size, time, 531permissions, owner etc.), it provides some useful methods to determine its type. 532It does *not* contain the file's data itself. 533 534:class:`TarInfo` objects are returned by :class:`TarFile`'s methods 535:meth:`getmember`, :meth:`getmembers` and :meth:`gettarinfo`. 536 537 538.. class:: TarInfo(name="") 539 540 Create a :class:`TarInfo` object. 541 542 543.. classmethod:: TarInfo.frombuf(buf, encoding, errors) 544 545 Create and return a :class:`TarInfo` object from string buffer *buf*. 546 547 Raises :exc:`HeaderError` if the buffer is invalid. 548 549 550.. classmethod:: TarInfo.fromtarfile(tarfile) 551 552 Read the next member from the :class:`TarFile` object *tarfile* and return it as 553 a :class:`TarInfo` object. 554 555 556.. method:: TarInfo.tobuf(format=DEFAULT_FORMAT, encoding=ENCODING, errors='surrogateescape') 557 558 Create a string buffer from a :class:`TarInfo` object. For information on the 559 arguments see the constructor of the :class:`TarFile` class. 560 561 .. versionchanged:: 3.2 562 Use ``'surrogateescape'`` as the default for the *errors* argument. 563 564 565A ``TarInfo`` object has the following public data attributes: 566 567 568.. attribute:: TarInfo.name 569 570 Name of the archive member. 571 572 573.. attribute:: TarInfo.size 574 575 Size in bytes. 576 577 578.. attribute:: TarInfo.mtime 579 580 Time of last modification. 581 582 583.. attribute:: TarInfo.mode 584 585 Permission bits. 586 587 588.. attribute:: TarInfo.type 589 590 File type. *type* is usually one of these constants: :const:`REGTYPE`, 591 :const:`AREGTYPE`, :const:`LNKTYPE`, :const:`SYMTYPE`, :const:`DIRTYPE`, 592 :const:`FIFOTYPE`, :const:`CONTTYPE`, :const:`CHRTYPE`, :const:`BLKTYPE`, 593 :const:`GNUTYPE_SPARSE`. To determine the type of a :class:`TarInfo` object 594 more conveniently, use the ``is*()`` methods below. 595 596 597.. attribute:: TarInfo.linkname 598 599 Name of the target file name, which is only present in :class:`TarInfo` objects 600 of type :const:`LNKTYPE` and :const:`SYMTYPE`. 601 602 603.. attribute:: TarInfo.uid 604 605 User ID of the user who originally stored this member. 606 607 608.. attribute:: TarInfo.gid 609 610 Group ID of the user who originally stored this member. 611 612 613.. attribute:: TarInfo.uname 614 615 User name. 616 617 618.. attribute:: TarInfo.gname 619 620 Group name. 621 622 623.. attribute:: TarInfo.pax_headers 624 625 A dictionary containing key-value pairs of an associated pax extended header. 626 627 628A :class:`TarInfo` object also provides some convenient query methods: 629 630 631.. method:: TarInfo.isfile() 632 633 Return :const:`True` if the :class:`Tarinfo` object is a regular file. 634 635 636.. method:: TarInfo.isreg() 637 638 Same as :meth:`isfile`. 639 640 641.. method:: TarInfo.isdir() 642 643 Return :const:`True` if it is a directory. 644 645 646.. method:: TarInfo.issym() 647 648 Return :const:`True` if it is a symbolic link. 649 650 651.. method:: TarInfo.islnk() 652 653 Return :const:`True` if it is a hard link. 654 655 656.. method:: TarInfo.ischr() 657 658 Return :const:`True` if it is a character device. 659 660 661.. method:: TarInfo.isblk() 662 663 Return :const:`True` if it is a block device. 664 665 666.. method:: TarInfo.isfifo() 667 668 Return :const:`True` if it is a FIFO. 669 670 671.. method:: TarInfo.isdev() 672 673 Return :const:`True` if it is one of character device, block device or FIFO. 674 675 676.. _tarfile-commandline: 677.. program:: tarfile 678 679Command-Line Interface 680---------------------- 681 682.. versionadded:: 3.4 683 684The :mod:`tarfile` module provides a simple command-line interface to interact 685with tar archives. 686 687If you want to create a new tar archive, specify its name after the :option:`-c` 688option and then list the filename(s) that should be included: 689 690.. code-block:: shell-session 691 692 $ python -m tarfile -c monty.tar spam.txt eggs.txt 693 694Passing a directory is also acceptable: 695 696.. code-block:: shell-session 697 698 $ python -m tarfile -c monty.tar life-of-brian_1979/ 699 700If you want to extract a tar archive into the current directory, use 701the :option:`-e` option: 702 703.. code-block:: shell-session 704 705 $ python -m tarfile -e monty.tar 706 707You can also extract a tar archive into a different directory by passing the 708directory's name: 709 710.. code-block:: shell-session 711 712 $ python -m tarfile -e monty.tar other-dir/ 713 714For a list of the files in a tar archive, use the :option:`-l` option: 715 716.. code-block:: shell-session 717 718 $ python -m tarfile -l monty.tar 719 720 721Command-line options 722~~~~~~~~~~~~~~~~~~~~ 723 724.. cmdoption:: -l <tarfile> 725 --list <tarfile> 726 727 List files in a tarfile. 728 729.. cmdoption:: -c <tarfile> <source1> ... <sourceN> 730 --create <tarfile> <source1> ... <sourceN> 731 732 Create tarfile from source files. 733 734.. cmdoption:: -e <tarfile> [<output_dir>] 735 --extract <tarfile> [<output_dir>] 736 737 Extract tarfile into the current directory if *output_dir* is not specified. 738 739.. cmdoption:: -t <tarfile> 740 --test <tarfile> 741 742 Test whether the tarfile is valid or not. 743 744.. cmdoption:: -v, --verbose 745 746 Verbose output. 747 748.. _tar-examples: 749 750Examples 751-------- 752 753How to extract an entire tar archive to the current working directory:: 754 755 import tarfile 756 tar = tarfile.open("sample.tar.gz") 757 tar.extractall() 758 tar.close() 759 760How to extract a subset of a tar archive with :meth:`TarFile.extractall` using 761a generator function instead of a list:: 762 763 import os 764 import tarfile 765 766 def py_files(members): 767 for tarinfo in members: 768 if os.path.splitext(tarinfo.name)[1] == ".py": 769 yield tarinfo 770 771 tar = tarfile.open("sample.tar.gz") 772 tar.extractall(members=py_files(tar)) 773 tar.close() 774 775How to create an uncompressed tar archive from a list of filenames:: 776 777 import tarfile 778 tar = tarfile.open("sample.tar", "w") 779 for name in ["foo", "bar", "quux"]: 780 tar.add(name) 781 tar.close() 782 783The same example using the :keyword:`with` statement:: 784 785 import tarfile 786 with tarfile.open("sample.tar", "w") as tar: 787 for name in ["foo", "bar", "quux"]: 788 tar.add(name) 789 790How to read a gzip compressed tar archive and display some member information:: 791 792 import tarfile 793 tar = tarfile.open("sample.tar.gz", "r:gz") 794 for tarinfo in tar: 795 print(tarinfo.name, "is", tarinfo.size, "bytes in size and is ", end="") 796 if tarinfo.isreg(): 797 print("a regular file.") 798 elif tarinfo.isdir(): 799 print("a directory.") 800 else: 801 print("something else.") 802 tar.close() 803 804How to create an archive and reset the user information using the *filter* 805parameter in :meth:`TarFile.add`:: 806 807 import tarfile 808 def reset(tarinfo): 809 tarinfo.uid = tarinfo.gid = 0 810 tarinfo.uname = tarinfo.gname = "root" 811 return tarinfo 812 tar = tarfile.open("sample.tar.gz", "w:gz") 813 tar.add("foo", filter=reset) 814 tar.close() 815 816 817.. _tar-formats: 818 819Supported tar formats 820--------------------- 821 822There are three tar formats that can be created with the :mod:`tarfile` module: 823 824* The POSIX.1-1988 ustar format (:const:`USTAR_FORMAT`). It supports filenames 825 up to a length of at best 256 characters and linknames up to 100 characters. 826 The maximum file size is 8 GiB. This is an old and limited but widely 827 supported format. 828 829* The GNU tar format (:const:`GNU_FORMAT`). It supports long filenames and 830 linknames, files bigger than 8 GiB and sparse files. It is the de facto 831 standard on GNU/Linux systems. :mod:`tarfile` fully supports the GNU tar 832 extensions for long names, sparse file support is read-only. 833 834* The POSIX.1-2001 pax format (:const:`PAX_FORMAT`). It is the most flexible 835 format with virtually no limits. It supports long filenames and linknames, large 836 files and stores pathnames in a portable way. Modern tar implementations, 837 including GNU tar, bsdtar/libarchive and star, fully support extended *pax* 838 features; some old or unmaintained libraries may not, but should treat 839 *pax* archives as if they were in the universally-supported *ustar* format. 840 It is the current default format for new archives. 841 842 It extends the existing *ustar* format with extra headers for information 843 that cannot be stored otherwise. There are two flavours of pax headers: 844 Extended headers only affect the subsequent file header, global 845 headers are valid for the complete archive and affect all following files. 846 All the data in a pax header is encoded in *UTF-8* for portability reasons. 847 848There are some more variants of the tar format which can be read, but not 849created: 850 851* The ancient V7 format. This is the first tar format from Unix Seventh Edition, 852 storing only regular files and directories. Names must not be longer than 100 853 characters, there is no user/group name information. Some archives have 854 miscalculated header checksums in case of fields with non-ASCII characters. 855 856* The SunOS tar extended format. This format is a variant of the POSIX.1-2001 857 pax format, but is not compatible. 858 859.. _tar-unicode: 860 861Unicode issues 862-------------- 863 864The tar format was originally conceived to make backups on tape drives with the 865main focus on preserving file system information. Nowadays tar archives are 866commonly used for file distribution and exchanging archives over networks. One 867problem of the original format (which is the basis of all other formats) is 868that there is no concept of supporting different character encodings. For 869example, an ordinary tar archive created on a *UTF-8* system cannot be read 870correctly on a *Latin-1* system if it contains non-*ASCII* characters. Textual 871metadata (like filenames, linknames, user/group names) will appear damaged. 872Unfortunately, there is no way to autodetect the encoding of an archive. The 873pax format was designed to solve this problem. It stores non-ASCII metadata 874using the universal character encoding *UTF-8*. 875 876The details of character conversion in :mod:`tarfile` are controlled by the 877*encoding* and *errors* keyword arguments of the :class:`TarFile` class. 878 879*encoding* defines the character encoding to use for the metadata in the 880archive. The default value is :func:`sys.getfilesystemencoding` or ``'ascii'`` 881as a fallback. Depending on whether the archive is read or written, the 882metadata must be either decoded or encoded. If *encoding* is not set 883appropriately, this conversion may fail. 884 885The *errors* argument defines how characters are treated that cannot be 886converted. Possible values are listed in section :ref:`error-handlers`. 887The default scheme is ``'surrogateescape'`` which Python also uses for its 888file system calls, see :ref:`os-filenames`. 889 890For :const:`PAX_FORMAT` archives (the default), *encoding* is generally not needed 891because all the metadata is stored using *UTF-8*. *encoding* is only used in 892the rare cases when binary pax headers are decoded or when strings with 893surrogate characters are stored. 894