• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1
2.. _importsystem:
3
4*****************
5The import system
6*****************
7
8.. index:: single: import machinery
9
10Python code in one :term:`module` gains access to the code in another module
11by the process of :term:`importing` it.  The :keyword:`import` statement is
12the most common way of invoking the import machinery, but it is not the only
13way.  Functions such as :func:`importlib.import_module` and built-in
14:func:`__import__` can also be used to invoke the import machinery.
15
16The :keyword:`import` statement combines two operations; it searches for the
17named module, then it binds the results of that search to a name in the local
18scope.  The search operation of the :keyword:`!import` statement is defined as
19a call to the :func:`__import__` function, with the appropriate arguments.
20The return value of :func:`__import__` is used to perform the name
21binding operation of the :keyword:`!import` statement.  See the
22:keyword:`!import` statement for the exact details of that name binding
23operation.
24
25A direct call to :func:`__import__` performs only the module search and, if
26found, the module creation operation.  While certain side-effects may occur,
27such as the importing of parent packages, and the updating of various caches
28(including :data:`sys.modules`), only the :keyword:`import` statement performs
29a name binding operation.
30
31When an :keyword:`import` statement is executed, the standard builtin
32:func:`__import__` function is called. Other mechanisms for invoking the
33import system (such as :func:`importlib.import_module`) may choose to bypass
34:func:`__import__` and use their own solutions to implement import semantics.
35
36When a module is first imported, Python searches for the module and if found,
37it creates a module object [#fnmo]_, initializing it.  If the named module
38cannot be found, a :exc:`ModuleNotFoundError` is raised.  Python implements various
39strategies to search for the named module when the import machinery is
40invoked.  These strategies can be modified and extended by using various hooks
41described in the sections below.
42
43.. versionchanged:: 3.3
44   The import system has been updated to fully implement the second phase
45   of :pep:`302`. There is no longer any implicit import machinery - the full
46   import system is exposed through :data:`sys.meta_path`. In addition,
47   native namespace package support has been implemented (see :pep:`420`).
48
49
50:mod:`importlib`
51================
52
53The :mod:`importlib` module provides a rich API for interacting with the
54import system.  For example :func:`importlib.import_module` provides a
55recommended, simpler API than built-in :func:`__import__` for invoking the
56import machinery.  Refer to the :mod:`importlib` library documentation for
57additional detail.
58
59
60
61Packages
62========
63
64.. index::
65    single: package
66
67Python has only one type of module object, and all modules are of this type,
68regardless of whether the module is implemented in Python, C, or something
69else.  To help organize modules and provide a naming hierarchy, Python has a
70concept of :term:`packages <package>`.
71
72You can think of packages as the directories on a file system and modules as
73files within directories, but don't take this analogy too literally since
74packages and modules need not originate from the file system.  For the
75purposes of this documentation, we'll use this convenient analogy of
76directories and files.  Like file system directories, packages are organized
77hierarchically, and packages may themselves contain subpackages, as well as
78regular modules.
79
80It's important to keep in mind that all packages are modules, but not all
81modules are packages.  Or put another way, packages are just a special kind of
82module.  Specifically, any module that contains a ``__path__`` attribute is
83considered a package.
84
85All modules have a name.  Subpackage names are separated from their parent
86package name by dots, akin to Python's standard attribute access syntax.  Thus
87you might have a module called :mod:`sys` and a package called :mod:`email`,
88which in turn has a subpackage called :mod:`email.mime` and a module within
89that subpackage called :mod:`email.mime.text`.
90
91
92Regular packages
93----------------
94
95.. index::
96    pair: package; regular
97
98Python defines two types of packages, :term:`regular packages <regular
99package>` and :term:`namespace packages <namespace package>`.  Regular
100packages are traditional packages as they existed in Python 3.2 and earlier.
101A regular package is typically implemented as a directory containing an
102``__init__.py`` file.  When a regular package is imported, this
103``__init__.py`` file is implicitly executed, and the objects it defines are
104bound to names in the package's namespace.  The ``__init__.py`` file can
105contain the same Python code that any other module can contain, and Python
106will add some additional attributes to the module when it is imported.
107
108For example, the following file system layout defines a top level ``parent``
109package with three subpackages::
110
111    parent/
112        __init__.py
113        one/
114            __init__.py
115        two/
116            __init__.py
117        three/
118            __init__.py
119
120Importing ``parent.one`` will implicitly execute ``parent/__init__.py`` and
121``parent/one/__init__.py``.  Subsequent imports of ``parent.two`` or
122``parent.three`` will execute ``parent/two/__init__.py`` and
123``parent/three/__init__.py`` respectively.
124
125
126Namespace packages
127------------------
128
129.. index::
130    pair: package; namespace
131    pair: package; portion
132
133A namespace package is a composite of various :term:`portions <portion>`,
134where each portion contributes a subpackage to the parent package.  Portions
135may reside in different locations on the file system.  Portions may also be
136found in zip files, on the network, or anywhere else that Python searches
137during import.  Namespace packages may or may not correspond directly to
138objects on the file system; they may be virtual modules that have no concrete
139representation.
140
141Namespace packages do not use an ordinary list for their ``__path__``
142attribute. They instead use a custom iterable type which will automatically
143perform a new search for package portions on the next import attempt within
144that package if the path of their parent package (or :data:`sys.path` for a
145top level package) changes.
146
147With namespace packages, there is no ``parent/__init__.py`` file.  In fact,
148there may be multiple ``parent`` directories found during import search, where
149each one is provided by a different portion.  Thus ``parent/one`` may not be
150physically located next to ``parent/two``.  In this case, Python will create a
151namespace package for the top-level ``parent`` package whenever it or one of
152its subpackages is imported.
153
154See also :pep:`420` for the namespace package specification.
155
156
157Searching
158=========
159
160To begin the search, Python needs the :term:`fully qualified <qualified name>`
161name of the module (or package, but for the purposes of this discussion, the
162difference is immaterial) being imported.  This name may come from various
163arguments to the :keyword:`import` statement, or from the parameters to the
164:func:`importlib.import_module` or :func:`__import__` functions.
165
166This name will be used in various phases of the import search, and it may be
167the dotted path to a submodule, e.g. ``foo.bar.baz``.  In this case, Python
168first tries to import ``foo``, then ``foo.bar``, and finally ``foo.bar.baz``.
169If any of the intermediate imports fail, a :exc:`ModuleNotFoundError` is raised.
170
171
172The module cache
173----------------
174
175.. index::
176    single: sys.modules
177
178The first place checked during import search is :data:`sys.modules`.  This
179mapping serves as a cache of all modules that have been previously imported,
180including the intermediate paths.  So if ``foo.bar.baz`` was previously
181imported, :data:`sys.modules` will contain entries for ``foo``, ``foo.bar``,
182and ``foo.bar.baz``.  Each key will have as its value the corresponding module
183object.
184
185During import, the module name is looked up in :data:`sys.modules` and if
186present, the associated value is the module satisfying the import, and the
187process completes.  However, if the value is ``None``, then a
188:exc:`ModuleNotFoundError` is raised.  If the module name is missing, Python will
189continue searching for the module.
190
191:data:`sys.modules` is writable.  Deleting a key may not destroy the
192associated module (as other modules may hold references to it),
193but it will invalidate the cache entry for the named module, causing
194Python to search anew for the named module upon its next
195import. The key can also be assigned to ``None``, forcing the next import
196of the module to result in a :exc:`ModuleNotFoundError`.
197
198Beware though, as if you keep a reference to the module object,
199invalidate its cache entry in :data:`sys.modules`, and then re-import the
200named module, the two module objects will *not* be the same. By contrast,
201:func:`importlib.reload` will reuse the *same* module object, and simply
202reinitialise the module contents by rerunning the module's code.
203
204
205Finders and loaders
206-------------------
207
208.. index::
209    single: finder
210    single: loader
211    single: module spec
212
213If the named module is not found in :data:`sys.modules`, then Python's import
214protocol is invoked to find and load the module.  This protocol consists of
215two conceptual objects, :term:`finders <finder>` and :term:`loaders <loader>`.
216A finder's job is to determine whether it can find the named module using
217whatever strategy it knows about. Objects that implement both of these
218interfaces are referred to as :term:`importers <importer>` - they return
219themselves when they find that they can load the requested module.
220
221Python includes a number of default finders and importers.  The first one
222knows how to locate built-in modules, and the second knows how to locate
223frozen modules.  A third default finder searches an :term:`import path`
224for modules.  The :term:`import path` is a list of locations that may
225name file system paths or zip files.  It can also be extended to search
226for any locatable resource, such as those identified by URLs.
227
228The import machinery is extensible, so new finders can be added to extend the
229range and scope of module searching.
230
231Finders do not actually load modules.  If they can find the named module, they
232return a :dfn:`module spec`, an encapsulation of the module's import-related
233information, which the import machinery then uses when loading the module.
234
235The following sections describe the protocol for finders and loaders in more
236detail, including how you can create and register new ones to extend the
237import machinery.
238
239.. versionchanged:: 3.4
240   In previous versions of Python, finders returned :term:`loaders <loader>`
241   directly, whereas now they return module specs which *contain* loaders.
242   Loaders are still used during import but have fewer responsibilities.
243
244Import hooks
245------------
246
247.. index::
248   single: import hooks
249   single: meta hooks
250   single: path hooks
251   pair: hooks; import
252   pair: hooks; meta
253   pair: hooks; path
254
255The import machinery is designed to be extensible; the primary mechanism for
256this are the *import hooks*.  There are two types of import hooks: *meta
257hooks* and *import path hooks*.
258
259Meta hooks are called at the start of import processing, before any other
260import processing has occurred, other than :data:`sys.modules` cache look up.
261This allows meta hooks to override :data:`sys.path` processing, frozen
262modules, or even built-in modules.  Meta hooks are registered by adding new
263finder objects to :data:`sys.meta_path`, as described below.
264
265Import path hooks are called as part of :data:`sys.path` (or
266``package.__path__``) processing, at the point where their associated path
267item is encountered.  Import path hooks are registered by adding new callables
268to :data:`sys.path_hooks` as described below.
269
270
271The meta path
272-------------
273
274.. index::
275    single: sys.meta_path
276    pair: finder; find_spec
277
278When the named module is not found in :data:`sys.modules`, Python next
279searches :data:`sys.meta_path`, which contains a list of meta path finder
280objects.  These finders are queried in order to see if they know how to handle
281the named module.  Meta path finders must implement a method called
282:meth:`~importlib.abc.MetaPathFinder.find_spec()` which takes three arguments:
283a name, an import path, and (optionally) a target module.  The meta path
284finder can use any strategy it wants to determine whether it can handle
285the named module or not.
286
287If the meta path finder knows how to handle the named module, it returns a
288spec object.  If it cannot handle the named module, it returns ``None``.  If
289:data:`sys.meta_path` processing reaches the end of its list without returning
290a spec, then a :exc:`ModuleNotFoundError` is raised.  Any other exceptions
291raised are simply propagated up, aborting the import process.
292
293The :meth:`~importlib.abc.MetaPathFinder.find_spec()` method of meta path
294finders is called with two or three arguments.  The first is the fully
295qualified name of the module being imported, for example ``foo.bar.baz``.
296The second argument is the path entries to use for the module search.  For
297top-level modules, the second argument is ``None``, but for submodules or
298subpackages, the second argument is the value of the parent package's
299``__path__`` attribute. If the appropriate ``__path__`` attribute cannot
300be accessed, a :exc:`ModuleNotFoundError` is raised.  The third argument
301is an existing module object that will be the target of loading later.
302The import system passes in a target module only during reload.
303
304The meta path may be traversed multiple times for a single import request.
305For example, assuming none of the modules involved has already been cached,
306importing ``foo.bar.baz`` will first perform a top level import, calling
307``mpf.find_spec("foo", None, None)`` on each meta path finder (``mpf``). After
308``foo`` has been imported, ``foo.bar`` will be imported by traversing the
309meta path a second time, calling
310``mpf.find_spec("foo.bar", foo.__path__, None)``. Once ``foo.bar`` has been
311imported, the final traversal will call
312``mpf.find_spec("foo.bar.baz", foo.bar.__path__, None)``.
313
314Some meta path finders only support top level imports. These importers will
315always return ``None`` when anything other than ``None`` is passed as the
316second argument.
317
318Python's default :data:`sys.meta_path` has three meta path finders, one that
319knows how to import built-in modules, one that knows how to import frozen
320modules, and one that knows how to import modules from an :term:`import path`
321(i.e. the :term:`path based finder`).
322
323.. versionchanged:: 3.4
324   The :meth:`~importlib.abc.MetaPathFinder.find_spec` method of meta path
325   finders replaced :meth:`~importlib.abc.MetaPathFinder.find_module`, which
326   is now deprecated.  While it will continue to work without change, the
327   import machinery will try it only if the finder does not implement
328   ``find_spec()``.
329
330
331Loading
332=======
333
334If and when a module spec is found, the import machinery will use it (and
335the loader it contains) when loading the module.  Here is an approximation
336of what happens during the loading portion of import::
337
338    module = None
339    if spec.loader is not None and hasattr(spec.loader, 'create_module'):
340        # It is assumed 'exec_module' will also be defined on the loader.
341        module = spec.loader.create_module(spec)
342    if module is None:
343        module = ModuleType(spec.name)
344    # The import-related module attributes get set here:
345    _init_module_attrs(spec, module)
346
347    if spec.loader is None:
348        if spec.submodule_search_locations is not None:
349            # namespace package
350            sys.modules[spec.name] = module
351        else:
352            # unsupported
353            raise ImportError
354    elif not hasattr(spec.loader, 'exec_module'):
355        module = spec.loader.load_module(spec.name)
356        # Set __loader__ and __package__ if missing.
357    else:
358        sys.modules[spec.name] = module
359        try:
360            spec.loader.exec_module(module)
361        except BaseException:
362            try:
363                del sys.modules[spec.name]
364            except KeyError:
365                pass
366            raise
367    return sys.modules[spec.name]
368
369Note the following details:
370
371 * If there is an existing module object with the given name in
372   :data:`sys.modules`, import will have already returned it.
373
374 * The module will exist in :data:`sys.modules` before the loader
375   executes the module code.  This is crucial because the module code may
376   (directly or indirectly) import itself; adding it to :data:`sys.modules`
377   beforehand prevents unbounded recursion in the worst case and multiple
378   loading in the best.
379
380 * If loading fails, the failing module -- and only the failing module --
381   gets removed from :data:`sys.modules`.  Any module already in the
382   :data:`sys.modules` cache, and any module that was successfully loaded
383   as a side-effect, must remain in the cache.  This contrasts with
384   reloading where even the failing module is left in :data:`sys.modules`.
385
386 * After the module is created but before execution, the import machinery
387   sets the import-related module attributes ("_init_module_attrs" in
388   the pseudo-code example above), as summarized in a
389   :ref:`later section <import-mod-attrs>`.
390
391 * Module execution is the key moment of loading in which the module's
392   namespace gets populated.  Execution is entirely delegated to the
393   loader, which gets to decide what gets populated and how.
394
395 * The module created during loading and passed to exec_module() may
396   not be the one returned at the end of import [#fnlo]_.
397
398.. versionchanged:: 3.4
399   The import system has taken over the boilerplate responsibilities of
400   loaders.  These were previously performed by the
401   :meth:`importlib.abc.Loader.load_module` method.
402
403Loaders
404-------
405
406Module loaders provide the critical function of loading: module execution.
407The import machinery calls the :meth:`importlib.abc.Loader.exec_module`
408method with a single argument, the module object to execute.  Any value
409returned from :meth:`~importlib.abc.Loader.exec_module` is ignored.
410
411Loaders must satisfy the following requirements:
412
413 * If the module is a Python module (as opposed to a built-in module or a
414   dynamically loaded extension), the loader should execute the module's code
415   in the module's global name space (``module.__dict__``).
416
417 * If the loader cannot execute the module, it should raise an
418   :exc:`ImportError`, although any other exception raised during
419   :meth:`~importlib.abc.Loader.exec_module` will be propagated.
420
421In many cases, the finder and loader can be the same object; in such cases the
422:meth:`~importlib.abc.MetaPathFinder.find_spec` method would just return a
423spec with the loader set to ``self``.
424
425Module loaders may opt in to creating the module object during loading
426by implementing a :meth:`~importlib.abc.Loader.create_module` method.
427It takes one argument, the module spec, and returns the new module object
428to use during loading.  ``create_module()`` does not need to set any attributes
429on the module object.  If the method returns ``None``, the
430import machinery will create the new module itself.
431
432.. versionadded:: 3.4
433   The :meth:`~importlib.abc.Loader.create_module` method of loaders.
434
435.. versionchanged:: 3.4
436   The :meth:`~importlib.abc.Loader.load_module` method was replaced by
437   :meth:`~importlib.abc.Loader.exec_module` and the import
438   machinery assumed all the boilerplate responsibilities of loading.
439
440   For compatibility with existing loaders, the import machinery will use
441   the ``load_module()`` method of loaders if it exists and the loader does
442   not also implement ``exec_module()``.  However, ``load_module()`` has been
443   deprecated and loaders should implement ``exec_module()`` instead.
444
445   The ``load_module()`` method must implement all the boilerplate loading
446   functionality described above in addition to executing the module.  All
447   the same constraints apply, with some additional clarification:
448
449    * If there is an existing module object with the given name in
450      :data:`sys.modules`, the loader must use that existing module.
451      (Otherwise, :func:`importlib.reload` will not work correctly.)  If the
452      named module does not exist in :data:`sys.modules`, the loader
453      must create a new module object and add it to :data:`sys.modules`.
454
455    * The module *must* exist in :data:`sys.modules` before the loader
456      executes the module code, to prevent unbounded recursion or multiple
457      loading.
458
459    * If loading fails, the loader must remove any modules it has inserted
460      into :data:`sys.modules`, but it must remove **only** the failing
461      module(s), and only if the loader itself has loaded the module(s)
462      explicitly.
463
464.. versionchanged:: 3.5
465   A :exc:`DeprecationWarning` is raised when ``exec_module()`` is defined but
466   ``create_module()`` is not.
467
468.. versionchanged:: 3.6
469   An :exc:`ImportError` is raised when ``exec_module()`` is defined but
470   ``create_module()`` is not.
471
472Submodules
473----------
474
475When a submodule is loaded using any mechanism (e.g. ``importlib`` APIs, the
476``import`` or ``import-from`` statements, or built-in ``__import__()``) a
477binding is placed in the parent module's namespace to the submodule object.
478For example, if package ``spam`` has a submodule ``foo``, after importing
479``spam.foo``, ``spam`` will have an attribute ``foo`` which is bound to the
480submodule.  Let's say you have the following directory structure::
481
482    spam/
483        __init__.py
484        foo.py
485        bar.py
486
487and ``spam/__init__.py`` has the following lines in it::
488
489    from .foo import Foo
490    from .bar import Bar
491
492then executing the following puts a name binding to ``foo`` and ``bar`` in the
493``spam`` module::
494
495    >>> import spam
496    >>> spam.foo
497    <module 'spam.foo' from '/tmp/imports/spam/foo.py'>
498    >>> spam.bar
499    <module 'spam.bar' from '/tmp/imports/spam/bar.py'>
500
501Given Python's familiar name binding rules this might seem surprising, but
502it's actually a fundamental feature of the import system.  The invariant
503holding is that if you have ``sys.modules['spam']`` and
504``sys.modules['spam.foo']`` (as you would after the above import), the latter
505must appear as the ``foo`` attribute of the former.
506
507Module spec
508-----------
509
510The import machinery uses a variety of information about each module
511during import, especially before loading.  Most of the information is
512common to all modules.  The purpose of a module's spec is to encapsulate
513this import-related information on a per-module basis.
514
515Using a spec during import allows state to be transferred between import
516system components, e.g. between the finder that creates the module spec
517and the loader that executes it.  Most importantly, it allows the
518import machinery to perform the boilerplate operations of loading,
519whereas without a module spec the loader had that responsibility.
520
521The module's spec is exposed as the ``__spec__`` attribute on a module object.
522See :class:`~importlib.machinery.ModuleSpec` for details on the contents of
523the module spec.
524
525.. versionadded:: 3.4
526
527.. _import-mod-attrs:
528
529Import-related module attributes
530--------------------------------
531
532The import machinery fills in these attributes on each module object
533during loading, based on the module's spec, before the loader executes
534the module.
535
536.. attribute:: __name__
537
538   The ``__name__`` attribute must be set to the fully-qualified name of
539   the module.  This name is used to uniquely identify the module in
540   the import system.
541
542.. attribute:: __loader__
543
544   The ``__loader__`` attribute must be set to the loader object that
545   the import machinery used when loading the module.  This is mostly
546   for introspection, but can be used for additional loader-specific
547   functionality, for example getting data associated with a loader.
548
549.. attribute:: __package__
550
551   The module's ``__package__`` attribute must be set.  Its value must
552   be a string, but it can be the same value as its ``__name__``.  When
553   the module is a package, its ``__package__`` value should be set to
554   its ``__name__``.  When the module is not a package, ``__package__``
555   should be set to the empty string for top-level modules, or for
556   submodules, to the parent package's name.  See :pep:`366` for further
557   details.
558
559   This attribute is used instead of ``__name__`` to calculate explicit
560   relative imports for main modules, as defined in :pep:`366`. It is
561   expected to have the same value as ``__spec__.parent``.
562
563   .. versionchanged:: 3.6
564      The value of ``__package__`` is expected to be the same as
565      ``__spec__.parent``.
566
567.. attribute:: __spec__
568
569   The ``__spec__`` attribute must be set to the module spec that was
570   used when importing the module. Setting ``__spec__``
571   appropriately applies equally to :ref:`modules initialized during
572   interpreter startup <programs>`.  The one exception is ``__main__``,
573   where ``__spec__`` is :ref:`set to None in some cases <main_spec>`.
574
575   When ``__package__`` is not defined, ``__spec__.parent`` is used as
576   a fallback.
577
578   .. versionadded:: 3.4
579
580   .. versionchanged:: 3.6
581      ``__spec__.parent`` is used as a fallback when ``__package__`` is
582      not defined.
583
584.. attribute:: __path__
585
586   If the module is a package (either regular or namespace), the module
587   object's ``__path__`` attribute must be set.  The value must be
588   iterable, but may be empty if ``__path__`` has no further significance.
589   If ``__path__`` is not empty, it must produce strings when iterated
590   over. More details on the semantics of ``__path__`` are given
591   :ref:`below <package-path-rules>`.
592
593   Non-package modules should not have a ``__path__`` attribute.
594
595.. attribute:: __file__
596.. attribute:: __cached__
597
598   ``__file__`` is optional. If set, this attribute's value must be a
599   string.  The import system may opt to leave ``__file__`` unset if it
600   has no semantic meaning (e.g. a module loaded from a database).
601
602   If ``__file__`` is set, it may also be appropriate to set the
603   ``__cached__`` attribute which is the path to any compiled version of
604   the code (e.g. byte-compiled file). The file does not need to exist
605   to set this attribute; the path can simply point to where the
606   compiled file would exist (see :pep:`3147`).
607
608   It is also appropriate to set ``__cached__`` when ``__file__`` is not
609   set.  However, that scenario is quite atypical.  Ultimately, the
610   loader is what makes use of ``__file__`` and/or ``__cached__``.  So
611   if a loader can load from a cached module but otherwise does not load
612   from a file, that atypical scenario may be appropriate.
613
614.. _package-path-rules:
615
616module.__path__
617---------------
618
619By definition, if a module has a ``__path__`` attribute, it is a package.
620
621A package's ``__path__`` attribute is used during imports of its subpackages.
622Within the import machinery, it functions much the same as :data:`sys.path`,
623i.e. providing a list of locations to search for modules during import.
624However, ``__path__`` is typically much more constrained than
625:data:`sys.path`.
626
627``__path__`` must be an iterable of strings, but it may be empty.
628The same rules used for :data:`sys.path` also apply to a package's
629``__path__``, and :data:`sys.path_hooks` (described below) are
630consulted when traversing a package's ``__path__``.
631
632A package's ``__init__.py`` file may set or alter the package's ``__path__``
633attribute, and this was typically the way namespace packages were implemented
634prior to :pep:`420`.  With the adoption of :pep:`420`, namespace packages no
635longer need to supply ``__init__.py`` files containing only ``__path__``
636manipulation code; the import machinery automatically sets ``__path__``
637correctly for the namespace package.
638
639Module reprs
640------------
641
642By default, all modules have a usable repr, however depending on the
643attributes set above, and in the module's spec, you can more explicitly
644control the repr of module objects.
645
646If the module has a spec (``__spec__``), the import machinery will try
647to generate a repr from it.  If that fails or there is no spec, the import
648system will craft a default repr using whatever information is available
649on the module.  It will try to use the ``module.__name__``,
650``module.__file__``, and ``module.__loader__`` as input into the repr,
651with defaults for whatever information is missing.
652
653Here are the exact rules used:
654
655 * If the module has a ``__spec__`` attribute, the information in the spec
656   is used to generate the repr.  The "name", "loader", "origin", and
657   "has_location" attributes are consulted.
658
659 * If the module has a ``__file__`` attribute, this is used as part of the
660   module's repr.
661
662 * If the module has no ``__file__`` but does have a ``__loader__`` that is not
663   ``None``, then the loader's repr is used as part of the module's repr.
664
665 * Otherwise, just use the module's ``__name__`` in the repr.
666
667.. versionchanged:: 3.4
668   Use of :meth:`loader.module_repr() <importlib.abc.Loader.module_repr>`
669   has been deprecated and the module spec is now used by the import
670   machinery to generate a module repr.
671
672   For backward compatibility with Python 3.3, the module repr will be
673   generated by calling the loader's
674   :meth:`~importlib.abc.Loader.module_repr` method, if defined, before
675   trying either approach described above.  However, the method is deprecated.
676
677.. _pyc-invalidation:
678
679Cached bytecode invalidation
680----------------------------
681
682Before Python loads cached bytecode from ``.pyc`` file, it checks whether the
683cache is up-to-date with the source ``.py`` file. By default, Python does this
684by storing the source's last-modified timestamp and size in the cache file when
685writing it. At runtime, the import system then validates the cache file by
686checking the stored metadata in the cache file against at source's
687metadata.
688
689Python also supports "hash-based" cache files, which store a hash of the source
690file's contents rather than its metadata. There are two variants of hash-based
691``.pyc`` files: checked and unchecked. For checked hash-based ``.pyc`` files,
692Python validates the cache file by hashing the source file and comparing the
693resulting hash with the hash in the cache file. If a checked hash-based cache
694file is found to be invalid, Python regenerates it and writes a new checked
695hash-based cache file. For unchecked hash-based ``.pyc`` files, Python simply
696assumes the cache file is valid if it exists. Hash-based ``.pyc`` files
697validation behavior may be overridden with the :option:`--check-hash-based-pycs`
698flag.
699
700.. versionchanged:: 3.7
701   Added hash-based ``.pyc`` files. Previously, Python only supported
702   timestamp-based invalidation of bytecode caches.
703
704
705The Path Based Finder
706=====================
707
708.. index::
709    single: path based finder
710
711As mentioned previously, Python comes with several default meta path finders.
712One of these, called the :term:`path based finder`
713(:class:`~importlib.machinery.PathFinder`), searches an :term:`import path`,
714which contains a list of :term:`path entries <path entry>`.  Each path
715entry names a location to search for modules.
716
717The path based finder itself doesn't know how to import anything. Instead, it
718traverses the individual path entries, associating each of them with a
719path entry finder that knows how to handle that particular kind of path.
720
721The default set of path entry finders implement all the semantics for finding
722modules on the file system, handling special file types such as Python source
723code (``.py`` files), Python byte code (``.pyc`` files) and
724shared libraries (e.g. ``.so`` files). When supported by the :mod:`zipimport`
725module in the standard library, the default path entry finders also handle
726loading all of these file types (other than shared libraries) from zipfiles.
727
728Path entries need not be limited to file system locations.  They can refer to
729URLs, database queries, or any other location that can be specified as a
730string.
731
732The path based finder provides additional hooks and protocols so that you
733can extend and customize the types of searchable path entries.  For example,
734if you wanted to support path entries as network URLs, you could write a hook
735that implements HTTP semantics to find modules on the web.  This hook (a
736callable) would return a :term:`path entry finder` supporting the protocol
737described below, which was then used to get a loader for the module from the
738web.
739
740A word of warning: this section and the previous both use the term *finder*,
741distinguishing between them by using the terms :term:`meta path finder` and
742:term:`path entry finder`.  These two types of finders are very similar,
743support similar protocols, and function in similar ways during the import
744process, but it's important to keep in mind that they are subtly different.
745In particular, meta path finders operate at the beginning of the import
746process, as keyed off the :data:`sys.meta_path` traversal.
747
748By contrast, path entry finders are in a sense an implementation detail
749of the path based finder, and in fact, if the path based finder were to be
750removed from :data:`sys.meta_path`, none of the path entry finder semantics
751would be invoked.
752
753
754Path entry finders
755------------------
756
757.. index::
758    single: sys.path
759    single: sys.path_hooks
760    single: sys.path_importer_cache
761    single: PYTHONPATH
762
763The :term:`path based finder` is responsible for finding and loading
764Python modules and packages whose location is specified with a string
765:term:`path entry`.  Most path entries name locations in the file system,
766but they need not be limited to this.
767
768As a meta path finder, the :term:`path based finder` implements the
769:meth:`~importlib.abc.MetaPathFinder.find_spec` protocol previously
770described, however it exposes additional hooks that can be used to
771customize how modules are found and loaded from the :term:`import path`.
772
773Three variables are used by the :term:`path based finder`, :data:`sys.path`,
774:data:`sys.path_hooks` and :data:`sys.path_importer_cache`.  The ``__path__``
775attributes on package objects are also used.  These provide additional ways
776that the import machinery can be customized.
777
778:data:`sys.path` contains a list of strings providing search locations for
779modules and packages.  It is initialized from the :data:`PYTHONPATH`
780environment variable and various other installation- and
781implementation-specific defaults.  Entries in :data:`sys.path` can name
782directories on the file system, zip files, and potentially other "locations"
783(see the :mod:`site` module) that should be searched for modules, such as
784URLs, or database queries.  Only strings and bytes should be present on
785:data:`sys.path`; all other data types are ignored.  The encoding of bytes
786entries is determined by the individual :term:`path entry finders <path entry
787finder>`.
788
789The :term:`path based finder` is a :term:`meta path finder`, so the import
790machinery begins the :term:`import path` search by calling the path
791based finder's :meth:`~importlib.machinery.PathFinder.find_spec` method as
792described previously.  When the ``path`` argument to
793:meth:`~importlib.machinery.PathFinder.find_spec` is given, it will be a
794list of string paths to traverse - typically a package's ``__path__``
795attribute for an import within that package.  If the ``path`` argument is
796``None``, this indicates a top level import and :data:`sys.path` is used.
797
798The path based finder iterates over every entry in the search path, and
799for each of these, looks for an appropriate :term:`path entry finder`
800(:class:`~importlib.abc.PathEntryFinder`) for the
801path entry.  Because this can be an expensive operation (e.g. there may be
802`stat()` call overheads for this search), the path based finder maintains
803a cache mapping path entries to path entry finders.  This cache is maintained
804in :data:`sys.path_importer_cache` (despite the name, this cache actually
805stores finder objects rather than being limited to :term:`importer` objects).
806In this way, the expensive search for a particular :term:`path entry`
807location's :term:`path entry finder` need only be done once.  User code is
808free to remove cache entries from :data:`sys.path_importer_cache` forcing
809the path based finder to perform the path entry search again [#fnpic]_.
810
811If the path entry is not present in the cache, the path based finder iterates
812over every callable in :data:`sys.path_hooks`.  Each of the :term:`path entry
813hooks <path entry hook>` in this list is called with a single argument, the
814path entry to be searched.  This callable may either return a :term:`path
815entry finder` that can handle the path entry, or it may raise
816:exc:`ImportError`.  An :exc:`ImportError` is used by the path based finder to
817signal that the hook cannot find a :term:`path entry finder`
818for that :term:`path entry`.  The
819exception is ignored and :term:`import path` iteration continues.  The hook
820should expect either a string or bytes object; the encoding of bytes objects
821is up to the hook (e.g. it may be a file system encoding, UTF-8, or something
822else), and if the hook cannot decode the argument, it should raise
823:exc:`ImportError`.
824
825If :data:`sys.path_hooks` iteration ends with no :term:`path entry finder`
826being returned, then the path based finder's
827:meth:`~importlib.machinery.PathFinder.find_spec` method will store ``None``
828in :data:`sys.path_importer_cache` (to indicate that there is no finder for
829this path entry) and return ``None``, indicating that this
830:term:`meta path finder` could not find the module.
831
832If a :term:`path entry finder` *is* returned by one of the :term:`path entry
833hook` callables on :data:`sys.path_hooks`, then the following protocol is used
834to ask the finder for a module spec, which is then used when loading the
835module.
836
837The current working directory -- denoted by an empty string -- is handled
838slightly differently from other entries on :data:`sys.path`. First, if the
839current working directory is found to not exist, no value is stored in
840:data:`sys.path_importer_cache`. Second, the value for the current working
841directory is looked up fresh for each module lookup. Third, the path used for
842:data:`sys.path_importer_cache` and returned by
843:meth:`importlib.machinery.PathFinder.find_spec` will be the actual current
844working directory and not the empty string.
845
846Path entry finder protocol
847--------------------------
848
849In order to support imports of modules and initialized packages and also to
850contribute portions to namespace packages, path entry finders must implement
851the :meth:`~importlib.abc.PathEntryFinder.find_spec` method.
852
853:meth:`~importlib.abc.PathEntryFinder.find_spec` takes two argument, the
854fully qualified name of the module being imported, and the (optional) target
855module.  ``find_spec()`` returns a fully populated spec for the module.
856This spec will always have "loader" set (with one exception).
857
858To indicate to the import machinery that the spec represents a namespace
859:term:`portion`. the path entry finder sets "loader" on the spec to
860``None`` and "submodule_search_locations" to a list containing the
861portion.
862
863.. versionchanged:: 3.4
864   :meth:`~importlib.abc.PathEntryFinder.find_spec` replaced
865   :meth:`~importlib.abc.PathEntryFinder.find_loader` and
866   :meth:`~importlib.abc.PathEntryFinder.find_module`, both of which
867   are now deprecated, but will be used if ``find_spec()`` is not defined.
868
869   Older path entry finders may implement one of these two deprecated methods
870   instead of ``find_spec()``.  The methods are still respected for the
871   sake of backward compatibility.  However, if ``find_spec()`` is
872   implemented on the path entry finder, the legacy methods are ignored.
873
874   :meth:`~importlib.abc.PathEntryFinder.find_loader` takes one argument, the
875   fully qualified name of the module being imported.  ``find_loader()``
876   returns a 2-tuple where the first item is the loader and the second item
877   is a namespace :term:`portion`.  When the first item (i.e. the loader) is
878   ``None``, this means that while the path entry finder does not have a
879   loader for the named module, it knows that the path entry contributes to
880   a namespace portion for the named module.  This will almost always be the
881   case where Python is asked to import a namespace package that has no
882   physical presence on the file system.  When a path entry finder returns
883   ``None`` for the loader, the second item of the 2-tuple return value must
884   be a sequence, although it can be empty.
885
886   If ``find_loader()`` returns a non-``None`` loader value, the portion is
887   ignored and the loader is returned from the path based finder, terminating
888   the search through the path entries.
889
890   For backwards compatibility with other implementations of the import
891   protocol, many path entry finders also support the same,
892   traditional ``find_module()`` method that meta path finders support.
893   However path entry finder ``find_module()`` methods are never called
894   with a ``path`` argument (they are expected to record the appropriate
895   path information from the initial call to the path hook).
896
897   The ``find_module()`` method on path entry finders is deprecated,
898   as it does not allow the path entry finder to contribute portions to
899   namespace packages.  If both ``find_loader()`` and ``find_module()``
900   exist on a path entry finder, the import system will always call
901   ``find_loader()`` in preference to ``find_module()``.
902
903
904Replacing the standard import system
905====================================
906
907The most reliable mechanism for replacing the entire import system is to
908delete the default contents of :data:`sys.meta_path`, replacing them
909entirely with a custom meta path hook.
910
911If it is acceptable to only alter the behaviour of import statements
912without affecting other APIs that access the import system, then replacing
913the builtin :func:`__import__` function may be sufficient. This technique
914may also be employed at the module level to only alter the behaviour of
915import statements within that module.
916
917To selectively prevent import of some modules from a hook early on the
918meta path (rather than disabling the standard import system entirely),
919it is sufficient to raise :exc:`ModuleNotFoundError` directly from
920:meth:`~importlib.abc.MetaPathFinder.find_spec` instead of returning
921``None``. The latter indicates that the meta path search should continue,
922while raising an exception terminates it immediately.
923
924
925Special considerations for __main__
926===================================
927
928The :mod:`__main__` module is a special case relative to Python's import
929system.  As noted :ref:`elsewhere <programs>`, the ``__main__`` module
930is directly initialized at interpreter startup, much like :mod:`sys` and
931:mod:`builtins`.  However, unlike those two, it doesn't strictly
932qualify as a built-in module.  This is because the manner in which
933``__main__`` is initialized depends on the flags and other options with
934which the interpreter is invoked.
935
936.. _main_spec:
937
938__main__.__spec__
939-----------------
940
941Depending on how :mod:`__main__` is initialized, ``__main__.__spec__``
942gets set appropriately or to ``None``.
943
944When Python is started with the :option:`-m` option, ``__spec__`` is set
945to the module spec of the corresponding module or package. ``__spec__`` is
946also populated when the ``__main__`` module is loaded as part of executing a
947directory, zipfile or other :data:`sys.path` entry.
948
949In :ref:`the remaining cases <using-on-interface-options>`
950``__main__.__spec__`` is set to ``None``, as the code used to populate the
951:mod:`__main__` does not correspond directly with an importable module:
952
953- interactive prompt
954- :option:`-c` option
955- running from stdin
956- running directly from a source or bytecode file
957
958Note that ``__main__.__spec__`` is always ``None`` in the last case,
959*even if* the file could technically be imported directly as a module
960instead. Use the :option:`-m` switch if valid module metadata is desired
961in :mod:`__main__`.
962
963Note also that even when ``__main__`` corresponds with an importable module
964and ``__main__.__spec__`` is set accordingly, they're still considered
965*distinct* modules. This is due to the fact that blocks guarded by
966``if __name__ == "__main__":`` checks only execute when the module is used
967to populate the ``__main__`` namespace, and not during normal import.
968
969
970Open issues
971===========
972
973XXX It would be really nice to have a diagram.
974
975XXX * (import_machinery.rst) how about a section devoted just to the
976attributes of modules and packages, perhaps expanding upon or supplanting the
977related entries in the data model reference page?
978
979XXX runpy, pkgutil, et al in the library manual should all get "See Also"
980links at the top pointing to the new import system section.
981
982XXX Add more explanation regarding the different ways in which
983``__main__`` is initialized?
984
985XXX Add more info on ``__main__`` quirks/pitfalls (i.e. copy from
986:pep:`395`).
987
988
989References
990==========
991
992The import machinery has evolved considerably since Python's early days.  The
993original `specification for packages
994<https://www.python.org/doc/essays/packages/>`_ is still available to read,
995although some details have changed since the writing of that document.
996
997The original specification for :data:`sys.meta_path` was :pep:`302`, with
998subsequent extension in :pep:`420`.
999
1000:pep:`420` introduced :term:`namespace packages <namespace package>` for
1001Python 3.3.  :pep:`420` also introduced the :meth:`find_loader` protocol as an
1002alternative to :meth:`find_module`.
1003
1004:pep:`366` describes the addition of the ``__package__`` attribute for
1005explicit relative imports in main modules.
1006
1007:pep:`328` introduced absolute and explicit relative imports and initially
1008proposed ``__name__`` for semantics :pep:`366` would eventually specify for
1009``__package__``.
1010
1011:pep:`338` defines executing modules as scripts.
1012
1013:pep:`451` adds the encapsulation of per-module import state in spec
1014objects.  It also off-loads most of the boilerplate responsibilities of
1015loaders back onto the import machinery.  These changes allow the
1016deprecation of several APIs in the import system and also addition of new
1017methods to finders and loaders.
1018
1019.. rubric:: Footnotes
1020
1021.. [#fnmo] See :class:`types.ModuleType`.
1022
1023.. [#fnlo] The importlib implementation avoids using the return value
1024   directly. Instead, it gets the module object by looking the module name up
1025   in :data:`sys.modules`.  The indirect effect of this is that an imported
1026   module may replace itself in :data:`sys.modules`.  This is
1027   implementation-specific behavior that is not guaranteed to work in other
1028   Python implementations.
1029
1030.. [#fnpic] In legacy code, it is possible to find instances of
1031   :class:`imp.NullImporter` in the :data:`sys.path_importer_cache`.  It
1032   is recommended that code be changed to use ``None`` instead.  See
1033   :ref:`portingpythoncode` for more details.
1034