• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1:mod:`modulegraph.modulegraph` --- Find modules used by a script
2================================================================
3
4.. module:: modulegraph.modulegraph
5   :synopsis: Find modules used by a script
6
7This module defines :class:`ModuleGraph`, which is used to find
8the dependencies of scripts using bytecode analysis.
9
10A number of APIs in this module refer to filesystem path. Those paths can refer to
11files inside zipfiles (for example when there are zipped egg files on :data:`sys.path`).
12Filenames referring to entries in a zipfile are not marked any way, if ``"somepath.zip"``
13refers to a zipfile, that is ``"somepath.zip/embedded/file"`` will be used to refer to
14``embedded/file`` inside the zipfile.
15
16The actual graph
17----------------
18
19.. class:: ModuleGraph([path[, excludes[, replace_paths[, implies[, graph[, debug]]]]]])
20
21   Create a new ModuleGraph object. Use the :meth:`run_script` method to add scripts,
22   and their dependencies to the graph.
23
24   :param path: Python search path to use, defaults to :data:`sys.path`
25   :param excludes: Iterable with module names that should not be included as a dependency
26   :param replace_paths: List of pathname rewrites ``(old, new)``. When this argument is
27     supplied the ``co_filename`` attributes of code objects get rewritten before scanning
28     them for dependencies.
29   :param implies: Implied module dependencies, a mapping from a module name to the list
30     of modules it depends on. Use this to tell modulegraph about dependencies that cannot
31     be found by code inspection (such as imports from C code or using the :func:`__import__`
32     function).
33   :param graph: A precreated :class:`Graph <altgraph.Graph.Graph>` object to use, the
34     default is to create a new one.
35   :param debug: The :class:`ObjectGraph <altgraph.ObjectGraph.ObjectGraph>` debug level.
36
37
38.. method:: run_script(pathname[, caller])
39
40   Create, and return,  a node by path (not module name). The *pathname* should
41   refer to a Python source file and will be scanned for dependencies.
42
43   The optional argument *caller* is the the node that calls this script,
44   and is used to add a reference in the graph.
45
46.. method:: import_hook(name[[, caller[, fromlist[, level, [, attr]]]])
47
48   Import a module and analyse its dependencies
49
50   :arg name:     The module name
51   :arg caller:   The node that caused the import to happen
52   :arg fromlist: The list of names to import, this is an empty list for
53      ``import name`` and a list of names for ``from name import a, b, c``.
54   :arg level:    The import level. The value should be ``-1`` for classical Python 2
55     imports, ``0`` for absolute imports and a positive number for relative imports (
56     where the value is the number of leading dots in the imported name).
57   :arg attr:     Attributes for the graph edge.
58
59
60.. method:: implyNodeReference(node, other, edgeData=None)
61
62   Explictly mark that *node* depends on *other*. Other is either
63   a :class:`node <Node>` or the name of a module that will be
64   searched for as if it were an absolute import.
65
66
67.. method:: createReference(fromnode, tonode[, edge_data])
68
69   Create a reference from *fromnode* to *tonode*, with optional edge data.
70
71   The default for *edge_data* is ``"direct"``.
72
73.. method:: getReferences(fromnode)
74
75   Yield all nodes that *fromnode* refers to. That is, all modules imported
76   by *fromnode*.
77
78   Node :data:`None` is the root of the graph, and refers to all notes that were
79   explicitly imported by :meth:`run_script` or :meth:`import_hook`, unless you use
80   an explicit parent with those methods.
81
82   .. versionadded:: 0.11
83
84.. method:: getReferers(tonode, collapse_missing_modules=True)
85
86   Yield all nodes that refer to *tonode*. That is, all modules that import
87   *tonode*.
88
89   If *collapse_missing_modules* is false this includes refererences from
90   :class:`MissingModule` nodes, otherwise :class:`MissingModule` nodes
91   are replaced by the "real" nodes that reference this missing node.
92
93   .. versionadded:: 0.12
94
95.. method:: foldReferences(pkgnode)
96
97   Hide all submodule nodes for package *pkgnode* and add ingoing and outgoing
98   edges to *pkgnode* based on the edges from the submodule nodes.
99
100   This can be used to simplify a module graph: after folding 'email' all
101   references to modules in the 'email' package are references to the package.
102
103   .. versionadded: 0.11
104
105.. method:: findNode(name)
106
107   Find a node by identifier.  If a node by that identifier exists, it will be returned.
108
109   If a lazy node exists by that identifier with no dependencies (excluded), it will be
110   instantiated and returned.
111
112   If a lazy node exists by that identifier with dependencies, it and its
113   dependencies will be instantiated and scanned for additional depende
114
115
116
117.. method:: create_xref([out])
118
119   Write an HTML file to the *out* stream (defaulting to :data:`sys.stdout`).
120
121   The HTML file contains a textual description of the dependency graph.
122
123
124
125.. method:: graphreport([fileobj[, flatpackages]])
126
127   .. todo:: To be documented
128
129
130
131.. method:: report()
132
133   Print a report to stdout, listing the found modules with their
134   paths, as well as modules that are missing, or seem to be missing.
135
136
137Mostly internal methods
138.......................
139
140The methods in this section should be considered as methods for subclassing at best,
141please let us know if you need these methods in your code as they are on track to be
142made private methods before the 1.0 release.
143
144.. warning:: The methods in this section will be refactored in a future release,
145   the current architecture makes it unnecessarily hard to write proper tests.
146
147.. method:: determine_parent(caller)
148
149   Returns the node of the package root voor *caller*. If *caller* is a package
150   this is the node itself, if the node is a module in a package this is the
151   node of for the package and otherwise the *caller* is not a package and
152   the result is :data:`None`.
153
154.. method:: find_head_package(parent, name[, level])
155
156   .. todo:: To be documented
157
158
159.. method:: load_tail(mod, tail)
160
161   This method is called to load the rest of a dotted name after loading the root
162   of a package. This will import all intermediate modules as well (using
163   :meth:`import_module`), and returns the module :class:`node <Node>` for the
164   requested node.
165
166   .. note:: When *tail* is empty this will just return *mod*.
167
168   :arg mod:   A start module (instance of :class:`Node`)
169   :arg tail:  The rest of a dotted name, can be empty
170   :raise ImportError: When the requested (or one of its parents) module cannot be found
171   :returns: the requested module
172
173
174
175.. method:: ensure_fromlist(m, fromlist)
176
177   Yield all submodules that would be imported when importing *fromlist*
178   from *m* (using ``from m import fromlist...``).
179
180   *m* must be a package and not a regular module.
181
182.. method:: find_all_submodules(m)
183
184   Yield the filenames for submodules of in the same package as *m*.
185
186
187
188.. method:: import_module(partname, fqname, parent)
189
190   Perform import of the module with basename *partname* (``path``) and
191   full name *fqname* (``os.path``). Import is performed by *parent*.
192
193   This will create a reference from the parent node to the
194   module node and will load the module node when it is not already
195   loaded.
196
197
198
199.. method:: load_module(fqname, fp, pathname, (suffix, mode, type))
200
201   Load the module named *fqname* from the given *pathame*. The
202   argument *fp* is either :data:`None`, or a stream where the
203   code for the Python module can be loaded (either byte-code or
204   the source code). The *(suffix, mode, type)* tuple are the
205   suffix of the source file, the open mode for the file and the
206   type of module.
207
208   Creates a node of the right class and processes the dependencies
209   of the :class:`node <Node>` by scanning the byte-code for the node.
210
211   Returns the resulting :class:`node <Node>`.
212
213
214
215.. method:: scan_code(code, m)
216
217   Scan the *code* object for module *m* and update the dependencies of
218   *m* using the import statemets found in the code.
219
220   This will automaticly scan the code for nested functions, generator
221   expressions and list comprehensions as well.
222
223
224
225.. method:: load_package(fqname, pathname)
226
227   Load a package directory.
228
229
230
231.. method:: find_module(name, path[, parent])
232
233   Locates a module named *name* that is not yet part of the
234   graph. This method will raise :exc:`ImportError` when
235   the module cannot be found or when it is already part
236   of the graph. The *name* can not be a dotted name.
237
238   The *path* is the search path used, or :data:`None` to
239   use the default path.
240
241   When the *parent* is specified *name* refers to a
242   subpackage of *parent*, and *path* should be the
243   search path of the parent.
244
245   Returns the result of the global function
246   :func:`find_module <modulegraph.modulegraph.find_module>`.
247
248
249.. method:: itergraphreport([name[, flatpackages]])
250
251   .. todo:: To be documented
252
253
254
255.. method:: replace_paths_in_code(co)
256
257   Replace the filenames in code object *co* using the *replace_paths* value that
258   was passed to the contructor. Returns the rewritten code object.
259
260
261
262.. method:: calc_setuptools_nspackages()
263
264   Returns a mapping from package name to a list of paths where that package
265   can be found in ``--single-version-externally-managed`` form.
266
267   This method is used to be able to find those packages: these use
268   a magic ``.pth`` file to ensure that the package is added to :data:`sys.path`,
269   as they do not contain an ``___init__.py`` file.
270
271   Packages in this form are used by system packages and the "pip"
272   installer.
273
274
275Graph nodes
276-----------
277
278The :class:`ModuleGraph` contains nodes that represent the various types of modules.
279
280.. class:: Alias(value)
281
282   This is a subclass of string that is used to mark module aliases.
283
284
285
286.. class:: Node(identifier)
287
288   Base class for nodes, which provides the common functionality.
289
290   Nodes can by used as mappings for storing arbitrary data in the node.
291
292   Nodes are compared by comparing their *identifier*.
293
294.. data:: debug
295
296   Debug level (integer)
297
298.. data:: graphident
299
300   The node identifier, this is the value of the *identifier* argument
301   to the constructor.
302
303.. data:: identifier
304
305   The node identifier, this is the value of the *identifier* argument
306   to the constructor.
307
308.. data:: filename
309
310   The filename associated with this node.
311
312.. data:: packagepath
313
314   The value of ``__path__`` for this node.
315
316.. data:: code
317
318   The :class:`code object <types.CodeObject>` associated with this node
319
320.. data:: globalnames
321
322   The set of global names that are assigned to in this module. This
323   includes those names imported through startimports of Python modules.
324
325.. data:: startimports
326
327   The set of startimports this module did that could not be resolved,
328   ie. a startimport from a non-Python module.
329
330
331.. method:: __contains__(name)
332
333   Return if there is a value associated with *name*.
334
335   This method is usually accessed as ``name in aNode``.
336
337.. method:: __setitem__(name, value)
338
339   Set the value of *name* to *value*.
340
341   This method is usually accessed as ``aNode[name] = value``.
342
343.. method:: __getitem__(name)
344
345   Returns the value of *name*, raises :exc:`KeyError` when
346   it cannot be found.
347
348   This method is usually accessed as ``value = aNode[name]``.
349
350.. method:: get(name[, default])
351
352   Returns the value of *name*, or the default value when it
353   cannot be found. The *default* is :data:`None` when not specified.
354
355.. method:: infoTuple()
356
357   Returns a tuple with information used in the :func:`repr`
358   output for the node. Subclasses can add additional informations
359   to the result.
360
361
362.. class:: AliasNode (name, node)
363
364   A node that represents an alias from a name to another node.
365
366   The value of attribute *graphident* for this node will be the
367   value of *name*, the other :class:`Node` attributed are
368   references to those attributed in *node*.
369
370.. class:: BadModule(identifier)
371
372   Base class for nodes that should be ignored for some reason
373
374.. class:: ExcludedModule(identifier)
375
376   A module that is explicitly excluded.
377
378.. class:: MissingModule(identifier)
379
380   A module that is imported but cannot be located.
381
382
383
384.. class:: Script(filename)
385
386   A python script.
387
388   .. data:: filename
389
390      The filename for the script
391
392.. class:: BaseModule(name[, filename[, path]])
393
394    The base class for actual modules. The *name* is
395    the possibly dotted module name, *filename* is the
396    filesystem path to the module and *path* is the
397    value of ``__path__`` for the module.
398
399.. data:: graphident
400
401   The name of the module
402
403.. data:: filename
404
405   The filesystem path to the module.
406
407.. data:: path
408
409   The value of ``__path__`` for this module.
410
411.. class:: BuiltinModule(name)
412
413   A built-in module (on in :data:`sys.builtin_module_names`).
414
415.. class:: SourceModule(name)
416
417   A module for which the python source code is available.
418
419.. class:: InvalidSourceModule(name)
420
421   A module for which the python source code is available, but where
422   that source code cannot be compiled (due to syntax errors).
423
424   This is a subclass of :class:`SourceModule`.
425
426   .. versionadded:: 0.12
427
428.. class:: CompiledModule(name)
429
430   A module for which only byte-code is available.
431
432.. class:: Package(name)
433
434   Represents a python package
435
436.. class:: NamespacePackage(name)
437
438   Represents a python namespace package.
439
440   This is a subclass of :class:`Package`.
441
442.. class:: Extension(name)
443
444   A native extension
445
446
447.. warning:: A number of other node types are defined in the module. Those modules aren't
448   used by modulegraph and will be removed in a future version.
449
450
451Edge data
452---------
453
454The edges in a module graph by default contain information about the edge, represented
455by an instance of :class:`DependencyInfo`.
456
457.. class:: DependencyInfo(conditional, function, tryexcept, fromlist)
458
459   This class is a :func:`namedtuple <collections.namedtuple>` for representing
460   the information on a dependency between two modules.
461
462   All attributes can be used to deduce if a dependency is essential or not, and
463   are particularly useful when reporting on missing modules (dependencies on
464   :class:`MissingModule`).
465
466   .. data:: fromlist
467
468      A boolean that is true iff the target of the edge is named in the "import"
469      list of a "from" import ("from package import module").
470
471      When the target module is imported multiple times this attribute is false
472      unless all imports are in "import" list of a "from" import.
473
474   .. data:: function
475
476      A boolean that is true iff the import is done inside a function definition,
477      and is false for imports in module scope (or class scope for classes that
478      aren't definined in a function).
479
480   .. data:: tryexcept
481
482      A boolean that is true iff the import that is done in the "try" or "except"
483      block of a try statement (but not in the "else" block).
484
485   .. data:: conditional
486
487      A boolean that is true iff the import is done in either block of an "if"
488      statement.
489
490   When the target of the edge is imported multiple times the :data:`function`,
491   :data:`tryexcept` and :data:`conditional` attributes of all imports are
492   merged: when there is an import where all these attributes are false the
493   attributes are false, otherwise each attribute is set to true if it is
494   true for at least one of the imports.
495
496   For example, when a module is imported both in a try-except statement and
497   furthermore is imported in a function (in two separate statements),
498   both :data:`tryexcept` and :data:`function` will be true.  But if there
499   is a third unconditional toplevel import for that module as well all
500   three attributes are false.
501
502   .. warning::
503
504      All attributes but :data:`fromlist` will be false when the source of
505      a dependency is scanned from a byte-compiled module instead of a python
506      source file. The :data:`fromlist` attribute will stil be set correctly.
507
508Utility functions
509-----------------
510
511.. function:: find_module(name[, path])
512
513   A version of :func:`imp.find_module` that works with zipped packages (and other
514   :pep:`302` importers).
515
516.. function:: moduleInfoForPath(path)
517
518   Return the module name, readmode and type for the file at *path*, or
519   None if it doesn't seem to be a valid module (based on its name).
520
521.. function:: addPackagePath(packagename, path)
522
523   Add *path* to the value of ``__path__`` for the package named *packagename*.
524
525.. function:: replacePackage(oldname, newname)
526
527   Rename *oldname* to *newname* when it is found by the module finder. This
528   is used as a workaround for the hack that the ``_xmlplus`` package uses
529   to inject itself in the ``xml`` namespace.
530
531
532