• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1:mod:`xml.etree.ElementTree` --- The ElementTree XML API
2========================================================
3
4.. module:: xml.etree.ElementTree
5   :synopsis: Implementation of the ElementTree API.
6
7.. moduleauthor:: Fredrik Lundh <fredrik@pythonware.com>
8
9**Source code:** :source:`Lib/xml/etree/ElementTree.py`
10
11--------------
12
13The :mod:`xml.etree.ElementTree` module implements a simple and efficient API
14for parsing and creating XML data.
15
16.. versionchanged:: 3.3
17   This module will use a fast implementation whenever available.
18   The :mod:`xml.etree.cElementTree` module is deprecated.
19
20
21.. warning::
22
23   The :mod:`xml.etree.ElementTree` module is not secure against
24   maliciously constructed data.  If you need to parse untrusted or
25   unauthenticated data see :ref:`xml-vulnerabilities`.
26
27Tutorial
28--------
29
30This is a short tutorial for using :mod:`xml.etree.ElementTree` (``ET`` in
31short).  The goal is to demonstrate some of the building blocks and basic
32concepts of the module.
33
34XML tree and elements
35^^^^^^^^^^^^^^^^^^^^^
36
37XML is an inherently hierarchical data format, and the most natural way to
38represent it is with a tree.  ``ET`` has two classes for this purpose -
39:class:`ElementTree` represents the whole XML document as a tree, and
40:class:`Element` represents a single node in this tree.  Interactions with
41the whole document (reading and writing to/from files) are usually done
42on the :class:`ElementTree` level.  Interactions with a single XML element
43and its sub-elements are done on the :class:`Element` level.
44
45.. _elementtree-parsing-xml:
46
47Parsing XML
48^^^^^^^^^^^
49
50We'll be using the following XML document as the sample data for this section:
51
52.. code-block:: xml
53
54   <?xml version="1.0"?>
55   <data>
56       <country name="Liechtenstein">
57           <rank>1</rank>
58           <year>2008</year>
59           <gdppc>141100</gdppc>
60           <neighbor name="Austria" direction="E"/>
61           <neighbor name="Switzerland" direction="W"/>
62       </country>
63       <country name="Singapore">
64           <rank>4</rank>
65           <year>2011</year>
66           <gdppc>59900</gdppc>
67           <neighbor name="Malaysia" direction="N"/>
68       </country>
69       <country name="Panama">
70           <rank>68</rank>
71           <year>2011</year>
72           <gdppc>13600</gdppc>
73           <neighbor name="Costa Rica" direction="W"/>
74           <neighbor name="Colombia" direction="E"/>
75       </country>
76   </data>
77
78We can import this data by reading from a file::
79
80   import xml.etree.ElementTree as ET
81   tree = ET.parse('country_data.xml')
82   root = tree.getroot()
83
84Or directly from a string::
85
86   root = ET.fromstring(country_data_as_string)
87
88:func:`fromstring` parses XML from a string directly into an :class:`Element`,
89which is the root element of the parsed tree.  Other parsing functions may
90create an :class:`ElementTree`.  Check the documentation to be sure.
91
92As an :class:`Element`, ``root`` has a tag and a dictionary of attributes::
93
94   >>> root.tag
95   'data'
96   >>> root.attrib
97   {}
98
99It also has children nodes over which we can iterate::
100
101   >>> for child in root:
102   ...     print(child.tag, child.attrib)
103   ...
104   country {'name': 'Liechtenstein'}
105   country {'name': 'Singapore'}
106   country {'name': 'Panama'}
107
108Children are nested, and we can access specific child nodes by index::
109
110   >>> root[0][1].text
111   '2008'
112
113
114.. note::
115
116   Not all elements of the XML input will end up as elements of the
117   parsed tree. Currently, this module skips over any XML comments,
118   processing instructions, and document type declarations in the
119   input. Nevertheless, trees built using this module's API rather
120   than parsing from XML text can have comments and processing
121   instructions in them; they will be included when generating XML
122   output. A document type declaration may be accessed by passing a
123   custom :class:`TreeBuilder` instance to the :class:`XMLParser`
124   constructor.
125
126
127.. _elementtree-pull-parsing:
128
129Pull API for non-blocking parsing
130^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
131
132Most parsing functions provided by this module require the whole document
133to be read at once before returning any result.  It is possible to use an
134:class:`XMLParser` and feed data into it incrementally, but it is a push API that
135calls methods on a callback target, which is too low-level and inconvenient for
136most needs.  Sometimes what the user really wants is to be able to parse XML
137incrementally, without blocking operations, while enjoying the convenience of
138fully constructed :class:`Element` objects.
139
140The most powerful tool for doing this is :class:`XMLPullParser`.  It does not
141require a blocking read to obtain the XML data, and is instead fed with data
142incrementally with :meth:`XMLPullParser.feed` calls.  To get the parsed XML
143elements, call :meth:`XMLPullParser.read_events`.  Here is an example::
144
145   >>> parser = ET.XMLPullParser(['start', 'end'])
146   >>> parser.feed('<mytag>sometext')
147   >>> list(parser.read_events())
148   [('start', <Element 'mytag' at 0x7fa66db2be58>)]
149   >>> parser.feed(' more text</mytag>')
150   >>> for event, elem in parser.read_events():
151   ...     print(event)
152   ...     print(elem.tag, 'text=', elem.text)
153   ...
154   end
155
156The obvious use case is applications that operate in a non-blocking fashion
157where the XML data is being received from a socket or read incrementally from
158some storage device.  In such cases, blocking reads are unacceptable.
159
160Because it's so flexible, :class:`XMLPullParser` can be inconvenient to use for
161simpler use-cases.  If you don't mind your application blocking on reading XML
162data but would still like to have incremental parsing capabilities, take a look
163at :func:`iterparse`.  It can be useful when you're reading a large XML document
164and don't want to hold it wholly in memory.
165
166Finding interesting elements
167^^^^^^^^^^^^^^^^^^^^^^^^^^^^
168
169:class:`Element` has some useful methods that help iterate recursively over all
170the sub-tree below it (its children, their children, and so on).  For example,
171:meth:`Element.iter`::
172
173   >>> for neighbor in root.iter('neighbor'):
174   ...     print(neighbor.attrib)
175   ...
176   {'name': 'Austria', 'direction': 'E'}
177   {'name': 'Switzerland', 'direction': 'W'}
178   {'name': 'Malaysia', 'direction': 'N'}
179   {'name': 'Costa Rica', 'direction': 'W'}
180   {'name': 'Colombia', 'direction': 'E'}
181
182:meth:`Element.findall` finds only elements with a tag which are direct
183children of the current element.  :meth:`Element.find` finds the *first* child
184with a particular tag, and :attr:`Element.text` accesses the element's text
185content.  :meth:`Element.get` accesses the element's attributes::
186
187   >>> for country in root.findall('country'):
188   ...     rank = country.find('rank').text
189   ...     name = country.get('name')
190   ...     print(name, rank)
191   ...
192   Liechtenstein 1
193   Singapore 4
194   Panama 68
195
196More sophisticated specification of which elements to look for is possible by
197using :ref:`XPath <elementtree-xpath>`.
198
199Modifying an XML File
200^^^^^^^^^^^^^^^^^^^^^
201
202:class:`ElementTree` provides a simple way to build XML documents and write them to files.
203The :meth:`ElementTree.write` method serves this purpose.
204
205Once created, an :class:`Element` object may be manipulated by directly changing
206its fields (such as :attr:`Element.text`), adding and modifying attributes
207(:meth:`Element.set` method), as well as adding new children (for example
208with :meth:`Element.append`).
209
210Let's say we want to add one to each country's rank, and add an ``updated``
211attribute to the rank element::
212
213   >>> for rank in root.iter('rank'):
214   ...     new_rank = int(rank.text) + 1
215   ...     rank.text = str(new_rank)
216   ...     rank.set('updated', 'yes')
217   ...
218   >>> tree.write('output.xml')
219
220Our XML now looks like this:
221
222.. code-block:: xml
223
224   <?xml version="1.0"?>
225   <data>
226       <country name="Liechtenstein">
227           <rank updated="yes">2</rank>
228           <year>2008</year>
229           <gdppc>141100</gdppc>
230           <neighbor name="Austria" direction="E"/>
231           <neighbor name="Switzerland" direction="W"/>
232       </country>
233       <country name="Singapore">
234           <rank updated="yes">5</rank>
235           <year>2011</year>
236           <gdppc>59900</gdppc>
237           <neighbor name="Malaysia" direction="N"/>
238       </country>
239       <country name="Panama">
240           <rank updated="yes">69</rank>
241           <year>2011</year>
242           <gdppc>13600</gdppc>
243           <neighbor name="Costa Rica" direction="W"/>
244           <neighbor name="Colombia" direction="E"/>
245       </country>
246   </data>
247
248We can remove elements using :meth:`Element.remove`.  Let's say we want to
249remove all countries with a rank higher than 50::
250
251   >>> for country in root.findall('country'):
252   ...     rank = int(country.find('rank').text)
253   ...     if rank > 50:
254   ...         root.remove(country)
255   ...
256   >>> tree.write('output.xml')
257
258Our XML now looks like this:
259
260.. code-block:: xml
261
262   <?xml version="1.0"?>
263   <data>
264       <country name="Liechtenstein">
265           <rank updated="yes">2</rank>
266           <year>2008</year>
267           <gdppc>141100</gdppc>
268           <neighbor name="Austria" direction="E"/>
269           <neighbor name="Switzerland" direction="W"/>
270       </country>
271       <country name="Singapore">
272           <rank updated="yes">5</rank>
273           <year>2011</year>
274           <gdppc>59900</gdppc>
275           <neighbor name="Malaysia" direction="N"/>
276       </country>
277   </data>
278
279Building XML documents
280^^^^^^^^^^^^^^^^^^^^^^
281
282The :func:`SubElement` function also provides a convenient way to create new
283sub-elements for a given element::
284
285   >>> a = ET.Element('a')
286   >>> b = ET.SubElement(a, 'b')
287   >>> c = ET.SubElement(a, 'c')
288   >>> d = ET.SubElement(c, 'd')
289   >>> ET.dump(a)
290   <a><b /><c><d /></c></a>
291
292Parsing XML with Namespaces
293^^^^^^^^^^^^^^^^^^^^^^^^^^^
294
295If the XML input has `namespaces
296<https://en.wikipedia.org/wiki/XML_namespace>`__, tags and attributes
297with prefixes in the form ``prefix:sometag`` get expanded to
298``{uri}sometag`` where the *prefix* is replaced by the full *URI*.
299Also, if there is a `default namespace
300<https://www.w3.org/TR/xml-names/#defaulting>`__,
301that full URI gets prepended to all of the non-prefixed tags.
302
303Here is an XML example that incorporates two namespaces, one with the
304prefix "fictional" and the other serving as the default namespace:
305
306.. code-block:: xml
307
308    <?xml version="1.0"?>
309    <actors xmlns:fictional="http://characters.example.com"
310            xmlns="http://people.example.com">
311        <actor>
312            <name>John Cleese</name>
313            <fictional:character>Lancelot</fictional:character>
314            <fictional:character>Archie Leach</fictional:character>
315        </actor>
316        <actor>
317            <name>Eric Idle</name>
318            <fictional:character>Sir Robin</fictional:character>
319            <fictional:character>Gunther</fictional:character>
320            <fictional:character>Commander Clement</fictional:character>
321        </actor>
322    </actors>
323
324One way to search and explore this XML example is to manually add the
325URI to every tag or attribute in the xpath of a
326:meth:`~Element.find` or :meth:`~Element.findall`::
327
328    root = fromstring(xml_text)
329    for actor in root.findall('{http://people.example.com}actor'):
330        name = actor.find('{http://people.example.com}name')
331        print(name.text)
332        for char in actor.findall('{http://characters.example.com}character'):
333            print(' |-->', char.text)
334
335A better way to search the namespaced XML example is to create a
336dictionary with your own prefixes and use those in the search functions::
337
338    ns = {'real_person': 'http://people.example.com',
339          'role': 'http://characters.example.com'}
340
341    for actor in root.findall('real_person:actor', ns):
342        name = actor.find('real_person:name', ns)
343        print(name.text)
344        for char in actor.findall('role:character', ns):
345            print(' |-->', char.text)
346
347These two approaches both output::
348
349    John Cleese
350     |--> Lancelot
351     |--> Archie Leach
352    Eric Idle
353     |--> Sir Robin
354     |--> Gunther
355     |--> Commander Clement
356
357
358Additional resources
359^^^^^^^^^^^^^^^^^^^^
360
361See http://effbot.org/zone/element-index.htm for tutorials and links to other
362docs.
363
364
365.. _elementtree-xpath:
366
367XPath support
368-------------
369
370This module provides limited support for
371`XPath expressions <https://www.w3.org/TR/xpath>`_ for locating elements in a
372tree.  The goal is to support a small subset of the abbreviated syntax; a full
373XPath engine is outside the scope of the module.
374
375Example
376^^^^^^^
377
378Here's an example that demonstrates some of the XPath capabilities of the
379module.  We'll be using the ``countrydata`` XML document from the
380:ref:`Parsing XML <elementtree-parsing-xml>` section::
381
382   import xml.etree.ElementTree as ET
383
384   root = ET.fromstring(countrydata)
385
386   # Top-level elements
387   root.findall(".")
388
389   # All 'neighbor' grand-children of 'country' children of the top-level
390   # elements
391   root.findall("./country/neighbor")
392
393   # Nodes with name='Singapore' that have a 'year' child
394   root.findall(".//year/..[@name='Singapore']")
395
396   # 'year' nodes that are children of nodes with name='Singapore'
397   root.findall(".//*[@name='Singapore']/year")
398
399   # All 'neighbor' nodes that are the second child of their parent
400   root.findall(".//neighbor[2]")
401
402For XML with namespaces, use the usual qualified ``{namespace}tag`` notation::
403
404   # All dublin-core "title" tags in the document
405   root.findall(".//{http://purl.org/dc/elements/1.1/}title")
406
407
408Supported XPath syntax
409^^^^^^^^^^^^^^^^^^^^^^
410
411.. tabularcolumns:: |l|L|
412
413+-----------------------+------------------------------------------------------+
414| Syntax                | Meaning                                              |
415+=======================+======================================================+
416| ``tag``               | Selects all child elements with the given tag.       |
417|                       | For example, ``spam`` selects all child elements     |
418|                       | named ``spam``, and ``spam/egg`` selects all         |
419|                       | grandchildren named ``egg`` in all children named    |
420|                       | ``spam``.  ``{namespace}*`` selects all tags in the  |
421|                       | given namespace, ``{*}spam`` selects tags named      |
422|                       | ``spam`` in any (or no) namespace, and ``{}*``       |
423|                       | only selects tags that are not in a namespace.       |
424|                       |                                                      |
425|                       | .. versionchanged:: 3.8                              |
426|                       |    Support for star-wildcards was added.             |
427+-----------------------+------------------------------------------------------+
428| ``*``                 | Selects all child elements, including comments and   |
429|                       | processing instructions.  For example, ``*/egg``     |
430|                       | selects all grandchildren named ``egg``.             |
431+-----------------------+------------------------------------------------------+
432| ``.``                 | Selects the current node.  This is mostly useful     |
433|                       | at the beginning of the path, to indicate that it's  |
434|                       | a relative path.                                     |
435+-----------------------+------------------------------------------------------+
436| ``//``                | Selects all subelements, on all levels beneath the   |
437|                       | current  element.  For example, ``.//egg`` selects   |
438|                       | all ``egg`` elements in the entire tree.             |
439+-----------------------+------------------------------------------------------+
440| ``..``                | Selects the parent element.  Returns ``None`` if the |
441|                       | path attempts to reach the ancestors of the start    |
442|                       | element (the element ``find`` was called on).        |
443+-----------------------+------------------------------------------------------+
444| ``[@attrib]``         | Selects all elements that have the given attribute.  |
445+-----------------------+------------------------------------------------------+
446| ``[@attrib='value']`` | Selects all elements for which the given attribute   |
447|                       | has the given value.  The value cannot contain       |
448|                       | quotes.                                              |
449+-----------------------+------------------------------------------------------+
450| ``[tag]``             | Selects all elements that have a child named         |
451|                       | ``tag``.  Only immediate children are supported.     |
452+-----------------------+------------------------------------------------------+
453| ``[.='text']``        | Selects all elements whose complete text content,    |
454|                       | including descendants, equals the given ``text``.    |
455|                       |                                                      |
456|                       | .. versionadded:: 3.7                                |
457+-----------------------+------------------------------------------------------+
458| ``[tag='text']``      | Selects all elements that have a child named         |
459|                       | ``tag`` whose complete text content, including       |
460|                       | descendants, equals the given ``text``.              |
461+-----------------------+------------------------------------------------------+
462| ``[position]``        | Selects all elements that are located at the given   |
463|                       | position.  The position can be either an integer     |
464|                       | (1 is the first position), the expression ``last()`` |
465|                       | (for the last position), or a position relative to   |
466|                       | the last position (e.g. ``last()-1``).               |
467+-----------------------+------------------------------------------------------+
468
469Predicates (expressions within square brackets) must be preceded by a tag
470name, an asterisk, or another predicate.  ``position`` predicates must be
471preceded by a tag name.
472
473Reference
474---------
475
476.. _elementtree-functions:
477
478Functions
479^^^^^^^^^
480
481.. function:: canonicalize(xml_data=None, *, out=None, from_file=None, **options)
482
483   `C14N 2.0 <https://www.w3.org/TR/xml-c14n2/>`_ transformation function.
484
485   Canonicalization is a way to normalise XML output in a way that allows
486   byte-by-byte comparisons and digital signatures.  It reduced the freedom
487   that XML serializers have and instead generates a more constrained XML
488   representation.  The main restrictions regard the placement of namespace
489   declarations, the ordering of attributes, and ignorable whitespace.
490
491   This function takes an XML data string (*xml_data*) or a file path or
492   file-like object (*from_file*) as input, converts it to the canonical
493   form, and writes it out using the *out* file(-like) object, if provided,
494   or returns it as a text string if not.  The output file receives text,
495   not bytes.  It should therefore be opened in text mode with ``utf-8``
496   encoding.
497
498   Typical uses::
499
500      xml_data = "<root>...</root>"
501      print(canonicalize(xml_data))
502
503      with open("c14n_output.xml", mode='w', encoding='utf-8') as out_file:
504          canonicalize(xml_data, out=out_file)
505
506      with open("c14n_output.xml", mode='w', encoding='utf-8') as out_file:
507          canonicalize(from_file="inputfile.xml", out=out_file)
508
509   The configuration *options* are as follows:
510
511   - *with_comments*: set to true to include comments (default: false)
512   - *strip_text*: set to true to strip whitespace before and after text content
513                   (default: false)
514   - *rewrite_prefixes*: set to true to replace namespace prefixes by "n{number}"
515                         (default: false)
516   - *qname_aware_tags*: a set of qname aware tag names in which prefixes
517                         should be replaced in text content (default: empty)
518   - *qname_aware_attrs*: a set of qname aware attribute names in which prefixes
519                          should be replaced in text content (default: empty)
520   - *exclude_attrs*: a set of attribute names that should not be serialised
521   - *exclude_tags*: a set of tag names that should not be serialised
522
523   In the option list above, "a set" refers to any collection or iterable of
524   strings, no ordering is expected.
525
526   .. versionadded:: 3.8
527
528
529.. function:: Comment(text=None)
530
531   Comment element factory.  This factory function creates a special element
532   that will be serialized as an XML comment by the standard serializer.  The
533   comment string can be either a bytestring or a Unicode string.  *text* is a
534   string containing the comment string.  Returns an element instance
535   representing a comment.
536
537   Note that :class:`XMLParser` skips over comments in the input
538   instead of creating comment objects for them. An :class:`ElementTree` will
539   only contain comment nodes if they have been inserted into to
540   the tree using one of the :class:`Element` methods.
541
542.. function:: dump(elem)
543
544   Writes an element tree or element structure to sys.stdout.  This function
545   should be used for debugging only.
546
547   The exact output format is implementation dependent.  In this version, it's
548   written as an ordinary XML file.
549
550   *elem* is an element tree or an individual element.
551
552   .. versionchanged:: 3.8
553      The :func:`dump` function now preserves the attribute order specified
554      by the user.
555
556
557.. function:: fromstring(text, parser=None)
558
559   Parses an XML section from a string constant.  Same as :func:`XML`.  *text*
560   is a string containing XML data.  *parser* is an optional parser instance.
561   If not given, the standard :class:`XMLParser` parser is used.
562   Returns an :class:`Element` instance.
563
564
565.. function:: fromstringlist(sequence, parser=None)
566
567   Parses an XML document from a sequence of string fragments.  *sequence* is a
568   list or other sequence containing XML data fragments.  *parser* is an
569   optional parser instance.  If not given, the standard :class:`XMLParser`
570   parser is used.  Returns an :class:`Element` instance.
571
572   .. versionadded:: 3.2
573
574
575.. function:: iselement(element)
576
577   Check if an object appears to be a valid element object.  *element* is an
578   element instance.  Return ``True`` if this is an element object.
579
580
581.. function:: iterparse(source, events=None, parser=None)
582
583   Parses an XML section into an element tree incrementally, and reports what's
584   going on to the user.  *source* is a filename or :term:`file object`
585   containing XML data.  *events* is a sequence of events to report back.  The
586   supported events are the strings ``"start"``, ``"end"``, ``"comment"``,
587   ``"pi"``, ``"start-ns"`` and ``"end-ns"``
588   (the "ns" events are used to get detailed namespace
589   information).  If *events* is omitted, only ``"end"`` events are reported.
590   *parser* is an optional parser instance.  If not given, the standard
591   :class:`XMLParser` parser is used.  *parser* must be a subclass of
592   :class:`XMLParser` and can only use the default :class:`TreeBuilder` as a
593   target.  Returns an :term:`iterator` providing ``(event, elem)`` pairs.
594
595   Note that while :func:`iterparse` builds the tree incrementally, it issues
596   blocking reads on *source* (or the file it names).  As such, it's unsuitable
597   for applications where blocking reads can't be made.  For fully non-blocking
598   parsing, see :class:`XMLPullParser`.
599
600   .. note::
601
602      :func:`iterparse` only guarantees that it has seen the ">" character of a
603      starting tag when it emits a "start" event, so the attributes are defined,
604      but the contents of the text and tail attributes are undefined at that
605      point.  The same applies to the element children; they may or may not be
606      present.
607
608      If you need a fully populated element, look for "end" events instead.
609
610   .. deprecated:: 3.4
611      The *parser* argument.
612
613   .. versionchanged:: 3.8
614      The ``comment`` and ``pi`` events were added.
615
616
617.. function:: parse(source, parser=None)
618
619   Parses an XML section into an element tree.  *source* is a filename or file
620   object containing XML data.  *parser* is an optional parser instance.  If
621   not given, the standard :class:`XMLParser` parser is used.  Returns an
622   :class:`ElementTree` instance.
623
624
625.. function:: ProcessingInstruction(target, text=None)
626
627   PI element factory.  This factory function creates a special element that
628   will be serialized as an XML processing instruction.  *target* is a string
629   containing the PI target.  *text* is a string containing the PI contents, if
630   given.  Returns an element instance, representing a processing instruction.
631
632   Note that :class:`XMLParser` skips over processing instructions
633   in the input instead of creating comment objects for them. An
634   :class:`ElementTree` will only contain processing instruction nodes if
635   they have been inserted into to the tree using one of the
636   :class:`Element` methods.
637
638.. function:: register_namespace(prefix, uri)
639
640   Registers a namespace prefix.  The registry is global, and any existing
641   mapping for either the given prefix or the namespace URI will be removed.
642   *prefix* is a namespace prefix.  *uri* is a namespace uri.  Tags and
643   attributes in this namespace will be serialized with the given prefix, if at
644   all possible.
645
646   .. versionadded:: 3.2
647
648
649.. function:: SubElement(parent, tag, attrib={}, **extra)
650
651   Subelement factory.  This function creates an element instance, and appends
652   it to an existing element.
653
654   The element name, attribute names, and attribute values can be either
655   bytestrings or Unicode strings.  *parent* is the parent element.  *tag* is
656   the subelement name.  *attrib* is an optional dictionary, containing element
657   attributes.  *extra* contains additional attributes, given as keyword
658   arguments.  Returns an element instance.
659
660
661.. function:: tostring(element, encoding="us-ascii", method="xml", *, \
662                       xml_declaration=None, default_namespace=None, \
663                       short_empty_elements=True)
664
665   Generates a string representation of an XML element, including all
666   subelements.  *element* is an :class:`Element` instance.  *encoding* [1]_ is
667   the output encoding (default is US-ASCII).  Use ``encoding="unicode"`` to
668   generate a Unicode string (otherwise, a bytestring is generated).  *method*
669   is either ``"xml"``, ``"html"`` or ``"text"`` (default is ``"xml"``).
670   *xml_declaration*, *default_namespace* and *short_empty_elements* has the same
671   meaning as in :meth:`ElementTree.write`. Returns an (optionally) encoded string
672   containing the XML data.
673
674   .. versionadded:: 3.4
675      The *short_empty_elements* parameter.
676
677   .. versionadded:: 3.8
678      The *xml_declaration* and *default_namespace* parameters.
679
680   .. versionchanged:: 3.8
681      The :func:`tostring` function now preserves the attribute order
682      specified by the user.
683
684
685.. function:: tostringlist(element, encoding="us-ascii", method="xml", *, \
686                           xml_declaration=None, default_namespace=None, \
687                           short_empty_elements=True)
688
689   Generates a string representation of an XML element, including all
690   subelements.  *element* is an :class:`Element` instance.  *encoding* [1]_ is
691   the output encoding (default is US-ASCII).  Use ``encoding="unicode"`` to
692   generate a Unicode string (otherwise, a bytestring is generated).  *method*
693   is either ``"xml"``, ``"html"`` or ``"text"`` (default is ``"xml"``).
694   *xml_declaration*, *default_namespace* and *short_empty_elements* has the same
695   meaning as in :meth:`ElementTree.write`. Returns a list of (optionally) encoded
696   strings containing the XML data. It does not guarantee any specific sequence,
697   except that ``b"".join(tostringlist(element)) == tostring(element)``.
698
699   .. versionadded:: 3.2
700
701   .. versionadded:: 3.4
702      The *short_empty_elements* parameter.
703
704   .. versionadded:: 3.8
705      The *xml_declaration* and *default_namespace* parameters.
706
707   .. versionchanged:: 3.8
708      The :func:`tostringlist` function now preserves the attribute order
709      specified by the user.
710
711
712.. function:: XML(text, parser=None)
713
714   Parses an XML section from a string constant.  This function can be used to
715   embed "XML literals" in Python code.  *text* is a string containing XML
716   data.  *parser* is an optional parser instance.  If not given, the standard
717   :class:`XMLParser` parser is used.  Returns an :class:`Element` instance.
718
719
720.. function:: XMLID(text, parser=None)
721
722   Parses an XML section from a string constant, and also returns a dictionary
723   which maps from element id:s to elements.  *text* is a string containing XML
724   data.  *parser* is an optional parser instance.  If not given, the standard
725   :class:`XMLParser` parser is used.  Returns a tuple containing an
726   :class:`Element` instance and a dictionary.
727
728
729.. _elementtree-xinclude:
730
731XInclude support
732----------------
733
734This module provides limited support for
735`XInclude directives <https://www.w3.org/TR/xinclude/>`_, via the :mod:`xml.etree.ElementInclude` helper module.  This module can be used to insert subtrees and text strings into element trees, based on information in the tree.
736
737Example
738^^^^^^^
739
740Here's an example that demonstrates use of the XInclude module. To include an XML document in the current document, use the ``{http://www.w3.org/2001/XInclude}include`` element and set the **parse** attribute to ``"xml"``, and use the **href** attribute to specify the document to include.
741
742.. code-block:: xml
743
744    <?xml version="1.0"?>
745    <document xmlns:xi="http://www.w3.org/2001/XInclude">
746      <xi:include href="source.xml" parse="xml" />
747    </document>
748
749By default, the **href** attribute is treated as a file name. You can use custom loaders to override this behaviour. Also note that the standard helper does not support XPointer syntax.
750
751To process this file, load it as usual, and pass the root element to the :mod:`xml.etree.ElementTree` module:
752
753.. code-block:: python
754
755   from xml.etree import ElementTree, ElementInclude
756
757   tree = ElementTree.parse("document.xml")
758   root = tree.getroot()
759
760   ElementInclude.include(root)
761
762The ElementInclude module replaces the ``{http://www.w3.org/2001/XInclude}include`` element with the root element from the **source.xml** document. The result might look something like this:
763
764.. code-block:: xml
765
766    <document xmlns:xi="http://www.w3.org/2001/XInclude">
767      <para>This is a paragraph.</para>
768    </document>
769
770If the **parse** attribute is omitted, it defaults to "xml". The href attribute is required.
771
772To include a text document, use the ``{http://www.w3.org/2001/XInclude}include`` element, and set the **parse** attribute to "text":
773
774.. code-block:: xml
775
776    <?xml version="1.0"?>
777    <document xmlns:xi="http://www.w3.org/2001/XInclude">
778      Copyright (c) <xi:include href="year.txt" parse="text" />.
779    </document>
780
781The result might look something like:
782
783.. code-block:: xml
784
785    <document xmlns:xi="http://www.w3.org/2001/XInclude">
786      Copyright (c) 2003.
787    </document>
788
789Reference
790---------
791
792.. _elementinclude-functions:
793
794Functions
795^^^^^^^^^
796
797.. function:: xml.etree.ElementInclude.default_loader( href, parse, encoding=None)
798
799   Default loader. This default loader reads an included resource from disk.  *href* is a URL.
800   *parse* is for parse mode either "xml" or "text".  *encoding*
801   is an optional text encoding.  If not given, encoding is ``utf-8``.  Returns the
802   expanded resource.  If the parse mode is ``"xml"``, this is an ElementTree
803   instance.  If the parse mode is "text", this is a Unicode string.  If the
804   loader fails, it can return None or raise an exception.
805
806
807.. function:: xml.etree.ElementInclude.include( elem, loader=None)
808
809   This function expands XInclude directives.  *elem* is the root element.  *loader* is
810   an optional resource loader.  If omitted, it defaults to :func:`default_loader`.
811   If given, it should be a callable that implements the same interface as
812   :func:`default_loader`.  Returns the expanded resource.  If the parse mode is
813   ``"xml"``, this is an ElementTree instance.  If the parse mode is "text",
814   this is a Unicode string.  If the loader fails, it can return None or
815   raise an exception.
816
817
818.. _elementtree-element-objects:
819
820Element Objects
821^^^^^^^^^^^^^^^
822
823.. class:: Element(tag, attrib={}, **extra)
824
825   Element class.  This class defines the Element interface, and provides a
826   reference implementation of this interface.
827
828   The element name, attribute names, and attribute values can be either
829   bytestrings or Unicode strings.  *tag* is the element name.  *attrib* is
830   an optional dictionary, containing element attributes.  *extra* contains
831   additional attributes, given as keyword arguments.
832
833
834   .. attribute:: tag
835
836      A string identifying what kind of data this element represents (the
837      element type, in other words).
838
839
840   .. attribute:: text
841                  tail
842
843      These attributes can be used to hold additional data associated with
844      the element.  Their values are usually strings but may be any
845      application-specific object.  If the element is created from
846      an XML file, the *text* attribute holds either the text between
847      the element's start tag and its first child or end tag, or ``None``, and
848      the *tail* attribute holds either the text between the element's
849      end tag and the next tag, or ``None``.  For the XML data
850
851      .. code-block:: xml
852
853         <a><b>1<c>2<d/>3</c></b>4</a>
854
855      the *a* element has ``None`` for both *text* and *tail* attributes,
856      the *b* element has *text* ``"1"`` and *tail* ``"4"``,
857      the *c* element has *text* ``"2"`` and *tail* ``None``,
858      and the *d* element has *text* ``None`` and *tail* ``"3"``.
859
860      To collect the inner text of an element, see :meth:`itertext`, for
861      example ``"".join(element.itertext())``.
862
863      Applications may store arbitrary objects in these attributes.
864
865
866   .. attribute:: attrib
867
868      A dictionary containing the element's attributes.  Note that while the
869      *attrib* value is always a real mutable Python dictionary, an ElementTree
870      implementation may choose to use another internal representation, and
871      create the dictionary only if someone asks for it.  To take advantage of
872      such implementations, use the dictionary methods below whenever possible.
873
874   The following dictionary-like methods work on the element attributes.
875
876
877   .. method:: clear()
878
879      Resets an element.  This function removes all subelements, clears all
880      attributes, and sets the text and tail attributes to ``None``.
881
882
883   .. method:: get(key, default=None)
884
885      Gets the element attribute named *key*.
886
887      Returns the attribute value, or *default* if the attribute was not found.
888
889
890   .. method:: items()
891
892      Returns the element attributes as a sequence of (name, value) pairs.  The
893      attributes are returned in an arbitrary order.
894
895
896   .. method:: keys()
897
898      Returns the elements attribute names as a list.  The names are returned
899      in an arbitrary order.
900
901
902   .. method:: set(key, value)
903
904      Set the attribute *key* on the element to *value*.
905
906   The following methods work on the element's children (subelements).
907
908
909   .. method:: append(subelement)
910
911      Adds the element *subelement* to the end of this element's internal list
912      of subelements.  Raises :exc:`TypeError` if *subelement* is not an
913      :class:`Element`.
914
915
916   .. method:: extend(subelements)
917
918      Appends *subelements* from a sequence object with zero or more elements.
919      Raises :exc:`TypeError` if a subelement is not an :class:`Element`.
920
921      .. versionadded:: 3.2
922
923
924   .. method:: find(match, namespaces=None)
925
926      Finds the first subelement matching *match*.  *match* may be a tag name
927      or a :ref:`path <elementtree-xpath>`.  Returns an element instance
928      or ``None``.  *namespaces* is an optional mapping from namespace prefix
929      to full name.  Pass ``''`` as prefix to move all unprefixed tag names
930      in the expression into the given namespace.
931
932
933   .. method:: findall(match, namespaces=None)
934
935      Finds all matching subelements, by tag name or
936      :ref:`path <elementtree-xpath>`.  Returns a list containing all matching
937      elements in document order.  *namespaces* is an optional mapping from
938      namespace prefix to full name.  Pass ``''`` as prefix to move all
939      unprefixed tag names in the expression into the given namespace.
940
941
942   .. method:: findtext(match, default=None, namespaces=None)
943
944      Finds text for the first subelement matching *match*.  *match* may be
945      a tag name or a :ref:`path <elementtree-xpath>`.  Returns the text content
946      of the first matching element, or *default* if no element was found.
947      Note that if the matching element has no text content an empty string
948      is returned. *namespaces* is an optional mapping from namespace prefix
949      to full name.  Pass ``''`` as prefix to move all unprefixed tag names
950      in the expression into the given namespace.
951
952
953   .. method:: getchildren()
954
955      .. deprecated-removed:: 3.2 3.9
956         Use ``list(elem)`` or iteration.
957
958
959   .. method:: getiterator(tag=None)
960
961      .. deprecated-removed:: 3.2 3.9
962         Use method :meth:`Element.iter` instead.
963
964
965   .. method:: insert(index, subelement)
966
967      Inserts *subelement* at the given position in this element.  Raises
968      :exc:`TypeError` if *subelement* is not an :class:`Element`.
969
970
971   .. method:: iter(tag=None)
972
973      Creates a tree :term:`iterator` with the current element as the root.
974      The iterator iterates over this element and all elements below it, in
975      document (depth first) order.  If *tag* is not ``None`` or ``'*'``, only
976      elements whose tag equals *tag* are returned from the iterator.  If the
977      tree structure is modified during iteration, the result is undefined.
978
979      .. versionadded:: 3.2
980
981
982   .. method:: iterfind(match, namespaces=None)
983
984      Finds all matching subelements, by tag name or
985      :ref:`path <elementtree-xpath>`.  Returns an iterable yielding all
986      matching elements in document order. *namespaces* is an optional mapping
987      from namespace prefix to full name.
988
989
990      .. versionadded:: 3.2
991
992
993   .. method:: itertext()
994
995      Creates a text iterator.  The iterator loops over this element and all
996      subelements, in document order, and returns all inner text.
997
998      .. versionadded:: 3.2
999
1000
1001   .. method:: makeelement(tag, attrib)
1002
1003      Creates a new element object of the same type as this element.  Do not
1004      call this method, use the :func:`SubElement` factory function instead.
1005
1006
1007   .. method:: remove(subelement)
1008
1009      Removes *subelement* from the element.  Unlike the find\* methods this
1010      method compares elements based on the instance identity, not on tag value
1011      or contents.
1012
1013   :class:`Element` objects also support the following sequence type methods
1014   for working with subelements: :meth:`~object.__delitem__`,
1015   :meth:`~object.__getitem__`, :meth:`~object.__setitem__`,
1016   :meth:`~object.__len__`.
1017
1018   Caution: Elements with no subelements will test as ``False``.  This behavior
1019   will change in future versions.  Use specific ``len(elem)`` or ``elem is
1020   None`` test instead. ::
1021
1022     element = root.find('foo')
1023
1024     if not element:  # careful!
1025         print("element not found, or element has no subelements")
1026
1027     if element is None:
1028         print("element not found")
1029
1030   Prior to Python 3.8, the serialisation order of the XML attributes of
1031   elements was artificially made predictable by sorting the attributes by
1032   their name. Based on the now guaranteed ordering of dicts, this arbitrary
1033   reordering was removed in Python 3.8 to preserve the order in which
1034   attributes were originally parsed or created by user code.
1035
1036   In general, user code should try not to depend on a specific ordering of
1037   attributes, given that the `XML Information Set
1038   <https://www.w3.org/TR/xml-infoset/>`_ explicitly excludes the attribute
1039   order from conveying information. Code should be prepared to deal with
1040   any ordering on input. In cases where deterministic XML output is required,
1041   e.g. for cryptographic signing or test data sets, canonical serialisation
1042   is available with the :func:`canonicalize` function.
1043
1044   In cases where canonical output is not applicable but a specific attribute
1045   order is still desirable on output, code should aim for creating the
1046   attributes directly in the desired order, to avoid perceptual mismatches
1047   for readers of the code. In cases where this is difficult to achieve, a
1048   recipe like the following can be applied prior to serialisation to enforce
1049   an order independently from the Element creation::
1050
1051     def reorder_attributes(root):
1052         for el in root.iter():
1053             attrib = el.attrib
1054             if len(attrib) > 1:
1055                 # adjust attribute order, e.g. by sorting
1056                 attribs = sorted(attrib.items())
1057                 attrib.clear()
1058                 attrib.update(attribs)
1059
1060
1061.. _elementtree-elementtree-objects:
1062
1063ElementTree Objects
1064^^^^^^^^^^^^^^^^^^^
1065
1066
1067.. class:: ElementTree(element=None, file=None)
1068
1069   ElementTree wrapper class.  This class represents an entire element
1070   hierarchy, and adds some extra support for serialization to and from
1071   standard XML.
1072
1073   *element* is the root element.  The tree is initialized with the contents
1074   of the XML *file* if given.
1075
1076
1077   .. method:: _setroot(element)
1078
1079      Replaces the root element for this tree.  This discards the current
1080      contents of the tree, and replaces it with the given element.  Use with
1081      care.  *element* is an element instance.
1082
1083
1084   .. method:: find(match, namespaces=None)
1085
1086      Same as :meth:`Element.find`, starting at the root of the tree.
1087
1088
1089   .. method:: findall(match, namespaces=None)
1090
1091      Same as :meth:`Element.findall`, starting at the root of the tree.
1092
1093
1094   .. method:: findtext(match, default=None, namespaces=None)
1095
1096      Same as :meth:`Element.findtext`, starting at the root of the tree.
1097
1098
1099   .. method:: getiterator(tag=None)
1100
1101      .. deprecated-removed:: 3.2 3.9
1102         Use method :meth:`ElementTree.iter` instead.
1103
1104
1105   .. method:: getroot()
1106
1107      Returns the root element for this tree.
1108
1109
1110   .. method:: iter(tag=None)
1111
1112      Creates and returns a tree iterator for the root element.  The iterator
1113      loops over all elements in this tree, in section order.  *tag* is the tag
1114      to look for (default is to return all elements).
1115
1116
1117   .. method:: iterfind(match, namespaces=None)
1118
1119      Same as :meth:`Element.iterfind`, starting at the root of the tree.
1120
1121      .. versionadded:: 3.2
1122
1123
1124   .. method:: parse(source, parser=None)
1125
1126      Loads an external XML section into this element tree.  *source* is a file
1127      name or :term:`file object`.  *parser* is an optional parser instance.
1128      If not given, the standard :class:`XMLParser` parser is used.  Returns the
1129      section root element.
1130
1131
1132   .. method:: write(file, encoding="us-ascii", xml_declaration=None, \
1133                     default_namespace=None, method="xml", *, \
1134                     short_empty_elements=True)
1135
1136      Writes the element tree to a file, as XML.  *file* is a file name, or a
1137      :term:`file object` opened for writing.  *encoding* [1]_ is the output
1138      encoding (default is US-ASCII).
1139      *xml_declaration* controls if an XML declaration should be added to the
1140      file.  Use ``False`` for never, ``True`` for always, ``None``
1141      for only if not US-ASCII or UTF-8 or Unicode (default is ``None``).
1142      *default_namespace* sets the default XML namespace (for "xmlns").
1143      *method* is either ``"xml"``, ``"html"`` or ``"text"`` (default is
1144      ``"xml"``).
1145      The keyword-only *short_empty_elements* parameter controls the formatting
1146      of elements that contain no content.  If ``True`` (the default), they are
1147      emitted as a single self-closed tag, otherwise they are emitted as a pair
1148      of start/end tags.
1149
1150      The output is either a string (:class:`str`) or binary (:class:`bytes`).
1151      This is controlled by the *encoding* argument.  If *encoding* is
1152      ``"unicode"``, the output is a string; otherwise, it's binary.  Note that
1153      this may conflict with the type of *file* if it's an open
1154      :term:`file object`; make sure you do not try to write a string to a
1155      binary stream and vice versa.
1156
1157      .. versionadded:: 3.4
1158         The *short_empty_elements* parameter.
1159
1160      .. versionchanged:: 3.8
1161         The :meth:`write` method now preserves the attribute order specified
1162         by the user.
1163
1164
1165This is the XML file that is going to be manipulated::
1166
1167    <html>
1168        <head>
1169            <title>Example page</title>
1170        </head>
1171        <body>
1172            <p>Moved to <a href="http://example.org/">example.org</a>
1173            or <a href="http://example.com/">example.com</a>.</p>
1174        </body>
1175    </html>
1176
1177Example of changing the attribute "target" of every link in first paragraph::
1178
1179    >>> from xml.etree.ElementTree import ElementTree
1180    >>> tree = ElementTree()
1181    >>> tree.parse("index.xhtml")
1182    <Element 'html' at 0xb77e6fac>
1183    >>> p = tree.find("body/p")     # Finds first occurrence of tag p in body
1184    >>> p
1185    <Element 'p' at 0xb77ec26c>
1186    >>> links = list(p.iter("a"))   # Returns list of all links
1187    >>> links
1188    [<Element 'a' at 0xb77ec2ac>, <Element 'a' at 0xb77ec1cc>]
1189    >>> for i in links:             # Iterates through all found links
1190    ...     i.attrib["target"] = "blank"
1191    >>> tree.write("output.xhtml")
1192
1193.. _elementtree-qname-objects:
1194
1195QName Objects
1196^^^^^^^^^^^^^
1197
1198
1199.. class:: QName(text_or_uri, tag=None)
1200
1201   QName wrapper.  This can be used to wrap a QName attribute value, in order
1202   to get proper namespace handling on output.  *text_or_uri* is a string
1203   containing the QName value, in the form {uri}local, or, if the tag argument
1204   is given, the URI part of a QName.  If *tag* is given, the first argument is
1205   interpreted as a URI, and this argument is interpreted as a local name.
1206   :class:`QName` instances are opaque.
1207
1208
1209
1210.. _elementtree-treebuilder-objects:
1211
1212TreeBuilder Objects
1213^^^^^^^^^^^^^^^^^^^
1214
1215
1216.. class:: TreeBuilder(element_factory=None, *, comment_factory=None, \
1217                       pi_factory=None, insert_comments=False, insert_pis=False)
1218
1219   Generic element structure builder.  This builder converts a sequence of
1220   start, data, end, comment and pi method calls to a well-formed element
1221   structure.  You can use this class to build an element structure using
1222   a custom XML parser, or a parser for some other XML-like format.
1223
1224   *element_factory*, when given, must be a callable accepting two positional
1225   arguments: a tag and a dict of attributes.  It is expected to return a new
1226   element instance.
1227
1228   The *comment_factory* and *pi_factory* functions, when given, should behave
1229   like the :func:`Comment` and :func:`ProcessingInstruction` functions to
1230   create comments and processing instructions.  When not given, the default
1231   factories will be used.  When *insert_comments* and/or *insert_pis* is true,
1232   comments/pis will be inserted into the tree if they appear within the root
1233   element (but not outside of it).
1234
1235   .. method:: close()
1236
1237      Flushes the builder buffers, and returns the toplevel document
1238      element.  Returns an :class:`Element` instance.
1239
1240
1241   .. method:: data(data)
1242
1243      Adds text to the current element.  *data* is a string.  This should be
1244      either a bytestring, or a Unicode string.
1245
1246
1247   .. method:: end(tag)
1248
1249      Closes the current element.  *tag* is the element name.  Returns the
1250      closed element.
1251
1252
1253   .. method:: start(tag, attrs)
1254
1255      Opens a new element.  *tag* is the element name.  *attrs* is a dictionary
1256      containing element attributes.  Returns the opened element.
1257
1258
1259   .. method:: comment(text)
1260
1261      Creates a comment with the given *text*.  If ``insert_comments`` is true,
1262      this will also add it to the tree.
1263
1264      .. versionadded:: 3.8
1265
1266
1267   .. method:: pi(target, text)
1268
1269      Creates a comment with the given *target* name and *text*.  If
1270      ``insert_pis`` is true, this will also add it to the tree.
1271
1272      .. versionadded:: 3.8
1273
1274
1275   In addition, a custom :class:`TreeBuilder` object can provide the
1276   following methods:
1277
1278   .. method:: doctype(name, pubid, system)
1279
1280      Handles a doctype declaration.  *name* is the doctype name.  *pubid* is
1281      the public identifier.  *system* is the system identifier.  This method
1282      does not exist on the default :class:`TreeBuilder` class.
1283
1284      .. versionadded:: 3.2
1285
1286   .. method:: start_ns(prefix, uri)
1287
1288      Is called whenever the parser encounters a new namespace declaration,
1289      before the ``start()`` callback for the opening element that defines it.
1290      *prefix* is ``''`` for the default namespace and the declared
1291      namespace prefix name otherwise.  *uri* is the namespace URI.
1292
1293      .. versionadded:: 3.8
1294
1295   .. method:: end_ns(prefix)
1296
1297      Is called after the ``end()`` callback of an element that declared
1298      a namespace prefix mapping, with the name of the *prefix* that went
1299      out of scope.
1300
1301      .. versionadded:: 3.8
1302
1303
1304.. class:: C14NWriterTarget(write, *, \
1305             with_comments=False, strip_text=False, rewrite_prefixes=False, \
1306             qname_aware_tags=None, qname_aware_attrs=None, \
1307             exclude_attrs=None, exclude_tags=None)
1308
1309   A `C14N 2.0 <https://www.w3.org/TR/xml-c14n2/>`_ writer.  Arguments are the
1310   same as for the :func:`canonicalize` function.  This class does not build a
1311   tree but translates the callback events directly into a serialised form
1312   using the *write* function.
1313
1314   .. versionadded:: 3.8
1315
1316
1317.. _elementtree-xmlparser-objects:
1318
1319XMLParser Objects
1320^^^^^^^^^^^^^^^^^
1321
1322
1323.. class:: XMLParser(*, target=None, encoding=None)
1324
1325   This class is the low-level building block of the module.  It uses
1326   :mod:`xml.parsers.expat` for efficient, event-based parsing of XML.  It can
1327   be fed XML data incrementally with the :meth:`feed` method, and parsing
1328   events are translated to a push API - by invoking callbacks on the *target*
1329   object.  If *target* is omitted, the standard :class:`TreeBuilder` is used.
1330   If *encoding* [1]_ is given, the value overrides the
1331   encoding specified in the XML file.
1332
1333   .. versionchanged:: 3.8
1334      Parameters are now :ref:`keyword-only <keyword-only_parameter>`.
1335      The *html* argument no longer supported.
1336
1337
1338   .. method:: close()
1339
1340      Finishes feeding data to the parser.  Returns the result of calling the
1341      ``close()`` method of the *target* passed during construction; by default,
1342      this is the toplevel document element.
1343
1344
1345   .. method:: feed(data)
1346
1347      Feeds data to the parser.  *data* is encoded data.
1348
1349   :meth:`XMLParser.feed` calls *target*\'s ``start(tag, attrs_dict)`` method
1350   for each opening tag, its ``end(tag)`` method for each closing tag, and data
1351   is processed by method ``data(data)``.  For further supported callback
1352   methods, see the :class:`TreeBuilder` class.  :meth:`XMLParser.close` calls
1353   *target*\'s method ``close()``. :class:`XMLParser` can be used not only for
1354   building a tree structure. This is an example of counting the maximum depth
1355   of an XML file::
1356
1357    >>> from xml.etree.ElementTree import XMLParser
1358    >>> class MaxDepth:                     # The target object of the parser
1359    ...     maxDepth = 0
1360    ...     depth = 0
1361    ...     def start(self, tag, attrib):   # Called for each opening tag.
1362    ...         self.depth += 1
1363    ...         if self.depth > self.maxDepth:
1364    ...             self.maxDepth = self.depth
1365    ...     def end(self, tag):             # Called for each closing tag.
1366    ...         self.depth -= 1
1367    ...     def data(self, data):
1368    ...         pass            # We do not need to do anything with data.
1369    ...     def close(self):    # Called when all data has been parsed.
1370    ...         return self.maxDepth
1371    ...
1372    >>> target = MaxDepth()
1373    >>> parser = XMLParser(target=target)
1374    >>> exampleXml = """
1375    ... <a>
1376    ...   <b>
1377    ...   </b>
1378    ...   <b>
1379    ...     <c>
1380    ...       <d>
1381    ...       </d>
1382    ...     </c>
1383    ...   </b>
1384    ... </a>"""
1385    >>> parser.feed(exampleXml)
1386    >>> parser.close()
1387    4
1388
1389
1390.. _elementtree-xmlpullparser-objects:
1391
1392XMLPullParser Objects
1393^^^^^^^^^^^^^^^^^^^^^
1394
1395.. class:: XMLPullParser(events=None)
1396
1397   A pull parser suitable for non-blocking applications.  Its input-side API is
1398   similar to that of :class:`XMLParser`, but instead of pushing calls to a
1399   callback target, :class:`XMLPullParser` collects an internal list of parsing
1400   events and lets the user read from it. *events* is a sequence of events to
1401   report back.  The supported events are the strings ``"start"``, ``"end"``,
1402   ``"comment"``, ``"pi"``, ``"start-ns"`` and ``"end-ns"`` (the "ns" events
1403   are used to get detailed namespace information).  If *events* is omitted,
1404   only ``"end"`` events are reported.
1405
1406   .. method:: feed(data)
1407
1408      Feed the given bytes data to the parser.
1409
1410   .. method:: close()
1411
1412      Signal the parser that the data stream is terminated. Unlike
1413      :meth:`XMLParser.close`, this method always returns :const:`None`.
1414      Any events not yet retrieved when the parser is closed can still be
1415      read with :meth:`read_events`.
1416
1417   .. method:: read_events()
1418
1419      Return an iterator over the events which have been encountered in the
1420      data fed to the
1421      parser.  The iterator yields ``(event, elem)`` pairs, where *event* is a
1422      string representing the type of event (e.g. ``"end"``) and *elem* is the
1423      encountered :class:`Element` object, or other context value as follows.
1424
1425      * ``start``, ``end``: the current Element.
1426      * ``comment``, ``pi``: the current comment / processing instruction
1427      * ``start-ns``: a tuple ``(prefix, uri)`` naming the declared namespace
1428        mapping.
1429      * ``end-ns``: :const:`None` (this may change in a future version)
1430
1431      Events provided in a previous call to :meth:`read_events` will not be
1432      yielded again.  Events are consumed from the internal queue only when
1433      they are retrieved from the iterator, so multiple readers iterating in
1434      parallel over iterators obtained from :meth:`read_events` will have
1435      unpredictable results.
1436
1437   .. note::
1438
1439      :class:`XMLPullParser` only guarantees that it has seen the ">"
1440      character of a starting tag when it emits a "start" event, so the
1441      attributes are defined, but the contents of the text and tail attributes
1442      are undefined at that point.  The same applies to the element children;
1443      they may or may not be present.
1444
1445      If you need a fully populated element, look for "end" events instead.
1446
1447   .. versionadded:: 3.4
1448
1449   .. versionchanged:: 3.8
1450      The ``comment`` and ``pi`` events were added.
1451
1452
1453Exceptions
1454^^^^^^^^^^
1455
1456.. class:: ParseError
1457
1458   XML parse error, raised by the various parsing methods in this module when
1459   parsing fails.  The string representation of an instance of this exception
1460   will contain a user-friendly error message.  In addition, it will have
1461   the following attributes available:
1462
1463   .. attribute:: code
1464
1465      A numeric error code from the expat parser. See the documentation of
1466      :mod:`xml.parsers.expat` for the list of error codes and their meanings.
1467
1468   .. attribute:: position
1469
1470      A tuple of *line*, *column* numbers, specifying where the error occurred.
1471
1472.. rubric:: Footnotes
1473
1474.. [1] The encoding string included in XML output should conform to the
1475   appropriate standards.  For example, "UTF-8" is valid, but "UTF8" is
1476   not.  See https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EncodingDecl
1477   and https://www.iana.org/assignments/character-sets/character-sets.xhtml.
1478