• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1:mod:`xml.etree.ElementTree` --- The ElementTree XML API
2========================================================
3
4.. module:: xml.etree.ElementTree
5   :synopsis: Implementation of the ElementTree API.
6.. moduleauthor:: Fredrik Lundh <fredrik@pythonware.com>
7
8
9.. versionadded:: 2.5
10
11**Source code:** :source:`Lib/xml/etree/ElementTree.py`
12
13--------------
14
15The :class:`Element` type is a flexible container object, designed to store
16hierarchical data structures in memory.  The type can be described as a cross
17between a list and a dictionary.
18
19
20.. warning::
21
22   The :mod:`xml.etree.ElementTree` module is not secure against
23   maliciously constructed data.  If you need to parse untrusted or
24   unauthenticated data see :ref:`xml-vulnerabilities`.
25
26
27Each element has a number of properties associated with it:
28
29* a tag which is a string identifying what kind of data this element represents
30  (the element type, in other words).
31
32* a number of attributes, stored in a Python dictionary.
33
34* a text string.
35
36* an optional tail string.
37
38* a number of child elements, stored in a Python sequence
39
40To create an element instance, use the :class:`Element` constructor or the
41:func:`SubElement` factory function.
42
43The :class:`ElementTree` class can be used to wrap an element structure, and
44convert it from and to XML.
45
46A C implementation of this API is available as :mod:`xml.etree.cElementTree`.
47
48See http://effbot.org/zone/element-index.htm for tutorials and links to other
49docs.  Fredrik Lundh's page is also the location of the development version of
50the xml.etree.ElementTree.
51
52.. versionchanged:: 2.7
53   The ElementTree API is updated to 1.3.  For more information, see
54   `Introducing ElementTree 1.3
55   <http://effbot.org/zone/elementtree-13-intro.htm>`_.
56
57Tutorial
58--------
59
60This is a short tutorial for using :mod:`xml.etree.ElementTree` (``ET`` in
61short).  The goal is to demonstrate some of the building blocks and basic
62concepts of the module.
63
64XML tree and elements
65^^^^^^^^^^^^^^^^^^^^^
66
67XML is an inherently hierarchical data format, and the most natural way to
68represent it is with a tree.  ``ET`` has two classes for this purpose -
69:class:`ElementTree` represents the whole XML document as a tree, and
70:class:`Element` represents a single node in this tree.  Interactions with
71the whole document (reading and writing to/from files) are usually done
72on the :class:`ElementTree` level.  Interactions with a single XML element
73and its sub-elements are done on the :class:`Element` level.
74
75.. _elementtree-parsing-xml:
76
77Parsing XML
78^^^^^^^^^^^
79
80We'll be using the following XML document as the sample data for this section:
81
82.. code-block:: xml
83
84   <?xml version="1.0"?>
85   <data>
86       <country name="Liechtenstein">
87           <rank>1</rank>
88           <year>2008</year>
89           <gdppc>141100</gdppc>
90           <neighbor name="Austria" direction="E"/>
91           <neighbor name="Switzerland" direction="W"/>
92       </country>
93       <country name="Singapore">
94           <rank>4</rank>
95           <year>2011</year>
96           <gdppc>59900</gdppc>
97           <neighbor name="Malaysia" direction="N"/>
98       </country>
99       <country name="Panama">
100           <rank>68</rank>
101           <year>2011</year>
102           <gdppc>13600</gdppc>
103           <neighbor name="Costa Rica" direction="W"/>
104           <neighbor name="Colombia" direction="E"/>
105       </country>
106   </data>
107
108We have a number of ways to import the data.  Reading the file from disk::
109
110   import xml.etree.ElementTree as ET
111   tree = ET.parse('country_data.xml')
112   root = tree.getroot()
113
114Reading the data from a string::
115
116   root = ET.fromstring(country_data_as_string)
117
118:func:`fromstring` parses XML from a string directly into an :class:`Element`,
119which is the root element of the parsed tree.  Other parsing functions may
120create an :class:`ElementTree`.  Check the documentation to be sure.
121
122As an :class:`Element`, ``root`` has a tag and a dictionary of attributes::
123
124   >>> root.tag
125   'data'
126   >>> root.attrib
127   {}
128
129It also has children nodes over which we can iterate::
130
131   >>> for child in root:
132   ...     print child.tag, child.attrib
133   ...
134   country {'name': 'Liechtenstein'}
135   country {'name': 'Singapore'}
136   country {'name': 'Panama'}
137
138Children are nested, and we can access specific child nodes by index::
139
140   >>> root[0][1].text
141   '2008'
142
143Finding interesting elements
144^^^^^^^^^^^^^^^^^^^^^^^^^^^^
145
146:class:`Element` has some useful methods that help iterate recursively over all
147the sub-tree below it (its children, their children, and so on).  For example,
148:meth:`Element.iter`::
149
150   >>> for neighbor in root.iter('neighbor'):
151   ...     print neighbor.attrib
152   ...
153   {'name': 'Austria', 'direction': 'E'}
154   {'name': 'Switzerland', 'direction': 'W'}
155   {'name': 'Malaysia', 'direction': 'N'}
156   {'name': 'Costa Rica', 'direction': 'W'}
157   {'name': 'Colombia', 'direction': 'E'}
158
159:meth:`Element.findall` finds only elements with a tag which are direct
160children of the current element.  :meth:`Element.find` finds the *first* child
161with a particular tag, and :attr:`Element.text` accesses the element's text
162content.  :meth:`Element.get` accesses the element's attributes::
163
164   >>> for country in root.findall('country'):
165   ...     rank = country.find('rank').text
166   ...     name = country.get('name')
167   ...     print name, rank
168   ...
169   Liechtenstein 1
170   Singapore 4
171   Panama 68
172
173More sophisticated specification of which elements to look for is possible by
174using :ref:`XPath <elementtree-xpath>`.
175
176Modifying an XML File
177^^^^^^^^^^^^^^^^^^^^^
178
179:class:`ElementTree` provides a simple way to build XML documents and write them to files.
180The :meth:`ElementTree.write` method serves this purpose.
181
182Once created, an :class:`Element` object may be manipulated by directly changing
183its fields (such as :attr:`Element.text`), adding and modifying attributes
184(:meth:`Element.set` method), as well as adding new children (for example
185with :meth:`Element.append`).
186
187Let's say we want to add one to each country's rank, and add an ``updated``
188attribute to the rank element::
189
190   >>> for rank in root.iter('rank'):
191   ...     new_rank = int(rank.text) + 1
192   ...     rank.text = str(new_rank)
193   ...     rank.set('updated', 'yes')
194   ...
195   >>> tree.write('output.xml')
196
197Our XML now looks like this:
198
199.. code-block:: xml
200
201   <?xml version="1.0"?>
202   <data>
203       <country name="Liechtenstein">
204           <rank updated="yes">2</rank>
205           <year>2008</year>
206           <gdppc>141100</gdppc>
207           <neighbor name="Austria" direction="E"/>
208           <neighbor name="Switzerland" direction="W"/>
209       </country>
210       <country name="Singapore">
211           <rank updated="yes">5</rank>
212           <year>2011</year>
213           <gdppc>59900</gdppc>
214           <neighbor name="Malaysia" direction="N"/>
215       </country>
216       <country name="Panama">
217           <rank updated="yes">69</rank>
218           <year>2011</year>
219           <gdppc>13600</gdppc>
220           <neighbor name="Costa Rica" direction="W"/>
221           <neighbor name="Colombia" direction="E"/>
222       </country>
223   </data>
224
225We can remove elements using :meth:`Element.remove`.  Let's say we want to
226remove all countries with a rank higher than 50::
227
228   >>> for country in root.findall('country'):
229   ...     rank = int(country.find('rank').text)
230   ...     if rank > 50:
231   ...         root.remove(country)
232   ...
233   >>> tree.write('output.xml')
234
235Our XML now looks like this:
236
237.. code-block:: xml
238
239   <?xml version="1.0"?>
240   <data>
241       <country name="Liechtenstein">
242           <rank updated="yes">2</rank>
243           <year>2008</year>
244           <gdppc>141100</gdppc>
245           <neighbor name="Austria" direction="E"/>
246           <neighbor name="Switzerland" direction="W"/>
247       </country>
248       <country name="Singapore">
249           <rank updated="yes">5</rank>
250           <year>2011</year>
251           <gdppc>59900</gdppc>
252           <neighbor name="Malaysia" direction="N"/>
253       </country>
254   </data>
255
256Building XML documents
257^^^^^^^^^^^^^^^^^^^^^^
258
259The :func:`SubElement` function also provides a convenient way to create new
260sub-elements for a given element::
261
262   >>> a = ET.Element('a')
263   >>> b = ET.SubElement(a, 'b')
264   >>> c = ET.SubElement(a, 'c')
265   >>> d = ET.SubElement(c, 'd')
266   >>> ET.dump(a)
267   <a><b /><c><d /></c></a>
268
269Parsing XML with Namespaces
270^^^^^^^^^^^^^^^^^^^^^^^^^^^
271
272If the XML input has `namespaces
273<https://en.wikipedia.org/wiki/XML_namespace>`__, tags and attributes
274with prefixes in the form ``prefix:sometag`` get expanded to
275``{uri}sometag`` where the *prefix* is replaced by the full *URI*.
276Also, if there is a `default namespace
277<https://www.w3.org/TR/xml-names/#defaulting>`__,
278that full URI gets prepended to all of the non-prefixed tags.
279
280Here is an XML example that incorporates two namespaces, one with the
281prefix "fictional" and the other serving as the default namespace:
282
283.. code-block:: xml
284
285    <?xml version="1.0"?>
286    <actors xmlns:fictional="http://characters.example.com"
287            xmlns="http://people.example.com">
288        <actor>
289            <name>John Cleese</name>
290            <fictional:character>Lancelot</fictional:character>
291            <fictional:character>Archie Leach</fictional:character>
292        </actor>
293        <actor>
294            <name>Eric Idle</name>
295            <fictional:character>Sir Robin</fictional:character>
296            <fictional:character>Gunther</fictional:character>
297            <fictional:character>Commander Clement</fictional:character>
298        </actor>
299    </actors>
300
301One way to search and explore this XML example is to manually add the
302URI to every tag or attribute in the xpath of a
303:meth:`~Element.find` or :meth:`~Element.findall`::
304
305    root = fromstring(xml_text)
306    for actor in root.findall('{http://people.example.com}actor'):
307        name = actor.find('{http://people.example.com}name')
308        print name.text
309        for char in actor.findall('{http://characters.example.com}character'):
310            print ' |-->', char.text
311
312
313A better way to search the namespaced XML example is to create a
314dictionary with your own prefixes and use those in the search functions::
315
316    ns = {'real_person': 'http://people.example.com',
317          'role': 'http://characters.example.com'}
318
319    for actor in root.findall('real_person:actor', ns):
320        name = actor.find('real_person:name', ns)
321        print name.text
322        for char in actor.findall('role:character', ns):
323            print ' |-->', char.text
324
325These two approaches both output::
326
327    John Cleese
328     |--> Lancelot
329     |--> Archie Leach
330    Eric Idle
331     |--> Sir Robin
332     |--> Gunther
333     |--> Commander Clement
334
335
336Additional resources
337^^^^^^^^^^^^^^^^^^^^
338
339See http://effbot.org/zone/element-index.htm for tutorials and links to other
340docs.
341
342.. _elementtree-xpath:
343
344XPath support
345-------------
346
347This module provides limited support for
348`XPath expressions <https://www.w3.org/TR/xpath>`_ for locating elements in a
349tree.  The goal is to support a small subset of the abbreviated syntax; a full
350XPath engine is outside the scope of the module.
351
352Example
353^^^^^^^
354
355Here's an example that demonstrates some of the XPath capabilities of the
356module.  We'll be using the ``countrydata`` XML document from the
357:ref:`Parsing XML <elementtree-parsing-xml>` section::
358
359   import xml.etree.ElementTree as ET
360
361   root = ET.fromstring(countrydata)
362
363   # Top-level elements
364   root.findall(".")
365
366   # All 'neighbor' grand-children of 'country' children of the top-level
367   # elements
368   root.findall("./country/neighbor")
369
370   # Nodes with name='Singapore' that have a 'year' child
371   root.findall(".//year/..[@name='Singapore']")
372
373   # 'year' nodes that are children of nodes with name='Singapore'
374   root.findall(".//*[@name='Singapore']/year")
375
376   # All 'neighbor' nodes that are the second child of their parent
377   root.findall(".//neighbor[2]")
378
379Supported XPath syntax
380^^^^^^^^^^^^^^^^^^^^^^
381
382.. tabularcolumns:: |l|L|
383
384+-----------------------+------------------------------------------------------+
385| Syntax                | Meaning                                              |
386+=======================+======================================================+
387| ``tag``               | Selects all child elements with the given tag.       |
388|                       | For example, ``spam`` selects all child elements     |
389|                       | named ``spam``, and ``spam/egg`` selects all         |
390|                       | grandchildren named ``egg`` in all children named    |
391|                       | ``spam``.                                            |
392+-----------------------+------------------------------------------------------+
393| ``*``                 | Selects all child elements.  For example, ``*/egg``  |
394|                       | selects all grandchildren named ``egg``.             |
395+-----------------------+------------------------------------------------------+
396| ``.``                 | Selects the current node.  This is mostly useful     |
397|                       | at the beginning of the path, to indicate that it's  |
398|                       | a relative path.                                     |
399+-----------------------+------------------------------------------------------+
400| ``//``                | Selects all subelements, on all levels beneath the   |
401|                       | current  element.  For example, ``.//egg`` selects   |
402|                       | all ``egg`` elements in the entire tree.             |
403+-----------------------+------------------------------------------------------+
404| ``..``                | Selects the parent element.                          |
405+-----------------------+------------------------------------------------------+
406| ``[@attrib]``         | Selects all elements that have the given attribute.  |
407+-----------------------+------------------------------------------------------+
408| ``[@attrib='value']`` | Selects all elements for which the given attribute   |
409|                       | has the given value.  The value cannot contain       |
410|                       | quotes.                                              |
411+-----------------------+------------------------------------------------------+
412| ``[tag]``             | Selects all elements that have a child named         |
413|                       | ``tag``.  Only immediate children are supported.     |
414+-----------------------+------------------------------------------------------+
415| ``[tag='text']``      | Selects all elements that have a child named         |
416|                       | ``tag`` whose complete text content, including       |
417|                       | descendants, equals the given ``text``.              |
418+-----------------------+------------------------------------------------------+
419| ``[position]``        | Selects all elements that are located at the given   |
420|                       | position.  The position can be either an integer     |
421|                       | (1 is the first position), the expression ``last()`` |
422|                       | (for the last position), or a position relative to   |
423|                       | the last position (e.g. ``last()-1``).               |
424+-----------------------+------------------------------------------------------+
425
426Predicates (expressions within square brackets) must be preceded by a tag
427name, an asterisk, or another predicate.  ``position`` predicates must be
428preceded by a tag name.
429
430Reference
431---------
432
433.. _elementtree-functions:
434
435Functions
436^^^^^^^^^
437
438
439.. function:: Comment(text=None)
440
441   Comment element factory.  This factory function creates a special element
442   that will be serialized as an XML comment by the standard serializer.  The
443   comment string can be either a bytestring or a Unicode string.  *text* is a
444   string containing the comment string.  Returns an element instance
445   representing a comment.
446
447
448.. function:: dump(elem)
449
450   Writes an element tree or element structure to sys.stdout.  This function
451   should be used for debugging only.
452
453   The exact output format is implementation dependent.  In this version, it's
454   written as an ordinary XML file.
455
456   *elem* is an element tree or an individual element.
457
458
459.. function:: fromstring(text)
460
461   Parses an XML section from a string constant.  Same as :func:`XML`.  *text*
462   is a string containing XML data.  Returns an :class:`Element` instance.
463
464
465.. function:: fromstringlist(sequence, parser=None)
466
467   Parses an XML document from a sequence of string fragments.  *sequence* is a
468   list or other sequence containing XML data fragments.  *parser* is an
469   optional parser instance.  If not given, the standard :class:`XMLParser`
470   parser is used.  Returns an :class:`Element` instance.
471
472   .. versionadded:: 2.7
473
474
475.. function:: iselement(element)
476
477   Checks if an object appears to be a valid element object.  *element* is an
478   element instance.  Returns a true value if this is an element object.
479
480
481.. function:: iterparse(source, events=None, parser=None)
482
483   Parses an XML section into an element tree incrementally, and reports what's
484   going on to the user.  *source* is a filename or file object containing XML
485   data.  *events* is a list of events to report back.  If omitted, only "end"
486   events are reported.  *parser* is an optional parser instance.  If not
487   given, the standard :class:`XMLParser` parser is used.  *parser* is not
488   supported by ``cElementTree``. Returns an :term:`iterator` providing
489   ``(event, elem)`` pairs.
490
491   .. note::
492
493      :func:`iterparse` only guarantees that it has seen the ">"
494      character of a starting tag when it emits a "start" event, so the
495      attributes are defined, but the contents of the text and tail attributes
496      are undefined at that point.  The same applies to the element children;
497      they may or may not be present.
498
499      If you need a fully populated element, look for "end" events instead.
500
501
502.. function:: parse(source, parser=None)
503
504   Parses an XML section into an element tree.  *source* is a filename or file
505   object containing XML data.  *parser* is an optional parser instance.  If
506   not given, the standard :class:`XMLParser` parser is used.  Returns an
507   :class:`ElementTree` instance.
508
509
510.. function:: ProcessingInstruction(target, text=None)
511
512   PI element factory.  This factory function creates a special element that
513   will be serialized as an XML processing instruction.  *target* is a string
514   containing the PI target.  *text* is a string containing the PI contents, if
515   given.  Returns an element instance, representing a processing instruction.
516
517
518.. function:: register_namespace(prefix, uri)
519
520   Registers a namespace prefix.  The registry is global, and any existing
521   mapping for either the given prefix or the namespace URI will be removed.
522   *prefix* is a namespace prefix.  *uri* is a namespace uri.  Tags and
523   attributes in this namespace will be serialized with the given prefix, if at
524   all possible.
525
526   .. versionadded:: 2.7
527
528
529.. function:: SubElement(parent, tag, attrib={}, **extra)
530
531   Subelement factory.  This function creates an element instance, and appends
532   it to an existing element.
533
534   The element name, attribute names, and attribute values can be either
535   bytestrings or Unicode strings.  *parent* is the parent element.  *tag* is
536   the subelement name.  *attrib* is an optional dictionary, containing element
537   attributes.  *extra* contains additional attributes, given as keyword
538   arguments.  Returns an element instance.
539
540
541.. function:: tostring(element, encoding="us-ascii", method="xml")
542
543   Generates a string representation of an XML element, including all
544   subelements.  *element* is an :class:`Element` instance.  *encoding* [1]_ is
545   the output encoding (default is US-ASCII).  *method* is either ``"xml"``,
546   ``"html"`` or ``"text"`` (default is ``"xml"``).  Returns an encoded string
547   containing the XML data.
548
549
550.. function:: tostringlist(element, encoding="us-ascii", method="xml")
551
552   Generates a string representation of an XML element, including all
553   subelements.  *element* is an :class:`Element` instance.  *encoding* [1]_ is
554   the output encoding (default is US-ASCII).   *method* is either ``"xml"``,
555   ``"html"`` or ``"text"`` (default is ``"xml"``).  Returns a list of encoded
556   strings containing the XML data.  It does not guarantee any specific
557   sequence, except that ``"".join(tostringlist(element)) ==
558   tostring(element)``.
559
560   .. versionadded:: 2.7
561
562
563.. function:: XML(text, parser=None)
564
565   Parses an XML section from a string constant.  This function can be used to
566   embed "XML literals" in Python code.  *text* is a string containing XML
567   data.  *parser* is an optional parser instance.  If not given, the standard
568   :class:`XMLParser` parser is used.  Returns an :class:`Element` instance.
569
570
571.. function:: XMLID(text, parser=None)
572
573   Parses an XML section from a string constant, and also returns a dictionary
574   which maps from element id:s to elements.  *text* is a string containing XML
575   data.  *parser* is an optional parser instance.  If not given, the standard
576   :class:`XMLParser` parser is used.  Returns a tuple containing an
577   :class:`Element` instance and a dictionary.
578
579
580.. _elementtree-element-objects:
581
582Element Objects
583^^^^^^^^^^^^^^^
584
585.. class:: Element(tag, attrib={}, **extra)
586
587   Element class.  This class defines the Element interface, and provides a
588   reference implementation of this interface.
589
590   The element name, attribute names, and attribute values can be either
591   bytestrings or Unicode strings.  *tag* is the element name.  *attrib* is
592   an optional dictionary, containing element attributes.  *extra* contains
593   additional attributes, given as keyword arguments.
594
595
596   .. attribute:: tag
597
598      A string identifying what kind of data this element represents (the
599      element type, in other words).
600
601
602   .. attribute:: text
603                  tail
604
605      These attributes can be used to hold additional data associated with
606      the element.  Their values are usually strings but may be any
607      application-specific object.  If the element is created from
608      an XML file, the *text* attribute holds either the text between
609      the element's start tag and its first child or end tag, or ``None``, and
610      the *tail* attribute holds either the text between the element's
611      end tag and the next tag, or ``None``.  For the XML data
612
613      .. code-block:: xml
614
615         <a><b>1<c>2<d/>3</c></b>4</a>
616
617      the *a* element has ``None`` for both *text* and *tail* attributes,
618      the *b* element has *text* ``"1"`` and *tail* ``"4"``,
619      the *c* element has *text* ``"2"`` and *tail* ``None``,
620      and the *d* element has *text* ``None`` and *tail* ``"3"``.
621
622      To collect the inner text of an element, see :meth:`itertext`, for
623      example ``"".join(element.itertext())``.
624
625      Applications may store arbitrary objects in these attributes.
626
627
628   .. attribute:: attrib
629
630      A dictionary containing the element's attributes.  Note that while the
631      *attrib* value is always a real mutable Python dictionary, an ElementTree
632      implementation may choose to use another internal representation, and
633      create the dictionary only if someone asks for it.  To take advantage of
634      such implementations, use the dictionary methods below whenever possible.
635
636   The following dictionary-like methods work on the element attributes.
637
638
639   .. method:: clear()
640
641      Resets an element.  This function removes all subelements, clears all
642      attributes, and sets the text and tail attributes to ``None``.
643
644
645   .. method:: get(key, default=None)
646
647      Gets the element attribute named *key*.
648
649      Returns the attribute value, or *default* if the attribute was not found.
650
651
652   .. method:: items()
653
654      Returns the element attributes as a sequence of (name, value) pairs.  The
655      attributes are returned in an arbitrary order.
656
657
658   .. method:: keys()
659
660      Returns the elements attribute names as a list.  The names are returned
661      in an arbitrary order.
662
663
664   .. method:: set(key, value)
665
666      Set the attribute *key* on the element to *value*.
667
668   The following methods work on the element's children (subelements).
669
670
671   .. method:: append(subelement)
672
673      Adds the element *subelement* to the end of this elements internal list
674      of subelements.
675
676
677   .. method:: extend(subelements)
678
679      Appends *subelements* from a sequence object with zero or more elements.
680      Raises :exc:`AssertionError` if a subelement is not a valid object.
681
682      .. versionadded:: 2.7
683
684
685   .. method:: find(match)
686
687      Finds the first subelement matching *match*.  *match* may be a tag name
688      or path.  Returns an element instance or ``None``.
689
690
691   .. method:: findall(match)
692
693      Finds all matching subelements, by tag name or path.  Returns a list
694      containing all matching elements in document order.
695
696
697   .. method:: findtext(match, default=None)
698
699      Finds text for the first subelement matching *match*.  *match* may be
700      a tag name or path.  Returns the text content of the first matching
701      element, or *default* if no element was found.  Note that if the matching
702      element has no text content an empty string is returned.
703
704
705   .. method:: getchildren()
706
707      .. deprecated:: 2.7
708         Use ``list(elem)`` or iteration.
709
710
711   .. method:: getiterator(tag=None)
712
713      .. deprecated:: 2.7
714         Use method :meth:`Element.iter` instead.
715
716
717   .. method:: insert(index, element)
718
719      Inserts a subelement at the given position in this element.
720
721
722   .. method:: iter(tag=None)
723
724      Creates a tree :term:`iterator` with the current element as the root.
725      The iterator iterates over this element and all elements below it, in
726      document (depth first) order.  If *tag* is not ``None`` or ``'*'``, only
727      elements whose tag equals *tag* are returned from the iterator.  If the
728      tree structure is modified during iteration, the result is undefined.
729
730      .. versionadded:: 2.7
731
732
733   .. method:: iterfind(match)
734
735      Finds all matching subelements, by tag name or path.  Returns an iterable
736      yielding all matching elements in document order.
737
738      .. versionadded:: 2.7
739
740
741   .. method:: itertext()
742
743      Creates a text iterator.  The iterator loops over this element and all
744      subelements, in document order, and returns all inner text.
745
746      .. versionadded:: 2.7
747
748
749   .. method:: makeelement(tag, attrib)
750
751      Creates a new element object of the same type as this element.  Do not
752      call this method, use the :func:`SubElement` factory function instead.
753
754
755   .. method:: remove(subelement)
756
757      Removes *subelement* from the element.  Unlike the find\* methods this
758      method compares elements based on the instance identity, not on tag value
759      or contents.
760
761   :class:`Element` objects also support the following sequence type methods
762   for working with subelements: :meth:`~object.__delitem__`,
763   :meth:`~object.__getitem__`, :meth:`~object.__setitem__`,
764   :meth:`~object.__len__`.
765
766   Caution: Elements with no subelements will test as ``False``.  This behavior
767   will change in future versions.  Use specific ``len(elem)`` or ``elem is
768   None`` test instead. ::
769
770     element = root.find('foo')
771
772     if not element:  # careful!
773         print "element not found, or element has no subelements"
774
775     if element is None:
776         print "element not found"
777
778
779.. _elementtree-elementtree-objects:
780
781ElementTree Objects
782^^^^^^^^^^^^^^^^^^^
783
784
785.. class:: ElementTree(element=None, file=None)
786
787   ElementTree wrapper class.  This class represents an entire element
788   hierarchy, and adds some extra support for serialization to and from
789   standard XML.
790
791   *element* is the root element.  The tree is initialized with the contents
792   of the XML *file* if given.
793
794
795   .. method:: _setroot(element)
796
797      Replaces the root element for this tree.  This discards the current
798      contents of the tree, and replaces it with the given element.  Use with
799      care.  *element* is an element instance.
800
801
802   .. method:: find(match)
803
804      Same as :meth:`Element.find`, starting at the root of the tree.
805
806
807   .. method:: findall(match)
808
809      Same as :meth:`Element.findall`, starting at the root of the tree.
810
811
812   .. method:: findtext(match, default=None)
813
814      Same as :meth:`Element.findtext`, starting at the root of the tree.
815
816
817   .. method:: getiterator(tag=None)
818
819      .. deprecated:: 2.7
820         Use method :meth:`ElementTree.iter` instead.
821
822
823   .. method:: getroot()
824
825      Returns the root element for this tree.
826
827
828   .. method:: iter(tag=None)
829
830      Creates and returns a tree iterator for the root element.  The iterator
831      loops over all elements in this tree, in section order.  *tag* is the tag
832      to look for (default is to return all elements).
833
834
835   .. method:: iterfind(match)
836
837      Finds all matching subelements, by tag name or path.  Same as
838      getroot().iterfind(match). Returns an iterable yielding all matching
839      elements in document order.
840
841      .. versionadded:: 2.7
842
843
844   .. method:: parse(source, parser=None)
845
846      Loads an external XML section into this element tree.  *source* is a file
847      name or file object.  *parser* is an optional parser instance.  If not
848      given, the standard XMLParser parser is used.  Returns the section
849      root element.
850
851
852   .. method:: write(file, encoding="us-ascii", xml_declaration=None, \
853                     default_namespace=None, method="xml")
854
855      Writes the element tree to a file, as XML.  *file* is a file name, or a
856      file object opened for writing.  *encoding* [1]_ is the output encoding
857      (default is US-ASCII).  *xml_declaration* controls if an XML declaration
858      should be added to the file.  Use ``False`` for never, ``True`` for always, ``None``
859      for only if not US-ASCII or UTF-8 (default is ``None``).  *default_namespace*
860      sets the default XML namespace (for "xmlns").  *method* is either
861      ``"xml"``, ``"html"`` or ``"text"`` (default is ``"xml"``).  Returns an
862      encoded string.
863
864This is the XML file that is going to be manipulated::
865
866    <html>
867        <head>
868            <title>Example page</title>
869        </head>
870        <body>
871            <p>Moved to <a href="http://example.org/">example.org</a>
872            or <a href="http://example.com/">example.com</a>.</p>
873        </body>
874    </html>
875
876Example of changing the attribute "target" of every link in first paragraph::
877
878    >>> from xml.etree.ElementTree import ElementTree
879    >>> tree = ElementTree()
880    >>> tree.parse("index.xhtml")
881    <Element 'html' at 0xb77e6fac>
882    >>> p = tree.find("body/p")     # Finds first occurrence of tag p in body
883    >>> p
884    <Element 'p' at 0xb77ec26c>
885    >>> links = list(p.iter("a"))   # Returns list of all links
886    >>> links
887    [<Element 'a' at 0xb77ec2ac>, <Element 'a' at 0xb77ec1cc>]
888    >>> for i in links:             # Iterates through all found links
889    ...     i.attrib["target"] = "blank"
890    ...
891    >>> tree.write("output.xhtml")
892
893.. _elementtree-qname-objects:
894
895QName Objects
896^^^^^^^^^^^^^
897
898
899.. class:: QName(text_or_uri, tag=None)
900
901   QName wrapper.  This can be used to wrap a QName attribute value, in order
902   to get proper namespace handling on output.  *text_or_uri* is a string
903   containing the QName value, in the form {uri}local, or, if the tag argument
904   is given, the URI part of a QName.  If *tag* is given, the first argument is
905   interpreted as a URI, and this argument is interpreted as a local name.
906   :class:`QName` instances are opaque.
907
908
909.. _elementtree-treebuilder-objects:
910
911TreeBuilder Objects
912^^^^^^^^^^^^^^^^^^^
913
914
915.. class:: TreeBuilder(element_factory=None)
916
917   Generic element structure builder.  This builder converts a sequence of
918   start, data, and end method calls to a well-formed element structure.  You
919   can use this class to build an element structure using a custom XML parser,
920   or a parser for some other XML-like format.  The *element_factory* is called
921   to create new :class:`Element` instances when given.
922
923
924   .. method:: close()
925
926      Flushes the builder buffers, and returns the toplevel document
927      element.  Returns an :class:`Element` instance.
928
929
930   .. method:: data(data)
931
932      Adds text to the current element.  *data* is a string.  This should be
933      either a bytestring, or a Unicode string.
934
935
936   .. method:: end(tag)
937
938      Closes the current element.  *tag* is the element name.  Returns the
939      closed element.
940
941
942   .. method:: start(tag, attrs)
943
944      Opens a new element.  *tag* is the element name.  *attrs* is a dictionary
945      containing element attributes.  Returns the opened element.
946
947
948   In addition, a custom :class:`TreeBuilder` object can provide the
949   following method:
950
951   .. method:: doctype(name, pubid, system)
952
953      Handles a doctype declaration.  *name* is the doctype name.  *pubid* is
954      the public identifier.  *system* is the system identifier.  This method
955      does not exist on the default :class:`TreeBuilder` class.
956
957      .. versionadded:: 2.7
958
959
960.. _elementtree-xmlparser-objects:
961
962XMLParser Objects
963^^^^^^^^^^^^^^^^^
964
965
966.. class:: XMLParser(html=0, target=None, encoding=None)
967
968   :class:`Element` structure builder for XML source data, based on the expat
969   parser.  *html* are predefined HTML entities.  This flag is not supported by
970   the current implementation.  *target* is the target object.  If omitted, the
971   builder uses an instance of the standard TreeBuilder class.  *encoding* [1]_
972   is optional.  If given, the value overrides the encoding specified in the
973   XML file.
974
975
976   .. method:: close()
977
978      Finishes feeding data to the parser.  Returns an element structure.
979
980
981   .. method:: doctype(name, pubid, system)
982
983      .. deprecated:: 2.7
984         Define the :meth:`TreeBuilder.doctype` method on a custom TreeBuilder
985         target.
986
987
988   .. method:: feed(data)
989
990      Feeds data to the parser.  *data* is encoded data.
991
992:meth:`XMLParser.feed` calls *target*\'s :meth:`start` method
993for each opening tag, its :meth:`end` method for each closing tag,
994and data is processed by method :meth:`data`.  :meth:`XMLParser.close`
995calls *target*\'s method :meth:`close`.
996:class:`XMLParser` can be used not only for building a tree structure.
997This is an example of counting the maximum depth of an XML file::
998
999    >>> from xml.etree.ElementTree import XMLParser
1000    >>> class MaxDepth:                     # The target object of the parser
1001    ...     maxDepth = 0
1002    ...     depth = 0
1003    ...     def start(self, tag, attrib):   # Called for each opening tag.
1004    ...         self.depth += 1
1005    ...         if self.depth > self.maxDepth:
1006    ...             self.maxDepth = self.depth
1007    ...     def end(self, tag):             # Called for each closing tag.
1008    ...         self.depth -= 1
1009    ...     def data(self, data):
1010    ...         pass            # We do not need to do anything with data.
1011    ...     def close(self):    # Called when all data has been parsed.
1012    ...         return self.maxDepth
1013    ...
1014    >>> target = MaxDepth()
1015    >>> parser = XMLParser(target=target)
1016    >>> exampleXml = """
1017    ... <a>
1018    ...   <b>
1019    ...   </b>
1020    ...   <b>
1021    ...     <c>
1022    ...       <d>
1023    ...       </d>
1024    ...     </c>
1025    ...   </b>
1026    ... </a>"""
1027    >>> parser.feed(exampleXml)
1028    >>> parser.close()
1029    4
1030
1031
1032.. rubric:: Footnotes
1033
1034.. [1] The encoding string included in XML output should conform to the
1035   appropriate standards.  For example, "UTF-8" is valid, but "UTF8" is
1036   not.  See https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EncodingDecl
1037   and https://www.iana.org/assignments/character-sets/character-sets.xhtml.
1038