1:mod:`xml.etree.ElementTree` --- The ElementTree XML API 2======================================================== 3 4.. module:: xml.etree.ElementTree 5 :synopsis: Implementation of the ElementTree API. 6.. moduleauthor:: Fredrik Lundh <fredrik@pythonware.com> 7 8 9.. versionadded:: 2.5 10 11**Source code:** :source:`Lib/xml/etree/ElementTree.py` 12 13-------------- 14 15The :class:`Element` type is a flexible container object, designed to store 16hierarchical data structures in memory. The type can be described as a cross 17between a list and a dictionary. 18 19 20.. warning:: 21 22 The :mod:`xml.etree.ElementTree` module is not secure against 23 maliciously constructed data. If you need to parse untrusted or 24 unauthenticated data see :ref:`xml-vulnerabilities`. 25 26 27Each element has a number of properties associated with it: 28 29* a tag which is a string identifying what kind of data this element represents 30 (the element type, in other words). 31 32* a number of attributes, stored in a Python dictionary. 33 34* a text string. 35 36* an optional tail string. 37 38* a number of child elements, stored in a Python sequence 39 40To create an element instance, use the :class:`Element` constructor or the 41:func:`SubElement` factory function. 42 43The :class:`ElementTree` class can be used to wrap an element structure, and 44convert it from and to XML. 45 46A C implementation of this API is available as :mod:`xml.etree.cElementTree`. 47 48See http://effbot.org/zone/element-index.htm for tutorials and links to other 49docs. Fredrik Lundh's page is also the location of the development version of 50the xml.etree.ElementTree. 51 52.. versionchanged:: 2.7 53 The ElementTree API is updated to 1.3. For more information, see 54 `Introducing ElementTree 1.3 55 <http://effbot.org/zone/elementtree-13-intro.htm>`_. 56 57Tutorial 58-------- 59 60This is a short tutorial for using :mod:`xml.etree.ElementTree` (``ET`` in 61short). The goal is to demonstrate some of the building blocks and basic 62concepts of the module. 63 64XML tree and elements 65^^^^^^^^^^^^^^^^^^^^^ 66 67XML is an inherently hierarchical data format, and the most natural way to 68represent it is with a tree. ``ET`` has two classes for this purpose - 69:class:`ElementTree` represents the whole XML document as a tree, and 70:class:`Element` represents a single node in this tree. Interactions with 71the whole document (reading and writing to/from files) are usually done 72on the :class:`ElementTree` level. Interactions with a single XML element 73and its sub-elements are done on the :class:`Element` level. 74 75.. _elementtree-parsing-xml: 76 77Parsing XML 78^^^^^^^^^^^ 79 80We'll be using the following XML document as the sample data for this section: 81 82.. code-block:: xml 83 84 <?xml version="1.0"?> 85 <data> 86 <country name="Liechtenstein"> 87 <rank>1</rank> 88 <year>2008</year> 89 <gdppc>141100</gdppc> 90 <neighbor name="Austria" direction="E"/> 91 <neighbor name="Switzerland" direction="W"/> 92 </country> 93 <country name="Singapore"> 94 <rank>4</rank> 95 <year>2011</year> 96 <gdppc>59900</gdppc> 97 <neighbor name="Malaysia" direction="N"/> 98 </country> 99 <country name="Panama"> 100 <rank>68</rank> 101 <year>2011</year> 102 <gdppc>13600</gdppc> 103 <neighbor name="Costa Rica" direction="W"/> 104 <neighbor name="Colombia" direction="E"/> 105 </country> 106 </data> 107 108We have a number of ways to import the data. Reading the file from disk:: 109 110 import xml.etree.ElementTree as ET 111 tree = ET.parse('country_data.xml') 112 root = tree.getroot() 113 114Reading the data from a string:: 115 116 root = ET.fromstring(country_data_as_string) 117 118:func:`fromstring` parses XML from a string directly into an :class:`Element`, 119which is the root element of the parsed tree. Other parsing functions may 120create an :class:`ElementTree`. Check the documentation to be sure. 121 122As an :class:`Element`, ``root`` has a tag and a dictionary of attributes:: 123 124 >>> root.tag 125 'data' 126 >>> root.attrib 127 {} 128 129It also has children nodes over which we can iterate:: 130 131 >>> for child in root: 132 ... print child.tag, child.attrib 133 ... 134 country {'name': 'Liechtenstein'} 135 country {'name': 'Singapore'} 136 country {'name': 'Panama'} 137 138Children are nested, and we can access specific child nodes by index:: 139 140 >>> root[0][1].text 141 '2008' 142 143Finding interesting elements 144^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 145 146:class:`Element` has some useful methods that help iterate recursively over all 147the sub-tree below it (its children, their children, and so on). For example, 148:meth:`Element.iter`:: 149 150 >>> for neighbor in root.iter('neighbor'): 151 ... print neighbor.attrib 152 ... 153 {'name': 'Austria', 'direction': 'E'} 154 {'name': 'Switzerland', 'direction': 'W'} 155 {'name': 'Malaysia', 'direction': 'N'} 156 {'name': 'Costa Rica', 'direction': 'W'} 157 {'name': 'Colombia', 'direction': 'E'} 158 159:meth:`Element.findall` finds only elements with a tag which are direct 160children of the current element. :meth:`Element.find` finds the *first* child 161with a particular tag, and :attr:`Element.text` accesses the element's text 162content. :meth:`Element.get` accesses the element's attributes:: 163 164 >>> for country in root.findall('country'): 165 ... rank = country.find('rank').text 166 ... name = country.get('name') 167 ... print name, rank 168 ... 169 Liechtenstein 1 170 Singapore 4 171 Panama 68 172 173More sophisticated specification of which elements to look for is possible by 174using :ref:`XPath <elementtree-xpath>`. 175 176Modifying an XML File 177^^^^^^^^^^^^^^^^^^^^^ 178 179:class:`ElementTree` provides a simple way to build XML documents and write them to files. 180The :meth:`ElementTree.write` method serves this purpose. 181 182Once created, an :class:`Element` object may be manipulated by directly changing 183its fields (such as :attr:`Element.text`), adding and modifying attributes 184(:meth:`Element.set` method), as well as adding new children (for example 185with :meth:`Element.append`). 186 187Let's say we want to add one to each country's rank, and add an ``updated`` 188attribute to the rank element:: 189 190 >>> for rank in root.iter('rank'): 191 ... new_rank = int(rank.text) + 1 192 ... rank.text = str(new_rank) 193 ... rank.set('updated', 'yes') 194 ... 195 >>> tree.write('output.xml') 196 197Our XML now looks like this: 198 199.. code-block:: xml 200 201 <?xml version="1.0"?> 202 <data> 203 <country name="Liechtenstein"> 204 <rank updated="yes">2</rank> 205 <year>2008</year> 206 <gdppc>141100</gdppc> 207 <neighbor name="Austria" direction="E"/> 208 <neighbor name="Switzerland" direction="W"/> 209 </country> 210 <country name="Singapore"> 211 <rank updated="yes">5</rank> 212 <year>2011</year> 213 <gdppc>59900</gdppc> 214 <neighbor name="Malaysia" direction="N"/> 215 </country> 216 <country name="Panama"> 217 <rank updated="yes">69</rank> 218 <year>2011</year> 219 <gdppc>13600</gdppc> 220 <neighbor name="Costa Rica" direction="W"/> 221 <neighbor name="Colombia" direction="E"/> 222 </country> 223 </data> 224 225We can remove elements using :meth:`Element.remove`. Let's say we want to 226remove all countries with a rank higher than 50:: 227 228 >>> for country in root.findall('country'): 229 ... rank = int(country.find('rank').text) 230 ... if rank > 50: 231 ... root.remove(country) 232 ... 233 >>> tree.write('output.xml') 234 235Our XML now looks like this: 236 237.. code-block:: xml 238 239 <?xml version="1.0"?> 240 <data> 241 <country name="Liechtenstein"> 242 <rank updated="yes">2</rank> 243 <year>2008</year> 244 <gdppc>141100</gdppc> 245 <neighbor name="Austria" direction="E"/> 246 <neighbor name="Switzerland" direction="W"/> 247 </country> 248 <country name="Singapore"> 249 <rank updated="yes">5</rank> 250 <year>2011</year> 251 <gdppc>59900</gdppc> 252 <neighbor name="Malaysia" direction="N"/> 253 </country> 254 </data> 255 256Building XML documents 257^^^^^^^^^^^^^^^^^^^^^^ 258 259The :func:`SubElement` function also provides a convenient way to create new 260sub-elements for a given element:: 261 262 >>> a = ET.Element('a') 263 >>> b = ET.SubElement(a, 'b') 264 >>> c = ET.SubElement(a, 'c') 265 >>> d = ET.SubElement(c, 'd') 266 >>> ET.dump(a) 267 <a><b /><c><d /></c></a> 268 269Parsing XML with Namespaces 270^^^^^^^^^^^^^^^^^^^^^^^^^^^ 271 272If the XML input has `namespaces 273<https://en.wikipedia.org/wiki/XML_namespace>`__, tags and attributes 274with prefixes in the form ``prefix:sometag`` get expanded to 275``{uri}sometag`` where the *prefix* is replaced by the full *URI*. 276Also, if there is a `default namespace 277<https://www.w3.org/TR/xml-names/#defaulting>`__, 278that full URI gets prepended to all of the non-prefixed tags. 279 280Here is an XML example that incorporates two namespaces, one with the 281prefix "fictional" and the other serving as the default namespace: 282 283.. code-block:: xml 284 285 <?xml version="1.0"?> 286 <actors xmlns:fictional="http://characters.example.com" 287 xmlns="http://people.example.com"> 288 <actor> 289 <name>John Cleese</name> 290 <fictional:character>Lancelot</fictional:character> 291 <fictional:character>Archie Leach</fictional:character> 292 </actor> 293 <actor> 294 <name>Eric Idle</name> 295 <fictional:character>Sir Robin</fictional:character> 296 <fictional:character>Gunther</fictional:character> 297 <fictional:character>Commander Clement</fictional:character> 298 </actor> 299 </actors> 300 301One way to search and explore this XML example is to manually add the 302URI to every tag or attribute in the xpath of a 303:meth:`~Element.find` or :meth:`~Element.findall`:: 304 305 root = fromstring(xml_text) 306 for actor in root.findall('{http://people.example.com}actor'): 307 name = actor.find('{http://people.example.com}name') 308 print name.text 309 for char in actor.findall('{http://characters.example.com}character'): 310 print ' |-->', char.text 311 312 313A better way to search the namespaced XML example is to create a 314dictionary with your own prefixes and use those in the search functions:: 315 316 ns = {'real_person': 'http://people.example.com', 317 'role': 'http://characters.example.com'} 318 319 for actor in root.findall('real_person:actor', ns): 320 name = actor.find('real_person:name', ns) 321 print name.text 322 for char in actor.findall('role:character', ns): 323 print ' |-->', char.text 324 325These two approaches both output:: 326 327 John Cleese 328 |--> Lancelot 329 |--> Archie Leach 330 Eric Idle 331 |--> Sir Robin 332 |--> Gunther 333 |--> Commander Clement 334 335 336Additional resources 337^^^^^^^^^^^^^^^^^^^^ 338 339See http://effbot.org/zone/element-index.htm for tutorials and links to other 340docs. 341 342.. _elementtree-xpath: 343 344XPath support 345------------- 346 347This module provides limited support for 348`XPath expressions <https://www.w3.org/TR/xpath>`_ for locating elements in a 349tree. The goal is to support a small subset of the abbreviated syntax; a full 350XPath engine is outside the scope of the module. 351 352Example 353^^^^^^^ 354 355Here's an example that demonstrates some of the XPath capabilities of the 356module. We'll be using the ``countrydata`` XML document from the 357:ref:`Parsing XML <elementtree-parsing-xml>` section:: 358 359 import xml.etree.ElementTree as ET 360 361 root = ET.fromstring(countrydata) 362 363 # Top-level elements 364 root.findall(".") 365 366 # All 'neighbor' grand-children of 'country' children of the top-level 367 # elements 368 root.findall("./country/neighbor") 369 370 # Nodes with name='Singapore' that have a 'year' child 371 root.findall(".//year/..[@name='Singapore']") 372 373 # 'year' nodes that are children of nodes with name='Singapore' 374 root.findall(".//*[@name='Singapore']/year") 375 376 # All 'neighbor' nodes that are the second child of their parent 377 root.findall(".//neighbor[2]") 378 379Supported XPath syntax 380^^^^^^^^^^^^^^^^^^^^^^ 381 382.. tabularcolumns:: |l|L| 383 384+-----------------------+------------------------------------------------------+ 385| Syntax | Meaning | 386+=======================+======================================================+ 387| ``tag`` | Selects all child elements with the given tag. | 388| | For example, ``spam`` selects all child elements | 389| | named ``spam``, and ``spam/egg`` selects all | 390| | grandchildren named ``egg`` in all children named | 391| | ``spam``. | 392+-----------------------+------------------------------------------------------+ 393| ``*`` | Selects all child elements. For example, ``*/egg`` | 394| | selects all grandchildren named ``egg``. | 395+-----------------------+------------------------------------------------------+ 396| ``.`` | Selects the current node. This is mostly useful | 397| | at the beginning of the path, to indicate that it's | 398| | a relative path. | 399+-----------------------+------------------------------------------------------+ 400| ``//`` | Selects all subelements, on all levels beneath the | 401| | current element. For example, ``.//egg`` selects | 402| | all ``egg`` elements in the entire tree. | 403+-----------------------+------------------------------------------------------+ 404| ``..`` | Selects the parent element. | 405+-----------------------+------------------------------------------------------+ 406| ``[@attrib]`` | Selects all elements that have the given attribute. | 407+-----------------------+------------------------------------------------------+ 408| ``[@attrib='value']`` | Selects all elements for which the given attribute | 409| | has the given value. The value cannot contain | 410| | quotes. | 411+-----------------------+------------------------------------------------------+ 412| ``[tag]`` | Selects all elements that have a child named | 413| | ``tag``. Only immediate children are supported. | 414+-----------------------+------------------------------------------------------+ 415| ``[tag='text']`` | Selects all elements that have a child named | 416| | ``tag`` whose complete text content, including | 417| | descendants, equals the given ``text``. | 418+-----------------------+------------------------------------------------------+ 419| ``[position]`` | Selects all elements that are located at the given | 420| | position. The position can be either an integer | 421| | (1 is the first position), the expression ``last()`` | 422| | (for the last position), or a position relative to | 423| | the last position (e.g. ``last()-1``). | 424+-----------------------+------------------------------------------------------+ 425 426Predicates (expressions within square brackets) must be preceded by a tag 427name, an asterisk, or another predicate. ``position`` predicates must be 428preceded by a tag name. 429 430Reference 431--------- 432 433.. _elementtree-functions: 434 435Functions 436^^^^^^^^^ 437 438 439.. function:: Comment(text=None) 440 441 Comment element factory. This factory function creates a special element 442 that will be serialized as an XML comment by the standard serializer. The 443 comment string can be either a bytestring or a Unicode string. *text* is a 444 string containing the comment string. Returns an element instance 445 representing a comment. 446 447 448.. function:: dump(elem) 449 450 Writes an element tree or element structure to sys.stdout. This function 451 should be used for debugging only. 452 453 The exact output format is implementation dependent. In this version, it's 454 written as an ordinary XML file. 455 456 *elem* is an element tree or an individual element. 457 458 459.. function:: fromstring(text) 460 461 Parses an XML section from a string constant. Same as :func:`XML`. *text* 462 is a string containing XML data. Returns an :class:`Element` instance. 463 464 465.. function:: fromstringlist(sequence, parser=None) 466 467 Parses an XML document from a sequence of string fragments. *sequence* is a 468 list or other sequence containing XML data fragments. *parser* is an 469 optional parser instance. If not given, the standard :class:`XMLParser` 470 parser is used. Returns an :class:`Element` instance. 471 472 .. versionadded:: 2.7 473 474 475.. function:: iselement(element) 476 477 Checks if an object appears to be a valid element object. *element* is an 478 element instance. Returns a true value if this is an element object. 479 480 481.. function:: iterparse(source, events=None, parser=None) 482 483 Parses an XML section into an element tree incrementally, and reports what's 484 going on to the user. *source* is a filename or file object containing XML 485 data. *events* is a list of events to report back. If omitted, only "end" 486 events are reported. *parser* is an optional parser instance. If not 487 given, the standard :class:`XMLParser` parser is used. *parser* is not 488 supported by ``cElementTree``. Returns an :term:`iterator` providing 489 ``(event, elem)`` pairs. 490 491 .. note:: 492 493 :func:`iterparse` only guarantees that it has seen the ">" 494 character of a starting tag when it emits a "start" event, so the 495 attributes are defined, but the contents of the text and tail attributes 496 are undefined at that point. The same applies to the element children; 497 they may or may not be present. 498 499 If you need a fully populated element, look for "end" events instead. 500 501 502.. function:: parse(source, parser=None) 503 504 Parses an XML section into an element tree. *source* is a filename or file 505 object containing XML data. *parser* is an optional parser instance. If 506 not given, the standard :class:`XMLParser` parser is used. Returns an 507 :class:`ElementTree` instance. 508 509 510.. function:: ProcessingInstruction(target, text=None) 511 512 PI element factory. This factory function creates a special element that 513 will be serialized as an XML processing instruction. *target* is a string 514 containing the PI target. *text* is a string containing the PI contents, if 515 given. Returns an element instance, representing a processing instruction. 516 517 518.. function:: register_namespace(prefix, uri) 519 520 Registers a namespace prefix. The registry is global, and any existing 521 mapping for either the given prefix or the namespace URI will be removed. 522 *prefix* is a namespace prefix. *uri* is a namespace uri. Tags and 523 attributes in this namespace will be serialized with the given prefix, if at 524 all possible. 525 526 .. versionadded:: 2.7 527 528 529.. function:: SubElement(parent, tag, attrib={}, **extra) 530 531 Subelement factory. This function creates an element instance, and appends 532 it to an existing element. 533 534 The element name, attribute names, and attribute values can be either 535 bytestrings or Unicode strings. *parent* is the parent element. *tag* is 536 the subelement name. *attrib* is an optional dictionary, containing element 537 attributes. *extra* contains additional attributes, given as keyword 538 arguments. Returns an element instance. 539 540 541.. function:: tostring(element, encoding="us-ascii", method="xml") 542 543 Generates a string representation of an XML element, including all 544 subelements. *element* is an :class:`Element` instance. *encoding* [1]_ is 545 the output encoding (default is US-ASCII). *method* is either ``"xml"``, 546 ``"html"`` or ``"text"`` (default is ``"xml"``). Returns an encoded string 547 containing the XML data. 548 549 550.. function:: tostringlist(element, encoding="us-ascii", method="xml") 551 552 Generates a string representation of an XML element, including all 553 subelements. *element* is an :class:`Element` instance. *encoding* [1]_ is 554 the output encoding (default is US-ASCII). *method* is either ``"xml"``, 555 ``"html"`` or ``"text"`` (default is ``"xml"``). Returns a list of encoded 556 strings containing the XML data. It does not guarantee any specific 557 sequence, except that ``"".join(tostringlist(element)) == 558 tostring(element)``. 559 560 .. versionadded:: 2.7 561 562 563.. function:: XML(text, parser=None) 564 565 Parses an XML section from a string constant. This function can be used to 566 embed "XML literals" in Python code. *text* is a string containing XML 567 data. *parser* is an optional parser instance. If not given, the standard 568 :class:`XMLParser` parser is used. Returns an :class:`Element` instance. 569 570 571.. function:: XMLID(text, parser=None) 572 573 Parses an XML section from a string constant, and also returns a dictionary 574 which maps from element id:s to elements. *text* is a string containing XML 575 data. *parser* is an optional parser instance. If not given, the standard 576 :class:`XMLParser` parser is used. Returns a tuple containing an 577 :class:`Element` instance and a dictionary. 578 579 580.. _elementtree-element-objects: 581 582Element Objects 583^^^^^^^^^^^^^^^ 584 585.. class:: Element(tag, attrib={}, **extra) 586 587 Element class. This class defines the Element interface, and provides a 588 reference implementation of this interface. 589 590 The element name, attribute names, and attribute values can be either 591 bytestrings or Unicode strings. *tag* is the element name. *attrib* is 592 an optional dictionary, containing element attributes. *extra* contains 593 additional attributes, given as keyword arguments. 594 595 596 .. attribute:: tag 597 598 A string identifying what kind of data this element represents (the 599 element type, in other words). 600 601 602 .. attribute:: text 603 tail 604 605 These attributes can be used to hold additional data associated with 606 the element. Their values are usually strings but may be any 607 application-specific object. If the element is created from 608 an XML file, the *text* attribute holds either the text between 609 the element's start tag and its first child or end tag, or ``None``, and 610 the *tail* attribute holds either the text between the element's 611 end tag and the next tag, or ``None``. For the XML data 612 613 .. code-block:: xml 614 615 <a><b>1<c>2<d/>3</c></b>4</a> 616 617 the *a* element has ``None`` for both *text* and *tail* attributes, 618 the *b* element has *text* ``"1"`` and *tail* ``"4"``, 619 the *c* element has *text* ``"2"`` and *tail* ``None``, 620 and the *d* element has *text* ``None`` and *tail* ``"3"``. 621 622 To collect the inner text of an element, see :meth:`itertext`, for 623 example ``"".join(element.itertext())``. 624 625 Applications may store arbitrary objects in these attributes. 626 627 628 .. attribute:: attrib 629 630 A dictionary containing the element's attributes. Note that while the 631 *attrib* value is always a real mutable Python dictionary, an ElementTree 632 implementation may choose to use another internal representation, and 633 create the dictionary only if someone asks for it. To take advantage of 634 such implementations, use the dictionary methods below whenever possible. 635 636 The following dictionary-like methods work on the element attributes. 637 638 639 .. method:: clear() 640 641 Resets an element. This function removes all subelements, clears all 642 attributes, and sets the text and tail attributes to ``None``. 643 644 645 .. method:: get(key, default=None) 646 647 Gets the element attribute named *key*. 648 649 Returns the attribute value, or *default* if the attribute was not found. 650 651 652 .. method:: items() 653 654 Returns the element attributes as a sequence of (name, value) pairs. The 655 attributes are returned in an arbitrary order. 656 657 658 .. method:: keys() 659 660 Returns the elements attribute names as a list. The names are returned 661 in an arbitrary order. 662 663 664 .. method:: set(key, value) 665 666 Set the attribute *key* on the element to *value*. 667 668 The following methods work on the element's children (subelements). 669 670 671 .. method:: append(subelement) 672 673 Adds the element *subelement* to the end of this elements internal list 674 of subelements. 675 676 677 .. method:: extend(subelements) 678 679 Appends *subelements* from a sequence object with zero or more elements. 680 Raises :exc:`AssertionError` if a subelement is not a valid object. 681 682 .. versionadded:: 2.7 683 684 685 .. method:: find(match) 686 687 Finds the first subelement matching *match*. *match* may be a tag name 688 or path. Returns an element instance or ``None``. 689 690 691 .. method:: findall(match) 692 693 Finds all matching subelements, by tag name or path. Returns a list 694 containing all matching elements in document order. 695 696 697 .. method:: findtext(match, default=None) 698 699 Finds text for the first subelement matching *match*. *match* may be 700 a tag name or path. Returns the text content of the first matching 701 element, or *default* if no element was found. Note that if the matching 702 element has no text content an empty string is returned. 703 704 705 .. method:: getchildren() 706 707 .. deprecated:: 2.7 708 Use ``list(elem)`` or iteration. 709 710 711 .. method:: getiterator(tag=None) 712 713 .. deprecated:: 2.7 714 Use method :meth:`Element.iter` instead. 715 716 717 .. method:: insert(index, element) 718 719 Inserts a subelement at the given position in this element. 720 721 722 .. method:: iter(tag=None) 723 724 Creates a tree :term:`iterator` with the current element as the root. 725 The iterator iterates over this element and all elements below it, in 726 document (depth first) order. If *tag* is not ``None`` or ``'*'``, only 727 elements whose tag equals *tag* are returned from the iterator. If the 728 tree structure is modified during iteration, the result is undefined. 729 730 .. versionadded:: 2.7 731 732 733 .. method:: iterfind(match) 734 735 Finds all matching subelements, by tag name or path. Returns an iterable 736 yielding all matching elements in document order. 737 738 .. versionadded:: 2.7 739 740 741 .. method:: itertext() 742 743 Creates a text iterator. The iterator loops over this element and all 744 subelements, in document order, and returns all inner text. 745 746 .. versionadded:: 2.7 747 748 749 .. method:: makeelement(tag, attrib) 750 751 Creates a new element object of the same type as this element. Do not 752 call this method, use the :func:`SubElement` factory function instead. 753 754 755 .. method:: remove(subelement) 756 757 Removes *subelement* from the element. Unlike the find\* methods this 758 method compares elements based on the instance identity, not on tag value 759 or contents. 760 761 :class:`Element` objects also support the following sequence type methods 762 for working with subelements: :meth:`~object.__delitem__`, 763 :meth:`~object.__getitem__`, :meth:`~object.__setitem__`, 764 :meth:`~object.__len__`. 765 766 Caution: Elements with no subelements will test as ``False``. This behavior 767 will change in future versions. Use specific ``len(elem)`` or ``elem is 768 None`` test instead. :: 769 770 element = root.find('foo') 771 772 if not element: # careful! 773 print "element not found, or element has no subelements" 774 775 if element is None: 776 print "element not found" 777 778 779.. _elementtree-elementtree-objects: 780 781ElementTree Objects 782^^^^^^^^^^^^^^^^^^^ 783 784 785.. class:: ElementTree(element=None, file=None) 786 787 ElementTree wrapper class. This class represents an entire element 788 hierarchy, and adds some extra support for serialization to and from 789 standard XML. 790 791 *element* is the root element. The tree is initialized with the contents 792 of the XML *file* if given. 793 794 795 .. method:: _setroot(element) 796 797 Replaces the root element for this tree. This discards the current 798 contents of the tree, and replaces it with the given element. Use with 799 care. *element* is an element instance. 800 801 802 .. method:: find(match) 803 804 Same as :meth:`Element.find`, starting at the root of the tree. 805 806 807 .. method:: findall(match) 808 809 Same as :meth:`Element.findall`, starting at the root of the tree. 810 811 812 .. method:: findtext(match, default=None) 813 814 Same as :meth:`Element.findtext`, starting at the root of the tree. 815 816 817 .. method:: getiterator(tag=None) 818 819 .. deprecated:: 2.7 820 Use method :meth:`ElementTree.iter` instead. 821 822 823 .. method:: getroot() 824 825 Returns the root element for this tree. 826 827 828 .. method:: iter(tag=None) 829 830 Creates and returns a tree iterator for the root element. The iterator 831 loops over all elements in this tree, in section order. *tag* is the tag 832 to look for (default is to return all elements). 833 834 835 .. method:: iterfind(match) 836 837 Finds all matching subelements, by tag name or path. Same as 838 getroot().iterfind(match). Returns an iterable yielding all matching 839 elements in document order. 840 841 .. versionadded:: 2.7 842 843 844 .. method:: parse(source, parser=None) 845 846 Loads an external XML section into this element tree. *source* is a file 847 name or file object. *parser* is an optional parser instance. If not 848 given, the standard XMLParser parser is used. Returns the section 849 root element. 850 851 852 .. method:: write(file, encoding="us-ascii", xml_declaration=None, \ 853 default_namespace=None, method="xml") 854 855 Writes the element tree to a file, as XML. *file* is a file name, or a 856 file object opened for writing. *encoding* [1]_ is the output encoding 857 (default is US-ASCII). *xml_declaration* controls if an XML declaration 858 should be added to the file. Use ``False`` for never, ``True`` for always, ``None`` 859 for only if not US-ASCII or UTF-8 (default is ``None``). *default_namespace* 860 sets the default XML namespace (for "xmlns"). *method* is either 861 ``"xml"``, ``"html"`` or ``"text"`` (default is ``"xml"``). Returns an 862 encoded string. 863 864This is the XML file that is going to be manipulated:: 865 866 <html> 867 <head> 868 <title>Example page</title> 869 </head> 870 <body> 871 <p>Moved to <a href="http://example.org/">example.org</a> 872 or <a href="http://example.com/">example.com</a>.</p> 873 </body> 874 </html> 875 876Example of changing the attribute "target" of every link in first paragraph:: 877 878 >>> from xml.etree.ElementTree import ElementTree 879 >>> tree = ElementTree() 880 >>> tree.parse("index.xhtml") 881 <Element 'html' at 0xb77e6fac> 882 >>> p = tree.find("body/p") # Finds first occurrence of tag p in body 883 >>> p 884 <Element 'p' at 0xb77ec26c> 885 >>> links = list(p.iter("a")) # Returns list of all links 886 >>> links 887 [<Element 'a' at 0xb77ec2ac>, <Element 'a' at 0xb77ec1cc>] 888 >>> for i in links: # Iterates through all found links 889 ... i.attrib["target"] = "blank" 890 ... 891 >>> tree.write("output.xhtml") 892 893.. _elementtree-qname-objects: 894 895QName Objects 896^^^^^^^^^^^^^ 897 898 899.. class:: QName(text_or_uri, tag=None) 900 901 QName wrapper. This can be used to wrap a QName attribute value, in order 902 to get proper namespace handling on output. *text_or_uri* is a string 903 containing the QName value, in the form {uri}local, or, if the tag argument 904 is given, the URI part of a QName. If *tag* is given, the first argument is 905 interpreted as a URI, and this argument is interpreted as a local name. 906 :class:`QName` instances are opaque. 907 908 909.. _elementtree-treebuilder-objects: 910 911TreeBuilder Objects 912^^^^^^^^^^^^^^^^^^^ 913 914 915.. class:: TreeBuilder(element_factory=None) 916 917 Generic element structure builder. This builder converts a sequence of 918 start, data, and end method calls to a well-formed element structure. You 919 can use this class to build an element structure using a custom XML parser, 920 or a parser for some other XML-like format. The *element_factory* is called 921 to create new :class:`Element` instances when given. 922 923 924 .. method:: close() 925 926 Flushes the builder buffers, and returns the toplevel document 927 element. Returns an :class:`Element` instance. 928 929 930 .. method:: data(data) 931 932 Adds text to the current element. *data* is a string. This should be 933 either a bytestring, or a Unicode string. 934 935 936 .. method:: end(tag) 937 938 Closes the current element. *tag* is the element name. Returns the 939 closed element. 940 941 942 .. method:: start(tag, attrs) 943 944 Opens a new element. *tag* is the element name. *attrs* is a dictionary 945 containing element attributes. Returns the opened element. 946 947 948 In addition, a custom :class:`TreeBuilder` object can provide the 949 following method: 950 951 .. method:: doctype(name, pubid, system) 952 953 Handles a doctype declaration. *name* is the doctype name. *pubid* is 954 the public identifier. *system* is the system identifier. This method 955 does not exist on the default :class:`TreeBuilder` class. 956 957 .. versionadded:: 2.7 958 959 960.. _elementtree-xmlparser-objects: 961 962XMLParser Objects 963^^^^^^^^^^^^^^^^^ 964 965 966.. class:: XMLParser(html=0, target=None, encoding=None) 967 968 :class:`Element` structure builder for XML source data, based on the expat 969 parser. *html* are predefined HTML entities. This flag is not supported by 970 the current implementation. *target* is the target object. If omitted, the 971 builder uses an instance of the standard TreeBuilder class. *encoding* [1]_ 972 is optional. If given, the value overrides the encoding specified in the 973 XML file. 974 975 976 .. method:: close() 977 978 Finishes feeding data to the parser. Returns an element structure. 979 980 981 .. method:: doctype(name, pubid, system) 982 983 .. deprecated:: 2.7 984 Define the :meth:`TreeBuilder.doctype` method on a custom TreeBuilder 985 target. 986 987 988 .. method:: feed(data) 989 990 Feeds data to the parser. *data* is encoded data. 991 992:meth:`XMLParser.feed` calls *target*\'s :meth:`start` method 993for each opening tag, its :meth:`end` method for each closing tag, 994and data is processed by method :meth:`data`. :meth:`XMLParser.close` 995calls *target*\'s method :meth:`close`. 996:class:`XMLParser` can be used not only for building a tree structure. 997This is an example of counting the maximum depth of an XML file:: 998 999 >>> from xml.etree.ElementTree import XMLParser 1000 >>> class MaxDepth: # The target object of the parser 1001 ... maxDepth = 0 1002 ... depth = 0 1003 ... def start(self, tag, attrib): # Called for each opening tag. 1004 ... self.depth += 1 1005 ... if self.depth > self.maxDepth: 1006 ... self.maxDepth = self.depth 1007 ... def end(self, tag): # Called for each closing tag. 1008 ... self.depth -= 1 1009 ... def data(self, data): 1010 ... pass # We do not need to do anything with data. 1011 ... def close(self): # Called when all data has been parsed. 1012 ... return self.maxDepth 1013 ... 1014 >>> target = MaxDepth() 1015 >>> parser = XMLParser(target=target) 1016 >>> exampleXml = """ 1017 ... <a> 1018 ... <b> 1019 ... </b> 1020 ... <b> 1021 ... <c> 1022 ... <d> 1023 ... </d> 1024 ... </c> 1025 ... </b> 1026 ... </a>""" 1027 >>> parser.feed(exampleXml) 1028 >>> parser.close() 1029 4 1030 1031 1032.. rubric:: Footnotes 1033 1034.. [1] The encoding string included in XML output should conform to the 1035 appropriate standards. For example, "UTF-8" is valid, but "UTF8" is 1036 not. See https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EncodingDecl 1037 and https://www.iana.org/assignments/character-sets/character-sets.xhtml. 1038