• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1:mod:`xml.dom` --- The Document Object Model API
2================================================
3
4.. module:: xml.dom
5   :synopsis: Document Object Model API for Python.
6
7.. sectionauthor:: Paul Prescod <paul@prescod.net>
8.. sectionauthor:: Martin v. Löwis <martin@v.loewis.de>
9
10**Source code:** :source:`Lib/xml/dom/__init__.py`
11
12--------------
13
14The Document Object Model, or "DOM," is a cross-language API from the World Wide
15Web Consortium (W3C) for accessing and modifying XML documents.  A DOM
16implementation presents an XML document as a tree structure, or allows client
17code to build such a structure from scratch.  It then gives access to the
18structure through a set of objects which provided well-known interfaces.
19
20The DOM is extremely useful for random-access applications.  SAX only allows you
21a view of one bit of the document at a time.  If you are looking at one SAX
22element, you have no access to another.  If you are looking at a text node, you
23have no access to a containing element. When you write a SAX application, you
24need to keep track of your program's position in the document somewhere in your
25own code.  SAX does not do it for you.  Also, if you need to look ahead in the
26XML document, you are just out of luck.
27
28Some applications are simply impossible in an event driven model with no access
29to a tree.  Of course you could build some sort of tree yourself in SAX events,
30but the DOM allows you to avoid writing that code.  The DOM is a standard tree
31representation for XML data.
32
33The Document Object Model is being defined by the W3C in stages, or "levels" in
34their terminology.  The Python mapping of the API is substantially based on the
35DOM Level 2 recommendation.
36
37.. What if your needs are somewhere between SAX and the DOM?  Perhaps
38   you cannot afford to load the entire tree in memory but you find the
39   SAX model somewhat cumbersome and low-level.  There is also a module
40   called xml.dom.pulldom that allows you to build trees of only the
41   parts of a document that you need structured access to.  It also has
42   features that allow you to find your way around the DOM.
43   See http://www.prescod.net/python/pulldom
44
45DOM applications typically start by parsing some XML into a DOM.  How this is
46accomplished is not covered at all by DOM Level 1, and Level 2 provides only
47limited improvements: There is a :class:`DOMImplementation` object class which
48provides access to :class:`Document` creation methods, but no way to access an
49XML reader/parser/Document builder in an implementation-independent way. There
50is also no well-defined way to access these methods without an existing
51:class:`Document` object.  In Python, each DOM implementation will provide a
52function :func:`getDOMImplementation`. DOM Level 3 adds a Load/Store
53specification, which defines an interface to the reader, but this is not yet
54available in the Python standard library.
55
56Once you have a DOM document object, you can access the parts of your XML
57document through its properties and methods.  These properties are defined in
58the DOM specification; this portion of the reference manual describes the
59interpretation of the specification in Python.
60
61The specification provided by the W3C defines the DOM API for Java, ECMAScript,
62and OMG IDL.  The Python mapping defined here is based in large part on the IDL
63version of the specification, but strict compliance is not required (though
64implementations are free to support the strict mapping from IDL).  See section
65:ref:`dom-conformance` for a detailed discussion of mapping requirements.
66
67
68.. seealso::
69
70   `Document Object Model (DOM) Level 2 Specification <https://www.w3.org/TR/2000/REC-DOM-Level-2-Core-20001113/>`_
71      The W3C recommendation upon which the Python DOM API is based.
72
73   `Document Object Model (DOM) Level 1 Specification <https://www.w3.org/TR/REC-DOM-Level-1/>`_
74      The W3C recommendation for the DOM supported by :mod:`xml.dom.minidom`.
75
76   `Python Language Mapping Specification <http://www.omg.org/cgi-bin/doc?formal/02-11-05.pdf>`_
77      This specifies the mapping from OMG IDL to Python.
78
79
80Module Contents
81---------------
82
83The :mod:`xml.dom` contains the following functions:
84
85
86.. function:: registerDOMImplementation(name, factory)
87
88   Register the *factory* function with the name *name*.  The factory function
89   should return an object which implements the :class:`DOMImplementation`
90   interface.  The factory function can return the same object every time, or a new
91   one for each call, as appropriate for the specific implementation (e.g. if that
92   implementation supports some customization).
93
94
95.. function:: getDOMImplementation(name=None, features=())
96
97   Return a suitable DOM implementation. The *name* is either well-known, the
98   module name of a DOM implementation, or ``None``. If it is not ``None``, imports
99   the corresponding module and returns a :class:`DOMImplementation` object if the
100   import succeeds.  If no name is given, and if the environment variable
101   :envvar:`PYTHON_DOM` is set, this variable is used to find the implementation.
102
103   If name is not given, this examines the available implementations to find one
104   with the required feature set.  If no implementation can be found, raise an
105   :exc:`ImportError`.  The features list must be a sequence of ``(feature,
106   version)`` pairs which are passed to the :meth:`hasFeature` method on available
107   :class:`DOMImplementation` objects.
108
109Some convenience constants are also provided:
110
111
112.. data:: EMPTY_NAMESPACE
113
114   The value used to indicate that no namespace is associated with a node in the
115   DOM.  This is typically found as the :attr:`namespaceURI` of a node, or used as
116   the *namespaceURI* parameter to a namespaces-specific method.
117
118
119.. data:: XML_NAMESPACE
120
121   The namespace URI associated with the reserved prefix ``xml``, as defined by
122   `Namespaces in XML <https://www.w3.org/TR/REC-xml-names/>`_ (section 4).
123
124
125.. data:: XMLNS_NAMESPACE
126
127   The namespace URI for namespace declarations, as defined by `Document Object
128   Model (DOM) Level 2 Core Specification
129   <https://www.w3.org/TR/DOM-Level-2-Core/core.html>`_ (section 1.1.8).
130
131
132.. data:: XHTML_NAMESPACE
133
134   The URI of the XHTML namespace as defined by `XHTML 1.0: The Extensible
135   HyperText Markup Language <https://www.w3.org/TR/xhtml1/>`_ (section 3.1.1).
136
137
138In addition, :mod:`xml.dom` contains a base :class:`Node` class and the DOM
139exception classes.  The :class:`Node` class provided by this module does not
140implement any of the methods or attributes defined by the DOM specification;
141concrete DOM implementations must provide those.  The :class:`Node` class
142provided as part of this module does provide the constants used for the
143:attr:`nodeType` attribute on concrete :class:`Node` objects; they are located
144within the class rather than at the module level to conform with the DOM
145specifications.
146
147.. Should the Node documentation go here?
148
149
150.. _dom-objects:
151
152Objects in the DOM
153------------------
154
155The definitive documentation for the DOM is the DOM specification from the W3C.
156
157Note that DOM attributes may also be manipulated as nodes instead of as simple
158strings.  It is fairly rare that you must do this, however, so this usage is not
159yet documented.
160
161+--------------------------------+-----------------------------------+---------------------------------+
162| Interface                      | Section                           | Purpose                         |
163+================================+===================================+=================================+
164| :class:`DOMImplementation`     | :ref:`dom-implementation-objects` | Interface to the underlying     |
165|                                |                                   | implementation.                 |
166+--------------------------------+-----------------------------------+---------------------------------+
167| :class:`Node`                  | :ref:`dom-node-objects`           | Base interface for most objects |
168|                                |                                   | in a document.                  |
169+--------------------------------+-----------------------------------+---------------------------------+
170| :class:`NodeList`              | :ref:`dom-nodelist-objects`       | Interface for a sequence of     |
171|                                |                                   | nodes.                          |
172+--------------------------------+-----------------------------------+---------------------------------+
173| :class:`DocumentType`          | :ref:`dom-documenttype-objects`   | Information about the           |
174|                                |                                   | declarations needed to process  |
175|                                |                                   | a document.                     |
176+--------------------------------+-----------------------------------+---------------------------------+
177| :class:`Document`              | :ref:`dom-document-objects`       | Object which represents an      |
178|                                |                                   | entire document.                |
179+--------------------------------+-----------------------------------+---------------------------------+
180| :class:`Element`               | :ref:`dom-element-objects`        | Element nodes in the document   |
181|                                |                                   | hierarchy.                      |
182+--------------------------------+-----------------------------------+---------------------------------+
183| :class:`Attr`                  | :ref:`dom-attr-objects`           | Attribute value nodes on        |
184|                                |                                   | element nodes.                  |
185+--------------------------------+-----------------------------------+---------------------------------+
186| :class:`Comment`               | :ref:`dom-comment-objects`        | Representation of comments in   |
187|                                |                                   | the source document.            |
188+--------------------------------+-----------------------------------+---------------------------------+
189| :class:`Text`                  | :ref:`dom-text-objects`           | Nodes containing textual        |
190|                                |                                   | content from the document.      |
191+--------------------------------+-----------------------------------+---------------------------------+
192| :class:`ProcessingInstruction` | :ref:`dom-pi-objects`             | Processing instruction          |
193|                                |                                   | representation.                 |
194+--------------------------------+-----------------------------------+---------------------------------+
195
196An additional section describes the exceptions defined for working with the DOM
197in Python.
198
199
200.. _dom-implementation-objects:
201
202DOMImplementation Objects
203^^^^^^^^^^^^^^^^^^^^^^^^^
204
205The :class:`DOMImplementation` interface provides a way for applications to
206determine the availability of particular features in the DOM they are using.
207DOM Level 2 added the ability to create new :class:`Document` and
208:class:`DocumentType` objects using the :class:`DOMImplementation` as well.
209
210
211.. method:: DOMImplementation.hasFeature(feature, version)
212
213   Return ``True`` if the feature identified by the pair of strings *feature* and
214   *version* is implemented.
215
216
217.. method:: DOMImplementation.createDocument(namespaceUri, qualifiedName, doctype)
218
219   Return a new :class:`Document` object (the root of the DOM), with a child
220   :class:`Element` object having the given *namespaceUri* and *qualifiedName*. The
221   *doctype* must be a :class:`DocumentType` object created by
222   :meth:`createDocumentType`, or ``None``. In the Python DOM API, the first two
223   arguments can also be ``None`` in order to indicate that no :class:`Element`
224   child is to be created.
225
226
227.. method:: DOMImplementation.createDocumentType(qualifiedName, publicId, systemId)
228
229   Return a new :class:`DocumentType` object that encapsulates the given
230   *qualifiedName*, *publicId*, and *systemId* strings, representing the
231   information contained in an XML document type declaration.
232
233
234.. _dom-node-objects:
235
236Node Objects
237^^^^^^^^^^^^
238
239All of the components of an XML document are subclasses of :class:`Node`.
240
241
242.. attribute:: Node.nodeType
243
244   An integer representing the node type.  Symbolic constants for the types are on
245   the :class:`Node` object: :const:`ELEMENT_NODE`, :const:`ATTRIBUTE_NODE`,
246   :const:`TEXT_NODE`, :const:`CDATA_SECTION_NODE`, :const:`ENTITY_NODE`,
247   :const:`PROCESSING_INSTRUCTION_NODE`, :const:`COMMENT_NODE`,
248   :const:`DOCUMENT_NODE`, :const:`DOCUMENT_TYPE_NODE`, :const:`NOTATION_NODE`.
249   This is a read-only attribute.
250
251
252.. attribute:: Node.parentNode
253
254   The parent of the current node, or ``None`` for the document node. The value is
255   always a :class:`Node` object or ``None``.  For :class:`Element` nodes, this
256   will be the parent element, except for the root element, in which case it will
257   be the :class:`Document` object. For :class:`Attr` nodes, this is always
258   ``None``. This is a read-only attribute.
259
260
261.. attribute:: Node.attributes
262
263   A :class:`NamedNodeMap` of attribute objects.  Only elements have actual values
264   for this; others provide ``None`` for this attribute. This is a read-only
265   attribute.
266
267
268.. attribute:: Node.previousSibling
269
270   The node that immediately precedes this one with the same parent.  For
271   instance the element with an end-tag that comes just before the *self*
272   element's start-tag.  Of course, XML documents are made up of more than just
273   elements so the previous sibling could be text, a comment, or something else.
274   If this node is the first child of the parent, this attribute will be
275   ``None``. This is a read-only attribute.
276
277
278.. attribute:: Node.nextSibling
279
280   The node that immediately follows this one with the same parent.  See also
281   :attr:`previousSibling`.  If this is the last child of the parent, this
282   attribute will be ``None``. This is a read-only attribute.
283
284
285.. attribute:: Node.childNodes
286
287   A list of nodes contained within this node. This is a read-only attribute.
288
289
290.. attribute:: Node.firstChild
291
292   The first child of the node, if there are any, or ``None``. This is a read-only
293   attribute.
294
295
296.. attribute:: Node.lastChild
297
298   The last child of the node, if there are any, or ``None``. This is a read-only
299   attribute.
300
301
302.. attribute:: Node.localName
303
304   The part of the :attr:`tagName` following the colon if there is one, else the
305   entire :attr:`tagName`.  The value is a string.
306
307
308.. attribute:: Node.prefix
309
310   The part of the :attr:`tagName` preceding the colon if there is one, else the
311   empty string.  The value is a string, or ``None``.
312
313
314.. attribute:: Node.namespaceURI
315
316   The namespace associated with the element name.  This will be a string or
317   ``None``.  This is a read-only attribute.
318
319
320.. attribute:: Node.nodeName
321
322   This has a different meaning for each node type; see the DOM specification for
323   details.  You can always get the information you would get here from another
324   property such as the :attr:`tagName` property for elements or the :attr:`name`
325   property for attributes. For all node types, the value of this attribute will be
326   either a string or ``None``.  This is a read-only attribute.
327
328
329.. attribute:: Node.nodeValue
330
331   This has a different meaning for each node type; see the DOM specification for
332   details.  The situation is similar to that with :attr:`nodeName`.  The value is
333   a string or ``None``.
334
335
336.. method:: Node.hasAttributes()
337
338   Return ``True`` if the node has any attributes.
339
340
341.. method:: Node.hasChildNodes()
342
343   Return ``True`` if the node has any child nodes.
344
345
346.. method:: Node.isSameNode(other)
347
348   Return ``True`` if *other* refers to the same node as this node. This is especially
349   useful for DOM implementations which use any sort of proxy architecture (because
350   more than one object can refer to the same node).
351
352   .. note::
353
354      This is based on a proposed DOM Level 3 API which is still in the "working
355      draft" stage, but this particular interface appears uncontroversial.  Changes
356      from the W3C will not necessarily affect this method in the Python DOM interface
357      (though any new W3C API for this would also be supported).
358
359
360.. method:: Node.appendChild(newChild)
361
362   Add a new child node to this node at the end of the list of
363   children, returning *newChild*. If the node was already in
364   the tree, it is removed first.
365
366
367.. method:: Node.insertBefore(newChild, refChild)
368
369   Insert a new child node before an existing child.  It must be the case that
370   *refChild* is a child of this node; if not, :exc:`ValueError` is raised.
371   *newChild* is returned. If *refChild* is ``None``, it inserts *newChild* at the
372   end of the children's list.
373
374
375.. method:: Node.removeChild(oldChild)
376
377   Remove a child node.  *oldChild* must be a child of this node; if not,
378   :exc:`ValueError` is raised.  *oldChild* is returned on success.  If *oldChild*
379   will not be used further, its :meth:`unlink` method should be called.
380
381
382.. method:: Node.replaceChild(newChild, oldChild)
383
384   Replace an existing node with a new node. It must be the case that  *oldChild*
385   is a child of this node; if not, :exc:`ValueError` is raised.
386
387
388.. method:: Node.normalize()
389
390   Join adjacent text nodes so that all stretches of text are stored as single
391   :class:`Text` instances.  This simplifies processing text from a DOM tree for
392   many applications.
393
394
395.. method:: Node.cloneNode(deep)
396
397   Clone this node.  Setting *deep* means to clone all child nodes as well.  This
398   returns the clone.
399
400
401.. _dom-nodelist-objects:
402
403NodeList Objects
404^^^^^^^^^^^^^^^^
405
406A :class:`NodeList` represents a sequence of nodes.  These objects are used in
407two ways in the DOM Core recommendation:  an :class:`Element` object provides
408one as its list of child nodes, and the :meth:`getElementsByTagName` and
409:meth:`getElementsByTagNameNS` methods of :class:`Node` return objects with this
410interface to represent query results.
411
412The DOM Level 2 recommendation defines one method and one attribute for these
413objects:
414
415
416.. method:: NodeList.item(i)
417
418   Return the *i*'th item from the sequence, if there is one, or ``None``.  The
419   index *i* is not allowed to be less than zero or greater than or equal to the
420   length of the sequence.
421
422
423.. attribute:: NodeList.length
424
425   The number of nodes in the sequence.
426
427In addition, the Python DOM interface requires that some additional support is
428provided to allow :class:`NodeList` objects to be used as Python sequences.  All
429:class:`NodeList` implementations must include support for
430:meth:`~object.__len__` and
431:meth:`~object.__getitem__`; this allows iteration over the :class:`NodeList` in
432:keyword:`for` statements and proper support for the :func:`len` built-in
433function.
434
435If a DOM implementation supports modification of the document, the
436:class:`NodeList` implementation must also support the
437:meth:`~object.__setitem__` and :meth:`~object.__delitem__` methods.
438
439
440.. _dom-documenttype-objects:
441
442DocumentType Objects
443^^^^^^^^^^^^^^^^^^^^
444
445Information about the notations and entities declared by a document (including
446the external subset if the parser uses it and can provide the information) is
447available from a :class:`DocumentType` object.  The :class:`DocumentType` for a
448document is available from the :class:`Document` object's :attr:`doctype`
449attribute; if there is no ``DOCTYPE`` declaration for the document, the
450document's :attr:`doctype` attribute will be set to ``None`` instead of an
451instance of this interface.
452
453:class:`DocumentType` is a specialization of :class:`Node`, and adds the
454following attributes:
455
456
457.. attribute:: DocumentType.publicId
458
459   The public identifier for the external subset of the document type definition.
460   This will be a string or ``None``.
461
462
463.. attribute:: DocumentType.systemId
464
465   The system identifier for the external subset of the document type definition.
466   This will be a URI as a string, or ``None``.
467
468
469.. attribute:: DocumentType.internalSubset
470
471   A string giving the complete internal subset from the document. This does not
472   include the brackets which enclose the subset.  If the document has no internal
473   subset, this should be ``None``.
474
475
476.. attribute:: DocumentType.name
477
478   The name of the root element as given in the ``DOCTYPE`` declaration, if
479   present.
480
481
482.. attribute:: DocumentType.entities
483
484   This is a :class:`NamedNodeMap` giving the definitions of external entities.
485   For entity names defined more than once, only the first definition is provided
486   (others are ignored as required by the XML recommendation).  This may be
487   ``None`` if the information is not provided by the parser, or if no entities are
488   defined.
489
490
491.. attribute:: DocumentType.notations
492
493   This is a :class:`NamedNodeMap` giving the definitions of notations. For
494   notation names defined more than once, only the first definition is provided
495   (others are ignored as required by the XML recommendation).  This may be
496   ``None`` if the information is not provided by the parser, or if no notations
497   are defined.
498
499
500.. _dom-document-objects:
501
502Document Objects
503^^^^^^^^^^^^^^^^
504
505A :class:`Document` represents an entire XML document, including its constituent
506elements, attributes, processing instructions, comments etc.  Remember that it
507inherits properties from :class:`Node`.
508
509
510.. attribute:: Document.documentElement
511
512   The one and only root element of the document.
513
514
515.. method:: Document.createElement(tagName)
516
517   Create and return a new element node.  The element is not inserted into the
518   document when it is created.  You need to explicitly insert it with one of the
519   other methods such as :meth:`insertBefore` or :meth:`appendChild`.
520
521
522.. method:: Document.createElementNS(namespaceURI, tagName)
523
524   Create and return a new element with a namespace.  The *tagName* may have a
525   prefix.  The element is not inserted into the document when it is created.  You
526   need to explicitly insert it with one of the other methods such as
527   :meth:`insertBefore` or :meth:`appendChild`.
528
529
530.. method:: Document.createTextNode(data)
531
532   Create and return a text node containing the data passed as a parameter.  As
533   with the other creation methods, this one does not insert the node into the
534   tree.
535
536
537.. method:: Document.createComment(data)
538
539   Create and return a comment node containing the data passed as a parameter.  As
540   with the other creation methods, this one does not insert the node into the
541   tree.
542
543
544.. method:: Document.createProcessingInstruction(target, data)
545
546   Create and return a processing instruction node containing the *target* and
547   *data* passed as parameters.  As with the other creation methods, this one does
548   not insert the node into the tree.
549
550
551.. method:: Document.createAttribute(name)
552
553   Create and return an attribute node.  This method does not associate the
554   attribute node with any particular element.  You must use
555   :meth:`setAttributeNode` on the appropriate :class:`Element` object to use the
556   newly created attribute instance.
557
558
559.. method:: Document.createAttributeNS(namespaceURI, qualifiedName)
560
561   Create and return an attribute node with a namespace.  The *tagName* may have a
562   prefix.  This method does not associate the attribute node with any particular
563   element.  You must use :meth:`setAttributeNode` on the appropriate
564   :class:`Element` object to use the newly created attribute instance.
565
566
567.. method:: Document.getElementsByTagName(tagName)
568
569   Search for all descendants (direct children, children's children, etc.) with a
570   particular element type name.
571
572
573.. method:: Document.getElementsByTagNameNS(namespaceURI, localName)
574
575   Search for all descendants (direct children, children's children, etc.) with a
576   particular namespace URI and localname.  The localname is the part of the
577   namespace after the prefix.
578
579
580.. _dom-element-objects:
581
582Element Objects
583^^^^^^^^^^^^^^^
584
585:class:`Element` is a subclass of :class:`Node`, so inherits all the attributes
586of that class.
587
588
589.. attribute:: Element.tagName
590
591   The element type name.  In a namespace-using document it may have colons in it.
592   The value is a string.
593
594
595.. method:: Element.getElementsByTagName(tagName)
596
597   Same as equivalent method in the :class:`Document` class.
598
599
600.. method:: Element.getElementsByTagNameNS(namespaceURI, localName)
601
602   Same as equivalent method in the :class:`Document` class.
603
604
605.. method:: Element.hasAttribute(name)
606
607   Return ``True`` if the element has an attribute named by *name*.
608
609
610.. method:: Element.hasAttributeNS(namespaceURI, localName)
611
612   Return ``True`` if the element has an attribute named by *namespaceURI* and
613   *localName*.
614
615
616.. method:: Element.getAttribute(name)
617
618   Return the value of the attribute named by *name* as a string. If no such
619   attribute exists, an empty string is returned, as if the attribute had no value.
620
621
622.. method:: Element.getAttributeNode(attrname)
623
624   Return the :class:`Attr` node for the attribute named by *attrname*.
625
626
627.. method:: Element.getAttributeNS(namespaceURI, localName)
628
629   Return the value of the attribute named by *namespaceURI* and *localName* as a
630   string. If no such attribute exists, an empty string is returned, as if the
631   attribute had no value.
632
633
634.. method:: Element.getAttributeNodeNS(namespaceURI, localName)
635
636   Return an attribute value as a node, given a *namespaceURI* and *localName*.
637
638
639.. method:: Element.removeAttribute(name)
640
641   Remove an attribute by name.  If there is no matching attribute, a
642   :exc:`NotFoundErr` is raised.
643
644
645.. method:: Element.removeAttributeNode(oldAttr)
646
647   Remove and return *oldAttr* from the attribute list, if present. If *oldAttr* is
648   not present, :exc:`NotFoundErr` is raised.
649
650
651.. method:: Element.removeAttributeNS(namespaceURI, localName)
652
653   Remove an attribute by name.  Note that it uses a localName, not a qname.  No
654   exception is raised if there is no matching attribute.
655
656
657.. method:: Element.setAttribute(name, value)
658
659   Set an attribute value from a string.
660
661
662.. method:: Element.setAttributeNode(newAttr)
663
664   Add a new attribute node to the element, replacing an existing attribute if
665   necessary if the :attr:`name` attribute matches.  If a replacement occurs, the
666   old attribute node will be returned.  If *newAttr* is already in use,
667   :exc:`InuseAttributeErr` will be raised.
668
669
670.. method:: Element.setAttributeNodeNS(newAttr)
671
672   Add a new attribute node to the element, replacing an existing attribute if
673   necessary if the :attr:`namespaceURI` and :attr:`localName` attributes match.
674   If a replacement occurs, the old attribute node will be returned.  If *newAttr*
675   is already in use, :exc:`InuseAttributeErr` will be raised.
676
677
678.. method:: Element.setAttributeNS(namespaceURI, qname, value)
679
680   Set an attribute value from a string, given a *namespaceURI* and a *qname*.
681   Note that a qname is the whole attribute name.  This is different than above.
682
683
684.. _dom-attr-objects:
685
686Attr Objects
687^^^^^^^^^^^^
688
689:class:`Attr` inherits from :class:`Node`, so inherits all its attributes.
690
691
692.. attribute:: Attr.name
693
694   The attribute name.
695   In a namespace-using document it may include a colon.
696
697
698.. attribute:: Attr.localName
699
700   The part of the name following the colon if there is one, else the
701   entire name.
702   This is a read-only attribute.
703
704
705.. attribute:: Attr.prefix
706
707   The part of the name preceding the colon if there is one, else the
708   empty string.
709
710
711.. attribute:: Attr.value
712
713   The text value of the attribute.  This is a synonym for the
714   :attr:`nodeValue` attribute.
715
716
717.. _dom-attributelist-objects:
718
719NamedNodeMap Objects
720^^^^^^^^^^^^^^^^^^^^
721
722:class:`NamedNodeMap` does *not* inherit from :class:`Node`.
723
724
725.. attribute:: NamedNodeMap.length
726
727   The length of the attribute list.
728
729
730.. method:: NamedNodeMap.item(index)
731
732   Return an attribute with a particular index.  The order you get the attributes
733   in is arbitrary but will be consistent for the life of a DOM.  Each item is an
734   attribute node.  Get its value with the :attr:`value` attribute.
735
736There are also experimental methods that give this class more mapping behavior.
737You can use them or you can use the standardized :meth:`getAttribute\*` family
738of methods on the :class:`Element` objects.
739
740
741.. _dom-comment-objects:
742
743Comment Objects
744^^^^^^^^^^^^^^^
745
746:class:`Comment` represents a comment in the XML document.  It is a subclass of
747:class:`Node`, but cannot have child nodes.
748
749
750.. attribute:: Comment.data
751
752   The content of the comment as a string.  The attribute contains all characters
753   between the leading ``<!-``\ ``-`` and trailing ``-``\ ``->``, but does not
754   include them.
755
756
757.. _dom-text-objects:
758
759Text and CDATASection Objects
760^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
761
762The :class:`Text` interface represents text in the XML document.  If the parser
763and DOM implementation support the DOM's XML extension, portions of the text
764enclosed in CDATA marked sections are stored in :class:`CDATASection` objects.
765These two interfaces are identical, but provide different values for the
766:attr:`nodeType` attribute.
767
768These interfaces extend the :class:`Node` interface.  They cannot have child
769nodes.
770
771
772.. attribute:: Text.data
773
774   The content of the text node as a string.
775
776.. note::
777
778   The use of a :class:`CDATASection` node does not indicate that the node
779   represents a complete CDATA marked section, only that the content of the node
780   was part of a CDATA section.  A single CDATA section may be represented by more
781   than one node in the document tree.  There is no way to determine whether two
782   adjacent :class:`CDATASection` nodes represent different CDATA marked sections.
783
784
785.. _dom-pi-objects:
786
787ProcessingInstruction Objects
788^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
789
790Represents a processing instruction in the XML document; this inherits from the
791:class:`Node` interface and cannot have child nodes.
792
793
794.. attribute:: ProcessingInstruction.target
795
796   The content of the processing instruction up to the first whitespace character.
797   This is a read-only attribute.
798
799
800.. attribute:: ProcessingInstruction.data
801
802   The content of the processing instruction following the first whitespace
803   character.
804
805
806.. _dom-exceptions:
807
808Exceptions
809^^^^^^^^^^
810
811The DOM Level 2 recommendation defines a single exception, :exc:`DOMException`,
812and a number of constants that allow applications to determine what sort of
813error occurred. :exc:`DOMException` instances carry a :attr:`code` attribute
814that provides the appropriate value for the specific exception.
815
816The Python DOM interface provides the constants, but also expands the set of
817exceptions so that a specific exception exists for each of the exception codes
818defined by the DOM.  The implementations must raise the appropriate specific
819exception, each of which carries the appropriate value for the :attr:`code`
820attribute.
821
822
823.. exception:: DOMException
824
825   Base exception class used for all specific DOM exceptions.  This exception class
826   cannot be directly instantiated.
827
828
829.. exception:: DomstringSizeErr
830
831   Raised when a specified range of text does not fit into a string. This is not
832   known to be used in the Python DOM implementations, but may be received from DOM
833   implementations not written in Python.
834
835
836.. exception:: HierarchyRequestErr
837
838   Raised when an attempt is made to insert a node where the node type is not
839   allowed.
840
841
842.. exception:: IndexSizeErr
843
844   Raised when an index or size parameter to a method is negative or exceeds the
845   allowed values.
846
847
848.. exception:: InuseAttributeErr
849
850   Raised when an attempt is made to insert an :class:`Attr` node that is already
851   present elsewhere in the document.
852
853
854.. exception:: InvalidAccessErr
855
856   Raised if a parameter or an operation is not supported on the underlying object.
857
858
859.. exception:: InvalidCharacterErr
860
861   This exception is raised when a string parameter contains a character that is
862   not permitted in the context it's being used in by the XML 1.0 recommendation.
863   For example, attempting to create an :class:`Element` node with a space in the
864   element type name will cause this error to be raised.
865
866
867.. exception:: InvalidModificationErr
868
869   Raised when an attempt is made to modify the type of a node.
870
871
872.. exception:: InvalidStateErr
873
874   Raised when an attempt is made to use an object that is not defined or is no
875   longer usable.
876
877
878.. exception:: NamespaceErr
879
880   If an attempt is made to change any object in a way that is not permitted with
881   regard to the `Namespaces in XML <https://www.w3.org/TR/REC-xml-names/>`_
882   recommendation, this exception is raised.
883
884
885.. exception:: NotFoundErr
886
887   Exception when a node does not exist in the referenced context.  For example,
888   :meth:`NamedNodeMap.removeNamedItem` will raise this if the node passed in does
889   not exist in the map.
890
891
892.. exception:: NotSupportedErr
893
894   Raised when the implementation does not support the requested type of object or
895   operation.
896
897
898.. exception:: NoDataAllowedErr
899
900   This is raised if data is specified for a node which does not support data.
901
902   .. XXX  a better explanation is needed!
903
904
905.. exception:: NoModificationAllowedErr
906
907   Raised on attempts to modify an object where modifications are not allowed (such
908   as for read-only nodes).
909
910
911.. exception:: SyntaxErr
912
913   Raised when an invalid or illegal string is specified.
914
915   .. XXX  how is this different from InvalidCharacterErr?
916
917
918.. exception:: WrongDocumentErr
919
920   Raised when a node is inserted in a different document than it currently belongs
921   to, and the implementation does not support migrating the node from one document
922   to the other.
923
924The exception codes defined in the DOM recommendation map to the exceptions
925described above according to this table:
926
927+--------------------------------------+---------------------------------+
928| Constant                             | Exception                       |
929+======================================+=================================+
930| :const:`DOMSTRING_SIZE_ERR`          | :exc:`DomstringSizeErr`         |
931+--------------------------------------+---------------------------------+
932| :const:`HIERARCHY_REQUEST_ERR`       | :exc:`HierarchyRequestErr`      |
933+--------------------------------------+---------------------------------+
934| :const:`INDEX_SIZE_ERR`              | :exc:`IndexSizeErr`             |
935+--------------------------------------+---------------------------------+
936| :const:`INUSE_ATTRIBUTE_ERR`         | :exc:`InuseAttributeErr`        |
937+--------------------------------------+---------------------------------+
938| :const:`INVALID_ACCESS_ERR`          | :exc:`InvalidAccessErr`         |
939+--------------------------------------+---------------------------------+
940| :const:`INVALID_CHARACTER_ERR`       | :exc:`InvalidCharacterErr`      |
941+--------------------------------------+---------------------------------+
942| :const:`INVALID_MODIFICATION_ERR`    | :exc:`InvalidModificationErr`   |
943+--------------------------------------+---------------------------------+
944| :const:`INVALID_STATE_ERR`           | :exc:`InvalidStateErr`          |
945+--------------------------------------+---------------------------------+
946| :const:`NAMESPACE_ERR`               | :exc:`NamespaceErr`             |
947+--------------------------------------+---------------------------------+
948| :const:`NOT_FOUND_ERR`               | :exc:`NotFoundErr`              |
949+--------------------------------------+---------------------------------+
950| :const:`NOT_SUPPORTED_ERR`           | :exc:`NotSupportedErr`          |
951+--------------------------------------+---------------------------------+
952| :const:`NO_DATA_ALLOWED_ERR`         | :exc:`NoDataAllowedErr`         |
953+--------------------------------------+---------------------------------+
954| :const:`NO_MODIFICATION_ALLOWED_ERR` | :exc:`NoModificationAllowedErr` |
955+--------------------------------------+---------------------------------+
956| :const:`SYNTAX_ERR`                  | :exc:`SyntaxErr`                |
957+--------------------------------------+---------------------------------+
958| :const:`WRONG_DOCUMENT_ERR`          | :exc:`WrongDocumentErr`         |
959+--------------------------------------+---------------------------------+
960
961
962.. _dom-conformance:
963
964Conformance
965-----------
966
967This section describes the conformance requirements and relationships between
968the Python DOM API, the W3C DOM recommendations, and the OMG IDL mapping for
969Python.
970
971
972.. _dom-type-mapping:
973
974Type Mapping
975^^^^^^^^^^^^
976
977The IDL types used in the DOM specification are mapped to Python types
978according to the following table.
979
980+------------------+-------------------------------------------+
981| IDL Type         | Python Type                               |
982+==================+===========================================+
983| ``boolean``      | ``bool`` or ``int``                       |
984+------------------+-------------------------------------------+
985| ``int``          | ``int``                                   |
986+------------------+-------------------------------------------+
987| ``long int``     | ``int``                                   |
988+------------------+-------------------------------------------+
989| ``unsigned int`` | ``int``                                   |
990+------------------+-------------------------------------------+
991| ``DOMString``    | ``str`` or ``bytes``                      |
992+------------------+-------------------------------------------+
993| ``null``         | ``None``                                  |
994+------------------+-------------------------------------------+
995
996.. _dom-accessor-methods:
997
998Accessor Methods
999^^^^^^^^^^^^^^^^
1000
1001The mapping from OMG IDL to Python defines accessor functions for IDL
1002``attribute`` declarations in much the way the Java mapping does.
1003Mapping the IDL declarations ::
1004
1005   readonly attribute string someValue;
1006            attribute string anotherValue;
1007
1008yields three accessor functions:  a "get" method for :attr:`someValue`
1009(:meth:`_get_someValue`), and "get" and "set" methods for :attr:`anotherValue`
1010(:meth:`_get_anotherValue` and :meth:`_set_anotherValue`).  The mapping, in
1011particular, does not require that the IDL attributes are accessible as normal
1012Python attributes:  ``object.someValue`` is *not* required to work, and may
1013raise an :exc:`AttributeError`.
1014
1015The Python DOM API, however, *does* require that normal attribute access work.
1016This means that the typical surrogates generated by Python IDL compilers are not
1017likely to work, and wrapper objects may be needed on the client if the DOM
1018objects are accessed via CORBA. While this does require some additional
1019consideration for CORBA DOM clients, the implementers with experience using DOM
1020over CORBA from Python do not consider this a problem.  Attributes that are
1021declared ``readonly`` may not restrict write access in all DOM
1022implementations.
1023
1024In the Python DOM API, accessor functions are not required.  If provided, they
1025should take the form defined by the Python IDL mapping, but these methods are
1026considered unnecessary since the attributes are accessible directly from Python.
1027"Set" accessors should never be provided for ``readonly`` attributes.
1028
1029The IDL definitions do not fully embody the requirements of the W3C DOM API,
1030such as the notion of certain objects, such as the return value of
1031:meth:`getElementsByTagName`, being "live".  The Python DOM API does not require
1032implementations to enforce such requirements.
1033
1034