1:mod:`!xml.parsers.expat` --- Fast XML parsing using Expat 2========================================================== 3 4.. module:: xml.parsers.expat 5 :synopsis: An interface to the Expat non-validating XML parser. 6 7.. moduleauthor:: Paul Prescod <paul@prescod.net> 8 9-------------- 10 11.. Markup notes: 12 13 Many of the attributes of the XMLParser objects are callbacks. Since 14 signature information must be presented, these are described using the method 15 directive. Since they are attributes which are set by client code, in-text 16 references to these attributes should be marked using the :member: role. 17 18 19.. warning:: 20 21 The :mod:`pyexpat` module is not secure against maliciously 22 constructed data. If you need to parse untrusted or unauthenticated data see 23 :ref:`xml-vulnerabilities`. 24 25 26.. index:: single: Expat 27 28The :mod:`xml.parsers.expat` module is a Python interface to the Expat 29non-validating XML parser. The module provides a single extension type, 30:class:`xmlparser`, that represents the current state of an XML parser. After 31an :class:`xmlparser` object has been created, various attributes of the object 32can be set to handler functions. When an XML document is then fed to the 33parser, the handler functions are called for the character data and markup in 34the XML document. 35 36.. index:: pair: module; pyexpat 37 38This module uses the :mod:`pyexpat` module to provide access to the Expat 39parser. Direct use of the :mod:`pyexpat` module is deprecated. 40 41This module provides one exception and one type object: 42 43 44.. exception:: ExpatError 45 46 The exception raised when Expat reports an error. See section 47 :ref:`expaterror-objects` for more information on interpreting Expat errors. 48 49 50.. exception:: error 51 52 Alias for :exc:`ExpatError`. 53 54 55.. data:: XMLParserType 56 57 The type of the return values from the :func:`ParserCreate` function. 58 59The :mod:`xml.parsers.expat` module contains two functions: 60 61 62.. function:: ErrorString(errno) 63 64 Returns an explanatory string for a given error number *errno*. 65 66 67.. function:: ParserCreate(encoding=None, namespace_separator=None) 68 69 Creates and returns a new :class:`xmlparser` object. *encoding*, if specified, 70 must be a string naming the encoding used by the XML data. Expat doesn't 71 support as many encodings as Python does, and its repertoire of encodings can't 72 be extended; it supports UTF-8, UTF-16, ISO-8859-1 (Latin1), and ASCII. If 73 *encoding* [1]_ is given it will override the implicit or explicit encoding of the 74 document. 75 76 Expat can optionally do XML namespace processing for you, enabled by providing a 77 value for *namespace_separator*. The value must be a one-character string; a 78 :exc:`ValueError` will be raised if the string has an illegal length (``None`` 79 is considered the same as omission). When namespace processing is enabled, 80 element type names and attribute names that belong to a namespace will be 81 expanded. The element name passed to the element handlers 82 :attr:`StartElementHandler` and :attr:`EndElementHandler` will be the 83 concatenation of the namespace URI, the namespace separator character, and the 84 local part of the name. If the namespace separator is a zero byte (``chr(0)``) 85 then the namespace URI and the local part will be concatenated without any 86 separator. 87 88 For example, if *namespace_separator* is set to a space character (``' '``) and 89 the following document is parsed: 90 91 .. code-block:: xml 92 93 <?xml version="1.0"?> 94 <root xmlns = "http://default-namespace.org/" 95 xmlns:py = "http://www.python.org/ns/"> 96 <py:elem1 /> 97 <elem2 xmlns="" /> 98 </root> 99 100 :attr:`StartElementHandler` will receive the following strings for each 101 element:: 102 103 http://default-namespace.org/ root 104 http://www.python.org/ns/ elem1 105 elem2 106 107 Due to limitations in the ``Expat`` library used by :mod:`pyexpat`, 108 the :class:`xmlparser` instance returned can only be used to parse a single 109 XML document. Call ``ParserCreate`` for each document to provide unique 110 parser instances. 111 112 113.. seealso:: 114 115 `The Expat XML Parser <http://www.libexpat.org/>`_ 116 Home page of the Expat project. 117 118 119.. _xmlparser-objects: 120 121XMLParser Objects 122----------------- 123 124:class:`xmlparser` objects have the following methods: 125 126 127.. method:: xmlparser.Parse(data[, isfinal]) 128 129 Parses the contents of the string *data*, calling the appropriate handler 130 functions to process the parsed data. *isfinal* must be true on the final call 131 to this method; it allows the parsing of a single file in fragments, 132 not the submission of multiple files. 133 *data* can be the empty string at any time. 134 135 136.. method:: xmlparser.ParseFile(file) 137 138 Parse XML data reading from the object *file*. *file* only needs to provide 139 the ``read(nbytes)`` method, returning the empty string when there's no more 140 data. 141 142 143.. method:: xmlparser.SetBase(base) 144 145 Sets the base to be used for resolving relative URIs in system identifiers in 146 declarations. Resolving relative identifiers is left to the application: this 147 value will be passed through as the *base* argument to the 148 :func:`ExternalEntityRefHandler`, :func:`NotationDeclHandler`, and 149 :func:`UnparsedEntityDeclHandler` functions. 150 151 152.. method:: xmlparser.GetBase() 153 154 Returns a string containing the base set by a previous call to :meth:`SetBase`, 155 or ``None`` if :meth:`SetBase` hasn't been called. 156 157 158.. method:: xmlparser.GetInputContext() 159 160 Returns the input data that generated the current event as a string. The data is 161 in the encoding of the entity which contains the text. When called while an 162 event handler is not active, the return value is ``None``. 163 164 165.. method:: xmlparser.ExternalEntityParserCreate(context[, encoding]) 166 167 Create a "child" parser which can be used to parse an external parsed entity 168 referred to by content parsed by the parent parser. The *context* parameter 169 should be the string passed to the :meth:`ExternalEntityRefHandler` handler 170 function, described below. The child parser is created with the 171 :attr:`ordered_attributes` and :attr:`specified_attributes` set to the values of 172 this parser. 173 174.. method:: xmlparser.SetParamEntityParsing(flag) 175 176 Control parsing of parameter entities (including the external DTD subset). 177 Possible *flag* values are :const:`XML_PARAM_ENTITY_PARSING_NEVER`, 178 :const:`XML_PARAM_ENTITY_PARSING_UNLESS_STANDALONE` and 179 :const:`XML_PARAM_ENTITY_PARSING_ALWAYS`. Return true if setting the flag 180 was successful. 181 182.. method:: xmlparser.UseForeignDTD([flag]) 183 184 Calling this with a true value for *flag* (the default) will cause Expat to call 185 the :attr:`ExternalEntityRefHandler` with :const:`None` for all arguments to 186 allow an alternate DTD to be loaded. If the document does not contain a 187 document type declaration, the :attr:`ExternalEntityRefHandler` will still be 188 called, but the :attr:`StartDoctypeDeclHandler` and 189 :attr:`EndDoctypeDeclHandler` will not be called. 190 191 Passing a false value for *flag* will cancel a previous call that passed a true 192 value, but otherwise has no effect. 193 194 This method can only be called before the :meth:`Parse` or :meth:`ParseFile` 195 methods are called; calling it after either of those have been called causes 196 :exc:`ExpatError` to be raised with the :attr:`code` attribute set to 197 ``errors.codes[errors.XML_ERROR_CANT_CHANGE_FEATURE_ONCE_PARSING]``. 198 199.. method:: xmlparser.SetReparseDeferralEnabled(enabled) 200 201 .. warning:: 202 203 Calling ``SetReparseDeferralEnabled(False)`` has security implications, 204 as detailed below; please make sure to understand these consequences 205 prior to using the ``SetReparseDeferralEnabled`` method. 206 207 Expat 2.6.0 introduced a security mechanism called "reparse deferral" 208 where instead of causing denial of service through quadratic runtime 209 from reparsing large tokens, reparsing of unfinished tokens is now delayed 210 by default until a sufficient amount of input is reached. 211 Due to this delay, registered handlers may — depending of the sizing of 212 input chunks pushed to Expat — no longer be called right after pushing new 213 input to the parser. Where immediate feedback and taking over responsibility 214 of protecting against denial of service from large tokens are both wanted, 215 calling ``SetReparseDeferralEnabled(False)`` disables reparse deferral 216 for the current Expat parser instance, temporarily or altogether. 217 Calling ``SetReparseDeferralEnabled(True)`` allows re-enabling reparse 218 deferral. 219 220 Note that :meth:`SetReparseDeferralEnabled` has been backported to some 221 prior releases of CPython as a security fix. Check for availability of 222 :meth:`SetReparseDeferralEnabled` using :func:`hasattr` if used in code 223 running across a variety of Python versions. 224 225 .. versionadded:: 3.13 226 227.. method:: xmlparser.GetReparseDeferralEnabled() 228 229 Returns whether reparse deferral is currently enabled for the given 230 Expat parser instance. 231 232 .. versionadded:: 3.13 233 234 235:class:`xmlparser` objects have the following attributes: 236 237 238.. attribute:: xmlparser.buffer_size 239 240 The size of the buffer used when :attr:`buffer_text` is true. 241 A new buffer size can be set by assigning a new integer value 242 to this attribute. 243 When the size is changed, the buffer will be flushed. 244 245 246.. attribute:: xmlparser.buffer_text 247 248 Setting this to true causes the :class:`xmlparser` object to buffer textual 249 content returned by Expat to avoid multiple calls to the 250 :meth:`CharacterDataHandler` callback whenever possible. This can improve 251 performance substantially since Expat normally breaks character data into chunks 252 at every line ending. This attribute is false by default, and may be changed at 253 any time. Note that when it is false, data that does not contain newlines 254 may be chunked too. 255 256 257.. attribute:: xmlparser.buffer_used 258 259 If :attr:`buffer_text` is enabled, the number of bytes stored in the buffer. 260 These bytes represent UTF-8 encoded text. This attribute has no meaningful 261 interpretation when :attr:`buffer_text` is false. 262 263 264.. attribute:: xmlparser.ordered_attributes 265 266 Setting this attribute to a non-zero integer causes the attributes to be 267 reported as a list rather than a dictionary. The attributes are presented in 268 the order found in the document text. For each attribute, two list entries are 269 presented: the attribute name and the attribute value. (Older versions of this 270 module also used this format.) By default, this attribute is false; it may be 271 changed at any time. 272 273 274.. attribute:: xmlparser.specified_attributes 275 276 If set to a non-zero integer, the parser will report only those attributes which 277 were specified in the document instance and not those which were derived from 278 attribute declarations. Applications which set this need to be especially 279 careful to use what additional information is available from the declarations as 280 needed to comply with the standards for the behavior of XML processors. By 281 default, this attribute is false; it may be changed at any time. 282 283 284The following attributes contain values relating to the most recent error 285encountered by an :class:`xmlparser` object, and will only have correct values 286once a call to :meth:`Parse` or :meth:`ParseFile` has raised an 287:exc:`xml.parsers.expat.ExpatError` exception. 288 289 290.. attribute:: xmlparser.ErrorByteIndex 291 292 Byte index at which an error occurred. 293 294 295.. attribute:: xmlparser.ErrorCode 296 297 Numeric code specifying the problem. This value can be passed to the 298 :func:`ErrorString` function, or compared to one of the constants defined in the 299 ``errors`` object. 300 301 302.. attribute:: xmlparser.ErrorColumnNumber 303 304 Column number at which an error occurred. 305 306 307.. attribute:: xmlparser.ErrorLineNumber 308 309 Line number at which an error occurred. 310 311The following attributes contain values relating to the current parse location 312in an :class:`xmlparser` object. During a callback reporting a parse event they 313indicate the location of the first of the sequence of characters that generated 314the event. When called outside of a callback, the position indicated will be 315just past the last parse event (regardless of whether there was an associated 316callback). 317 318 319.. attribute:: xmlparser.CurrentByteIndex 320 321 Current byte index in the parser input. 322 323 324.. attribute:: xmlparser.CurrentColumnNumber 325 326 Current column number in the parser input. 327 328 329.. attribute:: xmlparser.CurrentLineNumber 330 331 Current line number in the parser input. 332 333Here is the list of handlers that can be set. To set a handler on an 334:class:`xmlparser` object *o*, use ``o.handlername = func``. *handlername* must 335be taken from the following list, and *func* must be a callable object accepting 336the correct number of arguments. The arguments are all strings, unless 337otherwise stated. 338 339 340.. method:: xmlparser.XmlDeclHandler(version, encoding, standalone) 341 342 Called when the XML declaration is parsed. The XML declaration is the 343 (optional) declaration of the applicable version of the XML recommendation, the 344 encoding of the document text, and an optional "standalone" declaration. 345 *version* and *encoding* will be strings, and *standalone* will be ``1`` if the 346 document is declared standalone, ``0`` if it is declared not to be standalone, 347 or ``-1`` if the standalone clause was omitted. This is only available with 348 Expat version 1.95.0 or newer. 349 350 351.. method:: xmlparser.StartDoctypeDeclHandler(doctypeName, systemId, publicId, has_internal_subset) 352 353 Called when Expat begins parsing the document type declaration (``<!DOCTYPE 354 ...``). The *doctypeName* is provided exactly as presented. The *systemId* and 355 *publicId* parameters give the system and public identifiers if specified, or 356 ``None`` if omitted. *has_internal_subset* will be true if the document 357 contains and internal document declaration subset. This requires Expat version 358 1.2 or newer. 359 360 361.. method:: xmlparser.EndDoctypeDeclHandler() 362 363 Called when Expat is done parsing the document type declaration. This requires 364 Expat version 1.2 or newer. 365 366 367.. method:: xmlparser.ElementDeclHandler(name, model) 368 369 Called once for each element type declaration. *name* is the name of the 370 element type, and *model* is a representation of the content model. 371 372 373.. method:: xmlparser.AttlistDeclHandler(elname, attname, type, default, required) 374 375 Called for each declared attribute for an element type. If an attribute list 376 declaration declares three attributes, this handler is called three times, once 377 for each attribute. *elname* is the name of the element to which the 378 declaration applies and *attname* is the name of the attribute declared. The 379 attribute type is a string passed as *type*; the possible values are 380 ``'CDATA'``, ``'ID'``, ``'IDREF'``, ... *default* gives the default value for 381 the attribute used when the attribute is not specified by the document instance, 382 or ``None`` if there is no default value (``#IMPLIED`` values). If the 383 attribute is required to be given in the document instance, *required* will be 384 true. This requires Expat version 1.95.0 or newer. 385 386 387.. method:: xmlparser.StartElementHandler(name, attributes) 388 389 Called for the start of every element. *name* is a string containing the 390 element name, and *attributes* is the element attributes. If 391 :attr:`ordered_attributes` is true, this is a list (see 392 :attr:`ordered_attributes` for a full description). Otherwise it's a 393 dictionary mapping names to values. 394 395 396.. method:: xmlparser.EndElementHandler(name) 397 398 Called for the end of every element. 399 400 401.. method:: xmlparser.ProcessingInstructionHandler(target, data) 402 403 Called for every processing instruction. 404 405 406.. method:: xmlparser.CharacterDataHandler(data) 407 408 Called for character data. This will be called for normal character data, CDATA 409 marked content, and ignorable whitespace. Applications which must distinguish 410 these cases can use the :attr:`StartCdataSectionHandler`, 411 :attr:`EndCdataSectionHandler`, and :attr:`ElementDeclHandler` callbacks to 412 collect the required information. Note that the character data may be 413 chunked even if it is short and so you may receive more than one call to 414 :meth:`CharacterDataHandler`. Set the :attr:`buffer_text` instance attribute 415 to ``True`` to avoid that. 416 417 418.. method:: xmlparser.UnparsedEntityDeclHandler(entityName, base, systemId, publicId, notationName) 419 420 Called for unparsed (NDATA) entity declarations. This is only present for 421 version 1.2 of the Expat library; for more recent versions, use 422 :attr:`EntityDeclHandler` instead. (The underlying function in the Expat 423 library has been declared obsolete.) 424 425 426.. method:: xmlparser.EntityDeclHandler(entityName, is_parameter_entity, value, base, systemId, publicId, notationName) 427 428 Called for all entity declarations. For parameter and internal entities, 429 *value* will be a string giving the declared contents of the entity; this will 430 be ``None`` for external entities. The *notationName* parameter will be 431 ``None`` for parsed entities, and the name of the notation for unparsed 432 entities. *is_parameter_entity* will be true if the entity is a parameter entity 433 or false for general entities (most applications only need to be concerned with 434 general entities). This is only available starting with version 1.95.0 of the 435 Expat library. 436 437 438.. method:: xmlparser.NotationDeclHandler(notationName, base, systemId, publicId) 439 440 Called for notation declarations. *notationName*, *base*, and *systemId*, and 441 *publicId* are strings if given. If the public identifier is omitted, 442 *publicId* will be ``None``. 443 444 445.. method:: xmlparser.StartNamespaceDeclHandler(prefix, uri) 446 447 Called when an element contains a namespace declaration. Namespace declarations 448 are processed before the :attr:`StartElementHandler` is called for the element 449 on which declarations are placed. 450 451 452.. method:: xmlparser.EndNamespaceDeclHandler(prefix) 453 454 Called when the closing tag is reached for an element that contained a 455 namespace declaration. This is called once for each namespace declaration on 456 the element in the reverse of the order for which the 457 :attr:`StartNamespaceDeclHandler` was called to indicate the start of each 458 namespace declaration's scope. Calls to this handler are made after the 459 corresponding :attr:`EndElementHandler` for the end of the element. 460 461 462.. method:: xmlparser.CommentHandler(data) 463 464 Called for comments. *data* is the text of the comment, excluding the leading 465 ``'<!-``\ ``-'`` and trailing ``'-``\ ``->'``. 466 467 468.. method:: xmlparser.StartCdataSectionHandler() 469 470 Called at the start of a CDATA section. This and :attr:`EndCdataSectionHandler` 471 are needed to be able to identify the syntactical start and end for CDATA 472 sections. 473 474 475.. method:: xmlparser.EndCdataSectionHandler() 476 477 Called at the end of a CDATA section. 478 479 480.. method:: xmlparser.DefaultHandler(data) 481 482 Called for any characters in the XML document for which no applicable handler 483 has been specified. This means characters that are part of a construct which 484 could be reported, but for which no handler has been supplied. 485 486 487.. method:: xmlparser.DefaultHandlerExpand(data) 488 489 This is the same as the :func:`DefaultHandler`, but doesn't inhibit expansion 490 of internal entities. The entity reference will not be passed to the default 491 handler. 492 493 494.. method:: xmlparser.NotStandaloneHandler() 495 496 Called if the XML document hasn't been declared as being a standalone document. 497 This happens when there is an external subset or a reference to a parameter 498 entity, but the XML declaration does not set standalone to ``yes`` in an XML 499 declaration. If this handler returns ``0``, then the parser will raise an 500 :const:`XML_ERROR_NOT_STANDALONE` error. If this handler is not set, no 501 exception is raised by the parser for this condition. 502 503 504.. method:: xmlparser.ExternalEntityRefHandler(context, base, systemId, publicId) 505 506 Called for references to external entities. *base* is the current base, as set 507 by a previous call to :meth:`SetBase`. The public and system identifiers, 508 *systemId* and *publicId*, are strings if given; if the public identifier is not 509 given, *publicId* will be ``None``. The *context* value is opaque and should 510 only be used as described below. 511 512 For external entities to be parsed, this handler must be implemented. It is 513 responsible for creating the sub-parser using 514 ``ExternalEntityParserCreate(context)``, initializing it with the appropriate 515 callbacks, and parsing the entity. This handler should return an integer; if it 516 returns ``0``, the parser will raise an 517 :const:`XML_ERROR_EXTERNAL_ENTITY_HANDLING` error, otherwise parsing will 518 continue. 519 520 If this handler is not provided, external entities are reported by the 521 :attr:`DefaultHandler` callback, if provided. 522 523 524.. _expaterror-objects: 525 526ExpatError Exceptions 527--------------------- 528 529.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org> 530 531 532:exc:`ExpatError` exceptions have a number of interesting attributes: 533 534 535.. attribute:: ExpatError.code 536 537 Expat's internal error number for the specific error. The 538 :data:`errors.messages <xml.parsers.expat.errors.messages>` dictionary maps 539 these error numbers to Expat's error messages. For example:: 540 541 from xml.parsers.expat import ParserCreate, ExpatError, errors 542 543 p = ParserCreate() 544 try: 545 p.Parse(some_xml_document) 546 except ExpatError as err: 547 print("Error:", errors.messages[err.code]) 548 549 The :mod:`~xml.parsers.expat.errors` module also provides error message 550 constants and a dictionary :data:`~xml.parsers.expat.errors.codes` mapping 551 these messages back to the error codes, see below. 552 553 554.. attribute:: ExpatError.lineno 555 556 Line number on which the error was detected. The first line is numbered ``1``. 557 558 559.. attribute:: ExpatError.offset 560 561 Character offset into the line where the error occurred. The first column is 562 numbered ``0``. 563 564 565.. _expat-example: 566 567Example 568------- 569 570The following program defines three handlers that just print out their 571arguments. :: 572 573 import xml.parsers.expat 574 575 # 3 handler functions 576 def start_element(name, attrs): 577 print('Start element:', name, attrs) 578 def end_element(name): 579 print('End element:', name) 580 def char_data(data): 581 print('Character data:', repr(data)) 582 583 p = xml.parsers.expat.ParserCreate() 584 585 p.StartElementHandler = start_element 586 p.EndElementHandler = end_element 587 p.CharacterDataHandler = char_data 588 589 p.Parse("""<?xml version="1.0"?> 590 <parent id="top"><child1 name="paul">Text goes here</child1> 591 <child2 name="fred">More text</child2> 592 </parent>""", 1) 593 594The output from this program is:: 595 596 Start element: parent {'id': 'top'} 597 Start element: child1 {'name': 'paul'} 598 Character data: 'Text goes here' 599 End element: child1 600 Character data: '\n' 601 Start element: child2 {'name': 'fred'} 602 Character data: 'More text' 603 End element: child2 604 Character data: '\n' 605 End element: parent 606 607 608.. _expat-content-models: 609 610Content Model Descriptions 611-------------------------- 612 613.. module:: xml.parsers.expat.model 614 615.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org> 616 617Content models are described using nested tuples. Each tuple contains four 618values: the type, the quantifier, the name, and a tuple of children. Children 619are simply additional content model descriptions. 620 621The values of the first two fields are constants defined in the 622:mod:`xml.parsers.expat.model` module. These constants can be collected in two 623groups: the model type group and the quantifier group. 624 625The constants in the model type group are: 626 627 628.. data:: XML_CTYPE_ANY 629 :noindex: 630 631 The element named by the model name was declared to have a content model of 632 ``ANY``. 633 634 635.. data:: XML_CTYPE_CHOICE 636 :noindex: 637 638 The named element allows a choice from a number of options; this is used for 639 content models such as ``(A | B | C)``. 640 641 642.. data:: XML_CTYPE_EMPTY 643 :noindex: 644 645 Elements which are declared to be ``EMPTY`` have this model type. 646 647 648.. data:: XML_CTYPE_MIXED 649 :noindex: 650 651 652.. data:: XML_CTYPE_NAME 653 :noindex: 654 655 656.. data:: XML_CTYPE_SEQ 657 :noindex: 658 659 Models which represent a series of models which follow one after the other are 660 indicated with this model type. This is used for models such as ``(A, B, C)``. 661 662The constants in the quantifier group are: 663 664 665.. data:: XML_CQUANT_NONE 666 :noindex: 667 668 No modifier is given, so it can appear exactly once, as for ``A``. 669 670 671.. data:: XML_CQUANT_OPT 672 :noindex: 673 674 The model is optional: it can appear once or not at all, as for ``A?``. 675 676 677.. data:: XML_CQUANT_PLUS 678 :noindex: 679 680 The model must occur one or more times (like ``A+``). 681 682 683.. data:: XML_CQUANT_REP 684 :noindex: 685 686 The model must occur zero or more times, as for ``A*``. 687 688 689.. _expat-errors: 690 691Expat error constants 692--------------------- 693 694.. module:: xml.parsers.expat.errors 695 696The following constants are provided in the :mod:`xml.parsers.expat.errors` 697module. These constants are useful in interpreting some of the attributes of 698the :exc:`ExpatError` exception objects raised when an error has occurred. 699Since for backwards compatibility reasons, the constants' value is the error 700*message* and not the numeric error *code*, you do this by comparing its 701:attr:`code` attribute with 702:samp:`errors.codes[errors.XML_ERROR_{CONSTANT_NAME}]`. 703 704The ``errors`` module has the following attributes: 705 706.. data:: codes 707 708 A dictionary mapping string descriptions to their error codes. 709 710 .. versionadded:: 3.2 711 712 713.. data:: messages 714 715 A dictionary mapping numeric error codes to their string descriptions. 716 717 .. versionadded:: 3.2 718 719 720.. data:: XML_ERROR_ASYNC_ENTITY 721 722 723.. data:: XML_ERROR_ATTRIBUTE_EXTERNAL_ENTITY_REF 724 725 An entity reference in an attribute value referred to an external entity instead 726 of an internal entity. 727 728 729.. data:: XML_ERROR_BAD_CHAR_REF 730 731 A character reference referred to a character which is illegal in XML (for 732 example, character ``0``, or '``�``'). 733 734 735.. data:: XML_ERROR_BINARY_ENTITY_REF 736 737 An entity reference referred to an entity which was declared with a notation, so 738 cannot be parsed. 739 740 741.. data:: XML_ERROR_DUPLICATE_ATTRIBUTE 742 743 An attribute was used more than once in a start tag. 744 745 746.. data:: XML_ERROR_INCORRECT_ENCODING 747 748 749.. data:: XML_ERROR_INVALID_TOKEN 750 751 Raised when an input byte could not properly be assigned to a character; for 752 example, a NUL byte (value ``0``) in a UTF-8 input stream. 753 754 755.. data:: XML_ERROR_JUNK_AFTER_DOC_ELEMENT 756 757 Something other than whitespace occurred after the document element. 758 759 760.. data:: XML_ERROR_MISPLACED_XML_PI 761 762 An XML declaration was found somewhere other than the start of the input data. 763 764 765.. data:: XML_ERROR_NO_ELEMENTS 766 767 The document contains no elements (XML requires all documents to contain exactly 768 one top-level element).. 769 770 771.. data:: XML_ERROR_NO_MEMORY 772 773 Expat was not able to allocate memory internally. 774 775 776.. data:: XML_ERROR_PARAM_ENTITY_REF 777 778 A parameter entity reference was found where it was not allowed. 779 780 781.. data:: XML_ERROR_PARTIAL_CHAR 782 783 An incomplete character was found in the input. 784 785 786.. data:: XML_ERROR_RECURSIVE_ENTITY_REF 787 788 An entity reference contained another reference to the same entity; possibly via 789 a different name, and possibly indirectly. 790 791 792.. data:: XML_ERROR_SYNTAX 793 794 Some unspecified syntax error was encountered. 795 796 797.. data:: XML_ERROR_TAG_MISMATCH 798 799 An end tag did not match the innermost open start tag. 800 801 802.. data:: XML_ERROR_UNCLOSED_TOKEN 803 804 Some token (such as a start tag) was not closed before the end of the stream or 805 the next token was encountered. 806 807 808.. data:: XML_ERROR_UNDEFINED_ENTITY 809 810 A reference was made to an entity which was not defined. 811 812 813.. data:: XML_ERROR_UNKNOWN_ENCODING 814 815 The document encoding is not supported by Expat. 816 817 818.. data:: XML_ERROR_UNCLOSED_CDATA_SECTION 819 820 A CDATA marked section was not closed. 821 822 823.. data:: XML_ERROR_EXTERNAL_ENTITY_HANDLING 824 825 826.. data:: XML_ERROR_NOT_STANDALONE 827 828 The parser determined that the document was not "standalone" though it declared 829 itself to be in the XML declaration, and the :attr:`NotStandaloneHandler` was 830 set and returned ``0``. 831 832 833.. data:: XML_ERROR_UNEXPECTED_STATE 834 835 836.. data:: XML_ERROR_ENTITY_DECLARED_IN_PE 837 838 839.. data:: XML_ERROR_FEATURE_REQUIRES_XML_DTD 840 841 An operation was requested that requires DTD support to be compiled in, but 842 Expat was configured without DTD support. This should never be reported by a 843 standard build of the :mod:`xml.parsers.expat` module. 844 845 846.. data:: XML_ERROR_CANT_CHANGE_FEATURE_ONCE_PARSING 847 848 A behavioral change was requested after parsing started that can only be changed 849 before parsing has started. This is (currently) only raised by 850 :meth:`UseForeignDTD`. 851 852 853.. data:: XML_ERROR_UNBOUND_PREFIX 854 855 An undeclared prefix was found when namespace processing was enabled. 856 857 858.. data:: XML_ERROR_UNDECLARING_PREFIX 859 860 The document attempted to remove the namespace declaration associated with a 861 prefix. 862 863 864.. data:: XML_ERROR_INCOMPLETE_PE 865 866 A parameter entity contained incomplete markup. 867 868 869.. data:: XML_ERROR_XML_DECL 870 871 The document contained no document element at all. 872 873 874.. data:: XML_ERROR_TEXT_DECL 875 876 There was an error parsing a text declaration in an external entity. 877 878 879.. data:: XML_ERROR_PUBLICID 880 881 Characters were found in the public id that are not allowed. 882 883 884.. data:: XML_ERROR_SUSPENDED 885 886 The requested operation was made on a suspended parser, but isn't allowed. This 887 includes attempts to provide additional input or to stop the parser. 888 889 890.. data:: XML_ERROR_NOT_SUSPENDED 891 892 An attempt to resume the parser was made when the parser had not been suspended. 893 894 895.. data:: XML_ERROR_ABORTED 896 897 This should not be reported to Python applications. 898 899 900.. data:: XML_ERROR_FINISHED 901 902 The requested operation was made on a parser which was finished parsing input, 903 but isn't allowed. This includes attempts to provide additional input or to 904 stop the parser. 905 906 907.. data:: XML_ERROR_SUSPEND_PE 908 909 910.. data:: XML_ERROR_RESERVED_PREFIX_XML 911 912 An attempt was made to 913 undeclare reserved namespace prefix ``xml`` 914 or to bind it to another namespace URI. 915 916 917.. data:: XML_ERROR_RESERVED_PREFIX_XMLNS 918 919 An attempt was made to declare or undeclare reserved namespace prefix ``xmlns``. 920 921 922.. data:: XML_ERROR_RESERVED_NAMESPACE_URI 923 924 An attempt was made to bind the URI of one the reserved namespace 925 prefixes ``xml`` and ``xmlns`` to another namespace prefix. 926 927 928.. data:: XML_ERROR_INVALID_ARGUMENT 929 930 This should not be reported to Python applications. 931 932 933.. data:: XML_ERROR_NO_BUFFER 934 935 This should not be reported to Python applications. 936 937 938.. data:: XML_ERROR_AMPLIFICATION_LIMIT_BREACH 939 940 The limit on input amplification factor (from DTD and entities) 941 has been breached. 942 943 944.. rubric:: Footnotes 945 946.. [1] The encoding string included in XML output should conform to the 947 appropriate standards. For example, "UTF-8" is valid, but "UTF8" is 948 not. See https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EncodingDecl 949 and https://www.iana.org/assignments/character-sets/character-sets.xhtml. 950 951