1:mod:`!email.policy`: Policy Objects 2------------------------------------ 3 4.. module:: email.policy 5 :synopsis: Controlling the parsing and generating of messages 6 7.. moduleauthor:: R. David Murray <rdmurray@bitdance.com> 8.. sectionauthor:: R. David Murray <rdmurray@bitdance.com> 9 10.. versionadded:: 3.3 11 12**Source code:** :source:`Lib/email/policy.py` 13 14-------------- 15 16The :mod:`email` package's prime focus is the handling of email messages as 17described by the various email and MIME RFCs. However, the general format of 18email messages (a block of header fields each consisting of a name followed by 19a colon followed by a value, the whole block followed by a blank line and an 20arbitrary 'body'), is a format that has found utility outside of the realm of 21email. Some of these uses conform fairly closely to the main email RFCs, some 22do not. Even when working with email, there are times when it is desirable to 23break strict compliance with the RFCs, such as generating emails that 24interoperate with email servers that do not themselves follow the standards, or 25that implement extensions you want to use in ways that violate the 26standards. 27 28Policy objects give the email package the flexibility to handle all these 29disparate use cases. 30 31A :class:`Policy` object encapsulates a set of attributes and methods that 32control the behavior of various components of the email package during use. 33:class:`Policy` instances can be passed to various classes and methods in the 34email package to alter the default behavior. The settable values and their 35defaults are described below. 36 37There is a default policy used by all classes in the email package. For all of 38the :mod:`~email.parser` classes and the related convenience functions, and for 39the :class:`~email.message.Message` class, this is the :class:`Compat32` 40policy, via its corresponding pre-defined instance :const:`compat32`. This 41policy provides for complete backward compatibility (in some cases, including 42bug compatibility) with the pre-Python3.3 version of the email package. 43 44This default value for the *policy* keyword to 45:class:`~email.message.EmailMessage` is the :class:`EmailPolicy` policy, via 46its pre-defined instance :data:`~default`. 47 48When a :class:`~email.message.Message` or :class:`~email.message.EmailMessage` 49object is created, it acquires a policy. If the message is created by a 50:mod:`~email.parser`, a policy passed to the parser will be the policy used by 51the message it creates. If the message is created by the program, then the 52policy can be specified when it is created. When a message is passed to a 53:mod:`~email.generator`, the generator uses the policy from the message by 54default, but you can also pass a specific policy to the generator that will 55override the one stored on the message object. 56 57The default value for the *policy* keyword for the :mod:`email.parser` classes 58and the parser convenience functions **will be changing** in a future version of 59Python. Therefore you should **always specify explicitly which policy you want 60to use** when calling any of the classes and functions described in the 61:mod:`~email.parser` module. 62 63The first part of this documentation covers the features of :class:`Policy`, an 64:term:`abstract base class` that defines the features that are common to all 65policy objects, including :const:`compat32`. This includes certain hook 66methods that are called internally by the email package, which a custom policy 67could override to obtain different behavior. The second part describes the 68concrete classes :class:`EmailPolicy` and :class:`Compat32`, which implement 69the hooks that provide the standard behavior and the backward compatible 70behavior and features, respectively. 71 72:class:`Policy` instances are immutable, but they can be cloned, accepting the 73same keyword arguments as the class constructor and returning a new 74:class:`Policy` instance that is a copy of the original but with the specified 75attributes values changed. 76 77As an example, the following code could be used to read an email message from a 78file on disk and pass it to the system ``sendmail`` program on a Unix system: 79 80.. testsetup:: 81 82 from unittest import mock 83 mocker = mock.patch('subprocess.Popen') 84 m = mocker.start() 85 proc = mock.MagicMock() 86 m.return_value = proc 87 proc.stdin.close.return_value = None 88 mymsg = open('mymsg.txt', 'w') 89 mymsg.write('To: abc@xyz.com\n\n') 90 mymsg.flush() 91 92.. doctest:: 93 94 >>> from email import message_from_binary_file 95 >>> from email.generator import BytesGenerator 96 >>> from email import policy 97 >>> from subprocess import Popen, PIPE 98 >>> with open('mymsg.txt', 'rb') as f: 99 ... msg = message_from_binary_file(f, policy=policy.default) 100 ... 101 >>> p = Popen(['sendmail', msg['To'].addresses[0]], stdin=PIPE) 102 >>> g = BytesGenerator(p.stdin, policy=msg.policy.clone(linesep='\r\n')) 103 >>> g.flatten(msg) 104 >>> p.stdin.close() 105 >>> rc = p.wait() 106 107.. testcleanup:: 108 109 mymsg.close() 110 mocker.stop() 111 import os 112 os.remove('mymsg.txt') 113 114Here we are telling :class:`~email.generator.BytesGenerator` to use the RFC 115correct line separator characters when creating the binary string to feed into 116``sendmail's`` ``stdin``, where the default policy would use ``\n`` line 117separators. 118 119Some email package methods accept a *policy* keyword argument, allowing the 120policy to be overridden for that method. For example, the following code uses 121the :meth:`~email.message.Message.as_bytes` method of the *msg* object from 122the previous example and writes the message to a file using the native line 123separators for the platform on which it is running:: 124 125 >>> import os 126 >>> with open('converted.txt', 'wb') as f: 127 ... f.write(msg.as_bytes(policy=msg.policy.clone(linesep=os.linesep))) 128 17 129 130Policy objects can also be combined using the addition operator, producing a 131policy object whose settings are a combination of the non-default values of the 132summed objects:: 133 134 >>> compat_SMTP = policy.compat32.clone(linesep='\r\n') 135 >>> compat_strict = policy.compat32.clone(raise_on_defect=True) 136 >>> compat_strict_SMTP = compat_SMTP + compat_strict 137 138This operation is not commutative; that is, the order in which the objects are 139added matters. To illustrate:: 140 141 >>> policy100 = policy.compat32.clone(max_line_length=100) 142 >>> policy80 = policy.compat32.clone(max_line_length=80) 143 >>> apolicy = policy100 + policy80 144 >>> apolicy.max_line_length 145 80 146 >>> apolicy = policy80 + policy100 147 >>> apolicy.max_line_length 148 100 149 150 151.. class:: Policy(**kw) 152 153 This is the :term:`abstract base class` for all policy classes. It provides 154 default implementations for a couple of trivial methods, as well as the 155 implementation of the immutability property, the :meth:`clone` method, and 156 the constructor semantics. 157 158 The constructor of a policy class can be passed various keyword arguments. 159 The arguments that may be specified are any non-method properties on this 160 class, plus any additional non-method properties on the concrete class. A 161 value specified in the constructor will override the default value for the 162 corresponding attribute. 163 164 This class defines the following properties, and thus values for the 165 following may be passed in the constructor of any policy class: 166 167 168 .. attribute:: max_line_length 169 170 The maximum length of any line in the serialized output, not counting the 171 end of line character(s). Default is 78, per :rfc:`5322`. A value of 172 ``0`` or :const:`None` indicates that no line wrapping should be 173 done at all. 174 175 176 .. attribute:: linesep 177 178 The string to be used to terminate lines in serialized output. The 179 default is ``\n`` because that's the internal end-of-line discipline used 180 by Python, though ``\r\n`` is required by the RFCs. 181 182 183 .. attribute:: cte_type 184 185 Controls the type of Content Transfer Encodings that may be or are 186 required to be used. The possible values are: 187 188 .. tabularcolumns:: |l|L| 189 190 ======== =============================================================== 191 ``7bit`` all data must be "7 bit clean" (ASCII-only). This means that 192 where necessary data will be encoded using either 193 quoted-printable or base64 encoding. 194 195 ``8bit`` data is not constrained to be 7 bit clean. Data in headers is 196 still required to be ASCII-only and so will be encoded (see 197 :meth:`fold_binary` and :attr:`~EmailPolicy.utf8` below for 198 exceptions), but body parts may use the ``8bit`` CTE. 199 ======== =============================================================== 200 201 A ``cte_type`` value of ``8bit`` only works with ``BytesGenerator``, not 202 ``Generator``, because strings cannot contain binary data. If a 203 ``Generator`` is operating under a policy that specifies 204 ``cte_type=8bit``, it will act as if ``cte_type`` is ``7bit``. 205 206 207 .. attribute:: raise_on_defect 208 209 If :const:`True`, any defects encountered will be raised as errors. If 210 :const:`False` (the default), defects will be passed to the 211 :meth:`register_defect` method. 212 213 214 .. attribute:: mangle_from_ 215 216 If :const:`True`, lines starting with *"From "* in the body are 217 escaped by putting a ``>`` in front of them. This parameter is used when 218 the message is being serialized by a generator. 219 Default: :const:`False`. 220 221 .. versionadded:: 3.5 222 223 224 .. attribute:: message_factory 225 226 A factory function for constructing a new empty message object. Used 227 by the parser when building messages. Defaults to ``None``, in 228 which case :class:`~email.message.Message` is used. 229 230 .. versionadded:: 3.6 231 232 233 .. attribute:: verify_generated_headers 234 235 If ``True`` (the default), the generator will raise 236 :exc:`~email.errors.HeaderWriteError` instead of writing a header 237 that is improperly folded or delimited, such that it would 238 be parsed as multiple headers or joined with adjacent data. 239 Such headers can be generated by custom header classes or bugs 240 in the ``email`` module. 241 242 As it's a security feature, this defaults to ``True`` even in the 243 :class:`~email.policy.Compat32` policy. 244 For backwards compatible, but unsafe, behavior, it must be set to 245 ``False`` explicitly. 246 247 .. versionadded:: 3.13 248 249 250 The following :class:`Policy` method is intended to be called by code using 251 the email library to create policy instances with custom settings: 252 253 254 .. method:: clone(**kw) 255 256 Return a new :class:`Policy` instance whose attributes have the same 257 values as the current instance, except where those attributes are 258 given new values by the keyword arguments. 259 260 261 The remaining :class:`Policy` methods are called by the email package code, 262 and are not intended to be called by an application using the email package. 263 A custom policy must implement all of these methods. 264 265 266 .. method:: handle_defect(obj, defect) 267 268 Handle a *defect* found on *obj*. When the email package calls this 269 method, *defect* will always be a subclass of 270 :class:`~email.errors.Defect`. 271 272 The default implementation checks the :attr:`raise_on_defect` flag. If 273 it is ``True``, *defect* is raised as an exception. If it is ``False`` 274 (the default), *obj* and *defect* are passed to :meth:`register_defect`. 275 276 277 .. method:: register_defect(obj, defect) 278 279 Register a *defect* on *obj*. In the email package, *defect* will always 280 be a subclass of :class:`~email.errors.Defect`. 281 282 The default implementation calls the ``append`` method of the ``defects`` 283 attribute of *obj*. When the email package calls :attr:`handle_defect`, 284 *obj* will normally have a ``defects`` attribute that has an ``append`` 285 method. Custom object types used with the email package (for example, 286 custom ``Message`` objects) should also provide such an attribute, 287 otherwise defects in parsed messages will raise unexpected errors. 288 289 290 .. method:: header_max_count(name) 291 292 Return the maximum allowed number of headers named *name*. 293 294 Called when a header is added to an :class:`~email.message.EmailMessage` 295 or :class:`~email.message.Message` object. If the returned value is not 296 ``0`` or ``None``, and there are already a number of headers with the 297 name *name* greater than or equal to the value returned, a 298 :exc:`ValueError` is raised. 299 300 Because the default behavior of ``Message.__setitem__`` is to append the 301 value to the list of headers, it is easy to create duplicate headers 302 without realizing it. This method allows certain headers to be limited 303 in the number of instances of that header that may be added to a 304 ``Message`` programmatically. (The limit is not observed by the parser, 305 which will faithfully produce as many headers as exist in the message 306 being parsed.) 307 308 The default implementation returns ``None`` for all header names. 309 310 311 .. method:: header_source_parse(sourcelines) 312 313 The email package calls this method with a list of strings, each string 314 ending with the line separation characters found in the source being 315 parsed. The first line includes the field header name and separator. 316 All whitespace in the source is preserved. The method should return the 317 ``(name, value)`` tuple that is to be stored in the ``Message`` to 318 represent the parsed header. 319 320 If an implementation wishes to retain compatibility with the existing 321 email package policies, *name* should be the case preserved name (all 322 characters up to the '``:``' separator), while *value* should be the 323 unfolded value (all line separator characters removed, but whitespace 324 kept intact), stripped of leading whitespace. 325 326 *sourcelines* may contain surrogateescaped binary data. 327 328 There is no default implementation 329 330 331 .. method:: header_store_parse(name, value) 332 333 The email package calls this method with the name and value provided by 334 the application program when the application program is modifying a 335 ``Message`` programmatically (as opposed to a ``Message`` created by a 336 parser). The method should return the ``(name, value)`` tuple that is to 337 be stored in the ``Message`` to represent the header. 338 339 If an implementation wishes to retain compatibility with the existing 340 email package policies, the *name* and *value* should be strings or 341 string subclasses that do not change the content of the passed in 342 arguments. 343 344 There is no default implementation 345 346 347 .. method:: header_fetch_parse(name, value) 348 349 The email package calls this method with the *name* and *value* currently 350 stored in the ``Message`` when that header is requested by the 351 application program, and whatever the method returns is what is passed 352 back to the application as the value of the header being retrieved. 353 Note that there may be more than one header with the same name stored in 354 the ``Message``; the method is passed the specific name and value of the 355 header destined to be returned to the application. 356 357 *value* may contain surrogateescaped binary data. There should be no 358 surrogateescaped binary data in the value returned by the method. 359 360 There is no default implementation 361 362 363 .. method:: fold(name, value) 364 365 The email package calls this method with the *name* and *value* currently 366 stored in the ``Message`` for a given header. The method should return a 367 string that represents that header "folded" correctly (according to the 368 policy settings) by composing the *name* with the *value* and inserting 369 :attr:`linesep` characters at the appropriate places. See :rfc:`5322` 370 for a discussion of the rules for folding email headers. 371 372 *value* may contain surrogateescaped binary data. There should be no 373 surrogateescaped binary data in the string returned by the method. 374 375 376 .. method:: fold_binary(name, value) 377 378 The same as :meth:`fold`, except that the returned value should be a 379 bytes object rather than a string. 380 381 *value* may contain surrogateescaped binary data. These could be 382 converted back into binary data in the returned bytes object. 383 384 385 386.. class:: EmailPolicy(**kw) 387 388 This concrete :class:`Policy` provides behavior that is intended to be fully 389 compliant with the current email RFCs. These include (but are not limited 390 to) :rfc:`5322`, :rfc:`2047`, and the current MIME RFCs. 391 392 This policy adds new header parsing and folding algorithms. Instead of 393 simple strings, headers are ``str`` subclasses with attributes that depend 394 on the type of the field. The parsing and folding algorithm fully implement 395 :rfc:`2047` and :rfc:`5322`. 396 397 The default value for the :attr:`~email.policy.Policy.message_factory` 398 attribute is :class:`~email.message.EmailMessage`. 399 400 In addition to the settable attributes listed above that apply to all 401 policies, this policy adds the following additional attributes: 402 403 .. versionadded:: 3.6 [1]_ 404 405 406 .. attribute:: utf8 407 408 If ``False``, follow :rfc:`5322`, supporting non-ASCII characters in 409 headers by encoding them as "encoded words". If ``True``, follow 410 :rfc:`6532` and use ``utf-8`` encoding for headers. Messages 411 formatted in this way may be passed to SMTP servers that support 412 the ``SMTPUTF8`` extension (:rfc:`6531`). 413 414 415 .. attribute:: refold_source 416 417 If the value for a header in the ``Message`` object originated from a 418 :mod:`~email.parser` (as opposed to being set by a program), this 419 attribute indicates whether or not a generator should refold that value 420 when transforming the message back into serialized form. The possible 421 values are: 422 423 ======== =============================================================== 424 ``none`` all source values use original folding 425 426 ``long`` source values that have any line that is longer than 427 ``max_line_length`` will be refolded 428 429 ``all`` all values are refolded. 430 ======== =============================================================== 431 432 The default is ``long``. 433 434 435 .. attribute:: header_factory 436 437 A callable that takes two arguments, ``name`` and ``value``, where 438 ``name`` is a header field name and ``value`` is an unfolded header field 439 value, and returns a string subclass that represents that header. A 440 default ``header_factory`` (see :mod:`~email.headerregistry`) is provided 441 that supports custom parsing for the various address and date :RFC:`5322` 442 header field types, and the major MIME header field stypes. Support for 443 additional custom parsing will be added in the future. 444 445 446 .. attribute:: content_manager 447 448 An object with at least two methods: get_content and set_content. When 449 the :meth:`~email.message.EmailMessage.get_content` or 450 :meth:`~email.message.EmailMessage.set_content` method of an 451 :class:`~email.message.EmailMessage` object is called, it calls the 452 corresponding method of this object, passing it the message object as its 453 first argument, and any arguments or keywords that were passed to it as 454 additional arguments. By default ``content_manager`` is set to 455 :data:`~email.contentmanager.raw_data_manager`. 456 457 .. versionadded:: 3.4 458 459 460 The class provides the following concrete implementations of the abstract 461 methods of :class:`Policy`: 462 463 464 .. method:: header_max_count(name) 465 466 Returns the value of the 467 :attr:`~email.headerregistry.BaseHeader.max_count` attribute of the 468 specialized class used to represent the header with the given name. 469 470 471 .. method:: header_source_parse(sourcelines) 472 473 474 The name is parsed as everything up to the '``:``' and returned 475 unmodified. The value is determined by stripping leading whitespace off 476 the remainder of the first line, joining all subsequent lines together, 477 and stripping any trailing carriage return or linefeed characters. 478 479 480 .. method:: header_store_parse(name, value) 481 482 The name is returned unchanged. If the input value has a ``name`` 483 attribute and it matches *name* ignoring case, the value is returned 484 unchanged. Otherwise the *name* and *value* are passed to 485 ``header_factory``, and the resulting header object is returned as 486 the value. In this case a ``ValueError`` is raised if the input value 487 contains CR or LF characters. 488 489 490 .. method:: header_fetch_parse(name, value) 491 492 If the value has a ``name`` attribute, it is returned to unmodified. 493 Otherwise the *name*, and the *value* with any CR or LF characters 494 removed, are passed to the ``header_factory``, and the resulting 495 header object is returned. Any surrogateescaped bytes get turned into 496 the unicode unknown-character glyph. 497 498 499 .. method:: fold(name, value) 500 501 Header folding is controlled by the :attr:`refold_source` policy setting. 502 A value is considered to be a 'source value' if and only if it does not 503 have a ``name`` attribute (having a ``name`` attribute means it is a 504 header object of some sort). If a source value needs to be refolded 505 according to the policy, it is converted into a header object by 506 passing the *name* and the *value* with any CR and LF characters removed 507 to the ``header_factory``. Folding of a header object is done by 508 calling its ``fold`` method with the current policy. 509 510 Source values are split into lines using :meth:`~str.splitlines`. If 511 the value is not to be refolded, the lines are rejoined using the 512 ``linesep`` from the policy and returned. The exception is lines 513 containing non-ascii binary data. In that case the value is refolded 514 regardless of the ``refold_source`` setting, which causes the binary data 515 to be CTE encoded using the ``unknown-8bit`` charset. 516 517 518 .. method:: fold_binary(name, value) 519 520 The same as :meth:`fold` if :attr:`~Policy.cte_type` is ``7bit``, except 521 that the returned value is bytes. 522 523 If :attr:`~Policy.cte_type` is ``8bit``, non-ASCII binary data is 524 converted back 525 into bytes. Headers with binary data are not refolded, regardless of the 526 ``refold_header`` setting, since there is no way to know whether the 527 binary data consists of single byte characters or multibyte characters. 528 529 530The following instances of :class:`EmailPolicy` provide defaults suitable for 531specific application domains. Note that in the future the behavior of these 532instances (in particular the ``HTTP`` instance) may be adjusted to conform even 533more closely to the RFCs relevant to their domains. 534 535 536.. data:: default 537 538 An instance of ``EmailPolicy`` with all defaults unchanged. This policy 539 uses the standard Python ``\n`` line endings rather than the RFC-correct 540 ``\r\n``. 541 542 543.. data:: SMTP 544 545 Suitable for serializing messages in conformance with the email RFCs. 546 Like ``default``, but with ``linesep`` set to ``\r\n``, which is RFC 547 compliant. 548 549 550.. data:: SMTPUTF8 551 552 The same as ``SMTP`` except that :attr:`~EmailPolicy.utf8` is ``True``. 553 Useful for serializing messages to a message store without using encoded 554 words in the headers. Should only be used for SMTP transmission if the 555 sender or recipient addresses have non-ASCII characters (the 556 :meth:`smtplib.SMTP.send_message` method handles this automatically). 557 558 559.. data:: HTTP 560 561 Suitable for serializing headers with for use in HTTP traffic. Like 562 ``SMTP`` except that ``max_line_length`` is set to ``None`` (unlimited). 563 564 565.. data:: strict 566 567 Convenience instance. The same as ``default`` except that 568 ``raise_on_defect`` is set to ``True``. This allows any policy to be made 569 strict by writing:: 570 571 somepolicy + policy.strict 572 573 574With all of these :class:`EmailPolicies <.EmailPolicy>`, the effective API of 575the email package is changed from the Python 3.2 API in the following ways: 576 577* Setting a header on a :class:`~email.message.Message` results in that 578 header being parsed and a header object created. 579 580* Fetching a header value from a :class:`~email.message.Message` results 581 in that header being parsed and a header object created and 582 returned. 583 584* Any header object, or any header that is refolded due to the 585 policy settings, is folded using an algorithm that fully implements the 586 RFC folding algorithms, including knowing where encoded words are required 587 and allowed. 588 589From the application view, this means that any header obtained through the 590:class:`~email.message.EmailMessage` is a header object with extra 591attributes, whose string value is the fully decoded unicode value of the 592header. Likewise, a header may be assigned a new value, or a new header 593created, using a unicode string, and the policy will take care of converting 594the unicode string into the correct RFC encoded form. 595 596The header objects and their attributes are described in 597:mod:`~email.headerregistry`. 598 599 600 601.. class:: Compat32(**kw) 602 603 This concrete :class:`Policy` is the backward compatibility policy. It 604 replicates the behavior of the email package in Python 3.2. The 605 :mod:`~email.policy` module also defines an instance of this class, 606 :const:`compat32`, that is used as the default policy. Thus the default 607 behavior of the email package is to maintain compatibility with Python 3.2. 608 609 The following attributes have values that are different from the 610 :class:`Policy` default: 611 612 613 .. attribute:: mangle_from_ 614 615 The default is ``True``. 616 617 618 The class provides the following concrete implementations of the 619 abstract methods of :class:`Policy`: 620 621 622 .. method:: header_source_parse(sourcelines) 623 624 The name is parsed as everything up to the '``:``' and returned 625 unmodified. The value is determined by stripping leading whitespace off 626 the remainder of the first line, joining all subsequent lines together, 627 and stripping any trailing carriage return or linefeed characters. 628 629 630 .. method:: header_store_parse(name, value) 631 632 The name and value are returned unmodified. 633 634 635 .. method:: header_fetch_parse(name, value) 636 637 If the value contains binary data, it is converted into a 638 :class:`~email.header.Header` object using the ``unknown-8bit`` charset. 639 Otherwise it is returned unmodified. 640 641 642 .. method:: fold(name, value) 643 644 Headers are folded using the :class:`~email.header.Header` folding 645 algorithm, which preserves existing line breaks in the value, and wraps 646 each resulting line to the ``max_line_length``. Non-ASCII binary data are 647 CTE encoded using the ``unknown-8bit`` charset. 648 649 650 .. method:: fold_binary(name, value) 651 652 Headers are folded using the :class:`~email.header.Header` folding 653 algorithm, which preserves existing line breaks in the value, and wraps 654 each resulting line to the ``max_line_length``. If ``cte_type`` is 655 ``7bit``, non-ascii binary data is CTE encoded using the ``unknown-8bit`` 656 charset. Otherwise the original source header is used, with its existing 657 line breaks and any (RFC invalid) binary data it may contain. 658 659 660.. data:: compat32 661 662 An instance of :class:`Compat32`, providing backward compatibility with the 663 behavior of the email package in Python 3.2. 664 665 666.. rubric:: Footnotes 667 668.. [1] Originally added in 3.3 as a :term:`provisional feature <provisional 669 package>`. 670