• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1
2:mod:`rfc822` --- Parse RFC 2822 mail headers
3=============================================
4
5.. module:: rfc822
6   :synopsis: Parse 2822 style mail messages.
7   :deprecated:
8
9
10.. deprecated:: 2.3
11   The :mod:`email` package should be used in preference to the :mod:`rfc822`
12   module.  This module is present only to maintain backward compatibility, and
13   has been removed in Python 3.
14
15This module defines a class, :class:`Message`, which represents an "email
16message" as defined by the Internet standard :rfc:`2822`. [#]_  Such messages
17consist of a collection of message headers, and a message body.  This module
18also defines a helper class :class:`AddressList` for parsing :rfc:`2822`
19addresses.  Please refer to the RFC for information on the specific syntax of
20:rfc:`2822` messages.
21
22.. index:: module: mailbox
23
24The :mod:`mailbox` module provides classes  to read mailboxes produced by
25various end-user mail programs.
26
27
28.. class:: Message(file[, seekable])
29
30   A :class:`Message` instance is instantiated with an input object as parameter.
31   Message relies only on the input object having a :meth:`readline` method; in
32   particular, ordinary file objects qualify.  Instantiation reads headers from the
33   input object up to a delimiter line (normally a blank line) and stores them in
34   the instance.  The message body, following the headers, is not consumed.
35
36   This class can work with any input object that supports a :meth:`readline`
37   method.  If the input object has seek and tell capability, the
38   :meth:`rewindbody` method will work; also, illegal lines will be pushed back
39   onto the input stream.  If the input object lacks seek but has an :meth:`unread`
40   method that can push back a line of input, :class:`Message` will use that to
41   push back illegal lines.  Thus this class can be used to parse messages coming
42   from a buffered stream.
43
44   The optional *seekable* argument is provided as a workaround for certain stdio
45   libraries in which :c:func:`tell` discards buffered data before discovering that
46   the :c:func:`lseek` system call doesn't work.  For maximum portability, you
47   should set the seekable argument to zero to prevent that initial :meth:`tell`
48   when passing in an unseekable object such as a file object created from a socket
49   object.
50
51   Input lines as read from the file may either be terminated by CR-LF or by a
52   single linefeed; a terminating CR-LF is replaced by a single linefeed before the
53   line is stored.
54
55   All header matching is done independent of upper or lower case; e.g.
56   ``m['From']``, ``m['from']`` and ``m['FROM']`` all yield the same result.
57
58
59.. class:: AddressList(field)
60
61   You may instantiate the :class:`AddressList` helper class using a single string
62   parameter, a comma-separated list of :rfc:`2822` addresses to be parsed.  (The
63   parameter ``None`` yields an empty list.)
64
65
66.. function:: quote(str)
67
68   Return a new string with backslashes in *str* replaced by two backslashes and
69   double quotes replaced by backslash-double quote.
70
71
72.. function:: unquote(str)
73
74   Return a new string which is an *unquoted* version of *str*. If *str* ends and
75   begins with double quotes, they are stripped off.  Likewise if *str* ends and
76   begins with angle brackets, they are stripped off.
77
78
79.. function:: parseaddr(address)
80
81   Parse *address*, which should be the value of some address-containing field such
82   as :mailheader:`To` or :mailheader:`Cc`, into its constituent "realname" and
83   "email address" parts. Returns a tuple of that information, unless the parse
84   fails, in which case a 2-tuple ``(None, None)`` is returned.
85
86
87.. function:: dump_address_pair(pair)
88
89   The inverse of :meth:`parseaddr`, this takes a 2-tuple of the form ``(realname,
90   email_address)`` and returns the string value suitable for a :mailheader:`To` or
91   :mailheader:`Cc` header.  If the first element of *pair* is false, then the
92   second element is returned unmodified.
93
94
95.. function:: parsedate(date)
96
97   Attempts to parse a date according to the rules in :rfc:`2822`. however, some
98   mailers don't follow that format as specified, so :func:`parsedate` tries to
99   guess correctly in such cases.  *date* is a string containing an :rfc:`2822`
100   date, such as  ``'Mon, 20 Nov 1995 19:12:08 -0500'``.  If it succeeds in parsing
101   the date, :func:`parsedate` returns a 9-tuple that can be passed directly to
102   :func:`time.mktime`; otherwise ``None`` will be returned.  Note that indexes 6,
103   7, and 8 of the result tuple are not usable.
104
105
106.. function:: parsedate_tz(date)
107
108   Performs the same function as :func:`parsedate`, but returns either ``None`` or
109   a 10-tuple; the first 9 elements make up a tuple that can be passed directly to
110   :func:`time.mktime`, and the tenth is the offset of the date's timezone from UTC
111   (which is the official term for Greenwich Mean Time).  (Note that the sign of
112   the timezone offset is the opposite of the sign of the ``time.timezone``
113   variable for the same timezone; the latter variable follows the POSIX standard
114   while this module follows :rfc:`2822`.)  If the input string has no timezone,
115   the last element of the tuple returned is ``None``.  Note that indexes 6, 7, and
116   8 of the result tuple are not usable.
117
118
119.. function:: mktime_tz(tuple)
120
121   Turn a 10-tuple as returned by :func:`parsedate_tz` into a UTC timestamp.  If
122   the timezone item in the tuple is ``None``, assume local time.  Minor
123   deficiency: this first interprets the first 8 elements as a local time and then
124   compensates for the timezone difference; this may yield a slight error around
125   daylight savings time switch dates.  Not enough to worry about for common use.
126
127
128.. seealso::
129
130   Module :mod:`email`
131      Comprehensive email handling package; supersedes the :mod:`rfc822` module.
132
133   Module :mod:`mailbox`
134      Classes to read various mailbox formats produced  by end-user mail programs.
135
136   Module :mod:`mimetools`
137      Subclass of :class:`rfc822.Message` that handles MIME encoded messages.
138
139
140.. _message-objects:
141
142Message Objects
143---------------
144
145A :class:`Message` instance has the following methods:
146
147
148.. method:: Message.rewindbody()
149
150   Seek to the start of the message body.  This only works if the file object is
151   seekable.
152
153
154.. method:: Message.isheader(line)
155
156   Returns a line's canonicalized fieldname (the dictionary key that will be used
157   to index it) if the line is a legal :rfc:`2822` header; otherwise returns
158   ``None`` (implying that parsing should stop here and the line be pushed back on
159   the input stream).  It is sometimes useful to override this method in a
160   subclass.
161
162
163.. method:: Message.islast(line)
164
165   Return true if the given line is a delimiter on which Message should stop.  The
166   delimiter line is consumed, and the file object's read location positioned
167   immediately after it.  By default this method just checks that the line is
168   blank, but you can override it in a subclass.
169
170
171.. method:: Message.iscomment(line)
172
173   Return ``True`` if the given line should be ignored entirely, just skipped. By
174   default this is a stub that always returns ``False``, but you can override it in
175   a subclass.
176
177
178.. method:: Message.getallmatchingheaders(name)
179
180   Return a list of lines consisting of all headers matching *name*, if any.  Each
181   physical line, whether it is a continuation line or not, is a separate list
182   item.  Return the empty list if no header matches *name*.
183
184
185.. method:: Message.getfirstmatchingheader(name)
186
187   Return a list of lines comprising the first header matching *name*, and its
188   continuation line(s), if any.  Return ``None`` if there is no header matching
189   *name*.
190
191
192.. method:: Message.getrawheader(name)
193
194   Return a single string consisting of the text after the colon in the first
195   header matching *name*.  This includes leading whitespace, the trailing
196   linefeed, and internal linefeeds and whitespace if there any continuation
197   line(s) were present.  Return ``None`` if there is no header matching *name*.
198
199
200.. method:: Message.getheader(name[, default])
201
202   Return a single string consisting of the last header matching *name*,
203   but strip leading and trailing whitespace.
204   Internal whitespace is not stripped.  The optional *default* argument can be
205   used to specify a different default to be returned when there is no header
206   matching *name*; it defaults to ``None``.
207   This is the preferred way to get parsed headers.
208
209
210.. method:: Message.get(name[, default])
211
212   An alias for :meth:`getheader`, to make the interface more compatible  with
213   regular dictionaries.
214
215
216.. method:: Message.getaddr(name)
217
218   Return a pair ``(full name, email address)`` parsed from the string returned by
219   ``getheader(name)``.  If no header matching *name* exists, return ``(None,
220   None)``; otherwise both the full name and the address are (possibly empty)
221   strings.
222
223   Example: If *m*'s first :mailheader:`From` header contains the string
224   ``'jack@cwi.nl (Jack Jansen)'``, then ``m.getaddr('From')`` will yield the pair
225   ``('Jack Jansen', 'jack@cwi.nl')``. If the header contained ``'Jack Jansen
226   <jack@cwi.nl>'`` instead, it would yield the exact same result.
227
228
229.. method:: Message.getaddrlist(name)
230
231   This is similar to ``getaddr(list)``, but parses a header containing a list of
232   email addresses (e.g. a :mailheader:`To` header) and returns a list of ``(full
233   name, email address)`` pairs (even if there was only one address in the header).
234   If there is no header matching *name*, return an empty list.
235
236   If multiple headers exist that match the named header (e.g. if there are several
237   :mailheader:`Cc` headers), all are parsed for addresses. Any continuation lines
238   the named headers contain are also parsed.
239
240
241.. method:: Message.getdate(name)
242
243   Retrieve a header using :meth:`getheader` and parse it into a 9-tuple compatible
244   with :func:`time.mktime`; note that fields 6, 7, and 8  are not usable.  If
245   there is no header matching *name*, or it is unparsable, return ``None``.
246
247   Date parsing appears to be a black art, and not all mailers adhere to the
248   standard.  While it has been tested and found correct on a large collection of
249   email from many sources, it is still possible that this function may
250   occasionally yield an incorrect result.
251
252
253.. method:: Message.getdate_tz(name)
254
255   Retrieve a header using :meth:`getheader` and parse it into a 10-tuple; the
256   first 9 elements will make a tuple compatible with :func:`time.mktime`, and the
257   10th is a number giving the offset of the date's timezone from UTC.  Note that
258   fields 6, 7, and 8  are not usable.  Similarly to :meth:`getdate`, if there is
259   no header matching *name*, or it is unparsable, return ``None``.
260
261:class:`Message` instances also support a limited mapping interface. In
262particular: ``m[name]`` is like ``m.getheader(name)`` but raises :exc:`KeyError`
263if there is no matching header; and ``len(m)``, ``m.get(name[, default])``,
264``name in m``, ``m.keys()``, ``m.values()`` ``m.items()``, and
265``m.setdefault(name[, default])`` act as expected, with the one difference
266that :meth:`setdefault` uses an empty string as the default value.
267:class:`Message` instances also support the mapping writable interface ``m[name]
268= value`` and ``del m[name]``.  :class:`Message` objects do not support the
269:meth:`clear`, :meth:`copy`, :meth:`popitem`, or :meth:`update` methods of the
270mapping interface.  (Support for :meth:`get` and :meth:`setdefault` was only
271added in Python 2.2.)
272
273Finally, :class:`Message` instances have some public instance variables:
274
275
276.. attribute:: Message.headers
277
278   A list containing the entire set of header lines, in the order in which they
279   were read (except that setitem calls may disturb this order). Each line contains
280   a trailing newline.  The blank line terminating the headers is not contained in
281   the list.
282
283
284.. attribute:: Message.fp
285
286   The file or file-like object passed at instantiation time.  This can be used to
287   read the message content.
288
289
290.. attribute:: Message.unixfrom
291
292   The Unix ``From`` line, if the message had one, or an empty string.  This is
293   needed to regenerate the message in some contexts, such as an ``mbox``\ -style
294   mailbox file.
295
296
297.. _addresslist-objects:
298
299AddressList Objects
300-------------------
301
302An :class:`AddressList` instance has the following methods:
303
304
305.. method:: AddressList.__len__()
306
307   Return the number of addresses in the address list.
308
309
310.. method:: AddressList.__str__()
311
312   Return a canonicalized string representation of the address list. Addresses are
313   rendered in "name" <host@domain> form, comma-separated.
314
315
316.. method:: AddressList.__add__(alist)
317
318   Return a new :class:`AddressList` instance that contains all addresses in both
319   :class:`AddressList` operands, with duplicates removed (set union).
320
321
322.. method:: AddressList.__iadd__(alist)
323
324   In-place version of :meth:`__add__`; turns this :class:`AddressList` instance
325   into the union of itself and the right-hand instance, *alist*.
326
327
328.. method:: AddressList.__sub__(alist)
329
330   Return a new :class:`AddressList` instance that contains every address in the
331   left-hand :class:`AddressList` operand that is not present in the right-hand
332   address operand (set difference).
333
334
335.. method:: AddressList.__isub__(alist)
336
337   In-place version of :meth:`__sub__`, removing addresses in this list which are
338   also in *alist*.
339
340Finally, :class:`AddressList` instances have one public instance variable:
341
342
343.. attribute:: AddressList.addresslist
344
345   A list of tuple string pairs, one per address.  In each member, the first is the
346   canonicalized name part, the second is the actual route-address (``'@'``\
347   -separated username-host.domain pair).
348
349.. rubric:: Footnotes
350
351.. [#] This module originally conformed to :rfc:`822`, hence the name.  Since then,
352   :rfc:`2822` has been released as an update to :rfc:`822`.  This module should be
353   considered :rfc:`2822`\ -conformant, especially in cases where the syntax or
354   semantics have changed since :rfc:`822`.
355
356