• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1****************************
2  What's New in Python 2.5
3****************************
4
5:Author: A.M. Kuchling
6
7.. |release| replace:: 1.01
8
9.. $Id: whatsnew25.tex 56611 2007-07-29 08:26:10Z georg.brandl $
10.. Fix XXX comments
11
12This article explains the new features in Python 2.5.  The final release of
13Python 2.5 is scheduled for August 2006; :pep:`356` describes the planned
14release schedule.
15
16The changes in Python 2.5 are an interesting mix of language and library
17improvements. The library enhancements will be more important to Python's user
18community, I think, because several widely-useful packages were added.  New
19modules include ElementTree for XML processing (:mod:`xml.etree`),
20the SQLite database module (:mod:`sqlite`), and the :mod:`ctypes`
21module for calling C functions.
22
23The language changes are of middling significance.  Some pleasant new features
24were added, but most of them aren't features that you'll use every day.
25Conditional expressions were finally added to the language using a novel syntax;
26see section :ref:`pep-308`.  The new ':keyword:`with`' statement will make
27writing cleanup code easier (section :ref:`pep-343`).  Values can now be passed
28into generators (section :ref:`pep-342`).  Imports are now visible as either
29absolute or relative (section :ref:`pep-328`).  Some corner cases of exception
30handling are handled better (section :ref:`pep-341`).  All these improvements
31are worthwhile, but they're improvements to one specific language feature or
32another; none of them are broad modifications to Python's semantics.
33
34As well as the language and library additions, other improvements and bugfixes
35were made throughout the source tree.  A search through the SVN change logs
36finds there were 353 patches applied and 458 bugs fixed between Python 2.4 and
372.5.  (Both figures are likely to be underestimates.)
38
39This article doesn't try to be a complete specification of the new features;
40instead changes are briefly introduced using helpful examples.  For full
41details, you should always refer to the documentation for Python 2.5 at
42https://docs.python.org. If you want to understand the complete implementation
43and design rationale, refer to the PEP for a particular new feature.
44
45Comments, suggestions, and error reports for this document are welcome; please
46e-mail them to the author or open a bug in the Python bug tracker.
47
48.. ======================================================================
49
50
51.. _pep-308:
52
53PEP 308: Conditional Expressions
54================================
55
56For a long time, people have been requesting a way to write conditional
57expressions, which are expressions that return value A or value B depending on
58whether a Boolean value is true or false.  A conditional expression lets you
59write a single assignment statement that has the same effect as the following::
60
61   if condition:
62       x = true_value
63   else:
64       x = false_value
65
66There have been endless tedious discussions of syntax on both python-dev and
67comp.lang.python.  A vote was even held that found the majority of voters wanted
68conditional expressions in some form, but there was no syntax that was preferred
69by a clear majority. Candidates included C's ``cond ? true_v : false_v``, ``if
70cond then true_v else false_v``, and 16 other variations.
71
72Guido van Rossum eventually chose a surprising syntax::
73
74   x = true_value if condition else false_value
75
76Evaluation is still lazy as in existing Boolean expressions, so the order of
77evaluation jumps around a bit.  The *condition* expression in the middle is
78evaluated first, and the *true_value* expression is evaluated only if the
79condition was true.  Similarly, the *false_value* expression is only evaluated
80when the condition is false.
81
82This syntax may seem strange and backwards; why does the condition go in the
83*middle* of the expression, and not in the front as in C's ``c ? x : y``?  The
84decision was checked by applying the new syntax to the modules in the standard
85library and seeing how the resulting code read.  In many cases where a
86conditional expression is used, one value seems to be the 'common case' and one
87value is an 'exceptional case', used only on rarer occasions when the condition
88isn't met.  The conditional syntax makes this pattern a bit more obvious::
89
90   contents = ((doc + '\n') if doc else '')
91
92I read the above statement as meaning "here *contents* is  usually assigned a
93value of ``doc+'\n'``; sometimes  *doc* is empty, in which special case an empty
94string is returned."   I doubt I will use conditional expressions very often
95where there  isn't a clear common and uncommon case.
96
97There was some discussion of whether the language should require surrounding
98conditional expressions with parentheses.  The decision was made to *not*
99require parentheses in the Python language's grammar, but as a matter of style I
100think you should always use them. Consider these two statements::
101
102   # First version -- no parens
103   level = 1 if logging else 0
104
105   # Second version -- with parens
106   level = (1 if logging else 0)
107
108In the first version, I think a reader's eye might group the statement into
109'level = 1', 'if logging', 'else 0', and think that the condition decides
110whether the assignment to *level* is performed.  The second version reads
111better, in my opinion, because it makes it clear that the assignment is always
112performed and the choice is being made between two values.
113
114Another reason for including the brackets: a few odd combinations of list
115comprehensions and lambdas could look like incorrect conditional expressions.
116See :pep:`308` for some examples.  If you put parentheses around your
117conditional expressions, you won't run into this case.
118
119
120.. seealso::
121
122   :pep:`308` - Conditional Expressions
123      PEP written by Guido van Rossum and Raymond D. Hettinger; implemented by Thomas
124      Wouters.
125
126.. ======================================================================
127
128
129.. _pep-309:
130
131PEP 309: Partial Function Application
132=====================================
133
134The :mod:`functools` module is intended to contain tools for functional-style
135programming.
136
137One useful tool in this module is the :func:`partial` function. For programs
138written in a functional style, you'll sometimes want to construct variants of
139existing functions that have some of the parameters filled in.  Consider a
140Python function ``f(a, b, c)``; you could create a new function ``g(b, c)`` that
141was equivalent to ``f(1, b, c)``.  This is called "partial function
142application".
143
144:func:`partial` takes the arguments ``(function, arg1, arg2, ... kwarg1=value1,
145kwarg2=value2)``.  The resulting object is callable, so you can just call it to
146invoke *function* with the filled-in arguments.
147
148Here's a small but realistic example::
149
150   import functools
151
152   def log (message, subsystem):
153       "Write the contents of 'message' to the specified subsystem."
154       print '%s: %s' % (subsystem, message)
155       ...
156
157   server_log = functools.partial(log, subsystem='server')
158   server_log('Unable to open socket')
159
160Here's another example, from a program that uses PyGTK.  Here a context-sensitive
161pop-up menu is being constructed dynamically.  The callback provided
162for the menu option is a partially applied version of the :meth:`open_item`
163method, where the first argument has been provided. ::
164
165   ...
166   class Application:
167       def open_item(self, path):
168          ...
169       def init (self):
170           open_func = functools.partial(self.open_item, item_path)
171           popup_menu.append( ("Open", open_func, 1) )
172
173Another function in the :mod:`functools` module is the
174``update_wrapper(wrapper, wrapped)`` function that helps you write
175well-behaved decorators.  :func:`update_wrapper` copies the name, module, and
176docstring attribute to a wrapper function so that tracebacks inside the wrapped
177function are easier to understand.  For example, you might write::
178
179   def my_decorator(f):
180       def wrapper(*args, **kwds):
181           print 'Calling decorated function'
182           return f(*args, **kwds)
183       functools.update_wrapper(wrapper, f)
184       return wrapper
185
186:func:`wraps` is a decorator that can be used inside your own decorators to copy
187the wrapped function's information.  An alternate  version of the previous
188example would be::
189
190   def my_decorator(f):
191       @functools.wraps(f)
192       def wrapper(*args, **kwds):
193           print 'Calling decorated function'
194           return f(*args, **kwds)
195       return wrapper
196
197
198.. seealso::
199
200   :pep:`309` - Partial Function Application
201      PEP proposed and written by Peter Harris; implemented by Hye-Shik Chang and Nick
202      Coghlan, with adaptations by Raymond Hettinger.
203
204.. ======================================================================
205
206
207.. _pep-314:
208
209PEP 314: Metadata for Python Software Packages v1.1
210===================================================
211
212Some simple dependency support was added to Distutils.  The :func:`setup`
213function now has ``requires``, ``provides``, and ``obsoletes`` keyword
214parameters.  When you build a source distribution using the ``sdist`` command,
215the dependency information will be recorded in the :file:`PKG-INFO` file.
216
217Another new keyword parameter is ``download_url``, which should be set to a URL
218for the package's source code.  This means it's now possible to look up an entry
219in the package index, determine the dependencies for a package, and download the
220required packages. ::
221
222   VERSION = '1.0'
223   setup(name='PyPackage',
224         version=VERSION,
225         requires=['numarray', 'zlib (>=1.1.4)'],
226         obsoletes=['OldPackage']
227         download_url=('http://www.example.com/pypackage/dist/pkg-%s.tar.gz'
228                       % VERSION),
229        )
230
231Another new enhancement to the Python package index at
232https://pypi.org is storing source and binary archives for a
233package.  The new :command:`upload` Distutils command will upload a package to
234the repository.
235
236Before a package can be uploaded, you must be able to build a distribution using
237the :command:`sdist` Distutils command.  Once that works, you can run ``python
238setup.py upload`` to add your package to the PyPI archive.  Optionally you can
239GPG-sign the package by supplying the :option:`!--sign` and :option:`!--identity`
240options.
241
242Package uploading was implemented by Martin von Löwis and Richard Jones.
243
244
245.. seealso::
246
247   :pep:`314` - Metadata for Python Software Packages v1.1
248      PEP proposed and written by A.M. Kuchling, Richard Jones, and Fred Drake;
249      implemented by Richard Jones and Fred Drake.
250
251.. ======================================================================
252
253
254.. _pep-328:
255
256PEP 328: Absolute and Relative Imports
257======================================
258
259The simpler part of PEP 328 was implemented in Python 2.4: parentheses could now
260be used to enclose the names imported from a module using the ``from ... import
261...`` statement, making it easier to import many different names.
262
263The more complicated part has been implemented in Python 2.5: importing a module
264can be specified to use absolute or package-relative imports.  The plan is to
265move toward making absolute imports the default in future versions of Python.
266
267Let's say you have a package directory like this::
268
269   pkg/
270   pkg/__init__.py
271   pkg/main.py
272   pkg/string.py
273
274This defines a package named :mod:`pkg` containing the :mod:`pkg.main` and
275:mod:`pkg.string` submodules.
276
277Consider the code in the :file:`main.py` module.  What happens if it executes
278the statement ``import string``?  In Python 2.4 and earlier, it will first look
279in the package's directory to perform a relative import, finds
280:file:`pkg/string.py`, imports the contents of that file as the
281:mod:`pkg.string` module, and that module is bound to the name ``string`` in the
282:mod:`pkg.main` module's namespace.
283
284That's fine if :mod:`pkg.string` was what you wanted.  But what if you wanted
285Python's standard :mod:`string` module?  There's no clean way to ignore
286:mod:`pkg.string` and look for the standard module; generally you had to look at
287the contents of ``sys.modules``, which is slightly unclean.    Holger Krekel's
288:mod:`py.std` package provides a tidier way to perform imports from the standard
289library, ``import py; py.std.string.join()``, but that package isn't available
290on all Python installations.
291
292Reading code which relies on relative imports is also less clear, because a
293reader may be confused about which module, :mod:`string` or :mod:`pkg.string`,
294is intended to be used.  Python users soon learned not to duplicate the names of
295standard library modules in the names of their packages' submodules, but you
296can't protect against having your submodule's name being used for a new module
297added in a future version of Python.
298
299In Python 2.5, you can switch :keyword:`import`'s behaviour to  absolute imports
300using a ``from __future__ import absolute_import`` directive.  This absolute-import
301behaviour will become the default in a future version (probably Python
3022.7).  Once absolute imports  are the default, ``import string`` will always
303find the standard library's version. It's suggested that users should begin
304using absolute imports as much as possible, so it's preferable to begin writing
305``from pkg import string`` in your code.
306
307Relative imports are still possible by adding a leading period  to the module
308name when using the ``from ... import`` form::
309
310   # Import names from pkg.string
311   from .string import name1, name2
312   # Import pkg.string
313   from . import string
314
315This imports the :mod:`string` module relative to the current package, so in
316:mod:`pkg.main` this will import *name1* and *name2* from :mod:`pkg.string`.
317Additional leading periods perform the relative import starting from the parent
318of the current package.  For example, code in the :mod:`A.B.C` module can do::
319
320   from . import D                 # Imports A.B.D
321   from .. import E                # Imports A.E
322   from ..F import G               # Imports A.F.G
323
324Leading periods cannot be used with the ``import modname``  form of the import
325statement, only the ``from ... import`` form.
326
327
328.. seealso::
329
330   :pep:`328` - Imports: Multi-Line and Absolute/Relative
331      PEP written by Aahz; implemented by Thomas Wouters.
332
333   https://pylib.readthedocs.org/
334      The py library by Holger Krekel, which contains the :mod:`py.std` package.
335
336.. ======================================================================
337
338
339.. _pep-338:
340
341PEP 338: Executing Modules as Scripts
342=====================================
343
344The :option:`-m` switch added in Python 2.4 to execute a module as a script
345gained a few more abilities.  Instead of being implemented in C code inside the
346Python interpreter, the switch now uses an implementation in a new module,
347:mod:`runpy`.
348
349The :mod:`runpy` module implements a more sophisticated import mechanism so that
350it's now possible to run modules in a package such as :mod:`pychecker.checker`.
351The module also supports alternative import mechanisms such as the
352:mod:`zipimport` module.  This means you can add a .zip archive's path to
353``sys.path`` and then use the :option:`-m` switch to execute code from the
354archive.
355
356
357.. seealso::
358
359   :pep:`338` - Executing modules as scripts
360      PEP written and  implemented by Nick Coghlan.
361
362.. ======================================================================
363
364
365.. _pep-341:
366
367PEP 341: Unified try/except/finally
368===================================
369
370Until Python 2.5, the :keyword:`try` statement came in two flavours. You could
371use a :keyword:`finally` block to ensure that code is always executed, or one or
372more :keyword:`except` blocks to catch  specific exceptions.  You couldn't
373combine both :keyword:`except` blocks and a :keyword:`finally` block, because
374generating the right bytecode for the combined version was complicated and it
375wasn't clear what the semantics of the combined statement should be.
376
377Guido van Rossum spent some time working with Java, which does support the
378equivalent of combining :keyword:`except` blocks and a :keyword:`finally` block,
379and this clarified what the statement should mean.  In Python 2.5, you can now
380write::
381
382   try:
383       block-1 ...
384   except Exception1:
385       handler-1 ...
386   except Exception2:
387       handler-2 ...
388   else:
389       else-block
390   finally:
391       final-block
392
393The code in *block-1* is executed.  If the code raises an exception, the various
394:keyword:`except` blocks are tested: if the exception is of class
395:class:`Exception1`, *handler-1* is executed; otherwise if it's of class
396:class:`Exception2`, *handler-2* is executed, and so forth.  If no exception is
397raised, the *else-block* is executed.
398
399No matter what happened previously, the *final-block* is executed once the code
400block is complete and any raised exceptions handled. Even if there's an error in
401an exception handler or the *else-block* and a new exception is raised, the code
402in the *final-block* is still run.
403
404
405.. seealso::
406
407   :pep:`341` - Unifying try-except and try-finally
408      PEP written by Georg Brandl;  implementation by Thomas Lee.
409
410.. ======================================================================
411
412
413.. _pep-342:
414
415PEP 342: New Generator Features
416===============================
417
418Python 2.5 adds a simple way to pass values *into* a generator. As introduced in
419Python 2.3, generators only produce output; once a generator's code was invoked
420to create an iterator, there was no way to pass any new information into the
421function when its execution is resumed.  Sometimes the ability to pass in some
422information would be useful.  Hackish solutions to this include making the
423generator's code look at a global variable and then changing the global
424variable's value, or passing in some mutable object that callers then modify.
425
426To refresh your memory of basic generators, here's a simple example::
427
428   def counter (maximum):
429       i = 0
430       while i < maximum:
431           yield i
432           i += 1
433
434When you call ``counter(10)``, the result is an iterator that returns the values
435from 0 up to 9.  On encountering the :keyword:`yield` statement, the iterator
436returns the provided value and suspends the function's execution, preserving the
437local variables. Execution resumes on the following call to the iterator's
438:meth:`next` method, picking up after the :keyword:`yield` statement.
439
440In Python 2.3, :keyword:`yield` was a statement; it didn't return any value.  In
4412.5, :keyword:`yield` is now an expression, returning a value that can be
442assigned to a variable or otherwise operated on::
443
444   val = (yield i)
445
446I recommend that you always put parentheses around a :keyword:`yield` expression
447when you're doing something with the returned value, as in the above example.
448The parentheses aren't always necessary, but it's easier to always add them
449instead of having to remember when they're needed.
450
451(:pep:`342` explains the exact rules, which are that a :keyword:`yield`\
452-expression must always be parenthesized except when it occurs at the top-level
453expression on the right-hand side of an assignment.  This means you can write
454``val = yield i`` but have to use parentheses when there's an operation, as in
455``val = (yield i) + 12``.)
456
457Values are sent into a generator by calling its ``send(value)`` method.  The
458generator's code is then resumed and the :keyword:`yield` expression returns the
459specified *value*.  If the regular :meth:`next` method is called, the
460:keyword:`yield` returns :const:`None`.
461
462Here's the previous example, modified to allow changing the value of the
463internal counter. ::
464
465   def counter (maximum):
466       i = 0
467       while i < maximum:
468           val = (yield i)
469           # If value provided, change counter
470           if val is not None:
471               i = val
472           else:
473               i += 1
474
475And here's an example of changing the counter::
476
477   >>> it = counter(10)
478   >>> print it.next()
479   0
480   >>> print it.next()
481   1
482   >>> print it.send(8)
483   8
484   >>> print it.next()
485   9
486   >>> print it.next()
487   Traceback (most recent call last):
488     File "t.py", line 15, in ?
489       print it.next()
490   StopIteration
491
492:keyword:`yield` will usually return :const:`None`, so you should always check
493for this case.  Don't just use its value in expressions unless you're sure that
494the :meth:`send` method will be the only method used to resume your generator
495function.
496
497In addition to :meth:`send`, there are two other new methods on generators:
498
499* ``throw(type, value=None, traceback=None)`` is used to raise an exception
500  inside the generator; the exception is raised by the :keyword:`yield` expression
501  where the generator's execution is paused.
502
503* :meth:`close` raises a new :exc:`GeneratorExit` exception inside the generator
504  to terminate the iteration.  On receiving this exception, the generator's code
505  must either raise :exc:`GeneratorExit` or :exc:`StopIteration`.  Catching the
506  :exc:`GeneratorExit` exception and returning a value is illegal and will trigger
507  a :exc:`RuntimeError`; if the function raises some other exception, that
508  exception is propagated to the caller.  :meth:`close` will also be called by
509  Python's garbage collector when the generator is garbage-collected.
510
511  If you need to run cleanup code when a :exc:`GeneratorExit` occurs, I suggest
512  using a ``try: ... finally:`` suite instead of  catching :exc:`GeneratorExit`.
513
514The cumulative effect of these changes is to turn generators from one-way
515producers of information into both producers and consumers.
516
517Generators also become *coroutines*, a more generalized form of subroutines.
518Subroutines are entered at one point and exited at another point (the top of the
519function, and a :keyword:`return` statement), but coroutines can be entered,
520exited, and resumed at many different points (the :keyword:`yield` statements).
521We'll have to figure out patterns for using coroutines effectively in Python.
522
523The addition of the :meth:`close` method has one side effect that isn't obvious.
524:meth:`close` is called when a generator is garbage-collected, so this means the
525generator's code gets one last chance to run before the generator is destroyed.
526This last chance means that ``try...finally`` statements in generators can now
527be guaranteed to work; the :keyword:`finally` clause will now always get a
528chance to run.  The syntactic restriction that you couldn't mix :keyword:`yield`
529statements with a ``try...finally`` suite has therefore been removed.  This
530seems like a minor bit of language trivia, but using generators and
531``try...finally`` is actually necessary in order to implement the
532:keyword:`with` statement described by PEP 343.  I'll look at this new statement
533in the following  section.
534
535Another even more esoteric effect of this change: previously, the
536:attr:`gi_frame` attribute of a generator was always a frame object. It's now
537possible for :attr:`gi_frame` to be ``None`` once the generator has been
538exhausted.
539
540
541.. seealso::
542
543   :pep:`342` - Coroutines via Enhanced Generators
544      PEP written by  Guido van Rossum and Phillip J. Eby; implemented by Phillip J.
545      Eby.  Includes examples of  some fancier uses of generators as coroutines.
546
547      Earlier versions of these features were proposed in  :pep:`288` by Raymond
548      Hettinger and :pep:`325` by Samuele Pedroni.
549
550   https://en.wikipedia.org/wiki/Coroutine
551      The Wikipedia entry for  coroutines.
552
553   http://www.sidhe.org/~dan/blog/archives/000178.html
554      An explanation of coroutines from a Perl point of view, written by Dan Sugalski.
555
556.. ======================================================================
557
558
559.. _pep-343:
560
561PEP 343: The 'with' statement
562=============================
563
564The ':keyword:`with`' statement clarifies code that previously would use
565``try...finally`` blocks to ensure that clean-up code is executed.  In this
566section, I'll discuss the statement as it will commonly be used.  In the next
567section, I'll examine the implementation details and show how to write objects
568for use with this statement.
569
570The ':keyword:`with`' statement is a new control-flow structure whose basic
571structure is::
572
573   with expression [as variable]:
574       with-block
575
576The expression is evaluated, and it should result in an object that supports the
577context management protocol (that is, has :meth:`__enter__` and :meth:`__exit__`
578methods.
579
580The object's :meth:`__enter__` is called before *with-block* is executed and
581therefore can run set-up code. It also may return a value that is bound to the
582name *variable*, if given.  (Note carefully that *variable* is *not* assigned
583the result of *expression*.)
584
585After execution of the *with-block* is finished, the object's :meth:`__exit__`
586method is called, even if the block raised an exception, and can therefore run
587clean-up code.
588
589To enable the statement in Python 2.5, you need to add the following directive
590to your module::
591
592   from __future__ import with_statement
593
594The statement will always be enabled in Python 2.6.
595
596Some standard Python objects now support the context management protocol and can
597be used with the ':keyword:`with`' statement. File objects are one example::
598
599   with open('/etc/passwd', 'r') as f:
600       for line in f:
601           print line
602           ... more processing code ...
603
604After this statement has executed, the file object in *f* will have been
605automatically closed, even if the :keyword:`for` loop raised an exception
606part-way through the block.
607
608.. note::
609
610   In this case, *f* is the same object created by :func:`open`, because
611   :meth:`file.__enter__` returns *self*.
612
613The :mod:`threading` module's locks and condition variables  also support the
614':keyword:`with`' statement::
615
616   lock = threading.Lock()
617   with lock:
618       # Critical section of code
619       ...
620
621The lock is acquired before the block is executed and always released once  the
622block is complete.
623
624The new :func:`localcontext` function in the :mod:`decimal` module makes it easy
625to save and restore the current decimal context, which encapsulates the desired
626precision and rounding characteristics for computations::
627
628   from decimal import Decimal, Context, localcontext
629
630   # Displays with default precision of 28 digits
631   v = Decimal('578')
632   print v.sqrt()
633
634   with localcontext(Context(prec=16)):
635       # All code in this block uses a precision of 16 digits.
636       # The original context is restored on exiting the block.
637       print v.sqrt()
638
639
640.. _new-25-context-managers:
641
642Writing Context Managers
643------------------------
644
645Under the hood, the ':keyword:`with`' statement is fairly complicated. Most
646people will only use ':keyword:`with`' in company with existing objects and
647don't need to know these details, so you can skip the rest of this section if
648you like.  Authors of new objects will need to understand the details of the
649underlying implementation and should keep reading.
650
651A high-level explanation of the context management protocol is:
652
653* The expression is evaluated and should result in an object called a "context
654  manager".  The context manager must have :meth:`__enter__` and :meth:`__exit__`
655  methods.
656
657* The context manager's :meth:`__enter__` method is called.  The value returned
658  is assigned to *VAR*.  If no ``'as VAR'`` clause is present, the value is simply
659  discarded.
660
661* The code in *BLOCK* is executed.
662
663* If *BLOCK* raises an exception, the ``__exit__(type, value, traceback)``
664  is called with the exception details, the same values returned by
665  :func:`sys.exc_info`.  The method's return value controls whether the exception
666  is re-raised: any false value re-raises the exception, and ``True`` will result
667  in suppressing it.  You'll only rarely want to suppress the exception, because
668  if you do the author of the code containing the ':keyword:`with`' statement will
669  never realize anything went wrong.
670
671* If *BLOCK* didn't raise an exception,  the :meth:`__exit__` method is still
672  called, but *type*, *value*, and *traceback* are all ``None``.
673
674Let's think through an example.  I won't present detailed code but will only
675sketch the methods necessary for a database that supports transactions.
676
677(For people unfamiliar with database terminology: a set of changes to the
678database are grouped into a transaction.  Transactions can be either committed,
679meaning that all the changes are written into the database, or rolled back,
680meaning that the changes are all discarded and the database is unchanged.  See
681any database textbook for more information.)
682
683Let's assume there's an object representing a database connection. Our goal will
684be to let the user write code like this::
685
686   db_connection = DatabaseConnection()
687   with db_connection as cursor:
688       cursor.execute('insert into ...')
689       cursor.execute('delete from ...')
690       # ... more operations ...
691
692The transaction should be committed if the code in the block runs flawlessly or
693rolled back if there's an exception. Here's the basic interface for
694:class:`DatabaseConnection` that I'll assume::
695
696   class DatabaseConnection:
697       # Database interface
698       def cursor (self):
699           "Returns a cursor object and starts a new transaction"
700       def commit (self):
701           "Commits current transaction"
702       def rollback (self):
703           "Rolls back current transaction"
704
705The :meth:`__enter__` method is pretty easy, having only to start a new
706transaction.  For this application the resulting cursor object would be a useful
707result, so the method will return it.  The user can then add ``as cursor`` to
708their ':keyword:`with`' statement to bind the cursor to a variable name. ::
709
710   class DatabaseConnection:
711       ...
712       def __enter__ (self):
713           # Code to start a new transaction
714           cursor = self.cursor()
715           return cursor
716
717The :meth:`__exit__` method is the most complicated because it's where most of
718the work has to be done.  The method has to check if an exception occurred.  If
719there was no exception, the transaction is committed.  The transaction is rolled
720back if there was an exception.
721
722In the code below, execution will just fall off the end of the function,
723returning the default value of ``None``.  ``None`` is false, so the exception
724will be re-raised automatically.  If you wished, you could be more explicit and
725add a :keyword:`return` statement at the marked location. ::
726
727   class DatabaseConnection:
728       ...
729       def __exit__ (self, type, value, tb):
730           if tb is None:
731               # No exception, so commit
732               self.commit()
733           else:
734               # Exception occurred, so rollback.
735               self.rollback()
736               # return False
737
738
739.. _contextlibmod:
740
741The contextlib module
742---------------------
743
744The new :mod:`contextlib` module provides some functions and a decorator that
745are useful for writing objects for use with the ':keyword:`with`' statement.
746
747The decorator is called :func:`contextmanager`, and lets you write a single
748generator function instead of defining a new class.  The generator should yield
749exactly one value.  The code up to the :keyword:`yield` will be executed as the
750:meth:`__enter__` method, and the value yielded will be the method's return
751value that will get bound to the variable in the ':keyword:`with`' statement's
752:keyword:`as` clause, if any.  The code after the :keyword:`yield` will be
753executed in the :meth:`__exit__` method.  Any exception raised in the block will
754be raised by the :keyword:`yield` statement.
755
756Our database example from the previous section could be written  using this
757decorator as::
758
759   from contextlib import contextmanager
760
761   @contextmanager
762   def db_transaction (connection):
763       cursor = connection.cursor()
764       try:
765           yield cursor
766       except:
767           connection.rollback()
768           raise
769       else:
770           connection.commit()
771
772   db = DatabaseConnection()
773   with db_transaction(db) as cursor:
774       ...
775
776The :mod:`contextlib` module also has a ``nested(mgr1, mgr2, ...)`` function
777that combines a number of context managers so you don't need to write nested
778':keyword:`with`' statements.  In this example, the single ':keyword:`with`'
779statement both starts a database transaction and acquires a thread lock::
780
781   lock = threading.Lock()
782   with nested (db_transaction(db), lock) as (cursor, locked):
783       ...
784
785Finally, the ``closing(object)`` function returns *object* so that it can be
786bound to a variable, and calls ``object.close`` at the end of the block. ::
787
788   import urllib, sys
789   from contextlib import closing
790
791   with closing(urllib.urlopen('http://www.yahoo.com')) as f:
792       for line in f:
793           sys.stdout.write(line)
794
795
796.. seealso::
797
798   :pep:`343` - The "with" statement
799      PEP written by Guido van Rossum and Nick Coghlan; implemented by Mike Bland,
800      Guido van Rossum, and Neal Norwitz.  The PEP shows the code generated for a
801      ':keyword:`with`' statement, which can be helpful in learning how the statement
802      works.
803
804   The documentation  for the :mod:`contextlib` module.
805
806.. ======================================================================
807
808
809.. _pep-352:
810
811PEP 352: Exceptions as New-Style Classes
812========================================
813
814Exception classes can now be new-style classes, not just classic classes, and
815the built-in :exc:`Exception` class and all the standard built-in exceptions
816(:exc:`NameError`, :exc:`ValueError`, etc.) are now new-style classes.
817
818The inheritance hierarchy for exceptions has been rearranged a bit. In 2.5, the
819inheritance relationships are::
820
821   BaseException       # New in Python 2.5
822   |- KeyboardInterrupt
823   |- SystemExit
824   |- Exception
825      |- (all other current built-in exceptions)
826
827This rearrangement was done because people often want to catch all exceptions
828that indicate program errors.  :exc:`KeyboardInterrupt` and :exc:`SystemExit`
829aren't errors, though, and usually represent an explicit action such as the user
830hitting :kbd:`Control-C` or code calling :func:`sys.exit`.  A bare ``except:`` will
831catch all exceptions, so you commonly need to list :exc:`KeyboardInterrupt` and
832:exc:`SystemExit` in order to re-raise them.  The usual pattern is::
833
834   try:
835       ...
836   except (KeyboardInterrupt, SystemExit):
837       raise
838   except:
839       # Log error...
840       # Continue running program...
841
842In Python 2.5, you can now write ``except Exception`` to achieve the same
843result, catching all the exceptions that usually indicate errors  but leaving
844:exc:`KeyboardInterrupt` and :exc:`SystemExit` alone.  As in previous versions,
845a bare ``except:`` still catches all exceptions.
846
847The goal for Python 3.0 is to require any class raised as an exception to derive
848from :exc:`BaseException` or some descendant of :exc:`BaseException`, and future
849releases in the Python 2.x series may begin to enforce this constraint.
850Therefore, I suggest you begin making all your exception classes derive from
851:exc:`Exception` now.  It's been suggested that the bare ``except:`` form should
852be removed in Python 3.0, but Guido van Rossum hasn't decided whether to do this
853or not.
854
855Raising of strings as exceptions, as in the statement ``raise "Error
856occurred"``, is deprecated in Python 2.5 and will trigger a warning.  The aim is
857to be able to remove the string-exception feature in a few releases.
858
859
860.. seealso::
861
862   :pep:`352` - Required Superclass for Exceptions
863      PEP written by  Brett Cannon and Guido van Rossum; implemented by Brett Cannon.
864
865.. ======================================================================
866
867
868.. _pep-353:
869
870PEP 353: Using ssize_t as the index type
871========================================
872
873A wide-ranging change to Python's C API, using a new  :c:type:`Py_ssize_t` type
874definition instead of :c:type:`int`,  will permit the interpreter to handle more
875data on 64-bit platforms. This change doesn't affect Python's capacity on 32-bit
876platforms.
877
878Various pieces of the Python interpreter used C's :c:type:`int` type to store
879sizes or counts; for example, the number of items in a list or tuple were stored
880in an :c:type:`int`.  The C compilers for most 64-bit platforms still define
881:c:type:`int` as a 32-bit type, so that meant that lists could only hold up to
882``2**31 - 1`` = 2147483647 items. (There are actually a few different
883programming models that 64-bit C compilers can use -- see
884http://www.unix.org/version2/whatsnew/lp64_wp.html for a discussion -- but the
885most commonly available model leaves :c:type:`int` as 32 bits.)
886
887A limit of 2147483647 items doesn't really matter on a 32-bit platform because
888you'll run out of memory before hitting the length limit. Each list item
889requires space for a pointer, which is 4 bytes, plus space for a
890:c:type:`PyObject` representing the item.  2147483647\*4 is already more bytes
891than a 32-bit address space can contain.
892
893It's possible to address that much memory on a 64-bit platform, however.  The
894pointers for a list that size would only require 16 GiB of space, so it's not
895unreasonable that Python programmers might construct lists that large.
896Therefore, the Python interpreter had to be changed to use some type other than
897:c:type:`int`, and this will be a 64-bit type on 64-bit platforms.  The change
898will cause incompatibilities on 64-bit machines, so it was deemed worth making
899the transition now, while the number of 64-bit users is still relatively small.
900(In 5 or 10 years, we may *all* be on 64-bit machines, and the transition would
901be more painful then.)
902
903This change most strongly affects authors of C extension modules.   Python
904strings and container types such as lists and tuples  now use
905:c:type:`Py_ssize_t` to store their size.   Functions such as
906:c:func:`PyList_Size`  now return :c:type:`Py_ssize_t`.  Code in extension modules
907may therefore need to have some variables changed to :c:type:`Py_ssize_t`.
908
909The :c:func:`PyArg_ParseTuple` and :c:func:`Py_BuildValue` functions have a new
910conversion code, ``n``, for :c:type:`Py_ssize_t`.   :c:func:`PyArg_ParseTuple`'s
911``s#`` and ``t#`` still output :c:type:`int` by default, but you can define the
912macro  :c:macro:`PY_SSIZE_T_CLEAN` before including :file:`Python.h`  to make
913them return :c:type:`Py_ssize_t`.
914
915:pep:`353` has a section on conversion guidelines that  extension authors should
916read to learn about supporting 64-bit platforms.
917
918
919.. seealso::
920
921   :pep:`353` - Using ssize_t as the index type
922      PEP written and implemented by Martin von Löwis.
923
924.. ======================================================================
925
926
927.. _pep-357:
928
929PEP 357: The '__index__' method
930===============================
931
932The NumPy developers had a problem that could only be solved by adding a new
933special method, :meth:`__index__`.  When using slice notation, as in
934``[start:stop:step]``, the values of the *start*, *stop*, and *step* indexes
935must all be either integers or long integers.  NumPy defines a variety of
936specialized integer types corresponding to unsigned and signed integers of 8,
93716, 32, and 64 bits, but there was no way to signal that these types could be
938used as slice indexes.
939
940Slicing can't just use the existing :meth:`__int__` method because that method
941is also used to implement coercion to integers.  If slicing used
942:meth:`__int__`, floating-point numbers would also become legal slice indexes
943and that's clearly an undesirable behaviour.
944
945Instead, a new special method called :meth:`__index__` was added.  It takes no
946arguments and returns an integer giving the slice index to use.  For example::
947
948   class C:
949       def __index__ (self):
950           return self.value
951
952The return value must be either a Python integer or long integer. The
953interpreter will check that the type returned is correct, and raises a
954:exc:`TypeError` if this requirement isn't met.
955
956A corresponding :attr:`nb_index` slot was added to the C-level
957:c:type:`PyNumberMethods` structure to let C extensions implement this protocol.
958``PyNumber_Index(obj)`` can be used in extension code to call the
959:meth:`__index__` function and retrieve its result.
960
961
962.. seealso::
963
964   :pep:`357` - Allowing Any Object to be Used for Slicing
965      PEP written  and implemented by Travis Oliphant.
966
967.. ======================================================================
968
969
970.. _other-lang:
971
972Other Language Changes
973======================
974
975Here are all of the changes that Python 2.5 makes to the core Python language.
976
977* The :class:`dict` type has a new hook for letting subclasses provide a default
978  value when a key isn't contained in the dictionary. When a key isn't found, the
979  dictionary's ``__missing__(key)`` method will be called.  This hook is used
980  to implement the new :class:`defaultdict` class in the :mod:`collections`
981  module.  The following example defines a dictionary  that returns zero for any
982  missing key::
983
984     class zerodict (dict):
985         def __missing__ (self, key):
986             return 0
987
988     d = zerodict({1:1, 2:2})
989     print d[1], d[2]   # Prints 1, 2
990     print d[3], d[4]   # Prints 0, 0
991
992* Both 8-bit and Unicode strings have new ``partition(sep)``  and
993  ``rpartition(sep)`` methods that simplify a common use case.
994
995  The ``find(S)`` method is often used to get an index which is then used to
996  slice the string and obtain the pieces that are before and after the separator.
997  ``partition(sep)`` condenses this pattern into a single method call that
998  returns a 3-tuple containing the substring before the separator, the separator
999  itself, and the substring after the separator.  If the separator isn't found,
1000  the first element of the tuple is the entire string and the other two elements
1001  are empty.  ``rpartition(sep)`` also returns a 3-tuple but starts searching
1002  from the end of the string; the ``r`` stands for 'reverse'.
1003
1004  Some examples::
1005
1006     >>> ('http://www.python.org').partition('://')
1007     ('http', '://', 'www.python.org')
1008     >>> ('file:/usr/share/doc/index.html').partition('://')
1009     ('file:/usr/share/doc/index.html', '', '')
1010     >>> (u'Subject: a quick question').partition(':')
1011     (u'Subject', u':', u' a quick question')
1012     >>> 'www.python.org'.rpartition('.')
1013     ('www.python', '.', 'org')
1014     >>> 'www.python.org'.rpartition(':')
1015     ('', '', 'www.python.org')
1016
1017  (Implemented by Fredrik Lundh following a suggestion by Raymond Hettinger.)
1018
1019* The :meth:`startswith` and :meth:`endswith` methods of string types now accept
1020  tuples of strings to check for. ::
1021
1022     def is_image_file (filename):
1023         return filename.endswith(('.gif', '.jpg', '.tiff'))
1024
1025  (Implemented by Georg Brandl following a suggestion by Tom Lynn.)
1026
1027  .. RFE #1491485
1028
1029* The :func:`min` and :func:`max` built-in functions gained a ``key`` keyword
1030  parameter analogous to the ``key`` argument for :meth:`sort`.  This parameter
1031  supplies a function that takes a single argument and is called for every value
1032  in the list; :func:`min`/:func:`max` will return the element with the
1033  smallest/largest return value from this function. For example, to find the
1034  longest string in a list, you can do::
1035
1036     L = ['medium', 'longest', 'short']
1037     # Prints 'longest'
1038     print max(L, key=len)
1039     # Prints 'short', because lexicographically 'short' has the largest value
1040     print max(L)
1041
1042  (Contributed by Steven Bethard and Raymond Hettinger.)
1043
1044* Two new built-in functions, :func:`any` and :func:`all`, evaluate whether an
1045  iterator contains any true or false values.  :func:`any` returns :const:`True`
1046  if any value returned by the iterator is true; otherwise it will return
1047  :const:`False`.  :func:`all` returns :const:`True` only if all of the values
1048  returned by the iterator evaluate as true. (Suggested by Guido van Rossum, and
1049  implemented by Raymond Hettinger.)
1050
1051* The result of a class's :meth:`__hash__` method can now be either a long
1052  integer or a regular integer.  If a long integer is returned, the hash of that
1053  value is taken.  In earlier versions the hash value was required to be a
1054  regular integer, but in 2.5 the :func:`id` built-in was changed to always
1055  return non-negative numbers, and users often seem to use ``id(self)`` in
1056  :meth:`__hash__` methods (though this is discouraged).
1057
1058  .. Bug #1536021
1059
1060* ASCII is now the default encoding for modules.  It's now  a syntax error if a
1061  module contains string literals with 8-bit characters but doesn't have an
1062  encoding declaration.  In Python 2.4 this triggered a warning, not a syntax
1063  error.  See :pep:`263`  for how to declare a module's encoding; for example, you
1064  might add  a line like this near the top of the source file::
1065
1066     # -*- coding: latin1 -*-
1067
1068* A new warning, :class:`UnicodeWarning`, is triggered when  you attempt to
1069  compare a Unicode string and an 8-bit string  that can't be converted to Unicode
1070  using the default ASCII encoding.   The result of the comparison is false::
1071
1072     >>> chr(128) == unichr(128)   # Can't convert chr(128) to Unicode
1073     __main__:1: UnicodeWarning: Unicode equal comparison failed
1074       to convert both arguments to Unicode - interpreting them
1075       as being unequal
1076     False
1077     >>> chr(127) == unichr(127)   # chr(127) can be converted
1078     True
1079
1080  Previously this would raise a :class:`UnicodeDecodeError` exception, but in 2.5
1081  this could result in puzzling problems when accessing a dictionary.  If you
1082  looked up ``unichr(128)`` and ``chr(128)`` was being used as a key, you'd get a
1083  :class:`UnicodeDecodeError` exception.  Other changes in 2.5 resulted in this
1084  exception being raised instead of suppressed by the code in :file:`dictobject.c`
1085  that implements dictionaries.
1086
1087  Raising an exception for such a comparison is strictly correct, but the change
1088  might have broken code, so instead  :class:`UnicodeWarning` was introduced.
1089
1090  (Implemented by Marc-André Lemburg.)
1091
1092* One error that Python programmers sometimes make is forgetting to include an
1093  :file:`__init__.py` module in a package directory. Debugging this mistake can be
1094  confusing, and usually requires running Python with the :option:`-v` switch to
1095  log all the paths searched. In Python 2.5, a new :exc:`ImportWarning` warning is
1096  triggered when an import would have picked up a directory as a package but no
1097  :file:`__init__.py` was found.  This warning is silently ignored by default;
1098  provide the :option:`-Wd <-W>` option when running the Python executable to display
1099  the warning message. (Implemented by Thomas Wouters.)
1100
1101* The list of base classes in a class definition can now be empty.   As an
1102  example, this is now legal::
1103
1104     class C():
1105         pass
1106
1107  (Implemented by Brett Cannon.)
1108
1109.. ======================================================================
1110
1111
1112.. _25interactive:
1113
1114Interactive Interpreter Changes
1115-------------------------------
1116
1117In the interactive interpreter, ``quit`` and ``exit``  have long been strings so
1118that new users get a somewhat helpful message when they try to quit::
1119
1120   >>> quit
1121   'Use Ctrl-D (i.e. EOF) to exit.'
1122
1123In Python 2.5, ``quit`` and ``exit`` are now objects that still produce string
1124representations of themselves, but are also callable. Newbies who try ``quit()``
1125or ``exit()`` will now exit the interpreter as they expect.  (Implemented by
1126Georg Brandl.)
1127
1128The Python executable now accepts the standard long options  :option:`--help`
1129and :option:`--version`; on Windows,  it also accepts the :option:`/? <-?>` option
1130for displaying a help message. (Implemented by Georg Brandl.)
1131
1132.. ======================================================================
1133
1134
1135.. _opts:
1136
1137Optimizations
1138-------------
1139
1140Several of the optimizations were developed at the NeedForSpeed sprint, an event
1141held in Reykjavik, Iceland, from May 21--28 2006. The sprint focused on speed
1142enhancements to the CPython implementation and was funded by EWT LLC with local
1143support from CCP Games.  Those optimizations added at this sprint are specially
1144marked in the following list.
1145
1146* When they were introduced  in Python 2.4, the built-in :class:`set` and
1147  :class:`frozenset` types were built on top of Python's dictionary type.   In 2.5
1148  the internal data structure has been customized for implementing sets, and as a
1149  result sets will use a third less memory and are somewhat faster. (Implemented
1150  by Raymond Hettinger.)
1151
1152* The speed of some Unicode operations, such as finding substrings, string
1153  splitting, and character map encoding and decoding, has been improved.
1154  (Substring search and splitting improvements were added by Fredrik Lundh and
1155  Andrew Dalke at the NeedForSpeed sprint. Character maps were improved by Walter
1156  Dörwald and Martin von Löwis.)
1157
1158  .. Patch 1313939, 1359618
1159
1160* The ``long(str, base)`` function is now faster on long digit strings
1161  because fewer intermediate results are calculated.  The peak is for strings of
1162  around 800--1000 digits where  the function is 6 times faster. (Contributed by
1163  Alan McIntyre and committed at the NeedForSpeed sprint.)
1164
1165  .. Patch 1442927
1166
1167* It's now illegal to mix iterating over a file  with ``for line in file`` and
1168  calling  the file object's :meth:`read`/:meth:`readline`/:meth:`readlines`
1169  methods.  Iteration uses an internal buffer and the  :meth:`read\*` methods
1170  don't use that buffer.   Instead they would return the data following the
1171  buffer, causing the data to appear out of order.  Mixing iteration and these
1172  methods will now trigger a :exc:`ValueError` from the :meth:`read\*` method.
1173  (Implemented by Thomas Wouters.)
1174
1175  .. Patch 1397960
1176
1177* The :mod:`struct` module now compiles structure format  strings into an
1178  internal representation and caches this representation, yielding a 20% speedup.
1179  (Contributed by Bob Ippolito at the NeedForSpeed sprint.)
1180
1181* The :mod:`re` module got a 1 or 2% speedup by switching to  Python's allocator
1182  functions instead of the system's  :c:func:`malloc` and :c:func:`free`.
1183  (Contributed by Jack Diederich at the NeedForSpeed sprint.)
1184
1185* The code generator's peephole optimizer now performs simple constant folding
1186  in expressions.  If you write something like ``a = 2+3``, the code generator
1187  will do the arithmetic and produce code corresponding to ``a = 5``.  (Proposed
1188  and implemented  by Raymond Hettinger.)
1189
1190* Function calls are now faster because code objects now keep  the most recently
1191  finished frame (a "zombie frame") in an internal field of the code object,
1192  reusing it the next time the code object is invoked.  (Original patch by Michael
1193  Hudson, modified by Armin Rigo and Richard Jones; committed at the NeedForSpeed
1194  sprint.)  Frame objects are also slightly smaller, which may improve cache
1195  locality and reduce memory usage a bit.  (Contributed by Neal Norwitz.)
1196
1197  .. Patch 876206
1198  .. Patch 1337051
1199
1200* Python's built-in exceptions are now new-style classes, a change that speeds
1201  up instantiation considerably.  Exception handling in Python 2.5 is therefore
1202  about 30% faster than in 2.4. (Contributed by Richard Jones, Georg Brandl and
1203  Sean Reifschneider at the NeedForSpeed sprint.)
1204
1205* Importing now caches the paths tried, recording whether  they exist or not so
1206  that the interpreter makes fewer  :c:func:`open` and :c:func:`stat` calls on
1207  startup. (Contributed by Martin von Löwis and Georg Brandl.)
1208
1209  .. Patch 921466
1210
1211.. ======================================================================
1212
1213
1214.. _25modules:
1215
1216New, Improved, and Removed Modules
1217==================================
1218
1219The standard library received many enhancements and bug fixes in Python 2.5.
1220Here's a partial list of the most notable changes, sorted alphabetically by
1221module name. Consult the :file:`Misc/NEWS` file in the source tree for a more
1222complete list of changes, or look through the SVN logs for all the details.
1223
1224* The :mod:`audioop` module now supports the a-LAW encoding, and the code for
1225  u-LAW encoding has been improved.  (Contributed by Lars Immisch.)
1226
1227* The :mod:`codecs` module gained support for incremental codecs.  The
1228  :func:`codec.lookup` function now returns a :class:`CodecInfo` instance instead
1229  of a tuple. :class:`CodecInfo` instances behave like a 4-tuple to preserve
1230  backward compatibility but also have the attributes :attr:`encode`,
1231  :attr:`decode`, :attr:`incrementalencoder`, :attr:`incrementaldecoder`,
1232  :attr:`streamwriter`, and :attr:`streamreader`.  Incremental codecs  can receive
1233  input and produce output in multiple chunks; the output is the same as if the
1234  entire input was fed to the non-incremental codec. See the :mod:`codecs` module
1235  documentation for details. (Designed and implemented by Walter Dörwald.)
1236
1237  .. Patch  1436130
1238
1239* The :mod:`collections` module gained a new type, :class:`defaultdict`, that
1240  subclasses the standard :class:`dict` type.  The new type mostly behaves like a
1241  dictionary but constructs a default value when a key isn't present,
1242  automatically adding it to the dictionary for the requested key value.
1243
1244  The first argument to :class:`defaultdict`'s constructor is a factory function
1245  that gets called whenever a key is requested but not found. This factory
1246  function receives no arguments, so you can use built-in type constructors such
1247  as :func:`list` or :func:`int`.  For example,  you can make an index of words
1248  based on their initial letter like this::
1249
1250     words = """Nel mezzo del cammin di nostra vita
1251     mi ritrovai per una selva oscura
1252     che la diritta via era smarrita""".lower().split()
1253
1254     index = defaultdict(list)
1255
1256     for w in words:
1257         init_letter = w[0]
1258         index[init_letter].append(w)
1259
1260  Printing ``index`` results in the following output::
1261
1262     defaultdict(<type 'list'>, {'c': ['cammin', 'che'], 'e': ['era'],
1263             'd': ['del', 'di', 'diritta'], 'm': ['mezzo', 'mi'],
1264             'l': ['la'], 'o': ['oscura'], 'n': ['nel', 'nostra'],
1265             'p': ['per'], 's': ['selva', 'smarrita'],
1266             'r': ['ritrovai'], 'u': ['una'], 'v': ['vita', 'via']}
1267
1268  (Contributed by Guido van Rossum.)
1269
1270* The :class:`deque` double-ended queue type supplied by the :mod:`collections`
1271  module now has a ``remove(value)`` method that removes the first occurrence
1272  of *value* in the queue, raising :exc:`ValueError` if the value isn't found.
1273  (Contributed by Raymond Hettinger.)
1274
1275* New module: The :mod:`contextlib` module contains helper functions for use
1276  with the new ':keyword:`with`' statement.  See section :ref:`contextlibmod`
1277  for more about this module.
1278
1279* New module: The :mod:`cProfile` module is a C implementation of  the existing
1280  :mod:`profile` module that has much lower overhead. The module's interface is
1281  the same as :mod:`profile`: you run ``cProfile.run('main()')`` to profile a
1282  function, can save profile data to a file, etc.  It's not yet known if the
1283  Hotshot profiler, which is also written in C but doesn't match the
1284  :mod:`profile` module's interface, will continue to be maintained in future
1285  versions of Python.  (Contributed by Armin Rigo.)
1286
1287  Also, the :mod:`pstats` module for analyzing the data measured by the profiler
1288  now supports directing the output to any file object by supplying a *stream*
1289  argument to the :class:`Stats` constructor. (Contributed by Skip Montanaro.)
1290
1291* The :mod:`csv` module, which parses files in comma-separated value format,
1292  received several enhancements and a number of bugfixes.  You can now set the
1293  maximum size in bytes of a field by calling the
1294  ``csv.field_size_limit(new_limit)`` function; omitting the *new_limit*
1295  argument will return the currently-set limit.  The :class:`reader` class now has
1296  a :attr:`line_num` attribute that counts the number of physical lines read from
1297  the source; records can span multiple physical lines, so :attr:`line_num` is not
1298  the same as the number of records read.
1299
1300  The CSV parser is now stricter about multi-line quoted fields. Previously, if a
1301  line ended within a quoted field without a terminating newline character, a
1302  newline would be inserted into the returned field. This behavior caused problems
1303  when reading files that contained carriage return characters within fields, so
1304  the code was changed to return the field without inserting newlines. As a
1305  consequence, if newlines embedded within fields are important, the input should
1306  be split into lines in a manner that preserves the newline characters.
1307
1308  (Contributed by Skip Montanaro and Andrew McNamara.)
1309
1310* The :class:`~datetime.datetime` class in the :mod:`datetime`  module now has a
1311  ``strptime(string, format)``  method for parsing date strings, contributed
1312  by Josh Spoerri. It uses the same format characters as :func:`time.strptime` and
1313  :func:`time.strftime`::
1314
1315     from datetime import datetime
1316
1317     ts = datetime.strptime('10:13:15 2006-03-07',
1318                            '%H:%M:%S %Y-%m-%d')
1319
1320* The :meth:`SequenceMatcher.get_matching_blocks` method in the :mod:`difflib`
1321  module now guarantees to return a minimal list of blocks describing matching
1322  subsequences.  Previously, the algorithm would occasionally break a block of
1323  matching elements into two list entries. (Enhancement by Tim Peters.)
1324
1325* The :mod:`doctest` module gained a ``SKIP`` option that keeps an example from
1326  being executed at all.  This is intended for code snippets that are usage
1327  examples intended for the reader and aren't actually test cases.
1328
1329  An *encoding* parameter was added to the :func:`testfile` function and the
1330  :class:`DocFileSuite` class to specify the file's encoding.  This makes it
1331  easier to use non-ASCII characters in  tests contained within a docstring.
1332  (Contributed by Bjorn Tillenius.)
1333
1334  .. Patch 1080727
1335
1336* The :mod:`email` package has been updated to version 4.0. (Contributed by
1337  Barry Warsaw.)
1338
1339  .. XXX need to provide some more detail here
1340
1341  .. index::
1342     single: universal newlines; What's new
1343
1344* The :mod:`fileinput` module was made more flexible. Unicode filenames are now
1345  supported, and a *mode* parameter that defaults to ``"r"`` was added to the
1346  :func:`input` function to allow opening files in binary or :term:`universal
1347  newlines` mode.  Another new parameter, *openhook*, lets you use a function
1348  other than :func:`open`  to open the input files.  Once you're iterating over
1349  the set of files, the :class:`FileInput` object's new :meth:`fileno` returns
1350  the file descriptor for the currently opened file. (Contributed by Georg
1351  Brandl.)
1352
1353* In the :mod:`gc` module, the new :func:`get_count` function returns a 3-tuple
1354  containing the current collection counts for the three GC generations.  This is
1355  accounting information for the garbage collector; when these counts reach a
1356  specified threshold, a garbage collection sweep will be made.  The existing
1357  :func:`gc.collect` function now takes an optional *generation* argument of 0, 1,
1358  or 2 to specify which generation to collect. (Contributed by Barry Warsaw.)
1359
1360* The :func:`nsmallest` and  :func:`nlargest` functions in the :mod:`heapq`
1361  module  now support a ``key`` keyword parameter similar to the one provided by
1362  the :func:`min`/:func:`max` functions and the :meth:`sort` methods.  For
1363  example::
1364
1365     >>> import heapq
1366     >>> L = ["short", 'medium', 'longest', 'longer still']
1367     >>> heapq.nsmallest(2, L)  # Return two lowest elements, lexicographically
1368     ['longer still', 'longest']
1369     >>> heapq.nsmallest(2, L, key=len)   # Return two shortest elements
1370     ['short', 'medium']
1371
1372  (Contributed by Raymond Hettinger.)
1373
1374* The :func:`itertools.islice` function now accepts ``None`` for the start and
1375  step arguments.  This makes it more compatible with the attributes of slice
1376  objects, so that you can now write the following::
1377
1378     s = slice(5)     # Create slice object
1379     itertools.islice(iterable, s.start, s.stop, s.step)
1380
1381  (Contributed by Raymond Hettinger.)
1382
1383* The :func:`format` function in the :mod:`locale` module has been modified and
1384  two new functions were added, :func:`format_string` and :func:`currency`.
1385
1386  The :func:`format` function's *val* parameter could previously be a string as
1387  long as no more than one %char specifier appeared; now the parameter must be
1388  exactly one %char specifier with no surrounding text.  An optional *monetary*
1389  parameter was also added which, if ``True``, will use the locale's rules for
1390  formatting currency in placing a separator between groups of three digits.
1391
1392  To format strings with multiple %char specifiers, use the new
1393  :func:`format_string` function that works like :func:`format` but also supports
1394  mixing %char specifiers with arbitrary text.
1395
1396  A new :func:`currency` function was also added that formats a number according
1397  to the current locale's settings.
1398
1399  (Contributed by Georg Brandl.)
1400
1401  .. Patch 1180296
1402
1403* The :mod:`mailbox` module underwent a massive rewrite to add the capability to
1404  modify mailboxes in addition to reading them.  A new set of classes that include
1405  :class:`mbox`, :class:`MH`, and :class:`Maildir` are used to read mailboxes, and
1406  have an ``add(message)`` method to add messages, ``remove(key)`` to
1407  remove messages, and :meth:`lock`/:meth:`unlock` to lock/unlock the mailbox.
1408  The following example converts a maildir-format mailbox into an mbox-format
1409  one::
1410
1411     import mailbox
1412
1413     # 'factory=None' uses email.Message.Message as the class representing
1414     # individual messages.
1415     src = mailbox.Maildir('maildir', factory=None)
1416     dest = mailbox.mbox('/tmp/mbox')
1417
1418     for msg in src:
1419         dest.add(msg)
1420
1421  (Contributed by Gregory K. Johnson.  Funding was provided by Google's 2005
1422  Summer of Code.)
1423
1424* New module: the :mod:`msilib` module allows creating Microsoft Installer
1425  :file:`.msi` files and CAB files.  Some support for reading the :file:`.msi`
1426  database is also included. (Contributed by Martin von Löwis.)
1427
1428* The :mod:`nis` module now supports accessing domains other than the system
1429  default domain by supplying a *domain* argument to the :func:`nis.match` and
1430  :func:`nis.maps` functions. (Contributed by Ben Bell.)
1431
1432* The :mod:`operator` module's :func:`itemgetter`  and :func:`attrgetter`
1433  functions now support multiple fields.   A call such as
1434  ``operator.attrgetter('a', 'b')`` will return a function  that retrieves the
1435  :attr:`a` and :attr:`b` attributes.  Combining  this new feature with the
1436  :meth:`sort` method's ``key`` parameter  lets you easily sort lists using
1437  multiple fields. (Contributed by Raymond Hettinger.)
1438
1439* The :mod:`optparse` module was updated to version 1.5.1 of the Optik library.
1440  The :class:`OptionParser` class gained an :attr:`epilog` attribute, a string
1441  that will be printed after the help message, and a :meth:`destroy` method to
1442  break reference cycles created by the object. (Contributed by Greg Ward.)
1443
1444* The :mod:`os` module underwent several changes.  The :attr:`stat_float_times`
1445  variable now defaults to true, meaning that :func:`os.stat` will now return time
1446  values as floats.  (This doesn't necessarily mean that :func:`os.stat` will
1447  return times that are precise to fractions of a second; not all systems support
1448  such precision.)
1449
1450  Constants named :attr:`os.SEEK_SET`, :attr:`os.SEEK_CUR`, and
1451  :attr:`os.SEEK_END` have been added; these are the parameters to the
1452  :func:`os.lseek` function.  Two new constants for locking are
1453  :attr:`os.O_SHLOCK` and :attr:`os.O_EXLOCK`.
1454
1455  Two new functions, :func:`wait3` and :func:`wait4`, were added.  They're similar
1456  the :func:`waitpid` function which waits for a child process to exit and returns
1457  a tuple of the process ID and its exit status, but :func:`wait3` and
1458  :func:`wait4` return additional information.  :func:`wait3` doesn't take a
1459  process ID as input, so it waits for any child process to exit and returns a
1460  3-tuple of *process-id*, *exit-status*, *resource-usage* as returned from the
1461  :func:`resource.getrusage` function. ``wait4(pid)`` does take a process ID.
1462  (Contributed by Chad J. Schroeder.)
1463
1464  On FreeBSD, the :func:`os.stat` function now returns  times with nanosecond
1465  resolution, and the returned object now has :attr:`st_gen` and
1466  :attr:`st_birthtime`. The :attr:`st_flags` attribute is also available, if the
1467  platform supports it. (Contributed by Antti Louko and  Diego Pettenò.)
1468
1469  .. (Patch 1180695, 1212117)
1470
1471* The Python debugger provided by the :mod:`pdb` module can now store lists of
1472  commands to execute when a breakpoint is reached and execution stops.  Once
1473  breakpoint #1 has been created, enter ``commands 1`` and enter a series of
1474  commands to be executed, finishing the list with ``end``.  The command list can
1475  include commands that resume execution, such as ``continue`` or ``next``.
1476  (Contributed by Grégoire Dooms.)
1477
1478  .. Patch 790710
1479
1480* The :mod:`pickle` and :mod:`cPickle` modules no longer accept a return value
1481  of ``None`` from the :meth:`__reduce__` method; the method must return a tuple
1482  of arguments instead.  The ability to return ``None`` was deprecated in Python
1483  2.4, so this completes the removal of the feature.
1484
1485* The :mod:`pkgutil` module, containing various utility functions for finding
1486  packages, was enhanced to support PEP 302's import hooks and now also works for
1487  packages stored in ZIP-format archives. (Contributed by Phillip J. Eby.)
1488
1489* The pybench benchmark suite by Marc-André Lemburg is now included in the
1490  :file:`Tools/pybench` directory.  The pybench suite is an improvement on the
1491  commonly used :file:`pystone.py` program because pybench provides a more
1492  detailed measurement of the interpreter's speed.  It times particular operations
1493  such as function calls, tuple slicing, method lookups, and numeric operations,
1494  instead of performing many different operations and reducing the result to a
1495  single number as :file:`pystone.py` does.
1496
1497* The :mod:`pyexpat` module now uses version 2.0 of the Expat parser.
1498  (Contributed by Trent Mick.)
1499
1500* The :class:`~Queue.Queue` class provided by the :mod:`Queue` module gained two new
1501  methods.  :meth:`join` blocks until all items in the queue have been retrieved
1502  and all processing work on the items  have been completed.  Worker threads call
1503  the other new method,  :meth:`task_done`, to signal that processing for an item
1504  has been completed.  (Contributed by Raymond Hettinger.)
1505
1506* The old :mod:`regex` and :mod:`regsub` modules, which have been  deprecated
1507  ever since Python 2.0, have finally been deleted.   Other deleted modules:
1508  :mod:`statcache`, :mod:`tzparse`, :mod:`whrandom`.
1509
1510* Also deleted: the :file:`lib-old` directory, which includes ancient modules
1511  such as :mod:`dircmp` and :mod:`ni`, was removed.  :file:`lib-old` wasn't on the
1512  default ``sys.path``, so unless your programs explicitly added the directory to
1513  ``sys.path``, this removal shouldn't affect your code.
1514
1515* The :mod:`rlcompleter` module is no longer  dependent on importing the
1516  :mod:`readline` module and therefore now works on non-Unix platforms. (Patch
1517  from Robert Kiendl.)
1518
1519  .. Patch #1472854
1520
1521* The :mod:`SimpleXMLRPCServer` and :mod:`DocXMLRPCServer`  classes now have a
1522  :attr:`rpc_paths` attribute that constrains XML-RPC operations to a limited set
1523  of URL paths; the default is to allow only ``'/'`` and ``'/RPC2'``.  Setting
1524  :attr:`rpc_paths` to ``None`` or an empty tuple disables  this path checking.
1525
1526  .. Bug #1473048
1527
1528* The :mod:`socket` module now supports :const:`AF_NETLINK` sockets on Linux,
1529  thanks to a patch from Philippe Biondi.   Netlink sockets are a Linux-specific
1530  mechanism for communications between a user-space process and kernel code; an
1531  introductory  article about them is at https://www.linuxjournal.com/article/7356.
1532  In Python code, netlink addresses are represented as a tuple of 2 integers,
1533  ``(pid, group_mask)``.
1534
1535  Two new methods on socket objects, ``recv_into(buffer)`` and
1536  ``recvfrom_into(buffer)``, store the received data in an object  that
1537  supports the buffer protocol instead of returning the data as a string.  This
1538  means you can put the data directly into an array or a memory-mapped file.
1539
1540  Socket objects also gained :meth:`getfamily`, :meth:`gettype`, and
1541  :meth:`getproto` accessor methods to retrieve the family, type, and protocol
1542  values for the socket.
1543
1544* New module: the :mod:`spwd` module provides functions for accessing the shadow
1545  password database on systems that support  shadow passwords.
1546
1547* The :mod:`struct` is now faster because it  compiles format strings into
1548  :class:`Struct` objects with :meth:`pack` and :meth:`unpack` methods.  This is
1549  similar to how the :mod:`re` module lets you create compiled regular expression
1550  objects.  You can still use the module-level  :func:`pack` and :func:`unpack`
1551  functions; they'll create  :class:`Struct` objects and cache them.  Or you can
1552  use  :class:`Struct` instances directly::
1553
1554     s = struct.Struct('ih3s')
1555
1556     data = s.pack(1972, 187, 'abc')
1557     year, number, name = s.unpack(data)
1558
1559  You can also pack and unpack data to and from buffer objects directly using the
1560  ``pack_into(buffer, offset, v1, v2, ...)`` and ``unpack_from(buffer,
1561  offset)`` methods.  This lets you store data directly into an array or a
1562  memory-mapped file.
1563
1564  (:class:`Struct` objects were implemented by Bob Ippolito at the NeedForSpeed
1565  sprint.  Support for buffer objects was added by Martin Blais, also at the
1566  NeedForSpeed sprint.)
1567
1568* The Python developers switched from CVS to Subversion during the 2.5
1569  development process.  Information about the exact build version is available as
1570  the ``sys.subversion`` variable, a 3-tuple of ``(interpreter-name, branch-name,
1571  revision-range)``.  For example, at the time of writing my copy of 2.5 was
1572  reporting ``('CPython', 'trunk', '45313:45315')``.
1573
1574  This information is also available to C extensions via the
1575  :c:func:`Py_GetBuildInfo` function that returns a  string of build information
1576  like this: ``"trunk:45355:45356M, Apr 13 2006, 07:42:19"``.   (Contributed by
1577  Barry Warsaw.)
1578
1579* Another new function, :func:`sys._current_frames`, returns the current stack
1580  frames for all running threads as a dictionary mapping thread identifiers to the
1581  topmost stack frame currently active in that thread at the time the function is
1582  called.  (Contributed by Tim Peters.)
1583
1584* The :class:`TarFile` class in the :mod:`tarfile` module now has an
1585  :meth:`extractall` method that extracts all members from the archive into the
1586  current working directory.  It's also possible to set a different directory as
1587  the extraction target, and to unpack only a subset of the archive's members.
1588
1589  The compression used for a tarfile opened in stream mode can now be autodetected
1590  using the mode ``'r|*'``. (Contributed by Lars Gustäbel.)
1591
1592  .. patch 918101
1593
1594* The :mod:`threading` module now lets you set the stack size used when new
1595  threads are created. The ``stack_size([*size*])`` function returns the
1596  currently configured stack size, and supplying the optional *size* parameter
1597  sets a new value.  Not all platforms support changing the stack size, but
1598  Windows, POSIX threading, and OS/2 all do. (Contributed by Andrew MacIntyre.)
1599
1600  .. Patch 1454481
1601
1602* The :mod:`unicodedata` module has been updated to use version 4.1.0 of the
1603  Unicode character database.  Version 3.2.0 is required  by some specifications,
1604  so it's still available as  :attr:`unicodedata.ucd_3_2_0`.
1605
1606* New module: the  :mod:`uuid` module generates  universally unique identifiers
1607  (UUIDs) according to :rfc:`4122`.  The RFC defines several different UUID
1608  versions that are generated from a starting string, from system properties, or
1609  purely randomly.  This module contains a :class:`UUID` class and  functions
1610  named :func:`uuid1`, :func:`uuid3`, :func:`uuid4`,  and  :func:`uuid5` to
1611  generate different versions of UUID.  (Version 2 UUIDs  are not specified in
1612  :rfc:`4122` and are not supported by this module.) ::
1613
1614     >>> import uuid
1615     >>> # make a UUID based on the host ID and current time
1616     >>> uuid.uuid1()
1617     UUID('a8098c1a-f86e-11da-bd1a-00112444be1e')
1618
1619     >>> # make a UUID using an MD5 hash of a namespace UUID and a name
1620     >>> uuid.uuid3(uuid.NAMESPACE_DNS, 'python.org')
1621     UUID('6fa459ea-ee8a-3ca4-894e-db77e160355e')
1622
1623     >>> # make a random UUID
1624     >>> uuid.uuid4()
1625     UUID('16fd2706-8baf-433b-82eb-8c7fada847da')
1626
1627     >>> # make a UUID using a SHA-1 hash of a namespace UUID and a name
1628     >>> uuid.uuid5(uuid.NAMESPACE_DNS, 'python.org')
1629     UUID('886313e1-3b8a-5372-9b90-0c9aee199e5d')
1630
1631  (Contributed by Ka-Ping Yee.)
1632
1633* The :mod:`weakref` module's :class:`WeakKeyDictionary` and
1634  :class:`WeakValueDictionary` types gained new methods for iterating over the
1635  weak references contained in the dictionary.  :meth:`iterkeyrefs` and
1636  :meth:`keyrefs` methods were added to :class:`WeakKeyDictionary`, and
1637  :meth:`itervaluerefs` and :meth:`valuerefs` were added to
1638  :class:`WeakValueDictionary`.  (Contributed by Fred L. Drake, Jr.)
1639
1640* The :mod:`webbrowser` module received a number of enhancements. It's now
1641  usable as a script with ``python -m webbrowser``, taking a URL as the argument;
1642  there are a number of switches  to control the behaviour (:option:`!-n` for a new
1643  browser window,  :option:`!-t` for a new tab).  New module-level functions,
1644  :func:`open_new` and :func:`open_new_tab`, were added  to support this.  The
1645  module's :func:`open` function supports an additional feature, an *autoraise*
1646  parameter that signals whether to raise the open window when possible. A number
1647  of additional browsers were added to the supported list such as Firefox, Opera,
1648  Konqueror, and elinks.  (Contributed by Oleg Broytmann and Georg Brandl.)
1649
1650  .. Patch #754022
1651
1652* The :mod:`xmlrpclib` module now supports returning  :class:`~datetime.datetime` objects
1653  for the XML-RPC date type.  Supply  ``use_datetime=True`` to the :func:`loads`
1654  function or the :class:`Unmarshaller` class to enable this feature. (Contributed
1655  by Skip Montanaro.)
1656
1657  .. Patch 1120353
1658
1659* The :mod:`zipfile` module now supports the ZIP64 version of the  format,
1660  meaning that a .zip archive can now be larger than 4 GiB and can contain
1661  individual files larger than 4 GiB.  (Contributed by Ronald Oussoren.)
1662
1663  .. Patch 1446489
1664
1665* The :mod:`zlib` module's :class:`Compress` and :class:`Decompress` objects now
1666  support a :meth:`copy` method that makes a copy of the  object's internal state
1667  and returns a new  :class:`Compress` or :class:`Decompress` object.
1668  (Contributed by Chris AtLee.)
1669
1670  .. Patch 1435422
1671
1672.. ======================================================================
1673
1674
1675.. _module-ctypes:
1676
1677The ctypes package
1678------------------
1679
1680The :mod:`ctypes` package, written by Thomas Heller, has been added  to the
1681standard library.  :mod:`ctypes` lets you call arbitrary functions  in shared
1682libraries or DLLs.  Long-time users may remember the :mod:`dl` module, which
1683provides functions for loading shared libraries and calling functions in them.
1684The :mod:`ctypes` package is much fancier.
1685
1686To load a shared library or DLL, you must create an instance of the
1687:class:`CDLL` class and provide the name or path of the shared library or DLL.
1688Once that's done, you can call arbitrary functions by accessing them as
1689attributes of the :class:`CDLL` object.   ::
1690
1691   import ctypes
1692
1693   libc = ctypes.CDLL('libc.so.6')
1694   result = libc.printf("Line of output\n")
1695
1696Type constructors for the various C types are provided: :func:`c_int`,
1697:func:`c_float`, :func:`c_double`, :func:`c_char_p` (equivalent to :c:type:`char
1698\*`), and so forth.  Unlike Python's types, the C versions are all mutable; you
1699can assign to their :attr:`value` attribute to change the wrapped value.  Python
1700integers and strings will be automatically converted to the corresponding C
1701types, but for other types you  must call the correct type constructor.  (And I
1702mean *must*;  getting it wrong will often result in the interpreter crashing
1703with a segmentation fault.)
1704
1705You shouldn't use :func:`c_char_p` with a Python string when the C function will
1706be modifying the memory area, because Python strings are  supposed to be
1707immutable; breaking this rule will cause puzzling bugs.  When you need a
1708modifiable memory area, use :func:`create_string_buffer`::
1709
1710   s = "this is a string"
1711   buf = ctypes.create_string_buffer(s)
1712   libc.strfry(buf)
1713
1714C functions are assumed to return integers, but you can set the :attr:`restype`
1715attribute of the function object to  change this::
1716
1717   >>> libc.atof('2.71828')
1718   -1783957616
1719   >>> libc.atof.restype = ctypes.c_double
1720   >>> libc.atof('2.71828')
1721   2.71828
1722
1723:mod:`ctypes` also provides a wrapper for Python's C API  as the
1724``ctypes.pythonapi`` object.  This object does *not*  release the global
1725interpreter lock before calling a function, because the lock must be held when
1726calling into the interpreter's code.   There's a :class:`py_object()` type
1727constructor that will create a  :c:type:`PyObject \*` pointer.  A simple usage::
1728
1729   import ctypes
1730
1731   d = {}
1732   ctypes.pythonapi.PyObject_SetItem(ctypes.py_object(d),
1733             ctypes.py_object("abc"),  ctypes.py_object(1))
1734   # d is now {'abc', 1}.
1735
1736Don't forget to use :class:`py_object()`; if it's omitted you end  up with a
1737segmentation fault.
1738
1739:mod:`ctypes` has been around for a while, but people still write  and
1740distribution hand-coded extension modules because you can't rely on
1741:mod:`ctypes` being present. Perhaps developers will begin to write  Python
1742wrappers atop a library accessed through :mod:`ctypes` instead of extension
1743modules, now that :mod:`ctypes` is included with core Python.
1744
1745
1746.. seealso::
1747
1748   http://starship.python.net/crew/theller/ctypes/
1749      The ctypes web page, with a tutorial, reference, and FAQ.
1750
1751   The documentation  for the :mod:`ctypes` module.
1752
1753.. ======================================================================
1754
1755
1756.. _module-etree:
1757
1758The ElementTree package
1759-----------------------
1760
1761A subset of Fredrik Lundh's ElementTree library for processing XML has been
1762added to the standard library as :mod:`xml.etree`.  The available modules are
1763:mod:`ElementTree`, :mod:`ElementPath`, and :mod:`ElementInclude` from
1764ElementTree 1.2.6.    The :mod:`cElementTree` accelerator module is also
1765included.
1766
1767The rest of this section will provide a brief overview of using ElementTree.
1768Full documentation for ElementTree is available at
1769http://effbot.org/zone/element-index.htm.
1770
1771ElementTree represents an XML document as a tree of element nodes. The text
1772content of the document is stored as the :attr:`text` and :attr:`tail`
1773attributes of  (This is one of the major differences between ElementTree and
1774the Document Object Model; in the DOM there are many different types of node,
1775including :class:`TextNode`.)
1776
1777The most commonly used parsing function is :func:`parse`, that takes either a
1778string (assumed to contain a filename) or a file-like object and returns an
1779:class:`ElementTree` instance::
1780
1781   from xml.etree import ElementTree as ET
1782
1783   tree = ET.parse('ex-1.xml')
1784
1785   feed = urllib.urlopen(
1786             'http://planet.python.org/rss10.xml')
1787   tree = ET.parse(feed)
1788
1789Once you have an :class:`ElementTree` instance, you can call its :meth:`getroot`
1790method to get the root :class:`Element` node.
1791
1792There's also an :func:`XML` function that takes a string literal and returns an
1793:class:`Element` node (not an :class:`ElementTree`).   This function provides a
1794tidy way to incorporate XML fragments, approaching the convenience of an XML
1795literal::
1796
1797   svg = ET.XML("""<svg width="10px" version="1.0">
1798                </svg>""")
1799   svg.set('height', '320px')
1800   svg.append(elem1)
1801
1802Each XML element supports some dictionary-like and some list-like access
1803methods.  Dictionary-like operations are used to access attribute values, and
1804list-like operations are used to access child nodes.
1805
1806+-------------------------------+--------------------------------------------+
1807| Operation                     | Result                                     |
1808+===============================+============================================+
1809| ``elem[n]``                   | Returns n'th child element.                |
1810+-------------------------------+--------------------------------------------+
1811| ``elem[m:n]``                 | Returns list of m'th through n'th child    |
1812|                               | elements.                                  |
1813+-------------------------------+--------------------------------------------+
1814| ``len(elem)``                 | Returns number of child elements.          |
1815+-------------------------------+--------------------------------------------+
1816| ``list(elem)``                | Returns list of child elements.            |
1817+-------------------------------+--------------------------------------------+
1818| ``elem.append(elem2)``        | Adds *elem2* as a child.                   |
1819+-------------------------------+--------------------------------------------+
1820| ``elem.insert(index, elem2)`` | Inserts *elem2* at the specified location. |
1821+-------------------------------+--------------------------------------------+
1822| ``del elem[n]``               | Deletes n'th child element.                |
1823+-------------------------------+--------------------------------------------+
1824| ``elem.keys()``               | Returns list of attribute names.           |
1825+-------------------------------+--------------------------------------------+
1826| ``elem.get(name)``            | Returns value of attribute *name*.         |
1827+-------------------------------+--------------------------------------------+
1828| ``elem.set(name, value)``     | Sets new value for attribute *name*.       |
1829+-------------------------------+--------------------------------------------+
1830| ``elem.attrib``               | Retrieves the dictionary containing        |
1831|                               | attributes.                                |
1832+-------------------------------+--------------------------------------------+
1833| ``del elem.attrib[name]``     | Deletes attribute *name*.                  |
1834+-------------------------------+--------------------------------------------+
1835
1836Comments and processing instructions are also represented as :class:`Element`
1837nodes.  To check if a node is a comment or processing instructions::
1838
1839   if elem.tag is ET.Comment:
1840       ...
1841   elif elem.tag is ET.ProcessingInstruction:
1842       ...
1843
1844To generate XML output, you should call the :meth:`ElementTree.write` method.
1845Like :func:`parse`, it can take either a string or a file-like object::
1846
1847   # Encoding is US-ASCII
1848   tree.write('output.xml')
1849
1850   # Encoding is UTF-8
1851   f = open('output.xml', 'w')
1852   tree.write(f, encoding='utf-8')
1853
1854(Caution: the default encoding used for output is ASCII.  For general XML work,
1855where an element's name may contain arbitrary Unicode characters, ASCII isn't a
1856very useful encoding because it will raise an exception if an element's name
1857contains any characters with values greater than 127.  Therefore, it's best to
1858specify a different encoding such as UTF-8 that can handle any Unicode
1859character.)
1860
1861This section is only a partial description of the ElementTree interfaces. Please
1862read the package's official documentation for more details.
1863
1864
1865.. seealso::
1866
1867   http://effbot.org/zone/element-index.htm
1868      Official documentation for ElementTree.
1869
1870.. ======================================================================
1871
1872
1873.. _module-hashlib:
1874
1875The hashlib package
1876-------------------
1877
1878A new :mod:`hashlib` module, written by Gregory P. Smith,  has been added to
1879replace the :mod:`md5` and :mod:`sha` modules.  :mod:`hashlib` adds support for
1880additional secure hashes (SHA-224, SHA-256, SHA-384, and SHA-512). When
1881available, the module uses OpenSSL for fast platform optimized implementations
1882of algorithms.
1883
1884The old :mod:`md5` and :mod:`sha` modules still exist as wrappers around hashlib
1885to preserve backwards compatibility.  The new module's interface is very close
1886to that of the old modules, but not identical. The most significant difference
1887is that the constructor functions for creating new hashing objects are named
1888differently. ::
1889
1890   # Old versions
1891   h = md5.md5()
1892   h = md5.new()
1893
1894   # New version
1895   h = hashlib.md5()
1896
1897   # Old versions
1898   h = sha.sha()
1899   h = sha.new()
1900
1901   # New version
1902   h = hashlib.sha1()
1903
1904   # Hash that weren't previously available
1905   h = hashlib.sha224()
1906   h = hashlib.sha256()
1907   h = hashlib.sha384()
1908   h = hashlib.sha512()
1909
1910   # Alternative form
1911   h = hashlib.new('md5')          # Provide algorithm as a string
1912
1913Once a hash object has been created, its methods are the same as before:
1914``update(string)`` hashes the specified string into the  current digest
1915state, :meth:`digest` and :meth:`hexdigest` return the digest value as a binary
1916string or a string of hex digits, and :meth:`copy` returns a new hashing object
1917with the same digest state.
1918
1919
1920.. seealso::
1921
1922   The documentation  for the :mod:`hashlib` module.
1923
1924.. ======================================================================
1925
1926
1927.. _module-sqlite:
1928
1929The sqlite3 package
1930-------------------
1931
1932The pysqlite module (http://www.pysqlite.org), a wrapper for the SQLite embedded
1933database, has been added to the standard library under the package name
1934:mod:`sqlite3`.
1935
1936SQLite is a C library that provides a lightweight disk-based database that
1937doesn't require a separate server process and allows accessing the database
1938using a nonstandard variant of the SQL query language. Some applications can use
1939SQLite for internal data storage.  It's also possible to prototype an
1940application using SQLite and then port the code to a larger database such as
1941PostgreSQL or Oracle.
1942
1943pysqlite was written by Gerhard Häring and provides a SQL interface compliant
1944with the DB-API 2.0 specification described by :pep:`249`.
1945
1946If you're compiling the Python source yourself, note that the source tree
1947doesn't include the SQLite code, only the wrapper module. You'll need to have
1948the SQLite libraries and headers installed before compiling Python, and the
1949build process will compile the module when the necessary headers are available.
1950
1951To use the module, you must first create a :class:`Connection` object that
1952represents the database.  Here the data will be stored in the
1953:file:`/tmp/example` file::
1954
1955   conn = sqlite3.connect('/tmp/example')
1956
1957You can also supply the special name ``:memory:`` to create a database in RAM.
1958
1959Once you have a :class:`Connection`, you can create a :class:`Cursor`  object
1960and call its :meth:`execute` method to perform SQL commands::
1961
1962   c = conn.cursor()
1963
1964   # Create table
1965   c.execute('''create table stocks
1966   (date text, trans text, symbol text,
1967    qty real, price real)''')
1968
1969   # Insert a row of data
1970   c.execute("""insert into stocks
1971             values ('2006-01-05','BUY','RHAT',100,35.14)""")
1972
1973Usually your SQL operations will need to use values from Python variables.  You
1974shouldn't assemble your query using Python's string operations because doing so
1975is insecure; it makes your program vulnerable to an SQL injection attack.
1976
1977Instead, use the DB-API's parameter substitution.  Put ``?`` as a placeholder
1978wherever you want to use a value, and then provide a tuple of values as the
1979second argument to the cursor's :meth:`execute` method.  (Other database modules
1980may use a different placeholder, such as ``%s`` or ``:1``.) For example::
1981
1982   # Never do this -- insecure!
1983   symbol = 'IBM'
1984   c.execute("... where symbol = '%s'" % symbol)
1985
1986   # Do this instead
1987   t = (symbol,)
1988   c.execute('select * from stocks where symbol=?', t)
1989
1990   # Larger example
1991   for t in (('2006-03-28', 'BUY', 'IBM', 1000, 45.00),
1992             ('2006-04-05', 'BUY', 'MSOFT', 1000, 72.00),
1993             ('2006-04-06', 'SELL', 'IBM', 500, 53.00),
1994            ):
1995       c.execute('insert into stocks values (?,?,?,?,?)', t)
1996
1997To retrieve data after executing a SELECT statement, you can either  treat the
1998cursor as an iterator, call the cursor's :meth:`fetchone` method to retrieve a
1999single matching row,  or call :meth:`fetchall` to get a list of the matching
2000rows.
2001
2002This example uses the iterator form::
2003
2004   >>> c = conn.cursor()
2005   >>> c.execute('select * from stocks order by price')
2006   >>> for row in c:
2007   ...    print row
2008   ...
2009   (u'2006-01-05', u'BUY', u'RHAT', 100, 35.140000000000001)
2010   (u'2006-03-28', u'BUY', u'IBM', 1000, 45.0)
2011   (u'2006-04-06', u'SELL', u'IBM', 500, 53.0)
2012   (u'2006-04-05', u'BUY', u'MSOFT', 1000, 72.0)
2013   >>>
2014
2015For more information about the SQL dialect supported by SQLite, see
2016https://www.sqlite.org.
2017
2018
2019.. seealso::
2020
2021   http://www.pysqlite.org
2022      The pysqlite web page.
2023
2024   https://www.sqlite.org
2025      The SQLite web page; the documentation describes the syntax and the available
2026      data types for the supported SQL dialect.
2027
2028   The documentation  for the :mod:`sqlite3` module.
2029
2030   :pep:`249` - Database API Specification 2.0
2031      PEP written by Marc-André Lemburg.
2032
2033.. ======================================================================
2034
2035
2036.. _module-wsgiref:
2037
2038The wsgiref package
2039-------------------
2040
2041The Web Server Gateway Interface (WSGI) v1.0 defines a standard interface
2042between web servers and Python web applications and is described in :pep:`333`.
2043The :mod:`wsgiref` package is a reference implementation of the WSGI
2044specification.
2045
2046.. XXX should this be in a PEP 333 section instead?
2047
2048The package includes a basic HTTP server that will run a WSGI application; this
2049server is useful for debugging but isn't intended for  production use.  Setting
2050up a server takes only a few lines of code::
2051
2052   from wsgiref import simple_server
2053
2054   wsgi_app = ...
2055
2056   host = ''
2057   port = 8000
2058   httpd = simple_server.make_server(host, port, wsgi_app)
2059   httpd.serve_forever()
2060
2061.. XXX discuss structure of WSGI applications?
2062.. XXX provide an example using Django or some other framework?
2063
2064
2065.. seealso::
2066
2067   http://www.wsgi.org
2068      A central web site for WSGI-related resources.
2069
2070   :pep:`333` - Python Web Server Gateway Interface v1.0
2071      PEP written by Phillip J. Eby.
2072
2073.. ======================================================================
2074
2075
2076.. _build-api:
2077
2078Build and C API Changes
2079=======================
2080
2081Changes to Python's build process and to the C API include:
2082
2083* The Python source tree was converted from CVS to Subversion,  in a complex
2084  migration procedure that was supervised and flawlessly carried out by Martin von
2085  Löwis.  The procedure was developed as :pep:`347`.
2086
2087* Coverity, a company that markets a source code analysis tool called Prevent,
2088  provided the results of their examination of the Python source code.  The
2089  analysis found about 60 bugs that  were quickly fixed.  Many of the bugs were
2090  refcounting problems, often occurring in error-handling code.  See
2091  https://scan.coverity.com for the statistics.
2092
2093* The largest change to the C API came from :pep:`353`, which modifies the
2094  interpreter to use a :c:type:`Py_ssize_t` type definition instead of
2095  :c:type:`int`.  See the earlier section :ref:`pep-353` for a discussion of this
2096  change.
2097
2098* The design of the bytecode compiler has changed a great deal,  no longer
2099  generating bytecode by traversing the parse tree.  Instead the parse tree is
2100  converted to an abstract syntax tree (or AST), and it is  the abstract syntax
2101  tree that's traversed to produce the bytecode.
2102
2103  It's possible for Python code to obtain AST objects by using the
2104  :func:`compile` built-in and specifying ``_ast.PyCF_ONLY_AST`` as the value of
2105  the  *flags* parameter::
2106
2107     from _ast import PyCF_ONLY_AST
2108     ast = compile("""a=0
2109     for i in range(10):
2110         a += i
2111     """, "<string>", 'exec', PyCF_ONLY_AST)
2112
2113     assignment = ast.body[0]
2114     for_loop = ast.body[1]
2115
2116  No official documentation has been written for the AST code yet, but :pep:`339`
2117  discusses the design.  To start learning about the code, read the definition of
2118  the various AST nodes in :file:`Parser/Python.asdl`.  A Python script reads this
2119  file and generates a set of C structure definitions in
2120  :file:`Include/Python-ast.h`.  The :c:func:`PyParser_ASTFromString` and
2121  :c:func:`PyParser_ASTFromFile`, defined in :file:`Include/pythonrun.h`, take
2122  Python source as input and return the root of an AST representing the contents.
2123  This AST can then be turned into a code object by :c:func:`PyAST_Compile`.  For
2124  more information, read the source code, and then ask questions on python-dev.
2125
2126  The AST code was developed under Jeremy Hylton's management, and implemented by
2127  (in alphabetical order) Brett Cannon, Nick Coghlan, Grant Edwards, John
2128  Ehresman, Kurt Kaiser, Neal Norwitz, Tim Peters, Armin Rigo, and Neil
2129  Schemenauer, plus the participants in a number of AST sprints at conferences
2130  such as PyCon.
2131
2132  .. List of names taken from Jeremy's python-dev post at
2133  .. https://mail.python.org/pipermail/python-dev/2005-October/057500.html
2134
2135* Evan Jones's patch to obmalloc, first described in a talk at PyCon DC 2005,
2136  was applied.  Python 2.4 allocated small objects in 256K-sized arenas, but never
2137  freed arenas.  With this patch, Python will free arenas when they're empty.  The
2138  net effect is that on some platforms, when you allocate many objects, Python's
2139  memory usage may actually drop when you delete them and the memory may be
2140  returned to the operating system.  (Implemented by Evan Jones, and reworked by
2141  Tim Peters.)
2142
2143  Note that this change means extension modules must be more careful when
2144  allocating memory.  Python's API has many different functions for allocating
2145  memory that are grouped into families.  For example, :c:func:`PyMem_Malloc`,
2146  :c:func:`PyMem_Realloc`, and :c:func:`PyMem_Free` are one family that allocates
2147  raw memory, while :c:func:`PyObject_Malloc`, :c:func:`PyObject_Realloc`, and
2148  :c:func:`PyObject_Free` are another family that's supposed to be used for
2149  creating Python objects.
2150
2151  Previously these different families all reduced to the platform's
2152  :c:func:`malloc` and :c:func:`free` functions.  This meant  it didn't matter if
2153  you got things wrong and allocated memory with the :c:func:`PyMem` function but
2154  freed it with the :c:func:`PyObject` function.  With 2.5's changes to obmalloc,
2155  these families now do different things and mismatches will probably result in a
2156  segfault.  You should carefully test your C extension modules with Python 2.5.
2157
2158* The built-in set types now have an official C API.  Call :c:func:`PySet_New`
2159  and :c:func:`PyFrozenSet_New` to create a new set, :c:func:`PySet_Add` and
2160  :c:func:`PySet_Discard` to add and remove elements, and :c:func:`PySet_Contains`
2161  and :c:func:`PySet_Size` to examine the set's state. (Contributed by Raymond
2162  Hettinger.)
2163
2164* C code can now obtain information about the exact revision of the Python
2165  interpreter by calling the  :c:func:`Py_GetBuildInfo` function that returns a
2166  string of build information like this: ``"trunk:45355:45356M, Apr 13 2006,
2167  07:42:19"``.   (Contributed by Barry Warsaw.)
2168
2169* Two new macros can be used to indicate C functions that are local to the
2170  current file so that a faster calling convention can be used.
2171  ``Py_LOCAL(type)`` declares the function as returning a value of the
2172  specified *type* and uses a fast-calling qualifier.
2173  ``Py_LOCAL_INLINE(type)`` does the same thing and also requests the
2174  function be inlined.  If :c:func:`PY_LOCAL_AGGRESSIVE` is defined before
2175  :file:`python.h` is included, a set of more aggressive optimizations are enabled
2176  for the module; you should benchmark the results to find out if these
2177  optimizations actually make the code faster.  (Contributed by Fredrik Lundh at
2178  the NeedForSpeed sprint.)
2179
2180* ``PyErr_NewException(name, base, dict)`` can now accept a tuple of base
2181  classes as its *base* argument.  (Contributed by Georg Brandl.)
2182
2183* The :c:func:`PyErr_Warn` function for issuing warnings is now deprecated in
2184  favour of ``PyErr_WarnEx(category, message, stacklevel)`` which lets you
2185  specify the number of stack frames separating this function and the caller.  A
2186  *stacklevel* of 1 is the function calling :c:func:`PyErr_WarnEx`, 2 is the
2187  function above that, and so forth.  (Added by Neal Norwitz.)
2188
2189* The CPython interpreter is still written in C, but  the code can now be
2190  compiled with a C++ compiler without errors.   (Implemented by Anthony Baxter,
2191  Martin von Löwis, Skip Montanaro.)
2192
2193* The :c:func:`PyRange_New` function was removed.  It was never documented, never
2194  used in the core code, and had dangerously lax error checking.  In the unlikely
2195  case that your extensions were using it, you can replace it by something like
2196  the following::
2197
2198     range = PyObject_CallFunction((PyObject*) &PyRange_Type, "lll",
2199                                   start, stop, step);
2200
2201.. ======================================================================
2202
2203
2204.. _ports:
2205
2206Port-Specific Changes
2207---------------------
2208
2209* MacOS X (10.3 and higher): dynamic loading of modules now uses the
2210  :c:func:`dlopen` function instead of MacOS-specific functions.
2211
2212* MacOS X: an :option:`!--enable-universalsdk` switch was added to the
2213  :program:`configure` script that compiles the interpreter as a universal binary
2214  able to run on both PowerPC and Intel processors. (Contributed by Ronald
2215  Oussoren; :issue:`2573`.)
2216
2217* Windows: :file:`.dll` is no longer supported as a filename extension for
2218  extension modules.  :file:`.pyd` is now the only filename extension that will be
2219  searched for.
2220
2221.. ======================================================================
2222
2223
2224.. _porting:
2225
2226Porting to Python 2.5
2227=====================
2228
2229This section lists previously described changes that may require changes to your
2230code:
2231
2232* ASCII is now the default encoding for modules.  It's now  a syntax error if a
2233  module contains string literals with 8-bit characters but doesn't have an
2234  encoding declaration.  In Python 2.4 this triggered a warning, not a syntax
2235  error.
2236
2237* Previously, the :attr:`gi_frame` attribute of a generator was always a frame
2238  object.  Because of the :pep:`342` changes described in section :ref:`pep-342`,
2239  it's now possible for :attr:`gi_frame` to be ``None``.
2240
2241* A new warning, :class:`UnicodeWarning`, is triggered when  you attempt to
2242  compare a Unicode string and an 8-bit string that can't be converted to Unicode
2243  using the default ASCII encoding.  Previously such comparisons would raise a
2244  :class:`UnicodeDecodeError` exception.
2245
2246* Library: the :mod:`csv` module is now stricter about multi-line quoted fields.
2247  If your files contain newlines embedded within fields, the input should be split
2248  into lines in a manner which preserves the newline characters.
2249
2250* Library: the :mod:`locale` module's  :func:`format` function's would
2251  previously  accept any string as long as no more than one %char specifier
2252  appeared.  In Python 2.5, the argument must be exactly one %char specifier with
2253  no surrounding text.
2254
2255* Library: The :mod:`pickle` and :mod:`cPickle` modules no longer accept a
2256  return value of ``None`` from the :meth:`__reduce__` method; the method must
2257  return a tuple of arguments instead.  The modules also no longer accept the
2258  deprecated *bin* keyword parameter.
2259
2260* Library: The :mod:`SimpleXMLRPCServer` and :mod:`DocXMLRPCServer`  classes now
2261  have a :attr:`rpc_paths` attribute that constrains XML-RPC operations to a
2262  limited set of URL paths; the default is to allow only ``'/'`` and ``'/RPC2'``.
2263  Setting  :attr:`rpc_paths` to ``None`` or an empty tuple disables  this path
2264  checking.
2265
2266* C API: Many functions now use :c:type:`Py_ssize_t`  instead of :c:type:`int` to
2267  allow processing more data on 64-bit machines.  Extension code may need to make
2268  the same change to avoid warnings and to support 64-bit machines.  See the
2269  earlier section :ref:`pep-353` for a discussion of this change.
2270
2271* C API:  The obmalloc changes mean that  you must be careful to not mix usage
2272  of the :c:func:`PyMem_\*` and :c:func:`PyObject_\*` families of functions. Memory
2273  allocated with  one family's :c:func:`\*_Malloc` must be  freed with the
2274  corresponding family's :c:func:`\*_Free` function.
2275
2276.. ======================================================================
2277
2278
2279Acknowledgements
2280================
2281
2282The author would like to thank the following people for offering suggestions,
2283corrections and assistance with various drafts of this article: Georg Brandl,
2284Nick Coghlan, Phillip J. Eby, Lars Gustäbel, Raymond Hettinger, Ralf W.
2285Grosse-Kunstleve, Kent Johnson, Iain Lowe, Martin von Löwis, Fredrik Lundh, Andrew
2286McNamara, Skip Montanaro, Gustavo Niemeyer, Paul Prescod, James Pryor, Mike
2287Rovner, Scott Weikart, Barry Warsaw, Thomas Wouters.
2288
2289