1**************************** 2 What's New in Python 2.5 3**************************** 4 5:Author: A.M. Kuchling 6 7.. |release| replace:: 1.01 8 9.. $Id: whatsnew25.tex 56611 2007-07-29 08:26:10Z georg.brandl $ 10.. Fix XXX comments 11 12This article explains the new features in Python 2.5. The final release of 13Python 2.5 is scheduled for August 2006; :pep:`356` describes the planned 14release schedule. 15 16The changes in Python 2.5 are an interesting mix of language and library 17improvements. The library enhancements will be more important to Python's user 18community, I think, because several widely-useful packages were added. New 19modules include ElementTree for XML processing (:mod:`xml.etree`), 20the SQLite database module (:mod:`sqlite`), and the :mod:`ctypes` 21module for calling C functions. 22 23The language changes are of middling significance. Some pleasant new features 24were added, but most of them aren't features that you'll use every day. 25Conditional expressions were finally added to the language using a novel syntax; 26see section :ref:`pep-308`. The new ':keyword:`with`' statement will make 27writing cleanup code easier (section :ref:`pep-343`). Values can now be passed 28into generators (section :ref:`pep-342`). Imports are now visible as either 29absolute or relative (section :ref:`pep-328`). Some corner cases of exception 30handling are handled better (section :ref:`pep-341`). All these improvements 31are worthwhile, but they're improvements to one specific language feature or 32another; none of them are broad modifications to Python's semantics. 33 34As well as the language and library additions, other improvements and bugfixes 35were made throughout the source tree. A search through the SVN change logs 36finds there were 353 patches applied and 458 bugs fixed between Python 2.4 and 372.5. (Both figures are likely to be underestimates.) 38 39This article doesn't try to be a complete specification of the new features; 40instead changes are briefly introduced using helpful examples. For full 41details, you should always refer to the documentation for Python 2.5 at 42https://docs.python.org. If you want to understand the complete implementation 43and design rationale, refer to the PEP for a particular new feature. 44 45Comments, suggestions, and error reports for this document are welcome; please 46e-mail them to the author or open a bug in the Python bug tracker. 47 48.. ====================================================================== 49 50 51.. _pep-308: 52 53PEP 308: Conditional Expressions 54================================ 55 56For a long time, people have been requesting a way to write conditional 57expressions, which are expressions that return value A or value B depending on 58whether a Boolean value is true or false. A conditional expression lets you 59write a single assignment statement that has the same effect as the following:: 60 61 if condition: 62 x = true_value 63 else: 64 x = false_value 65 66There have been endless tedious discussions of syntax on both python-dev and 67comp.lang.python. A vote was even held that found the majority of voters wanted 68conditional expressions in some form, but there was no syntax that was preferred 69by a clear majority. Candidates included C's ``cond ? true_v : false_v``, ``if 70cond then true_v else false_v``, and 16 other variations. 71 72Guido van Rossum eventually chose a surprising syntax:: 73 74 x = true_value if condition else false_value 75 76Evaluation is still lazy as in existing Boolean expressions, so the order of 77evaluation jumps around a bit. The *condition* expression in the middle is 78evaluated first, and the *true_value* expression is evaluated only if the 79condition was true. Similarly, the *false_value* expression is only evaluated 80when the condition is false. 81 82This syntax may seem strange and backwards; why does the condition go in the 83*middle* of the expression, and not in the front as in C's ``c ? x : y``? The 84decision was checked by applying the new syntax to the modules in the standard 85library and seeing how the resulting code read. In many cases where a 86conditional expression is used, one value seems to be the 'common case' and one 87value is an 'exceptional case', used only on rarer occasions when the condition 88isn't met. The conditional syntax makes this pattern a bit more obvious:: 89 90 contents = ((doc + '\n') if doc else '') 91 92I read the above statement as meaning "here *contents* is usually assigned a 93value of ``doc+'\n'``; sometimes *doc* is empty, in which special case an empty 94string is returned." I doubt I will use conditional expressions very often 95where there isn't a clear common and uncommon case. 96 97There was some discussion of whether the language should require surrounding 98conditional expressions with parentheses. The decision was made to *not* 99require parentheses in the Python language's grammar, but as a matter of style I 100think you should always use them. Consider these two statements:: 101 102 # First version -- no parens 103 level = 1 if logging else 0 104 105 # Second version -- with parens 106 level = (1 if logging else 0) 107 108In the first version, I think a reader's eye might group the statement into 109'level = 1', 'if logging', 'else 0', and think that the condition decides 110whether the assignment to *level* is performed. The second version reads 111better, in my opinion, because it makes it clear that the assignment is always 112performed and the choice is being made between two values. 113 114Another reason for including the brackets: a few odd combinations of list 115comprehensions and lambdas could look like incorrect conditional expressions. 116See :pep:`308` for some examples. If you put parentheses around your 117conditional expressions, you won't run into this case. 118 119 120.. seealso:: 121 122 :pep:`308` - Conditional Expressions 123 PEP written by Guido van Rossum and Raymond D. Hettinger; implemented by Thomas 124 Wouters. 125 126.. ====================================================================== 127 128 129.. _pep-309: 130 131PEP 309: Partial Function Application 132===================================== 133 134The :mod:`functools` module is intended to contain tools for functional-style 135programming. 136 137One useful tool in this module is the :func:`partial` function. For programs 138written in a functional style, you'll sometimes want to construct variants of 139existing functions that have some of the parameters filled in. Consider a 140Python function ``f(a, b, c)``; you could create a new function ``g(b, c)`` that 141was equivalent to ``f(1, b, c)``. This is called "partial function 142application". 143 144:func:`partial` takes the arguments ``(function, arg1, arg2, ... kwarg1=value1, 145kwarg2=value2)``. The resulting object is callable, so you can just call it to 146invoke *function* with the filled-in arguments. 147 148Here's a small but realistic example:: 149 150 import functools 151 152 def log (message, subsystem): 153 "Write the contents of 'message' to the specified subsystem." 154 print '%s: %s' % (subsystem, message) 155 ... 156 157 server_log = functools.partial(log, subsystem='server') 158 server_log('Unable to open socket') 159 160Here's another example, from a program that uses PyGTK. Here a context-sensitive 161pop-up menu is being constructed dynamically. The callback provided 162for the menu option is a partially applied version of the :meth:`open_item` 163method, where the first argument has been provided. :: 164 165 ... 166 class Application: 167 def open_item(self, path): 168 ... 169 def init (self): 170 open_func = functools.partial(self.open_item, item_path) 171 popup_menu.append( ("Open", open_func, 1) ) 172 173Another function in the :mod:`functools` module is the 174``update_wrapper(wrapper, wrapped)`` function that helps you write 175well-behaved decorators. :func:`update_wrapper` copies the name, module, and 176docstring attribute to a wrapper function so that tracebacks inside the wrapped 177function are easier to understand. For example, you might write:: 178 179 def my_decorator(f): 180 def wrapper(*args, **kwds): 181 print 'Calling decorated function' 182 return f(*args, **kwds) 183 functools.update_wrapper(wrapper, f) 184 return wrapper 185 186:func:`wraps` is a decorator that can be used inside your own decorators to copy 187the wrapped function's information. An alternate version of the previous 188example would be:: 189 190 def my_decorator(f): 191 @functools.wraps(f) 192 def wrapper(*args, **kwds): 193 print 'Calling decorated function' 194 return f(*args, **kwds) 195 return wrapper 196 197 198.. seealso:: 199 200 :pep:`309` - Partial Function Application 201 PEP proposed and written by Peter Harris; implemented by Hye-Shik Chang and Nick 202 Coghlan, with adaptations by Raymond Hettinger. 203 204.. ====================================================================== 205 206 207.. _pep-314: 208 209PEP 314: Metadata for Python Software Packages v1.1 210=================================================== 211 212Some simple dependency support was added to Distutils. The :func:`setup` 213function now has ``requires``, ``provides``, and ``obsoletes`` keyword 214parameters. When you build a source distribution using the ``sdist`` command, 215the dependency information will be recorded in the :file:`PKG-INFO` file. 216 217Another new keyword parameter is ``download_url``, which should be set to a URL 218for the package's source code. This means it's now possible to look up an entry 219in the package index, determine the dependencies for a package, and download the 220required packages. :: 221 222 VERSION = '1.0' 223 setup(name='PyPackage', 224 version=VERSION, 225 requires=['numarray', 'zlib (>=1.1.4)'], 226 obsoletes=['OldPackage'] 227 download_url=('http://www.example.com/pypackage/dist/pkg-%s.tar.gz' 228 % VERSION), 229 ) 230 231Another new enhancement to the Python package index at 232https://pypi.org is storing source and binary archives for a 233package. The new :command:`upload` Distutils command will upload a package to 234the repository. 235 236Before a package can be uploaded, you must be able to build a distribution using 237the :command:`sdist` Distutils command. Once that works, you can run ``python 238setup.py upload`` to add your package to the PyPI archive. Optionally you can 239GPG-sign the package by supplying the :option:`!--sign` and :option:`!--identity` 240options. 241 242Package uploading was implemented by Martin von Löwis and Richard Jones. 243 244 245.. seealso:: 246 247 :pep:`314` - Metadata for Python Software Packages v1.1 248 PEP proposed and written by A.M. Kuchling, Richard Jones, and Fred Drake; 249 implemented by Richard Jones and Fred Drake. 250 251.. ====================================================================== 252 253 254.. _pep-328: 255 256PEP 328: Absolute and Relative Imports 257====================================== 258 259The simpler part of PEP 328 was implemented in Python 2.4: parentheses could now 260be used to enclose the names imported from a module using the ``from ... import 261...`` statement, making it easier to import many different names. 262 263The more complicated part has been implemented in Python 2.5: importing a module 264can be specified to use absolute or package-relative imports. The plan is to 265move toward making absolute imports the default in future versions of Python. 266 267Let's say you have a package directory like this:: 268 269 pkg/ 270 pkg/__init__.py 271 pkg/main.py 272 pkg/string.py 273 274This defines a package named :mod:`pkg` containing the :mod:`pkg.main` and 275:mod:`pkg.string` submodules. 276 277Consider the code in the :file:`main.py` module. What happens if it executes 278the statement ``import string``? In Python 2.4 and earlier, it will first look 279in the package's directory to perform a relative import, finds 280:file:`pkg/string.py`, imports the contents of that file as the 281:mod:`pkg.string` module, and that module is bound to the name ``string`` in the 282:mod:`pkg.main` module's namespace. 283 284That's fine if :mod:`pkg.string` was what you wanted. But what if you wanted 285Python's standard :mod:`string` module? There's no clean way to ignore 286:mod:`pkg.string` and look for the standard module; generally you had to look at 287the contents of ``sys.modules``, which is slightly unclean. Holger Krekel's 288:mod:`py.std` package provides a tidier way to perform imports from the standard 289library, ``import py; py.std.string.join()``, but that package isn't available 290on all Python installations. 291 292Reading code which relies on relative imports is also less clear, because a 293reader may be confused about which module, :mod:`string` or :mod:`pkg.string`, 294is intended to be used. Python users soon learned not to duplicate the names of 295standard library modules in the names of their packages' submodules, but you 296can't protect against having your submodule's name being used for a new module 297added in a future version of Python. 298 299In Python 2.5, you can switch :keyword:`import`'s behaviour to absolute imports 300using a ``from __future__ import absolute_import`` directive. This absolute-import 301behaviour will become the default in a future version (probably Python 3022.7). Once absolute imports are the default, ``import string`` will always 303find the standard library's version. It's suggested that users should begin 304using absolute imports as much as possible, so it's preferable to begin writing 305``from pkg import string`` in your code. 306 307Relative imports are still possible by adding a leading period to the module 308name when using the ``from ... import`` form:: 309 310 # Import names from pkg.string 311 from .string import name1, name2 312 # Import pkg.string 313 from . import string 314 315This imports the :mod:`string` module relative to the current package, so in 316:mod:`pkg.main` this will import *name1* and *name2* from :mod:`pkg.string`. 317Additional leading periods perform the relative import starting from the parent 318of the current package. For example, code in the :mod:`A.B.C` module can do:: 319 320 from . import D # Imports A.B.D 321 from .. import E # Imports A.E 322 from ..F import G # Imports A.F.G 323 324Leading periods cannot be used with the ``import modname`` form of the import 325statement, only the ``from ... import`` form. 326 327 328.. seealso:: 329 330 :pep:`328` - Imports: Multi-Line and Absolute/Relative 331 PEP written by Aahz; implemented by Thomas Wouters. 332 333 https://pylib.readthedocs.org/ 334 The py library by Holger Krekel, which contains the :mod:`py.std` package. 335 336.. ====================================================================== 337 338 339.. _pep-338: 340 341PEP 338: Executing Modules as Scripts 342===================================== 343 344The :option:`-m` switch added in Python 2.4 to execute a module as a script 345gained a few more abilities. Instead of being implemented in C code inside the 346Python interpreter, the switch now uses an implementation in a new module, 347:mod:`runpy`. 348 349The :mod:`runpy` module implements a more sophisticated import mechanism so that 350it's now possible to run modules in a package such as :mod:`pychecker.checker`. 351The module also supports alternative import mechanisms such as the 352:mod:`zipimport` module. This means you can add a .zip archive's path to 353``sys.path`` and then use the :option:`-m` switch to execute code from the 354archive. 355 356 357.. seealso:: 358 359 :pep:`338` - Executing modules as scripts 360 PEP written and implemented by Nick Coghlan. 361 362.. ====================================================================== 363 364 365.. _pep-341: 366 367PEP 341: Unified try/except/finally 368=================================== 369 370Until Python 2.5, the :keyword:`try` statement came in two flavours. You could 371use a :keyword:`finally` block to ensure that code is always executed, or one or 372more :keyword:`except` blocks to catch specific exceptions. You couldn't 373combine both :keyword:`except` blocks and a :keyword:`finally` block, because 374generating the right bytecode for the combined version was complicated and it 375wasn't clear what the semantics of the combined statement should be. 376 377Guido van Rossum spent some time working with Java, which does support the 378equivalent of combining :keyword:`except` blocks and a :keyword:`finally` block, 379and this clarified what the statement should mean. In Python 2.5, you can now 380write:: 381 382 try: 383 block-1 ... 384 except Exception1: 385 handler-1 ... 386 except Exception2: 387 handler-2 ... 388 else: 389 else-block 390 finally: 391 final-block 392 393The code in *block-1* is executed. If the code raises an exception, the various 394:keyword:`except` blocks are tested: if the exception is of class 395:class:`Exception1`, *handler-1* is executed; otherwise if it's of class 396:class:`Exception2`, *handler-2* is executed, and so forth. If no exception is 397raised, the *else-block* is executed. 398 399No matter what happened previously, the *final-block* is executed once the code 400block is complete and any raised exceptions handled. Even if there's an error in 401an exception handler or the *else-block* and a new exception is raised, the code 402in the *final-block* is still run. 403 404 405.. seealso:: 406 407 :pep:`341` - Unifying try-except and try-finally 408 PEP written by Georg Brandl; implementation by Thomas Lee. 409 410.. ====================================================================== 411 412 413.. _pep-342: 414 415PEP 342: New Generator Features 416=============================== 417 418Python 2.5 adds a simple way to pass values *into* a generator. As introduced in 419Python 2.3, generators only produce output; once a generator's code was invoked 420to create an iterator, there was no way to pass any new information into the 421function when its execution is resumed. Sometimes the ability to pass in some 422information would be useful. Hackish solutions to this include making the 423generator's code look at a global variable and then changing the global 424variable's value, or passing in some mutable object that callers then modify. 425 426To refresh your memory of basic generators, here's a simple example:: 427 428 def counter (maximum): 429 i = 0 430 while i < maximum: 431 yield i 432 i += 1 433 434When you call ``counter(10)``, the result is an iterator that returns the values 435from 0 up to 9. On encountering the :keyword:`yield` statement, the iterator 436returns the provided value and suspends the function's execution, preserving the 437local variables. Execution resumes on the following call to the iterator's 438:meth:`next` method, picking up after the :keyword:`yield` statement. 439 440In Python 2.3, :keyword:`yield` was a statement; it didn't return any value. In 4412.5, :keyword:`yield` is now an expression, returning a value that can be 442assigned to a variable or otherwise operated on:: 443 444 val = (yield i) 445 446I recommend that you always put parentheses around a :keyword:`yield` expression 447when you're doing something with the returned value, as in the above example. 448The parentheses aren't always necessary, but it's easier to always add them 449instead of having to remember when they're needed. 450 451(:pep:`342` explains the exact rules, which are that a :keyword:`yield`\ 452-expression must always be parenthesized except when it occurs at the top-level 453expression on the right-hand side of an assignment. This means you can write 454``val = yield i`` but have to use parentheses when there's an operation, as in 455``val = (yield i) + 12``.) 456 457Values are sent into a generator by calling its ``send(value)`` method. The 458generator's code is then resumed and the :keyword:`yield` expression returns the 459specified *value*. If the regular :meth:`next` method is called, the 460:keyword:`yield` returns :const:`None`. 461 462Here's the previous example, modified to allow changing the value of the 463internal counter. :: 464 465 def counter (maximum): 466 i = 0 467 while i < maximum: 468 val = (yield i) 469 # If value provided, change counter 470 if val is not None: 471 i = val 472 else: 473 i += 1 474 475And here's an example of changing the counter:: 476 477 >>> it = counter(10) 478 >>> print it.next() 479 0 480 >>> print it.next() 481 1 482 >>> print it.send(8) 483 8 484 >>> print it.next() 485 9 486 >>> print it.next() 487 Traceback (most recent call last): 488 File "t.py", line 15, in ? 489 print it.next() 490 StopIteration 491 492:keyword:`yield` will usually return :const:`None`, so you should always check 493for this case. Don't just use its value in expressions unless you're sure that 494the :meth:`send` method will be the only method used to resume your generator 495function. 496 497In addition to :meth:`send`, there are two other new methods on generators: 498 499* ``throw(type, value=None, traceback=None)`` is used to raise an exception 500 inside the generator; the exception is raised by the :keyword:`yield` expression 501 where the generator's execution is paused. 502 503* :meth:`close` raises a new :exc:`GeneratorExit` exception inside the generator 504 to terminate the iteration. On receiving this exception, the generator's code 505 must either raise :exc:`GeneratorExit` or :exc:`StopIteration`. Catching the 506 :exc:`GeneratorExit` exception and returning a value is illegal and will trigger 507 a :exc:`RuntimeError`; if the function raises some other exception, that 508 exception is propagated to the caller. :meth:`close` will also be called by 509 Python's garbage collector when the generator is garbage-collected. 510 511 If you need to run cleanup code when a :exc:`GeneratorExit` occurs, I suggest 512 using a ``try: ... finally:`` suite instead of catching :exc:`GeneratorExit`. 513 514The cumulative effect of these changes is to turn generators from one-way 515producers of information into both producers and consumers. 516 517Generators also become *coroutines*, a more generalized form of subroutines. 518Subroutines are entered at one point and exited at another point (the top of the 519function, and a :keyword:`return` statement), but coroutines can be entered, 520exited, and resumed at many different points (the :keyword:`yield` statements). 521We'll have to figure out patterns for using coroutines effectively in Python. 522 523The addition of the :meth:`close` method has one side effect that isn't obvious. 524:meth:`close` is called when a generator is garbage-collected, so this means the 525generator's code gets one last chance to run before the generator is destroyed. 526This last chance means that ``try...finally`` statements in generators can now 527be guaranteed to work; the :keyword:`finally` clause will now always get a 528chance to run. The syntactic restriction that you couldn't mix :keyword:`yield` 529statements with a ``try...finally`` suite has therefore been removed. This 530seems like a minor bit of language trivia, but using generators and 531``try...finally`` is actually necessary in order to implement the 532:keyword:`with` statement described by PEP 343. I'll look at this new statement 533in the following section. 534 535Another even more esoteric effect of this change: previously, the 536:attr:`gi_frame` attribute of a generator was always a frame object. It's now 537possible for :attr:`gi_frame` to be ``None`` once the generator has been 538exhausted. 539 540 541.. seealso:: 542 543 :pep:`342` - Coroutines via Enhanced Generators 544 PEP written by Guido van Rossum and Phillip J. Eby; implemented by Phillip J. 545 Eby. Includes examples of some fancier uses of generators as coroutines. 546 547 Earlier versions of these features were proposed in :pep:`288` by Raymond 548 Hettinger and :pep:`325` by Samuele Pedroni. 549 550 https://en.wikipedia.org/wiki/Coroutine 551 The Wikipedia entry for coroutines. 552 553 http://www.sidhe.org/~dan/blog/archives/000178.html 554 An explanation of coroutines from a Perl point of view, written by Dan Sugalski. 555 556.. ====================================================================== 557 558 559.. _pep-343: 560 561PEP 343: The 'with' statement 562============================= 563 564The ':keyword:`with`' statement clarifies code that previously would use 565``try...finally`` blocks to ensure that clean-up code is executed. In this 566section, I'll discuss the statement as it will commonly be used. In the next 567section, I'll examine the implementation details and show how to write objects 568for use with this statement. 569 570The ':keyword:`with`' statement is a new control-flow structure whose basic 571structure is:: 572 573 with expression [as variable]: 574 with-block 575 576The expression is evaluated, and it should result in an object that supports the 577context management protocol (that is, has :meth:`__enter__` and :meth:`__exit__` 578methods. 579 580The object's :meth:`__enter__` is called before *with-block* is executed and 581therefore can run set-up code. It also may return a value that is bound to the 582name *variable*, if given. (Note carefully that *variable* is *not* assigned 583the result of *expression*.) 584 585After execution of the *with-block* is finished, the object's :meth:`__exit__` 586method is called, even if the block raised an exception, and can therefore run 587clean-up code. 588 589To enable the statement in Python 2.5, you need to add the following directive 590to your module:: 591 592 from __future__ import with_statement 593 594The statement will always be enabled in Python 2.6. 595 596Some standard Python objects now support the context management protocol and can 597be used with the ':keyword:`with`' statement. File objects are one example:: 598 599 with open('/etc/passwd', 'r') as f: 600 for line in f: 601 print line 602 ... more processing code ... 603 604After this statement has executed, the file object in *f* will have been 605automatically closed, even if the :keyword:`for` loop raised an exception 606part-way through the block. 607 608.. note:: 609 610 In this case, *f* is the same object created by :func:`open`, because 611 :meth:`file.__enter__` returns *self*. 612 613The :mod:`threading` module's locks and condition variables also support the 614':keyword:`with`' statement:: 615 616 lock = threading.Lock() 617 with lock: 618 # Critical section of code 619 ... 620 621The lock is acquired before the block is executed and always released once the 622block is complete. 623 624The new :func:`localcontext` function in the :mod:`decimal` module makes it easy 625to save and restore the current decimal context, which encapsulates the desired 626precision and rounding characteristics for computations:: 627 628 from decimal import Decimal, Context, localcontext 629 630 # Displays with default precision of 28 digits 631 v = Decimal('578') 632 print v.sqrt() 633 634 with localcontext(Context(prec=16)): 635 # All code in this block uses a precision of 16 digits. 636 # The original context is restored on exiting the block. 637 print v.sqrt() 638 639 640.. _new-25-context-managers: 641 642Writing Context Managers 643------------------------ 644 645Under the hood, the ':keyword:`with`' statement is fairly complicated. Most 646people will only use ':keyword:`with`' in company with existing objects and 647don't need to know these details, so you can skip the rest of this section if 648you like. Authors of new objects will need to understand the details of the 649underlying implementation and should keep reading. 650 651A high-level explanation of the context management protocol is: 652 653* The expression is evaluated and should result in an object called a "context 654 manager". The context manager must have :meth:`__enter__` and :meth:`__exit__` 655 methods. 656 657* The context manager's :meth:`__enter__` method is called. The value returned 658 is assigned to *VAR*. If no ``'as VAR'`` clause is present, the value is simply 659 discarded. 660 661* The code in *BLOCK* is executed. 662 663* If *BLOCK* raises an exception, the ``__exit__(type, value, traceback)`` 664 is called with the exception details, the same values returned by 665 :func:`sys.exc_info`. The method's return value controls whether the exception 666 is re-raised: any false value re-raises the exception, and ``True`` will result 667 in suppressing it. You'll only rarely want to suppress the exception, because 668 if you do the author of the code containing the ':keyword:`with`' statement will 669 never realize anything went wrong. 670 671* If *BLOCK* didn't raise an exception, the :meth:`__exit__` method is still 672 called, but *type*, *value*, and *traceback* are all ``None``. 673 674Let's think through an example. I won't present detailed code but will only 675sketch the methods necessary for a database that supports transactions. 676 677(For people unfamiliar with database terminology: a set of changes to the 678database are grouped into a transaction. Transactions can be either committed, 679meaning that all the changes are written into the database, or rolled back, 680meaning that the changes are all discarded and the database is unchanged. See 681any database textbook for more information.) 682 683Let's assume there's an object representing a database connection. Our goal will 684be to let the user write code like this:: 685 686 db_connection = DatabaseConnection() 687 with db_connection as cursor: 688 cursor.execute('insert into ...') 689 cursor.execute('delete from ...') 690 # ... more operations ... 691 692The transaction should be committed if the code in the block runs flawlessly or 693rolled back if there's an exception. Here's the basic interface for 694:class:`DatabaseConnection` that I'll assume:: 695 696 class DatabaseConnection: 697 # Database interface 698 def cursor (self): 699 "Returns a cursor object and starts a new transaction" 700 def commit (self): 701 "Commits current transaction" 702 def rollback (self): 703 "Rolls back current transaction" 704 705The :meth:`__enter__` method is pretty easy, having only to start a new 706transaction. For this application the resulting cursor object would be a useful 707result, so the method will return it. The user can then add ``as cursor`` to 708their ':keyword:`with`' statement to bind the cursor to a variable name. :: 709 710 class DatabaseConnection: 711 ... 712 def __enter__ (self): 713 # Code to start a new transaction 714 cursor = self.cursor() 715 return cursor 716 717The :meth:`__exit__` method is the most complicated because it's where most of 718the work has to be done. The method has to check if an exception occurred. If 719there was no exception, the transaction is committed. The transaction is rolled 720back if there was an exception. 721 722In the code below, execution will just fall off the end of the function, 723returning the default value of ``None``. ``None`` is false, so the exception 724will be re-raised automatically. If you wished, you could be more explicit and 725add a :keyword:`return` statement at the marked location. :: 726 727 class DatabaseConnection: 728 ... 729 def __exit__ (self, type, value, tb): 730 if tb is None: 731 # No exception, so commit 732 self.commit() 733 else: 734 # Exception occurred, so rollback. 735 self.rollback() 736 # return False 737 738 739.. _contextlibmod: 740 741The contextlib module 742--------------------- 743 744The new :mod:`contextlib` module provides some functions and a decorator that 745are useful for writing objects for use with the ':keyword:`with`' statement. 746 747The decorator is called :func:`contextmanager`, and lets you write a single 748generator function instead of defining a new class. The generator should yield 749exactly one value. The code up to the :keyword:`yield` will be executed as the 750:meth:`__enter__` method, and the value yielded will be the method's return 751value that will get bound to the variable in the ':keyword:`with`' statement's 752:keyword:`as` clause, if any. The code after the :keyword:`yield` will be 753executed in the :meth:`__exit__` method. Any exception raised in the block will 754be raised by the :keyword:`yield` statement. 755 756Our database example from the previous section could be written using this 757decorator as:: 758 759 from contextlib import contextmanager 760 761 @contextmanager 762 def db_transaction (connection): 763 cursor = connection.cursor() 764 try: 765 yield cursor 766 except: 767 connection.rollback() 768 raise 769 else: 770 connection.commit() 771 772 db = DatabaseConnection() 773 with db_transaction(db) as cursor: 774 ... 775 776The :mod:`contextlib` module also has a ``nested(mgr1, mgr2, ...)`` function 777that combines a number of context managers so you don't need to write nested 778':keyword:`with`' statements. In this example, the single ':keyword:`with`' 779statement both starts a database transaction and acquires a thread lock:: 780 781 lock = threading.Lock() 782 with nested (db_transaction(db), lock) as (cursor, locked): 783 ... 784 785Finally, the ``closing(object)`` function returns *object* so that it can be 786bound to a variable, and calls ``object.close`` at the end of the block. :: 787 788 import urllib, sys 789 from contextlib import closing 790 791 with closing(urllib.urlopen('http://www.yahoo.com')) as f: 792 for line in f: 793 sys.stdout.write(line) 794 795 796.. seealso:: 797 798 :pep:`343` - The "with" statement 799 PEP written by Guido van Rossum and Nick Coghlan; implemented by Mike Bland, 800 Guido van Rossum, and Neal Norwitz. The PEP shows the code generated for a 801 ':keyword:`with`' statement, which can be helpful in learning how the statement 802 works. 803 804 The documentation for the :mod:`contextlib` module. 805 806.. ====================================================================== 807 808 809.. _pep-352: 810 811PEP 352: Exceptions as New-Style Classes 812======================================== 813 814Exception classes can now be new-style classes, not just classic classes, and 815the built-in :exc:`Exception` class and all the standard built-in exceptions 816(:exc:`NameError`, :exc:`ValueError`, etc.) are now new-style classes. 817 818The inheritance hierarchy for exceptions has been rearranged a bit. In 2.5, the 819inheritance relationships are:: 820 821 BaseException # New in Python 2.5 822 |- KeyboardInterrupt 823 |- SystemExit 824 |- Exception 825 |- (all other current built-in exceptions) 826 827This rearrangement was done because people often want to catch all exceptions 828that indicate program errors. :exc:`KeyboardInterrupt` and :exc:`SystemExit` 829aren't errors, though, and usually represent an explicit action such as the user 830hitting :kbd:`Control-C` or code calling :func:`sys.exit`. A bare ``except:`` will 831catch all exceptions, so you commonly need to list :exc:`KeyboardInterrupt` and 832:exc:`SystemExit` in order to re-raise them. The usual pattern is:: 833 834 try: 835 ... 836 except (KeyboardInterrupt, SystemExit): 837 raise 838 except: 839 # Log error... 840 # Continue running program... 841 842In Python 2.5, you can now write ``except Exception`` to achieve the same 843result, catching all the exceptions that usually indicate errors but leaving 844:exc:`KeyboardInterrupt` and :exc:`SystemExit` alone. As in previous versions, 845a bare ``except:`` still catches all exceptions. 846 847The goal for Python 3.0 is to require any class raised as an exception to derive 848from :exc:`BaseException` or some descendant of :exc:`BaseException`, and future 849releases in the Python 2.x series may begin to enforce this constraint. 850Therefore, I suggest you begin making all your exception classes derive from 851:exc:`Exception` now. It's been suggested that the bare ``except:`` form should 852be removed in Python 3.0, but Guido van Rossum hasn't decided whether to do this 853or not. 854 855Raising of strings as exceptions, as in the statement ``raise "Error 856occurred"``, is deprecated in Python 2.5 and will trigger a warning. The aim is 857to be able to remove the string-exception feature in a few releases. 858 859 860.. seealso:: 861 862 :pep:`352` - Required Superclass for Exceptions 863 PEP written by Brett Cannon and Guido van Rossum; implemented by Brett Cannon. 864 865.. ====================================================================== 866 867 868.. _pep-353: 869 870PEP 353: Using ssize_t as the index type 871======================================== 872 873A wide-ranging change to Python's C API, using a new :c:type:`Py_ssize_t` type 874definition instead of :c:type:`int`, will permit the interpreter to handle more 875data on 64-bit platforms. This change doesn't affect Python's capacity on 32-bit 876platforms. 877 878Various pieces of the Python interpreter used C's :c:type:`int` type to store 879sizes or counts; for example, the number of items in a list or tuple were stored 880in an :c:type:`int`. The C compilers for most 64-bit platforms still define 881:c:type:`int` as a 32-bit type, so that meant that lists could only hold up to 882``2**31 - 1`` = 2147483647 items. (There are actually a few different 883programming models that 64-bit C compilers can use -- see 884http://www.unix.org/version2/whatsnew/lp64_wp.html for a discussion -- but the 885most commonly available model leaves :c:type:`int` as 32 bits.) 886 887A limit of 2147483647 items doesn't really matter on a 32-bit platform because 888you'll run out of memory before hitting the length limit. Each list item 889requires space for a pointer, which is 4 bytes, plus space for a 890:c:type:`PyObject` representing the item. 2147483647\*4 is already more bytes 891than a 32-bit address space can contain. 892 893It's possible to address that much memory on a 64-bit platform, however. The 894pointers for a list that size would only require 16 GiB of space, so it's not 895unreasonable that Python programmers might construct lists that large. 896Therefore, the Python interpreter had to be changed to use some type other than 897:c:type:`int`, and this will be a 64-bit type on 64-bit platforms. The change 898will cause incompatibilities on 64-bit machines, so it was deemed worth making 899the transition now, while the number of 64-bit users is still relatively small. 900(In 5 or 10 years, we may *all* be on 64-bit machines, and the transition would 901be more painful then.) 902 903This change most strongly affects authors of C extension modules. Python 904strings and container types such as lists and tuples now use 905:c:type:`Py_ssize_t` to store their size. Functions such as 906:c:func:`PyList_Size` now return :c:type:`Py_ssize_t`. Code in extension modules 907may therefore need to have some variables changed to :c:type:`Py_ssize_t`. 908 909The :c:func:`PyArg_ParseTuple` and :c:func:`Py_BuildValue` functions have a new 910conversion code, ``n``, for :c:type:`Py_ssize_t`. :c:func:`PyArg_ParseTuple`'s 911``s#`` and ``t#`` still output :c:type:`int` by default, but you can define the 912macro :c:macro:`PY_SSIZE_T_CLEAN` before including :file:`Python.h` to make 913them return :c:type:`Py_ssize_t`. 914 915:pep:`353` has a section on conversion guidelines that extension authors should 916read to learn about supporting 64-bit platforms. 917 918 919.. seealso:: 920 921 :pep:`353` - Using ssize_t as the index type 922 PEP written and implemented by Martin von Löwis. 923 924.. ====================================================================== 925 926 927.. _pep-357: 928 929PEP 357: The '__index__' method 930=============================== 931 932The NumPy developers had a problem that could only be solved by adding a new 933special method, :meth:`__index__`. When using slice notation, as in 934``[start:stop:step]``, the values of the *start*, *stop*, and *step* indexes 935must all be either integers or long integers. NumPy defines a variety of 936specialized integer types corresponding to unsigned and signed integers of 8, 93716, 32, and 64 bits, but there was no way to signal that these types could be 938used as slice indexes. 939 940Slicing can't just use the existing :meth:`__int__` method because that method 941is also used to implement coercion to integers. If slicing used 942:meth:`__int__`, floating-point numbers would also become legal slice indexes 943and that's clearly an undesirable behaviour. 944 945Instead, a new special method called :meth:`__index__` was added. It takes no 946arguments and returns an integer giving the slice index to use. For example:: 947 948 class C: 949 def __index__ (self): 950 return self.value 951 952The return value must be either a Python integer or long integer. The 953interpreter will check that the type returned is correct, and raises a 954:exc:`TypeError` if this requirement isn't met. 955 956A corresponding :attr:`nb_index` slot was added to the C-level 957:c:type:`PyNumberMethods` structure to let C extensions implement this protocol. 958``PyNumber_Index(obj)`` can be used in extension code to call the 959:meth:`__index__` function and retrieve its result. 960 961 962.. seealso:: 963 964 :pep:`357` - Allowing Any Object to be Used for Slicing 965 PEP written and implemented by Travis Oliphant. 966 967.. ====================================================================== 968 969 970.. _other-lang: 971 972Other Language Changes 973====================== 974 975Here are all of the changes that Python 2.5 makes to the core Python language. 976 977* The :class:`dict` type has a new hook for letting subclasses provide a default 978 value when a key isn't contained in the dictionary. When a key isn't found, the 979 dictionary's ``__missing__(key)`` method will be called. This hook is used 980 to implement the new :class:`defaultdict` class in the :mod:`collections` 981 module. The following example defines a dictionary that returns zero for any 982 missing key:: 983 984 class zerodict (dict): 985 def __missing__ (self, key): 986 return 0 987 988 d = zerodict({1:1, 2:2}) 989 print d[1], d[2] # Prints 1, 2 990 print d[3], d[4] # Prints 0, 0 991 992* Both 8-bit and Unicode strings have new ``partition(sep)`` and 993 ``rpartition(sep)`` methods that simplify a common use case. 994 995 The ``find(S)`` method is often used to get an index which is then used to 996 slice the string and obtain the pieces that are before and after the separator. 997 ``partition(sep)`` condenses this pattern into a single method call that 998 returns a 3-tuple containing the substring before the separator, the separator 999 itself, and the substring after the separator. If the separator isn't found, 1000 the first element of the tuple is the entire string and the other two elements 1001 are empty. ``rpartition(sep)`` also returns a 3-tuple but starts searching 1002 from the end of the string; the ``r`` stands for 'reverse'. 1003 1004 Some examples:: 1005 1006 >>> ('http://www.python.org').partition('://') 1007 ('http', '://', 'www.python.org') 1008 >>> ('file:/usr/share/doc/index.html').partition('://') 1009 ('file:/usr/share/doc/index.html', '', '') 1010 >>> (u'Subject: a quick question').partition(':') 1011 (u'Subject', u':', u' a quick question') 1012 >>> 'www.python.org'.rpartition('.') 1013 ('www.python', '.', 'org') 1014 >>> 'www.python.org'.rpartition(':') 1015 ('', '', 'www.python.org') 1016 1017 (Implemented by Fredrik Lundh following a suggestion by Raymond Hettinger.) 1018 1019* The :meth:`startswith` and :meth:`endswith` methods of string types now accept 1020 tuples of strings to check for. :: 1021 1022 def is_image_file (filename): 1023 return filename.endswith(('.gif', '.jpg', '.tiff')) 1024 1025 (Implemented by Georg Brandl following a suggestion by Tom Lynn.) 1026 1027 .. RFE #1491485 1028 1029* The :func:`min` and :func:`max` built-in functions gained a ``key`` keyword 1030 parameter analogous to the ``key`` argument for :meth:`sort`. This parameter 1031 supplies a function that takes a single argument and is called for every value 1032 in the list; :func:`min`/:func:`max` will return the element with the 1033 smallest/largest return value from this function. For example, to find the 1034 longest string in a list, you can do:: 1035 1036 L = ['medium', 'longest', 'short'] 1037 # Prints 'longest' 1038 print max(L, key=len) 1039 # Prints 'short', because lexicographically 'short' has the largest value 1040 print max(L) 1041 1042 (Contributed by Steven Bethard and Raymond Hettinger.) 1043 1044* Two new built-in functions, :func:`any` and :func:`all`, evaluate whether an 1045 iterator contains any true or false values. :func:`any` returns :const:`True` 1046 if any value returned by the iterator is true; otherwise it will return 1047 :const:`False`. :func:`all` returns :const:`True` only if all of the values 1048 returned by the iterator evaluate as true. (Suggested by Guido van Rossum, and 1049 implemented by Raymond Hettinger.) 1050 1051* The result of a class's :meth:`__hash__` method can now be either a long 1052 integer or a regular integer. If a long integer is returned, the hash of that 1053 value is taken. In earlier versions the hash value was required to be a 1054 regular integer, but in 2.5 the :func:`id` built-in was changed to always 1055 return non-negative numbers, and users often seem to use ``id(self)`` in 1056 :meth:`__hash__` methods (though this is discouraged). 1057 1058 .. Bug #1536021 1059 1060* ASCII is now the default encoding for modules. It's now a syntax error if a 1061 module contains string literals with 8-bit characters but doesn't have an 1062 encoding declaration. In Python 2.4 this triggered a warning, not a syntax 1063 error. See :pep:`263` for how to declare a module's encoding; for example, you 1064 might add a line like this near the top of the source file:: 1065 1066 # -*- coding: latin1 -*- 1067 1068* A new warning, :class:`UnicodeWarning`, is triggered when you attempt to 1069 compare a Unicode string and an 8-bit string that can't be converted to Unicode 1070 using the default ASCII encoding. The result of the comparison is false:: 1071 1072 >>> chr(128) == unichr(128) # Can't convert chr(128) to Unicode 1073 __main__:1: UnicodeWarning: Unicode equal comparison failed 1074 to convert both arguments to Unicode - interpreting them 1075 as being unequal 1076 False 1077 >>> chr(127) == unichr(127) # chr(127) can be converted 1078 True 1079 1080 Previously this would raise a :class:`UnicodeDecodeError` exception, but in 2.5 1081 this could result in puzzling problems when accessing a dictionary. If you 1082 looked up ``unichr(128)`` and ``chr(128)`` was being used as a key, you'd get a 1083 :class:`UnicodeDecodeError` exception. Other changes in 2.5 resulted in this 1084 exception being raised instead of suppressed by the code in :file:`dictobject.c` 1085 that implements dictionaries. 1086 1087 Raising an exception for such a comparison is strictly correct, but the change 1088 might have broken code, so instead :class:`UnicodeWarning` was introduced. 1089 1090 (Implemented by Marc-André Lemburg.) 1091 1092* One error that Python programmers sometimes make is forgetting to include an 1093 :file:`__init__.py` module in a package directory. Debugging this mistake can be 1094 confusing, and usually requires running Python with the :option:`-v` switch to 1095 log all the paths searched. In Python 2.5, a new :exc:`ImportWarning` warning is 1096 triggered when an import would have picked up a directory as a package but no 1097 :file:`__init__.py` was found. This warning is silently ignored by default; 1098 provide the :option:`-Wd <-W>` option when running the Python executable to display 1099 the warning message. (Implemented by Thomas Wouters.) 1100 1101* The list of base classes in a class definition can now be empty. As an 1102 example, this is now legal:: 1103 1104 class C(): 1105 pass 1106 1107 (Implemented by Brett Cannon.) 1108 1109.. ====================================================================== 1110 1111 1112.. _25interactive: 1113 1114Interactive Interpreter Changes 1115------------------------------- 1116 1117In the interactive interpreter, ``quit`` and ``exit`` have long been strings so 1118that new users get a somewhat helpful message when they try to quit:: 1119 1120 >>> quit 1121 'Use Ctrl-D (i.e. EOF) to exit.' 1122 1123In Python 2.5, ``quit`` and ``exit`` are now objects that still produce string 1124representations of themselves, but are also callable. Newbies who try ``quit()`` 1125or ``exit()`` will now exit the interpreter as they expect. (Implemented by 1126Georg Brandl.) 1127 1128The Python executable now accepts the standard long options :option:`--help` 1129and :option:`--version`; on Windows, it also accepts the :option:`/? <-?>` option 1130for displaying a help message. (Implemented by Georg Brandl.) 1131 1132.. ====================================================================== 1133 1134 1135.. _opts: 1136 1137Optimizations 1138------------- 1139 1140Several of the optimizations were developed at the NeedForSpeed sprint, an event 1141held in Reykjavik, Iceland, from May 21--28 2006. The sprint focused on speed 1142enhancements to the CPython implementation and was funded by EWT LLC with local 1143support from CCP Games. Those optimizations added at this sprint are specially 1144marked in the following list. 1145 1146* When they were introduced in Python 2.4, the built-in :class:`set` and 1147 :class:`frozenset` types were built on top of Python's dictionary type. In 2.5 1148 the internal data structure has been customized for implementing sets, and as a 1149 result sets will use a third less memory and are somewhat faster. (Implemented 1150 by Raymond Hettinger.) 1151 1152* The speed of some Unicode operations, such as finding substrings, string 1153 splitting, and character map encoding and decoding, has been improved. 1154 (Substring search and splitting improvements were added by Fredrik Lundh and 1155 Andrew Dalke at the NeedForSpeed sprint. Character maps were improved by Walter 1156 Dörwald and Martin von Löwis.) 1157 1158 .. Patch 1313939, 1359618 1159 1160* The ``long(str, base)`` function is now faster on long digit strings 1161 because fewer intermediate results are calculated. The peak is for strings of 1162 around 800--1000 digits where the function is 6 times faster. (Contributed by 1163 Alan McIntyre and committed at the NeedForSpeed sprint.) 1164 1165 .. Patch 1442927 1166 1167* It's now illegal to mix iterating over a file with ``for line in file`` and 1168 calling the file object's :meth:`read`/:meth:`readline`/:meth:`readlines` 1169 methods. Iteration uses an internal buffer and the :meth:`read\*` methods 1170 don't use that buffer. Instead they would return the data following the 1171 buffer, causing the data to appear out of order. Mixing iteration and these 1172 methods will now trigger a :exc:`ValueError` from the :meth:`read\*` method. 1173 (Implemented by Thomas Wouters.) 1174 1175 .. Patch 1397960 1176 1177* The :mod:`struct` module now compiles structure format strings into an 1178 internal representation and caches this representation, yielding a 20% speedup. 1179 (Contributed by Bob Ippolito at the NeedForSpeed sprint.) 1180 1181* The :mod:`re` module got a 1 or 2% speedup by switching to Python's allocator 1182 functions instead of the system's :c:func:`malloc` and :c:func:`free`. 1183 (Contributed by Jack Diederich at the NeedForSpeed sprint.) 1184 1185* The code generator's peephole optimizer now performs simple constant folding 1186 in expressions. If you write something like ``a = 2+3``, the code generator 1187 will do the arithmetic and produce code corresponding to ``a = 5``. (Proposed 1188 and implemented by Raymond Hettinger.) 1189 1190* Function calls are now faster because code objects now keep the most recently 1191 finished frame (a "zombie frame") in an internal field of the code object, 1192 reusing it the next time the code object is invoked. (Original patch by Michael 1193 Hudson, modified by Armin Rigo and Richard Jones; committed at the NeedForSpeed 1194 sprint.) Frame objects are also slightly smaller, which may improve cache 1195 locality and reduce memory usage a bit. (Contributed by Neal Norwitz.) 1196 1197 .. Patch 876206 1198 .. Patch 1337051 1199 1200* Python's built-in exceptions are now new-style classes, a change that speeds 1201 up instantiation considerably. Exception handling in Python 2.5 is therefore 1202 about 30% faster than in 2.4. (Contributed by Richard Jones, Georg Brandl and 1203 Sean Reifschneider at the NeedForSpeed sprint.) 1204 1205* Importing now caches the paths tried, recording whether they exist or not so 1206 that the interpreter makes fewer :c:func:`open` and :c:func:`stat` calls on 1207 startup. (Contributed by Martin von Löwis and Georg Brandl.) 1208 1209 .. Patch 921466 1210 1211.. ====================================================================== 1212 1213 1214.. _25modules: 1215 1216New, Improved, and Removed Modules 1217================================== 1218 1219The standard library received many enhancements and bug fixes in Python 2.5. 1220Here's a partial list of the most notable changes, sorted alphabetically by 1221module name. Consult the :file:`Misc/NEWS` file in the source tree for a more 1222complete list of changes, or look through the SVN logs for all the details. 1223 1224* The :mod:`audioop` module now supports the a-LAW encoding, and the code for 1225 u-LAW encoding has been improved. (Contributed by Lars Immisch.) 1226 1227* The :mod:`codecs` module gained support for incremental codecs. The 1228 :func:`codec.lookup` function now returns a :class:`CodecInfo` instance instead 1229 of a tuple. :class:`CodecInfo` instances behave like a 4-tuple to preserve 1230 backward compatibility but also have the attributes :attr:`encode`, 1231 :attr:`decode`, :attr:`incrementalencoder`, :attr:`incrementaldecoder`, 1232 :attr:`streamwriter`, and :attr:`streamreader`. Incremental codecs can receive 1233 input and produce output in multiple chunks; the output is the same as if the 1234 entire input was fed to the non-incremental codec. See the :mod:`codecs` module 1235 documentation for details. (Designed and implemented by Walter Dörwald.) 1236 1237 .. Patch 1436130 1238 1239* The :mod:`collections` module gained a new type, :class:`defaultdict`, that 1240 subclasses the standard :class:`dict` type. The new type mostly behaves like a 1241 dictionary but constructs a default value when a key isn't present, 1242 automatically adding it to the dictionary for the requested key value. 1243 1244 The first argument to :class:`defaultdict`'s constructor is a factory function 1245 that gets called whenever a key is requested but not found. This factory 1246 function receives no arguments, so you can use built-in type constructors such 1247 as :func:`list` or :func:`int`. For example, you can make an index of words 1248 based on their initial letter like this:: 1249 1250 words = """Nel mezzo del cammin di nostra vita 1251 mi ritrovai per una selva oscura 1252 che la diritta via era smarrita""".lower().split() 1253 1254 index = defaultdict(list) 1255 1256 for w in words: 1257 init_letter = w[0] 1258 index[init_letter].append(w) 1259 1260 Printing ``index`` results in the following output:: 1261 1262 defaultdict(<type 'list'>, {'c': ['cammin', 'che'], 'e': ['era'], 1263 'd': ['del', 'di', 'diritta'], 'm': ['mezzo', 'mi'], 1264 'l': ['la'], 'o': ['oscura'], 'n': ['nel', 'nostra'], 1265 'p': ['per'], 's': ['selva', 'smarrita'], 1266 'r': ['ritrovai'], 'u': ['una'], 'v': ['vita', 'via']} 1267 1268 (Contributed by Guido van Rossum.) 1269 1270* The :class:`deque` double-ended queue type supplied by the :mod:`collections` 1271 module now has a ``remove(value)`` method that removes the first occurrence 1272 of *value* in the queue, raising :exc:`ValueError` if the value isn't found. 1273 (Contributed by Raymond Hettinger.) 1274 1275* New module: The :mod:`contextlib` module contains helper functions for use 1276 with the new ':keyword:`with`' statement. See section :ref:`contextlibmod` 1277 for more about this module. 1278 1279* New module: The :mod:`cProfile` module is a C implementation of the existing 1280 :mod:`profile` module that has much lower overhead. The module's interface is 1281 the same as :mod:`profile`: you run ``cProfile.run('main()')`` to profile a 1282 function, can save profile data to a file, etc. It's not yet known if the 1283 Hotshot profiler, which is also written in C but doesn't match the 1284 :mod:`profile` module's interface, will continue to be maintained in future 1285 versions of Python. (Contributed by Armin Rigo.) 1286 1287 Also, the :mod:`pstats` module for analyzing the data measured by the profiler 1288 now supports directing the output to any file object by supplying a *stream* 1289 argument to the :class:`Stats` constructor. (Contributed by Skip Montanaro.) 1290 1291* The :mod:`csv` module, which parses files in comma-separated value format, 1292 received several enhancements and a number of bugfixes. You can now set the 1293 maximum size in bytes of a field by calling the 1294 ``csv.field_size_limit(new_limit)`` function; omitting the *new_limit* 1295 argument will return the currently-set limit. The :class:`reader` class now has 1296 a :attr:`line_num` attribute that counts the number of physical lines read from 1297 the source; records can span multiple physical lines, so :attr:`line_num` is not 1298 the same as the number of records read. 1299 1300 The CSV parser is now stricter about multi-line quoted fields. Previously, if a 1301 line ended within a quoted field without a terminating newline character, a 1302 newline would be inserted into the returned field. This behavior caused problems 1303 when reading files that contained carriage return characters within fields, so 1304 the code was changed to return the field without inserting newlines. As a 1305 consequence, if newlines embedded within fields are important, the input should 1306 be split into lines in a manner that preserves the newline characters. 1307 1308 (Contributed by Skip Montanaro and Andrew McNamara.) 1309 1310* The :class:`~datetime.datetime` class in the :mod:`datetime` module now has a 1311 ``strptime(string, format)`` method for parsing date strings, contributed 1312 by Josh Spoerri. It uses the same format characters as :func:`time.strptime` and 1313 :func:`time.strftime`:: 1314 1315 from datetime import datetime 1316 1317 ts = datetime.strptime('10:13:15 2006-03-07', 1318 '%H:%M:%S %Y-%m-%d') 1319 1320* The :meth:`SequenceMatcher.get_matching_blocks` method in the :mod:`difflib` 1321 module now guarantees to return a minimal list of blocks describing matching 1322 subsequences. Previously, the algorithm would occasionally break a block of 1323 matching elements into two list entries. (Enhancement by Tim Peters.) 1324 1325* The :mod:`doctest` module gained a ``SKIP`` option that keeps an example from 1326 being executed at all. This is intended for code snippets that are usage 1327 examples intended for the reader and aren't actually test cases. 1328 1329 An *encoding* parameter was added to the :func:`testfile` function and the 1330 :class:`DocFileSuite` class to specify the file's encoding. This makes it 1331 easier to use non-ASCII characters in tests contained within a docstring. 1332 (Contributed by Bjorn Tillenius.) 1333 1334 .. Patch 1080727 1335 1336* The :mod:`email` package has been updated to version 4.0. (Contributed by 1337 Barry Warsaw.) 1338 1339 .. XXX need to provide some more detail here 1340 1341 .. index:: 1342 single: universal newlines; What's new 1343 1344* The :mod:`fileinput` module was made more flexible. Unicode filenames are now 1345 supported, and a *mode* parameter that defaults to ``"r"`` was added to the 1346 :func:`input` function to allow opening files in binary or :term:`universal 1347 newlines` mode. Another new parameter, *openhook*, lets you use a function 1348 other than :func:`open` to open the input files. Once you're iterating over 1349 the set of files, the :class:`FileInput` object's new :meth:`fileno` returns 1350 the file descriptor for the currently opened file. (Contributed by Georg 1351 Brandl.) 1352 1353* In the :mod:`gc` module, the new :func:`get_count` function returns a 3-tuple 1354 containing the current collection counts for the three GC generations. This is 1355 accounting information for the garbage collector; when these counts reach a 1356 specified threshold, a garbage collection sweep will be made. The existing 1357 :func:`gc.collect` function now takes an optional *generation* argument of 0, 1, 1358 or 2 to specify which generation to collect. (Contributed by Barry Warsaw.) 1359 1360* The :func:`nsmallest` and :func:`nlargest` functions in the :mod:`heapq` 1361 module now support a ``key`` keyword parameter similar to the one provided by 1362 the :func:`min`/:func:`max` functions and the :meth:`sort` methods. For 1363 example:: 1364 1365 >>> import heapq 1366 >>> L = ["short", 'medium', 'longest', 'longer still'] 1367 >>> heapq.nsmallest(2, L) # Return two lowest elements, lexicographically 1368 ['longer still', 'longest'] 1369 >>> heapq.nsmallest(2, L, key=len) # Return two shortest elements 1370 ['short', 'medium'] 1371 1372 (Contributed by Raymond Hettinger.) 1373 1374* The :func:`itertools.islice` function now accepts ``None`` for the start and 1375 step arguments. This makes it more compatible with the attributes of slice 1376 objects, so that you can now write the following:: 1377 1378 s = slice(5) # Create slice object 1379 itertools.islice(iterable, s.start, s.stop, s.step) 1380 1381 (Contributed by Raymond Hettinger.) 1382 1383* The :func:`format` function in the :mod:`locale` module has been modified and 1384 two new functions were added, :func:`format_string` and :func:`currency`. 1385 1386 The :func:`format` function's *val* parameter could previously be a string as 1387 long as no more than one %char specifier appeared; now the parameter must be 1388 exactly one %char specifier with no surrounding text. An optional *monetary* 1389 parameter was also added which, if ``True``, will use the locale's rules for 1390 formatting currency in placing a separator between groups of three digits. 1391 1392 To format strings with multiple %char specifiers, use the new 1393 :func:`format_string` function that works like :func:`format` but also supports 1394 mixing %char specifiers with arbitrary text. 1395 1396 A new :func:`currency` function was also added that formats a number according 1397 to the current locale's settings. 1398 1399 (Contributed by Georg Brandl.) 1400 1401 .. Patch 1180296 1402 1403* The :mod:`mailbox` module underwent a massive rewrite to add the capability to 1404 modify mailboxes in addition to reading them. A new set of classes that include 1405 :class:`mbox`, :class:`MH`, and :class:`Maildir` are used to read mailboxes, and 1406 have an ``add(message)`` method to add messages, ``remove(key)`` to 1407 remove messages, and :meth:`lock`/:meth:`unlock` to lock/unlock the mailbox. 1408 The following example converts a maildir-format mailbox into an mbox-format 1409 one:: 1410 1411 import mailbox 1412 1413 # 'factory=None' uses email.Message.Message as the class representing 1414 # individual messages. 1415 src = mailbox.Maildir('maildir', factory=None) 1416 dest = mailbox.mbox('/tmp/mbox') 1417 1418 for msg in src: 1419 dest.add(msg) 1420 1421 (Contributed by Gregory K. Johnson. Funding was provided by Google's 2005 1422 Summer of Code.) 1423 1424* New module: the :mod:`msilib` module allows creating Microsoft Installer 1425 :file:`.msi` files and CAB files. Some support for reading the :file:`.msi` 1426 database is also included. (Contributed by Martin von Löwis.) 1427 1428* The :mod:`nis` module now supports accessing domains other than the system 1429 default domain by supplying a *domain* argument to the :func:`nis.match` and 1430 :func:`nis.maps` functions. (Contributed by Ben Bell.) 1431 1432* The :mod:`operator` module's :func:`itemgetter` and :func:`attrgetter` 1433 functions now support multiple fields. A call such as 1434 ``operator.attrgetter('a', 'b')`` will return a function that retrieves the 1435 :attr:`a` and :attr:`b` attributes. Combining this new feature with the 1436 :meth:`sort` method's ``key`` parameter lets you easily sort lists using 1437 multiple fields. (Contributed by Raymond Hettinger.) 1438 1439* The :mod:`optparse` module was updated to version 1.5.1 of the Optik library. 1440 The :class:`OptionParser` class gained an :attr:`epilog` attribute, a string 1441 that will be printed after the help message, and a :meth:`destroy` method to 1442 break reference cycles created by the object. (Contributed by Greg Ward.) 1443 1444* The :mod:`os` module underwent several changes. The :attr:`stat_float_times` 1445 variable now defaults to true, meaning that :func:`os.stat` will now return time 1446 values as floats. (This doesn't necessarily mean that :func:`os.stat` will 1447 return times that are precise to fractions of a second; not all systems support 1448 such precision.) 1449 1450 Constants named :attr:`os.SEEK_SET`, :attr:`os.SEEK_CUR`, and 1451 :attr:`os.SEEK_END` have been added; these are the parameters to the 1452 :func:`os.lseek` function. Two new constants for locking are 1453 :attr:`os.O_SHLOCK` and :attr:`os.O_EXLOCK`. 1454 1455 Two new functions, :func:`wait3` and :func:`wait4`, were added. They're similar 1456 the :func:`waitpid` function which waits for a child process to exit and returns 1457 a tuple of the process ID and its exit status, but :func:`wait3` and 1458 :func:`wait4` return additional information. :func:`wait3` doesn't take a 1459 process ID as input, so it waits for any child process to exit and returns a 1460 3-tuple of *process-id*, *exit-status*, *resource-usage* as returned from the 1461 :func:`resource.getrusage` function. ``wait4(pid)`` does take a process ID. 1462 (Contributed by Chad J. Schroeder.) 1463 1464 On FreeBSD, the :func:`os.stat` function now returns times with nanosecond 1465 resolution, and the returned object now has :attr:`st_gen` and 1466 :attr:`st_birthtime`. The :attr:`st_flags` attribute is also available, if the 1467 platform supports it. (Contributed by Antti Louko and Diego Pettenò.) 1468 1469 .. (Patch 1180695, 1212117) 1470 1471* The Python debugger provided by the :mod:`pdb` module can now store lists of 1472 commands to execute when a breakpoint is reached and execution stops. Once 1473 breakpoint #1 has been created, enter ``commands 1`` and enter a series of 1474 commands to be executed, finishing the list with ``end``. The command list can 1475 include commands that resume execution, such as ``continue`` or ``next``. 1476 (Contributed by Grégoire Dooms.) 1477 1478 .. Patch 790710 1479 1480* The :mod:`pickle` and :mod:`cPickle` modules no longer accept a return value 1481 of ``None`` from the :meth:`__reduce__` method; the method must return a tuple 1482 of arguments instead. The ability to return ``None`` was deprecated in Python 1483 2.4, so this completes the removal of the feature. 1484 1485* The :mod:`pkgutil` module, containing various utility functions for finding 1486 packages, was enhanced to support PEP 302's import hooks and now also works for 1487 packages stored in ZIP-format archives. (Contributed by Phillip J. Eby.) 1488 1489* The pybench benchmark suite by Marc-André Lemburg is now included in the 1490 :file:`Tools/pybench` directory. The pybench suite is an improvement on the 1491 commonly used :file:`pystone.py` program because pybench provides a more 1492 detailed measurement of the interpreter's speed. It times particular operations 1493 such as function calls, tuple slicing, method lookups, and numeric operations, 1494 instead of performing many different operations and reducing the result to a 1495 single number as :file:`pystone.py` does. 1496 1497* The :mod:`pyexpat` module now uses version 2.0 of the Expat parser. 1498 (Contributed by Trent Mick.) 1499 1500* The :class:`~Queue.Queue` class provided by the :mod:`Queue` module gained two new 1501 methods. :meth:`join` blocks until all items in the queue have been retrieved 1502 and all processing work on the items have been completed. Worker threads call 1503 the other new method, :meth:`task_done`, to signal that processing for an item 1504 has been completed. (Contributed by Raymond Hettinger.) 1505 1506* The old :mod:`regex` and :mod:`regsub` modules, which have been deprecated 1507 ever since Python 2.0, have finally been deleted. Other deleted modules: 1508 :mod:`statcache`, :mod:`tzparse`, :mod:`whrandom`. 1509 1510* Also deleted: the :file:`lib-old` directory, which includes ancient modules 1511 such as :mod:`dircmp` and :mod:`ni`, was removed. :file:`lib-old` wasn't on the 1512 default ``sys.path``, so unless your programs explicitly added the directory to 1513 ``sys.path``, this removal shouldn't affect your code. 1514 1515* The :mod:`rlcompleter` module is no longer dependent on importing the 1516 :mod:`readline` module and therefore now works on non-Unix platforms. (Patch 1517 from Robert Kiendl.) 1518 1519 .. Patch #1472854 1520 1521* The :mod:`SimpleXMLRPCServer` and :mod:`DocXMLRPCServer` classes now have a 1522 :attr:`rpc_paths` attribute that constrains XML-RPC operations to a limited set 1523 of URL paths; the default is to allow only ``'/'`` and ``'/RPC2'``. Setting 1524 :attr:`rpc_paths` to ``None`` or an empty tuple disables this path checking. 1525 1526 .. Bug #1473048 1527 1528* The :mod:`socket` module now supports :const:`AF_NETLINK` sockets on Linux, 1529 thanks to a patch from Philippe Biondi. Netlink sockets are a Linux-specific 1530 mechanism for communications between a user-space process and kernel code; an 1531 introductory article about them is at https://www.linuxjournal.com/article/7356. 1532 In Python code, netlink addresses are represented as a tuple of 2 integers, 1533 ``(pid, group_mask)``. 1534 1535 Two new methods on socket objects, ``recv_into(buffer)`` and 1536 ``recvfrom_into(buffer)``, store the received data in an object that 1537 supports the buffer protocol instead of returning the data as a string. This 1538 means you can put the data directly into an array or a memory-mapped file. 1539 1540 Socket objects also gained :meth:`getfamily`, :meth:`gettype`, and 1541 :meth:`getproto` accessor methods to retrieve the family, type, and protocol 1542 values for the socket. 1543 1544* New module: the :mod:`spwd` module provides functions for accessing the shadow 1545 password database on systems that support shadow passwords. 1546 1547* The :mod:`struct` is now faster because it compiles format strings into 1548 :class:`Struct` objects with :meth:`pack` and :meth:`unpack` methods. This is 1549 similar to how the :mod:`re` module lets you create compiled regular expression 1550 objects. You can still use the module-level :func:`pack` and :func:`unpack` 1551 functions; they'll create :class:`Struct` objects and cache them. Or you can 1552 use :class:`Struct` instances directly:: 1553 1554 s = struct.Struct('ih3s') 1555 1556 data = s.pack(1972, 187, 'abc') 1557 year, number, name = s.unpack(data) 1558 1559 You can also pack and unpack data to and from buffer objects directly using the 1560 ``pack_into(buffer, offset, v1, v2, ...)`` and ``unpack_from(buffer, 1561 offset)`` methods. This lets you store data directly into an array or a 1562 memory-mapped file. 1563 1564 (:class:`Struct` objects were implemented by Bob Ippolito at the NeedForSpeed 1565 sprint. Support for buffer objects was added by Martin Blais, also at the 1566 NeedForSpeed sprint.) 1567 1568* The Python developers switched from CVS to Subversion during the 2.5 1569 development process. Information about the exact build version is available as 1570 the ``sys.subversion`` variable, a 3-tuple of ``(interpreter-name, branch-name, 1571 revision-range)``. For example, at the time of writing my copy of 2.5 was 1572 reporting ``('CPython', 'trunk', '45313:45315')``. 1573 1574 This information is also available to C extensions via the 1575 :c:func:`Py_GetBuildInfo` function that returns a string of build information 1576 like this: ``"trunk:45355:45356M, Apr 13 2006, 07:42:19"``. (Contributed by 1577 Barry Warsaw.) 1578 1579* Another new function, :func:`sys._current_frames`, returns the current stack 1580 frames for all running threads as a dictionary mapping thread identifiers to the 1581 topmost stack frame currently active in that thread at the time the function is 1582 called. (Contributed by Tim Peters.) 1583 1584* The :class:`TarFile` class in the :mod:`tarfile` module now has an 1585 :meth:`extractall` method that extracts all members from the archive into the 1586 current working directory. It's also possible to set a different directory as 1587 the extraction target, and to unpack only a subset of the archive's members. 1588 1589 The compression used for a tarfile opened in stream mode can now be autodetected 1590 using the mode ``'r|*'``. (Contributed by Lars Gustäbel.) 1591 1592 .. patch 918101 1593 1594* The :mod:`threading` module now lets you set the stack size used when new 1595 threads are created. The ``stack_size([*size*])`` function returns the 1596 currently configured stack size, and supplying the optional *size* parameter 1597 sets a new value. Not all platforms support changing the stack size, but 1598 Windows, POSIX threading, and OS/2 all do. (Contributed by Andrew MacIntyre.) 1599 1600 .. Patch 1454481 1601 1602* The :mod:`unicodedata` module has been updated to use version 4.1.0 of the 1603 Unicode character database. Version 3.2.0 is required by some specifications, 1604 so it's still available as :attr:`unicodedata.ucd_3_2_0`. 1605 1606* New module: the :mod:`uuid` module generates universally unique identifiers 1607 (UUIDs) according to :rfc:`4122`. The RFC defines several different UUID 1608 versions that are generated from a starting string, from system properties, or 1609 purely randomly. This module contains a :class:`UUID` class and functions 1610 named :func:`uuid1`, :func:`uuid3`, :func:`uuid4`, and :func:`uuid5` to 1611 generate different versions of UUID. (Version 2 UUIDs are not specified in 1612 :rfc:`4122` and are not supported by this module.) :: 1613 1614 >>> import uuid 1615 >>> # make a UUID based on the host ID and current time 1616 >>> uuid.uuid1() 1617 UUID('a8098c1a-f86e-11da-bd1a-00112444be1e') 1618 1619 >>> # make a UUID using an MD5 hash of a namespace UUID and a name 1620 >>> uuid.uuid3(uuid.NAMESPACE_DNS, 'python.org') 1621 UUID('6fa459ea-ee8a-3ca4-894e-db77e160355e') 1622 1623 >>> # make a random UUID 1624 >>> uuid.uuid4() 1625 UUID('16fd2706-8baf-433b-82eb-8c7fada847da') 1626 1627 >>> # make a UUID using a SHA-1 hash of a namespace UUID and a name 1628 >>> uuid.uuid5(uuid.NAMESPACE_DNS, 'python.org') 1629 UUID('886313e1-3b8a-5372-9b90-0c9aee199e5d') 1630 1631 (Contributed by Ka-Ping Yee.) 1632 1633* The :mod:`weakref` module's :class:`WeakKeyDictionary` and 1634 :class:`WeakValueDictionary` types gained new methods for iterating over the 1635 weak references contained in the dictionary. :meth:`iterkeyrefs` and 1636 :meth:`keyrefs` methods were added to :class:`WeakKeyDictionary`, and 1637 :meth:`itervaluerefs` and :meth:`valuerefs` were added to 1638 :class:`WeakValueDictionary`. (Contributed by Fred L. Drake, Jr.) 1639 1640* The :mod:`webbrowser` module received a number of enhancements. It's now 1641 usable as a script with ``python -m webbrowser``, taking a URL as the argument; 1642 there are a number of switches to control the behaviour (:option:`!-n` for a new 1643 browser window, :option:`!-t` for a new tab). New module-level functions, 1644 :func:`open_new` and :func:`open_new_tab`, were added to support this. The 1645 module's :func:`open` function supports an additional feature, an *autoraise* 1646 parameter that signals whether to raise the open window when possible. A number 1647 of additional browsers were added to the supported list such as Firefox, Opera, 1648 Konqueror, and elinks. (Contributed by Oleg Broytmann and Georg Brandl.) 1649 1650 .. Patch #754022 1651 1652* The :mod:`xmlrpclib` module now supports returning :class:`~datetime.datetime` objects 1653 for the XML-RPC date type. Supply ``use_datetime=True`` to the :func:`loads` 1654 function or the :class:`Unmarshaller` class to enable this feature. (Contributed 1655 by Skip Montanaro.) 1656 1657 .. Patch 1120353 1658 1659* The :mod:`zipfile` module now supports the ZIP64 version of the format, 1660 meaning that a .zip archive can now be larger than 4 GiB and can contain 1661 individual files larger than 4 GiB. (Contributed by Ronald Oussoren.) 1662 1663 .. Patch 1446489 1664 1665* The :mod:`zlib` module's :class:`Compress` and :class:`Decompress` objects now 1666 support a :meth:`copy` method that makes a copy of the object's internal state 1667 and returns a new :class:`Compress` or :class:`Decompress` object. 1668 (Contributed by Chris AtLee.) 1669 1670 .. Patch 1435422 1671 1672.. ====================================================================== 1673 1674 1675.. _module-ctypes: 1676 1677The ctypes package 1678------------------ 1679 1680The :mod:`ctypes` package, written by Thomas Heller, has been added to the 1681standard library. :mod:`ctypes` lets you call arbitrary functions in shared 1682libraries or DLLs. Long-time users may remember the :mod:`dl` module, which 1683provides functions for loading shared libraries and calling functions in them. 1684The :mod:`ctypes` package is much fancier. 1685 1686To load a shared library or DLL, you must create an instance of the 1687:class:`CDLL` class and provide the name or path of the shared library or DLL. 1688Once that's done, you can call arbitrary functions by accessing them as 1689attributes of the :class:`CDLL` object. :: 1690 1691 import ctypes 1692 1693 libc = ctypes.CDLL('libc.so.6') 1694 result = libc.printf("Line of output\n") 1695 1696Type constructors for the various C types are provided: :func:`c_int`, 1697:func:`c_float`, :func:`c_double`, :func:`c_char_p` (equivalent to :c:type:`char 1698\*`), and so forth. Unlike Python's types, the C versions are all mutable; you 1699can assign to their :attr:`value` attribute to change the wrapped value. Python 1700integers and strings will be automatically converted to the corresponding C 1701types, but for other types you must call the correct type constructor. (And I 1702mean *must*; getting it wrong will often result in the interpreter crashing 1703with a segmentation fault.) 1704 1705You shouldn't use :func:`c_char_p` with a Python string when the C function will 1706be modifying the memory area, because Python strings are supposed to be 1707immutable; breaking this rule will cause puzzling bugs. When you need a 1708modifiable memory area, use :func:`create_string_buffer`:: 1709 1710 s = "this is a string" 1711 buf = ctypes.create_string_buffer(s) 1712 libc.strfry(buf) 1713 1714C functions are assumed to return integers, but you can set the :attr:`restype` 1715attribute of the function object to change this:: 1716 1717 >>> libc.atof('2.71828') 1718 -1783957616 1719 >>> libc.atof.restype = ctypes.c_double 1720 >>> libc.atof('2.71828') 1721 2.71828 1722 1723:mod:`ctypes` also provides a wrapper for Python's C API as the 1724``ctypes.pythonapi`` object. This object does *not* release the global 1725interpreter lock before calling a function, because the lock must be held when 1726calling into the interpreter's code. There's a :class:`py_object()` type 1727constructor that will create a :c:type:`PyObject \*` pointer. A simple usage:: 1728 1729 import ctypes 1730 1731 d = {} 1732 ctypes.pythonapi.PyObject_SetItem(ctypes.py_object(d), 1733 ctypes.py_object("abc"), ctypes.py_object(1)) 1734 # d is now {'abc', 1}. 1735 1736Don't forget to use :class:`py_object()`; if it's omitted you end up with a 1737segmentation fault. 1738 1739:mod:`ctypes` has been around for a while, but people still write and 1740distribution hand-coded extension modules because you can't rely on 1741:mod:`ctypes` being present. Perhaps developers will begin to write Python 1742wrappers atop a library accessed through :mod:`ctypes` instead of extension 1743modules, now that :mod:`ctypes` is included with core Python. 1744 1745 1746.. seealso:: 1747 1748 http://starship.python.net/crew/theller/ctypes/ 1749 The ctypes web page, with a tutorial, reference, and FAQ. 1750 1751 The documentation for the :mod:`ctypes` module. 1752 1753.. ====================================================================== 1754 1755 1756.. _module-etree: 1757 1758The ElementTree package 1759----------------------- 1760 1761A subset of Fredrik Lundh's ElementTree library for processing XML has been 1762added to the standard library as :mod:`xml.etree`. The available modules are 1763:mod:`ElementTree`, :mod:`ElementPath`, and :mod:`ElementInclude` from 1764ElementTree 1.2.6. The :mod:`cElementTree` accelerator module is also 1765included. 1766 1767The rest of this section will provide a brief overview of using ElementTree. 1768Full documentation for ElementTree is available at 1769http://effbot.org/zone/element-index.htm. 1770 1771ElementTree represents an XML document as a tree of element nodes. The text 1772content of the document is stored as the :attr:`text` and :attr:`tail` 1773attributes of (This is one of the major differences between ElementTree and 1774the Document Object Model; in the DOM there are many different types of node, 1775including :class:`TextNode`.) 1776 1777The most commonly used parsing function is :func:`parse`, that takes either a 1778string (assumed to contain a filename) or a file-like object and returns an 1779:class:`ElementTree` instance:: 1780 1781 from xml.etree import ElementTree as ET 1782 1783 tree = ET.parse('ex-1.xml') 1784 1785 feed = urllib.urlopen( 1786 'http://planet.python.org/rss10.xml') 1787 tree = ET.parse(feed) 1788 1789Once you have an :class:`ElementTree` instance, you can call its :meth:`getroot` 1790method to get the root :class:`Element` node. 1791 1792There's also an :func:`XML` function that takes a string literal and returns an 1793:class:`Element` node (not an :class:`ElementTree`). This function provides a 1794tidy way to incorporate XML fragments, approaching the convenience of an XML 1795literal:: 1796 1797 svg = ET.XML("""<svg width="10px" version="1.0"> 1798 </svg>""") 1799 svg.set('height', '320px') 1800 svg.append(elem1) 1801 1802Each XML element supports some dictionary-like and some list-like access 1803methods. Dictionary-like operations are used to access attribute values, and 1804list-like operations are used to access child nodes. 1805 1806+-------------------------------+--------------------------------------------+ 1807| Operation | Result | 1808+===============================+============================================+ 1809| ``elem[n]`` | Returns n'th child element. | 1810+-------------------------------+--------------------------------------------+ 1811| ``elem[m:n]`` | Returns list of m'th through n'th child | 1812| | elements. | 1813+-------------------------------+--------------------------------------------+ 1814| ``len(elem)`` | Returns number of child elements. | 1815+-------------------------------+--------------------------------------------+ 1816| ``list(elem)`` | Returns list of child elements. | 1817+-------------------------------+--------------------------------------------+ 1818| ``elem.append(elem2)`` | Adds *elem2* as a child. | 1819+-------------------------------+--------------------------------------------+ 1820| ``elem.insert(index, elem2)`` | Inserts *elem2* at the specified location. | 1821+-------------------------------+--------------------------------------------+ 1822| ``del elem[n]`` | Deletes n'th child element. | 1823+-------------------------------+--------------------------------------------+ 1824| ``elem.keys()`` | Returns list of attribute names. | 1825+-------------------------------+--------------------------------------------+ 1826| ``elem.get(name)`` | Returns value of attribute *name*. | 1827+-------------------------------+--------------------------------------------+ 1828| ``elem.set(name, value)`` | Sets new value for attribute *name*. | 1829+-------------------------------+--------------------------------------------+ 1830| ``elem.attrib`` | Retrieves the dictionary containing | 1831| | attributes. | 1832+-------------------------------+--------------------------------------------+ 1833| ``del elem.attrib[name]`` | Deletes attribute *name*. | 1834+-------------------------------+--------------------------------------------+ 1835 1836Comments and processing instructions are also represented as :class:`Element` 1837nodes. To check if a node is a comment or processing instructions:: 1838 1839 if elem.tag is ET.Comment: 1840 ... 1841 elif elem.tag is ET.ProcessingInstruction: 1842 ... 1843 1844To generate XML output, you should call the :meth:`ElementTree.write` method. 1845Like :func:`parse`, it can take either a string or a file-like object:: 1846 1847 # Encoding is US-ASCII 1848 tree.write('output.xml') 1849 1850 # Encoding is UTF-8 1851 f = open('output.xml', 'w') 1852 tree.write(f, encoding='utf-8') 1853 1854(Caution: the default encoding used for output is ASCII. For general XML work, 1855where an element's name may contain arbitrary Unicode characters, ASCII isn't a 1856very useful encoding because it will raise an exception if an element's name 1857contains any characters with values greater than 127. Therefore, it's best to 1858specify a different encoding such as UTF-8 that can handle any Unicode 1859character.) 1860 1861This section is only a partial description of the ElementTree interfaces. Please 1862read the package's official documentation for more details. 1863 1864 1865.. seealso:: 1866 1867 http://effbot.org/zone/element-index.htm 1868 Official documentation for ElementTree. 1869 1870.. ====================================================================== 1871 1872 1873.. _module-hashlib: 1874 1875The hashlib package 1876------------------- 1877 1878A new :mod:`hashlib` module, written by Gregory P. Smith, has been added to 1879replace the :mod:`md5` and :mod:`sha` modules. :mod:`hashlib` adds support for 1880additional secure hashes (SHA-224, SHA-256, SHA-384, and SHA-512). When 1881available, the module uses OpenSSL for fast platform optimized implementations 1882of algorithms. 1883 1884The old :mod:`md5` and :mod:`sha` modules still exist as wrappers around hashlib 1885to preserve backwards compatibility. The new module's interface is very close 1886to that of the old modules, but not identical. The most significant difference 1887is that the constructor functions for creating new hashing objects are named 1888differently. :: 1889 1890 # Old versions 1891 h = md5.md5() 1892 h = md5.new() 1893 1894 # New version 1895 h = hashlib.md5() 1896 1897 # Old versions 1898 h = sha.sha() 1899 h = sha.new() 1900 1901 # New version 1902 h = hashlib.sha1() 1903 1904 # Hash that weren't previously available 1905 h = hashlib.sha224() 1906 h = hashlib.sha256() 1907 h = hashlib.sha384() 1908 h = hashlib.sha512() 1909 1910 # Alternative form 1911 h = hashlib.new('md5') # Provide algorithm as a string 1912 1913Once a hash object has been created, its methods are the same as before: 1914``update(string)`` hashes the specified string into the current digest 1915state, :meth:`digest` and :meth:`hexdigest` return the digest value as a binary 1916string or a string of hex digits, and :meth:`copy` returns a new hashing object 1917with the same digest state. 1918 1919 1920.. seealso:: 1921 1922 The documentation for the :mod:`hashlib` module. 1923 1924.. ====================================================================== 1925 1926 1927.. _module-sqlite: 1928 1929The sqlite3 package 1930------------------- 1931 1932The pysqlite module (http://www.pysqlite.org), a wrapper for the SQLite embedded 1933database, has been added to the standard library under the package name 1934:mod:`sqlite3`. 1935 1936SQLite is a C library that provides a lightweight disk-based database that 1937doesn't require a separate server process and allows accessing the database 1938using a nonstandard variant of the SQL query language. Some applications can use 1939SQLite for internal data storage. It's also possible to prototype an 1940application using SQLite and then port the code to a larger database such as 1941PostgreSQL or Oracle. 1942 1943pysqlite was written by Gerhard Häring and provides a SQL interface compliant 1944with the DB-API 2.0 specification described by :pep:`249`. 1945 1946If you're compiling the Python source yourself, note that the source tree 1947doesn't include the SQLite code, only the wrapper module. You'll need to have 1948the SQLite libraries and headers installed before compiling Python, and the 1949build process will compile the module when the necessary headers are available. 1950 1951To use the module, you must first create a :class:`Connection` object that 1952represents the database. Here the data will be stored in the 1953:file:`/tmp/example` file:: 1954 1955 conn = sqlite3.connect('/tmp/example') 1956 1957You can also supply the special name ``:memory:`` to create a database in RAM. 1958 1959Once you have a :class:`Connection`, you can create a :class:`Cursor` object 1960and call its :meth:`execute` method to perform SQL commands:: 1961 1962 c = conn.cursor() 1963 1964 # Create table 1965 c.execute('''create table stocks 1966 (date text, trans text, symbol text, 1967 qty real, price real)''') 1968 1969 # Insert a row of data 1970 c.execute("""insert into stocks 1971 values ('2006-01-05','BUY','RHAT',100,35.14)""") 1972 1973Usually your SQL operations will need to use values from Python variables. You 1974shouldn't assemble your query using Python's string operations because doing so 1975is insecure; it makes your program vulnerable to an SQL injection attack. 1976 1977Instead, use the DB-API's parameter substitution. Put ``?`` as a placeholder 1978wherever you want to use a value, and then provide a tuple of values as the 1979second argument to the cursor's :meth:`execute` method. (Other database modules 1980may use a different placeholder, such as ``%s`` or ``:1``.) For example:: 1981 1982 # Never do this -- insecure! 1983 symbol = 'IBM' 1984 c.execute("... where symbol = '%s'" % symbol) 1985 1986 # Do this instead 1987 t = (symbol,) 1988 c.execute('select * from stocks where symbol=?', t) 1989 1990 # Larger example 1991 for t in (('2006-03-28', 'BUY', 'IBM', 1000, 45.00), 1992 ('2006-04-05', 'BUY', 'MSOFT', 1000, 72.00), 1993 ('2006-04-06', 'SELL', 'IBM', 500, 53.00), 1994 ): 1995 c.execute('insert into stocks values (?,?,?,?,?)', t) 1996 1997To retrieve data after executing a SELECT statement, you can either treat the 1998cursor as an iterator, call the cursor's :meth:`fetchone` method to retrieve a 1999single matching row, or call :meth:`fetchall` to get a list of the matching 2000rows. 2001 2002This example uses the iterator form:: 2003 2004 >>> c = conn.cursor() 2005 >>> c.execute('select * from stocks order by price') 2006 >>> for row in c: 2007 ... print row 2008 ... 2009 (u'2006-01-05', u'BUY', u'RHAT', 100, 35.140000000000001) 2010 (u'2006-03-28', u'BUY', u'IBM', 1000, 45.0) 2011 (u'2006-04-06', u'SELL', u'IBM', 500, 53.0) 2012 (u'2006-04-05', u'BUY', u'MSOFT', 1000, 72.0) 2013 >>> 2014 2015For more information about the SQL dialect supported by SQLite, see 2016https://www.sqlite.org. 2017 2018 2019.. seealso:: 2020 2021 http://www.pysqlite.org 2022 The pysqlite web page. 2023 2024 https://www.sqlite.org 2025 The SQLite web page; the documentation describes the syntax and the available 2026 data types for the supported SQL dialect. 2027 2028 The documentation for the :mod:`sqlite3` module. 2029 2030 :pep:`249` - Database API Specification 2.0 2031 PEP written by Marc-André Lemburg. 2032 2033.. ====================================================================== 2034 2035 2036.. _module-wsgiref: 2037 2038The wsgiref package 2039------------------- 2040 2041The Web Server Gateway Interface (WSGI) v1.0 defines a standard interface 2042between web servers and Python web applications and is described in :pep:`333`. 2043The :mod:`wsgiref` package is a reference implementation of the WSGI 2044specification. 2045 2046.. XXX should this be in a PEP 333 section instead? 2047 2048The package includes a basic HTTP server that will run a WSGI application; this 2049server is useful for debugging but isn't intended for production use. Setting 2050up a server takes only a few lines of code:: 2051 2052 from wsgiref import simple_server 2053 2054 wsgi_app = ... 2055 2056 host = '' 2057 port = 8000 2058 httpd = simple_server.make_server(host, port, wsgi_app) 2059 httpd.serve_forever() 2060 2061.. XXX discuss structure of WSGI applications? 2062.. XXX provide an example using Django or some other framework? 2063 2064 2065.. seealso:: 2066 2067 http://www.wsgi.org 2068 A central web site for WSGI-related resources. 2069 2070 :pep:`333` - Python Web Server Gateway Interface v1.0 2071 PEP written by Phillip J. Eby. 2072 2073.. ====================================================================== 2074 2075 2076.. _build-api: 2077 2078Build and C API Changes 2079======================= 2080 2081Changes to Python's build process and to the C API include: 2082 2083* The Python source tree was converted from CVS to Subversion, in a complex 2084 migration procedure that was supervised and flawlessly carried out by Martin von 2085 Löwis. The procedure was developed as :pep:`347`. 2086 2087* Coverity, a company that markets a source code analysis tool called Prevent, 2088 provided the results of their examination of the Python source code. The 2089 analysis found about 60 bugs that were quickly fixed. Many of the bugs were 2090 refcounting problems, often occurring in error-handling code. See 2091 https://scan.coverity.com for the statistics. 2092 2093* The largest change to the C API came from :pep:`353`, which modifies the 2094 interpreter to use a :c:type:`Py_ssize_t` type definition instead of 2095 :c:type:`int`. See the earlier section :ref:`pep-353` for a discussion of this 2096 change. 2097 2098* The design of the bytecode compiler has changed a great deal, no longer 2099 generating bytecode by traversing the parse tree. Instead the parse tree is 2100 converted to an abstract syntax tree (or AST), and it is the abstract syntax 2101 tree that's traversed to produce the bytecode. 2102 2103 It's possible for Python code to obtain AST objects by using the 2104 :func:`compile` built-in and specifying ``_ast.PyCF_ONLY_AST`` as the value of 2105 the *flags* parameter:: 2106 2107 from _ast import PyCF_ONLY_AST 2108 ast = compile("""a=0 2109 for i in range(10): 2110 a += i 2111 """, "<string>", 'exec', PyCF_ONLY_AST) 2112 2113 assignment = ast.body[0] 2114 for_loop = ast.body[1] 2115 2116 No official documentation has been written for the AST code yet, but :pep:`339` 2117 discusses the design. To start learning about the code, read the definition of 2118 the various AST nodes in :file:`Parser/Python.asdl`. A Python script reads this 2119 file and generates a set of C structure definitions in 2120 :file:`Include/Python-ast.h`. The :c:func:`PyParser_ASTFromString` and 2121 :c:func:`PyParser_ASTFromFile`, defined in :file:`Include/pythonrun.h`, take 2122 Python source as input and return the root of an AST representing the contents. 2123 This AST can then be turned into a code object by :c:func:`PyAST_Compile`. For 2124 more information, read the source code, and then ask questions on python-dev. 2125 2126 The AST code was developed under Jeremy Hylton's management, and implemented by 2127 (in alphabetical order) Brett Cannon, Nick Coghlan, Grant Edwards, John 2128 Ehresman, Kurt Kaiser, Neal Norwitz, Tim Peters, Armin Rigo, and Neil 2129 Schemenauer, plus the participants in a number of AST sprints at conferences 2130 such as PyCon. 2131 2132 .. List of names taken from Jeremy's python-dev post at 2133 .. https://mail.python.org/pipermail/python-dev/2005-October/057500.html 2134 2135* Evan Jones's patch to obmalloc, first described in a talk at PyCon DC 2005, 2136 was applied. Python 2.4 allocated small objects in 256K-sized arenas, but never 2137 freed arenas. With this patch, Python will free arenas when they're empty. The 2138 net effect is that on some platforms, when you allocate many objects, Python's 2139 memory usage may actually drop when you delete them and the memory may be 2140 returned to the operating system. (Implemented by Evan Jones, and reworked by 2141 Tim Peters.) 2142 2143 Note that this change means extension modules must be more careful when 2144 allocating memory. Python's API has many different functions for allocating 2145 memory that are grouped into families. For example, :c:func:`PyMem_Malloc`, 2146 :c:func:`PyMem_Realloc`, and :c:func:`PyMem_Free` are one family that allocates 2147 raw memory, while :c:func:`PyObject_Malloc`, :c:func:`PyObject_Realloc`, and 2148 :c:func:`PyObject_Free` are another family that's supposed to be used for 2149 creating Python objects. 2150 2151 Previously these different families all reduced to the platform's 2152 :c:func:`malloc` and :c:func:`free` functions. This meant it didn't matter if 2153 you got things wrong and allocated memory with the :c:func:`PyMem` function but 2154 freed it with the :c:func:`PyObject` function. With 2.5's changes to obmalloc, 2155 these families now do different things and mismatches will probably result in a 2156 segfault. You should carefully test your C extension modules with Python 2.5. 2157 2158* The built-in set types now have an official C API. Call :c:func:`PySet_New` 2159 and :c:func:`PyFrozenSet_New` to create a new set, :c:func:`PySet_Add` and 2160 :c:func:`PySet_Discard` to add and remove elements, and :c:func:`PySet_Contains` 2161 and :c:func:`PySet_Size` to examine the set's state. (Contributed by Raymond 2162 Hettinger.) 2163 2164* C code can now obtain information about the exact revision of the Python 2165 interpreter by calling the :c:func:`Py_GetBuildInfo` function that returns a 2166 string of build information like this: ``"trunk:45355:45356M, Apr 13 2006, 2167 07:42:19"``. (Contributed by Barry Warsaw.) 2168 2169* Two new macros can be used to indicate C functions that are local to the 2170 current file so that a faster calling convention can be used. 2171 ``Py_LOCAL(type)`` declares the function as returning a value of the 2172 specified *type* and uses a fast-calling qualifier. 2173 ``Py_LOCAL_INLINE(type)`` does the same thing and also requests the 2174 function be inlined. If :c:func:`PY_LOCAL_AGGRESSIVE` is defined before 2175 :file:`python.h` is included, a set of more aggressive optimizations are enabled 2176 for the module; you should benchmark the results to find out if these 2177 optimizations actually make the code faster. (Contributed by Fredrik Lundh at 2178 the NeedForSpeed sprint.) 2179 2180* ``PyErr_NewException(name, base, dict)`` can now accept a tuple of base 2181 classes as its *base* argument. (Contributed by Georg Brandl.) 2182 2183* The :c:func:`PyErr_Warn` function for issuing warnings is now deprecated in 2184 favour of ``PyErr_WarnEx(category, message, stacklevel)`` which lets you 2185 specify the number of stack frames separating this function and the caller. A 2186 *stacklevel* of 1 is the function calling :c:func:`PyErr_WarnEx`, 2 is the 2187 function above that, and so forth. (Added by Neal Norwitz.) 2188 2189* The CPython interpreter is still written in C, but the code can now be 2190 compiled with a C++ compiler without errors. (Implemented by Anthony Baxter, 2191 Martin von Löwis, Skip Montanaro.) 2192 2193* The :c:func:`PyRange_New` function was removed. It was never documented, never 2194 used in the core code, and had dangerously lax error checking. In the unlikely 2195 case that your extensions were using it, you can replace it by something like 2196 the following:: 2197 2198 range = PyObject_CallFunction((PyObject*) &PyRange_Type, "lll", 2199 start, stop, step); 2200 2201.. ====================================================================== 2202 2203 2204.. _ports: 2205 2206Port-Specific Changes 2207--------------------- 2208 2209* MacOS X (10.3 and higher): dynamic loading of modules now uses the 2210 :c:func:`dlopen` function instead of MacOS-specific functions. 2211 2212* MacOS X: an :option:`!--enable-universalsdk` switch was added to the 2213 :program:`configure` script that compiles the interpreter as a universal binary 2214 able to run on both PowerPC and Intel processors. (Contributed by Ronald 2215 Oussoren; :issue:`2573`.) 2216 2217* Windows: :file:`.dll` is no longer supported as a filename extension for 2218 extension modules. :file:`.pyd` is now the only filename extension that will be 2219 searched for. 2220 2221.. ====================================================================== 2222 2223 2224.. _porting: 2225 2226Porting to Python 2.5 2227===================== 2228 2229This section lists previously described changes that may require changes to your 2230code: 2231 2232* ASCII is now the default encoding for modules. It's now a syntax error if a 2233 module contains string literals with 8-bit characters but doesn't have an 2234 encoding declaration. In Python 2.4 this triggered a warning, not a syntax 2235 error. 2236 2237* Previously, the :attr:`gi_frame` attribute of a generator was always a frame 2238 object. Because of the :pep:`342` changes described in section :ref:`pep-342`, 2239 it's now possible for :attr:`gi_frame` to be ``None``. 2240 2241* A new warning, :class:`UnicodeWarning`, is triggered when you attempt to 2242 compare a Unicode string and an 8-bit string that can't be converted to Unicode 2243 using the default ASCII encoding. Previously such comparisons would raise a 2244 :class:`UnicodeDecodeError` exception. 2245 2246* Library: the :mod:`csv` module is now stricter about multi-line quoted fields. 2247 If your files contain newlines embedded within fields, the input should be split 2248 into lines in a manner which preserves the newline characters. 2249 2250* Library: the :mod:`locale` module's :func:`format` function's would 2251 previously accept any string as long as no more than one %char specifier 2252 appeared. In Python 2.5, the argument must be exactly one %char specifier with 2253 no surrounding text. 2254 2255* Library: The :mod:`pickle` and :mod:`cPickle` modules no longer accept a 2256 return value of ``None`` from the :meth:`__reduce__` method; the method must 2257 return a tuple of arguments instead. The modules also no longer accept the 2258 deprecated *bin* keyword parameter. 2259 2260* Library: The :mod:`SimpleXMLRPCServer` and :mod:`DocXMLRPCServer` classes now 2261 have a :attr:`rpc_paths` attribute that constrains XML-RPC operations to a 2262 limited set of URL paths; the default is to allow only ``'/'`` and ``'/RPC2'``. 2263 Setting :attr:`rpc_paths` to ``None`` or an empty tuple disables this path 2264 checking. 2265 2266* C API: Many functions now use :c:type:`Py_ssize_t` instead of :c:type:`int` to 2267 allow processing more data on 64-bit machines. Extension code may need to make 2268 the same change to avoid warnings and to support 64-bit machines. See the 2269 earlier section :ref:`pep-353` for a discussion of this change. 2270 2271* C API: The obmalloc changes mean that you must be careful to not mix usage 2272 of the :c:func:`PyMem_\*` and :c:func:`PyObject_\*` families of functions. Memory 2273 allocated with one family's :c:func:`\*_Malloc` must be freed with the 2274 corresponding family's :c:func:`\*_Free` function. 2275 2276.. ====================================================================== 2277 2278 2279Acknowledgements 2280================ 2281 2282The author would like to thank the following people for offering suggestions, 2283corrections and assistance with various drafts of this article: Georg Brandl, 2284Nick Coghlan, Phillip J. Eby, Lars Gustäbel, Raymond Hettinger, Ralf W. 2285Grosse-Kunstleve, Kent Johnson, Iain Lowe, Martin von Löwis, Fredrik Lundh, Andrew 2286McNamara, Skip Montanaro, Gustavo Niemeyer, Paul Prescod, James Pryor, Mike 2287Rovner, Scott Weikart, Barry Warsaw, Thomas Wouters. 2288 2289