1**************************** 2 What's New in Python 2.5 3**************************** 4 5:Author: A.M. Kuchling 6 7.. |release| replace:: 1.01 8 9.. $Id: whatsnew25.tex 56611 2007-07-29 08:26:10Z georg.brandl $ 10.. Fix XXX comments 11 12This article explains the new features in Python 2.5. The final release of 13Python 2.5 is scheduled for August 2006; :pep:`356` describes the planned 14release schedule. 15 16The changes in Python 2.5 are an interesting mix of language and library 17improvements. The library enhancements will be more important to Python's user 18community, I think, because several widely-useful packages were added. New 19modules include ElementTree for XML processing (:mod:`xml.etree`), 20the SQLite database module (:mod:`sqlite`), and the :mod:`ctypes` 21module for calling C functions. 22 23The language changes are of middling significance. Some pleasant new features 24were added, but most of them aren't features that you'll use every day. 25Conditional expressions were finally added to the language using a novel syntax; 26see section :ref:`pep-308`. The new ':keyword:`with`' statement will make 27writing cleanup code easier (section :ref:`pep-343`). Values can now be passed 28into generators (section :ref:`pep-342`). Imports are now visible as either 29absolute or relative (section :ref:`pep-328`). Some corner cases of exception 30handling are handled better (section :ref:`pep-341`). All these improvements 31are worthwhile, but they're improvements to one specific language feature or 32another; none of them are broad modifications to Python's semantics. 33 34As well as the language and library additions, other improvements and bugfixes 35were made throughout the source tree. A search through the SVN change logs 36finds there were 353 patches applied and 458 bugs fixed between Python 2.4 and 372.5. (Both figures are likely to be underestimates.) 38 39This article doesn't try to be a complete specification of the new features; 40instead changes are briefly introduced using helpful examples. For full 41details, you should always refer to the documentation for Python 2.5 at 42https://docs.python.org. If you want to understand the complete implementation 43and design rationale, refer to the PEP for a particular new feature. 44 45Comments, suggestions, and error reports for this document are welcome; please 46e-mail them to the author or open a bug in the Python bug tracker. 47 48.. ====================================================================== 49 50 51.. _pep-308: 52 53PEP 308: Conditional Expressions 54================================ 55 56For a long time, people have been requesting a way to write conditional 57expressions, which are expressions that return value A or value B depending on 58whether a Boolean value is true or false. A conditional expression lets you 59write a single assignment statement that has the same effect as the following:: 60 61 if condition: 62 x = true_value 63 else: 64 x = false_value 65 66There have been endless tedious discussions of syntax on both python-dev and 67comp.lang.python. A vote was even held that found the majority of voters wanted 68conditional expressions in some form, but there was no syntax that was preferred 69by a clear majority. Candidates included C's ``cond ? true_v : false_v``, ``if 70cond then true_v else false_v``, and 16 other variations. 71 72Guido van Rossum eventually chose a surprising syntax:: 73 74 x = true_value if condition else false_value 75 76Evaluation is still lazy as in existing Boolean expressions, so the order of 77evaluation jumps around a bit. The *condition* expression in the middle is 78evaluated first, and the *true_value* expression is evaluated only if the 79condition was true. Similarly, the *false_value* expression is only evaluated 80when the condition is false. 81 82This syntax may seem strange and backwards; why does the condition go in the 83*middle* of the expression, and not in the front as in C's ``c ? x : y``? The 84decision was checked by applying the new syntax to the modules in the standard 85library and seeing how the resulting code read. In many cases where a 86conditional expression is used, one value seems to be the 'common case' and one 87value is an 'exceptional case', used only on rarer occasions when the condition 88isn't met. The conditional syntax makes this pattern a bit more obvious:: 89 90 contents = ((doc + '\n') if doc else '') 91 92I read the above statement as meaning "here *contents* is usually assigned a 93value of ``doc+'\n'``; sometimes *doc* is empty, in which special case an empty 94string is returned." I doubt I will use conditional expressions very often 95where there isn't a clear common and uncommon case. 96 97There was some discussion of whether the language should require surrounding 98conditional expressions with parentheses. The decision was made to *not* 99require parentheses in the Python language's grammar, but as a matter of style I 100think you should always use them. Consider these two statements:: 101 102 # First version -- no parens 103 level = 1 if logging else 0 104 105 # Second version -- with parens 106 level = (1 if logging else 0) 107 108In the first version, I think a reader's eye might group the statement into 109'level = 1', 'if logging', 'else 0', and think that the condition decides 110whether the assignment to *level* is performed. The second version reads 111better, in my opinion, because it makes it clear that the assignment is always 112performed and the choice is being made between two values. 113 114Another reason for including the brackets: a few odd combinations of list 115comprehensions and lambdas could look like incorrect conditional expressions. 116See :pep:`308` for some examples. If you put parentheses around your 117conditional expressions, you won't run into this case. 118 119 120.. seealso:: 121 122 :pep:`308` - Conditional Expressions 123 PEP written by Guido van Rossum and Raymond D. Hettinger; implemented by Thomas 124 Wouters. 125 126.. ====================================================================== 127 128 129.. _pep-309: 130 131PEP 309: Partial Function Application 132===================================== 133 134The :mod:`functools` module is intended to contain tools for functional-style 135programming. 136 137One useful tool in this module is the :func:`partial` function. For programs 138written in a functional style, you'll sometimes want to construct variants of 139existing functions that have some of the parameters filled in. Consider a 140Python function ``f(a, b, c)``; you could create a new function ``g(b, c)`` that 141was equivalent to ``f(1, b, c)``. This is called "partial function 142application". 143 144:func:`partial` takes the arguments ``(function, arg1, arg2, ... kwarg1=value1, 145kwarg2=value2)``. The resulting object is callable, so you can just call it to 146invoke *function* with the filled-in arguments. 147 148Here's a small but realistic example:: 149 150 import functools 151 152 def log (message, subsystem): 153 "Write the contents of 'message' to the specified subsystem." 154 print '%s: %s' % (subsystem, message) 155 ... 156 157 server_log = functools.partial(log, subsystem='server') 158 server_log('Unable to open socket') 159 160Here's another example, from a program that uses PyGTK. Here a context-sensitive 161pop-up menu is being constructed dynamically. The callback provided 162for the menu option is a partially applied version of the :meth:`open_item` 163method, where the first argument has been provided. :: 164 165 ... 166 class Application: 167 def open_item(self, path): 168 ... 169 def init (self): 170 open_func = functools.partial(self.open_item, item_path) 171 popup_menu.append( ("Open", open_func, 1) ) 172 173Another function in the :mod:`functools` module is the 174``update_wrapper(wrapper, wrapped)`` function that helps you write 175well-behaved decorators. :func:`update_wrapper` copies the name, module, and 176docstring attribute to a wrapper function so that tracebacks inside the wrapped 177function are easier to understand. For example, you might write:: 178 179 def my_decorator(f): 180 def wrapper(*args, **kwds): 181 print 'Calling decorated function' 182 return f(*args, **kwds) 183 functools.update_wrapper(wrapper, f) 184 return wrapper 185 186:func:`wraps` is a decorator that can be used inside your own decorators to copy 187the wrapped function's information. An alternate version of the previous 188example would be:: 189 190 def my_decorator(f): 191 @functools.wraps(f) 192 def wrapper(*args, **kwds): 193 print 'Calling decorated function' 194 return f(*args, **kwds) 195 return wrapper 196 197 198.. seealso:: 199 200 :pep:`309` - Partial Function Application 201 PEP proposed and written by Peter Harris; implemented by Hye-Shik Chang and Nick 202 Coghlan, with adaptations by Raymond Hettinger. 203 204.. ====================================================================== 205 206 207.. _pep-314: 208 209PEP 314: Metadata for Python Software Packages v1.1 210=================================================== 211 212Some simple dependency support was added to Distutils. The :func:`setup` 213function now has ``requires``, ``provides``, and ``obsoletes`` keyword 214parameters. When you build a source distribution using the ``sdist`` command, 215the dependency information will be recorded in the :file:`PKG-INFO` file. 216 217Another new keyword parameter is ``download_url``, which should be set to a URL 218for the package's source code. This means it's now possible to look up an entry 219in the package index, determine the dependencies for a package, and download the 220required packages. :: 221 222 VERSION = '1.0' 223 setup(name='PyPackage', 224 version=VERSION, 225 requires=['numarray', 'zlib (>=1.1.4)'], 226 obsoletes=['OldPackage'] 227 download_url=('http://www.example.com/pypackage/dist/pkg-%s.tar.gz' 228 % VERSION), 229 ) 230 231Another new enhancement to the Python package index at 232https://pypi.org is storing source and binary archives for a 233package. The new :command:`upload` Distutils command will upload a package to 234the repository. 235 236Before a package can be uploaded, you must be able to build a distribution using 237the :command:`sdist` Distutils command. Once that works, you can run ``python 238setup.py upload`` to add your package to the PyPI archive. Optionally you can 239GPG-sign the package by supplying the :option:`!--sign` and :option:`!--identity` 240options. 241 242Package uploading was implemented by Martin von Löwis and Richard Jones. 243 244 245.. seealso:: 246 247 :pep:`314` - Metadata for Python Software Packages v1.1 248 PEP proposed and written by A.M. Kuchling, Richard Jones, and Fred Drake; 249 implemented by Richard Jones and Fred Drake. 250 251.. ====================================================================== 252 253 254.. _pep-328: 255 256PEP 328: Absolute and Relative Imports 257====================================== 258 259The simpler part of PEP 328 was implemented in Python 2.4: parentheses could now 260be used to enclose the names imported from a module using the ``from ... import 261...`` statement, making it easier to import many different names. 262 263The more complicated part has been implemented in Python 2.5: importing a module 264can be specified to use absolute or package-relative imports. The plan is to 265move toward making absolute imports the default in future versions of Python. 266 267Let's say you have a package directory like this:: 268 269 pkg/ 270 pkg/__init__.py 271 pkg/main.py 272 pkg/string.py 273 274This defines a package named :mod:`pkg` containing the :mod:`pkg.main` and 275:mod:`pkg.string` submodules. 276 277Consider the code in the :file:`main.py` module. What happens if it executes 278the statement ``import string``? In Python 2.4 and earlier, it will first look 279in the package's directory to perform a relative import, finds 280:file:`pkg/string.py`, imports the contents of that file as the 281:mod:`pkg.string` module, and that module is bound to the name ``string`` in the 282:mod:`pkg.main` module's namespace. 283 284That's fine if :mod:`pkg.string` was what you wanted. But what if you wanted 285Python's standard :mod:`string` module? There's no clean way to ignore 286:mod:`pkg.string` and look for the standard module; generally you had to look at 287the contents of ``sys.modules``, which is slightly unclean. Holger Krekel's 288:mod:`py.std` package provides a tidier way to perform imports from the standard 289library, ``import py; py.std.string.join()``, but that package isn't available 290on all Python installations. 291 292Reading code which relies on relative imports is also less clear, because a 293reader may be confused about which module, :mod:`string` or :mod:`pkg.string`, 294is intended to be used. Python users soon learned not to duplicate the names of 295standard library modules in the names of their packages' submodules, but you 296can't protect against having your submodule's name being used for a new module 297added in a future version of Python. 298 299In Python 2.5, you can switch :keyword:`import`'s behaviour to absolute imports 300using a ``from __future__ import absolute_import`` directive. This absolute-import 301behaviour will become the default in a future version (probably Python 3022.7). Once absolute imports are the default, ``import string`` will always 303find the standard library's version. It's suggested that users should begin 304using absolute imports as much as possible, so it's preferable to begin writing 305``from pkg import string`` in your code. 306 307Relative imports are still possible by adding a leading period to the module 308name when using the ``from ... import`` form:: 309 310 # Import names from pkg.string 311 from .string import name1, name2 312 # Import pkg.string 313 from . import string 314 315This imports the :mod:`string` module relative to the current package, so in 316:mod:`pkg.main` this will import *name1* and *name2* from :mod:`pkg.string`. 317Additional leading periods perform the relative import starting from the parent 318of the current package. For example, code in the :mod:`A.B.C` module can do:: 319 320 from . import D # Imports A.B.D 321 from .. import E # Imports A.E 322 from ..F import G # Imports A.F.G 323 324Leading periods cannot be used with the ``import modname`` form of the import 325statement, only the ``from ... import`` form. 326 327 328.. seealso:: 329 330 :pep:`328` - Imports: Multi-Line and Absolute/Relative 331 PEP written by Aahz; implemented by Thomas Wouters. 332 333 https://pylib.readthedocs.io/ 334 The py library by Holger Krekel, which contains the :mod:`py.std` package. 335 336.. ====================================================================== 337 338 339.. _pep-338: 340 341PEP 338: Executing Modules as Scripts 342===================================== 343 344The :option:`-m` switch added in Python 2.4 to execute a module as a script 345gained a few more abilities. Instead of being implemented in C code inside the 346Python interpreter, the switch now uses an implementation in a new module, 347:mod:`runpy`. 348 349The :mod:`runpy` module implements a more sophisticated import mechanism so that 350it's now possible to run modules in a package such as :mod:`pychecker.checker`. 351The module also supports alternative import mechanisms such as the 352:mod:`zipimport` module. This means you can add a .zip archive's path to 353``sys.path`` and then use the :option:`-m` switch to execute code from the 354archive. 355 356 357.. seealso:: 358 359 :pep:`338` - Executing modules as scripts 360 PEP written and implemented by Nick Coghlan. 361 362.. ====================================================================== 363 364 365.. _pep-341: 366 367PEP 341: Unified try/except/finally 368=================================== 369 370Until Python 2.5, the :keyword:`try` statement came in two flavours. You could 371use a :keyword:`finally` block to ensure that code is always executed, or one or 372more :keyword:`except` blocks to catch specific exceptions. You couldn't 373combine both :keyword:`!except` blocks and a :keyword:`!finally` block, because 374generating the right bytecode for the combined version was complicated and it 375wasn't clear what the semantics of the combined statement should be. 376 377Guido van Rossum spent some time working with Java, which does support the 378equivalent of combining :keyword:`except` blocks and a :keyword:`finally` block, 379and this clarified what the statement should mean. In Python 2.5, you can now 380write:: 381 382 try: 383 block-1 ... 384 except Exception1: 385 handler-1 ... 386 except Exception2: 387 handler-2 ... 388 else: 389 else-block 390 finally: 391 final-block 392 393The code in *block-1* is executed. If the code raises an exception, the various 394:keyword:`except` blocks are tested: if the exception is of class 395:class:`Exception1`, *handler-1* is executed; otherwise if it's of class 396:class:`Exception2`, *handler-2* is executed, and so forth. If no exception is 397raised, the *else-block* is executed. 398 399No matter what happened previously, the *final-block* is executed once the code 400block is complete and any raised exceptions handled. Even if there's an error in 401an exception handler or the *else-block* and a new exception is raised, the code 402in the *final-block* is still run. 403 404 405.. seealso:: 406 407 :pep:`341` - Unifying try-except and try-finally 408 PEP written by Georg Brandl; implementation by Thomas Lee. 409 410.. ====================================================================== 411 412 413.. _pep-342: 414 415PEP 342: New Generator Features 416=============================== 417 418Python 2.5 adds a simple way to pass values *into* a generator. As introduced in 419Python 2.3, generators only produce output; once a generator's code was invoked 420to create an iterator, there was no way to pass any new information into the 421function when its execution is resumed. Sometimes the ability to pass in some 422information would be useful. Hackish solutions to this include making the 423generator's code look at a global variable and then changing the global 424variable's value, or passing in some mutable object that callers then modify. 425 426To refresh your memory of basic generators, here's a simple example:: 427 428 def counter (maximum): 429 i = 0 430 while i < maximum: 431 yield i 432 i += 1 433 434When you call ``counter(10)``, the result is an iterator that returns the values 435from 0 up to 9. On encountering the :keyword:`yield` statement, the iterator 436returns the provided value and suspends the function's execution, preserving the 437local variables. Execution resumes on the following call to the iterator's 438:meth:`next` method, picking up after the :keyword:`!yield` statement. 439 440In Python 2.3, :keyword:`yield` was a statement; it didn't return any value. In 4412.5, :keyword:`!yield` is now an expression, returning a value that can be 442assigned to a variable or otherwise operated on:: 443 444 val = (yield i) 445 446I recommend that you always put parentheses around a :keyword:`yield` expression 447when you're doing something with the returned value, as in the above example. 448The parentheses aren't always necessary, but it's easier to always add them 449instead of having to remember when they're needed. 450 451(:pep:`342` explains the exact rules, which are that a 452:keyword:`yield`\ -expression must always be parenthesized except when it 453occurs at the top-level 454expression on the right-hand side of an assignment. This means you can write 455``val = yield i`` but have to use parentheses when there's an operation, as in 456``val = (yield i) + 12``.) 457 458Values are sent into a generator by calling its ``send(value)`` method. The 459generator's code is then resumed and the :keyword:`yield` expression returns the 460specified *value*. If the regular :meth:`next` method is called, the 461:keyword:`!yield` returns :const:`None`. 462 463Here's the previous example, modified to allow changing the value of the 464internal counter. :: 465 466 def counter (maximum): 467 i = 0 468 while i < maximum: 469 val = (yield i) 470 # If value provided, change counter 471 if val is not None: 472 i = val 473 else: 474 i += 1 475 476And here's an example of changing the counter:: 477 478 >>> it = counter(10) 479 >>> print it.next() 480 0 481 >>> print it.next() 482 1 483 >>> print it.send(8) 484 8 485 >>> print it.next() 486 9 487 >>> print it.next() 488 Traceback (most recent call last): 489 File "t.py", line 15, in ? 490 print it.next() 491 StopIteration 492 493:keyword:`yield` will usually return :const:`None`, so you should always check 494for this case. Don't just use its value in expressions unless you're sure that 495the :meth:`send` method will be the only method used to resume your generator 496function. 497 498In addition to :meth:`send`, there are two other new methods on generators: 499 500* ``throw(type, value=None, traceback=None)`` is used to raise an exception 501 inside the generator; the exception is raised by the :keyword:`yield` expression 502 where the generator's execution is paused. 503 504* :meth:`close` raises a new :exc:`GeneratorExit` exception inside the generator 505 to terminate the iteration. On receiving this exception, the generator's code 506 must either raise :exc:`GeneratorExit` or :exc:`StopIteration`. Catching the 507 :exc:`GeneratorExit` exception and returning a value is illegal and will trigger 508 a :exc:`RuntimeError`; if the function raises some other exception, that 509 exception is propagated to the caller. :meth:`close` will also be called by 510 Python's garbage collector when the generator is garbage-collected. 511 512 If you need to run cleanup code when a :exc:`GeneratorExit` occurs, I suggest 513 using a ``try: ... finally:`` suite instead of catching :exc:`GeneratorExit`. 514 515The cumulative effect of these changes is to turn generators from one-way 516producers of information into both producers and consumers. 517 518Generators also become *coroutines*, a more generalized form of subroutines. 519Subroutines are entered at one point and exited at another point (the top of the 520function, and a :keyword:`return` statement), but coroutines can be entered, 521exited, and resumed at many different points (the :keyword:`yield` statements). 522We'll have to figure out patterns for using coroutines effectively in Python. 523 524The addition of the :meth:`close` method has one side effect that isn't obvious. 525:meth:`close` is called when a generator is garbage-collected, so this means the 526generator's code gets one last chance to run before the generator is destroyed. 527This last chance means that ``try...finally`` statements in generators can now 528be guaranteed to work; the :keyword:`finally` clause will now always get a 529chance to run. The syntactic restriction that you couldn't mix :keyword:`yield` 530statements with a ``try...finally`` suite has therefore been removed. This 531seems like a minor bit of language trivia, but using generators and 532``try...finally`` is actually necessary in order to implement the 533:keyword:`with` statement described by PEP 343. I'll look at this new statement 534in the following section. 535 536Another even more esoteric effect of this change: previously, the 537:attr:`gi_frame` attribute of a generator was always a frame object. It's now 538possible for :attr:`gi_frame` to be ``None`` once the generator has been 539exhausted. 540 541 542.. seealso:: 543 544 :pep:`342` - Coroutines via Enhanced Generators 545 PEP written by Guido van Rossum and Phillip J. Eby; implemented by Phillip J. 546 Eby. Includes examples of some fancier uses of generators as coroutines. 547 548 Earlier versions of these features were proposed in :pep:`288` by Raymond 549 Hettinger and :pep:`325` by Samuele Pedroni. 550 551 https://en.wikipedia.org/wiki/Coroutine 552 The Wikipedia entry for coroutines. 553 554 http://www.sidhe.org/~dan/blog/archives/000178.html 555 An explanation of coroutines from a Perl point of view, written by Dan Sugalski. 556 557.. ====================================================================== 558 559 560.. _pep-343: 561 562PEP 343: The 'with' statement 563============================= 564 565The ':keyword:`with`' statement clarifies code that previously would use 566``try...finally`` blocks to ensure that clean-up code is executed. In this 567section, I'll discuss the statement as it will commonly be used. In the next 568section, I'll examine the implementation details and show how to write objects 569for use with this statement. 570 571The ':keyword:`with`' statement is a new control-flow structure whose basic 572structure is:: 573 574 with expression [as variable]: 575 with-block 576 577The expression is evaluated, and it should result in an object that supports the 578context management protocol (that is, has :meth:`__enter__` and :meth:`__exit__` 579methods. 580 581The object's :meth:`__enter__` is called before *with-block* is executed and 582therefore can run set-up code. It also may return a value that is bound to the 583name *variable*, if given. (Note carefully that *variable* is *not* assigned 584the result of *expression*.) 585 586After execution of the *with-block* is finished, the object's :meth:`__exit__` 587method is called, even if the block raised an exception, and can therefore run 588clean-up code. 589 590To enable the statement in Python 2.5, you need to add the following directive 591to your module:: 592 593 from __future__ import with_statement 594 595The statement will always be enabled in Python 2.6. 596 597Some standard Python objects now support the context management protocol and can 598be used with the ':keyword:`with`' statement. File objects are one example:: 599 600 with open('/etc/passwd', 'r') as f: 601 for line in f: 602 print line 603 ... more processing code ... 604 605After this statement has executed, the file object in *f* will have been 606automatically closed, even if the :keyword:`for` loop raised an exception 607part-way through the block. 608 609.. note:: 610 611 In this case, *f* is the same object created by :func:`open`, because 612 :meth:`file.__enter__` returns *self*. 613 614The :mod:`threading` module's locks and condition variables also support the 615':keyword:`with`' statement:: 616 617 lock = threading.Lock() 618 with lock: 619 # Critical section of code 620 ... 621 622The lock is acquired before the block is executed and always released once the 623block is complete. 624 625The new :func:`localcontext` function in the :mod:`decimal` module makes it easy 626to save and restore the current decimal context, which encapsulates the desired 627precision and rounding characteristics for computations:: 628 629 from decimal import Decimal, Context, localcontext 630 631 # Displays with default precision of 28 digits 632 v = Decimal('578') 633 print v.sqrt() 634 635 with localcontext(Context(prec=16)): 636 # All code in this block uses a precision of 16 digits. 637 # The original context is restored on exiting the block. 638 print v.sqrt() 639 640 641.. _new-25-context-managers: 642 643Writing Context Managers 644------------------------ 645 646Under the hood, the ':keyword:`with`' statement is fairly complicated. Most 647people will only use ':keyword:`!with`' in company with existing objects and 648don't need to know these details, so you can skip the rest of this section if 649you like. Authors of new objects will need to understand the details of the 650underlying implementation and should keep reading. 651 652A high-level explanation of the context management protocol is: 653 654* The expression is evaluated and should result in an object called a "context 655 manager". The context manager must have :meth:`__enter__` and :meth:`__exit__` 656 methods. 657 658* The context manager's :meth:`__enter__` method is called. The value returned 659 is assigned to *VAR*. If no ``'as VAR'`` clause is present, the value is simply 660 discarded. 661 662* The code in *BLOCK* is executed. 663 664* If *BLOCK* raises an exception, the ``__exit__(type, value, traceback)`` 665 is called with the exception details, the same values returned by 666 :func:`sys.exc_info`. The method's return value controls whether the exception 667 is re-raised: any false value re-raises the exception, and ``True`` will result 668 in suppressing it. You'll only rarely want to suppress the exception, because 669 if you do the author of the code containing the ':keyword:`with`' statement will 670 never realize anything went wrong. 671 672* If *BLOCK* didn't raise an exception, the :meth:`__exit__` method is still 673 called, but *type*, *value*, and *traceback* are all ``None``. 674 675Let's think through an example. I won't present detailed code but will only 676sketch the methods necessary for a database that supports transactions. 677 678(For people unfamiliar with database terminology: a set of changes to the 679database are grouped into a transaction. Transactions can be either committed, 680meaning that all the changes are written into the database, or rolled back, 681meaning that the changes are all discarded and the database is unchanged. See 682any database textbook for more information.) 683 684Let's assume there's an object representing a database connection. Our goal will 685be to let the user write code like this:: 686 687 db_connection = DatabaseConnection() 688 with db_connection as cursor: 689 cursor.execute('insert into ...') 690 cursor.execute('delete from ...') 691 # ... more operations ... 692 693The transaction should be committed if the code in the block runs flawlessly or 694rolled back if there's an exception. Here's the basic interface for 695:class:`DatabaseConnection` that I'll assume:: 696 697 class DatabaseConnection: 698 # Database interface 699 def cursor (self): 700 "Returns a cursor object and starts a new transaction" 701 def commit (self): 702 "Commits current transaction" 703 def rollback (self): 704 "Rolls back current transaction" 705 706The :meth:`__enter__` method is pretty easy, having only to start a new 707transaction. For this application the resulting cursor object would be a useful 708result, so the method will return it. The user can then add ``as cursor`` to 709their ':keyword:`with`' statement to bind the cursor to a variable name. :: 710 711 class DatabaseConnection: 712 ... 713 def __enter__ (self): 714 # Code to start a new transaction 715 cursor = self.cursor() 716 return cursor 717 718The :meth:`__exit__` method is the most complicated because it's where most of 719the work has to be done. The method has to check if an exception occurred. If 720there was no exception, the transaction is committed. The transaction is rolled 721back if there was an exception. 722 723In the code below, execution will just fall off the end of the function, 724returning the default value of ``None``. ``None`` is false, so the exception 725will be re-raised automatically. If you wished, you could be more explicit and 726add a :keyword:`return` statement at the marked location. :: 727 728 class DatabaseConnection: 729 ... 730 def __exit__ (self, type, value, tb): 731 if tb is None: 732 # No exception, so commit 733 self.commit() 734 else: 735 # Exception occurred, so rollback. 736 self.rollback() 737 # return False 738 739 740.. _contextlibmod: 741 742The contextlib module 743--------------------- 744 745The new :mod:`contextlib` module provides some functions and a decorator that 746are useful for writing objects for use with the ':keyword:`with`' statement. 747 748The decorator is called :func:`contextmanager`, and lets you write a single 749generator function instead of defining a new class. The generator should yield 750exactly one value. The code up to the :keyword:`yield` will be executed as the 751:meth:`__enter__` method, and the value yielded will be the method's return 752value that will get bound to the variable in the ':keyword:`with`' statement's 753:keyword:`!as` clause, if any. The code after the :keyword:`yield` will be 754executed in the :meth:`__exit__` method. Any exception raised in the block will 755be raised by the :keyword:`!yield` statement. 756 757Our database example from the previous section could be written using this 758decorator as:: 759 760 from contextlib import contextmanager 761 762 @contextmanager 763 def db_transaction (connection): 764 cursor = connection.cursor() 765 try: 766 yield cursor 767 except: 768 connection.rollback() 769 raise 770 else: 771 connection.commit() 772 773 db = DatabaseConnection() 774 with db_transaction(db) as cursor: 775 ... 776 777The :mod:`contextlib` module also has a ``nested(mgr1, mgr2, ...)`` function 778that combines a number of context managers so you don't need to write nested 779':keyword:`with`' statements. In this example, the single ':keyword:`!with`' 780statement both starts a database transaction and acquires a thread lock:: 781 782 lock = threading.Lock() 783 with nested (db_transaction(db), lock) as (cursor, locked): 784 ... 785 786Finally, the ``closing(object)`` function returns *object* so that it can be 787bound to a variable, and calls ``object.close`` at the end of the block. :: 788 789 import urllib, sys 790 from contextlib import closing 791 792 with closing(urllib.urlopen('http://www.yahoo.com')) as f: 793 for line in f: 794 sys.stdout.write(line) 795 796 797.. seealso:: 798 799 :pep:`343` - The "with" statement 800 PEP written by Guido van Rossum and Nick Coghlan; implemented by Mike Bland, 801 Guido van Rossum, and Neal Norwitz. The PEP shows the code generated for a 802 ':keyword:`with`' statement, which can be helpful in learning how the statement 803 works. 804 805 The documentation for the :mod:`contextlib` module. 806 807.. ====================================================================== 808 809 810.. _pep-352: 811 812PEP 352: Exceptions as New-Style Classes 813======================================== 814 815Exception classes can now be new-style classes, not just classic classes, and 816the built-in :exc:`Exception` class and all the standard built-in exceptions 817(:exc:`NameError`, :exc:`ValueError`, etc.) are now new-style classes. 818 819The inheritance hierarchy for exceptions has been rearranged a bit. In 2.5, the 820inheritance relationships are:: 821 822 BaseException # New in Python 2.5 823 |- KeyboardInterrupt 824 |- SystemExit 825 |- Exception 826 |- (all other current built-in exceptions) 827 828This rearrangement was done because people often want to catch all exceptions 829that indicate program errors. :exc:`KeyboardInterrupt` and :exc:`SystemExit` 830aren't errors, though, and usually represent an explicit action such as the user 831hitting :kbd:`Control-C` or code calling :func:`sys.exit`. A bare ``except:`` will 832catch all exceptions, so you commonly need to list :exc:`KeyboardInterrupt` and 833:exc:`SystemExit` in order to re-raise them. The usual pattern is:: 834 835 try: 836 ... 837 except (KeyboardInterrupt, SystemExit): 838 raise 839 except: 840 # Log error... 841 # Continue running program... 842 843In Python 2.5, you can now write ``except Exception`` to achieve the same 844result, catching all the exceptions that usually indicate errors but leaving 845:exc:`KeyboardInterrupt` and :exc:`SystemExit` alone. As in previous versions, 846a bare ``except:`` still catches all exceptions. 847 848The goal for Python 3.0 is to require any class raised as an exception to derive 849from :exc:`BaseException` or some descendant of :exc:`BaseException`, and future 850releases in the Python 2.x series may begin to enforce this constraint. 851Therefore, I suggest you begin making all your exception classes derive from 852:exc:`Exception` now. It's been suggested that the bare ``except:`` form should 853be removed in Python 3.0, but Guido van Rossum hasn't decided whether to do this 854or not. 855 856Raising of strings as exceptions, as in the statement ``raise "Error 857occurred"``, is deprecated in Python 2.5 and will trigger a warning. The aim is 858to be able to remove the string-exception feature in a few releases. 859 860 861.. seealso:: 862 863 :pep:`352` - Required Superclass for Exceptions 864 PEP written by Brett Cannon and Guido van Rossum; implemented by Brett Cannon. 865 866.. ====================================================================== 867 868 869.. _pep-353: 870 871PEP 353: Using ssize_t as the index type 872======================================== 873 874A wide-ranging change to Python's C API, using a new :c:type:`Py_ssize_t` type 875definition instead of :c:type:`int`, will permit the interpreter to handle more 876data on 64-bit platforms. This change doesn't affect Python's capacity on 32-bit 877platforms. 878 879Various pieces of the Python interpreter used C's :c:type:`int` type to store 880sizes or counts; for example, the number of items in a list or tuple were stored 881in an :c:type:`int`. The C compilers for most 64-bit platforms still define 882:c:type:`int` as a 32-bit type, so that meant that lists could only hold up to 883``2**31 - 1`` = 2147483647 items. (There are actually a few different 884programming models that 64-bit C compilers can use -- see 885http://www.unix.org/version2/whatsnew/lp64_wp.html for a discussion -- but the 886most commonly available model leaves :c:type:`int` as 32 bits.) 887 888A limit of 2147483647 items doesn't really matter on a 32-bit platform because 889you'll run out of memory before hitting the length limit. Each list item 890requires space for a pointer, which is 4 bytes, plus space for a 891:c:type:`PyObject` representing the item. 2147483647\*4 is already more bytes 892than a 32-bit address space can contain. 893 894It's possible to address that much memory on a 64-bit platform, however. The 895pointers for a list that size would only require 16 GiB of space, so it's not 896unreasonable that Python programmers might construct lists that large. 897Therefore, the Python interpreter had to be changed to use some type other than 898:c:type:`int`, and this will be a 64-bit type on 64-bit platforms. The change 899will cause incompatibilities on 64-bit machines, so it was deemed worth making 900the transition now, while the number of 64-bit users is still relatively small. 901(In 5 or 10 years, we may *all* be on 64-bit machines, and the transition would 902be more painful then.) 903 904This change most strongly affects authors of C extension modules. Python 905strings and container types such as lists and tuples now use 906:c:type:`Py_ssize_t` to store their size. Functions such as 907:c:func:`PyList_Size` now return :c:type:`Py_ssize_t`. Code in extension modules 908may therefore need to have some variables changed to :c:type:`Py_ssize_t`. 909 910The :c:func:`PyArg_ParseTuple` and :c:func:`Py_BuildValue` functions have a new 911conversion code, ``n``, for :c:type:`Py_ssize_t`. :c:func:`PyArg_ParseTuple`'s 912``s#`` and ``t#`` still output :c:type:`int` by default, but you can define the 913macro :c:macro:`PY_SSIZE_T_CLEAN` before including :file:`Python.h` to make 914them return :c:type:`Py_ssize_t`. 915 916:pep:`353` has a section on conversion guidelines that extension authors should 917read to learn about supporting 64-bit platforms. 918 919 920.. seealso:: 921 922 :pep:`353` - Using ssize_t as the index type 923 PEP written and implemented by Martin von Löwis. 924 925.. ====================================================================== 926 927 928.. _pep-357: 929 930PEP 357: The '__index__' method 931=============================== 932 933The NumPy developers had a problem that could only be solved by adding a new 934special method, :meth:`__index__`. When using slice notation, as in 935``[start:stop:step]``, the values of the *start*, *stop*, and *step* indexes 936must all be either integers or long integers. NumPy defines a variety of 937specialized integer types corresponding to unsigned and signed integers of 8, 93816, 32, and 64 bits, but there was no way to signal that these types could be 939used as slice indexes. 940 941Slicing can't just use the existing :meth:`__int__` method because that method 942is also used to implement coercion to integers. If slicing used 943:meth:`__int__`, floating-point numbers would also become legal slice indexes 944and that's clearly an undesirable behaviour. 945 946Instead, a new special method called :meth:`__index__` was added. It takes no 947arguments and returns an integer giving the slice index to use. For example:: 948 949 class C: 950 def __index__ (self): 951 return self.value 952 953The return value must be either a Python integer or long integer. The 954interpreter will check that the type returned is correct, and raises a 955:exc:`TypeError` if this requirement isn't met. 956 957A corresponding :attr:`nb_index` slot was added to the C-level 958:c:type:`PyNumberMethods` structure to let C extensions implement this protocol. 959``PyNumber_Index(obj)`` can be used in extension code to call the 960:meth:`__index__` function and retrieve its result. 961 962 963.. seealso:: 964 965 :pep:`357` - Allowing Any Object to be Used for Slicing 966 PEP written and implemented by Travis Oliphant. 967 968.. ====================================================================== 969 970 971.. _other-lang: 972 973Other Language Changes 974====================== 975 976Here are all of the changes that Python 2.5 makes to the core Python language. 977 978* The :class:`dict` type has a new hook for letting subclasses provide a default 979 value when a key isn't contained in the dictionary. When a key isn't found, the 980 dictionary's ``__missing__(key)`` method will be called. This hook is used 981 to implement the new :class:`defaultdict` class in the :mod:`collections` 982 module. The following example defines a dictionary that returns zero for any 983 missing key:: 984 985 class zerodict (dict): 986 def __missing__ (self, key): 987 return 0 988 989 d = zerodict({1:1, 2:2}) 990 print d[1], d[2] # Prints 1, 2 991 print d[3], d[4] # Prints 0, 0 992 993* Both 8-bit and Unicode strings have new ``partition(sep)`` and 994 ``rpartition(sep)`` methods that simplify a common use case. 995 996 The ``find(S)`` method is often used to get an index which is then used to 997 slice the string and obtain the pieces that are before and after the separator. 998 ``partition(sep)`` condenses this pattern into a single method call that 999 returns a 3-tuple containing the substring before the separator, the separator 1000 itself, and the substring after the separator. If the separator isn't found, 1001 the first element of the tuple is the entire string and the other two elements 1002 are empty. ``rpartition(sep)`` also returns a 3-tuple but starts searching 1003 from the end of the string; the ``r`` stands for 'reverse'. 1004 1005 Some examples:: 1006 1007 >>> ('http://www.python.org').partition('://') 1008 ('http', '://', 'www.python.org') 1009 >>> ('file:/usr/share/doc/index.html').partition('://') 1010 ('file:/usr/share/doc/index.html', '', '') 1011 >>> (u'Subject: a quick question').partition(':') 1012 (u'Subject', u':', u' a quick question') 1013 >>> 'www.python.org'.rpartition('.') 1014 ('www.python', '.', 'org') 1015 >>> 'www.python.org'.rpartition(':') 1016 ('', '', 'www.python.org') 1017 1018 (Implemented by Fredrik Lundh following a suggestion by Raymond Hettinger.) 1019 1020* The :meth:`startswith` and :meth:`endswith` methods of string types now accept 1021 tuples of strings to check for. :: 1022 1023 def is_image_file (filename): 1024 return filename.endswith(('.gif', '.jpg', '.tiff')) 1025 1026 (Implemented by Georg Brandl following a suggestion by Tom Lynn.) 1027 1028 .. RFE #1491485 1029 1030* The :func:`min` and :func:`max` built-in functions gained a ``key`` keyword 1031 parameter analogous to the ``key`` argument for :meth:`sort`. This parameter 1032 supplies a function that takes a single argument and is called for every value 1033 in the list; :func:`min`/:func:`max` will return the element with the 1034 smallest/largest return value from this function. For example, to find the 1035 longest string in a list, you can do:: 1036 1037 L = ['medium', 'longest', 'short'] 1038 # Prints 'longest' 1039 print max(L, key=len) 1040 # Prints 'short', because lexicographically 'short' has the largest value 1041 print max(L) 1042 1043 (Contributed by Steven Bethard and Raymond Hettinger.) 1044 1045* Two new built-in functions, :func:`any` and :func:`all`, evaluate whether an 1046 iterator contains any true or false values. :func:`any` returns :const:`True` 1047 if any value returned by the iterator is true; otherwise it will return 1048 :const:`False`. :func:`all` returns :const:`True` only if all of the values 1049 returned by the iterator evaluate as true. (Suggested by Guido van Rossum, and 1050 implemented by Raymond Hettinger.) 1051 1052* The result of a class's :meth:`__hash__` method can now be either a long 1053 integer or a regular integer. If a long integer is returned, the hash of that 1054 value is taken. In earlier versions the hash value was required to be a 1055 regular integer, but in 2.5 the :func:`id` built-in was changed to always 1056 return non-negative numbers, and users often seem to use ``id(self)`` in 1057 :meth:`__hash__` methods (though this is discouraged). 1058 1059 .. Bug #1536021 1060 1061* ASCII is now the default encoding for modules. It's now a syntax error if a 1062 module contains string literals with 8-bit characters but doesn't have an 1063 encoding declaration. In Python 2.4 this triggered a warning, not a syntax 1064 error. See :pep:`263` for how to declare a module's encoding; for example, you 1065 might add a line like this near the top of the source file:: 1066 1067 # -*- coding: latin1 -*- 1068 1069* A new warning, :class:`UnicodeWarning`, is triggered when you attempt to 1070 compare a Unicode string and an 8-bit string that can't be converted to Unicode 1071 using the default ASCII encoding. The result of the comparison is false:: 1072 1073 >>> chr(128) == unichr(128) # Can't convert chr(128) to Unicode 1074 __main__:1: UnicodeWarning: Unicode equal comparison failed 1075 to convert both arguments to Unicode - interpreting them 1076 as being unequal 1077 False 1078 >>> chr(127) == unichr(127) # chr(127) can be converted 1079 True 1080 1081 Previously this would raise a :class:`UnicodeDecodeError` exception, but in 2.5 1082 this could result in puzzling problems when accessing a dictionary. If you 1083 looked up ``unichr(128)`` and ``chr(128)`` was being used as a key, you'd get a 1084 :class:`UnicodeDecodeError` exception. Other changes in 2.5 resulted in this 1085 exception being raised instead of suppressed by the code in :file:`dictobject.c` 1086 that implements dictionaries. 1087 1088 Raising an exception for such a comparison is strictly correct, but the change 1089 might have broken code, so instead :class:`UnicodeWarning` was introduced. 1090 1091 (Implemented by Marc-André Lemburg.) 1092 1093* One error that Python programmers sometimes make is forgetting to include an 1094 :file:`__init__.py` module in a package directory. Debugging this mistake can be 1095 confusing, and usually requires running Python with the :option:`-v` switch to 1096 log all the paths searched. In Python 2.5, a new :exc:`ImportWarning` warning is 1097 triggered when an import would have picked up a directory as a package but no 1098 :file:`__init__.py` was found. This warning is silently ignored by default; 1099 provide the :option:`-Wd <-W>` option when running the Python executable to display 1100 the warning message. (Implemented by Thomas Wouters.) 1101 1102* The list of base classes in a class definition can now be empty. As an 1103 example, this is now legal:: 1104 1105 class C(): 1106 pass 1107 1108 (Implemented by Brett Cannon.) 1109 1110.. ====================================================================== 1111 1112 1113.. _25interactive: 1114 1115Interactive Interpreter Changes 1116------------------------------- 1117 1118In the interactive interpreter, ``quit`` and ``exit`` have long been strings so 1119that new users get a somewhat helpful message when they try to quit:: 1120 1121 >>> quit 1122 'Use Ctrl-D (i.e. EOF) to exit.' 1123 1124In Python 2.5, ``quit`` and ``exit`` are now objects that still produce string 1125representations of themselves, but are also callable. Newbies who try ``quit()`` 1126or ``exit()`` will now exit the interpreter as they expect. (Implemented by 1127Georg Brandl.) 1128 1129The Python executable now accepts the standard long options :option:`--help` 1130and :option:`--version`; on Windows, it also accepts the :option:`/? <-?>` option 1131for displaying a help message. (Implemented by Georg Brandl.) 1132 1133.. ====================================================================== 1134 1135 1136.. _opts: 1137 1138Optimizations 1139------------- 1140 1141Several of the optimizations were developed at the NeedForSpeed sprint, an event 1142held in Reykjavik, Iceland, from May 21--28 2006. The sprint focused on speed 1143enhancements to the CPython implementation and was funded by EWT LLC with local 1144support from CCP Games. Those optimizations added at this sprint are specially 1145marked in the following list. 1146 1147* When they were introduced in Python 2.4, the built-in :class:`set` and 1148 :class:`frozenset` types were built on top of Python's dictionary type. In 2.5 1149 the internal data structure has been customized for implementing sets, and as a 1150 result sets will use a third less memory and are somewhat faster. (Implemented 1151 by Raymond Hettinger.) 1152 1153* The speed of some Unicode operations, such as finding substrings, string 1154 splitting, and character map encoding and decoding, has been improved. 1155 (Substring search and splitting improvements were added by Fredrik Lundh and 1156 Andrew Dalke at the NeedForSpeed sprint. Character maps were improved by Walter 1157 Dörwald and Martin von Löwis.) 1158 1159 .. Patch 1313939, 1359618 1160 1161* The ``long(str, base)`` function is now faster on long digit strings 1162 because fewer intermediate results are calculated. The peak is for strings of 1163 around 800--1000 digits where the function is 6 times faster. (Contributed by 1164 Alan McIntyre and committed at the NeedForSpeed sprint.) 1165 1166 .. Patch 1442927 1167 1168* It's now illegal to mix iterating over a file with ``for line in file`` and 1169 calling the file object's :meth:`read`/:meth:`readline`/:meth:`readlines` 1170 methods. Iteration uses an internal buffer and the :meth:`read\*` methods 1171 don't use that buffer. Instead they would return the data following the 1172 buffer, causing the data to appear out of order. Mixing iteration and these 1173 methods will now trigger a :exc:`ValueError` from the :meth:`read\*` method. 1174 (Implemented by Thomas Wouters.) 1175 1176 .. Patch 1397960 1177 1178* The :mod:`struct` module now compiles structure format strings into an 1179 internal representation and caches this representation, yielding a 20% speedup. 1180 (Contributed by Bob Ippolito at the NeedForSpeed sprint.) 1181 1182* The :mod:`re` module got a 1 or 2% speedup by switching to Python's allocator 1183 functions instead of the system's :c:func:`malloc` and :c:func:`free`. 1184 (Contributed by Jack Diederich at the NeedForSpeed sprint.) 1185 1186* The code generator's peephole optimizer now performs simple constant folding 1187 in expressions. If you write something like ``a = 2+3``, the code generator 1188 will do the arithmetic and produce code corresponding to ``a = 5``. (Proposed 1189 and implemented by Raymond Hettinger.) 1190 1191* Function calls are now faster because code objects now keep the most recently 1192 finished frame (a "zombie frame") in an internal field of the code object, 1193 reusing it the next time the code object is invoked. (Original patch by Michael 1194 Hudson, modified by Armin Rigo and Richard Jones; committed at the NeedForSpeed 1195 sprint.) Frame objects are also slightly smaller, which may improve cache 1196 locality and reduce memory usage a bit. (Contributed by Neal Norwitz.) 1197 1198 .. Patch 876206 1199 .. Patch 1337051 1200 1201* Python's built-in exceptions are now new-style classes, a change that speeds 1202 up instantiation considerably. Exception handling in Python 2.5 is therefore 1203 about 30% faster than in 2.4. (Contributed by Richard Jones, Georg Brandl and 1204 Sean Reifschneider at the NeedForSpeed sprint.) 1205 1206* Importing now caches the paths tried, recording whether they exist or not so 1207 that the interpreter makes fewer :c:func:`open` and :c:func:`stat` calls on 1208 startup. (Contributed by Martin von Löwis and Georg Brandl.) 1209 1210 .. Patch 921466 1211 1212.. ====================================================================== 1213 1214 1215.. _25modules: 1216 1217New, Improved, and Removed Modules 1218================================== 1219 1220The standard library received many enhancements and bug fixes in Python 2.5. 1221Here's a partial list of the most notable changes, sorted alphabetically by 1222module name. Consult the :file:`Misc/NEWS` file in the source tree for a more 1223complete list of changes, or look through the SVN logs for all the details. 1224 1225* The :mod:`audioop` module now supports the a-LAW encoding, and the code for 1226 u-LAW encoding has been improved. (Contributed by Lars Immisch.) 1227 1228* The :mod:`codecs` module gained support for incremental codecs. The 1229 :func:`codec.lookup` function now returns a :class:`CodecInfo` instance instead 1230 of a tuple. :class:`CodecInfo` instances behave like a 4-tuple to preserve 1231 backward compatibility but also have the attributes :attr:`encode`, 1232 :attr:`decode`, :attr:`incrementalencoder`, :attr:`incrementaldecoder`, 1233 :attr:`streamwriter`, and :attr:`streamreader`. Incremental codecs can receive 1234 input and produce output in multiple chunks; the output is the same as if the 1235 entire input was fed to the non-incremental codec. See the :mod:`codecs` module 1236 documentation for details. (Designed and implemented by Walter Dörwald.) 1237 1238 .. Patch 1436130 1239 1240* The :mod:`collections` module gained a new type, :class:`defaultdict`, that 1241 subclasses the standard :class:`dict` type. The new type mostly behaves like a 1242 dictionary but constructs a default value when a key isn't present, 1243 automatically adding it to the dictionary for the requested key value. 1244 1245 The first argument to :class:`defaultdict`'s constructor is a factory function 1246 that gets called whenever a key is requested but not found. This factory 1247 function receives no arguments, so you can use built-in type constructors such 1248 as :func:`list` or :func:`int`. For example, you can make an index of words 1249 based on their initial letter like this:: 1250 1251 words = """Nel mezzo del cammin di nostra vita 1252 mi ritrovai per una selva oscura 1253 che la diritta via era smarrita""".lower().split() 1254 1255 index = defaultdict(list) 1256 1257 for w in words: 1258 init_letter = w[0] 1259 index[init_letter].append(w) 1260 1261 Printing ``index`` results in the following output:: 1262 1263 defaultdict(<type 'list'>, {'c': ['cammin', 'che'], 'e': ['era'], 1264 'd': ['del', 'di', 'diritta'], 'm': ['mezzo', 'mi'], 1265 'l': ['la'], 'o': ['oscura'], 'n': ['nel', 'nostra'], 1266 'p': ['per'], 's': ['selva', 'smarrita'], 1267 'r': ['ritrovai'], 'u': ['una'], 'v': ['vita', 'via']} 1268 1269 (Contributed by Guido van Rossum.) 1270 1271* The :class:`deque` double-ended queue type supplied by the :mod:`collections` 1272 module now has a ``remove(value)`` method that removes the first occurrence 1273 of *value* in the queue, raising :exc:`ValueError` if the value isn't found. 1274 (Contributed by Raymond Hettinger.) 1275 1276* New module: The :mod:`contextlib` module contains helper functions for use 1277 with the new ':keyword:`with`' statement. See section :ref:`contextlibmod` 1278 for more about this module. 1279 1280* New module: The :mod:`cProfile` module is a C implementation of the existing 1281 :mod:`profile` module that has much lower overhead. The module's interface is 1282 the same as :mod:`profile`: you run ``cProfile.run('main()')`` to profile a 1283 function, can save profile data to a file, etc. It's not yet known if the 1284 Hotshot profiler, which is also written in C but doesn't match the 1285 :mod:`profile` module's interface, will continue to be maintained in future 1286 versions of Python. (Contributed by Armin Rigo.) 1287 1288 Also, the :mod:`pstats` module for analyzing the data measured by the profiler 1289 now supports directing the output to any file object by supplying a *stream* 1290 argument to the :class:`Stats` constructor. (Contributed by Skip Montanaro.) 1291 1292* The :mod:`csv` module, which parses files in comma-separated value format, 1293 received several enhancements and a number of bugfixes. You can now set the 1294 maximum size in bytes of a field by calling the 1295 ``csv.field_size_limit(new_limit)`` function; omitting the *new_limit* 1296 argument will return the currently-set limit. The :class:`reader` class now has 1297 a :attr:`line_num` attribute that counts the number of physical lines read from 1298 the source; records can span multiple physical lines, so :attr:`line_num` is not 1299 the same as the number of records read. 1300 1301 The CSV parser is now stricter about multi-line quoted fields. Previously, if a 1302 line ended within a quoted field without a terminating newline character, a 1303 newline would be inserted into the returned field. This behavior caused problems 1304 when reading files that contained carriage return characters within fields, so 1305 the code was changed to return the field without inserting newlines. As a 1306 consequence, if newlines embedded within fields are important, the input should 1307 be split into lines in a manner that preserves the newline characters. 1308 1309 (Contributed by Skip Montanaro and Andrew McNamara.) 1310 1311* The :class:`~datetime.datetime` class in the :mod:`datetime` module now has a 1312 ``strptime(string, format)`` method for parsing date strings, contributed 1313 by Josh Spoerri. It uses the same format characters as :func:`time.strptime` and 1314 :func:`time.strftime`:: 1315 1316 from datetime import datetime 1317 1318 ts = datetime.strptime('10:13:15 2006-03-07', 1319 '%H:%M:%S %Y-%m-%d') 1320 1321* The :meth:`SequenceMatcher.get_matching_blocks` method in the :mod:`difflib` 1322 module now guarantees to return a minimal list of blocks describing matching 1323 subsequences. Previously, the algorithm would occasionally break a block of 1324 matching elements into two list entries. (Enhancement by Tim Peters.) 1325 1326* The :mod:`doctest` module gained a ``SKIP`` option that keeps an example from 1327 being executed at all. This is intended for code snippets that are usage 1328 examples intended for the reader and aren't actually test cases. 1329 1330 An *encoding* parameter was added to the :func:`testfile` function and the 1331 :class:`DocFileSuite` class to specify the file's encoding. This makes it 1332 easier to use non-ASCII characters in tests contained within a docstring. 1333 (Contributed by Bjorn Tillenius.) 1334 1335 .. Patch 1080727 1336 1337* The :mod:`email` package has been updated to version 4.0. (Contributed by 1338 Barry Warsaw.) 1339 1340 .. XXX need to provide some more detail here 1341 1342 .. index:: 1343 single: universal newlines; What's new 1344 1345* The :mod:`fileinput` module was made more flexible. Unicode filenames are now 1346 supported, and a *mode* parameter that defaults to ``"r"`` was added to the 1347 :func:`input` function to allow opening files in binary or :term:`universal 1348 newlines` mode. Another new parameter, *openhook*, lets you use a function 1349 other than :func:`open` to open the input files. Once you're iterating over 1350 the set of files, the :class:`FileInput` object's new :meth:`fileno` returns 1351 the file descriptor for the currently opened file. (Contributed by Georg 1352 Brandl.) 1353 1354* In the :mod:`gc` module, the new :func:`get_count` function returns a 3-tuple 1355 containing the current collection counts for the three GC generations. This is 1356 accounting information for the garbage collector; when these counts reach a 1357 specified threshold, a garbage collection sweep will be made. The existing 1358 :func:`gc.collect` function now takes an optional *generation* argument of 0, 1, 1359 or 2 to specify which generation to collect. (Contributed by Barry Warsaw.) 1360 1361* The :func:`nsmallest` and :func:`nlargest` functions in the :mod:`heapq` 1362 module now support a ``key`` keyword parameter similar to the one provided by 1363 the :func:`min`/:func:`max` functions and the :meth:`sort` methods. For 1364 example:: 1365 1366 >>> import heapq 1367 >>> L = ["short", 'medium', 'longest', 'longer still'] 1368 >>> heapq.nsmallest(2, L) # Return two lowest elements, lexicographically 1369 ['longer still', 'longest'] 1370 >>> heapq.nsmallest(2, L, key=len) # Return two shortest elements 1371 ['short', 'medium'] 1372 1373 (Contributed by Raymond Hettinger.) 1374 1375* The :func:`itertools.islice` function now accepts ``None`` for the start and 1376 step arguments. This makes it more compatible with the attributes of slice 1377 objects, so that you can now write the following:: 1378 1379 s = slice(5) # Create slice object 1380 itertools.islice(iterable, s.start, s.stop, s.step) 1381 1382 (Contributed by Raymond Hettinger.) 1383 1384* The :func:`format` function in the :mod:`locale` module has been modified and 1385 two new functions were added, :func:`format_string` and :func:`currency`. 1386 1387 The :func:`format` function's *val* parameter could previously be a string as 1388 long as no more than one %char specifier appeared; now the parameter must be 1389 exactly one %char specifier with no surrounding text. An optional *monetary* 1390 parameter was also added which, if ``True``, will use the locale's rules for 1391 formatting currency in placing a separator between groups of three digits. 1392 1393 To format strings with multiple %char specifiers, use the new 1394 :func:`format_string` function that works like :func:`format` but also supports 1395 mixing %char specifiers with arbitrary text. 1396 1397 A new :func:`currency` function was also added that formats a number according 1398 to the current locale's settings. 1399 1400 (Contributed by Georg Brandl.) 1401 1402 .. Patch 1180296 1403 1404* The :mod:`mailbox` module underwent a massive rewrite to add the capability to 1405 modify mailboxes in addition to reading them. A new set of classes that include 1406 :class:`mbox`, :class:`MH`, and :class:`Maildir` are used to read mailboxes, and 1407 have an ``add(message)`` method to add messages, ``remove(key)`` to 1408 remove messages, and :meth:`lock`/:meth:`unlock` to lock/unlock the mailbox. 1409 The following example converts a maildir-format mailbox into an mbox-format 1410 one:: 1411 1412 import mailbox 1413 1414 # 'factory=None' uses email.Message.Message as the class representing 1415 # individual messages. 1416 src = mailbox.Maildir('maildir', factory=None) 1417 dest = mailbox.mbox('/tmp/mbox') 1418 1419 for msg in src: 1420 dest.add(msg) 1421 1422 (Contributed by Gregory K. Johnson. Funding was provided by Google's 2005 1423 Summer of Code.) 1424 1425* New module: the :mod:`msilib` module allows creating Microsoft Installer 1426 :file:`.msi` files and CAB files. Some support for reading the :file:`.msi` 1427 database is also included. (Contributed by Martin von Löwis.) 1428 1429* The :mod:`nis` module now supports accessing domains other than the system 1430 default domain by supplying a *domain* argument to the :func:`nis.match` and 1431 :func:`nis.maps` functions. (Contributed by Ben Bell.) 1432 1433* The :mod:`operator` module's :func:`itemgetter` and :func:`attrgetter` 1434 functions now support multiple fields. A call such as 1435 ``operator.attrgetter('a', 'b')`` will return a function that retrieves the 1436 :attr:`a` and :attr:`b` attributes. Combining this new feature with the 1437 :meth:`sort` method's ``key`` parameter lets you easily sort lists using 1438 multiple fields. (Contributed by Raymond Hettinger.) 1439 1440* The :mod:`optparse` module was updated to version 1.5.1 of the Optik library. 1441 The :class:`OptionParser` class gained an :attr:`epilog` attribute, a string 1442 that will be printed after the help message, and a :meth:`destroy` method to 1443 break reference cycles created by the object. (Contributed by Greg Ward.) 1444 1445* The :mod:`os` module underwent several changes. The :attr:`stat_float_times` 1446 variable now defaults to true, meaning that :func:`os.stat` will now return time 1447 values as floats. (This doesn't necessarily mean that :func:`os.stat` will 1448 return times that are precise to fractions of a second; not all systems support 1449 such precision.) 1450 1451 Constants named :attr:`os.SEEK_SET`, :attr:`os.SEEK_CUR`, and 1452 :attr:`os.SEEK_END` have been added; these are the parameters to the 1453 :func:`os.lseek` function. Two new constants for locking are 1454 :attr:`os.O_SHLOCK` and :attr:`os.O_EXLOCK`. 1455 1456 Two new functions, :func:`wait3` and :func:`wait4`, were added. They're similar 1457 the :func:`waitpid` function which waits for a child process to exit and returns 1458 a tuple of the process ID and its exit status, but :func:`wait3` and 1459 :func:`wait4` return additional information. :func:`wait3` doesn't take a 1460 process ID as input, so it waits for any child process to exit and returns a 1461 3-tuple of *process-id*, *exit-status*, *resource-usage* as returned from the 1462 :func:`resource.getrusage` function. ``wait4(pid)`` does take a process ID. 1463 (Contributed by Chad J. Schroeder.) 1464 1465 On FreeBSD, the :func:`os.stat` function now returns times with nanosecond 1466 resolution, and the returned object now has :attr:`st_gen` and 1467 :attr:`st_birthtime`. The :attr:`st_flags` attribute is also available, if the 1468 platform supports it. (Contributed by Antti Louko and Diego Pettenò.) 1469 1470 .. (Patch 1180695, 1212117) 1471 1472* The Python debugger provided by the :mod:`pdb` module can now store lists of 1473 commands to execute when a breakpoint is reached and execution stops. Once 1474 breakpoint #1 has been created, enter ``commands 1`` and enter a series of 1475 commands to be executed, finishing the list with ``end``. The command list can 1476 include commands that resume execution, such as ``continue`` or ``next``. 1477 (Contributed by Grégoire Dooms.) 1478 1479 .. Patch 790710 1480 1481* The :mod:`pickle` and :mod:`cPickle` modules no longer accept a return value 1482 of ``None`` from the :meth:`__reduce__` method; the method must return a tuple 1483 of arguments instead. The ability to return ``None`` was deprecated in Python 1484 2.4, so this completes the removal of the feature. 1485 1486* The :mod:`pkgutil` module, containing various utility functions for finding 1487 packages, was enhanced to support PEP 302's import hooks and now also works for 1488 packages stored in ZIP-format archives. (Contributed by Phillip J. Eby.) 1489 1490* The pybench benchmark suite by Marc-André Lemburg is now included in the 1491 :file:`Tools/pybench` directory. The pybench suite is an improvement on the 1492 commonly used :file:`pystone.py` program because pybench provides a more 1493 detailed measurement of the interpreter's speed. It times particular operations 1494 such as function calls, tuple slicing, method lookups, and numeric operations, 1495 instead of performing many different operations and reducing the result to a 1496 single number as :file:`pystone.py` does. 1497 1498* The :mod:`pyexpat` module now uses version 2.0 of the Expat parser. 1499 (Contributed by Trent Mick.) 1500 1501* The :class:`~queue.Queue` class provided by the :mod:`Queue` module gained two new 1502 methods. :meth:`join` blocks until all items in the queue have been retrieved 1503 and all processing work on the items have been completed. Worker threads call 1504 the other new method, :meth:`task_done`, to signal that processing for an item 1505 has been completed. (Contributed by Raymond Hettinger.) 1506 1507* The old :mod:`regex` and :mod:`regsub` modules, which have been deprecated 1508 ever since Python 2.0, have finally been deleted. Other deleted modules: 1509 :mod:`statcache`, :mod:`tzparse`, :mod:`whrandom`. 1510 1511* Also deleted: the :file:`lib-old` directory, which includes ancient modules 1512 such as :mod:`dircmp` and :mod:`ni`, was removed. :file:`lib-old` wasn't on the 1513 default ``sys.path``, so unless your programs explicitly added the directory to 1514 ``sys.path``, this removal shouldn't affect your code. 1515 1516* The :mod:`rlcompleter` module is no longer dependent on importing the 1517 :mod:`readline` module and therefore now works on non-Unix platforms. (Patch 1518 from Robert Kiendl.) 1519 1520 .. Patch #1472854 1521 1522* The :mod:`SimpleXMLRPCServer` and :mod:`DocXMLRPCServer` classes now have a 1523 :attr:`rpc_paths` attribute that constrains XML-RPC operations to a limited set 1524 of URL paths; the default is to allow only ``'/'`` and ``'/RPC2'``. Setting 1525 :attr:`rpc_paths` to ``None`` or an empty tuple disables this path checking. 1526 1527 .. Bug #1473048 1528 1529* The :mod:`socket` module now supports :const:`AF_NETLINK` sockets on Linux, 1530 thanks to a patch from Philippe Biondi. Netlink sockets are a Linux-specific 1531 mechanism for communications between a user-space process and kernel code; an 1532 introductory article about them is at https://www.linuxjournal.com/article/7356. 1533 In Python code, netlink addresses are represented as a tuple of 2 integers, 1534 ``(pid, group_mask)``. 1535 1536 Two new methods on socket objects, ``recv_into(buffer)`` and 1537 ``recvfrom_into(buffer)``, store the received data in an object that 1538 supports the buffer protocol instead of returning the data as a string. This 1539 means you can put the data directly into an array or a memory-mapped file. 1540 1541 Socket objects also gained :meth:`getfamily`, :meth:`gettype`, and 1542 :meth:`getproto` accessor methods to retrieve the family, type, and protocol 1543 values for the socket. 1544 1545* New module: the :mod:`spwd` module provides functions for accessing the shadow 1546 password database on systems that support shadow passwords. 1547 1548* The :mod:`struct` is now faster because it compiles format strings into 1549 :class:`Struct` objects with :meth:`pack` and :meth:`unpack` methods. This is 1550 similar to how the :mod:`re` module lets you create compiled regular expression 1551 objects. You can still use the module-level :func:`pack` and :func:`unpack` 1552 functions; they'll create :class:`Struct` objects and cache them. Or you can 1553 use :class:`Struct` instances directly:: 1554 1555 s = struct.Struct('ih3s') 1556 1557 data = s.pack(1972, 187, 'abc') 1558 year, number, name = s.unpack(data) 1559 1560 You can also pack and unpack data to and from buffer objects directly using the 1561 ``pack_into(buffer, offset, v1, v2, ...)`` and ``unpack_from(buffer, 1562 offset)`` methods. This lets you store data directly into an array or a 1563 memory-mapped file. 1564 1565 (:class:`Struct` objects were implemented by Bob Ippolito at the NeedForSpeed 1566 sprint. Support for buffer objects was added by Martin Blais, also at the 1567 NeedForSpeed sprint.) 1568 1569* The Python developers switched from CVS to Subversion during the 2.5 1570 development process. Information about the exact build version is available as 1571 the ``sys.subversion`` variable, a 3-tuple of ``(interpreter-name, branch-name, 1572 revision-range)``. For example, at the time of writing my copy of 2.5 was 1573 reporting ``('CPython', 'trunk', '45313:45315')``. 1574 1575 This information is also available to C extensions via the 1576 :c:func:`Py_GetBuildInfo` function that returns a string of build information 1577 like this: ``"trunk:45355:45356M, Apr 13 2006, 07:42:19"``. (Contributed by 1578 Barry Warsaw.) 1579 1580* Another new function, :func:`sys._current_frames`, returns the current stack 1581 frames for all running threads as a dictionary mapping thread identifiers to the 1582 topmost stack frame currently active in that thread at the time the function is 1583 called. (Contributed by Tim Peters.) 1584 1585* The :class:`TarFile` class in the :mod:`tarfile` module now has an 1586 :meth:`extractall` method that extracts all members from the archive into the 1587 current working directory. It's also possible to set a different directory as 1588 the extraction target, and to unpack only a subset of the archive's members. 1589 1590 The compression used for a tarfile opened in stream mode can now be autodetected 1591 using the mode ``'r|*'``. (Contributed by Lars Gustäbel.) 1592 1593 .. patch 918101 1594 1595* The :mod:`threading` module now lets you set the stack size used when new 1596 threads are created. The ``stack_size([*size*])`` function returns the 1597 currently configured stack size, and supplying the optional *size* parameter 1598 sets a new value. Not all platforms support changing the stack size, but 1599 Windows, POSIX threading, and OS/2 all do. (Contributed by Andrew MacIntyre.) 1600 1601 .. Patch 1454481 1602 1603* The :mod:`unicodedata` module has been updated to use version 4.1.0 of the 1604 Unicode character database. Version 3.2.0 is required by some specifications, 1605 so it's still available as :attr:`unicodedata.ucd_3_2_0`. 1606 1607* New module: the :mod:`uuid` module generates universally unique identifiers 1608 (UUIDs) according to :rfc:`4122`. The RFC defines several different UUID 1609 versions that are generated from a starting string, from system properties, or 1610 purely randomly. This module contains a :class:`UUID` class and functions 1611 named :func:`uuid1`, :func:`uuid3`, :func:`uuid4`, and :func:`uuid5` to 1612 generate different versions of UUID. (Version 2 UUIDs are not specified in 1613 :rfc:`4122` and are not supported by this module.) :: 1614 1615 >>> import uuid 1616 >>> # make a UUID based on the host ID and current time 1617 >>> uuid.uuid1() 1618 UUID('a8098c1a-f86e-11da-bd1a-00112444be1e') 1619 1620 >>> # make a UUID using an MD5 hash of a namespace UUID and a name 1621 >>> uuid.uuid3(uuid.NAMESPACE_DNS, 'python.org') 1622 UUID('6fa459ea-ee8a-3ca4-894e-db77e160355e') 1623 1624 >>> # make a random UUID 1625 >>> uuid.uuid4() 1626 UUID('16fd2706-8baf-433b-82eb-8c7fada847da') 1627 1628 >>> # make a UUID using a SHA-1 hash of a namespace UUID and a name 1629 >>> uuid.uuid5(uuid.NAMESPACE_DNS, 'python.org') 1630 UUID('886313e1-3b8a-5372-9b90-0c9aee199e5d') 1631 1632 (Contributed by Ka-Ping Yee.) 1633 1634* The :mod:`weakref` module's :class:`WeakKeyDictionary` and 1635 :class:`WeakValueDictionary` types gained new methods for iterating over the 1636 weak references contained in the dictionary. :meth:`iterkeyrefs` and 1637 :meth:`keyrefs` methods were added to :class:`WeakKeyDictionary`, and 1638 :meth:`itervaluerefs` and :meth:`valuerefs` were added to 1639 :class:`WeakValueDictionary`. (Contributed by Fred L. Drake, Jr.) 1640 1641* The :mod:`webbrowser` module received a number of enhancements. It's now 1642 usable as a script with ``python -m webbrowser``, taking a URL as the argument; 1643 there are a number of switches to control the behaviour (:option:`!-n` for a new 1644 browser window, :option:`!-t` for a new tab). New module-level functions, 1645 :func:`open_new` and :func:`open_new_tab`, were added to support this. The 1646 module's :func:`open` function supports an additional feature, an *autoraise* 1647 parameter that signals whether to raise the open window when possible. A number 1648 of additional browsers were added to the supported list such as Firefox, Opera, 1649 Konqueror, and elinks. (Contributed by Oleg Broytmann and Georg Brandl.) 1650 1651 .. Patch #754022 1652 1653* The :mod:`xmlrpclib` module now supports returning :class:`~datetime.datetime` objects 1654 for the XML-RPC date type. Supply ``use_datetime=True`` to the :func:`loads` 1655 function or the :class:`Unmarshaller` class to enable this feature. (Contributed 1656 by Skip Montanaro.) 1657 1658 .. Patch 1120353 1659 1660* The :mod:`zipfile` module now supports the ZIP64 version of the format, 1661 meaning that a .zip archive can now be larger than 4 GiB and can contain 1662 individual files larger than 4 GiB. (Contributed by Ronald Oussoren.) 1663 1664 .. Patch 1446489 1665 1666* The :mod:`zlib` module's :class:`Compress` and :class:`Decompress` objects now 1667 support a :meth:`copy` method that makes a copy of the object's internal state 1668 and returns a new :class:`Compress` or :class:`Decompress` object. 1669 (Contributed by Chris AtLee.) 1670 1671 .. Patch 1435422 1672 1673.. ====================================================================== 1674 1675 1676.. _module-ctypes: 1677 1678The ctypes package 1679------------------ 1680 1681The :mod:`ctypes` package, written by Thomas Heller, has been added to the 1682standard library. :mod:`ctypes` lets you call arbitrary functions in shared 1683libraries or DLLs. Long-time users may remember the :mod:`dl` module, which 1684provides functions for loading shared libraries and calling functions in them. 1685The :mod:`ctypes` package is much fancier. 1686 1687To load a shared library or DLL, you must create an instance of the 1688:class:`CDLL` class and provide the name or path of the shared library or DLL. 1689Once that's done, you can call arbitrary functions by accessing them as 1690attributes of the :class:`CDLL` object. :: 1691 1692 import ctypes 1693 1694 libc = ctypes.CDLL('libc.so.6') 1695 result = libc.printf("Line of output\n") 1696 1697Type constructors for the various C types are provided: :func:`c_int`, 1698:func:`c_float`, :func:`c_double`, :func:`c_char_p` (equivalent to :c:type:`char 1699\*`), and so forth. Unlike Python's types, the C versions are all mutable; you 1700can assign to their :attr:`value` attribute to change the wrapped value. Python 1701integers and strings will be automatically converted to the corresponding C 1702types, but for other types you must call the correct type constructor. (And I 1703mean *must*; getting it wrong will often result in the interpreter crashing 1704with a segmentation fault.) 1705 1706You shouldn't use :func:`c_char_p` with a Python string when the C function will 1707be modifying the memory area, because Python strings are supposed to be 1708immutable; breaking this rule will cause puzzling bugs. When you need a 1709modifiable memory area, use :func:`create_string_buffer`:: 1710 1711 s = "this is a string" 1712 buf = ctypes.create_string_buffer(s) 1713 libc.strfry(buf) 1714 1715C functions are assumed to return integers, but you can set the :attr:`restype` 1716attribute of the function object to change this:: 1717 1718 >>> libc.atof('2.71828') 1719 -1783957616 1720 >>> libc.atof.restype = ctypes.c_double 1721 >>> libc.atof('2.71828') 1722 2.71828 1723 1724:mod:`ctypes` also provides a wrapper for Python's C API as the 1725``ctypes.pythonapi`` object. This object does *not* release the global 1726interpreter lock before calling a function, because the lock must be held when 1727calling into the interpreter's code. There's a :class:`py_object()` type 1728constructor that will create a :c:type:`PyObject \*` pointer. A simple usage:: 1729 1730 import ctypes 1731 1732 d = {} 1733 ctypes.pythonapi.PyObject_SetItem(ctypes.py_object(d), 1734 ctypes.py_object("abc"), ctypes.py_object(1)) 1735 # d is now {'abc', 1}. 1736 1737Don't forget to use :class:`py_object()`; if it's omitted you end up with a 1738segmentation fault. 1739 1740:mod:`ctypes` has been around for a while, but people still write and 1741distribution hand-coded extension modules because you can't rely on 1742:mod:`ctypes` being present. Perhaps developers will begin to write Python 1743wrappers atop a library accessed through :mod:`ctypes` instead of extension 1744modules, now that :mod:`ctypes` is included with core Python. 1745 1746 1747.. seealso:: 1748 1749 http://starship.python.net/crew/theller/ctypes/ 1750 The ctypes web page, with a tutorial, reference, and FAQ. 1751 1752 The documentation for the :mod:`ctypes` module. 1753 1754.. ====================================================================== 1755 1756 1757.. _module-etree: 1758 1759The ElementTree package 1760----------------------- 1761 1762A subset of Fredrik Lundh's ElementTree library for processing XML has been 1763added to the standard library as :mod:`xml.etree`. The available modules are 1764:mod:`ElementTree`, :mod:`ElementPath`, and :mod:`ElementInclude` from 1765ElementTree 1.2.6. The :mod:`cElementTree` accelerator module is also 1766included. 1767 1768The rest of this section will provide a brief overview of using ElementTree. 1769Full documentation for ElementTree is available at 1770http://effbot.org/zone/element-index.htm. 1771 1772ElementTree represents an XML document as a tree of element nodes. The text 1773content of the document is stored as the :attr:`text` and :attr:`tail` 1774attributes of (This is one of the major differences between ElementTree and 1775the Document Object Model; in the DOM there are many different types of node, 1776including :class:`TextNode`.) 1777 1778The most commonly used parsing function is :func:`parse`, that takes either a 1779string (assumed to contain a filename) or a file-like object and returns an 1780:class:`ElementTree` instance:: 1781 1782 from xml.etree import ElementTree as ET 1783 1784 tree = ET.parse('ex-1.xml') 1785 1786 feed = urllib.urlopen( 1787 'http://planet.python.org/rss10.xml') 1788 tree = ET.parse(feed) 1789 1790Once you have an :class:`ElementTree` instance, you can call its :meth:`getroot` 1791method to get the root :class:`Element` node. 1792 1793There's also an :func:`XML` function that takes a string literal and returns an 1794:class:`Element` node (not an :class:`ElementTree`). This function provides a 1795tidy way to incorporate XML fragments, approaching the convenience of an XML 1796literal:: 1797 1798 svg = ET.XML("""<svg width="10px" version="1.0"> 1799 </svg>""") 1800 svg.set('height', '320px') 1801 svg.append(elem1) 1802 1803Each XML element supports some dictionary-like and some list-like access 1804methods. Dictionary-like operations are used to access attribute values, and 1805list-like operations are used to access child nodes. 1806 1807+-------------------------------+--------------------------------------------+ 1808| Operation | Result | 1809+===============================+============================================+ 1810| ``elem[n]`` | Returns n'th child element. | 1811+-------------------------------+--------------------------------------------+ 1812| ``elem[m:n]`` | Returns list of m'th through n'th child | 1813| | elements. | 1814+-------------------------------+--------------------------------------------+ 1815| ``len(elem)`` | Returns number of child elements. | 1816+-------------------------------+--------------------------------------------+ 1817| ``list(elem)`` | Returns list of child elements. | 1818+-------------------------------+--------------------------------------------+ 1819| ``elem.append(elem2)`` | Adds *elem2* as a child. | 1820+-------------------------------+--------------------------------------------+ 1821| ``elem.insert(index, elem2)`` | Inserts *elem2* at the specified location. | 1822+-------------------------------+--------------------------------------------+ 1823| ``del elem[n]`` | Deletes n'th child element. | 1824+-------------------------------+--------------------------------------------+ 1825| ``elem.keys()`` | Returns list of attribute names. | 1826+-------------------------------+--------------------------------------------+ 1827| ``elem.get(name)`` | Returns value of attribute *name*. | 1828+-------------------------------+--------------------------------------------+ 1829| ``elem.set(name, value)`` | Sets new value for attribute *name*. | 1830+-------------------------------+--------------------------------------------+ 1831| ``elem.attrib`` | Retrieves the dictionary containing | 1832| | attributes. | 1833+-------------------------------+--------------------------------------------+ 1834| ``del elem.attrib[name]`` | Deletes attribute *name*. | 1835+-------------------------------+--------------------------------------------+ 1836 1837Comments and processing instructions are also represented as :class:`Element` 1838nodes. To check if a node is a comment or processing instructions:: 1839 1840 if elem.tag is ET.Comment: 1841 ... 1842 elif elem.tag is ET.ProcessingInstruction: 1843 ... 1844 1845To generate XML output, you should call the :meth:`ElementTree.write` method. 1846Like :func:`parse`, it can take either a string or a file-like object:: 1847 1848 # Encoding is US-ASCII 1849 tree.write('output.xml') 1850 1851 # Encoding is UTF-8 1852 f = open('output.xml', 'w') 1853 tree.write(f, encoding='utf-8') 1854 1855(Caution: the default encoding used for output is ASCII. For general XML work, 1856where an element's name may contain arbitrary Unicode characters, ASCII isn't a 1857very useful encoding because it will raise an exception if an element's name 1858contains any characters with values greater than 127. Therefore, it's best to 1859specify a different encoding such as UTF-8 that can handle any Unicode 1860character.) 1861 1862This section is only a partial description of the ElementTree interfaces. Please 1863read the package's official documentation for more details. 1864 1865 1866.. seealso:: 1867 1868 http://effbot.org/zone/element-index.htm 1869 Official documentation for ElementTree. 1870 1871.. ====================================================================== 1872 1873 1874.. _module-hashlib: 1875 1876The hashlib package 1877------------------- 1878 1879A new :mod:`hashlib` module, written by Gregory P. Smith, has been added to 1880replace the :mod:`md5` and :mod:`sha` modules. :mod:`hashlib` adds support for 1881additional secure hashes (SHA-224, SHA-256, SHA-384, and SHA-512). When 1882available, the module uses OpenSSL for fast platform optimized implementations 1883of algorithms. 1884 1885The old :mod:`md5` and :mod:`sha` modules still exist as wrappers around hashlib 1886to preserve backwards compatibility. The new module's interface is very close 1887to that of the old modules, but not identical. The most significant difference 1888is that the constructor functions for creating new hashing objects are named 1889differently. :: 1890 1891 # Old versions 1892 h = md5.md5() 1893 h = md5.new() 1894 1895 # New version 1896 h = hashlib.md5() 1897 1898 # Old versions 1899 h = sha.sha() 1900 h = sha.new() 1901 1902 # New version 1903 h = hashlib.sha1() 1904 1905 # Hash that weren't previously available 1906 h = hashlib.sha224() 1907 h = hashlib.sha256() 1908 h = hashlib.sha384() 1909 h = hashlib.sha512() 1910 1911 # Alternative form 1912 h = hashlib.new('md5') # Provide algorithm as a string 1913 1914Once a hash object has been created, its methods are the same as before: 1915``update(string)`` hashes the specified string into the current digest 1916state, :meth:`digest` and :meth:`hexdigest` return the digest value as a binary 1917string or a string of hex digits, and :meth:`copy` returns a new hashing object 1918with the same digest state. 1919 1920 1921.. seealso:: 1922 1923 The documentation for the :mod:`hashlib` module. 1924 1925.. ====================================================================== 1926 1927 1928.. _module-sqlite: 1929 1930The sqlite3 package 1931------------------- 1932 1933The pysqlite module (http://www.pysqlite.org), a wrapper for the SQLite embedded 1934database, has been added to the standard library under the package name 1935:mod:`sqlite3`. 1936 1937SQLite is a C library that provides a lightweight disk-based database that 1938doesn't require a separate server process and allows accessing the database 1939using a nonstandard variant of the SQL query language. Some applications can use 1940SQLite for internal data storage. It's also possible to prototype an 1941application using SQLite and then port the code to a larger database such as 1942PostgreSQL or Oracle. 1943 1944pysqlite was written by Gerhard Häring and provides a SQL interface compliant 1945with the DB-API 2.0 specification described by :pep:`249`. 1946 1947If you're compiling the Python source yourself, note that the source tree 1948doesn't include the SQLite code, only the wrapper module. You'll need to have 1949the SQLite libraries and headers installed before compiling Python, and the 1950build process will compile the module when the necessary headers are available. 1951 1952To use the module, you must first create a :class:`Connection` object that 1953represents the database. Here the data will be stored in the 1954:file:`/tmp/example` file:: 1955 1956 conn = sqlite3.connect('/tmp/example') 1957 1958You can also supply the special name ``:memory:`` to create a database in RAM. 1959 1960Once you have a :class:`Connection`, you can create a :class:`Cursor` object 1961and call its :meth:`execute` method to perform SQL commands:: 1962 1963 c = conn.cursor() 1964 1965 # Create table 1966 c.execute('''create table stocks 1967 (date text, trans text, symbol text, 1968 qty real, price real)''') 1969 1970 # Insert a row of data 1971 c.execute("""insert into stocks 1972 values ('2006-01-05','BUY','RHAT',100,35.14)""") 1973 1974Usually your SQL operations will need to use values from Python variables. You 1975shouldn't assemble your query using Python's string operations because doing so 1976is insecure; it makes your program vulnerable to an SQL injection attack. 1977 1978Instead, use the DB-API's parameter substitution. Put ``?`` as a placeholder 1979wherever you want to use a value, and then provide a tuple of values as the 1980second argument to the cursor's :meth:`execute` method. (Other database modules 1981may use a different placeholder, such as ``%s`` or ``:1``.) For example:: 1982 1983 # Never do this -- insecure! 1984 symbol = 'IBM' 1985 c.execute("... where symbol = '%s'" % symbol) 1986 1987 # Do this instead 1988 t = (symbol,) 1989 c.execute('select * from stocks where symbol=?', t) 1990 1991 # Larger example 1992 for t in (('2006-03-28', 'BUY', 'IBM', 1000, 45.00), 1993 ('2006-04-05', 'BUY', 'MSOFT', 1000, 72.00), 1994 ('2006-04-06', 'SELL', 'IBM', 500, 53.00), 1995 ): 1996 c.execute('insert into stocks values (?,?,?,?,?)', t) 1997 1998To retrieve data after executing a SELECT statement, you can either treat the 1999cursor as an iterator, call the cursor's :meth:`fetchone` method to retrieve a 2000single matching row, or call :meth:`fetchall` to get a list of the matching 2001rows. 2002 2003This example uses the iterator form:: 2004 2005 >>> c = conn.cursor() 2006 >>> c.execute('select * from stocks order by price') 2007 >>> for row in c: 2008 ... print row 2009 ... 2010 (u'2006-01-05', u'BUY', u'RHAT', 100, 35.140000000000001) 2011 (u'2006-03-28', u'BUY', u'IBM', 1000, 45.0) 2012 (u'2006-04-06', u'SELL', u'IBM', 500, 53.0) 2013 (u'2006-04-05', u'BUY', u'MSOFT', 1000, 72.0) 2014 >>> 2015 2016For more information about the SQL dialect supported by SQLite, see 2017https://www.sqlite.org. 2018 2019 2020.. seealso:: 2021 2022 http://www.pysqlite.org 2023 The pysqlite web page. 2024 2025 https://www.sqlite.org 2026 The SQLite web page; the documentation describes the syntax and the available 2027 data types for the supported SQL dialect. 2028 2029 The documentation for the :mod:`sqlite3` module. 2030 2031 :pep:`249` - Database API Specification 2.0 2032 PEP written by Marc-André Lemburg. 2033 2034.. ====================================================================== 2035 2036 2037.. _module-wsgiref: 2038 2039The wsgiref package 2040------------------- 2041 2042The Web Server Gateway Interface (WSGI) v1.0 defines a standard interface 2043between web servers and Python web applications and is described in :pep:`333`. 2044The :mod:`wsgiref` package is a reference implementation of the WSGI 2045specification. 2046 2047.. XXX should this be in a PEP 333 section instead? 2048 2049The package includes a basic HTTP server that will run a WSGI application; this 2050server is useful for debugging but isn't intended for production use. Setting 2051up a server takes only a few lines of code:: 2052 2053 from wsgiref import simple_server 2054 2055 wsgi_app = ... 2056 2057 host = '' 2058 port = 8000 2059 httpd = simple_server.make_server(host, port, wsgi_app) 2060 httpd.serve_forever() 2061 2062.. XXX discuss structure of WSGI applications? 2063.. XXX provide an example using Django or some other framework? 2064 2065 2066.. seealso:: 2067 2068 http://www.wsgi.org 2069 A central web site for WSGI-related resources. 2070 2071 :pep:`333` - Python Web Server Gateway Interface v1.0 2072 PEP written by Phillip J. Eby. 2073 2074.. ====================================================================== 2075 2076 2077.. _build-api: 2078 2079Build and C API Changes 2080======================= 2081 2082Changes to Python's build process and to the C API include: 2083 2084* The Python source tree was converted from CVS to Subversion, in a complex 2085 migration procedure that was supervised and flawlessly carried out by Martin von 2086 Löwis. The procedure was developed as :pep:`347`. 2087 2088* Coverity, a company that markets a source code analysis tool called Prevent, 2089 provided the results of their examination of the Python source code. The 2090 analysis found about 60 bugs that were quickly fixed. Many of the bugs were 2091 refcounting problems, often occurring in error-handling code. See 2092 https://scan.coverity.com for the statistics. 2093 2094* The largest change to the C API came from :pep:`353`, which modifies the 2095 interpreter to use a :c:type:`Py_ssize_t` type definition instead of 2096 :c:type:`int`. See the earlier section :ref:`pep-353` for a discussion of this 2097 change. 2098 2099* The design of the bytecode compiler has changed a great deal, no longer 2100 generating bytecode by traversing the parse tree. Instead the parse tree is 2101 converted to an abstract syntax tree (or AST), and it is the abstract syntax 2102 tree that's traversed to produce the bytecode. 2103 2104 It's possible for Python code to obtain AST objects by using the 2105 :func:`compile` built-in and specifying ``_ast.PyCF_ONLY_AST`` as the value of 2106 the *flags* parameter:: 2107 2108 from _ast import PyCF_ONLY_AST 2109 ast = compile("""a=0 2110 for i in range(10): 2111 a += i 2112 """, "<string>", 'exec', PyCF_ONLY_AST) 2113 2114 assignment = ast.body[0] 2115 for_loop = ast.body[1] 2116 2117 No official documentation has been written for the AST code yet, but :pep:`339` 2118 discusses the design. To start learning about the code, read the definition of 2119 the various AST nodes in :file:`Parser/Python.asdl`. A Python script reads this 2120 file and generates a set of C structure definitions in 2121 :file:`Include/Python-ast.h`. The :c:func:`PyParser_ASTFromString` and 2122 :c:func:`PyParser_ASTFromFile`, defined in :file:`Include/pythonrun.h`, take 2123 Python source as input and return the root of an AST representing the contents. 2124 This AST can then be turned into a code object by :c:func:`PyAST_Compile`. For 2125 more information, read the source code, and then ask questions on python-dev. 2126 2127 The AST code was developed under Jeremy Hylton's management, and implemented by 2128 (in alphabetical order) Brett Cannon, Nick Coghlan, Grant Edwards, John 2129 Ehresman, Kurt Kaiser, Neal Norwitz, Tim Peters, Armin Rigo, and Neil 2130 Schemenauer, plus the participants in a number of AST sprints at conferences 2131 such as PyCon. 2132 2133 .. List of names taken from Jeremy's python-dev post at 2134 .. https://mail.python.org/pipermail/python-dev/2005-October/057500.html 2135 2136* Evan Jones's patch to obmalloc, first described in a talk at PyCon DC 2005, 2137 was applied. Python 2.4 allocated small objects in 256K-sized arenas, but never 2138 freed arenas. With this patch, Python will free arenas when they're empty. The 2139 net effect is that on some platforms, when you allocate many objects, Python's 2140 memory usage may actually drop when you delete them and the memory may be 2141 returned to the operating system. (Implemented by Evan Jones, and reworked by 2142 Tim Peters.) 2143 2144 Note that this change means extension modules must be more careful when 2145 allocating memory. Python's API has many different functions for allocating 2146 memory that are grouped into families. For example, :c:func:`PyMem_Malloc`, 2147 :c:func:`PyMem_Realloc`, and :c:func:`PyMem_Free` are one family that allocates 2148 raw memory, while :c:func:`PyObject_Malloc`, :c:func:`PyObject_Realloc`, and 2149 :c:func:`PyObject_Free` are another family that's supposed to be used for 2150 creating Python objects. 2151 2152 Previously these different families all reduced to the platform's 2153 :c:func:`malloc` and :c:func:`free` functions. This meant it didn't matter if 2154 you got things wrong and allocated memory with the :c:func:`PyMem` function but 2155 freed it with the :c:func:`PyObject` function. With 2.5's changes to obmalloc, 2156 these families now do different things and mismatches will probably result in a 2157 segfault. You should carefully test your C extension modules with Python 2.5. 2158 2159* The built-in set types now have an official C API. Call :c:func:`PySet_New` 2160 and :c:func:`PyFrozenSet_New` to create a new set, :c:func:`PySet_Add` and 2161 :c:func:`PySet_Discard` to add and remove elements, and :c:func:`PySet_Contains` 2162 and :c:func:`PySet_Size` to examine the set's state. (Contributed by Raymond 2163 Hettinger.) 2164 2165* C code can now obtain information about the exact revision of the Python 2166 interpreter by calling the :c:func:`Py_GetBuildInfo` function that returns a 2167 string of build information like this: ``"trunk:45355:45356M, Apr 13 2006, 2168 07:42:19"``. (Contributed by Barry Warsaw.) 2169 2170* Two new macros can be used to indicate C functions that are local to the 2171 current file so that a faster calling convention can be used. 2172 ``Py_LOCAL(type)`` declares the function as returning a value of the 2173 specified *type* and uses a fast-calling qualifier. 2174 ``Py_LOCAL_INLINE(type)`` does the same thing and also requests the 2175 function be inlined. If :c:func:`PY_LOCAL_AGGRESSIVE` is defined before 2176 :file:`python.h` is included, a set of more aggressive optimizations are enabled 2177 for the module; you should benchmark the results to find out if these 2178 optimizations actually make the code faster. (Contributed by Fredrik Lundh at 2179 the NeedForSpeed sprint.) 2180 2181* ``PyErr_NewException(name, base, dict)`` can now accept a tuple of base 2182 classes as its *base* argument. (Contributed by Georg Brandl.) 2183 2184* The :c:func:`PyErr_Warn` function for issuing warnings is now deprecated in 2185 favour of ``PyErr_WarnEx(category, message, stacklevel)`` which lets you 2186 specify the number of stack frames separating this function and the caller. A 2187 *stacklevel* of 1 is the function calling :c:func:`PyErr_WarnEx`, 2 is the 2188 function above that, and so forth. (Added by Neal Norwitz.) 2189 2190* The CPython interpreter is still written in C, but the code can now be 2191 compiled with a C++ compiler without errors. (Implemented by Anthony Baxter, 2192 Martin von Löwis, Skip Montanaro.) 2193 2194* The :c:func:`PyRange_New` function was removed. It was never documented, never 2195 used in the core code, and had dangerously lax error checking. In the unlikely 2196 case that your extensions were using it, you can replace it by something like 2197 the following:: 2198 2199 range = PyObject_CallFunction((PyObject*) &PyRange_Type, "lll", 2200 start, stop, step); 2201 2202.. ====================================================================== 2203 2204 2205.. _ports: 2206 2207Port-Specific Changes 2208--------------------- 2209 2210* MacOS X (10.3 and higher): dynamic loading of modules now uses the 2211 :c:func:`dlopen` function instead of MacOS-specific functions. 2212 2213* MacOS X: an :option:`!--enable-universalsdk` switch was added to the 2214 :program:`configure` script that compiles the interpreter as a universal binary 2215 able to run on both PowerPC and Intel processors. (Contributed by Ronald 2216 Oussoren; :issue:`2573`.) 2217 2218* Windows: :file:`.dll` is no longer supported as a filename extension for 2219 extension modules. :file:`.pyd` is now the only filename extension that will be 2220 searched for. 2221 2222.. ====================================================================== 2223 2224 2225.. _porting: 2226 2227Porting to Python 2.5 2228===================== 2229 2230This section lists previously described changes that may require changes to your 2231code: 2232 2233* ASCII is now the default encoding for modules. It's now a syntax error if a 2234 module contains string literals with 8-bit characters but doesn't have an 2235 encoding declaration. In Python 2.4 this triggered a warning, not a syntax 2236 error. 2237 2238* Previously, the :attr:`gi_frame` attribute of a generator was always a frame 2239 object. Because of the :pep:`342` changes described in section :ref:`pep-342`, 2240 it's now possible for :attr:`gi_frame` to be ``None``. 2241 2242* A new warning, :class:`UnicodeWarning`, is triggered when you attempt to 2243 compare a Unicode string and an 8-bit string that can't be converted to Unicode 2244 using the default ASCII encoding. Previously such comparisons would raise a 2245 :class:`UnicodeDecodeError` exception. 2246 2247* Library: the :mod:`csv` module is now stricter about multi-line quoted fields. 2248 If your files contain newlines embedded within fields, the input should be split 2249 into lines in a manner which preserves the newline characters. 2250 2251* Library: the :mod:`locale` module's :func:`format` function's would 2252 previously accept any string as long as no more than one %char specifier 2253 appeared. In Python 2.5, the argument must be exactly one %char specifier with 2254 no surrounding text. 2255 2256* Library: The :mod:`pickle` and :mod:`cPickle` modules no longer accept a 2257 return value of ``None`` from the :meth:`__reduce__` method; the method must 2258 return a tuple of arguments instead. The modules also no longer accept the 2259 deprecated *bin* keyword parameter. 2260 2261* Library: The :mod:`SimpleXMLRPCServer` and :mod:`DocXMLRPCServer` classes now 2262 have a :attr:`rpc_paths` attribute that constrains XML-RPC operations to a 2263 limited set of URL paths; the default is to allow only ``'/'`` and ``'/RPC2'``. 2264 Setting :attr:`rpc_paths` to ``None`` or an empty tuple disables this path 2265 checking. 2266 2267* C API: Many functions now use :c:type:`Py_ssize_t` instead of :c:type:`int` to 2268 allow processing more data on 64-bit machines. Extension code may need to make 2269 the same change to avoid warnings and to support 64-bit machines. See the 2270 earlier section :ref:`pep-353` for a discussion of this change. 2271 2272* C API: The obmalloc changes mean that you must be careful to not mix usage 2273 of the :c:func:`PyMem_\*` and :c:func:`PyObject_\*` families of functions. Memory 2274 allocated with one family's :c:func:`\*_Malloc` must be freed with the 2275 corresponding family's :c:func:`\*_Free` function. 2276 2277.. ====================================================================== 2278 2279 2280Acknowledgements 2281================ 2282 2283The author would like to thank the following people for offering suggestions, 2284corrections and assistance with various drafts of this article: Georg Brandl, 2285Nick Coghlan, Phillip J. Eby, Lars Gustäbel, Raymond Hettinger, Ralf W. 2286Grosse-Kunstleve, Kent Johnson, Iain Lowe, Martin von Löwis, Fredrik Lundh, Andrew 2287McNamara, Skip Montanaro, Gustavo Niemeyer, Paul Prescod, James Pryor, Mike 2288Rovner, Scott Weikart, Barry Warsaw, Thomas Wouters. 2289 2290