1:tocdepth: 2 2 3=============== 4Programming FAQ 5=============== 6 7.. only:: html 8 9 .. contents:: 10 11General Questions 12================= 13 14Is there a source code level debugger with breakpoints, single-stepping, etc.? 15------------------------------------------------------------------------------ 16 17Yes. 18 19The pdb module is a simple but adequate console-mode debugger for Python. It is 20part of the standard Python library, and is :mod:`documented in the Library 21Reference Manual <pdb>`. You can also write your own debugger by using the code 22for pdb as an example. 23 24The IDLE interactive development environment, which is part of the standard 25Python distribution (normally available as Tools/scripts/idle), includes a 26graphical debugger. 27 28PythonWin is a Python IDE that includes a GUI debugger based on pdb. The 29Pythonwin debugger colors breakpoints and has quite a few cool features such as 30debugging non-Pythonwin programs. Pythonwin is available as part of the `Python 31for Windows Extensions <https://sourceforge.net/projects/pywin32/>`__ project and 32as a part of the ActivePython distribution (see 33https://www.activestate.com/activepython\ ). 34 35`Boa Constructor <http://boa-constructor.sourceforge.net/>`_ is an IDE and GUI 36builder that uses wxWidgets. It offers visual frame creation and manipulation, 37an object inspector, many views on the source like object browsers, inheritance 38hierarchies, doc string generated html documentation, an advanced debugger, 39integrated help, and Zope support. 40 41`Eric <http://eric-ide.python-projects.org/>`_ is an IDE built on PyQt 42and the Scintilla editing component. 43 44Pydb is a version of the standard Python debugger pdb, modified for use with DDD 45(Data Display Debugger), a popular graphical debugger front end. Pydb can be 46found at http://bashdb.sourceforge.net/pydb/ and DDD can be found at 47https://www.gnu.org/software/ddd. 48 49There are a number of commercial Python IDEs that include graphical debuggers. 50They include: 51 52* Wing IDE (https://wingware.com/) 53* Komodo IDE (https://komodoide.com/) 54* PyCharm (https://www.jetbrains.com/pycharm/) 55 56 57Is there a tool to help find bugs or perform static analysis? 58------------------------------------------------------------- 59 60Yes. 61 62PyChecker is a static analysis tool that finds bugs in Python source code and 63warns about code complexity and style. You can get PyChecker from 64http://pychecker.sourceforge.net/. 65 66`Pylint <https://www.pylint.org/>`_ is another tool that checks 67if a module satisfies a coding standard, and also makes it possible to write 68plug-ins to add a custom feature. In addition to the bug checking that 69PyChecker performs, Pylint offers some additional features such as checking line 70length, whether variable names are well-formed according to your coding 71standard, whether declared interfaces are fully implemented, and more. 72https://docs.pylint.org/ provides a full list of Pylint's features. 73 74 75How can I create a stand-alone binary from a Python script? 76----------------------------------------------------------- 77 78You don't need the ability to compile Python to C code if all you want is a 79stand-alone program that users can download and run without having to install 80the Python distribution first. There are a number of tools that determine the 81set of modules required by a program and bind these modules together with a 82Python binary to produce a single executable. 83 84One is to use the freeze tool, which is included in the Python source tree as 85``Tools/freeze``. It converts Python byte code to C arrays; a C compiler you can 86embed all your modules into a new program, which is then linked with the 87standard Python modules. 88 89It works by scanning your source recursively for import statements (in both 90forms) and looking for the modules in the standard Python path as well as in the 91source directory (for built-in modules). It then turns the bytecode for modules 92written in Python into C code (array initializers that can be turned into code 93objects using the marshal module) and creates a custom-made config file that 94only contains those built-in modules which are actually used in the program. It 95then compiles the generated C code and links it with the rest of the Python 96interpreter to form a self-contained binary which acts exactly like your script. 97 98Obviously, freeze requires a C compiler. There are several other utilities 99which don't. One is Thomas Heller's py2exe (Windows only) at 100 101 http://www.py2exe.org/ 102 103Another tool is Anthony Tuininga's `cx_Freeze <http://cx-freeze.sourceforge.net/>`_. 104 105 106Are there coding standards or a style guide for Python programs? 107---------------------------------------------------------------- 108 109Yes. The coding style required for standard library modules is documented as 110:pep:`8`. 111 112 113My program is too slow. How do I speed it up? 114--------------------------------------------- 115 116That's a tough one, in general. There are many tricks to speed up Python code; 117consider rewriting parts in C as a last resort. 118 119In some cases it's possible to automatically translate Python to C or x86 120assembly language, meaning that you don't have to modify your code to gain 121increased speed. 122 123.. XXX seems to have overlap with other questions! 124 125`Pyrex <http://www.cosc.canterbury.ac.nz/~greg/python/Pyrex/>`_ can compile a 126slightly modified version of Python code into a C extension, and can be used on 127many different platforms. 128 129`Psyco <http://psyco.sourceforge.net>`_ is a just-in-time compiler that 130translates Python code into x86 assembly language. If you can use it, Psyco can 131provide dramatic speedups for critical functions. 132 133The rest of this answer will discuss various tricks for squeezing a bit more 134speed out of Python code. *Never* apply any optimization tricks unless you know 135you need them, after profiling has indicated that a particular function is the 136heavily executed hot spot in the code. Optimizations almost always make the 137code less clear, and you shouldn't pay the costs of reduced clarity (increased 138development time, greater likelihood of bugs) unless the resulting performance 139benefit is worth it. 140 141There is a page on the wiki devoted to `performance tips 142<https://wiki.python.org/moin/PythonSpeed/PerformanceTips>`_. 143 144Guido van Rossum has written up an anecdote related to optimization at 145https://www.python.org/doc/essays/list2str. 146 147One thing to notice is that function and (especially) method calls are rather 148expensive; if you have designed a purely OO interface with lots of tiny 149functions that don't do much more than get or set an instance variable or call 150another method, you might consider using a more direct way such as directly 151accessing instance variables. Also see the standard module :mod:`profile` which 152makes it possible to find out where your program is spending most of its time 153(if you have some patience -- the profiling itself can slow your program down by 154an order of magnitude). 155 156Remember that many standard optimization heuristics you may know from other 157programming experience may well apply to Python. For example it may be faster 158to send output to output devices using larger writes rather than smaller ones in 159order to reduce the overhead of kernel system calls. Thus CGI scripts that 160write all output in "one shot" may be faster than those that write lots of small 161pieces of output. 162 163Also, be sure to use Python's core features where appropriate. For example, 164slicing allows programs to chop up lists and other sequence objects in a single 165tick of the interpreter's mainloop using highly optimized C implementations. 166Thus to get the same effect as:: 167 168 L2 = [] 169 for i in range(3): 170 L2.append(L1[i]) 171 172it is much shorter and far faster to use :: 173 174 L2 = list(L1[:3]) # "list" is redundant if L1 is a list. 175 176Note that the functionally-oriented built-in functions such as :func:`map`, 177:func:`zip`, and friends can be a convenient accelerator for loops that 178perform a single task. For example to pair the elements of two lists 179together:: 180 181 >>> zip([1, 2, 3], [4, 5, 6]) 182 [(1, 4), (2, 5), (3, 6)] 183 184or to compute a number of sines:: 185 186 >>> map(math.sin, (1, 2, 3, 4)) 187 [0.841470984808, 0.909297426826, 0.14112000806, -0.756802495308] 188 189The operation completes very quickly in such cases. 190 191Other examples include the ``join()`` and ``split()`` :ref:`methods 192of string objects <string-methods>`. 193For example if s1..s7 are large (10K+) strings then 194``"".join([s1,s2,s3,s4,s5,s6,s7])`` may be far faster than the more obvious 195``s1+s2+s3+s4+s5+s6+s7``, since the "summation" will compute many 196subexpressions, whereas ``join()`` does all the copying in one pass. For 197manipulating strings, use the ``replace()`` and the ``format()`` :ref:`methods 198on string objects <string-methods>`. Use regular expressions only when you're 199not dealing with constant string patterns. You may still use :ref:`the old % 200operations <string-formatting>` ``string % tuple`` and ``string % dictionary``. 201 202Be sure to use the :meth:`list.sort` built-in method to do sorting, and see the 203`sorting mini-HOWTO <https://wiki.python.org/moin/HowTo/Sorting>`_ for examples 204of moderately advanced usage. :meth:`list.sort` beats other techniques for 205sorting in all but the most extreme circumstances. 206 207Another common trick is to "push loops into functions or methods." For example 208suppose you have a program that runs slowly and you use the profiler to 209determine that a Python function ``ff()`` is being called lots of times. If you 210notice that ``ff()``:: 211 212 def ff(x): 213 ... # do something with x computing result... 214 return result 215 216tends to be called in loops like:: 217 218 list = map(ff, oldlist) 219 220or:: 221 222 for x in sequence: 223 value = ff(x) 224 ... # do something with value... 225 226then you can often eliminate function call overhead by rewriting ``ff()`` to:: 227 228 def ffseq(seq): 229 resultseq = [] 230 for x in seq: 231 ... # do something with x computing result... 232 resultseq.append(result) 233 return resultseq 234 235and rewrite the two examples to ``list = ffseq(oldlist)`` and to:: 236 237 for value in ffseq(sequence): 238 ... # do something with value... 239 240Single calls to ``ff(x)`` translate to ``ffseq([x])[0]`` with little penalty. 241Of course this technique is not always appropriate and there are other variants 242which you can figure out. 243 244You can gain some performance by explicitly storing the results of a function or 245method lookup into a local variable. A loop like:: 246 247 for key in token: 248 dict[key] = dict.get(key, 0) + 1 249 250resolves ``dict.get`` every iteration. If the method isn't going to change, a 251slightly faster implementation is:: 252 253 dict_get = dict.get # look up the method once 254 for key in token: 255 dict[key] = dict_get(key, 0) + 1 256 257Default arguments can be used to determine values once, at compile time instead 258of at run time. This can only be done for functions or objects which will not 259be changed during program execution, such as replacing :: 260 261 def degree_sin(deg): 262 return math.sin(deg * math.pi / 180.0) 263 264with :: 265 266 def degree_sin(deg, factor=math.pi/180.0, sin=math.sin): 267 return sin(deg * factor) 268 269Because this trick uses default arguments for terms which should not be changed, 270it should only be used when you are not concerned with presenting a possibly 271confusing API to your users. 272 273 274Core Language 275============= 276 277Why am I getting an UnboundLocalError when the variable has a value? 278-------------------------------------------------------------------- 279 280It can be a surprise to get the UnboundLocalError in previously working 281code when it is modified by adding an assignment statement somewhere in 282the body of a function. 283 284This code: 285 286 >>> x = 10 287 >>> def bar(): 288 ... print x 289 >>> bar() 290 10 291 292works, but this code: 293 294 >>> x = 10 295 >>> def foo(): 296 ... print x 297 ... x += 1 298 299results in an UnboundLocalError: 300 301 >>> foo() 302 Traceback (most recent call last): 303 ... 304 UnboundLocalError: local variable 'x' referenced before assignment 305 306This is because when you make an assignment to a variable in a scope, that 307variable becomes local to that scope and shadows any similarly named variable 308in the outer scope. Since the last statement in foo assigns a new value to 309``x``, the compiler recognizes it as a local variable. Consequently when the 310earlier ``print x`` attempts to print the uninitialized local variable and 311an error results. 312 313In the example above you can access the outer scope variable by declaring it 314global: 315 316 >>> x = 10 317 >>> def foobar(): 318 ... global x 319 ... print x 320 ... x += 1 321 >>> foobar() 322 10 323 324This explicit declaration is required in order to remind you that (unlike the 325superficially analogous situation with class and instance variables) you are 326actually modifying the value of the variable in the outer scope: 327 328 >>> print x 329 11 330 331 332What are the rules for local and global variables in Python? 333------------------------------------------------------------ 334 335In Python, variables that are only referenced inside a function are implicitly 336global. If a variable is assigned a value anywhere within the function's body, 337it's assumed to be a local unless explicitly declared as global. 338 339Though a bit surprising at first, a moment's consideration explains this. On 340one hand, requiring :keyword:`global` for assigned variables provides a bar 341against unintended side-effects. On the other hand, if ``global`` was required 342for all global references, you'd be using ``global`` all the time. You'd have 343to declare as global every reference to a built-in function or to a component of 344an imported module. This clutter would defeat the usefulness of the ``global`` 345declaration for identifying side-effects. 346 347 348Why do lambdas defined in a loop with different values all return the same result? 349---------------------------------------------------------------------------------- 350 351Assume you use a for loop to define a few different lambdas (or even plain 352functions), e.g.:: 353 354 >>> squares = [] 355 >>> for x in range(5): 356 ... squares.append(lambda: x**2) 357 358This gives you a list that contains 5 lambdas that calculate ``x**2``. You 359might expect that, when called, they would return, respectively, ``0``, ``1``, 360``4``, ``9``, and ``16``. However, when you actually try you will see that 361they all return ``16``:: 362 363 >>> squares[2]() 364 16 365 >>> squares[4]() 366 16 367 368This happens because ``x`` is not local to the lambdas, but is defined in 369the outer scope, and it is accessed when the lambda is called --- not when it 370is defined. At the end of the loop, the value of ``x`` is ``4``, so all the 371functions now return ``4**2``, i.e. ``16``. You can also verify this by 372changing the value of ``x`` and see how the results of the lambdas change:: 373 374 >>> x = 8 375 >>> squares[2]() 376 64 377 378In order to avoid this, you need to save the values in variables local to the 379lambdas, so that they don't rely on the value of the global ``x``:: 380 381 >>> squares = [] 382 >>> for x in range(5): 383 ... squares.append(lambda n=x: n**2) 384 385Here, ``n=x`` creates a new variable ``n`` local to the lambda and computed 386when the lambda is defined so that it has the same value that ``x`` had at 387that point in the loop. This means that the value of ``n`` will be ``0`` 388in the first lambda, ``1`` in the second, ``2`` in the third, and so on. 389Therefore each lambda will now return the correct result:: 390 391 >>> squares[2]() 392 4 393 >>> squares[4]() 394 16 395 396Note that this behaviour is not peculiar to lambdas, but applies to regular 397functions too. 398 399 400How do I share global variables across modules? 401------------------------------------------------ 402 403The canonical way to share information across modules within a single program is 404to create a special module (often called config or cfg). Just import the config 405module in all modules of your application; the module then becomes available as 406a global name. Because there is only one instance of each module, any changes 407made to the module object get reflected everywhere. For example: 408 409config.py:: 410 411 x = 0 # Default value of the 'x' configuration setting 412 413mod.py:: 414 415 import config 416 config.x = 1 417 418main.py:: 419 420 import config 421 import mod 422 print config.x 423 424Note that using a module is also the basis for implementing the Singleton design 425pattern, for the same reason. 426 427 428What are the "best practices" for using import in a module? 429----------------------------------------------------------- 430 431In general, don't use ``from modulename import *``. Doing so clutters the 432importer's namespace, and makes it much harder for linters to detect undefined 433names. 434 435Import modules at the top of a file. Doing so makes it clear what other modules 436your code requires and avoids questions of whether the module name is in scope. 437Using one import per line makes it easy to add and delete module imports, but 438using multiple imports per line uses less screen space. 439 440It's good practice if you import modules in the following order: 441 4421. standard library modules -- e.g. ``sys``, ``os``, ``getopt``, ``re`` 4432. third-party library modules (anything installed in Python's site-packages 444 directory) -- e.g. mx.DateTime, ZODB, PIL.Image, etc. 4453. locally-developed modules 446 447Only use explicit relative package imports. If you're writing code that's in 448the ``package.sub.m1`` module and want to import ``package.sub.m2``, do not just 449write ``import m2``, even though it's legal. Write ``from package.sub import 450m2`` or ``from . import m2`` instead. 451 452It is sometimes necessary to move imports to a function or class to avoid 453problems with circular imports. Gordon McMillan says: 454 455 Circular imports are fine where both modules use the "import <module>" form 456 of import. They fail when the 2nd module wants to grab a name out of the 457 first ("from module import name") and the import is at the top level. That's 458 because names in the 1st are not yet available, because the first module is 459 busy importing the 2nd. 460 461In this case, if the second module is only used in one function, then the import 462can easily be moved into that function. By the time the import is called, the 463first module will have finished initializing, and the second module can do its 464import. 465 466It may also be necessary to move imports out of the top level of code if some of 467the modules are platform-specific. In that case, it may not even be possible to 468import all of the modules at the top of the file. In this case, importing the 469correct modules in the corresponding platform-specific code is a good option. 470 471Only move imports into a local scope, such as inside a function definition, if 472it's necessary to solve a problem such as avoiding a circular import or are 473trying to reduce the initialization time of a module. This technique is 474especially helpful if many of the imports are unnecessary depending on how the 475program executes. You may also want to move imports into a function if the 476modules are only ever used in that function. Note that loading a module the 477first time may be expensive because of the one time initialization of the 478module, but loading a module multiple times is virtually free, costing only a 479couple of dictionary lookups. Even if the module name has gone out of scope, 480the module is probably available in :data:`sys.modules`. 481 482 483Why are default values shared between objects? 484---------------------------------------------- 485 486This type of bug commonly bites neophyte programmers. Consider this function:: 487 488 def foo(mydict={}): # Danger: shared reference to one dict for all calls 489 ... compute something ... 490 mydict[key] = value 491 return mydict 492 493The first time you call this function, ``mydict`` contains a single item. The 494second time, ``mydict`` contains two items because when ``foo()`` begins 495executing, ``mydict`` starts out with an item already in it. 496 497It is often expected that a function call creates new objects for default 498values. This is not what happens. Default values are created exactly once, when 499the function is defined. If that object is changed, like the dictionary in this 500example, subsequent calls to the function will refer to this changed object. 501 502By definition, immutable objects such as numbers, strings, tuples, and ``None``, 503are safe from change. Changes to mutable objects such as dictionaries, lists, 504and class instances can lead to confusion. 505 506Because of this feature, it is good programming practice to not use mutable 507objects as default values. Instead, use ``None`` as the default value and 508inside the function, check if the parameter is ``None`` and create a new 509list/dictionary/whatever if it is. For example, don't write:: 510 511 def foo(mydict={}): 512 ... 513 514but:: 515 516 def foo(mydict=None): 517 if mydict is None: 518 mydict = {} # create a new dict for local namespace 519 520This feature can be useful. When you have a function that's time-consuming to 521compute, a common technique is to cache the parameters and the resulting value 522of each call to the function, and return the cached value if the same value is 523requested again. This is called "memoizing", and can be implemented like this:: 524 525 # Callers will never provide a third parameter for this function. 526 def expensive(arg1, arg2, _cache={}): 527 if (arg1, arg2) in _cache: 528 return _cache[(arg1, arg2)] 529 530 # Calculate the value 531 result = ... expensive computation ... 532 _cache[(arg1, arg2)] = result # Store result in the cache 533 return result 534 535You could use a global variable containing a dictionary instead of the default 536value; it's a matter of taste. 537 538 539How can I pass optional or keyword parameters from one function to another? 540--------------------------------------------------------------------------- 541 542Collect the arguments using the ``*`` and ``**`` specifiers in the function's 543parameter list; this gives you the positional arguments as a tuple and the 544keyword arguments as a dictionary. You can then pass these arguments when 545calling another function by using ``*`` and ``**``:: 546 547 def f(x, *args, **kwargs): 548 ... 549 kwargs['width'] = '14.3c' 550 ... 551 g(x, *args, **kwargs) 552 553In the unlikely case that you care about Python versions older than 2.0, use 554:func:`apply`:: 555 556 def f(x, *args, **kwargs): 557 ... 558 kwargs['width'] = '14.3c' 559 ... 560 apply(g, (x,)+args, kwargs) 561 562 563.. index:: 564 single: argument; difference from parameter 565 single: parameter; difference from argument 566 567.. _faq-argument-vs-parameter: 568 569What is the difference between arguments and parameters? 570-------------------------------------------------------- 571 572:term:`Parameters <parameter>` are defined by the names that appear in a 573function definition, whereas :term:`arguments <argument>` are the values 574actually passed to a function when calling it. Parameters define what types of 575arguments a function can accept. For example, given the function definition:: 576 577 def func(foo, bar=None, **kwargs): 578 pass 579 580*foo*, *bar* and *kwargs* are parameters of ``func``. However, when calling 581``func``, for example:: 582 583 func(42, bar=314, extra=somevar) 584 585the values ``42``, ``314``, and ``somevar`` are arguments. 586 587 588Why did changing list 'y' also change list 'x'? 589------------------------------------------------ 590 591If you wrote code like:: 592 593 >>> x = [] 594 >>> y = x 595 >>> y.append(10) 596 >>> y 597 [10] 598 >>> x 599 [10] 600 601you might be wondering why appending an element to ``y`` changed ``x`` too. 602 603There are two factors that produce this result: 604 6051) Variables are simply names that refer to objects. Doing ``y = x`` doesn't 606 create a copy of the list -- it creates a new variable ``y`` that refers to 607 the same object ``x`` refers to. This means that there is only one object 608 (the list), and both ``x`` and ``y`` refer to it. 6092) Lists are :term:`mutable`, which means that you can change their content. 610 611After the call to :meth:`~list.append`, the content of the mutable object has 612changed from ``[]`` to ``[10]``. Since both the variables refer to the same 613object, using either name accesses the modified value ``[10]``. 614 615If we instead assign an immutable object to ``x``:: 616 617 >>> x = 5 # ints are immutable 618 >>> y = x 619 >>> x = x + 1 # 5 can't be mutated, we are creating a new object here 620 >>> x 621 6 622 >>> y 623 5 624 625we can see that in this case ``x`` and ``y`` are not equal anymore. This is 626because integers are :term:`immutable`, and when we do ``x = x + 1`` we are not 627mutating the int ``5`` by incrementing its value; instead, we are creating a 628new object (the int ``6``) and assigning it to ``x`` (that is, changing which 629object ``x`` refers to). After this assignment we have two objects (the ints 630``6`` and ``5``) and two variables that refer to them (``x`` now refers to 631``6`` but ``y`` still refers to ``5``). 632 633Some operations (for example ``y.append(10)`` and ``y.sort()``) mutate the 634object, whereas superficially similar operations (for example ``y = y + [10]`` 635and ``sorted(y)``) create a new object. In general in Python (and in all cases 636in the standard library) a method that mutates an object will return ``None`` 637to help avoid getting the two types of operations confused. So if you 638mistakenly write ``y.sort()`` thinking it will give you a sorted copy of ``y``, 639you'll instead end up with ``None``, which will likely cause your program to 640generate an easily diagnosed error. 641 642However, there is one class of operations where the same operation sometimes 643has different behaviors with different types: the augmented assignment 644operators. For example, ``+=`` mutates lists but not tuples or ints (``a_list 645+= [1, 2, 3]`` is equivalent to ``a_list.extend([1, 2, 3])`` and mutates 646``a_list``, whereas ``some_tuple += (1, 2, 3)`` and ``some_int += 1`` create 647new objects). 648 649In other words: 650 651* If we have a mutable object (:class:`list`, :class:`dict`, :class:`set`, 652 etc.), we can use some specific operations to mutate it and all the variables 653 that refer to it will see the change. 654* If we have an immutable object (:class:`str`, :class:`int`, :class:`tuple`, 655 etc.), all the variables that refer to it will always see the same value, 656 but operations that transform that value into a new value always return a new 657 object. 658 659If you want to know if two variables refer to the same object or not, you can 660use the :keyword:`is` operator, or the built-in function :func:`id`. 661 662 663How do I write a function with output parameters (call by reference)? 664--------------------------------------------------------------------- 665 666Remember that arguments are passed by assignment in Python. Since assignment 667just creates references to objects, there's no alias between an argument name in 668the caller and callee, and so no call-by-reference per se. You can achieve the 669desired effect in a number of ways. 670 6711) By returning a tuple of the results:: 672 673 def func2(a, b): 674 a = 'new-value' # a and b are local names 675 b = b + 1 # assigned to new objects 676 return a, b # return new values 677 678 x, y = 'old-value', 99 679 x, y = func2(x, y) 680 print x, y # output: new-value 100 681 682 This is almost always the clearest solution. 683 6842) By using global variables. This isn't thread-safe, and is not recommended. 685 6863) By passing a mutable (changeable in-place) object:: 687 688 def func1(a): 689 a[0] = 'new-value' # 'a' references a mutable list 690 a[1] = a[1] + 1 # changes a shared object 691 692 args = ['old-value', 99] 693 func1(args) 694 print args[0], args[1] # output: new-value 100 695 6964) By passing in a dictionary that gets mutated:: 697 698 def func3(args): 699 args['a'] = 'new-value' # args is a mutable dictionary 700 args['b'] = args['b'] + 1 # change it in-place 701 702 args = {'a': 'old-value', 'b': 99} 703 func3(args) 704 print args['a'], args['b'] 705 7065) Or bundle up values in a class instance:: 707 708 class callByRef: 709 def __init__(self, **args): 710 for (key, value) in args.items(): 711 setattr(self, key, value) 712 713 def func4(args): 714 args.a = 'new-value' # args is a mutable callByRef 715 args.b = args.b + 1 # change object in-place 716 717 args = callByRef(a='old-value', b=99) 718 func4(args) 719 print args.a, args.b 720 721 722 There's almost never a good reason to get this complicated. 723 724Your best choice is to return a tuple containing the multiple results. 725 726 727How do you make a higher order function in Python? 728-------------------------------------------------- 729 730You have two choices: you can use nested scopes or you can use callable objects. 731For example, suppose you wanted to define ``linear(a,b)`` which returns a 732function ``f(x)`` that computes the value ``a*x+b``. Using nested scopes:: 733 734 def linear(a, b): 735 def result(x): 736 return a * x + b 737 return result 738 739Or using a callable object:: 740 741 class linear: 742 743 def __init__(self, a, b): 744 self.a, self.b = a, b 745 746 def __call__(self, x): 747 return self.a * x + self.b 748 749In both cases, :: 750 751 taxes = linear(0.3, 2) 752 753gives a callable object where ``taxes(10e6) == 0.3 * 10e6 + 2``. 754 755The callable object approach has the disadvantage that it is a bit slower and 756results in slightly longer code. However, note that a collection of callables 757can share their signature via inheritance:: 758 759 class exponential(linear): 760 # __init__ inherited 761 def __call__(self, x): 762 return self.a * (x ** self.b) 763 764Object can encapsulate state for several methods:: 765 766 class counter: 767 768 value = 0 769 770 def set(self, x): 771 self.value = x 772 773 def up(self): 774 self.value = self.value + 1 775 776 def down(self): 777 self.value = self.value - 1 778 779 count = counter() 780 inc, dec, reset = count.up, count.down, count.set 781 782Here ``inc()``, ``dec()`` and ``reset()`` act like functions which share the 783same counting variable. 784 785 786How do I copy an object in Python? 787---------------------------------- 788 789In general, try :func:`copy.copy` or :func:`copy.deepcopy` for the general case. 790Not all objects can be copied, but most can. 791 792Some objects can be copied more easily. Dictionaries have a :meth:`~dict.copy` 793method:: 794 795 newdict = olddict.copy() 796 797Sequences can be copied by slicing:: 798 799 new_l = l[:] 800 801 802How can I find the methods or attributes of an object? 803------------------------------------------------------ 804 805For an instance x of a user-defined class, ``dir(x)`` returns an alphabetized 806list of the names containing the instance attributes and methods and attributes 807defined by its class. 808 809 810How can my code discover the name of an object? 811----------------------------------------------- 812 813Generally speaking, it can't, because objects don't really have names. 814Essentially, assignment always binds a name to a value; The same is true of 815``def`` and ``class`` statements, but in that case the value is a 816callable. Consider the following code:: 817 818 >>> class A: 819 ... pass 820 ... 821 >>> B = A 822 >>> a = B() 823 >>> b = a 824 >>> print b 825 <__main__.A instance at 0x16D07CC> 826 >>> print a 827 <__main__.A instance at 0x16D07CC> 828 829Arguably the class has a name: even though it is bound to two names and invoked 830through the name B the created instance is still reported as an instance of 831class A. However, it is impossible to say whether the instance's name is a or 832b, since both names are bound to the same value. 833 834Generally speaking it should not be necessary for your code to "know the names" 835of particular values. Unless you are deliberately writing introspective 836programs, this is usually an indication that a change of approach might be 837beneficial. 838 839In comp.lang.python, Fredrik Lundh once gave an excellent analogy in answer to 840this question: 841 842 The same way as you get the name of that cat you found on your porch: the cat 843 (object) itself cannot tell you its name, and it doesn't really care -- so 844 the only way to find out what it's called is to ask all your neighbours 845 (namespaces) if it's their cat (object)... 846 847 ....and don't be surprised if you'll find that it's known by many names, or 848 no name at all! 849 850 851What's up with the comma operator's precedence? 852----------------------------------------------- 853 854Comma is not an operator in Python. Consider this session:: 855 856 >>> "a" in "b", "a" 857 (False, 'a') 858 859Since the comma is not an operator, but a separator between expressions the 860above is evaluated as if you had entered:: 861 862 ("a" in "b"), "a" 863 864not:: 865 866 "a" in ("b", "a") 867 868The same is true of the various assignment operators (``=``, ``+=`` etc). They 869are not truly operators but syntactic delimiters in assignment statements. 870 871 872Is there an equivalent of C's "?:" ternary operator? 873---------------------------------------------------- 874 875Yes, this feature was added in Python 2.5. The syntax would be as follows:: 876 877 [on_true] if [expression] else [on_false] 878 879 x, y = 50, 25 880 881 small = x if x < y else y 882 883For versions previous to 2.5 the answer would be 'No'. 884 885 886Is it possible to write obfuscated one-liners in Python? 887-------------------------------------------------------- 888 889Yes. Usually this is done by nesting :keyword:`lambda` within 890:keyword:`lambda`. See the following three examples, due to Ulf Bartelt:: 891 892 # Primes < 1000 893 print filter(None,map(lambda y:y*reduce(lambda x,y:x*y!=0, 894 map(lambda x,y=y:y%x,range(2,int(pow(y,0.5)+1))),1),range(2,1000))) 895 896 # First 10 Fibonacci numbers 897 print map(lambda x,f=lambda x,f:(f(x-1,f)+f(x-2,f)) if x>1 else 1: f(x,f), 898 range(10)) 899 900 # Mandelbrot set 901 print (lambda Ru,Ro,Iu,Io,IM,Sx,Sy:reduce(lambda x,y:x+y,map(lambda y, 902 Iu=Iu,Io=Io,Ru=Ru,Ro=Ro,Sy=Sy,L=lambda yc,Iu=Iu,Io=Io,Ru=Ru,Ro=Ro,i=IM, 903 Sx=Sx,Sy=Sy:reduce(lambda x,y:x+y,map(lambda x,xc=Ru,yc=yc,Ru=Ru,Ro=Ro, 904 i=i,Sx=Sx,F=lambda xc,yc,x,y,k,f=lambda xc,yc,x,y,k,f:(k<=0)or (x*x+y*y 905 >=4.0) or 1+f(xc,yc,x*x-y*y+xc,2.0*x*y+yc,k-1,f):f(xc,yc,x,y,k,f):chr( 906 64+F(Ru+x*(Ro-Ru)/Sx,yc,0,0,i)),range(Sx))):L(Iu+y*(Io-Iu)/Sy),range(Sy 907 ))))(-2.1, 0.7, -1.2, 1.2, 30, 80, 24) 908 # \___ ___/ \___ ___/ | | |__ lines on screen 909 # V V | |______ columns on screen 910 # | | |__________ maximum of "iterations" 911 # | |_________________ range on y axis 912 # |____________________________ range on x axis 913 914Don't try this at home, kids! 915 916 917Numbers and strings 918=================== 919 920How do I specify hexadecimal and octal integers? 921------------------------------------------------ 922 923To specify an octal digit, precede the octal value with a zero, and then a lower 924or uppercase "o". For example, to set the variable "a" to the octal value "10" 925(8 in decimal), type:: 926 927 >>> a = 0o10 928 >>> a 929 8 930 931Hexadecimal is just as easy. Simply precede the hexadecimal number with a zero, 932and then a lower or uppercase "x". Hexadecimal digits can be specified in lower 933or uppercase. For example, in the Python interpreter:: 934 935 >>> a = 0xa5 936 >>> a 937 165 938 >>> b = 0XB2 939 >>> b 940 178 941 942 943Why does -22 // 10 return -3? 944----------------------------- 945 946It's primarily driven by the desire that ``i % j`` have the same sign as ``j``. 947If you want that, and also want:: 948 949 i == (i // j) * j + (i % j) 950 951then integer division has to return the floor. C also requires that identity to 952hold, and then compilers that truncate ``i // j`` need to make ``i % j`` have 953the same sign as ``i``. 954 955There are few real use cases for ``i % j`` when ``j`` is negative. When ``j`` 956is positive, there are many, and in virtually all of them it's more useful for 957``i % j`` to be ``>= 0``. If the clock says 10 now, what did it say 200 hours 958ago? ``-190 % 12 == 2`` is useful; ``-190 % 12 == -10`` is a bug waiting to 959bite. 960 961.. note:: 962 963 On Python 2, ``a / b`` returns the same as ``a // b`` if 964 ``__future__.division`` is not in effect. This is also known as "classic" 965 division. 966 967 968How do I convert a string to a number? 969-------------------------------------- 970 971For integers, use the built-in :func:`int` type constructor, e.g. ``int('144') 972== 144``. Similarly, :func:`float` converts to floating-point, 973e.g. ``float('144') == 144.0``. 974 975By default, these interpret the number as decimal, so that ``int('0144') == 976144`` and ``int('0x144')`` raises :exc:`ValueError`. ``int(string, base)`` takes 977the base to convert from as a second optional argument, so ``int('0x144', 16) == 978324``. If the base is specified as 0, the number is interpreted using Python's 979rules: a leading '0' indicates octal, and '0x' indicates a hex number. 980 981Do not use the built-in function :func:`eval` if all you need is to convert 982strings to numbers. :func:`eval` will be significantly slower and it presents a 983security risk: someone could pass you a Python expression that might have 984unwanted side effects. For example, someone could pass 985``__import__('os').system("rm -rf $HOME")`` which would erase your home 986directory. 987 988:func:`eval` also has the effect of interpreting numbers as Python expressions, 989so that e.g. ``eval('09')`` gives a syntax error because Python regards numbers 990starting with '0' as octal (base 8). 991 992 993How do I convert a number to a string? 994-------------------------------------- 995 996To convert, e.g., the number 144 to the string '144', use the built-in type 997constructor :func:`str`. If you want a hexadecimal or octal representation, use 998the built-in functions :func:`hex` or :func:`oct`. For fancy formatting, see 999the :ref:`formatstrings` section, e.g. ``"{:04d}".format(144)`` yields 1000``'0144'`` and ``"{:.3f}".format(1/3)`` yields ``'0.333'``. You may also use 1001:ref:`the % operator <string-formatting>` on strings. See the library reference 1002manual for details. 1003 1004 1005How do I modify a string in place? 1006---------------------------------- 1007 1008You can't, because strings are immutable. If you need an object with this 1009ability, try converting the string to a list or use the array module:: 1010 1011 >>> import io 1012 >>> s = "Hello, world" 1013 >>> a = list(s) 1014 >>> print a 1015 ['H', 'e', 'l', 'l', 'o', ',', ' ', 'w', 'o', 'r', 'l', 'd'] 1016 >>> a[7:] = list("there!") 1017 >>> ''.join(a) 1018 'Hello, there!' 1019 1020 >>> import array 1021 >>> a = array.array('c', s) 1022 >>> print a 1023 array('c', 'Hello, world') 1024 >>> a[0] = 'y'; print a 1025 array('c', 'yello, world') 1026 >>> a.tostring() 1027 'yello, world' 1028 1029 1030How do I use strings to call functions/methods? 1031----------------------------------------------- 1032 1033There are various techniques. 1034 1035* The best is to use a dictionary that maps strings to functions. The primary 1036 advantage of this technique is that the strings do not need to match the names 1037 of the functions. This is also the primary technique used to emulate a case 1038 construct:: 1039 1040 def a(): 1041 pass 1042 1043 def b(): 1044 pass 1045 1046 dispatch = {'go': a, 'stop': b} # Note lack of parens for funcs 1047 1048 dispatch[get_input()]() # Note trailing parens to call function 1049 1050* Use the built-in function :func:`getattr`:: 1051 1052 import foo 1053 getattr(foo, 'bar')() 1054 1055 Note that :func:`getattr` works on any object, including classes, class 1056 instances, modules, and so on. 1057 1058 This is used in several places in the standard library, like this:: 1059 1060 class Foo: 1061 def do_foo(self): 1062 ... 1063 1064 def do_bar(self): 1065 ... 1066 1067 f = getattr(foo_instance, 'do_' + opname) 1068 f() 1069 1070 1071* Use :func:`locals` or :func:`eval` to resolve the function name:: 1072 1073 def myFunc(): 1074 print "hello" 1075 1076 fname = "myFunc" 1077 1078 f = locals()[fname] 1079 f() 1080 1081 f = eval(fname) 1082 f() 1083 1084 Note: Using :func:`eval` is slow and dangerous. If you don't have absolute 1085 control over the contents of the string, someone could pass a string that 1086 resulted in an arbitrary function being executed. 1087 1088Is there an equivalent to Perl's chomp() for removing trailing newlines from strings? 1089------------------------------------------------------------------------------------- 1090 1091Starting with Python 2.2, you can use ``S.rstrip("\r\n")`` to remove all 1092occurrences of any line terminator from the end of the string ``S`` without 1093removing other trailing whitespace. If the string ``S`` represents more than 1094one line, with several empty lines at the end, the line terminators for all the 1095blank lines will be removed:: 1096 1097 >>> lines = ("line 1 \r\n" 1098 ... "\r\n" 1099 ... "\r\n") 1100 >>> lines.rstrip("\n\r") 1101 'line 1 ' 1102 1103Since this is typically only desired when reading text one line at a time, using 1104``S.rstrip()`` this way works well. 1105 1106For older versions of Python, there are two partial substitutes: 1107 1108- If you want to remove all trailing whitespace, use the ``rstrip()`` method of 1109 string objects. This removes all trailing whitespace, not just a single 1110 newline. 1111 1112- Otherwise, if there is only one line in the string ``S``, use 1113 ``S.splitlines()[0]``. 1114 1115 1116Is there a scanf() or sscanf() equivalent? 1117------------------------------------------ 1118 1119Not as such. 1120 1121For simple input parsing, the easiest approach is usually to split the line into 1122whitespace-delimited words using the :meth:`~str.split` method of string objects 1123and then convert decimal strings to numeric values using :func:`int` or 1124:func:`float`. ``split()`` supports an optional "sep" parameter which is useful 1125if the line uses something other than whitespace as a separator. 1126 1127For more complicated input parsing, regular expressions are more powerful 1128than C's :c:func:`sscanf` and better suited for the task. 1129 1130 1131What does 'UnicodeError: ASCII [decoding,encoding] error: ordinal not in range(128)' mean? 1132------------------------------------------------------------------------------------------ 1133 1134This error indicates that your Python installation can handle only 7-bit ASCII 1135strings. There are a couple ways to fix or work around the problem. 1136 1137If your programs must handle data in arbitrary character set encodings, the 1138environment the application runs in will generally identify the encoding of the 1139data it is handing you. You need to convert the input to Unicode data using 1140that encoding. For example, a program that handles email or web input will 1141typically find character set encoding information in Content-Type headers. This 1142can then be used to properly convert input data to Unicode. Assuming the string 1143referred to by ``value`` is encoded as UTF-8:: 1144 1145 value = unicode(value, "utf-8") 1146 1147will return a Unicode object. If the data is not correctly encoded as UTF-8, 1148the above call will raise a :exc:`UnicodeError` exception. 1149 1150If you only want strings converted to Unicode which have non-ASCII data, you can 1151try converting them first assuming an ASCII encoding, and then generate Unicode 1152objects if that fails:: 1153 1154 try: 1155 x = unicode(value, "ascii") 1156 except UnicodeError: 1157 value = unicode(value, "utf-8") 1158 else: 1159 # value was valid ASCII data 1160 pass 1161 1162It's possible to set a default encoding in a file called ``sitecustomize.py`` 1163that's part of the Python library. However, this isn't recommended because 1164changing the Python-wide default encoding may cause third-party extension 1165modules to fail. 1166 1167Note that on Windows, there is an encoding known as "mbcs", which uses an 1168encoding specific to your current locale. In many cases, and particularly when 1169working with COM, this may be an appropriate default encoding to use. 1170 1171 1172Sequences (Tuples/Lists) 1173======================== 1174 1175How do I convert between tuples and lists? 1176------------------------------------------ 1177 1178The type constructor ``tuple(seq)`` converts any sequence (actually, any 1179iterable) into a tuple with the same items in the same order. 1180 1181For example, ``tuple([1, 2, 3])`` yields ``(1, 2, 3)`` and ``tuple('abc')`` 1182yields ``('a', 'b', 'c')``. If the argument is a tuple, it does not make a copy 1183but returns the same object, so it is cheap to call :func:`tuple` when you 1184aren't sure that an object is already a tuple. 1185 1186The type constructor ``list(seq)`` converts any sequence or iterable into a list 1187with the same items in the same order. For example, ``list((1, 2, 3))`` yields 1188``[1, 2, 3]`` and ``list('abc')`` yields ``['a', 'b', 'c']``. If the argument 1189is a list, it makes a copy just like ``seq[:]`` would. 1190 1191 1192What's a negative index? 1193------------------------ 1194 1195Python sequences are indexed with positive numbers and negative numbers. For 1196positive numbers 0 is the first index 1 is the second index and so forth. For 1197negative indices -1 is the last index and -2 is the penultimate (next to last) 1198index and so forth. Think of ``seq[-n]`` as the same as ``seq[len(seq)-n]``. 1199 1200Using negative indices can be very convenient. For example ``S[:-1]`` is all of 1201the string except for its last character, which is useful for removing the 1202trailing newline from a string. 1203 1204 1205How do I iterate over a sequence in reverse order? 1206-------------------------------------------------- 1207 1208Use the :func:`reversed` built-in function, which is new in Python 2.4:: 1209 1210 for x in reversed(sequence): 1211 ... # do something with x ... 1212 1213This won't touch your original sequence, but build a new copy with reversed 1214order to iterate over. 1215 1216With Python 2.3, you can use an extended slice syntax:: 1217 1218 for x in sequence[::-1]: 1219 ... # do something with x ... 1220 1221 1222How do you remove duplicates from a list? 1223----------------------------------------- 1224 1225See the Python Cookbook for a long discussion of many ways to do this: 1226 1227 https://code.activestate.com/recipes/52560/ 1228 1229If you don't mind reordering the list, sort it and then scan from the end of the 1230list, deleting duplicates as you go:: 1231 1232 if mylist: 1233 mylist.sort() 1234 last = mylist[-1] 1235 for i in range(len(mylist)-2, -1, -1): 1236 if last == mylist[i]: 1237 del mylist[i] 1238 else: 1239 last = mylist[i] 1240 1241If all elements of the list may be used as dictionary keys (i.e. they are all 1242hashable) this is often faster :: 1243 1244 d = {} 1245 for x in mylist: 1246 d[x] = 1 1247 mylist = list(d.keys()) 1248 1249In Python 2.5 and later, the following is possible instead:: 1250 1251 mylist = list(set(mylist)) 1252 1253This converts the list into a set, thereby removing duplicates, and then back 1254into a list. 1255 1256 1257How do you make an array in Python? 1258----------------------------------- 1259 1260Use a list:: 1261 1262 ["this", 1, "is", "an", "array"] 1263 1264Lists are equivalent to C or Pascal arrays in their time complexity; the primary 1265difference is that a Python list can contain objects of many different types. 1266 1267The ``array`` module also provides methods for creating arrays of fixed types 1268with compact representations, but they are slower to index than lists. Also 1269note that the Numeric extensions and others define array-like structures with 1270various characteristics as well. 1271 1272To get Lisp-style linked lists, you can emulate cons cells using tuples:: 1273 1274 lisp_list = ("like", ("this", ("example", None) ) ) 1275 1276If mutability is desired, you could use lists instead of tuples. Here the 1277analogue of lisp car is ``lisp_list[0]`` and the analogue of cdr is 1278``lisp_list[1]``. Only do this if you're sure you really need to, because it's 1279usually a lot slower than using Python lists. 1280 1281 1282.. _faq-multidimensional-list: 1283 1284How do I create a multidimensional list? 1285---------------------------------------- 1286 1287You probably tried to make a multidimensional array like this:: 1288 1289 >>> A = [[None] * 2] * 3 1290 1291This looks correct if you print it:: 1292 1293 >>> A 1294 [[None, None], [None, None], [None, None]] 1295 1296But when you assign a value, it shows up in multiple places: 1297 1298 >>> A[0][0] = 5 1299 >>> A 1300 [[5, None], [5, None], [5, None]] 1301 1302The reason is that replicating a list with ``*`` doesn't create copies, it only 1303creates references to the existing objects. The ``*3`` creates a list 1304containing 3 references to the same list of length two. Changes to one row will 1305show in all rows, which is almost certainly not what you want. 1306 1307The suggested approach is to create a list of the desired length first and then 1308fill in each element with a newly created list:: 1309 1310 A = [None] * 3 1311 for i in range(3): 1312 A[i] = [None] * 2 1313 1314This generates a list containing 3 different lists of length two. You can also 1315use a list comprehension:: 1316 1317 w, h = 2, 3 1318 A = [[None] * w for i in range(h)] 1319 1320Or, you can use an extension that provides a matrix datatype; `NumPy 1321<http://www.numpy.org/>`_ is the best known. 1322 1323 1324How do I apply a method to a sequence of objects? 1325------------------------------------------------- 1326 1327Use a list comprehension:: 1328 1329 result = [obj.method() for obj in mylist] 1330 1331More generically, you can try the following function:: 1332 1333 def method_map(objects, method, arguments): 1334 """method_map([a,b], "meth", (1,2)) gives [a.meth(1,2), b.meth(1,2)]""" 1335 nobjects = len(objects) 1336 methods = map(getattr, objects, [method]*nobjects) 1337 return map(apply, methods, [arguments]*nobjects) 1338 1339 1340Why does a_tuple[i] += ['item'] raise an exception when the addition works? 1341--------------------------------------------------------------------------- 1342 1343This is because of a combination of the fact that augmented assignment 1344operators are *assignment* operators, and the difference between mutable and 1345immutable objects in Python. 1346 1347This discussion applies in general when augmented assignment operators are 1348applied to elements of a tuple that point to mutable objects, but we'll use 1349a ``list`` and ``+=`` as our exemplar. 1350 1351If you wrote:: 1352 1353 >>> a_tuple = (1, 2) 1354 >>> a_tuple[0] += 1 1355 Traceback (most recent call last): 1356 ... 1357 TypeError: 'tuple' object does not support item assignment 1358 1359The reason for the exception should be immediately clear: ``1`` is added to the 1360object ``a_tuple[0]`` points to (``1``), producing the result object, ``2``, 1361but when we attempt to assign the result of the computation, ``2``, to element 1362``0`` of the tuple, we get an error because we can't change what an element of 1363a tuple points to. 1364 1365Under the covers, what this augmented assignment statement is doing is 1366approximately this:: 1367 1368 >>> result = a_tuple[0] + 1 1369 >>> a_tuple[0] = result 1370 Traceback (most recent call last): 1371 ... 1372 TypeError: 'tuple' object does not support item assignment 1373 1374It is the assignment part of the operation that produces the error, since a 1375tuple is immutable. 1376 1377When you write something like:: 1378 1379 >>> a_tuple = (['foo'], 'bar') 1380 >>> a_tuple[0] += ['item'] 1381 Traceback (most recent call last): 1382 ... 1383 TypeError: 'tuple' object does not support item assignment 1384 1385The exception is a bit more surprising, and even more surprising is the fact 1386that even though there was an error, the append worked:: 1387 1388 >>> a_tuple[0] 1389 ['foo', 'item'] 1390 1391To see why this happens, you need to know that (a) if an object implements an 1392``__iadd__`` magic method, it gets called when the ``+=`` augmented assignment 1393is executed, and its return value is what gets used in the assignment statement; 1394and (b) for lists, ``__iadd__`` is equivalent to calling ``extend`` on the list 1395and returning the list. That's why we say that for lists, ``+=`` is a 1396"shorthand" for ``list.extend``:: 1397 1398 >>> a_list = [] 1399 >>> a_list += [1] 1400 >>> a_list 1401 [1] 1402 1403This is equivalent to:: 1404 1405 >>> result = a_list.__iadd__([1]) 1406 >>> a_list = result 1407 1408The object pointed to by a_list has been mutated, and the pointer to the 1409mutated object is assigned back to ``a_list``. The end result of the 1410assignment is a no-op, since it is a pointer to the same object that ``a_list`` 1411was previously pointing to, but the assignment still happens. 1412 1413Thus, in our tuple example what is happening is equivalent to:: 1414 1415 >>> result = a_tuple[0].__iadd__(['item']) 1416 >>> a_tuple[0] = result 1417 Traceback (most recent call last): 1418 ... 1419 TypeError: 'tuple' object does not support item assignment 1420 1421The ``__iadd__`` succeeds, and thus the list is extended, but even though 1422``result`` points to the same object that ``a_tuple[0]`` already points to, 1423that final assignment still results in an error, because tuples are immutable. 1424 1425 1426Dictionaries 1427============ 1428 1429How can I get a dictionary to display its keys in a consistent order? 1430--------------------------------------------------------------------- 1431 1432You can't. Dictionaries store their keys in an unpredictable order, so the 1433display order of a dictionary's elements will be similarly unpredictable. 1434 1435This can be frustrating if you want to save a printable version to a file, make 1436some changes and then compare it with some other printed dictionary. In this 1437case, use the ``pprint`` module to pretty-print the dictionary; the items will 1438be presented in order sorted by the key. 1439 1440A more complicated solution is to subclass ``dict`` to create a 1441``SortedDict`` class that prints itself in a predictable order. Here's one 1442simpleminded implementation of such a class:: 1443 1444 class SortedDict(dict): 1445 def __repr__(self): 1446 keys = sorted(self.keys()) 1447 result = ("{!r}: {!r}".format(k, self[k]) for k in keys) 1448 return "{{{}}}".format(", ".join(result)) 1449 1450 __str__ = __repr__ 1451 1452This will work for many common situations you might encounter, though it's far 1453from a perfect solution. The largest flaw is that if some values in the 1454dictionary are also dictionaries, their values won't be presented in any 1455particular order. 1456 1457 1458I want to do a complicated sort: can you do a Schwartzian Transform in Python? 1459------------------------------------------------------------------------------ 1460 1461The technique, attributed to Randal Schwartz of the Perl community, sorts the 1462elements of a list by a metric which maps each element to its "sort value". In 1463Python, use the ``key`` argument for the :func:`sort()` function:: 1464 1465 Isorted = L[:] 1466 Isorted.sort(key=lambda s: int(s[10:15])) 1467 1468 1469How can I sort one list by values from another list? 1470---------------------------------------------------- 1471 1472Merge them into a single list of tuples, sort the resulting list, and then pick 1473out the element you want. :: 1474 1475 >>> list1 = ["what", "I'm", "sorting", "by"] 1476 >>> list2 = ["something", "else", "to", "sort"] 1477 >>> pairs = zip(list1, list2) 1478 >>> pairs 1479 [('what', 'something'), ("I'm", 'else'), ('sorting', 'to'), ('by', 'sort')] 1480 >>> pairs.sort() 1481 >>> result = [ x[1] for x in pairs ] 1482 >>> result 1483 ['else', 'sort', 'to', 'something'] 1484 1485An alternative for the last step is:: 1486 1487 >>> result = [] 1488 >>> for p in pairs: result.append(p[1]) 1489 1490If you find this more legible, you might prefer to use this instead of the final 1491list comprehension. However, it is almost twice as slow for long lists. Why? 1492First, the ``append()`` operation has to reallocate memory, and while it uses 1493some tricks to avoid doing that each time, it still has to do it occasionally, 1494and that costs quite a bit. Second, the expression "result.append" requires an 1495extra attribute lookup, and third, there's a speed reduction from having to make 1496all those function calls. 1497 1498 1499Objects 1500======= 1501 1502What is a class? 1503---------------- 1504 1505A class is the particular object type created by executing a class statement. 1506Class objects are used as templates to create instance objects, which embody 1507both the data (attributes) and code (methods) specific to a datatype. 1508 1509A class can be based on one or more other classes, called its base class(es). It 1510then inherits the attributes and methods of its base classes. This allows an 1511object model to be successively refined by inheritance. You might have a 1512generic ``Mailbox`` class that provides basic accessor methods for a mailbox, 1513and subclasses such as ``MboxMailbox``, ``MaildirMailbox``, ``OutlookMailbox`` 1514that handle various specific mailbox formats. 1515 1516 1517What is a method? 1518----------------- 1519 1520A method is a function on some object ``x`` that you normally call as 1521``x.name(arguments...)``. Methods are defined as functions inside the class 1522definition:: 1523 1524 class C: 1525 def meth(self, arg): 1526 return arg * 2 + self.attribute 1527 1528 1529What is self? 1530------------- 1531 1532Self is merely a conventional name for the first argument of a method. A method 1533defined as ``meth(self, a, b, c)`` should be called as ``x.meth(a, b, c)`` for 1534some instance ``x`` of the class in which the definition occurs; the called 1535method will think it is called as ``meth(x, a, b, c)``. 1536 1537See also :ref:`why-self`. 1538 1539 1540How do I check if an object is an instance of a given class or of a subclass of it? 1541----------------------------------------------------------------------------------- 1542 1543Use the built-in function ``isinstance(obj, cls)``. You can check if an object 1544is an instance of any of a number of classes by providing a tuple instead of a 1545single class, e.g. ``isinstance(obj, (class1, class2, ...))``, and can also 1546check whether an object is one of Python's built-in types, e.g. 1547``isinstance(obj, str)`` or ``isinstance(obj, (int, long, float, complex))``. 1548 1549Note that most programs do not use :func:`isinstance` on user-defined classes 1550very often. If you are developing the classes yourself, a more proper 1551object-oriented style is to define methods on the classes that encapsulate a 1552particular behaviour, instead of checking the object's class and doing a 1553different thing based on what class it is. For example, if you have a function 1554that does something:: 1555 1556 def search(obj): 1557 if isinstance(obj, Mailbox): 1558 ... # code to search a mailbox 1559 elif isinstance(obj, Document): 1560 ... # code to search a document 1561 elif ... 1562 1563A better approach is to define a ``search()`` method on all the classes and just 1564call it:: 1565 1566 class Mailbox: 1567 def search(self): 1568 ... # code to search a mailbox 1569 1570 class Document: 1571 def search(self): 1572 ... # code to search a document 1573 1574 obj.search() 1575 1576 1577What is delegation? 1578------------------- 1579 1580Delegation is an object oriented technique (also called a design pattern). 1581Let's say you have an object ``x`` and want to change the behaviour of just one 1582of its methods. You can create a new class that provides a new implementation 1583of the method you're interested in changing and delegates all other methods to 1584the corresponding method of ``x``. 1585 1586Python programmers can easily implement delegation. For example, the following 1587class implements a class that behaves like a file but converts all written data 1588to uppercase:: 1589 1590 class UpperOut: 1591 1592 def __init__(self, outfile): 1593 self._outfile = outfile 1594 1595 def write(self, s): 1596 self._outfile.write(s.upper()) 1597 1598 def __getattr__(self, name): 1599 return getattr(self._outfile, name) 1600 1601Here the ``UpperOut`` class redefines the ``write()`` method to convert the 1602argument string to uppercase before calling the underlying 1603``self.__outfile.write()`` method. All other methods are delegated to the 1604underlying ``self.__outfile`` object. The delegation is accomplished via the 1605``__getattr__`` method; consult :ref:`the language reference <attribute-access>` 1606for more information about controlling attribute access. 1607 1608Note that for more general cases delegation can get trickier. When attributes 1609must be set as well as retrieved, the class must define a :meth:`__setattr__` 1610method too, and it must do so carefully. The basic implementation of 1611:meth:`__setattr__` is roughly equivalent to the following:: 1612 1613 class X: 1614 ... 1615 def __setattr__(self, name, value): 1616 self.__dict__[name] = value 1617 ... 1618 1619Most :meth:`__setattr__` implementations must modify ``self.__dict__`` to store 1620local state for self without causing an infinite recursion. 1621 1622 1623How do I call a method defined in a base class from a derived class that overrides it? 1624-------------------------------------------------------------------------------------- 1625 1626If you're using new-style classes, use the built-in :func:`super` function:: 1627 1628 class Derived(Base): 1629 def meth(self): 1630 super(Derived, self).meth() 1631 1632If you're using classic classes: For a class definition such as ``class 1633Derived(Base): ...`` you can call method ``meth()`` defined in ``Base`` (or one 1634of ``Base``'s base classes) as ``Base.meth(self, arguments...)``. Here, 1635``Base.meth`` is an unbound method, so you need to provide the ``self`` 1636argument. 1637 1638 1639How can I organize my code to make it easier to change the base class? 1640---------------------------------------------------------------------- 1641 1642You could define an alias for the base class, assign the real base class to it 1643before your class definition, and use the alias throughout your class. Then all 1644you have to change is the value assigned to the alias. Incidentally, this trick 1645is also handy if you want to decide dynamically (e.g. depending on availability 1646of resources) which base class to use. Example:: 1647 1648 BaseAlias = <real base class> 1649 1650 class Derived(BaseAlias): 1651 def meth(self): 1652 BaseAlias.meth(self) 1653 ... 1654 1655 1656How do I create static class data and static class methods? 1657----------------------------------------------------------- 1658 1659Both static data and static methods (in the sense of C++ or Java) are supported 1660in Python. 1661 1662For static data, simply define a class attribute. To assign a new value to the 1663attribute, you have to explicitly use the class name in the assignment:: 1664 1665 class C: 1666 count = 0 # number of times C.__init__ called 1667 1668 def __init__(self): 1669 C.count = C.count + 1 1670 1671 def getcount(self): 1672 return C.count # or return self.count 1673 1674``c.count`` also refers to ``C.count`` for any ``c`` such that ``isinstance(c, 1675C)`` holds, unless overridden by ``c`` itself or by some class on the base-class 1676search path from ``c.__class__`` back to ``C``. 1677 1678Caution: within a method of C, an assignment like ``self.count = 42`` creates a 1679new and unrelated instance named "count" in ``self``'s own dict. Rebinding of a 1680class-static data name must always specify the class whether inside a method or 1681not:: 1682 1683 C.count = 314 1684 1685Static methods are possible since Python 2.2:: 1686 1687 class C: 1688 def static(arg1, arg2, arg3): 1689 # No 'self' parameter! 1690 ... 1691 static = staticmethod(static) 1692 1693With Python 2.4's decorators, this can also be written as :: 1694 1695 class C: 1696 @staticmethod 1697 def static(arg1, arg2, arg3): 1698 # No 'self' parameter! 1699 ... 1700 1701However, a far more straightforward way to get the effect of a static method is 1702via a simple module-level function:: 1703 1704 def getcount(): 1705 return C.count 1706 1707If your code is structured so as to define one class (or tightly related class 1708hierarchy) per module, this supplies the desired encapsulation. 1709 1710 1711How can I overload constructors (or methods) in Python? 1712------------------------------------------------------- 1713 1714This answer actually applies to all methods, but the question usually comes up 1715first in the context of constructors. 1716 1717In C++ you'd write 1718 1719.. code-block:: c 1720 1721 class C { 1722 C() { cout << "No arguments\n"; } 1723 C(int i) { cout << "Argument is " << i << "\n"; } 1724 } 1725 1726In Python you have to write a single constructor that catches all cases using 1727default arguments. For example:: 1728 1729 class C: 1730 def __init__(self, i=None): 1731 if i is None: 1732 print "No arguments" 1733 else: 1734 print "Argument is", i 1735 1736This is not entirely equivalent, but close enough in practice. 1737 1738You could also try a variable-length argument list, e.g. :: 1739 1740 def __init__(self, *args): 1741 ... 1742 1743The same approach works for all method definitions. 1744 1745 1746I try to use __spam and I get an error about _SomeClassName__spam. 1747------------------------------------------------------------------ 1748 1749Variable names with double leading underscores are "mangled" to provide a simple 1750but effective way to define class private variables. Any identifier of the form 1751``__spam`` (at least two leading underscores, at most one trailing underscore) 1752is textually replaced with ``_classname__spam``, where ``classname`` is the 1753current class name with any leading underscores stripped. 1754 1755This doesn't guarantee privacy: an outside user can still deliberately access 1756the "_classname__spam" attribute, and private values are visible in the object's 1757``__dict__``. Many Python programmers never bother to use private variable 1758names at all. 1759 1760 1761My class defines __del__ but it is not called when I delete the object. 1762----------------------------------------------------------------------- 1763 1764There are several possible reasons for this. 1765 1766The del statement does not necessarily call :meth:`__del__` -- it simply 1767decrements the object's reference count, and if this reaches zero 1768:meth:`__del__` is called. 1769 1770If your data structures contain circular links (e.g. a tree where each child has 1771a parent reference and each parent has a list of children) the reference counts 1772will never go back to zero. Once in a while Python runs an algorithm to detect 1773such cycles, but the garbage collector might run some time after the last 1774reference to your data structure vanishes, so your :meth:`__del__` method may be 1775called at an inconvenient and random time. This is inconvenient if you're trying 1776to reproduce a problem. Worse, the order in which object's :meth:`__del__` 1777methods are executed is arbitrary. You can run :func:`gc.collect` to force a 1778collection, but there *are* pathological cases where objects will never be 1779collected. 1780 1781Despite the cycle collector, it's still a good idea to define an explicit 1782``close()`` method on objects to be called whenever you're done with them. The 1783``close()`` method can then remove attributes that refer to subobjecs. Don't 1784call :meth:`__del__` directly -- :meth:`__del__` should call ``close()`` and 1785``close()`` should make sure that it can be called more than once for the same 1786object. 1787 1788Another way to avoid cyclical references is to use the :mod:`weakref` module, 1789which allows you to point to objects without incrementing their reference count. 1790Tree data structures, for instance, should use weak references for their parent 1791and sibling references (if they need them!). 1792 1793If the object has ever been a local variable in a function that caught an 1794expression in an except clause, chances are that a reference to the object still 1795exists in that function's stack frame as contained in the stack trace. 1796Normally, calling :func:`sys.exc_clear` will take care of this by clearing the 1797last recorded exception. 1798 1799Finally, if your :meth:`__del__` method raises an exception, a warning message 1800is printed to :data:`sys.stderr`. 1801 1802 1803How do I get a list of all instances of a given class? 1804------------------------------------------------------ 1805 1806Python does not keep track of all instances of a class (or of a built-in type). 1807You can program the class's constructor to keep track of all instances by 1808keeping a list of weak references to each instance. 1809 1810 1811Why does the result of ``id()`` appear to be not unique? 1812-------------------------------------------------------- 1813 1814The :func:`id` builtin returns an integer that is guaranteed to be unique during 1815the lifetime of the object. Since in CPython, this is the object's memory 1816address, it happens frequently that after an object is deleted from memory, the 1817next freshly created object is allocated at the same position in memory. This 1818is illustrated by this example: 1819 1820>>> id(1000) 182113901272 1822>>> id(2000) 182313901272 1824 1825The two ids belong to different integer objects that are created before, and 1826deleted immediately after execution of the ``id()`` call. To be sure that 1827objects whose id you want to examine are still alive, create another reference 1828to the object: 1829 1830>>> a = 1000; b = 2000 1831>>> id(a) 183213901272 1833>>> id(b) 183413891296 1835 1836 1837Modules 1838======= 1839 1840How do I create a .pyc file? 1841---------------------------- 1842 1843When a module is imported for the first time (or when the source is more recent 1844than the current compiled file) a ``.pyc`` file containing the compiled code 1845should be created in the same directory as the ``.py`` file. 1846 1847One reason that a ``.pyc`` file may not be created is permissions problems with 1848the directory. This can happen, for example, if you develop as one user but run 1849as another, such as if you are testing with a web server. Creation of a .pyc 1850file is automatic if you're importing a module and Python has the ability 1851(permissions, free space, etc...) to write the compiled module back to the 1852directory. 1853 1854Running Python on a top level script is not considered an import and no 1855``.pyc`` will be created. For example, if you have a top-level module 1856``foo.py`` that imports another module ``xyz.py``, when you run ``foo``, 1857``xyz.pyc`` will be created since ``xyz`` is imported, but no ``foo.pyc`` file 1858will be created since ``foo.py`` isn't being imported. 1859 1860If you need to create ``foo.pyc`` -- that is, to create a ``.pyc`` file for a module 1861that is not imported -- you can, using the :mod:`py_compile` and 1862:mod:`compileall` modules. 1863 1864The :mod:`py_compile` module can manually compile any module. One way is to use 1865the ``compile()`` function in that module interactively:: 1866 1867 >>> import py_compile 1868 >>> py_compile.compile('foo.py') # doctest: +SKIP 1869 1870This will write the ``.pyc`` to the same location as ``foo.py`` (or you can 1871override that with the optional parameter ``cfile``). 1872 1873You can also automatically compile all files in a directory or directories using 1874the :mod:`compileall` module. You can do it from the shell prompt by running 1875``compileall.py`` and providing the path of a directory containing Python files 1876to compile:: 1877 1878 python -m compileall . 1879 1880 1881How do I find the current module name? 1882-------------------------------------- 1883 1884A module can find out its own module name by looking at the predefined global 1885variable ``__name__``. If this has the value ``'__main__'``, the program is 1886running as a script. Many modules that are usually used by importing them also 1887provide a command-line interface or a self-test, and only execute this code 1888after checking ``__name__``:: 1889 1890 def main(): 1891 print 'Running test...' 1892 ... 1893 1894 if __name__ == '__main__': 1895 main() 1896 1897 1898How can I have modules that mutually import each other? 1899------------------------------------------------------- 1900 1901Suppose you have the following modules: 1902 1903foo.py:: 1904 1905 from bar import bar_var 1906 foo_var = 1 1907 1908bar.py:: 1909 1910 from foo import foo_var 1911 bar_var = 2 1912 1913The problem is that the interpreter will perform the following steps: 1914 1915* main imports foo 1916* Empty globals for foo are created 1917* foo is compiled and starts executing 1918* foo imports bar 1919* Empty globals for bar are created 1920* bar is compiled and starts executing 1921* bar imports foo (which is a no-op since there already is a module named foo) 1922* bar.foo_var = foo.foo_var 1923 1924The last step fails, because Python isn't done with interpreting ``foo`` yet and 1925the global symbol dictionary for ``foo`` is still empty. 1926 1927The same thing happens when you use ``import foo``, and then try to access 1928``foo.foo_var`` in global code. 1929 1930There are (at least) three possible workarounds for this problem. 1931 1932Guido van Rossum recommends avoiding all uses of ``from <module> import ...``, 1933and placing all code inside functions. Initializations of global variables and 1934class variables should use constants or built-in functions only. This means 1935everything from an imported module is referenced as ``<module>.<name>``. 1936 1937Jim Roskind suggests performing steps in the following order in each module: 1938 1939* exports (globals, functions, and classes that don't need imported base 1940 classes) 1941* ``import`` statements 1942* active code (including globals that are initialized from imported values). 1943 1944van Rossum doesn't like this approach much because the imports appear in a 1945strange place, but it does work. 1946 1947Matthias Urlichs recommends restructuring your code so that the recursive import 1948is not necessary in the first place. 1949 1950These solutions are not mutually exclusive. 1951 1952 1953__import__('x.y.z') returns <module 'x'>; how do I get z? 1954--------------------------------------------------------- 1955 1956Consider using the convenience function :func:`~importlib.import_module` from 1957:mod:`importlib` instead:: 1958 1959 z = importlib.import_module('x.y.z') 1960 1961 1962When I edit an imported module and reimport it, the changes don't show up. Why does this happen? 1963------------------------------------------------------------------------------------------------- 1964 1965For reasons of efficiency as well as consistency, Python only reads the module 1966file on the first time a module is imported. If it didn't, in a program 1967consisting of many modules where each one imports the same basic module, the 1968basic module would be parsed and re-parsed many times. To force rereading of a 1969changed module, do this:: 1970 1971 import modname 1972 reload(modname) 1973 1974Warning: this technique is not 100% fool-proof. In particular, modules 1975containing statements like :: 1976 1977 from modname import some_objects 1978 1979will continue to work with the old version of the imported objects. If the 1980module contains class definitions, existing class instances will *not* be 1981updated to use the new class definition. This can result in the following 1982paradoxical behaviour: 1983 1984 >>> import cls 1985 >>> c = cls.C() # Create an instance of C 1986 >>> reload(cls) 1987 <module 'cls' from 'cls.pyc'> 1988 >>> isinstance(c, cls.C) # isinstance is false?!? 1989 False 1990 1991The nature of the problem is made clear if you print out the class objects: 1992 1993 >>> c.__class__ 1994 <class cls.C at 0x7352a0> 1995 >>> cls.C 1996 <class cls.C at 0x4198d0> 1997 1998