1************************************ 2 Idioms and Anti-Idioms in Python 3************************************ 4 5:Author: Moshe Zadka 6 7This document is placed in the public domain. 8 9 10.. topic:: Abstract 11 12 This document can be considered a companion to the tutorial. It shows how to use 13 Python, and even more importantly, how *not* to use Python. 14 15 16Language Constructs You Should Not Use 17====================================== 18 19While Python has relatively few gotchas compared to other languages, it still 20has some constructs which are only useful in corner cases, or are plain 21dangerous. 22 23 24from module import \* 25--------------------- 26 27 28Inside Function Definitions 29^^^^^^^^^^^^^^^^^^^^^^^^^^^ 30 31``from module import *`` is *invalid* inside function definitions. While many 32versions of Python do not check for the invalidity, it does not make it more 33valid, no more than having a smart lawyer makes a man innocent. Do not use it 34like that ever. Even in versions where it was accepted, it made the function 35execution slower, because the compiler could not be certain which names were 36local and which were global. In Python 2.1 this construct causes warnings, and 37sometimes even errors. 38 39 40At Module Level 41^^^^^^^^^^^^^^^ 42 43While it is valid to use ``from module import *`` at module level it is usually 44a bad idea. For one, this loses an important property Python otherwise has --- 45you can know where each toplevel name is defined by a simple "search" function 46in your favourite editor. You also open yourself to trouble in the future, if 47some module grows additional functions or classes. 48 49One of the most awful questions asked on the newsgroup is why this code:: 50 51 f = open("www") 52 f.read() 53 54does not work. Of course, it works just fine (assuming you have a file called 55"www".) But it does not work if somewhere in the module, the statement ``from 56os import *`` is present. The :mod:`os` module has a function called 57:func:`open` which returns an integer. While it is very useful, shadowing a 58builtin is one of its least useful properties. 59 60Remember, you can never know for sure what names a module exports, so either 61take what you need --- ``from module import name1, name2``, or keep them in the 62module and access on a per-need basis --- ``import module;print module.name``. 63 64 65When It Is Just Fine 66^^^^^^^^^^^^^^^^^^^^ 67 68There are situations in which ``from module import *`` is just fine: 69 70* The interactive prompt. For example, ``from math import *`` makes Python an 71 amazing scientific calculator. 72 73* When extending a module in C with a module in Python. 74 75* When the module advertises itself as ``from import *`` safe. 76 77 78Unadorned :keyword:`exec`, :func:`execfile` and friends 79------------------------------------------------------- 80 81The word "unadorned" refers to the use without an explicit dictionary, in which 82case those constructs evaluate code in the *current* environment. This is 83dangerous for the same reasons ``from import *`` is dangerous --- it might step 84over variables you are counting on and mess up things for the rest of your code. 85Simply do not do that. 86 87Bad examples:: 88 89 >>> for name in sys.argv[1:]: 90 >>> exec "%s=1" % name 91 >>> def func(s, **kw): 92 >>> for var, val in kw.items(): 93 >>> exec "s.%s=val" % var # invalid! 94 >>> execfile("handler.py") 95 >>> handle() 96 97Good examples:: 98 99 >>> d = {} 100 >>> for name in sys.argv[1:]: 101 >>> d[name] = 1 102 >>> def func(s, **kw): 103 >>> for var, val in kw.items(): 104 >>> setattr(s, var, val) 105 >>> d={} 106 >>> execfile("handle.py", d, d) 107 >>> handle = d['handle'] 108 >>> handle() 109 110 111from module import name1, name2 112------------------------------- 113 114This is a "don't" which is much weaker than the previous "don't"s but is still 115something you should not do if you don't have good reasons to do that. The 116reason it is usually a bad idea is because you suddenly have an object which lives 117in two separate namespaces. When the binding in one namespace changes, the 118binding in the other will not, so there will be a discrepancy between them. This 119happens when, for example, one module is reloaded, or changes the definition of 120a function at runtime. 121 122Bad example:: 123 124 # foo.py 125 a = 1 126 127 # bar.py 128 from foo import a 129 if something(): 130 a = 2 # danger: foo.a != a 131 132Good example:: 133 134 # foo.py 135 a = 1 136 137 # bar.py 138 import foo 139 if something(): 140 foo.a = 2 141 142 143except: 144------- 145 146Python has the ``except:`` clause, which catches all exceptions. Since *every* 147error in Python raises an exception, using ``except:`` can make many 148programming errors look like runtime problems, which hinders the debugging 149process. 150 151The following code shows a great example of why this is bad:: 152 153 try: 154 foo = opne("file") # misspelled "open" 155 except: 156 sys.exit("could not open file!") 157 158The second line triggers a :exc:`NameError`, which is caught by the except 159clause. The program will exit, and the error message the program prints will 160make you think the problem is the readability of ``"file"`` when in fact 161the real error has nothing to do with ``"file"``. 162 163A better way to write the above is :: 164 165 try: 166 foo = opne("file") 167 except IOError: 168 sys.exit("could not open file") 169 170When this is run, Python will produce a traceback showing the :exc:`NameError`, 171and it will be immediately apparent what needs to be fixed. 172 173.. index:: bare except, except; bare 174 175Because ``except:`` catches *all* exceptions, including :exc:`SystemExit`, 176:exc:`KeyboardInterrupt`, and :exc:`GeneratorExit` (which is not an error and 177should not normally be caught by user code), using a bare ``except:`` is almost 178never a good idea. In situations where you need to catch all "normal" errors, 179such as in a framework that runs callbacks, you can catch the base class for 180all normal exceptions, :exc:`Exception`. Unfortunately in Python 2.x it is 181possible for third-party code to raise exceptions that do not inherit from 182:exc:`Exception`, so in Python 2.x there are some cases where you may have to 183use a bare ``except:`` and manually re-raise the exceptions you don't want 184to catch. 185 186 187Exceptions 188========== 189 190Exceptions are a useful feature of Python. You should learn to raise them 191whenever something unexpected occurs, and catch them only where you can do 192something about them. 193 194The following is a very popular anti-idiom :: 195 196 def get_status(file): 197 if not os.path.exists(file): 198 print "file not found" 199 sys.exit(1) 200 return open(file).readline() 201 202Consider the case where the file gets deleted between the time the call to 203:func:`os.path.exists` is made and the time :func:`open` is called. In that 204case the last line will raise an :exc:`IOError`. The same thing would happen 205if *file* exists but has no read permission. Since testing this on a normal 206machine on existent and non-existent files makes it seem bugless, the test 207results will seem fine, and the code will get shipped. Later an unhandled 208:exc:`IOError` (or perhaps some other :exc:`EnvironmentError`) escapes to the 209user, who gets to watch the ugly traceback. 210 211Here is a somewhat better way to do it. :: 212 213 def get_status(file): 214 try: 215 return open(file).readline() 216 except EnvironmentError as err: 217 print "Unable to open file: {}".format(err) 218 sys.exit(1) 219 220In this version, *either* the file gets opened and the line is read (so it 221works even on flaky NFS or SMB connections), or an error message is printed 222that provides all the available information on why the open failed, and the 223application is aborted. 224 225However, even this version of :func:`get_status` makes too many assumptions --- 226that it will only be used in a short running script, and not, say, in a long 227running server. Sure, the caller could do something like :: 228 229 try: 230 status = get_status(log) 231 except SystemExit: 232 status = None 233 234But there is a better way. You should try to use as few ``except`` clauses in 235your code as you can --- the ones you do use will usually be inside calls which 236should always succeed, or a catch-all in a main function. 237 238So, an even better version of :func:`get_status()` is probably :: 239 240 def get_status(file): 241 return open(file).readline() 242 243The caller can deal with the exception if it wants (for example, if it tries 244several files in a loop), or just let the exception filter upwards to *its* 245caller. 246 247But the last version still has a serious problem --- due to implementation 248details in CPython, the file would not be closed when an exception is raised 249until the exception handler finishes; and, worse, in other implementations 250(e.g., Jython) it might not be closed at all regardless of whether or not 251an exception is raised. 252 253The best version of this function uses the ``open()`` call as a context 254manager, which will ensure that the file gets closed as soon as the 255function returns:: 256 257 def get_status(file): 258 with open(file) as fp: 259 return fp.readline() 260 261 262Using the Batteries 263=================== 264 265Every so often, people seem to be writing stuff in the Python library again, 266usually poorly. While the occasional module has a poor interface, it is usually 267much better to use the rich standard library and data types that come with 268Python than inventing your own. 269 270A useful module very few people know about is :mod:`os.path`. It always has the 271correct path arithmetic for your operating system, and will usually be much 272better than whatever you come up with yourself. 273 274Compare:: 275 276 # ugh! 277 return dir+"/"+file 278 # better 279 return os.path.join(dir, file) 280 281More useful functions in :mod:`os.path`: :func:`basename`, :func:`dirname` and 282:func:`splitext`. 283 284There are also many useful built-in functions people seem not to be aware of 285for some reason: :func:`min` and :func:`max` can find the minimum/maximum of 286any sequence with comparable semantics, for example, yet many people write 287their own :func:`max`/:func:`min`. Another highly useful function is 288:func:`reduce` which can be used to repeatedly apply a binary operation to a 289sequence, reducing it to a single value. For example, compute a factorial 290with a series of multiply operations:: 291 292 >>> n = 4 293 >>> import operator 294 >>> reduce(operator.mul, range(1, n+1)) 295 24 296 297When it comes to parsing numbers, note that :func:`float`, :func:`int` and 298:func:`long` all accept string arguments and will reject ill-formed strings 299by raising an :exc:`ValueError`. 300 301 302Using Backslash to Continue Statements 303====================================== 304 305Since Python treats a newline as a statement terminator, and since statements 306are often more than is comfortable to put in one line, many people do:: 307 308 if foo.bar()['first'][0] == baz.quux(1, 2)[5:9] and \ 309 calculate_number(10, 20) != forbulate(500, 360): 310 pass 311 312You should realize that this is dangerous: a stray space after the ``\`` would 313make this line wrong, and stray spaces are notoriously hard to see in editors. 314In this case, at least it would be a syntax error, but if the code was:: 315 316 value = foo.bar()['first'][0]*baz.quux(1, 2)[5:9] \ 317 + calculate_number(10, 20)*forbulate(500, 360) 318 319then it would just be subtly wrong. 320 321It is usually much better to use the implicit continuation inside parenthesis: 322 323This version is bulletproof:: 324 325 value = (foo.bar()['first'][0]*baz.quux(1, 2)[5:9] 326 + calculate_number(10, 20)*forbulate(500, 360)) 327 328