1******************************** 2 Functional Programming HOWTO 3******************************** 4 5:Author: A. M. Kuchling 6:Release: 0.32 7 8In this document, we'll take a tour of Python's features suitable for 9implementing programs in a functional style. After an introduction to the 10concepts of functional programming, we'll look at language features such as 11:term:`iterator`\s and :term:`generator`\s and relevant library modules such as 12:mod:`itertools` and :mod:`functools`. 13 14 15Introduction 16============ 17 18This section explains the basic concept of functional programming; if 19you're just interested in learning about Python language features, 20skip to the next section on :ref:`functional-howto-iterators`. 21 22Programming languages support decomposing problems in several different ways: 23 24* Most programming languages are **procedural**: programs are lists of 25 instructions that tell the computer what to do with the program's input. C, 26 Pascal, and even Unix shells are procedural languages. 27 28* In **declarative** languages, you write a specification that describes the 29 problem to be solved, and the language implementation figures out how to 30 perform the computation efficiently. SQL is the declarative language you're 31 most likely to be familiar with; a SQL query describes the data set you want 32 to retrieve, and the SQL engine decides whether to scan tables or use indexes, 33 which subclauses should be performed first, etc. 34 35* **Object-oriented** programs manipulate collections of objects. Objects have 36 internal state and support methods that query or modify this internal state in 37 some way. Smalltalk and Java are object-oriented languages. C++ and Python 38 are languages that support object-oriented programming, but don't force the 39 use of object-oriented features. 40 41* **Functional** programming decomposes a problem into a set of functions. 42 Ideally, functions only take inputs and produce outputs, and don't have any 43 internal state that affects the output produced for a given input. Well-known 44 functional languages include the ML family (Standard ML, OCaml, and other 45 variants) and Haskell. 46 47The designers of some computer languages choose to emphasize one 48particular approach to programming. This often makes it difficult to 49write programs that use a different approach. Other languages are 50multi-paradigm languages that support several different approaches. 51Lisp, C++, and Python are multi-paradigm; you can write programs or 52libraries that are largely procedural, object-oriented, or functional 53in all of these languages. In a large program, different sections 54might be written using different approaches; the GUI might be 55object-oriented while the processing logic is procedural or 56functional, for example. 57 58In a functional program, input flows through a set of functions. Each function 59operates on its input and produces some output. Functional style discourages 60functions with side effects that modify internal state or make other changes 61that aren't visible in the function's return value. Functions that have no side 62effects at all are called **purely functional**. Avoiding side effects means 63not using data structures that get updated as a program runs; every function's 64output must only depend on its input. 65 66Some languages are very strict about purity and don't even have assignment 67statements such as ``a=3`` or ``c = a + b``, but it's difficult to avoid all 68side effects. Printing to the screen or writing to a disk file are side 69effects, for example. For example, in Python a call to the :func:`print` or 70:func:`time.sleep` function both return no useful value; they're only called for 71their side effects of sending some text to the screen or pausing execution for a 72second. 73 74Python programs written in functional style usually won't go to the extreme of 75avoiding all I/O or all assignments; instead, they'll provide a 76functional-appearing interface but will use non-functional features internally. 77For example, the implementation of a function will still use assignments to 78local variables, but won't modify global variables or have other side effects. 79 80Functional programming can be considered the opposite of object-oriented 81programming. Objects are little capsules containing some internal state along 82with a collection of method calls that let you modify this state, and programs 83consist of making the right set of state changes. Functional programming wants 84to avoid state changes as much as possible and works with data flowing between 85functions. In Python you might combine the two approaches by writing functions 86that take and return instances representing objects in your application (e-mail 87messages, transactions, etc.). 88 89Functional design may seem like an odd constraint to work under. Why should you 90avoid objects and side effects? There are theoretical and practical advantages 91to the functional style: 92 93* Formal provability. 94* Modularity. 95* Composability. 96* Ease of debugging and testing. 97 98 99Formal provability 100------------------ 101 102A theoretical benefit is that it's easier to construct a mathematical proof that 103a functional program is correct. 104 105For a long time researchers have been interested in finding ways to 106mathematically prove programs correct. This is different from testing a program 107on numerous inputs and concluding that its output is usually correct, or reading 108a program's source code and concluding that the code looks right; the goal is 109instead a rigorous proof that a program produces the right result for all 110possible inputs. 111 112The technique used to prove programs correct is to write down **invariants**, 113properties of the input data and of the program's variables that are always 114true. For each line of code, you then show that if invariants X and Y are true 115**before** the line is executed, the slightly different invariants X' and Y' are 116true **after** the line is executed. This continues until you reach the end of 117the program, at which point the invariants should match the desired conditions 118on the program's output. 119 120Functional programming's avoidance of assignments arose because assignments are 121difficult to handle with this technique; assignments can break invariants that 122were true before the assignment without producing any new invariants that can be 123propagated onward. 124 125Unfortunately, proving programs correct is largely impractical and not relevant 126to Python software. Even trivial programs require proofs that are several pages 127long; the proof of correctness for a moderately complicated program would be 128enormous, and few or none of the programs you use daily (the Python interpreter, 129your XML parser, your web browser) could be proven correct. Even if you wrote 130down or generated a proof, there would then be the question of verifying the 131proof; maybe there's an error in it, and you wrongly believe you've proved the 132program correct. 133 134 135Modularity 136---------- 137 138A more practical benefit of functional programming is that it forces you to 139break apart your problem into small pieces. Programs are more modular as a 140result. It's easier to specify and write a small function that does one thing 141than a large function that performs a complicated transformation. Small 142functions are also easier to read and to check for errors. 143 144 145Ease of debugging and testing 146----------------------------- 147 148Testing and debugging a functional-style program is easier. 149 150Debugging is simplified because functions are generally small and clearly 151specified. When a program doesn't work, each function is an interface point 152where you can check that the data are correct. You can look at the intermediate 153inputs and outputs to quickly isolate the function that's responsible for a bug. 154 155Testing is easier because each function is a potential subject for a unit test. 156Functions don't depend on system state that needs to be replicated before 157running a test; instead you only have to synthesize the right input and then 158check that the output matches expectations. 159 160 161Composability 162------------- 163 164As you work on a functional-style program, you'll write a number of functions 165with varying inputs and outputs. Some of these functions will be unavoidably 166specialized to a particular application, but others will be useful in a wide 167variety of programs. For example, a function that takes a directory path and 168returns all the XML files in the directory, or a function that takes a filename 169and returns its contents, can be applied to many different situations. 170 171Over time you'll form a personal library of utilities. Often you'll assemble 172new programs by arranging existing functions in a new configuration and writing 173a few functions specialized for the current task. 174 175 176.. _functional-howto-iterators: 177 178Iterators 179========= 180 181I'll start by looking at a Python language feature that's an important 182foundation for writing functional-style programs: iterators. 183 184An iterator is an object representing a stream of data; this object returns the 185data one element at a time. A Python iterator must support a method called 186:meth:`~iterator.__next__` that takes no arguments and always returns the next 187element of the stream. If there are no more elements in the stream, 188:meth:`~iterator.__next__` must raise the :exc:`StopIteration` exception. 189Iterators don't have to be finite, though; it's perfectly reasonable to write 190an iterator that produces an infinite stream of data. 191 192The built-in :func:`iter` function takes an arbitrary object and tries to return 193an iterator that will return the object's contents or elements, raising 194:exc:`TypeError` if the object doesn't support iteration. Several of Python's 195built-in data types support iteration, the most common being lists and 196dictionaries. An object is called :term:`iterable` if you can get an iterator 197for it. 198 199You can experiment with the iteration interface manually: 200 201 >>> L = [1, 2, 3] 202 >>> it = iter(L) 203 >>> it #doctest: +ELLIPSIS 204 <...iterator object at ...> 205 >>> it.__next__() # same as next(it) 206 1 207 >>> next(it) 208 2 209 >>> next(it) 210 3 211 >>> next(it) 212 Traceback (most recent call last): 213 File "<stdin>", line 1, in <module> 214 StopIteration 215 >>> 216 217Python expects iterable objects in several different contexts, the most 218important being the :keyword:`for` statement. In the statement ``for X in Y``, 219Y must be an iterator or some object for which :func:`iter` can create an 220iterator. These two statements are equivalent:: 221 222 223 for i in iter(obj): 224 print(i) 225 226 for i in obj: 227 print(i) 228 229Iterators can be materialized as lists or tuples by using the :func:`list` or 230:func:`tuple` constructor functions: 231 232 >>> L = [1, 2, 3] 233 >>> iterator = iter(L) 234 >>> t = tuple(iterator) 235 >>> t 236 (1, 2, 3) 237 238Sequence unpacking also supports iterators: if you know an iterator will return 239N elements, you can unpack them into an N-tuple: 240 241 >>> L = [1, 2, 3] 242 >>> iterator = iter(L) 243 >>> a, b, c = iterator 244 >>> a, b, c 245 (1, 2, 3) 246 247Built-in functions such as :func:`max` and :func:`min` can take a single 248iterator argument and will return the largest or smallest element. The ``"in"`` 249and ``"not in"`` operators also support iterators: ``X in iterator`` is true if 250X is found in the stream returned by the iterator. You'll run into obvious 251problems if the iterator is infinite; :func:`max`, :func:`min` 252will never return, and if the element X never appears in the stream, the 253``"in"`` and ``"not in"`` operators won't return either. 254 255Note that you can only go forward in an iterator; there's no way to get the 256previous element, reset the iterator, or make a copy of it. Iterator objects 257can optionally provide these additional capabilities, but the iterator protocol 258only specifies the :meth:`~iterator.__next__` method. Functions may therefore 259consume all of the iterator's output, and if you need to do something different 260with the same stream, you'll have to create a new iterator. 261 262 263 264Data Types That Support Iterators 265--------------------------------- 266 267We've already seen how lists and tuples support iterators. In fact, any Python 268sequence type, such as strings, will automatically support creation of an 269iterator. 270 271Calling :func:`iter` on a dictionary returns an iterator that will loop over the 272dictionary's keys:: 273 274 >>> m = {'Jan': 1, 'Feb': 2, 'Mar': 3, 'Apr': 4, 'May': 5, 'Jun': 6, 275 ... 'Jul': 7, 'Aug': 8, 'Sep': 9, 'Oct': 10, 'Nov': 11, 'Dec': 12} 276 >>> for key in m: 277 ... print(key, m[key]) 278 Jan 1 279 Feb 2 280 Mar 3 281 Apr 4 282 May 5 283 Jun 6 284 Jul 7 285 Aug 8 286 Sep 9 287 Oct 10 288 Nov 11 289 Dec 12 290 291Note that starting with Python 3.7, dictionary iteration order is guaranteed 292to be the same as the insertion order. In earlier versions, the behaviour was 293unspecified and could vary between implementations. 294 295Applying :func:`iter` to a dictionary always loops over the keys, but 296dictionaries have methods that return other iterators. If you want to iterate 297over values or key/value pairs, you can explicitly call the 298:meth:`~dict.values` or :meth:`~dict.items` methods to get an appropriate 299iterator. 300 301The :func:`dict` constructor can accept an iterator that returns a finite stream 302of ``(key, value)`` tuples: 303 304 >>> L = [('Italy', 'Rome'), ('France', 'Paris'), ('US', 'Washington DC')] 305 >>> dict(iter(L)) 306 {'Italy': 'Rome', 'France': 'Paris', 'US': 'Washington DC'} 307 308Files also support iteration by calling the :meth:`~io.TextIOBase.readline` 309method until there are no more lines in the file. This means you can read each 310line of a file like this:: 311 312 for line in file: 313 # do something for each line 314 ... 315 316Sets can take their contents from an iterable and let you iterate over the set's 317elements:: 318 319 S = {2, 3, 5, 7, 11, 13} 320 for i in S: 321 print(i) 322 323 324 325Generator expressions and list comprehensions 326============================================= 327 328Two common operations on an iterator's output are 1) performing some operation 329for every element, 2) selecting a subset of elements that meet some condition. 330For example, given a list of strings, you might want to strip off trailing 331whitespace from each line or extract all the strings containing a given 332substring. 333 334List comprehensions and generator expressions (short form: "listcomps" and 335"genexps") are a concise notation for such operations, borrowed from the 336functional programming language Haskell (https://www.haskell.org/). You can strip 337all the whitespace from a stream of strings with the following code:: 338 339 line_list = [' line 1\n', 'line 2 \n', ...] 340 341 # Generator expression -- returns iterator 342 stripped_iter = (line.strip() for line in line_list) 343 344 # List comprehension -- returns list 345 stripped_list = [line.strip() for line in line_list] 346 347You can select only certain elements by adding an ``"if"`` condition:: 348 349 stripped_list = [line.strip() for line in line_list 350 if line != ""] 351 352With a list comprehension, you get back a Python list; ``stripped_list`` is a 353list containing the resulting lines, not an iterator. Generator expressions 354return an iterator that computes the values as necessary, not needing to 355materialize all the values at once. This means that list comprehensions aren't 356useful if you're working with iterators that return an infinite stream or a very 357large amount of data. Generator expressions are preferable in these situations. 358 359Generator expressions are surrounded by parentheses ("()") and list 360comprehensions are surrounded by square brackets ("[]"). Generator expressions 361have the form:: 362 363 ( expression for expr in sequence1 364 if condition1 365 for expr2 in sequence2 366 if condition2 367 for expr3 in sequence3 ... 368 if condition3 369 for exprN in sequenceN 370 if conditionN ) 371 372Again, for a list comprehension only the outside brackets are different (square 373brackets instead of parentheses). 374 375The elements of the generated output will be the successive values of 376``expression``. The ``if`` clauses are all optional; if present, ``expression`` 377is only evaluated and added to the result when ``condition`` is true. 378 379Generator expressions always have to be written inside parentheses, but the 380parentheses signalling a function call also count. If you want to create an 381iterator that will be immediately passed to a function you can write:: 382 383 obj_total = sum(obj.count for obj in list_all_objects()) 384 385The ``for...in`` clauses contain the sequences to be iterated over. The 386sequences do not have to be the same length, because they are iterated over from 387left to right, **not** in parallel. For each element in ``sequence1``, 388``sequence2`` is looped over from the beginning. ``sequence3`` is then looped 389over for each resulting pair of elements from ``sequence1`` and ``sequence2``. 390 391To put it another way, a list comprehension or generator expression is 392equivalent to the following Python code:: 393 394 for expr1 in sequence1: 395 if not (condition1): 396 continue # Skip this element 397 for expr2 in sequence2: 398 if not (condition2): 399 continue # Skip this element 400 ... 401 for exprN in sequenceN: 402 if not (conditionN): 403 continue # Skip this element 404 405 # Output the value of 406 # the expression. 407 408This means that when there are multiple ``for...in`` clauses but no ``if`` 409clauses, the length of the resulting output will be equal to the product of the 410lengths of all the sequences. If you have two lists of length 3, the output 411list is 9 elements long: 412 413 >>> seq1 = 'abc' 414 >>> seq2 = (1, 2, 3) 415 >>> [(x, y) for x in seq1 for y in seq2] #doctest: +NORMALIZE_WHITESPACE 416 [('a', 1), ('a', 2), ('a', 3), 417 ('b', 1), ('b', 2), ('b', 3), 418 ('c', 1), ('c', 2), ('c', 3)] 419 420To avoid introducing an ambiguity into Python's grammar, if ``expression`` is 421creating a tuple, it must be surrounded with parentheses. The first list 422comprehension below is a syntax error, while the second one is correct:: 423 424 # Syntax error 425 [x, y for x in seq1 for y in seq2] 426 # Correct 427 [(x, y) for x in seq1 for y in seq2] 428 429 430Generators 431========== 432 433Generators are a special class of functions that simplify the task of writing 434iterators. Regular functions compute a value and return it, but generators 435return an iterator that returns a stream of values. 436 437You're doubtless familiar with how regular function calls work in Python or C. 438When you call a function, it gets a private namespace where its local variables 439are created. When the function reaches a ``return`` statement, the local 440variables are destroyed and the value is returned to the caller. A later call 441to the same function creates a new private namespace and a fresh set of local 442variables. But, what if the local variables weren't thrown away on exiting a 443function? What if you could later resume the function where it left off? This 444is what generators provide; they can be thought of as resumable functions. 445 446Here's the simplest example of a generator function: 447 448 >>> def generate_ints(N): 449 ... for i in range(N): 450 ... yield i 451 452Any function containing a :keyword:`yield` keyword is a generator function; 453this is detected by Python's :term:`bytecode` compiler which compiles the 454function specially as a result. 455 456When you call a generator function, it doesn't return a single value; instead it 457returns a generator object that supports the iterator protocol. On executing 458the ``yield`` expression, the generator outputs the value of ``i``, similar to a 459``return`` statement. The big difference between ``yield`` and a ``return`` 460statement is that on reaching a ``yield`` the generator's state of execution is 461suspended and local variables are preserved. On the next call to the 462generator's :meth:`~generator.__next__` method, the function will resume 463executing. 464 465Here's a sample usage of the ``generate_ints()`` generator: 466 467 >>> gen = generate_ints(3) 468 >>> gen #doctest: +ELLIPSIS 469 <generator object generate_ints at ...> 470 >>> next(gen) 471 0 472 >>> next(gen) 473 1 474 >>> next(gen) 475 2 476 >>> next(gen) 477 Traceback (most recent call last): 478 File "stdin", line 1, in <module> 479 File "stdin", line 2, in generate_ints 480 StopIteration 481 482You could equally write ``for i in generate_ints(5)``, or ``a, b, c = 483generate_ints(3)``. 484 485Inside a generator function, ``return value`` causes ``StopIteration(value)`` 486to be raised from the :meth:`~generator.__next__` method. Once this happens, or 487the bottom of the function is reached, the procession of values ends and the 488generator cannot yield any further values. 489 490You could achieve the effect of generators manually by writing your own class 491and storing all the local variables of the generator as instance variables. For 492example, returning a list of integers could be done by setting ``self.count`` to 4930, and having the :meth:`~iterator.__next__` method increment ``self.count`` and 494return it. 495However, for a moderately complicated generator, writing a corresponding class 496can be much messier. 497 498The test suite included with Python's library, 499:source:`Lib/test/test_generators.py`, contains 500a number of more interesting examples. Here's one generator that implements an 501in-order traversal of a tree using generators recursively. :: 502 503 # A recursive generator that generates Tree leaves in in-order. 504 def inorder(t): 505 if t: 506 for x in inorder(t.left): 507 yield x 508 509 yield t.label 510 511 for x in inorder(t.right): 512 yield x 513 514Two other examples in ``test_generators.py`` produce solutions for the N-Queens 515problem (placing N queens on an NxN chess board so that no queen threatens 516another) and the Knight's Tour (finding a route that takes a knight to every 517square of an NxN chessboard without visiting any square twice). 518 519 520 521Passing values into a generator 522------------------------------- 523 524In Python 2.4 and earlier, generators only produced output. Once a generator's 525code was invoked to create an iterator, there was no way to pass any new 526information into the function when its execution is resumed. You could hack 527together this ability by making the generator look at a global variable or by 528passing in some mutable object that callers then modify, but these approaches 529are messy. 530 531In Python 2.5 there's a simple way to pass values into a generator. 532:keyword:`yield` became an expression, returning a value that can be assigned to 533a variable or otherwise operated on:: 534 535 val = (yield i) 536 537I recommend that you **always** put parentheses around a ``yield`` expression 538when you're doing something with the returned value, as in the above example. 539The parentheses aren't always necessary, but it's easier to always add them 540instead of having to remember when they're needed. 541 542(:pep:`342` explains the exact rules, which are that a ``yield``-expression must 543always be parenthesized except when it occurs at the top-level expression on the 544right-hand side of an assignment. This means you can write ``val = yield i`` 545but have to use parentheses when there's an operation, as in ``val = (yield i) 546+ 12``.) 547 548Values are sent into a generator by calling its :meth:`send(value) 549<generator.send>` method. This method resumes the generator's code and the 550``yield`` expression returns the specified value. If the regular 551:meth:`~generator.__next__` method is called, the ``yield`` returns ``None``. 552 553Here's a simple counter that increments by 1 and allows changing the value of 554the internal counter. 555 556.. testcode:: 557 558 def counter(maximum): 559 i = 0 560 while i < maximum: 561 val = (yield i) 562 # If value provided, change counter 563 if val is not None: 564 i = val 565 else: 566 i += 1 567 568And here's an example of changing the counter: 569 570 >>> it = counter(10) #doctest: +SKIP 571 >>> next(it) #doctest: +SKIP 572 0 573 >>> next(it) #doctest: +SKIP 574 1 575 >>> it.send(8) #doctest: +SKIP 576 8 577 >>> next(it) #doctest: +SKIP 578 9 579 >>> next(it) #doctest: +SKIP 580 Traceback (most recent call last): 581 File "t.py", line 15, in <module> 582 it.next() 583 StopIteration 584 585Because ``yield`` will often be returning ``None``, you should always check for 586this case. Don't just use its value in expressions unless you're sure that the 587:meth:`~generator.send` method will be the only method used to resume your 588generator function. 589 590In addition to :meth:`~generator.send`, there are two other methods on 591generators: 592 593* :meth:`throw(type, value=None, traceback=None) <generator.throw>` is used to 594 raise an exception inside the generator; the exception is raised by the 595 ``yield`` expression where the generator's execution is paused. 596 597* :meth:`~generator.close` raises a :exc:`GeneratorExit` exception inside the 598 generator to terminate the iteration. On receiving this exception, the 599 generator's code must either raise :exc:`GeneratorExit` or 600 :exc:`StopIteration`; catching the exception and doing anything else is 601 illegal and will trigger a :exc:`RuntimeError`. :meth:`~generator.close` 602 will also be called by Python's garbage collector when the generator is 603 garbage-collected. 604 605 If you need to run cleanup code when a :exc:`GeneratorExit` occurs, I suggest 606 using a ``try: ... finally:`` suite instead of catching :exc:`GeneratorExit`. 607 608The cumulative effect of these changes is to turn generators from one-way 609producers of information into both producers and consumers. 610 611Generators also become **coroutines**, a more generalized form of subroutines. 612Subroutines are entered at one point and exited at another point (the top of the 613function, and a ``return`` statement), but coroutines can be entered, exited, 614and resumed at many different points (the ``yield`` statements). 615 616 617Built-in functions 618================== 619 620Let's look in more detail at built-in functions often used with iterators. 621 622Two of Python's built-in functions, :func:`map` and :func:`filter` duplicate the 623features of generator expressions: 624 625:func:`map(f, iterA, iterB, ...) <map>` returns an iterator over the sequence 626 ``f(iterA[0], iterB[0]), f(iterA[1], iterB[1]), f(iterA[2], iterB[2]), ...``. 627 628 >>> def upper(s): 629 ... return s.upper() 630 631 >>> list(map(upper, ['sentence', 'fragment'])) 632 ['SENTENCE', 'FRAGMENT'] 633 >>> [upper(s) for s in ['sentence', 'fragment']] 634 ['SENTENCE', 'FRAGMENT'] 635 636You can of course achieve the same effect with a list comprehension. 637 638:func:`filter(predicate, iter) <filter>` returns an iterator over all the 639sequence elements that meet a certain condition, and is similarly duplicated by 640list comprehensions. A **predicate** is a function that returns the truth 641value of some condition; for use with :func:`filter`, the predicate must take a 642single value. 643 644 >>> def is_even(x): 645 ... return (x % 2) == 0 646 647 >>> list(filter(is_even, range(10))) 648 [0, 2, 4, 6, 8] 649 650 651This can also be written as a list comprehension: 652 653 >>> list(x for x in range(10) if is_even(x)) 654 [0, 2, 4, 6, 8] 655 656 657:func:`enumerate(iter, start=0) <enumerate>` counts off the elements in the 658iterable returning 2-tuples containing the count (from *start*) and 659each element. :: 660 661 >>> for item in enumerate(['subject', 'verb', 'object']): 662 ... print(item) 663 (0, 'subject') 664 (1, 'verb') 665 (2, 'object') 666 667:func:`enumerate` is often used when looping through a list and recording the 668indexes at which certain conditions are met:: 669 670 f = open('data.txt', 'r') 671 for i, line in enumerate(f): 672 if line.strip() == '': 673 print('Blank line at line #%i' % i) 674 675:func:`sorted(iterable, key=None, reverse=False) <sorted>` collects all the 676elements of the iterable into a list, sorts the list, and returns the sorted 677result. The *key* and *reverse* arguments are passed through to the 678constructed list's :meth:`~list.sort` method. :: 679 680 >>> import random 681 >>> # Generate 8 random numbers between [0, 10000) 682 >>> rand_list = random.sample(range(10000), 8) 683 >>> rand_list #doctest: +SKIP 684 [769, 7953, 9828, 6431, 8442, 9878, 6213, 2207] 685 >>> sorted(rand_list) #doctest: +SKIP 686 [769, 2207, 6213, 6431, 7953, 8442, 9828, 9878] 687 >>> sorted(rand_list, reverse=True) #doctest: +SKIP 688 [9878, 9828, 8442, 7953, 6431, 6213, 2207, 769] 689 690(For a more detailed discussion of sorting, see the :ref:`sortinghowto`.) 691 692 693The :func:`any(iter) <any>` and :func:`all(iter) <all>` built-ins look at the 694truth values of an iterable's contents. :func:`any` returns ``True`` if any element 695in the iterable is a true value, and :func:`all` returns ``True`` if all of the 696elements are true values: 697 698 >>> any([0, 1, 0]) 699 True 700 >>> any([0, 0, 0]) 701 False 702 >>> any([1, 1, 1]) 703 True 704 >>> all([0, 1, 0]) 705 False 706 >>> all([0, 0, 0]) 707 False 708 >>> all([1, 1, 1]) 709 True 710 711 712:func:`zip(iterA, iterB, ...) <zip>` takes one element from each iterable and 713returns them in a tuple:: 714 715 zip(['a', 'b', 'c'], (1, 2, 3)) => 716 ('a', 1), ('b', 2), ('c', 3) 717 718It doesn't construct an in-memory list and exhaust all the input iterators 719before returning; instead tuples are constructed and returned only if they're 720requested. (The technical term for this behaviour is `lazy evaluation 721<https://en.wikipedia.org/wiki/Lazy_evaluation>`__.) 722 723This iterator is intended to be used with iterables that are all of the same 724length. If the iterables are of different lengths, the resulting stream will be 725the same length as the shortest iterable. :: 726 727 zip(['a', 'b'], (1, 2, 3)) => 728 ('a', 1), ('b', 2) 729 730You should avoid doing this, though, because an element may be taken from the 731longer iterators and discarded. This means you can't go on to use the iterators 732further because you risk skipping a discarded element. 733 734 735The itertools module 736==================== 737 738The :mod:`itertools` module contains a number of commonly-used iterators as well 739as functions for combining several iterators. This section will introduce the 740module's contents by showing small examples. 741 742The module's functions fall into a few broad classes: 743 744* Functions that create a new iterator based on an existing iterator. 745* Functions for treating an iterator's elements as function arguments. 746* Functions for selecting portions of an iterator's output. 747* A function for grouping an iterator's output. 748 749Creating new iterators 750---------------------- 751 752:func:`itertools.count(start, step) <itertools.count>` returns an infinite 753stream of evenly spaced values. You can optionally supply the starting number, 754which defaults to 0, and the interval between numbers, which defaults to 1:: 755 756 itertools.count() => 757 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ... 758 itertools.count(10) => 759 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, ... 760 itertools.count(10, 5) => 761 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, ... 762 763:func:`itertools.cycle(iter) <itertools.cycle>` saves a copy of the contents of 764a provided iterable and returns a new iterator that returns its elements from 765first to last. The new iterator will repeat these elements infinitely. :: 766 767 itertools.cycle([1, 2, 3, 4, 5]) => 768 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, ... 769 770:func:`itertools.repeat(elem, [n]) <itertools.repeat>` returns the provided 771element *n* times, or returns the element endlessly if *n* is not provided. :: 772 773 itertools.repeat('abc') => 774 abc, abc, abc, abc, abc, abc, abc, abc, abc, abc, ... 775 itertools.repeat('abc', 5) => 776 abc, abc, abc, abc, abc 777 778:func:`itertools.chain(iterA, iterB, ...) <itertools.chain>` takes an arbitrary 779number of iterables as input, and returns all the elements of the first 780iterator, then all the elements of the second, and so on, until all of the 781iterables have been exhausted. :: 782 783 itertools.chain(['a', 'b', 'c'], (1, 2, 3)) => 784 a, b, c, 1, 2, 3 785 786:func:`itertools.islice(iter, [start], stop, [step]) <itertools.islice>` returns 787a stream that's a slice of the iterator. With a single *stop* argument, it 788will return the first *stop* elements. If you supply a starting index, you'll 789get *stop-start* elements, and if you supply a value for *step*, elements 790will be skipped accordingly. Unlike Python's string and list slicing, you can't 791use negative values for *start*, *stop*, or *step*. :: 792 793 itertools.islice(range(10), 8) => 794 0, 1, 2, 3, 4, 5, 6, 7 795 itertools.islice(range(10), 2, 8) => 796 2, 3, 4, 5, 6, 7 797 itertools.islice(range(10), 2, 8, 2) => 798 2, 4, 6 799 800:func:`itertools.tee(iter, [n]) <itertools.tee>` replicates an iterator; it 801returns *n* independent iterators that will all return the contents of the 802source iterator. 803If you don't supply a value for *n*, the default is 2. Replicating iterators 804requires saving some of the contents of the source iterator, so this can consume 805significant memory if the iterator is large and one of the new iterators is 806consumed more than the others. :: 807 808 itertools.tee( itertools.count() ) => 809 iterA, iterB 810 811 where iterA -> 812 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ... 813 814 and iterB -> 815 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ... 816 817 818Calling functions on elements 819----------------------------- 820 821The :mod:`operator` module contains a set of functions corresponding to Python's 822operators. Some examples are :func:`operator.add(a, b) <operator.add>` (adds 823two values), :func:`operator.ne(a, b) <operator.ne>` (same as ``a != b``), and 824:func:`operator.attrgetter('id') <operator.attrgetter>` 825(returns a callable that fetches the ``.id`` attribute). 826 827:func:`itertools.starmap(func, iter) <itertools.starmap>` assumes that the 828iterable will return a stream of tuples, and calls *func* using these tuples as 829the arguments:: 830 831 itertools.starmap(os.path.join, 832 [('/bin', 'python'), ('/usr', 'bin', 'java'), 833 ('/usr', 'bin', 'perl'), ('/usr', 'bin', 'ruby')]) 834 => 835 /bin/python, /usr/bin/java, /usr/bin/perl, /usr/bin/ruby 836 837 838Selecting elements 839------------------ 840 841Another group of functions chooses a subset of an iterator's elements based on a 842predicate. 843 844:func:`itertools.filterfalse(predicate, iter) <itertools.filterfalse>` is the 845opposite of :func:`filter`, returning all elements for which the predicate 846returns false:: 847 848 itertools.filterfalse(is_even, itertools.count()) => 849 1, 3, 5, 7, 9, 11, 13, 15, ... 850 851:func:`itertools.takewhile(predicate, iter) <itertools.takewhile>` returns 852elements for as long as the predicate returns true. Once the predicate returns 853false, the iterator will signal the end of its results. :: 854 855 def less_than_10(x): 856 return x < 10 857 858 itertools.takewhile(less_than_10, itertools.count()) => 859 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 860 861 itertools.takewhile(is_even, itertools.count()) => 862 0 863 864:func:`itertools.dropwhile(predicate, iter) <itertools.dropwhile>` discards 865elements while the predicate returns true, and then returns the rest of the 866iterable's results. :: 867 868 itertools.dropwhile(less_than_10, itertools.count()) => 869 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, ... 870 871 itertools.dropwhile(is_even, itertools.count()) => 872 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, ... 873 874:func:`itertools.compress(data, selectors) <itertools.compress>` takes two 875iterators and returns only those elements of *data* for which the corresponding 876element of *selectors* is true, stopping whenever either one is exhausted:: 877 878 itertools.compress([1, 2, 3, 4, 5], [True, True, False, False, True]) => 879 1, 2, 5 880 881 882Combinatoric functions 883---------------------- 884 885The :func:`itertools.combinations(iterable, r) <itertools.combinations>` 886returns an iterator giving all possible *r*-tuple combinations of the 887elements contained in *iterable*. :: 888 889 itertools.combinations([1, 2, 3, 4, 5], 2) => 890 (1, 2), (1, 3), (1, 4), (1, 5), 891 (2, 3), (2, 4), (2, 5), 892 (3, 4), (3, 5), 893 (4, 5) 894 895 itertools.combinations([1, 2, 3, 4, 5], 3) => 896 (1, 2, 3), (1, 2, 4), (1, 2, 5), (1, 3, 4), (1, 3, 5), (1, 4, 5), 897 (2, 3, 4), (2, 3, 5), (2, 4, 5), 898 (3, 4, 5) 899 900The elements within each tuple remain in the same order as 901*iterable* returned them. For example, the number 1 is always before 9022, 3, 4, or 5 in the examples above. A similar function, 903:func:`itertools.permutations(iterable, r=None) <itertools.permutations>`, 904removes this constraint on the order, returning all possible 905arrangements of length *r*:: 906 907 itertools.permutations([1, 2, 3, 4, 5], 2) => 908 (1, 2), (1, 3), (1, 4), (1, 5), 909 (2, 1), (2, 3), (2, 4), (2, 5), 910 (3, 1), (3, 2), (3, 4), (3, 5), 911 (4, 1), (4, 2), (4, 3), (4, 5), 912 (5, 1), (5, 2), (5, 3), (5, 4) 913 914 itertools.permutations([1, 2, 3, 4, 5]) => 915 (1, 2, 3, 4, 5), (1, 2, 3, 5, 4), (1, 2, 4, 3, 5), 916 ... 917 (5, 4, 3, 2, 1) 918 919If you don't supply a value for *r* the length of the iterable is used, 920meaning that all the elements are permuted. 921 922Note that these functions produce all of the possible combinations by 923position and don't require that the contents of *iterable* are unique:: 924 925 itertools.permutations('aba', 3) => 926 ('a', 'b', 'a'), ('a', 'a', 'b'), ('b', 'a', 'a'), 927 ('b', 'a', 'a'), ('a', 'a', 'b'), ('a', 'b', 'a') 928 929The identical tuple ``('a', 'a', 'b')`` occurs twice, but the two 'a' 930strings came from different positions. 931 932The :func:`itertools.combinations_with_replacement(iterable, r) <itertools.combinations_with_replacement>` 933function relaxes a different constraint: elements can be repeated 934within a single tuple. Conceptually an element is selected for the 935first position of each tuple and then is replaced before the second 936element is selected. :: 937 938 itertools.combinations_with_replacement([1, 2, 3, 4, 5], 2) => 939 (1, 1), (1, 2), (1, 3), (1, 4), (1, 5), 940 (2, 2), (2, 3), (2, 4), (2, 5), 941 (3, 3), (3, 4), (3, 5), 942 (4, 4), (4, 5), 943 (5, 5) 944 945 946Grouping elements 947----------------- 948 949The last function I'll discuss, :func:`itertools.groupby(iter, key_func=None) 950<itertools.groupby>`, is the most complicated. ``key_func(elem)`` is a function 951that can compute a key value for each element returned by the iterable. If you 952don't supply a key function, the key is simply each element itself. 953 954:func:`~itertools.groupby` collects all the consecutive elements from the 955underlying iterable that have the same key value, and returns a stream of 9562-tuples containing a key value and an iterator for the elements with that key. 957 958:: 959 960 city_list = [('Decatur', 'AL'), ('Huntsville', 'AL'), ('Selma', 'AL'), 961 ('Anchorage', 'AK'), ('Nome', 'AK'), 962 ('Flagstaff', 'AZ'), ('Phoenix', 'AZ'), ('Tucson', 'AZ'), 963 ... 964 ] 965 966 def get_state(city_state): 967 return city_state[1] 968 969 itertools.groupby(city_list, get_state) => 970 ('AL', iterator-1), 971 ('AK', iterator-2), 972 ('AZ', iterator-3), ... 973 974 where 975 iterator-1 => 976 ('Decatur', 'AL'), ('Huntsville', 'AL'), ('Selma', 'AL') 977 iterator-2 => 978 ('Anchorage', 'AK'), ('Nome', 'AK') 979 iterator-3 => 980 ('Flagstaff', 'AZ'), ('Phoenix', 'AZ'), ('Tucson', 'AZ') 981 982:func:`~itertools.groupby` assumes that the underlying iterable's contents will 983already be sorted based on the key. Note that the returned iterators also use 984the underlying iterable, so you have to consume the results of iterator-1 before 985requesting iterator-2 and its corresponding key. 986 987 988The functools module 989==================== 990 991The :mod:`functools` module in Python 2.5 contains some higher-order functions. 992A **higher-order function** takes one or more functions as input and returns a 993new function. The most useful tool in this module is the 994:func:`functools.partial` function. 995 996For programs written in a functional style, you'll sometimes want to construct 997variants of existing functions that have some of the parameters filled in. 998Consider a Python function ``f(a, b, c)``; you may wish to create a new function 999``g(b, c)`` that's equivalent to ``f(1, b, c)``; you're filling in a value for 1000one of ``f()``'s parameters. This is called "partial function application". 1001 1002The constructor for :func:`~functools.partial` takes the arguments 1003``(function, arg1, arg2, ..., kwarg1=value1, kwarg2=value2)``. The resulting 1004object is callable, so you can just call it to invoke ``function`` with the 1005filled-in arguments. 1006 1007Here's a small but realistic example:: 1008 1009 import functools 1010 1011 def log(message, subsystem): 1012 """Write the contents of 'message' to the specified subsystem.""" 1013 print('%s: %s' % (subsystem, message)) 1014 ... 1015 1016 server_log = functools.partial(log, subsystem='server') 1017 server_log('Unable to open socket') 1018 1019:func:`functools.reduce(func, iter, [initial_value]) <functools.reduce>` 1020cumulatively performs an operation on all the iterable's elements and, 1021therefore, can't be applied to infinite iterables. *func* must be a function 1022that takes two elements and returns a single value. :func:`functools.reduce` 1023takes the first two elements A and B returned by the iterator and calculates 1024``func(A, B)``. It then requests the third element, C, calculates 1025``func(func(A, B), C)``, combines this result with the fourth element returned, 1026and continues until the iterable is exhausted. If the iterable returns no 1027values at all, a :exc:`TypeError` exception is raised. If the initial value is 1028supplied, it's used as a starting point and ``func(initial_value, A)`` is the 1029first calculation. :: 1030 1031 >>> import operator, functools 1032 >>> functools.reduce(operator.concat, ['A', 'BB', 'C']) 1033 'ABBC' 1034 >>> functools.reduce(operator.concat, []) 1035 Traceback (most recent call last): 1036 ... 1037 TypeError: reduce() of empty sequence with no initial value 1038 >>> functools.reduce(operator.mul, [1, 2, 3], 1) 1039 6 1040 >>> functools.reduce(operator.mul, [], 1) 1041 1 1042 1043If you use :func:`operator.add` with :func:`functools.reduce`, you'll add up all the 1044elements of the iterable. This case is so common that there's a special 1045built-in called :func:`sum` to compute it: 1046 1047 >>> import functools, operator 1048 >>> functools.reduce(operator.add, [1, 2, 3, 4], 0) 1049 10 1050 >>> sum([1, 2, 3, 4]) 1051 10 1052 >>> sum([]) 1053 0 1054 1055For many uses of :func:`functools.reduce`, though, it can be clearer to just 1056write the obvious :keyword:`for` loop:: 1057 1058 import functools 1059 # Instead of: 1060 product = functools.reduce(operator.mul, [1, 2, 3], 1) 1061 1062 # You can write: 1063 product = 1 1064 for i in [1, 2, 3]: 1065 product *= i 1066 1067A related function is :func:`itertools.accumulate(iterable, func=operator.add) 1068<itertools.accumulate>`. It performs the same calculation, but instead of 1069returning only the final result, :func:`accumulate` returns an iterator that 1070also yields each partial result:: 1071 1072 itertools.accumulate([1, 2, 3, 4, 5]) => 1073 1, 3, 6, 10, 15 1074 1075 itertools.accumulate([1, 2, 3, 4, 5], operator.mul) => 1076 1, 2, 6, 24, 120 1077 1078 1079The operator module 1080------------------- 1081 1082The :mod:`operator` module was mentioned earlier. It contains a set of 1083functions corresponding to Python's operators. These functions are often useful 1084in functional-style code because they save you from writing trivial functions 1085that perform a single operation. 1086 1087Some of the functions in this module are: 1088 1089* Math operations: ``add()``, ``sub()``, ``mul()``, ``floordiv()``, ``abs()``, ... 1090* Logical operations: ``not_()``, ``truth()``. 1091* Bitwise operations: ``and_()``, ``or_()``, ``invert()``. 1092* Comparisons: ``eq()``, ``ne()``, ``lt()``, ``le()``, ``gt()``, and ``ge()``. 1093* Object identity: ``is_()``, ``is_not()``. 1094 1095Consult the operator module's documentation for a complete list. 1096 1097 1098Small functions and the lambda expression 1099========================================= 1100 1101When writing functional-style programs, you'll often need little functions that 1102act as predicates or that combine elements in some way. 1103 1104If there's a Python built-in or a module function that's suitable, you don't 1105need to define a new function at all:: 1106 1107 stripped_lines = [line.strip() for line in lines] 1108 existing_files = filter(os.path.exists, file_list) 1109 1110If the function you need doesn't exist, you need to write it. One way to write 1111small functions is to use the :keyword:`lambda` expression. ``lambda`` takes a 1112number of parameters and an expression combining these parameters, and creates 1113an anonymous function that returns the value of the expression:: 1114 1115 adder = lambda x, y: x+y 1116 1117 print_assign = lambda name, value: name + '=' + str(value) 1118 1119An alternative is to just use the ``def`` statement and define a function in the 1120usual way:: 1121 1122 def adder(x, y): 1123 return x + y 1124 1125 def print_assign(name, value): 1126 return name + '=' + str(value) 1127 1128Which alternative is preferable? That's a style question; my usual course is to 1129avoid using ``lambda``. 1130 1131One reason for my preference is that ``lambda`` is quite limited in the 1132functions it can define. The result has to be computable as a single 1133expression, which means you can't have multiway ``if... elif... else`` 1134comparisons or ``try... except`` statements. If you try to do too much in a 1135``lambda`` statement, you'll end up with an overly complicated expression that's 1136hard to read. Quick, what's the following code doing? :: 1137 1138 import functools 1139 total = functools.reduce(lambda a, b: (0, a[1] + b[1]), items)[1] 1140 1141You can figure it out, but it takes time to disentangle the expression to figure 1142out what's going on. Using a short nested ``def`` statements makes things a 1143little bit better:: 1144 1145 import functools 1146 def combine(a, b): 1147 return 0, a[1] + b[1] 1148 1149 total = functools.reduce(combine, items)[1] 1150 1151But it would be best of all if I had simply used a ``for`` loop:: 1152 1153 total = 0 1154 for a, b in items: 1155 total += b 1156 1157Or the :func:`sum` built-in and a generator expression:: 1158 1159 total = sum(b for a, b in items) 1160 1161Many uses of :func:`functools.reduce` are clearer when written as ``for`` loops. 1162 1163Fredrik Lundh once suggested the following set of rules for refactoring uses of 1164``lambda``: 1165 11661. Write a lambda function. 11672. Write a comment explaining what the heck that lambda does. 11683. Study the comment for a while, and think of a name that captures the essence 1169 of the comment. 11704. Convert the lambda to a def statement, using that name. 11715. Remove the comment. 1172 1173I really like these rules, but you're free to disagree 1174about whether this lambda-free style is better. 1175 1176 1177Revision History and Acknowledgements 1178===================================== 1179 1180The author would like to thank the following people for offering suggestions, 1181corrections and assistance with various drafts of this article: Ian Bicking, 1182Nick Coghlan, Nick Efford, Raymond Hettinger, Jim Jewett, Mike Krell, Leandro 1183Lameiro, Jussi Salmela, Collin Winter, Blake Winton. 1184 1185Version 0.1: posted June 30 2006. 1186 1187Version 0.11: posted July 1 2006. Typo fixes. 1188 1189Version 0.2: posted July 10 2006. Merged genexp and listcomp sections into one. 1190Typo fixes. 1191 1192Version 0.21: Added more references suggested on the tutor mailing list. 1193 1194Version 0.30: Adds a section on the ``functional`` module written by Collin 1195Winter; adds short section on the operator module; a few other edits. 1196 1197 1198References 1199========== 1200 1201General 1202------- 1203 1204**Structure and Interpretation of Computer Programs**, by Harold Abelson and 1205Gerald Jay Sussman with Julie Sussman. Full text at 1206https://mitpress.mit.edu/sicp/. In this classic textbook of computer science, 1207chapters 2 and 3 discuss the use of sequences and streams to organize the data 1208flow inside a program. The book uses Scheme for its examples, but many of the 1209design approaches described in these chapters are applicable to functional-style 1210Python code. 1211 1212http://www.defmacro.org/ramblings/fp.html: A general introduction to functional 1213programming that uses Java examples and has a lengthy historical introduction. 1214 1215https://en.wikipedia.org/wiki/Functional_programming: General Wikipedia entry 1216describing functional programming. 1217 1218https://en.wikipedia.org/wiki/Coroutine: Entry for coroutines. 1219 1220https://en.wikipedia.org/wiki/Currying: Entry for the concept of currying. 1221 1222Python-specific 1223--------------- 1224 1225http://gnosis.cx/TPiP/: The first chapter of David Mertz's book 1226:title-reference:`Text Processing in Python` discusses functional programming 1227for text processing, in the section titled "Utilizing Higher-Order Functions in 1228Text Processing". 1229 1230Mertz also wrote a 3-part series of articles on functional programming 1231for IBM's DeveloperWorks site; see 1232`part 1 <https://developer.ibm.com/articles/l-prog/>`__, 1233`part 2 <https://developer.ibm.com/tutorials/l-prog2/>`__, and 1234`part 3 <https://developer.ibm.com/tutorials/l-prog3/>`__, 1235 1236 1237Python documentation 1238-------------------- 1239 1240Documentation for the :mod:`itertools` module. 1241 1242Documentation for the :mod:`functools` module. 1243 1244Documentation for the :mod:`operator` module. 1245 1246:pep:`289`: "Generator Expressions" 1247 1248:pep:`342`: "Coroutines via Enhanced Generators" describes the new generator 1249features in Python 2.5. 1250 1251.. comment 1252 1253 Handy little function for printing part of an iterator -- used 1254 while writing this document. 1255 1256 import itertools 1257 def print_iter(it): 1258 slice = itertools.islice(it, 10) 1259 for elem in slice[:-1]: 1260 sys.stdout.write(str(elem)) 1261 sys.stdout.write(', ') 1262 print(elem[-1]) 1263