• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1:mod:`!bisect` --- Array bisection algorithm
2============================================
3
4.. module:: bisect
5   :synopsis: Array bisection algorithms for binary searching.
6.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org>
7.. sectionauthor:: Raymond Hettinger <python at rcn.com>
8.. example based on the PyModules FAQ entry by Aaron Watters <arw@pythonpros.com>
9
10**Source code:** :source:`Lib/bisect.py`
11
12--------------
13
14This module provides support for maintaining a list in sorted order without
15having to sort the list after each insertion.  For long lists of items with
16expensive comparison operations, this can be an improvement over
17linear searches or frequent resorting.
18
19The module is called :mod:`bisect` because it uses a basic bisection
20algorithm to do its work.  Unlike other bisection tools that search for a
21specific value, the functions in this module are designed to locate an
22insertion point. Accordingly, the functions never call an :meth:`~object.__eq__`
23method to determine whether a value has been found.  Instead, the
24functions only call the :meth:`~object.__lt__` method and will return an insertion
25point between values in an array.
26
27.. _bisect functions:
28
29The following functions are provided:
30
31
32.. function:: bisect_left(a, x, lo=0, hi=len(a), *, key=None)
33
34   Locate the insertion point for *x* in *a* to maintain sorted order.
35   The parameters *lo* and *hi* may be used to specify a subset of the list
36   which should be considered; by default the entire list is used.  If *x* is
37   already present in *a*, the insertion point will be before (to the left of)
38   any existing entries.  The return value is suitable for use as the first
39   parameter to ``list.insert()`` assuming that *a* is already sorted.
40
41   The returned insertion point *ip* partitions the array *a* into two
42   slices such that ``all(elem < x for elem in a[lo : ip])`` is true for the
43   left slice and ``all(elem >= x for elem in a[ip : hi])`` is true for the
44   right slice.
45
46   *key* specifies a :term:`key function` of one argument that is used to
47   extract a comparison key from each element in the array.  To support
48   searching complex records, the key function is not applied to the *x* value.
49
50   If *key* is ``None``, the elements are compared directly and
51   no key function is called.
52
53   .. versionchanged:: 3.10
54      Added the *key* parameter.
55
56
57.. function:: bisect_right(a, x, lo=0, hi=len(a), *, key=None)
58              bisect(a, x, lo=0, hi=len(a), *, key=None)
59
60   Similar to :py:func:`~bisect.bisect_left`, but returns an insertion point which comes
61   after (to the right of) any existing entries of *x* in *a*.
62
63   The returned insertion point *ip* partitions the array *a* into two slices
64   such that ``all(elem <= x for elem in a[lo : ip])`` is true for the left slice and
65   ``all(elem > x for elem in a[ip : hi])`` is true for the right slice.
66
67   .. versionchanged:: 3.10
68      Added the *key* parameter.
69
70
71.. function:: insort_left(a, x, lo=0, hi=len(a), *, key=None)
72
73   Insert *x* in *a* in sorted order.
74
75   This function first runs :py:func:`~bisect.bisect_left` to locate an insertion point.
76   Next, it runs the :meth:`!insert` method on *a* to insert *x* at the
77   appropriate position to maintain sort order.
78
79   To support inserting records in a table, the *key* function (if any) is
80   applied to *x* for the search step but not for the insertion step.
81
82   Keep in mind that the *O*\ (log *n*) search is dominated by the slow *O*\ (*n*)
83   insertion step.
84
85   .. versionchanged:: 3.10
86      Added the *key* parameter.
87
88
89.. function:: insort_right(a, x, lo=0, hi=len(a), *, key=None)
90              insort(a, x, lo=0, hi=len(a), *, key=None)
91
92   Similar to :py:func:`~bisect.insort_left`, but inserting *x* in *a* after any existing
93   entries of *x*.
94
95   This function first runs :py:func:`~bisect.bisect_right` to locate an insertion point.
96   Next, it runs the :meth:`!insert` method on *a* to insert *x* at the
97   appropriate position to maintain sort order.
98
99   To support inserting records in a table, the *key* function (if any) is
100   applied to *x* for the search step but not for the insertion step.
101
102   Keep in mind that the *O*\ (log *n*) search is dominated by the slow *O*\ (*n*)
103   insertion step.
104
105   .. versionchanged:: 3.10
106      Added the *key* parameter.
107
108
109Performance Notes
110-----------------
111
112When writing time sensitive code using *bisect()* and *insort()*, keep these
113thoughts in mind:
114
115* Bisection is effective for searching ranges of values.
116  For locating specific values, dictionaries are more performant.
117
118* The *insort()* functions are *O*\ (*n*) because the logarithmic search step
119  is dominated by the linear time insertion step.
120
121* The search functions are stateless and discard key function results after
122  they are used.  Consequently, if the search functions are used in a loop,
123  the key function may be called again and again on the same array elements.
124  If the key function isn't fast, consider wrapping it with
125  :py:func:`functools.cache` to avoid duplicate computations.  Alternatively,
126  consider searching an array of precomputed keys to locate the insertion
127  point (as shown in the examples section below).
128
129.. seealso::
130
131   * `Sorted Collections
132     <https://grantjenks.com/docs/sortedcollections/>`_ is a high performance
133     module that uses *bisect* to managed sorted collections of data.
134
135   * The `SortedCollection recipe
136     <https://code.activestate.com/recipes/577197-sortedcollection/>`_ uses
137     bisect to build a full-featured collection class with straight-forward search
138     methods and support for a key-function.  The keys are precomputed to save
139     unnecessary calls to the key function during searches.
140
141
142Searching Sorted Lists
143----------------------
144
145The above `bisect functions`_ are useful for finding insertion points but
146can be tricky or awkward to use for common searching tasks. The following five
147functions show how to transform them into the standard lookups for sorted
148lists::
149
150    def index(a, x):
151        'Locate the leftmost value exactly equal to x'
152        i = bisect_left(a, x)
153        if i != len(a) and a[i] == x:
154            return i
155        raise ValueError
156
157    def find_lt(a, x):
158        'Find rightmost value less than x'
159        i = bisect_left(a, x)
160        if i:
161            return a[i-1]
162        raise ValueError
163
164    def find_le(a, x):
165        'Find rightmost value less than or equal to x'
166        i = bisect_right(a, x)
167        if i:
168            return a[i-1]
169        raise ValueError
170
171    def find_gt(a, x):
172        'Find leftmost value greater than x'
173        i = bisect_right(a, x)
174        if i != len(a):
175            return a[i]
176        raise ValueError
177
178    def find_ge(a, x):
179        'Find leftmost item greater than or equal to x'
180        i = bisect_left(a, x)
181        if i != len(a):
182            return a[i]
183        raise ValueError
184
185
186Examples
187--------
188
189.. _bisect-example:
190
191The :py:func:`~bisect.bisect` function can be useful for numeric table lookups. This
192example uses :py:func:`~bisect.bisect` to look up a letter grade for an exam score (say)
193based on a set of ordered numeric breakpoints: 90 and up is an 'A', 80 to 89 is
194a 'B', and so on::
195
196   >>> def grade(score, breakpoints=[60, 70, 80, 90], grades='FDCBA'):
197   ...     i = bisect(breakpoints, score)
198   ...     return grades[i]
199   ...
200   >>> [grade(score) for score in [33, 99, 77, 70, 89, 90, 100]]
201   ['F', 'A', 'C', 'C', 'B', 'A', 'A']
202
203The :py:func:`~bisect.bisect` and :py:func:`~bisect.insort` functions also work with
204lists of tuples.  The *key* argument can serve to extract the field used for ordering
205records in a table::
206
207    >>> from collections import namedtuple
208    >>> from operator import attrgetter
209    >>> from bisect import bisect, insort
210    >>> from pprint import pprint
211
212    >>> Movie = namedtuple('Movie', ('name', 'released', 'director'))
213
214    >>> movies = [
215    ...     Movie('Jaws', 1975, 'Spielberg'),
216    ...     Movie('Titanic', 1997, 'Cameron'),
217    ...     Movie('The Birds', 1963, 'Hitchcock'),
218    ...     Movie('Aliens', 1986, 'Cameron')
219    ... ]
220
221    >>> # Find the first movie released after 1960
222    >>> by_year = attrgetter('released')
223    >>> movies.sort(key=by_year)
224    >>> movies[bisect(movies, 1960, key=by_year)]
225    Movie(name='The Birds', released=1963, director='Hitchcock')
226
227    >>> # Insert a movie while maintaining sort order
228    >>> romance = Movie('Love Story', 1970, 'Hiller')
229    >>> insort(movies, romance, key=by_year)
230    >>> pprint(movies)
231    [Movie(name='The Birds', released=1963, director='Hitchcock'),
232     Movie(name='Love Story', released=1970, director='Hiller'),
233     Movie(name='Jaws', released=1975, director='Spielberg'),
234     Movie(name='Aliens', released=1986, director='Cameron'),
235     Movie(name='Titanic', released=1997, director='Cameron')]
236
237If the key function is expensive, it is possible to avoid repeated function
238calls by searching a list of precomputed keys to find the index of a record::
239
240    >>> data = [('red', 5), ('blue', 1), ('yellow', 8), ('black', 0)]
241    >>> data.sort(key=lambda r: r[1])       # Or use operator.itemgetter(1).
242    >>> keys = [r[1] for r in data]         # Precompute a list of keys.
243    >>> data[bisect_left(keys, 0)]
244    ('black', 0)
245    >>> data[bisect_left(keys, 1)]
246    ('blue', 1)
247    >>> data[bisect_left(keys, 5)]
248    ('red', 5)
249    >>> data[bisect_left(keys, 8)]
250    ('yellow', 8)
251