1:mod:`random` --- Generate pseudo-random numbers 2================================================ 3 4.. module:: random 5 :synopsis: Generate pseudo-random numbers with various common distributions. 6 7**Source code:** :source:`Lib/random.py` 8 9-------------- 10 11This module implements pseudo-random number generators for various 12distributions. 13 14For integers, there is uniform selection from a range. For sequences, there is 15uniform selection of a random element, a function to generate a random 16permutation of a list in-place, and a function for random sampling without 17replacement. 18 19On the real line, there are functions to compute uniform, normal (Gaussian), 20lognormal, negative exponential, gamma, and beta distributions. For generating 21distributions of angles, the von Mises distribution is available. 22 23Almost all module functions depend on the basic function :func:`.random`, which 24generates a random float uniformly in the semi-open range [0.0, 1.0). Python 25uses the Mersenne Twister as the core generator. It produces 53-bit precision 26floats and has a period of 2\*\*19937-1. The underlying implementation in C is 27both fast and threadsafe. The Mersenne Twister is one of the most extensively 28tested random number generators in existence. However, being completely 29deterministic, it is not suitable for all purposes, and is completely unsuitable 30for cryptographic purposes. 31 32The functions supplied by this module are actually bound methods of a hidden 33instance of the :class:`random.Random` class. You can instantiate your own 34instances of :class:`Random` to get generators that don't share state. 35 36Class :class:`Random` can also be subclassed if you want to use a different 37basic generator of your own devising: in that case, override the :meth:`~Random.random`, 38:meth:`~Random.seed`, :meth:`~Random.getstate`, and :meth:`~Random.setstate` methods. 39Optionally, a new generator can supply a :meth:`~Random.getrandbits` method --- this 40allows :meth:`randrange` to produce selections over an arbitrarily large range. 41 42The :mod:`random` module also provides the :class:`SystemRandom` class which 43uses the system function :func:`os.urandom` to generate random numbers 44from sources provided by the operating system. 45 46.. warning:: 47 48 The pseudo-random generators of this module should not be used for 49 security purposes. For security or cryptographic uses, see the 50 :mod:`secrets` module. 51 52.. seealso:: 53 54 M. Matsumoto and T. Nishimura, "Mersenne Twister: A 623-dimensionally 55 equidistributed uniform pseudorandom number generator", ACM Transactions on 56 Modeling and Computer Simulation Vol. 8, No. 1, January pp.3--30 1998. 57 58 59 `Complementary-Multiply-with-Carry recipe 60 <https://code.activestate.com/recipes/576707/>`_ for a compatible alternative 61 random number generator with a long period and comparatively simple update 62 operations. 63 64 65Bookkeeping functions 66--------------------- 67 68.. function:: seed(a=None, version=2) 69 70 Initialize the random number generator. 71 72 If *a* is omitted or ``None``, the current system time is used. If 73 randomness sources are provided by the operating system, they are used 74 instead of the system time (see the :func:`os.urandom` function for details 75 on availability). 76 77 If *a* is an int, it is used directly. 78 79 With version 2 (the default), a :class:`str`, :class:`bytes`, or :class:`bytearray` 80 object gets converted to an :class:`int` and all of its bits are used. 81 82 With version 1 (provided for reproducing random sequences from older versions 83 of Python), the algorithm for :class:`str` and :class:`bytes` generates a 84 narrower range of seeds. 85 86 .. versionchanged:: 3.2 87 Moved to the version 2 scheme which uses all of the bits in a string seed. 88 89 .. deprecated:: 3.9 90 In the future, the *seed* must be one of the following types: 91 *NoneType*, :class:`int`, :class:`float`, :class:`str`, 92 :class:`bytes`, or :class:`bytearray`. 93 94.. function:: getstate() 95 96 Return an object capturing the current internal state of the generator. This 97 object can be passed to :func:`setstate` to restore the state. 98 99 100.. function:: setstate(state) 101 102 *state* should have been obtained from a previous call to :func:`getstate`, and 103 :func:`setstate` restores the internal state of the generator to what it was at 104 the time :func:`getstate` was called. 105 106 107Functions for bytes 108------------------- 109 110.. function:: randbytes(n) 111 112 Generate *n* random bytes. 113 114 This method should not be used for generating security tokens. 115 Use :func:`secrets.token_bytes` instead. 116 117 .. versionadded:: 3.9 118 119 120Functions for integers 121---------------------- 122 123.. function:: randrange(stop) 124 randrange(start, stop[, step]) 125 126 Return a randomly selected element from ``range(start, stop, step)``. This is 127 equivalent to ``choice(range(start, stop, step))``, but doesn't actually build a 128 range object. 129 130 The positional argument pattern matches that of :func:`range`. Keyword arguments 131 should not be used because the function may use them in unexpected ways. 132 133 .. versionchanged:: 3.2 134 :meth:`randrange` is more sophisticated about producing equally distributed 135 values. Formerly it used a style like ``int(random()*n)`` which could produce 136 slightly uneven distributions. 137 138 .. deprecated:: 3.10 139 The automatic conversion of non-integer types to equivalent integers is 140 deprecated. Currently ``randrange(10.0)`` is losslessly converted to 141 ``randrange(10)``. In the future, this will raise a :exc:`TypeError`. 142 143 .. deprecated:: 3.10 144 The exception raised for non-integral values such as ``randrange(10.5)`` 145 or ``randrange('10')`` will be changed from :exc:`ValueError` to 146 :exc:`TypeError`. 147 148.. function:: randint(a, b) 149 150 Return a random integer *N* such that ``a <= N <= b``. Alias for 151 ``randrange(a, b+1)``. 152 153.. function:: getrandbits(k) 154 155 Returns a non-negative Python integer with *k* random bits. This method 156 is supplied with the MersenneTwister generator and some other generators 157 may also provide it as an optional part of the API. When available, 158 :meth:`getrandbits` enables :meth:`randrange` to handle arbitrarily large 159 ranges. 160 161 .. versionchanged:: 3.9 162 This method now accepts zero for *k*. 163 164 165Functions for sequences 166----------------------- 167 168.. function:: choice(seq) 169 170 Return a random element from the non-empty sequence *seq*. If *seq* is empty, 171 raises :exc:`IndexError`. 172 173.. function:: choices(population, weights=None, *, cum_weights=None, k=1) 174 175 Return a *k* sized list of elements chosen from the *population* with replacement. 176 If the *population* is empty, raises :exc:`IndexError`. 177 178 If a *weights* sequence is specified, selections are made according to the 179 relative weights. Alternatively, if a *cum_weights* sequence is given, the 180 selections are made according to the cumulative weights (perhaps computed 181 using :func:`itertools.accumulate`). For example, the relative weights 182 ``[10, 5, 30, 5]`` are equivalent to the cumulative weights 183 ``[10, 15, 45, 50]``. Internally, the relative weights are converted to 184 cumulative weights before making selections, so supplying the cumulative 185 weights saves work. 186 187 If neither *weights* nor *cum_weights* are specified, selections are made 188 with equal probability. If a weights sequence is supplied, it must be 189 the same length as the *population* sequence. It is a :exc:`TypeError` 190 to specify both *weights* and *cum_weights*. 191 192 The *weights* or *cum_weights* can use any numeric type that interoperates 193 with the :class:`float` values returned by :func:`random` (that includes 194 integers, floats, and fractions but excludes decimals). Weights are assumed 195 to be non-negative and finite. A :exc:`ValueError` is raised if all 196 weights are zero. 197 198 For a given seed, the :func:`choices` function with equal weighting 199 typically produces a different sequence than repeated calls to 200 :func:`choice`. The algorithm used by :func:`choices` uses floating 201 point arithmetic for internal consistency and speed. The algorithm used 202 by :func:`choice` defaults to integer arithmetic with repeated selections 203 to avoid small biases from round-off error. 204 205 .. versionadded:: 3.6 206 207 .. versionchanged:: 3.9 208 Raises a :exc:`ValueError` if all weights are zero. 209 210 211.. function:: shuffle(x[, random]) 212 213 Shuffle the sequence *x* in place. 214 215 The optional argument *random* is a 0-argument function returning a random 216 float in [0.0, 1.0); by default, this is the function :func:`.random`. 217 218 To shuffle an immutable sequence and return a new shuffled list, use 219 ``sample(x, k=len(x))`` instead. 220 221 Note that even for small ``len(x)``, the total number of permutations of *x* 222 can quickly grow larger than the period of most random number generators. 223 This implies that most permutations of a long sequence can never be 224 generated. For example, a sequence of length 2080 is the largest that 225 can fit within the period of the Mersenne Twister random number generator. 226 227 .. deprecated-removed:: 3.9 3.11 228 The optional parameter *random*. 229 230 231.. function:: sample(population, k, *, counts=None) 232 233 Return a *k* length list of unique elements chosen from the population sequence 234 or set. Used for random sampling without replacement. 235 236 Returns a new list containing elements from the population while leaving the 237 original population unchanged. The resulting list is in selection order so that 238 all sub-slices will also be valid random samples. This allows raffle winners 239 (the sample) to be partitioned into grand prize and second place winners (the 240 subslices). 241 242 Members of the population need not be :term:`hashable` or unique. If the population 243 contains repeats, then each occurrence is a possible selection in the sample. 244 245 Repeated elements can be specified one at a time or with the optional 246 keyword-only *counts* parameter. For example, ``sample(['red', 'blue'], 247 counts=[4, 2], k=5)`` is equivalent to ``sample(['red', 'red', 'red', 'red', 248 'blue', 'blue'], k=5)``. 249 250 To choose a sample from a range of integers, use a :func:`range` object as an 251 argument. This is especially fast and space efficient for sampling from a large 252 population: ``sample(range(10000000), k=60)``. 253 254 If the sample size is larger than the population size, a :exc:`ValueError` 255 is raised. 256 257 .. versionchanged:: 3.9 258 Added the *counts* parameter. 259 260 .. deprecated:: 3.9 261 In the future, the *population* must be a sequence. Instances of 262 :class:`set` are no longer supported. The set must first be converted 263 to a :class:`list` or :class:`tuple`, preferably in a deterministic 264 order so that the sample is reproducible. 265 266 267.. _real-valued-distributions: 268 269Real-valued distributions 270------------------------- 271 272The following functions generate specific real-valued distributions. Function 273parameters are named after the corresponding variables in the distribution's 274equation, as used in common mathematical practice; most of these equations can 275be found in any statistics text. 276 277 278.. function:: random() 279 280 Return the next random floating point number in the range [0.0, 1.0). 281 282 283.. function:: uniform(a, b) 284 285 Return a random floating point number *N* such that ``a <= N <= b`` for 286 ``a <= b`` and ``b <= N <= a`` for ``b < a``. 287 288 The end-point value ``b`` may or may not be included in the range 289 depending on floating-point rounding in the equation ``a + (b-a) * random()``. 290 291 292.. function:: triangular(low, high, mode) 293 294 Return a random floating point number *N* such that ``low <= N <= high`` and 295 with the specified *mode* between those bounds. The *low* and *high* bounds 296 default to zero and one. The *mode* argument defaults to the midpoint 297 between the bounds, giving a symmetric distribution. 298 299 300.. function:: betavariate(alpha, beta) 301 302 Beta distribution. Conditions on the parameters are ``alpha > 0`` and 303 ``beta > 0``. Returned values range between 0 and 1. 304 305 306.. function:: expovariate(lambd) 307 308 Exponential distribution. *lambd* is 1.0 divided by the desired 309 mean. It should be nonzero. (The parameter would be called 310 "lambda", but that is a reserved word in Python.) Returned values 311 range from 0 to positive infinity if *lambd* is positive, and from 312 negative infinity to 0 if *lambd* is negative. 313 314 315.. function:: gammavariate(alpha, beta) 316 317 Gamma distribution. (*Not* the gamma function!) Conditions on the 318 parameters are ``alpha > 0`` and ``beta > 0``. 319 320 The probability distribution function is:: 321 322 x ** (alpha - 1) * math.exp(-x / beta) 323 pdf(x) = -------------------------------------- 324 math.gamma(alpha) * beta ** alpha 325 326 327.. function:: gauss(mu, sigma) 328 329 Normal distribution, also called the Gaussian distribution. *mu* is the mean, 330 and *sigma* is the standard deviation. This is slightly faster than 331 the :func:`normalvariate` function defined below. 332 333 Multithreading note: When two threads call this function 334 simultaneously, it is possible that they will receive the 335 same return value. This can be avoided in three ways. 336 1) Have each thread use a different instance of the random 337 number generator. 2) Put locks around all calls. 3) Use the 338 slower, but thread-safe :func:`normalvariate` function instead. 339 340 341.. function:: lognormvariate(mu, sigma) 342 343 Log normal distribution. If you take the natural logarithm of this 344 distribution, you'll get a normal distribution with mean *mu* and standard 345 deviation *sigma*. *mu* can have any value, and *sigma* must be greater than 346 zero. 347 348 349.. function:: normalvariate(mu, sigma) 350 351 Normal distribution. *mu* is the mean, and *sigma* is the standard deviation. 352 353 354.. function:: vonmisesvariate(mu, kappa) 355 356 *mu* is the mean angle, expressed in radians between 0 and 2\*\ *pi*, and *kappa* 357 is the concentration parameter, which must be greater than or equal to zero. If 358 *kappa* is equal to zero, this distribution reduces to a uniform random angle 359 over the range 0 to 2\*\ *pi*. 360 361 362.. function:: paretovariate(alpha) 363 364 Pareto distribution. *alpha* is the shape parameter. 365 366 367.. function:: weibullvariate(alpha, beta) 368 369 Weibull distribution. *alpha* is the scale parameter and *beta* is the shape 370 parameter. 371 372 373Alternative Generator 374--------------------- 375 376.. class:: Random([seed]) 377 378 Class that implements the default pseudo-random number generator used by the 379 :mod:`random` module. 380 381 .. deprecated:: 3.9 382 In the future, the *seed* must be one of the following types: 383 :class:`NoneType`, :class:`int`, :class:`float`, :class:`str`, 384 :class:`bytes`, or :class:`bytearray`. 385 386.. class:: SystemRandom([seed]) 387 388 Class that uses the :func:`os.urandom` function for generating random numbers 389 from sources provided by the operating system. Not available on all systems. 390 Does not rely on software state, and sequences are not reproducible. Accordingly, 391 the :meth:`seed` method has no effect and is ignored. 392 The :meth:`getstate` and :meth:`setstate` methods raise 393 :exc:`NotImplementedError` if called. 394 395 396Notes on Reproducibility 397------------------------ 398 399Sometimes it is useful to be able to reproduce the sequences given by a 400pseudo-random number generator. By re-using a seed value, the same sequence should be 401reproducible from run to run as long as multiple threads are not running. 402 403Most of the random module's algorithms and seeding functions are subject to 404change across Python versions, but two aspects are guaranteed not to change: 405 406* If a new seeding method is added, then a backward compatible seeder will be 407 offered. 408 409* The generator's :meth:`~Random.random` method will continue to produce the same 410 sequence when the compatible seeder is given the same seed. 411 412.. _random-examples: 413 414Examples 415-------- 416 417Basic examples:: 418 419 >>> random() # Random float: 0.0 <= x < 1.0 420 0.37444887175646646 421 422 >>> uniform(2.5, 10.0) # Random float: 2.5 <= x <= 10.0 423 3.1800146073117523 424 425 >>> expovariate(1 / 5) # Interval between arrivals averaging 5 seconds 426 5.148957571865031 427 428 >>> randrange(10) # Integer from 0 to 9 inclusive 429 7 430 431 >>> randrange(0, 101, 2) # Even integer from 0 to 100 inclusive 432 26 433 434 >>> choice(['win', 'lose', 'draw']) # Single random element from a sequence 435 'draw' 436 437 >>> deck = 'ace two three four'.split() 438 >>> shuffle(deck) # Shuffle a list 439 >>> deck 440 ['four', 'two', 'ace', 'three'] 441 442 >>> sample([10, 20, 30, 40, 50], k=4) # Four samples without replacement 443 [40, 10, 50, 30] 444 445Simulations:: 446 447 >>> # Six roulette wheel spins (weighted sampling with replacement) 448 >>> choices(['red', 'black', 'green'], [18, 18, 2], k=6) 449 ['red', 'green', 'black', 'black', 'red', 'black'] 450 451 >>> # Deal 20 cards without replacement from a deck 452 >>> # of 52 playing cards, and determine the proportion of cards 453 >>> # with a ten-value: ten, jack, queen, or king. 454 >>> dealt = sample(['tens', 'low cards'], counts=[16, 36], k=20) 455 >>> dealt.count('tens') / 20 456 0.15 457 458 >>> # Estimate the probability of getting 5 or more heads from 7 spins 459 >>> # of a biased coin that settles on heads 60% of the time. 460 >>> def trial(): 461 ... return choices('HT', cum_weights=(0.60, 1.00), k=7).count('H') >= 5 462 ... 463 >>> sum(trial() for i in range(10_000)) / 10_000 464 0.4169 465 466 >>> # Probability of the median of 5 samples being in middle two quartiles 467 >>> def trial(): 468 ... return 2_500 <= sorted(choices(range(10_000), k=5))[2] < 7_500 469 ... 470 >>> sum(trial() for i in range(10_000)) / 10_000 471 0.7958 472 473Example of `statistical bootstrapping 474<https://en.wikipedia.org/wiki/Bootstrapping_(statistics)>`_ using resampling 475with replacement to estimate a confidence interval for the mean of a sample:: 476 477 # http://statistics.about.com/od/Applications/a/Example-Of-Bootstrapping.htm 478 from statistics import fmean as mean 479 from random import choices 480 481 data = [41, 50, 29, 37, 81, 30, 73, 63, 20, 35, 68, 22, 60, 31, 95] 482 means = sorted(mean(choices(data, k=len(data))) for i in range(100)) 483 print(f'The sample mean of {mean(data):.1f} has a 90% confidence ' 484 f'interval from {means[5]:.1f} to {means[94]:.1f}') 485 486Example of a `resampling permutation test 487<https://en.wikipedia.org/wiki/Resampling_(statistics)#Permutation_tests>`_ 488to determine the statistical significance or `p-value 489<https://en.wikipedia.org/wiki/P-value>`_ of an observed difference 490between the effects of a drug versus a placebo:: 491 492 # Example from "Statistics is Easy" by Dennis Shasha and Manda Wilson 493 from statistics import fmean as mean 494 from random import shuffle 495 496 drug = [54, 73, 53, 70, 73, 68, 52, 65, 65] 497 placebo = [54, 51, 58, 44, 55, 52, 42, 47, 58, 46] 498 observed_diff = mean(drug) - mean(placebo) 499 500 n = 10_000 501 count = 0 502 combined = drug + placebo 503 for i in range(n): 504 shuffle(combined) 505 new_diff = mean(combined[:len(drug)]) - mean(combined[len(drug):]) 506 count += (new_diff >= observed_diff) 507 508 print(f'{n} label reshufflings produced only {count} instances with a difference') 509 print(f'at least as extreme as the observed difference of {observed_diff:.1f}.') 510 print(f'The one-sided p-value of {count / n:.4f} leads us to reject the null') 511 print(f'hypothesis that there is no difference between the drug and the placebo.') 512 513Simulation of arrival times and service deliveries for a multiserver queue:: 514 515 from heapq import heapify, heapreplace 516 from random import expovariate, gauss 517 from statistics import mean, quantiles 518 519 average_arrival_interval = 5.6 520 average_service_time = 15.0 521 stdev_service_time = 3.5 522 num_servers = 3 523 524 waits = [] 525 arrival_time = 0.0 526 servers = [0.0] * num_servers # time when each server becomes available 527 heapify(servers) 528 for i in range(1_000_000): 529 arrival_time += expovariate(1.0 / average_arrival_interval) 530 next_server_available = servers[0] 531 wait = max(0.0, next_server_available - arrival_time) 532 waits.append(wait) 533 service_duration = max(0.0, gauss(average_service_time, stdev_service_time)) 534 service_completed = arrival_time + wait + service_duration 535 heapreplace(servers, service_completed) 536 537 print(f'Mean wait: {mean(waits):.1f} Max wait: {max(waits):.1f}') 538 print('Quartiles:', [round(q, 1) for q in quantiles(waits)]) 539 540.. seealso:: 541 542 `Statistics for Hackers <https://www.youtube.com/watch?v=Iq9DzN6mvYA>`_ 543 a video tutorial by 544 `Jake Vanderplas <https://us.pycon.org/2016/speaker/profile/295/>`_ 545 on statistical analysis using just a few fundamental concepts 546 including simulation, sampling, shuffling, and cross-validation. 547 548 `Economics Simulation 549 <http://nbviewer.jupyter.org/url/norvig.com/ipython/Economics.ipynb>`_ 550 a simulation of a marketplace by 551 `Peter Norvig <http://norvig.com/bio.html>`_ that shows effective 552 use of many of the tools and distributions provided by this module 553 (gauss, uniform, sample, betavariate, choice, triangular, and randrange). 554 555 `A Concrete Introduction to Probability (using Python) 556 <http://nbviewer.jupyter.org/url/norvig.com/ipython/Probability.ipynb>`_ 557 a tutorial by `Peter Norvig <http://norvig.com/bio.html>`_ covering 558 the basics of probability theory, how to write simulations, and 559 how to perform data analysis using Python. 560 561 562Recipes 563------- 564 565The default :func:`.random` returns multiples of 2⁻⁵³ in the range 566*0.0 ≤ x < 1.0*. All such numbers are evenly spaced and are exactly 567representable as Python floats. However, many other representable 568floats in that interval are not possible selections. For example, 569``0.05954861408025609`` isn't an integer multiple of 2⁻⁵³. 570 571The following recipe takes a different approach. All floats in the 572interval are possible selections. The mantissa comes from a uniform 573distribution of integers in the range *2⁵² ≤ mantissa < 2⁵³*. The 574exponent comes from a geometric distribution where exponents smaller 575than *-53* occur half as often as the next larger exponent. 576 577:: 578 579 from random import Random 580 from math import ldexp 581 582 class FullRandom(Random): 583 584 def random(self): 585 mantissa = 0x10_0000_0000_0000 | self.getrandbits(52) 586 exponent = -53 587 x = 0 588 while not x: 589 x = self.getrandbits(32) 590 exponent += x.bit_length() - 32 591 return ldexp(mantissa, exponent) 592 593All :ref:`real valued distributions <real-valued-distributions>` 594in the class will use the new method:: 595 596 >>> fr = FullRandom() 597 >>> fr.random() 598 0.05954861408025609 599 >>> fr.expovariate(0.25) 600 8.87925541791544 601 602The recipe is conceptually equivalent to an algorithm that chooses from 603all the multiples of 2⁻¹⁰⁷⁴ in the range *0.0 ≤ x < 1.0*. All such 604numbers are evenly spaced, but most have to be rounded down to the 605nearest representable Python float. (The value 2⁻¹⁰⁷⁴ is the smallest 606positive unnormalized float and is equal to ``math.ulp(0.0)``.) 607 608 609.. seealso:: 610 611 `Generating Pseudo-random Floating-Point Values 612 <https://allendowney.com/research/rand/downey07randfloat.pdf>`_ a 613 paper by Allen B. Downey describing ways to generate more 614 fine-grained floats than normally generated by :func:`.random`. 615