• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1<html>
2<head>
3<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
4<title>Generic operations common to all distributions are non-member functions</title>
5<link rel="stylesheet" href="../../../math.css" type="text/css">
6<meta name="generator" content="DocBook XSL Stylesheets V1.79.1">
7<link rel="home" href="../../../index.html" title="Math Toolkit 2.12.0">
8<link rel="up" href="../overview.html" title="Overview of Statistical Distributions">
9<link rel="prev" href="objects.html" title="Distributions are Objects">
10<link rel="next" href="complements.html" title="Complements are supported too - and when to use them">
11</head>
12<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
13<table cellpadding="2" width="100%"><tr>
14<td valign="top"><img alt="Boost C++ Libraries" width="277" height="86" src="../../../../../../../boost.png"></td>
15<td align="center"><a href="../../../../../../../index.html">Home</a></td>
16<td align="center"><a href="../../../../../../../libs/libraries.htm">Libraries</a></td>
17<td align="center"><a href="http://www.boost.org/users/people.html">People</a></td>
18<td align="center"><a href="http://www.boost.org/users/faq.html">FAQ</a></td>
19<td align="center"><a href="../../../../../../../more/index.htm">More</a></td>
20</tr></table>
21<hr>
22<div class="spirit-nav">
23<a accesskey="p" href="objects.html"><img src="../../../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../overview.html"><img src="../../../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../../../index.html"><img src="../../../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="complements.html"><img src="../../../../../../../doc/src/images/next.png" alt="Next"></a>
24</div>
25<div class="section">
26<div class="titlepage"><div><div><h4 class="title">
27<a name="math_toolkit.stat_tut.overview.generic"></a><a class="link" href="generic.html" title="Generic operations common to all distributions are non-member functions">Generic operations
28        common to all distributions are non-member functions</a>
29</h4></div></div></div>
30<p>
31          Want to calculate the PDF (Probability Density Function) of a distribution?
32          No problem, just use:
33        </p>
34<pre class="programlisting"><span class="identifier">pdf</span><span class="special">(</span><span class="identifier">my_dist</span><span class="special">,</span> <span class="identifier">x</span><span class="special">);</span>  <span class="comment">// Returns PDF (density) at point x of distribution my_dist.</span>
35</pre>
36<p>
37          Or how about the CDF (Cumulative Distribution Function):
38        </p>
39<pre class="programlisting"><span class="identifier">cdf</span><span class="special">(</span><span class="identifier">my_dist</span><span class="special">,</span> <span class="identifier">x</span><span class="special">);</span>  <span class="comment">// Returns CDF (integral from -infinity to point x)</span>
40                  <span class="comment">// of distribution my_dist.</span>
41</pre>
42<p>
43          And quantiles are just the same:
44        </p>
45<pre class="programlisting"><span class="identifier">quantile</span><span class="special">(</span><span class="identifier">my_dist</span><span class="special">,</span> <span class="identifier">p</span><span class="special">);</span>  <span class="comment">// Returns the value of the random variable x</span>
46                       <span class="comment">// such that cdf(my_dist, x) == p.</span>
47</pre>
48<p>
49          If you're wondering why these aren't member functions, it's to make the
50          library more easily extensible: if you want to add additional generic operations
51          - let's say the <span class="emphasis"><em>n'th moment</em></span> - then all you have to
52          do is add the appropriate non-member functions, overloaded for each implemented
53          distribution type.
54        </p>
55<div class="tip"><table border="0" summary="Tip">
56<tr>
57<td rowspan="2" align="center" valign="top" width="25"><img alt="[Tip]" src="../../../../../../../doc/src/images/tip.png"></td>
58<th align="left">Tip</th>
59</tr>
60<tr><td align="left" valign="top">
61<p>
62            <span class="bold"><strong>Random numbers that approximate Quantiles of Distributions</strong></span>
63          </p>
64<p>
65            If you want random numbers that are distributed in a specific way, for
66            example in a uniform, normal or triangular, see <a href="http://www.boost.org/libs/random/" target="_top">Boost.Random</a>.
67          </p>
68<p>
69            Whilst in principal there's nothing to prevent you from using the quantile
70            function to convert a uniformly distributed random number to another
71            distribution, in practice there are much more efficient algorithms available
72            that are specific to random number generation.
73          </p>
74</td></tr>
75</table></div>
76<p>
77          For example, the binomial distribution has two parameters: n (the number
78          of trials) and p (the probability of success on any one trial).
79        </p>
80<p>
81          The <code class="computeroutput"><span class="identifier">binomial_distribution</span></code>
82          constructor therefore has two parameters:
83        </p>
84<p>
85          <code class="computeroutput"><span class="identifier">binomial_distribution</span><span class="special">(</span><span class="identifier">RealType</span> <span class="identifier">n</span><span class="special">,</span> <span class="identifier">RealType</span>
86          <span class="identifier">p</span><span class="special">);</span></code>
87        </p>
88<p>
89          For this distribution the <a href="http://en.wikipedia.org/wiki/Random_variate" target="_top">random
90          variate</a> is k: the number of successes observed. The probability
91          density/mass function (pdf) is therefore written as <span class="emphasis"><em>f(k; n, p)</em></span>.
92        </p>
93<div class="note"><table border="0" summary="Note">
94<tr>
95<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../../../../../doc/src/images/note.png"></td>
96<th align="left">Note</th>
97</tr>
98<tr><td align="left" valign="top">
99<p>
100            <span class="bold"><strong>Random Variates and Distribution Parameters</strong></span>
101          </p>
102<p>
103            The concept of a <a href="http://en.wikipedia.org/wiki/Random_variable" target="_top">random
104            variable</a> is closely linked to the term <a href="http://en.wikipedia.org/wiki/Random_variate" target="_top">random
105            variate</a>: a random variate is a particular value (outcome) of
106            a random variable. and <a href="http://en.wikipedia.org/wiki/Parameter" target="_top">distribution
107            parameters</a> are conventionally distinguished (for example in Wikipedia
108            and Wolfram MathWorld) by placing a semi-colon or vertical bar) <span class="emphasis"><em>after</em></span>
109            the <a href="http://en.wikipedia.org/wiki/Random_variable" target="_top">random
110            variable</a> (whose value you 'choose'), to separate the variate
111            from the parameter(s) that defines the shape of the distribution.
112          </p>
113<p>
114            For example, the binomial distribution probability distribution function
115            (PDF) is written as <span class="serif_italic"><span class="emphasis"><em>f(k| n, p)</em></span>
116            = Pr(K = k|n, p) = </span> probability of observing k successes out
117            of n trials. K is the <a href="http://en.wikipedia.org/wiki/Random_variable" target="_top">random
118            variable</a>, k is the <a href="http://en.wikipedia.org/wiki/Random_variate" target="_top">random
119            variate</a>, the parameters are n (trials) and p (probability).
120          </p>
121</td></tr>
122</table></div>
123<div class="note"><table border="0" summary="Note">
124<tr>
125<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../../../../../doc/src/images/note.png"></td>
126<th align="left">Note</th>
127</tr>
128<tr><td align="left" valign="top"><p>
129            By convention, <a href="http://en.wikipedia.org/wiki/Random_variate" target="_top">random
130            variate</a> are lower case, usually k is integral, x if real, and
131            <a href="http://en.wikipedia.org/wiki/Random_variable" target="_top">random variable</a>
132            are upper case, K if integral, X if real. But this implementation treats
133            all as floating point values <code class="computeroutput"><span class="identifier">RealType</span></code>,
134            so if you really want an integral result, you must round: see note on
135            Discrete Probability Distributions below for details.
136          </p></td></tr>
137</table></div>
138<p>
139          As noted above the non-member function <code class="computeroutput"><span class="identifier">pdf</span></code>
140          has one parameter for the distribution object, and a second for the random
141          variate. So taking our binomial distribution example, we would write:
142        </p>
143<p>
144          <code class="computeroutput"><span class="identifier">pdf</span><span class="special">(</span><span class="identifier">binomial_distribution</span><span class="special">&lt;</span><span class="identifier">RealType</span><span class="special">&gt;(</span><span class="identifier">n</span><span class="special">,</span> <span class="identifier">p</span><span class="special">),</span> <span class="identifier">k</span><span class="special">);</span></code>
145        </p>
146<p>
147          The ranges of <a href="http://en.wikipedia.org/wiki/Random_variate" target="_top">random
148          variate</a> values that are permitted and are supported can be tested
149          by using two functions <code class="computeroutput"><span class="identifier">range</span></code>
150          and <code class="computeroutput"><span class="identifier">support</span></code>.
151        </p>
152<p>
153          The distribution (effectively the <a href="http://en.wikipedia.org/wiki/Random_variate" target="_top">random
154          variate</a>) is said to be 'supported' over a range that is <a href="http://en.wikipedia.org/wiki/Probability_distribution" target="_top">"the smallest
155          closed set whose complement has probability zero"</a>. MathWorld
156          uses the word 'defined' for this range. Non-mathematicians might say it
157          means the 'interesting' smallest range of random variate x that has the
158          cdf going from zero to unity. Outside are uninteresting zones where the
159          pdf is zero, and the cdf zero or unity.
160        </p>
161<p>
162          For most distributions, with probability distribution functions one might
163          describe as 'well-behaved', we have decided that it is most useful for
164          the supported range to <span class="bold"><strong>exclude</strong></span> random
165          variate values like exact zero <span class="bold"><strong>if the end point is
166          discontinuous</strong></span>. For example, the Weibull (scale 1, shape 1) distribution
167          smoothly heads for unity as the random variate x declines towards zero.
168          But at x = zero, the value of the pdf is suddenly exactly zero, by definition.
169          If you are plotting the PDF, or otherwise calculating, zero is not the
170          most useful value for the lower limit of supported, as we discovered. So
171          for this, and similar distributions, we have decided it is most numerically
172          useful to use the closest value to zero, min_value, for the limit of the
173          supported range. (The <code class="computeroutput"><span class="identifier">range</span></code>
174          remains from zero, so you will still get <code class="computeroutput"><span class="identifier">pdf</span><span class="special">(</span><span class="identifier">weibull</span><span class="special">,</span> <span class="number">0</span><span class="special">)</span>
175          <span class="special">==</span> <span class="number">0</span></code>).
176          (Exponential and gamma distributions have similarly discontinuous functions).
177        </p>
178<p>
179          Mathematically, the functions may make sense with an (+ or -) infinite
180          value, but except for a few special cases (in the Normal and Cauchy distributions)
181          this implementation limits random variates to finite values from the <code class="computeroutput"><span class="identifier">max</span></code> to <code class="computeroutput"><span class="identifier">min</span></code>
182          for the <code class="computeroutput"><span class="identifier">RealType</span></code>. (See
183          <a class="link" href="../../sf_implementation.html#math_toolkit.sf_implementation.handling_of_floating_point_infin">Handling
184          of Floating-Point Infinity</a> for rationale).
185        </p>
186<div class="note"><table border="0" summary="Note">
187<tr>
188<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../../../../../doc/src/images/note.png"></td>
189<th align="left">Note</th>
190</tr>
191<tr><td align="left" valign="top">
192<p>
193            <span class="bold"><strong>Discrete Probability Distributions</strong></span>
194          </p>
195<p>
196            Note that the <a href="http://en.wikipedia.org/wiki/Discrete_probability_distribution" target="_top">discrete
197            distributions</a>, including the binomial, negative binomial, Poisson
198            &amp; Bernoulli, are all mathematically defined as discrete functions:
199            that is to say the functions <code class="computeroutput"><span class="identifier">cdf</span></code>
200            and <code class="computeroutput"><span class="identifier">pdf</span></code> are only defined
201            for integral values of the random variate.
202          </p>
203<p>
204            However, because the method of calculation often uses continuous functions
205            it is convenient to treat them as if they were continuous functions,
206            and permit non-integral values of their parameters.
207          </p>
208<p>
209            Users wanting to enforce a strict mathematical model may use <code class="computeroutput"><span class="identifier">floor</span></code> or <code class="computeroutput"><span class="identifier">ceil</span></code>
210            functions on the random variate prior to calling the distribution function.
211          </p>
212<p>
213            The quantile functions for these distributions are hard to specify in
214            a manner that will satisfy everyone all of the time. The default behaviour
215            is to return an integer result, that has been rounded <span class="emphasis"><em>outwards</em></span>:
216            that is to say, lower quantiles - where the probability is less than
217            0.5 are rounded down, while upper quantiles - where the probability is
218            greater than 0.5 - are rounded up. This behaviour ensures that if an
219            X% quantile is requested, then <span class="emphasis"><em>at least</em></span> the requested
220            coverage will be present in the central region, and <span class="emphasis"><em>no more
221            than</em></span> the requested coverage will be present in the tails.
222          </p>
223<p>
224            This behaviour can be changed so that the quantile functions are rounded
225            differently, or return a real-valued result using <a class="link" href="../../pol_overview.html" title="Policy Overview">Policies</a>.
226            It is strongly recommended that you read the tutorial <a class="link" href="../../pol_tutorial/understand_dis_quant.html" title="Understanding Quantiles of Discrete Distributions">Understanding
227            Quantiles of Discrete Distributions</a> before using the quantile
228            function on a discrete distribution. The <a class="link" href="../../pol_ref/discrete_quant_ref.html" title="Discrete Quantile Policies">reference
229            docs</a> describe how to change the rounding policy for these distributions.
230          </p>
231<p>
232            For similar reasons continuous distributions with parameters like "degrees
233            of freedom" that might appear to be integral, are treated as real
234            values (and are promoted from integer to floating-point if necessary).
235            In this case however, there are a small number of situations where non-integral
236            degrees of freedom do have a genuine meaning.
237          </p>
238</td></tr>
239</table></div>
240</div>
241<table xmlns:rev="http://www.cs.rpi.edu/~gregod/boost/tools/doc/revision" width="100%"><tr>
242<td align="left"></td>
243<td align="right"><div class="copyright-footer">Copyright © 2006-2019 Nikhar
244      Agrawal, Anton Bikineev, Paul A. Bristow, Marco Guazzone, Christopher Kormanyos,
245      Hubert Holin, Bruno Lalande, John Maddock, Jeremy Murphy, Matthew Pulver, Johan
246      Råde, Gautam Sewani, Benjamin Sobotta, Nicholas Thompson, Thijs van den Berg,
247      Daryle Walker and Xiaogang Zhang<p>
248        Distributed under the Boost Software License, Version 1.0. (See accompanying
249        file LICENSE_1_0.txt or copy at <a href="http://www.boost.org/LICENSE_1_0.txt" target="_top">http://www.boost.org/LICENSE_1_0.txt</a>)
250      </p>
251</div></td>
252</tr></table>
253<hr>
254<div class="spirit-nav">
255<a accesskey="p" href="objects.html"><img src="../../../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../overview.html"><img src="../../../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../../../index.html"><img src="../../../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="complements.html"><img src="../../../../../../../doc/src/images/next.png" alt="Next"></a>
256</div>
257</body>
258</html>
259