• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1[section:binom_eg Binomial Distribution Examples]
2
3See also the reference documentation for the __binomial_distrib.
4
5[section:binomial_coinflip_example Binomial Coin-Flipping Example]
6
7[import ../../example/binomial_coinflip_example.cpp]
8[binomial_coinflip_example1]
9
10See [@../../example/binomial_coinflip_example.cpp binomial_coinflip_example.cpp]
11for full source code, the program output looks like this:
12
13[binomial_coinflip_example_output]
14
15[endsect] [/section:binomial_coinflip_example Binomial coinflip example]
16
17[section:binomial_quiz_example Binomial Quiz Example]
18
19[import ../../example/binomial_quiz_example.cpp]
20[binomial_quiz_example1]
21[binomial_quiz_example2]
22[discrete_quantile_real]
23
24See [@../../example/binomial_quiz_example.cpp binomial_quiz_example.cpp]
25for full source code and output.
26
27[endsect] [/section:binomial_coinflip_quiz Binomial Coin-Flipping example]
28
29[section:binom_conf Calculating Confidence Limits on the Frequency of Occurrence for a Binomial Distribution]
30
31Imagine you have a process that follows a binomial distribution: for each
32trial conducted, an event either occurs or does it does not, referred
33to as "successes" and "failures".  If, by experiment, you want to measure the
34frequency with which successes occur, the best estimate is given simply
35by /k/ \/ /N/, for /k/ successes out of /N/ trials.  However our confidence in that
36estimate will be shaped by how many trials were conducted, and how many successes
37were observed.  The static member functions
38`binomial_distribution<>::find_lower_bound_on_p` and
39`binomial_distribution<>::find_upper_bound_on_p` allow you to calculate
40the confidence intervals for your estimate of the occurrence frequency.
41
42The sample program [@../../example/binomial_confidence_limits.cpp
43binomial_confidence_limits.cpp] illustrates their use.  It begins by defining
44a procedure that will print a table of confidence limits for various degrees
45of certainty:
46
47   #include <iostream>
48   #include <iomanip>
49   #include <boost/math/distributions/binomial.hpp>
50
51   void confidence_limits_on_frequency(unsigned trials, unsigned successes)
52   {
53      //
54      // trials = Total number of trials.
55      // successes = Total number of observed successes.
56      //
57      // Calculate confidence limits for an observed
58      // frequency of occurrence that follows a binomial
59      // distribution.
60      //
61      using namespace std;
62      using namespace boost::math;
63
64      // Print out general info:
65      cout <<
66         "___________________________________________\n"
67         "2-Sided Confidence Limits For Success Ratio\n"
68         "___________________________________________\n\n";
69      cout << setprecision(7);
70      cout << setw(40) << left << "Number of Observations" << "=  " << trials << "\n";
71      cout << setw(40) << left << "Number of successes" << "=  " << successes << "\n";
72      cout << setw(40) << left << "Sample frequency of occurrence" << "=  " << double(successes) / trials << "\n";
73
74The procedure now defines a table of significance levels: these are the
75probabilities that the true occurrence frequency lies outside the calculated
76interval:
77
78      double alpha[] = { 0.5, 0.25, 0.1, 0.05, 0.01, 0.001, 0.0001, 0.00001 };
79
80Some pretty printing of the table header follows:
81
82   cout << "\n\n"
83           "_______________________________________________________________________\n"
84           "Confidence        Lower CP       Upper CP       Lower JP       Upper JP\n"
85           " Value (%)        Limit          Limit          Limit          Limit\n"
86           "_______________________________________________________________________\n";
87
88
89And now for the important part - the intervals themselves - for each
90value of /alpha/, we call `find_lower_bound_on_p` and
91`find_lower_upper_on_p` to obtain lower and upper bounds
92respectively.  Note that since we are calculating a two-sided interval,
93we must divide the value of alpha in two.
94
95Please note that calculating two separate /single sided bounds/, each with risk
96level [alpha] is not the same thing as calculating a two sided interval.
97Had we calculate two single-sided intervals each with a risk
98that the true value is outside the interval of [alpha], then:
99
100* The risk that it is less than the lower bound is [alpha].
101
102and
103
104* The risk that it is greater than the upper bound is also [alpha].
105
106So the risk it is outside *upper or lower bound*, is *twice* alpha, and the
107probability that it is inside the bounds is therefore not nearly as high as
108one might have thought.  This is why [alpha]/2 must be used in
109the calculations below.
110
111In contrast, had we been calculating a
112single-sided interval, for example: ['"Calculate a lower bound so that we are P%
113sure that the true occurrence frequency is greater than some value"]
114then we would *not* have divided by two.
115
116Finally note that `binomial_distribution` provides a choice of two
117methods for the calculation, we print out the results from both
118methods in this example:
119
120      for(unsigned i = 0; i < sizeof(alpha)/sizeof(alpha[0]); ++i)
121      {
122         // Confidence value:
123         cout << fixed << setprecision(3) << setw(10) << right << 100 * (1-alpha[i]);
124         // Calculate Clopper Pearson bounds:
125         double l = binomial_distribution<>::find_lower_bound_on_p(
126                        trials, successes, alpha[i]/2);
127         double u = binomial_distribution<>::find_upper_bound_on_p(
128                        trials, successes, alpha[i]/2);
129         // Print Clopper Pearson Limits:
130         cout << fixed << setprecision(5) << setw(15) << right << l;
131         cout << fixed << setprecision(5) << setw(15) << right << u;
132         // Calculate Jeffreys Prior Bounds:
133         l = binomial_distribution<>::find_lower_bound_on_p(
134               trials, successes, alpha[i]/2,
135               binomial_distribution<>::jeffreys_prior_interval);
136         u = binomial_distribution<>::find_upper_bound_on_p(
137               trials, successes, alpha[i]/2,
138               binomial_distribution<>::jeffreys_prior_interval);
139         // Print Jeffreys Prior Limits:
140         cout << fixed << setprecision(5) << setw(15) << right << l;
141         cout << fixed << setprecision(5) << setw(15) << right << u << std::endl;
142      }
143      cout << endl;
144   }
145
146And that's all there is to it.  Let's see some sample output for a 2 in 10
147success ratio, first for 20 trials:
148
149[pre'''___________________________________________
1502-Sided Confidence Limits For Success Ratio
151___________________________________________
152
153Number of Observations                  =  20
154Number of successes                     =  4
155Sample frequency of occurrence          =  0.2
156
157
158_______________________________________________________________________
159Confidence        Lower CP       Upper CP       Lower JP       Upper JP
160 Value (%)        Limit          Limit          Limit          Limit
161_______________________________________________________________________
162    50.000        0.12840        0.29588        0.14974        0.26916
163    75.000        0.09775        0.34633        0.11653        0.31861
164    90.000        0.07135        0.40103        0.08734        0.37274
165    95.000        0.05733        0.43661        0.07152        0.40823
166    99.000        0.03576        0.50661        0.04655        0.47859
167    99.900        0.01905        0.58632        0.02634        0.55960
168    99.990        0.01042        0.64997        0.01530        0.62495
169    99.999        0.00577        0.70216        0.00901        0.67897
170''']
171
172As you can see, even at the 95% confidence level the bounds are
173really quite wide (this example is chosen to be easily compared to the one
174in the __handbook
175[@http://www.itl.nist.gov/div898/handbook/prc/section2/prc241.htm
176here]).  Note also that the Clopper-Pearson calculation method (CP above)
177produces quite noticeably more pessimistic estimates than the Jeffreys Prior
178method (JP above).
179
180
181Compare that with the program output for
1822000 trials:
183
184[pre'''___________________________________________
1852-Sided Confidence Limits For Success Ratio
186___________________________________________
187
188Number of Observations                  =  2000
189Number of successes                     =  400
190Sample frequency of occurrence          =  0.2000000
191
192
193_______________________________________________________________________
194Confidence        Lower CP       Upper CP       Lower JP       Upper JP
195 Value (%)        Limit          Limit          Limit          Limit
196_______________________________________________________________________
197    50.000        0.19382        0.20638        0.19406        0.20613
198    75.000        0.18965        0.21072        0.18990        0.21047
199    90.000        0.18537        0.21528        0.18561        0.21503
200    95.000        0.18267        0.21821        0.18291        0.21796
201    99.000        0.17745        0.22400        0.17769        0.22374
202    99.900        0.17150        0.23079        0.17173        0.23053
203    99.990        0.16658        0.23657        0.16681        0.23631
204    99.999        0.16233        0.24169        0.16256        0.24143
205''']
206
207Now even when the confidence level is very high, the limits are really
208quite close to the experimentally calculated value of 0.2.  Furthermore
209the difference between the two calculation methods is now really quite small.
210
211[endsect] [/section:binom_conf Calculating Confidence Limits on the Frequency of Occurrence for a Binomial Distribution]
212
213[section:binom_size_eg Estimating Sample Sizes for a Binomial Distribution.]
214
215Imagine you have a critical component that you know will fail in 1 in
216N "uses" (for some suitable definition of "use").  You may want to schedule
217routine replacement of the component so that its chance of failure between
218routine replacements is less than P%.  If the failures follow a binomial
219distribution (each time the component is "used" it either fails or does not)
220then the static member function `binomial_distibution<>::find_maximum_number_of_trials`
221can be used to estimate the maximum number of "uses" of that component for some
222acceptable risk level /alpha/.
223
224The example program
225[@../../example/binomial_sample_sizes.cpp binomial_sample_sizes.cpp]
226demonstrates its usage.  It centres on a routine that prints out
227a table of maximum sample sizes for various probability thresholds:
228
229   void find_max_sample_size(
230      double p,              // success ratio.
231      unsigned successes)    // Total number of observed successes permitted.
232   {
233
234The routine then declares a table of probability thresholds: these are the
235maximum acceptable probability that /successes/ or fewer events will be
236observed.  In our example, /successes/ will be always zero, since we want
237no component failures, but in other situations non-zero values may well
238make sense.
239
240   double alpha[] = { 0.5, 0.25, 0.1, 0.05, 0.01, 0.001, 0.0001, 0.00001 };
241
242Much of the rest of the program is pretty-printing, the important part
243is in the calculation of maximum number of permitted trials for each
244value of alpha:
245
246   for(unsigned i = 0; i < sizeof(alpha)/sizeof(alpha[0]); ++i)
247   {
248      // Confidence value:
249      cout << fixed << setprecision(3) << setw(10) << right << 100 * (1-alpha[i]);
250      // calculate trials:
251      double t = binomial::find_maximum_number_of_trials(
252                     successes, p, alpha[i]);
253      t = floor(t);
254      // Print Trials:
255      cout << fixed << setprecision(5) << setw(15) << right << t << endl;
256   }
257
258Note that since we're
259calculating the maximum number of trials permitted, we'll err on the safe
260side and take the floor of the result.  Had we been calculating the
261/minimum/ number of trials required to observe a certain number of /successes/
262using `find_minimum_number_of_trials` we would have taken the ceiling instead.
263
264We'll finish off by looking at some sample output, firstly for
265a 1 in 1000 chance of component failure with each use:
266
267[pre
268'''________________________
269Maximum Number of Trials
270________________________
271
272Success ratio                           =  0.001
273Maximum Number of "successes" permitted =  0
274
275
276____________________________
277Confidence        Max Number
278 Value (%)        Of Trials
279____________________________
280    50.000            692
281    75.000            287
282    90.000            105
283    95.000             51
284    99.000             10
285    99.900              0
286    99.990              0
287    99.999              0'''
288]
289
290So 51 "uses" of the component would yield a 95% chance that no
291component failures would be observed.
292
293Compare that with a 1 in 1 million chance of component failure:
294
295[pre'''
296________________________
297Maximum Number of Trials
298________________________
299
300Success ratio                           =  0.0000010
301Maximum Number of "successes" permitted =  0
302
303
304____________________________
305Confidence        Max Number
306 Value (%)        Of Trials
307____________________________
308    50.000         693146
309    75.000         287681
310    90.000         105360
311    95.000          51293
312    99.000          10050
313    99.900           1000
314    99.990            100
315    99.999             10'''
316]
317
318In this case, even 1000 uses of the component would still yield a
319less than 1 in 1000 chance of observing a component failure
320(i.e. a 99.9% chance of no failure).
321
322[endsect] [/section:binom_size_eg Estimating Sample Sizes for a Binomial Distribution.]
323
324[endsect] [/section:binom_eg Binomial Distribution]
325
326[/
327  Copyright 2006 John Maddock and Paul A. Bristow.
328  Distributed under the Boost Software License, Version 1.0.
329  (See accompanying file LICENSE_1_0.txt or copy at
330  http://www.boost.org/LICENSE_1_0.txt).
331]
332
333