1[section:binom_eg Binomial Distribution Examples] 2 3See also the reference documentation for the __binomial_distrib. 4 5[section:binomial_coinflip_example Binomial Coin-Flipping Example] 6 7[import ../../example/binomial_coinflip_example.cpp] 8[binomial_coinflip_example1] 9 10See [@../../example/binomial_coinflip_example.cpp binomial_coinflip_example.cpp] 11for full source code, the program output looks like this: 12 13[binomial_coinflip_example_output] 14 15[endsect] [/section:binomial_coinflip_example Binomial coinflip example] 16 17[section:binomial_quiz_example Binomial Quiz Example] 18 19[import ../../example/binomial_quiz_example.cpp] 20[binomial_quiz_example1] 21[binomial_quiz_example2] 22[discrete_quantile_real] 23 24See [@../../example/binomial_quiz_example.cpp binomial_quiz_example.cpp] 25for full source code and output. 26 27[endsect] [/section:binomial_coinflip_quiz Binomial Coin-Flipping example] 28 29[section:binom_conf Calculating Confidence Limits on the Frequency of Occurrence for a Binomial Distribution] 30 31Imagine you have a process that follows a binomial distribution: for each 32trial conducted, an event either occurs or does it does not, referred 33to as "successes" and "failures". If, by experiment, you want to measure the 34frequency with which successes occur, the best estimate is given simply 35by /k/ \/ /N/, for /k/ successes out of /N/ trials. However our confidence in that 36estimate will be shaped by how many trials were conducted, and how many successes 37were observed. The static member functions 38`binomial_distribution<>::find_lower_bound_on_p` and 39`binomial_distribution<>::find_upper_bound_on_p` allow you to calculate 40the confidence intervals for your estimate of the occurrence frequency. 41 42The sample program [@../../example/binomial_confidence_limits.cpp 43binomial_confidence_limits.cpp] illustrates their use. It begins by defining 44a procedure that will print a table of confidence limits for various degrees 45of certainty: 46 47 #include <iostream> 48 #include <iomanip> 49 #include <boost/math/distributions/binomial.hpp> 50 51 void confidence_limits_on_frequency(unsigned trials, unsigned successes) 52 { 53 // 54 // trials = Total number of trials. 55 // successes = Total number of observed successes. 56 // 57 // Calculate confidence limits for an observed 58 // frequency of occurrence that follows a binomial 59 // distribution. 60 // 61 using namespace std; 62 using namespace boost::math; 63 64 // Print out general info: 65 cout << 66 "___________________________________________\n" 67 "2-Sided Confidence Limits For Success Ratio\n" 68 "___________________________________________\n\n"; 69 cout << setprecision(7); 70 cout << setw(40) << left << "Number of Observations" << "= " << trials << "\n"; 71 cout << setw(40) << left << "Number of successes" << "= " << successes << "\n"; 72 cout << setw(40) << left << "Sample frequency of occurrence" << "= " << double(successes) / trials << "\n"; 73 74The procedure now defines a table of significance levels: these are the 75probabilities that the true occurrence frequency lies outside the calculated 76interval: 77 78 double alpha[] = { 0.5, 0.25, 0.1, 0.05, 0.01, 0.001, 0.0001, 0.00001 }; 79 80Some pretty printing of the table header follows: 81 82 cout << "\n\n" 83 "_______________________________________________________________________\n" 84 "Confidence Lower CP Upper CP Lower JP Upper JP\n" 85 " Value (%) Limit Limit Limit Limit\n" 86 "_______________________________________________________________________\n"; 87 88 89And now for the important part - the intervals themselves - for each 90value of /alpha/, we call `find_lower_bound_on_p` and 91`find_lower_upper_on_p` to obtain lower and upper bounds 92respectively. Note that since we are calculating a two-sided interval, 93we must divide the value of alpha in two. 94 95Please note that calculating two separate /single sided bounds/, each with risk 96level [alpha] is not the same thing as calculating a two sided interval. 97Had we calculate two single-sided intervals each with a risk 98that the true value is outside the interval of [alpha], then: 99 100* The risk that it is less than the lower bound is [alpha]. 101 102and 103 104* The risk that it is greater than the upper bound is also [alpha]. 105 106So the risk it is outside *upper or lower bound*, is *twice* alpha, and the 107probability that it is inside the bounds is therefore not nearly as high as 108one might have thought. This is why [alpha]/2 must be used in 109the calculations below. 110 111In contrast, had we been calculating a 112single-sided interval, for example: ['"Calculate a lower bound so that we are P% 113sure that the true occurrence frequency is greater than some value"] 114then we would *not* have divided by two. 115 116Finally note that `binomial_distribution` provides a choice of two 117methods for the calculation, we print out the results from both 118methods in this example: 119 120 for(unsigned i = 0; i < sizeof(alpha)/sizeof(alpha[0]); ++i) 121 { 122 // Confidence value: 123 cout << fixed << setprecision(3) << setw(10) << right << 100 * (1-alpha[i]); 124 // Calculate Clopper Pearson bounds: 125 double l = binomial_distribution<>::find_lower_bound_on_p( 126 trials, successes, alpha[i]/2); 127 double u = binomial_distribution<>::find_upper_bound_on_p( 128 trials, successes, alpha[i]/2); 129 // Print Clopper Pearson Limits: 130 cout << fixed << setprecision(5) << setw(15) << right << l; 131 cout << fixed << setprecision(5) << setw(15) << right << u; 132 // Calculate Jeffreys Prior Bounds: 133 l = binomial_distribution<>::find_lower_bound_on_p( 134 trials, successes, alpha[i]/2, 135 binomial_distribution<>::jeffreys_prior_interval); 136 u = binomial_distribution<>::find_upper_bound_on_p( 137 trials, successes, alpha[i]/2, 138 binomial_distribution<>::jeffreys_prior_interval); 139 // Print Jeffreys Prior Limits: 140 cout << fixed << setprecision(5) << setw(15) << right << l; 141 cout << fixed << setprecision(5) << setw(15) << right << u << std::endl; 142 } 143 cout << endl; 144 } 145 146And that's all there is to it. Let's see some sample output for a 2 in 10 147success ratio, first for 20 trials: 148 149[pre'''___________________________________________ 1502-Sided Confidence Limits For Success Ratio 151___________________________________________ 152 153Number of Observations = 20 154Number of successes = 4 155Sample frequency of occurrence = 0.2 156 157 158_______________________________________________________________________ 159Confidence Lower CP Upper CP Lower JP Upper JP 160 Value (%) Limit Limit Limit Limit 161_______________________________________________________________________ 162 50.000 0.12840 0.29588 0.14974 0.26916 163 75.000 0.09775 0.34633 0.11653 0.31861 164 90.000 0.07135 0.40103 0.08734 0.37274 165 95.000 0.05733 0.43661 0.07152 0.40823 166 99.000 0.03576 0.50661 0.04655 0.47859 167 99.900 0.01905 0.58632 0.02634 0.55960 168 99.990 0.01042 0.64997 0.01530 0.62495 169 99.999 0.00577 0.70216 0.00901 0.67897 170'''] 171 172As you can see, even at the 95% confidence level the bounds are 173really quite wide (this example is chosen to be easily compared to the one 174in the __handbook 175[@http://www.itl.nist.gov/div898/handbook/prc/section2/prc241.htm 176here]). Note also that the Clopper-Pearson calculation method (CP above) 177produces quite noticeably more pessimistic estimates than the Jeffreys Prior 178method (JP above). 179 180 181Compare that with the program output for 1822000 trials: 183 184[pre'''___________________________________________ 1852-Sided Confidence Limits For Success Ratio 186___________________________________________ 187 188Number of Observations = 2000 189Number of successes = 400 190Sample frequency of occurrence = 0.2000000 191 192 193_______________________________________________________________________ 194Confidence Lower CP Upper CP Lower JP Upper JP 195 Value (%) Limit Limit Limit Limit 196_______________________________________________________________________ 197 50.000 0.19382 0.20638 0.19406 0.20613 198 75.000 0.18965 0.21072 0.18990 0.21047 199 90.000 0.18537 0.21528 0.18561 0.21503 200 95.000 0.18267 0.21821 0.18291 0.21796 201 99.000 0.17745 0.22400 0.17769 0.22374 202 99.900 0.17150 0.23079 0.17173 0.23053 203 99.990 0.16658 0.23657 0.16681 0.23631 204 99.999 0.16233 0.24169 0.16256 0.24143 205'''] 206 207Now even when the confidence level is very high, the limits are really 208quite close to the experimentally calculated value of 0.2. Furthermore 209the difference between the two calculation methods is now really quite small. 210 211[endsect] [/section:binom_conf Calculating Confidence Limits on the Frequency of Occurrence for a Binomial Distribution] 212 213[section:binom_size_eg Estimating Sample Sizes for a Binomial Distribution.] 214 215Imagine you have a critical component that you know will fail in 1 in 216N "uses" (for some suitable definition of "use"). You may want to schedule 217routine replacement of the component so that its chance of failure between 218routine replacements is less than P%. If the failures follow a binomial 219distribution (each time the component is "used" it either fails or does not) 220then the static member function `binomial_distibution<>::find_maximum_number_of_trials` 221can be used to estimate the maximum number of "uses" of that component for some 222acceptable risk level /alpha/. 223 224The example program 225[@../../example/binomial_sample_sizes.cpp binomial_sample_sizes.cpp] 226demonstrates its usage. It centres on a routine that prints out 227a table of maximum sample sizes for various probability thresholds: 228 229 void find_max_sample_size( 230 double p, // success ratio. 231 unsigned successes) // Total number of observed successes permitted. 232 { 233 234The routine then declares a table of probability thresholds: these are the 235maximum acceptable probability that /successes/ or fewer events will be 236observed. In our example, /successes/ will be always zero, since we want 237no component failures, but in other situations non-zero values may well 238make sense. 239 240 double alpha[] = { 0.5, 0.25, 0.1, 0.05, 0.01, 0.001, 0.0001, 0.00001 }; 241 242Much of the rest of the program is pretty-printing, the important part 243is in the calculation of maximum number of permitted trials for each 244value of alpha: 245 246 for(unsigned i = 0; i < sizeof(alpha)/sizeof(alpha[0]); ++i) 247 { 248 // Confidence value: 249 cout << fixed << setprecision(3) << setw(10) << right << 100 * (1-alpha[i]); 250 // calculate trials: 251 double t = binomial::find_maximum_number_of_trials( 252 successes, p, alpha[i]); 253 t = floor(t); 254 // Print Trials: 255 cout << fixed << setprecision(5) << setw(15) << right << t << endl; 256 } 257 258Note that since we're 259calculating the maximum number of trials permitted, we'll err on the safe 260side and take the floor of the result. Had we been calculating the 261/minimum/ number of trials required to observe a certain number of /successes/ 262using `find_minimum_number_of_trials` we would have taken the ceiling instead. 263 264We'll finish off by looking at some sample output, firstly for 265a 1 in 1000 chance of component failure with each use: 266 267[pre 268'''________________________ 269Maximum Number of Trials 270________________________ 271 272Success ratio = 0.001 273Maximum Number of "successes" permitted = 0 274 275 276____________________________ 277Confidence Max Number 278 Value (%) Of Trials 279____________________________ 280 50.000 692 281 75.000 287 282 90.000 105 283 95.000 51 284 99.000 10 285 99.900 0 286 99.990 0 287 99.999 0''' 288] 289 290So 51 "uses" of the component would yield a 95% chance that no 291component failures would be observed. 292 293Compare that with a 1 in 1 million chance of component failure: 294 295[pre''' 296________________________ 297Maximum Number of Trials 298________________________ 299 300Success ratio = 0.0000010 301Maximum Number of "successes" permitted = 0 302 303 304____________________________ 305Confidence Max Number 306 Value (%) Of Trials 307____________________________ 308 50.000 693146 309 75.000 287681 310 90.000 105360 311 95.000 51293 312 99.000 10050 313 99.900 1000 314 99.990 100 315 99.999 10''' 316] 317 318In this case, even 1000 uses of the component would still yield a 319less than 1 in 1000 chance of observing a component failure 320(i.e. a 99.9% chance of no failure). 321 322[endsect] [/section:binom_size_eg Estimating Sample Sizes for a Binomial Distribution.] 323 324[endsect] [/section:binom_eg Binomial Distribution] 325 326[/ 327 Copyright 2006 John Maddock and Paul A. Bristow. 328 Distributed under the Boost Software License, Version 1.0. 329 (See accompanying file LICENSE_1_0.txt or copy at 330 http://www.boost.org/LICENSE_1_0.txt). 331] 332 333