1<html> 2<head> 3<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> 4<title>Calculating Confidence Limits on the Frequency of Occurrence for a Binomial Distribution</title> 5<link rel="stylesheet" href="../../../../math.css" type="text/css"> 6<meta name="generator" content="DocBook XSL Stylesheets V1.79.1"> 7<link rel="home" href="../../../../index.html" title="Math Toolkit 2.12.0"> 8<link rel="up" href="../binom_eg.html" title="Binomial Distribution Examples"> 9<link rel="prev" href="binomial_quiz_example.html" title="Binomial Quiz Example"> 10<link rel="next" href="binom_size_eg.html" title="Estimating Sample Sizes for a Binomial Distribution."> 11</head> 12<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"> 13<table cellpadding="2" width="100%"><tr> 14<td valign="top"><img alt="Boost C++ Libraries" width="277" height="86" src="../../../../../../../../boost.png"></td> 15<td align="center"><a href="../../../../../../../../index.html">Home</a></td> 16<td align="center"><a href="../../../../../../../../libs/libraries.htm">Libraries</a></td> 17<td align="center"><a href="http://www.boost.org/users/people.html">People</a></td> 18<td align="center"><a href="http://www.boost.org/users/faq.html">FAQ</a></td> 19<td align="center"><a href="../../../../../../../../more/index.htm">More</a></td> 20</tr></table> 21<hr> 22<div class="spirit-nav"> 23<a accesskey="p" href="binomial_quiz_example.html"><img src="../../../../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../binom_eg.html"><img src="../../../../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../../../../index.html"><img src="../../../../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="binom_size_eg.html"><img src="../../../../../../../../doc/src/images/next.png" alt="Next"></a> 24</div> 25<div class="section"> 26<div class="titlepage"><div><div><h5 class="title"> 27<a name="math_toolkit.stat_tut.weg.binom_eg.binom_conf"></a><a class="link" href="binom_conf.html" title="Calculating Confidence Limits on the Frequency of Occurrence for a Binomial Distribution">Calculating 28 Confidence Limits on the Frequency of Occurrence for a Binomial Distribution</a> 29</h5></div></div></div> 30<p> 31 Imagine you have a process that follows a binomial distribution: for 32 each trial conducted, an event either occurs or does it does not, referred 33 to as "successes" and "failures". If, by experiment, 34 you want to measure the frequency with which successes occur, the best 35 estimate is given simply by <span class="emphasis"><em>k</em></span> / <span class="emphasis"><em>N</em></span>, 36 for <span class="emphasis"><em>k</em></span> successes out of <span class="emphasis"><em>N</em></span> trials. 37 However our confidence in that estimate will be shaped by how many trials 38 were conducted, and how many successes were observed. The static member 39 functions <code class="computeroutput"><span class="identifier">binomial_distribution</span><span class="special"><>::</span><span class="identifier">find_lower_bound_on_p</span></code> 40 and <code class="computeroutput"><span class="identifier">binomial_distribution</span><span class="special"><>::</span><span class="identifier">find_upper_bound_on_p</span></code> 41 allow you to calculate the confidence intervals for your estimate of 42 the occurrence frequency. 43 </p> 44<p> 45 The sample program <a href="../../../../../../example/binomial_confidence_limits.cpp" target="_top">binomial_confidence_limits.cpp</a> 46 illustrates their use. It begins by defining a procedure that will print 47 a table of confidence limits for various degrees of certainty: 48 </p> 49<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iostream</span><span class="special">></span> 50<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iomanip</span><span class="special">></span> 51<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">math</span><span class="special">/</span><span class="identifier">distributions</span><span class="special">/</span><span class="identifier">binomial</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span> 52 53<span class="keyword">void</span> <span class="identifier">confidence_limits_on_frequency</span><span class="special">(</span><span class="keyword">unsigned</span> <span class="identifier">trials</span><span class="special">,</span> <span class="keyword">unsigned</span> <span class="identifier">successes</span><span class="special">)</span> 54<span class="special">{</span> 55 <span class="comment">//</span> 56 <span class="comment">// trials = Total number of trials.</span> 57 <span class="comment">// successes = Total number of observed successes.</span> 58 <span class="comment">//</span> 59 <span class="comment">// Calculate confidence limits for an observed</span> 60 <span class="comment">// frequency of occurrence that follows a binomial</span> 61 <span class="comment">// distribution.</span> 62 <span class="comment">//</span> 63 <span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">std</span><span class="special">;</span> 64 <span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">math</span><span class="special">;</span> 65 66 <span class="comment">// Print out general info:</span> 67 <span class="identifier">cout</span> <span class="special"><<</span> 68 <span class="string">"___________________________________________\n"</span> 69 <span class="string">"2-Sided Confidence Limits For Success Ratio\n"</span> 70 <span class="string">"___________________________________________\n\n"</span><span class="special">;</span> 71 <span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">setprecision</span><span class="special">(</span><span class="number">7</span><span class="special">);</span> 72 <span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">setw</span><span class="special">(</span><span class="number">40</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">left</span> <span class="special"><<</span> <span class="string">"Number of Observations"</span> <span class="special"><<</span> <span class="string">"= "</span> <span class="special"><<</span> <span class="identifier">trials</span> <span class="special"><<</span> <span class="string">"\n"</span><span class="special">;</span> 73 <span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">setw</span><span class="special">(</span><span class="number">40</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">left</span> <span class="special"><<</span> <span class="string">"Number of successes"</span> <span class="special"><<</span> <span class="string">"= "</span> <span class="special"><<</span> <span class="identifier">successes</span> <span class="special"><<</span> <span class="string">"\n"</span><span class="special">;</span> 74 <span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">setw</span><span class="special">(</span><span class="number">40</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">left</span> <span class="special"><<</span> <span class="string">"Sample frequency of occurrence"</span> <span class="special"><<</span> <span class="string">"= "</span> <span class="special"><<</span> <span class="keyword">double</span><span class="special">(</span><span class="identifier">successes</span><span class="special">)</span> <span class="special">/</span> <span class="identifier">trials</span> <span class="special"><<</span> <span class="string">"\n"</span><span class="special">;</span> 75</pre> 76<p> 77 The procedure now defines a table of significance levels: these are the 78 probabilities that the true occurrence frequency lies outside the calculated 79 interval: 80 </p> 81<pre class="programlisting"><span class="keyword">double</span> <span class="identifier">alpha</span><span class="special">[]</span> <span class="special">=</span> <span class="special">{</span> <span class="number">0.5</span><span class="special">,</span> <span class="number">0.25</span><span class="special">,</span> <span class="number">0.1</span><span class="special">,</span> <span class="number">0.05</span><span class="special">,</span> <span class="number">0.01</span><span class="special">,</span> <span class="number">0.001</span><span class="special">,</span> <span class="number">0.0001</span><span class="special">,</span> <span class="number">0.00001</span> <span class="special">};</span> 82</pre> 83<p> 84 Some pretty printing of the table header follows: 85 </p> 86<pre class="programlisting"><span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"\n\n"</span> 87 <span class="string">"_______________________________________________________________________\n"</span> 88 <span class="string">"Confidence Lower CP Upper CP Lower JP Upper JP\n"</span> 89 <span class="string">" Value (%) Limit Limit Limit Limit\n"</span> 90 <span class="string">"_______________________________________________________________________\n"</span><span class="special">;</span> 91</pre> 92<p> 93 And now for the important part - the intervals themselves - for each 94 value of <span class="emphasis"><em>alpha</em></span>, we call <code class="computeroutput"><span class="identifier">find_lower_bound_on_p</span></code> 95 and <code class="computeroutput"><span class="identifier">find_lower_upper_on_p</span></code> 96 to obtain lower and upper bounds respectively. Note that since we are 97 calculating a two-sided interval, we must divide the value of alpha in 98 two. 99 </p> 100<p> 101 Please note that calculating two separate <span class="emphasis"><em>single sided bounds</em></span>, 102 each with risk level α is not the same thing as calculating a two sided 103 interval. Had we calculate two single-sided intervals each with a risk 104 that the true value is outside the interval of α, then: 105 </p> 106<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"> 107 The risk that it is less than the lower bound is α. 108 </li></ul></div> 109<p> 110 and 111 </p> 112<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"> 113 The risk that it is greater than the upper bound is also α. 114 </li></ul></div> 115<p> 116 So the risk it is outside <span class="bold"><strong>upper or lower bound</strong></span>, 117 is <span class="bold"><strong>twice</strong></span> alpha, and the probability 118 that it is inside the bounds is therefore not nearly as high as one might 119 have thought. This is why α/2 must be used in the calculations below. 120 </p> 121<p> 122 In contrast, had we been calculating a single-sided interval, for example: 123 <span class="emphasis"><em>"Calculate a lower bound so that we are P% sure that the 124 true occurrence frequency is greater than some value"</em></span> 125 then we would <span class="bold"><strong>not</strong></span> have divided by two. 126 </p> 127<p> 128 Finally note that <code class="computeroutput"><span class="identifier">binomial_distribution</span></code> 129 provides a choice of two methods for the calculation, we print out the 130 results from both methods in this example: 131 </p> 132<pre class="programlisting"> <span class="keyword">for</span><span class="special">(</span><span class="keyword">unsigned</span> <span class="identifier">i</span> <span class="special">=</span> <span class="number">0</span><span class="special">;</span> <span class="identifier">i</span> <span class="special"><</span> <span class="keyword">sizeof</span><span class="special">(</span><span class="identifier">alpha</span><span class="special">)/</span><span class="keyword">sizeof</span><span class="special">(</span><span class="identifier">alpha</span><span class="special">[</span><span class="number">0</span><span class="special">]);</span> <span class="special">++</span><span class="identifier">i</span><span class="special">)</span> 133 <span class="special">{</span> 134 <span class="comment">// Confidence value:</span> 135 <span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">fixed</span> <span class="special"><<</span> <span class="identifier">setprecision</span><span class="special">(</span><span class="number">3</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">setw</span><span class="special">(</span><span class="number">10</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">right</span> <span class="special"><<</span> <span class="number">100</span> <span class="special">*</span> <span class="special">(</span><span class="number">1</span><span class="special">-</span><span class="identifier">alpha</span><span class="special">[</span><span class="identifier">i</span><span class="special">]);</span> 136 <span class="comment">// Calculate Clopper Pearson bounds:</span> 137 <span class="keyword">double</span> <span class="identifier">l</span> <span class="special">=</span> <span class="identifier">binomial_distribution</span><span class="special"><>::</span><span class="identifier">find_lower_bound_on_p</span><span class="special">(</span> 138 <span class="identifier">trials</span><span class="special">,</span> <span class="identifier">successes</span><span class="special">,</span> <span class="identifier">alpha</span><span class="special">[</span><span class="identifier">i</span><span class="special">]/</span><span class="number">2</span><span class="special">);</span> 139 <span class="keyword">double</span> <span class="identifier">u</span> <span class="special">=</span> <span class="identifier">binomial_distribution</span><span class="special"><>::</span><span class="identifier">find_upper_bound_on_p</span><span class="special">(</span> 140 <span class="identifier">trials</span><span class="special">,</span> <span class="identifier">successes</span><span class="special">,</span> <span class="identifier">alpha</span><span class="special">[</span><span class="identifier">i</span><span class="special">]/</span><span class="number">2</span><span class="special">);</span> 141 <span class="comment">// Print Clopper Pearson Limits:</span> 142 <span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">fixed</span> <span class="special"><<</span> <span class="identifier">setprecision</span><span class="special">(</span><span class="number">5</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">setw</span><span class="special">(</span><span class="number">15</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">right</span> <span class="special"><<</span> <span class="identifier">l</span><span class="special">;</span> 143 <span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">fixed</span> <span class="special"><<</span> <span class="identifier">setprecision</span><span class="special">(</span><span class="number">5</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">setw</span><span class="special">(</span><span class="number">15</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">right</span> <span class="special"><<</span> <span class="identifier">u</span><span class="special">;</span> 144 <span class="comment">// Calculate Jeffreys Prior Bounds:</span> 145 <span class="identifier">l</span> <span class="special">=</span> <span class="identifier">binomial_distribution</span><span class="special"><>::</span><span class="identifier">find_lower_bound_on_p</span><span class="special">(</span> 146 <span class="identifier">trials</span><span class="special">,</span> <span class="identifier">successes</span><span class="special">,</span> <span class="identifier">alpha</span><span class="special">[</span><span class="identifier">i</span><span class="special">]/</span><span class="number">2</span><span class="special">,</span> 147 <span class="identifier">binomial_distribution</span><span class="special"><>::</span><span class="identifier">jeffreys_prior_interval</span><span class="special">);</span> 148 <span class="identifier">u</span> <span class="special">=</span> <span class="identifier">binomial_distribution</span><span class="special"><>::</span><span class="identifier">find_upper_bound_on_p</span><span class="special">(</span> 149 <span class="identifier">trials</span><span class="special">,</span> <span class="identifier">successes</span><span class="special">,</span> <span class="identifier">alpha</span><span class="special">[</span><span class="identifier">i</span><span class="special">]/</span><span class="number">2</span><span class="special">,</span> 150 <span class="identifier">binomial_distribution</span><span class="special"><>::</span><span class="identifier">jeffreys_prior_interval</span><span class="special">);</span> 151 <span class="comment">// Print Jeffreys Prior Limits:</span> 152 <span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">fixed</span> <span class="special"><<</span> <span class="identifier">setprecision</span><span class="special">(</span><span class="number">5</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">setw</span><span class="special">(</span><span class="number">15</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">right</span> <span class="special"><<</span> <span class="identifier">l</span><span class="special">;</span> 153 <span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">fixed</span> <span class="special"><<</span> <span class="identifier">setprecision</span><span class="special">(</span><span class="number">5</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">setw</span><span class="special">(</span><span class="number">15</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">right</span> <span class="special"><<</span> <span class="identifier">u</span> <span class="special"><<</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span> 154 <span class="special">}</span> 155 <span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> 156<span class="special">}</span> 157</pre> 158<p> 159 And that's all there is to it. Let's see some sample output for a 2 in 160 10 success ratio, first for 20 trials: 161 </p> 162<pre class="programlisting">___________________________________________ 1632-Sided Confidence Limits For Success Ratio 164___________________________________________ 165 166Number of Observations = 20 167Number of successes = 4 168Sample frequency of occurrence = 0.2 169 170 171_______________________________________________________________________ 172Confidence Lower CP Upper CP Lower JP Upper JP 173 Value (%) Limit Limit Limit Limit 174_______________________________________________________________________ 175 50.000 0.12840 0.29588 0.14974 0.26916 176 75.000 0.09775 0.34633 0.11653 0.31861 177 90.000 0.07135 0.40103 0.08734 0.37274 178 95.000 0.05733 0.43661 0.07152 0.40823 179 99.000 0.03576 0.50661 0.04655 0.47859 180 99.900 0.01905 0.58632 0.02634 0.55960 181 99.990 0.01042 0.64997 0.01530 0.62495 182 99.999 0.00577 0.70216 0.00901 0.67897 183</pre> 184<p> 185 As you can see, even at the 95% confidence level the bounds are really 186 quite wide (this example is chosen to be easily compared to the one in 187 the <a href="http://www.itl.nist.gov/div898/handbook/" target="_top">NIST/SEMATECH 188 e-Handbook of Statistical Methods.</a> <a href="http://www.itl.nist.gov/div898/handbook/prc/section2/prc241.htm" target="_top">here</a>). 189 Note also that the Clopper-Pearson calculation method (CP above) produces 190 quite noticeably more pessimistic estimates than the Jeffreys Prior method 191 (JP above). 192 </p> 193<p> 194 Compare that with the program output for 2000 trials: 195 </p> 196<pre class="programlisting">___________________________________________ 1972-Sided Confidence Limits For Success Ratio 198___________________________________________ 199 200Number of Observations = 2000 201Number of successes = 400 202Sample frequency of occurrence = 0.2000000 203 204 205_______________________________________________________________________ 206Confidence Lower CP Upper CP Lower JP Upper JP 207 Value (%) Limit Limit Limit Limit 208_______________________________________________________________________ 209 50.000 0.19382 0.20638 0.19406 0.20613 210 75.000 0.18965 0.21072 0.18990 0.21047 211 90.000 0.18537 0.21528 0.18561 0.21503 212 95.000 0.18267 0.21821 0.18291 0.21796 213 99.000 0.17745 0.22400 0.17769 0.22374 214 99.900 0.17150 0.23079 0.17173 0.23053 215 99.990 0.16658 0.23657 0.16681 0.23631 216 99.999 0.16233 0.24169 0.16256 0.24143 217</pre> 218<p> 219 Now even when the confidence level is very high, the limits are really 220 quite close to the experimentally calculated value of 0.2. Furthermore 221 the difference between the two calculation methods is now really quite 222 small. 223 </p> 224</div> 225<table xmlns:rev="http://www.cs.rpi.edu/~gregod/boost/tools/doc/revision" width="100%"><tr> 226<td align="left"></td> 227<td align="right"><div class="copyright-footer">Copyright © 2006-2019 Nikhar 228 Agrawal, Anton Bikineev, Paul A. Bristow, Marco Guazzone, Christopher Kormanyos, 229 Hubert Holin, Bruno Lalande, John Maddock, Jeremy Murphy, Matthew Pulver, Johan 230 Råde, Gautam Sewani, Benjamin Sobotta, Nicholas Thompson, Thijs van den Berg, 231 Daryle Walker and Xiaogang Zhang<p> 232 Distributed under the Boost Software License, Version 1.0. (See accompanying 233 file LICENSE_1_0.txt or copy at <a href="http://www.boost.org/LICENSE_1_0.txt" target="_top">http://www.boost.org/LICENSE_1_0.txt</a>) 234 </p> 235</div></td> 236</tr></table> 237<hr> 238<div class="spirit-nav"> 239<a accesskey="p" href="binomial_quiz_example.html"><img src="../../../../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../binom_eg.html"><img src="../../../../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../../../../index.html"><img src="../../../../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="binom_size_eg.html"><img src="../../../../../../../../doc/src/images/next.png" alt="Next"></a> 240</div> 241</body> 242</html> 243