1 2[section:st_eg Student's t Distribution Examples] 3 4[section:tut_mean_intervals Calculating confidence intervals on the mean with the Students-t distribution] 5 6Let's say you have a sample mean, you may wish to know what confidence intervals 7you can place on that mean. Colloquially: "I want an interval that I can be 8P% sure contains the true mean". (On a technical point, note that 9the interval either contains the true mean or it does not: the 10meaning of the confidence level is subtly 11different from this colloquialism. More background information can be found on the 12[@http://www.itl.nist.gov/div898/handbook/eda/section3/eda352.htm NIST site]). 13 14The formula for the interval can be expressed as: 15 16[equation dist_tutorial4] 17 18Where, ['Y[sub s]] is the sample mean, /s/ is the sample standard deviation, 19/N/ is the sample size, /[alpha]/ is the desired significance level and 20['t[sub ([alpha]/2,N-1)]] is the upper critical value of the Students-t 21distribution with /N-1/ degrees of freedom. 22 23[note 24The quantity [alpha] is the maximum acceptable risk of falsely rejecting 25the null-hypothesis. The smaller the value of [alpha] the greater the 26strength of the test. 27 28The confidence level of the test is defined as 1 - [alpha], and often expressed 29as a percentage. So for example a significance level of 0.05, is equivalent 30to a 95% confidence level. Refer to 31[@http://www.itl.nist.gov/div898/handbook/prc/section1/prc14.htm 32"What are confidence intervals?"] in __handbook for more information. 33] [/ Note] 34 35[note 36The usual assumptions of 37[@http://en.wikipedia.org/wiki/Independent_and_identically-distributed_random_variables independent and identically distributed (i.i.d.)] 38variables and [@http://en.wikipedia.org/wiki/Normal_distribution normal distribution] 39of course apply here, as they do in other examples. 40] 41 42From the formula, it should be clear that: 43 44* The width of the confidence interval decreases as the sample size increases. 45* The width increases as the standard deviation increases. 46* The width increases as the ['confidence level increases] (0.5 towards 0.99999 - stronger). 47* The width increases as the ['significance level decreases] (0.5 towards 0.00000...01 - stronger). 48 49The following example code is taken from the example program 50[@../../example/students_t_single_sample.cpp students_t_single_sample.cpp]. 51 52We'll begin by defining a procedure to calculate intervals for 53various confidence levels; the procedure will print these out 54as a table: 55 56 // Needed includes: 57 #include <boost/math/distributions/students_t.hpp> 58 #include <iostream> 59 #include <iomanip> 60 // Bring everything into global namespace for ease of use: 61 using namespace boost::math; 62 using namespace std; 63 64 void confidence_limits_on_mean( 65 double Sm, // Sm = Sample Mean. 66 double Sd, // Sd = Sample Standard Deviation. 67 unsigned Sn) // Sn = Sample Size. 68 { 69 using namespace std; 70 using namespace boost::math; 71 72 // Print out general info: 73 cout << 74 "__________________________________\n" 75 "2-Sided Confidence Limits For Mean\n" 76 "__________________________________\n\n"; 77 cout << setprecision(7); 78 cout << setw(40) << left << "Number of Observations" << "= " << Sn << "\n"; 79 cout << setw(40) << left << "Mean" << "= " << Sm << "\n"; 80 cout << setw(40) << left << "Standard Deviation" << "= " << Sd << "\n"; 81 82We'll define a table of significance/risk levels for which we'll compute intervals: 83 84 double alpha[] = { 0.5, 0.25, 0.1, 0.05, 0.01, 0.001, 0.0001, 0.00001 }; 85 86Note that these are the complements of the confidence/probability levels: 0.5, 0.75, 0.9 .. 0.99999). 87 88Next we'll declare the distribution object we'll need, note that 89the /degrees of freedom/ parameter is the sample size less one: 90 91 students_t dist(Sn - 1); 92 93Most of what follows in the program is pretty printing, so let's focus 94on the calculation of the interval. First we need the t-statistic, 95computed using the /quantile/ function and our significance level. Note 96that since the significance levels are the complement of the probability, 97we have to wrap the arguments in a call to /complement(...)/: 98 99 double T = quantile(complement(dist, alpha[i] / 2)); 100 101Note that alpha was divided by two, since we'll be calculating 102both the upper and lower bounds: had we been interested in a single 103sided interval then we would have omitted this step. 104 105Now to complete the picture, we'll get the (one-sided) width of the 106interval from the t-statistic 107by multiplying by the standard deviation, and dividing by the square 108root of the sample size: 109 110 double w = T * Sd / sqrt(double(Sn)); 111 112The two-sided interval is then the sample mean plus and minus this width. 113 114And apart from some more pretty-printing that completes the procedure. 115 116Let's take a look at some sample output, first using the 117[@http://www.itl.nist.gov/div898/handbook/eda/section4/eda428.htm 118Heat flow data] from the NIST site. The data set was collected 119by Bob Zarr of NIST in January, 1990 from a heat flow meter 120calibration and stability analysis. 121The corresponding dataplot 122output for this test can be found in 123[@http://www.itl.nist.gov/div898/handbook/eda/section3/eda352.htm 124section 3.5.2] of the __handbook. 125 126[pre''' 127 __________________________________ 128 2-Sided Confidence Limits For Mean 129 __________________________________ 130 131 Number of Observations = 195 132 Mean = 9.26146 133 Standard Deviation = 0.02278881 134 135 136 ___________________________________________________________________ 137 Confidence T Interval Lower Upper 138 Value (%) Value Width Limit Limit 139 ___________________________________________________________________ 140 50.000 0.676 1.103e-003 9.26036 9.26256 141 75.000 1.154 1.883e-003 9.25958 9.26334 142 90.000 1.653 2.697e-003 9.25876 9.26416 143 95.000 1.972 3.219e-003 9.25824 9.26468 144 99.000 2.601 4.245e-003 9.25721 9.26571 145 99.900 3.341 5.453e-003 9.25601 9.26691 146 99.990 3.973 6.484e-003 9.25498 9.26794 147 99.999 4.537 7.404e-003 9.25406 9.26886 148'''] 149 150As you can see the large sample size (195) and small standard deviation (0.023) 151have combined to give very small intervals, indeed we can be 152very confident that the true mean is 9.2. 153 154For comparison the next example data output is taken from 155['P.K.Hou, O. W. Lau & M.C. Wong, Analyst (1983) vol. 108, p 64. 156and from Statistics for Analytical Chemistry, 3rd ed. (1994), pp 54-55 157J. C. Miller and J. N. Miller, Ellis Horwood ISBN 0 13 0309907.] 158The values result from the determination of mercury by cold-vapour 159atomic absorption. 160 161[pre''' 162 __________________________________ 163 2-Sided Confidence Limits For Mean 164 __________________________________ 165 166 Number of Observations = 3 167 Mean = 37.8000000 168 Standard Deviation = 0.9643650 169 170 171 ___________________________________________________________________ 172 Confidence T Interval Lower Upper 173 Value (%) Value Width Limit Limit 174 ___________________________________________________________________ 175 50.000 0.816 0.455 37.34539 38.25461 176 75.000 1.604 0.893 36.90717 38.69283 177 90.000 2.920 1.626 36.17422 39.42578 178 95.000 4.303 2.396 35.40438 40.19562 179 99.000 9.925 5.526 32.27408 43.32592 180 99.900 31.599 17.594 20.20639 55.39361 181 99.990 99.992 55.673 -17.87346 93.47346 182 99.999 316.225 176.067 -138.26683 213.86683 183'''] 184 185This time the fact that there are only three measurements leads to 186much wider intervals, indeed such large intervals that it's hard 187to be very confident in the location of the mean. 188 189[endsect] [/section:tut_mean_intervals Calculating confidence intervals on the mean with the Students-t distribution] 190 191[section:tut_mean_test Testing a sample mean for difference from a "true" mean] 192 193When calibrating or comparing a scientific instrument or measurement method of some kind, 194we want to be answer the question "Does an observed sample mean differ from the 195"true" mean in any significant way?". If it does, then we have evidence of 196a systematic difference. This question can be answered with a Students-t test: 197more information can be found 198[@http://www.itl.nist.gov/div898/handbook/eda/section3/eda352.htm 199on the NIST site]. 200 201Of course, the assignment of "true" to one mean may be quite arbitrary, 202often this is simply a "traditional" method of measurement. 203 204The following example code is taken from the example program 205[@../../example/students_t_single_sample.cpp students_t_single_sample.cpp]. 206 207We'll begin by defining a procedure to determine which of the 208possible hypothesis are rejected or not-rejected 209at a given significance level: 210 211[note 212Non-statisticians might say 'not-rejected' means 'accepted', 213(often of the null-hypothesis) implying, wrongly, that there really *IS* no difference, 214but statisticians eschew this to avoid implying that there is positive evidence of 'no difference'. 215'Not-rejected' here means there is *no evidence* of difference, but there still might well be a difference. 216For example, see [@http://en.wikipedia.org/wiki/Argument_from_ignorance argument from ignorance] and 217[@http://www.bmj.com/cgi/content/full/311/7003/485 Absence of evidence does not constitute evidence of absence.] 218] [/ note] 219 220 221 // Needed includes: 222 #include <boost/math/distributions/students_t.hpp> 223 #include <iostream> 224 #include <iomanip> 225 // Bring everything into global namespace for ease of use: 226 using namespace boost::math; 227 using namespace std; 228 229 void single_sample_t_test(double M, double Sm, double Sd, unsigned Sn, double alpha) 230 { 231 // 232 // M = true mean. 233 // Sm = Sample Mean. 234 // Sd = Sample Standard Deviation. 235 // Sn = Sample Size. 236 // alpha = Significance Level. 237 238Most of the procedure is pretty-printing, so let's just focus on the 239calculation, we begin by calculating the t-statistic: 240 241 // Difference in means: 242 double diff = Sm - M; 243 // Degrees of freedom: 244 unsigned v = Sn - 1; 245 // t-statistic: 246 double t_stat = diff * sqrt(double(Sn)) / Sd; 247 248Finally calculate the probability from the t-statistic. If we're interested 249in simply whether there is a difference (either less or greater) or not, 250we don't care about the sign of the t-statistic, 251and we take the complement of the probability for comparison 252to the significance level: 253 254 students_t dist(v); 255 double q = cdf(complement(dist, fabs(t_stat))); 256 257The procedure then prints out the results of the various tests 258that can be done, these 259can be summarised in the following table: 260 261[table 262[[Hypothesis][Test]] 263[[The Null-hypothesis: there is 264*no difference* in means] 265[Reject if complement of CDF for |t| < significance level / 2: 266 267`cdf(complement(dist, fabs(t))) < alpha / 2`]] 268 269[[The Alternative-hypothesis: there 270*is difference* in means] 271[Reject if complement of CDF for |t| > significance level / 2: 272 273`cdf(complement(dist, fabs(t))) > alpha / 2`]] 274 275[[The Alternative-hypothesis: the sample mean *is less* than 276the true mean.] 277[Reject if CDF of t > 1 - significance level: 278 279`cdf(complement(dist, t)) < alpha`]] 280 281[[The Alternative-hypothesis: the sample mean *is greater* than 282the true mean.] 283[Reject if complement of CDF of t < significance level: 284 285`cdf(dist, t) < alpha`]] 286] 287 288[note 289Notice that the comparisons are against `alpha / 2` for a two-sided test 290and against `alpha` for a one-sided test] 291 292Now that we have all the parts in place, let's take a look at some 293sample output, first using the 294[@http://www.itl.nist.gov/div898/handbook/eda/section4/eda428.htm 295Heat flow data] from the NIST site. The data set was collected 296by Bob Zarr of NIST in January, 1990 from a heat flow meter 297calibration and stability analysis. The corresponding dataplot 298output for this test can be found in 299[@http://www.itl.nist.gov/div898/handbook/eda/section3/eda352.htm 300section 3.5.2] of the __handbook. 301 302[pre 303__________________________________ 304Student t test for a single sample 305__________________________________ 306 307Number of Observations = 195 308Sample Mean = 9.26146 309Sample Standard Deviation = 0.02279 310Expected True Mean = 5.00000 311 312Sample Mean - Expected Test Mean = 4.26146 313Degrees of Freedom = 194 314T Statistic = 2611.28380 315Probability that difference is due to chance = 0.000e+000 316 317Results for Alternative Hypothesis and alpha = 0.0500 318 319Alternative Hypothesis Conclusion 320Mean != 5.000 NOT REJECTED 321Mean < 5.000 REJECTED 322Mean > 5.000 NOT REJECTED 323] 324 325You will note the line that says the probability that the difference is 326due to chance is zero. From a philosophical point of view, of course, 327the probability can never reach zero. However, in this case the calculated 328probability is smaller than the smallest representable double precision number, 329hence the appearance of a zero here. Whatever its "true" value is, we know it 330must be extraordinarily small, so the alternative hypothesis - that there is 331a difference in means - is not rejected. 332 333For comparison the next example data output is taken from 334['P.K.Hou, O. W. Lau & M.C. Wong, Analyst (1983) vol. 108, p 64. 335and from Statistics for Analytical Chemistry, 3rd ed. (1994), pp 54-55 336J. C. Miller and J. N. Miller, Ellis Horwood ISBN 0 13 0309907.] 337The values result from the determination of mercury by cold-vapour 338atomic absorption. 339 340[pre 341__________________________________ 342Student t test for a single sample 343__________________________________ 344 345Number of Observations = 3 346Sample Mean = 37.80000 347Sample Standard Deviation = 0.96437 348Expected True Mean = 38.90000 349 350Sample Mean - Expected Test Mean = -1.10000 351Degrees of Freedom = 2 352T Statistic = -1.97566 353Probability that difference is due to chance = 1.869e-001 354 355Results for Alternative Hypothesis and alpha = 0.0500 356 357Alternative Hypothesis Conclusion 358Mean != 38.900 REJECTED 359Mean < 38.900 NOT REJECTED 360Mean > 38.900 NOT REJECTED 361] 362 363As you can see the small number of measurements (3) has led to a large uncertainty 364in the location of the true mean. So even though there appears to be a difference 365between the sample mean and the expected true mean, we conclude that there 366is no significant difference, and are unable to reject the null hypothesis. 367However, if we were to lower the bar for acceptance down to alpha = 0.1 368(a 90% confidence level) we see a different output: 369 370[pre 371__________________________________ 372Student t test for a single sample 373__________________________________ 374 375Number of Observations = 3 376Sample Mean = 37.80000 377Sample Standard Deviation = 0.96437 378Expected True Mean = 38.90000 379 380Sample Mean - Expected Test Mean = -1.10000 381Degrees of Freedom = 2 382T Statistic = -1.97566 383Probability that difference is due to chance = 1.869e-001 384 385Results for Alternative Hypothesis and alpha = 0.1000 386 387Alternative Hypothesis Conclusion 388Mean != 38.900 REJECTED 389Mean < 38.900 NOT REJECTED 390Mean > 38.900 REJECTED 391] 392 393In this case, we really have a borderline result, 394and more data (and/or more accurate data), 395is needed for a more convincing conclusion. 396 397[endsect] [/section:tut_mean_test Testing a sample mean for difference from a "true" mean] 398 399 400[section:tut_mean_size Estimating how large a sample size would have to become 401in order to give a significant Students-t test result with a single sample test] 402 403Imagine you have conducted a Students-t test on a single sample in order 404to check for systematic errors in your measurements. Imagine that the 405result is borderline. At this point one might go off and collect more data, 406but it might be prudent to first ask the question "How much more?". 407The parameter estimators of the students_t_distribution class 408can provide this information. 409 410This section is based on the example code in 411[@../../example/students_t_single_sample.cpp students_t_single_sample.cpp] 412and we begin by defining a procedure that will print out a table of 413estimated sample sizes for various confidence levels: 414 415 // Needed includes: 416 #include <boost/math/distributions/students_t.hpp> 417 #include <iostream> 418 #include <iomanip> 419 // Bring everything into global namespace for ease of use: 420 using namespace boost::math; 421 using namespace std; 422 423 void single_sample_find_df( 424 double M, // M = true mean. 425 double Sm, // Sm = Sample Mean. 426 double Sd) // Sd = Sample Standard Deviation. 427 { 428 429Next we define a table of significance levels: 430 431 double alpha[] = { 0.5, 0.25, 0.1, 0.05, 0.01, 0.001, 0.0001, 0.00001 }; 432 433Printing out the table of sample sizes required for various confidence levels 434begins with the table header: 435 436 cout << "\n\n" 437 "_______________________________________________________________\n" 438 "Confidence Estimated Estimated\n" 439 " Value (%) Sample Size Sample Size\n" 440 " (one sided test) (two sided test)\n" 441 "_______________________________________________________________\n"; 442 443 444And now the important part: the sample sizes required. Class 445`students_t_distribution` has a static member function 446`find_degrees_of_freedom` that will calculate how large 447a sample size needs to be in order to give a definitive result. 448 449The first argument is the difference between the means that you 450wish to be able to detect, here it's the absolute value of the 451difference between the sample mean, and the true mean. 452 453Then come two probability values: alpha and beta. Alpha is the 454maximum acceptable risk of rejecting the null-hypothesis when it is 455in fact true. Beta is the maximum acceptable risk of failing to reject 456the null-hypothesis when in fact it is false. 457Also note that for a two-sided test, alpha must be divided by 2. 458 459The final parameter of the function is the standard deviation of the sample. 460 461In this example, we assume that alpha and beta are the same, and call 462`find_degrees_of_freedom` twice: once with alpha for a one-sided test, 463and once with alpha/2 for a two-sided test. 464 465 for(unsigned i = 0; i < sizeof(alpha)/sizeof(alpha[0]); ++i) 466 { 467 // Confidence value: 468 cout << fixed << setprecision(3) << setw(10) << right << 100 * (1-alpha[i]); 469 // calculate df for single sided test: 470 double df = students_t::find_degrees_of_freedom( 471 fabs(M - Sm), alpha[i], alpha[i], Sd); 472 // convert to sample size: 473 double size = ceil(df) + 1; 474 // Print size: 475 cout << fixed << setprecision(0) << setw(16) << right << size; 476 // calculate df for two sided test: 477 df = students_t::find_degrees_of_freedom( 478 fabs(M - Sm), alpha[i]/2, alpha[i], Sd); 479 // convert to sample size: 480 size = ceil(df) + 1; 481 // Print size: 482 cout << fixed << setprecision(0) << setw(16) << right << size << endl; 483 } 484 cout << endl; 485 } 486 487Let's now look at some sample output using data taken from 488['P.K.Hou, O. W. Lau & M.C. Wong, Analyst (1983) vol. 108, p 64. 489and from Statistics for Analytical Chemistry, 3rd ed. (1994), pp 54-55 490J. C. Miller and J. N. Miller, Ellis Horwood ISBN 0 13 0309907.] 491The values result from the determination of mercury by cold-vapour 492atomic absorption. 493 494Only three measurements were made, and the Students-t test above 495gave a borderline result, so this example 496will show us how many samples would need to be collected: 497 498[pre''' 499_____________________________________________________________ 500Estimated sample sizes required for various confidence levels 501_____________________________________________________________ 502 503True Mean = 38.90000 504Sample Mean = 37.80000 505Sample Standard Deviation = 0.96437 506 507 508_______________________________________________________________ 509Confidence Estimated Estimated 510 Value (%) Sample Size Sample Size 511 (one sided test) (two sided test) 512_______________________________________________________________ 513 75.000 3 4 514 90.000 7 9 515 95.000 11 13 516 99.000 20 22 517 99.900 35 37 518 99.990 50 53 519 99.999 66 68 520'''] 521 522So in this case, many more measurements would have had to be made, 523for example at the 95% level, 14 measurements in total for a two-sided test. 524 525[endsect] [/section:tut_mean_size Estimating how large a sample size would have to become in order to give a significant Students-t test result with a single sample test] 526 527[section:two_sample_students_t Comparing the means of two samples with the Students-t test] 528 529Imagine that we have two samples, and we wish to determine whether 530their means are different or not. This situation often arises when 531determining whether a new process or treatment is better than an old one. 532 533In this example, we'll be using the 534[@http://www.itl.nist.gov/div898/handbook/eda/section3/eda3531.htm 535Car Mileage sample data] from the 536[@http://www.itl.nist.gov NIST website]. The data compares 537miles per gallon of US cars with miles per gallon of Japanese cars. 538 539The sample code is in 540[@../../example/students_t_two_samples.cpp students_t_two_samples.cpp]. 541 542There are two ways in which this test can be conducted: we can assume 543that the true standard deviations of the two samples are equal or not. 544If the standard deviations are assumed to be equal, then the calculation 545of the t-statistic is greatly simplified, so we'll examine that case first. 546In real life we should verify whether this assumption is valid with a 547Chi-Squared test for equal variances. 548 549We begin by defining a procedure that will conduct our test assuming equal 550variances: 551 552 // Needed headers: 553 #include <boost/math/distributions/students_t.hpp> 554 #include <iostream> 555 #include <iomanip> 556 // Simplify usage: 557 using namespace boost::math; 558 using namespace std; 559 560 void two_samples_t_test_equal_sd( 561 double Sm1, // Sm1 = Sample 1 Mean. 562 double Sd1, // Sd1 = Sample 1 Standard Deviation. 563 unsigned Sn1, // Sn1 = Sample 1 Size. 564 double Sm2, // Sm2 = Sample 2 Mean. 565 double Sd2, // Sd2 = Sample 2 Standard Deviation. 566 unsigned Sn2, // Sn2 = Sample 2 Size. 567 double alpha) // alpha = Significance Level. 568 { 569 570 571Our procedure will begin by calculating the t-statistic, assuming 572equal variances the needed formulae are: 573 574[equation dist_tutorial1] 575 576where Sp is the "pooled" standard deviation of the two samples, 577and /v/ is the number of degrees of freedom of the two combined 578samples. We can now write the code to calculate the t-statistic: 579 580 // Degrees of freedom: 581 double v = Sn1 + Sn2 - 2; 582 cout << setw(55) << left << "Degrees of Freedom" << "= " << v << "\n"; 583 // Pooled variance: 584 double sp = sqrt(((Sn1-1) * Sd1 * Sd1 + (Sn2-1) * Sd2 * Sd2) / v); 585 cout << setw(55) << left << "Pooled Standard Deviation" << "= " << sp << "\n"; 586 // t-statistic: 587 double t_stat = (Sm1 - Sm2) / (sp * sqrt(1.0 / Sn1 + 1.0 / Sn2)); 588 cout << setw(55) << left << "T Statistic" << "= " << t_stat << "\n"; 589 590The next step is to define our distribution object, and calculate the 591complement of the probability: 592 593 students_t dist(v); 594 double q = cdf(complement(dist, fabs(t_stat))); 595 cout << setw(55) << left << "Probability that difference is due to chance" << "= " 596 << setprecision(3) << scientific << 2 * q << "\n\n"; 597 598Here we've used the absolute value of the t-statistic, because we initially 599want to know simply whether there is a difference or not (a two-sided test). 600However, we can also test whether the mean of the second sample is greater 601or is less (one-sided test) than that of the first: 602all the possible tests are summed up in the following table: 603 604[table 605[[Hypothesis][Test]] 606[[The Null-hypothesis: there is 607*no difference* in means] 608[Reject if complement of CDF for |t| < significance level / 2: 609 610`cdf(complement(dist, fabs(t))) < alpha / 2`]] 611 612[[The Alternative-hypothesis: there is a 613*difference* in means] 614[Reject if complement of CDF for |t| > significance level / 2: 615 616`cdf(complement(dist, fabs(t))) > alpha / 2`]] 617 618[[The Alternative-hypothesis: Sample 1 Mean is *less* than 619Sample 2 Mean.] 620[Reject if CDF of t > significance level: 621 622`cdf(dist, t) > alpha`]] 623 624[[The Alternative-hypothesis: Sample 1 Mean is *greater* than 625Sample 2 Mean.] 626 627[Reject if complement of CDF of t > significance level: 628 629`cdf(complement(dist, t)) > alpha`]] 630] 631 632[note 633For a two-sided test we must compare against alpha / 2 and not alpha.] 634 635Most of the rest of the sample program is pretty-printing, so we'll 636skip over that, and take a look at the sample output for alpha=0.05 637(a 95% probability level). For comparison the dataplot output 638for the same data is in 639[@http://www.itl.nist.gov/div898/handbook/eda/section3/eda353.htm 640section 1.3.5.3] of the __handbook. 641 642[pre''' 643 ________________________________________________ 644 Student t test for two samples (equal variances) 645 ________________________________________________ 646 647 Number of Observations (Sample 1) = 249 648 Sample 1 Mean = 20.145 649 Sample 1 Standard Deviation = 6.4147 650 Number of Observations (Sample 2) = 79 651 Sample 2 Mean = 30.481 652 Sample 2 Standard Deviation = 6.1077 653 Degrees of Freedom = 326 654 Pooled Standard Deviation = 6.3426 655 T Statistic = -12.621 656 Probability that difference is due to chance = 5.273e-030 657 658 Results for Alternative Hypothesis and alpha = 0.0500''' 659 660 Alternative Hypothesis Conclusion 661 Sample 1 Mean != Sample 2 Mean NOT REJECTED 662 Sample 1 Mean < Sample 2 Mean NOT REJECTED 663 Sample 1 Mean > Sample 2 Mean REJECTED 664] 665 666So with a probability that the difference is due to chance of just 6675.273e-030, we can safely conclude that there is indeed a difference. 668 669The tests on the alternative hypothesis show that we must 670also reject the hypothesis that Sample 1 Mean is 671greater than that for Sample 2: in this case Sample 1 represents the 672miles per gallon for Japanese cars, and Sample 2 the miles per gallon for 673US cars, so we conclude that Japanese cars are on average more 674fuel efficient. 675 676Now that we have the simple case out of the way, let's look for a moment 677at the more complex one: that the standard deviations of the two samples 678are not equal. In this case the formula for the t-statistic becomes: 679 680[equation dist_tutorial2] 681 682And for the combined degrees of freedom we use the 683[@http://en.wikipedia.org/wiki/Welch-Satterthwaite_equation Welch-Satterthwaite] 684approximation: 685 686[equation dist_tutorial3] 687 688Note that this is one of the rare situations where the degrees-of-freedom 689parameter to the Student's t distribution is a real number, and not an 690integer value. 691 692[note 693Some statistical packages truncate the effective degrees of freedom to 694an integer value: this may be necessary if you are relying on lookup tables, 695but since our code fully supports non-integer degrees of freedom there is no 696need to truncate in this case. Also note that when the degrees of freedom 697is small then the Welch-Satterthwaite approximation may be a significant 698source of error.] 699 700Putting these formulae into code we get: 701 702 // Degrees of freedom: 703 double v = Sd1 * Sd1 / Sn1 + Sd2 * Sd2 / Sn2; 704 v *= v; 705 double t1 = Sd1 * Sd1 / Sn1; 706 t1 *= t1; 707 t1 /= (Sn1 - 1); 708 double t2 = Sd2 * Sd2 / Sn2; 709 t2 *= t2; 710 t2 /= (Sn2 - 1); 711 v /= (t1 + t2); 712 cout << setw(55) << left << "Degrees of Freedom" << "= " << v << "\n"; 713 // t-statistic: 714 double t_stat = (Sm1 - Sm2) / sqrt(Sd1 * Sd1 / Sn1 + Sd2 * Sd2 / Sn2); 715 cout << setw(55) << left << "T Statistic" << "= " << t_stat << "\n"; 716 717Thereafter the code and the tests are performed the same as before. Using 718are car mileage data again, here's what the output looks like: 719 720[pre''' 721 __________________________________________________ 722 Student t test for two samples (unequal variances) 723 __________________________________________________ 724 725 Number of Observations (Sample 1) = 249 726 Sample 1 Mean = 20.145 727 Sample 1 Standard Deviation = 6.4147 728 Number of Observations (Sample 2) = 79 729 Sample 2 Mean = 30.481 730 Sample 2 Standard Deviation = 6.1077 731 Degrees of Freedom = 136.87 732 T Statistic = -12.946 733 Probability that difference is due to chance = 1.571e-025 734 735 Results for Alternative Hypothesis and alpha = 0.0500''' 736 737 Alternative Hypothesis Conclusion 738 Sample 1 Mean != Sample 2 Mean NOT REJECTED 739 Sample 1 Mean < Sample 2 Mean NOT REJECTED 740 Sample 1 Mean > Sample 2 Mean REJECTED 741] 742 743This time allowing the variances in the two samples to differ has yielded 744a higher likelihood that the observed difference is down to chance alone 745(1.571e-025 compared to 5.273e-030 when equal variances were assumed). 746However, the conclusion remains the same: US cars are less fuel efficient 747than Japanese models. 748 749[endsect] [/section:two_sample_students_t Comparing the means of two samples with the Students-t test] 750 751[section:paired_st Comparing two paired samples with the Student's t distribution] 752 753Imagine that we have a before and after reading for each item in the sample: 754for example we might have measured blood pressure before and after administration 755of a new drug. We can't pool the results and compare the means before and after 756the change, because each patient will have a different baseline reading. 757Instead we calculate the difference between before and after measurements 758in each patient, and calculate the mean and standard deviation of the differences. 759To test whether a significant change has taken place, we can then test 760the null-hypothesis that the true mean is zero using the same procedure 761we used in the single sample cases previously discussed. 762 763That means we can: 764 765* [link math_toolkit.stat_tut.weg.st_eg.tut_mean_intervals Calculate confidence intervals of the mean]. 766If the endpoints of the interval differ in sign then we are unable to reject 767the null-hypothesis that there is no change. 768* [link math_toolkit.stat_tut.weg.st_eg.tut_mean_test Test whether the true mean is zero]. If the 769result is consistent with a true mean of zero, then we are unable to reject the 770null-hypothesis that there is no change. 771* [link math_toolkit.stat_tut.weg.st_eg.tut_mean_size Calculate how many pairs of readings we would need 772in order to obtain a significant result]. 773 774[endsect] [/section:paired_st Comparing two paired samples with the Student's t distribution] 775 776 777[endsect] [/section:st_eg Student's t] 778 779[/ 780 Copyright 2006, 2012 John Maddock and Paul A. Bristow. 781 Distributed under the Boost Software License, Version 1.0. 782 (See accompanying file LICENSE_1_0.txt or copy at 783 http://www.boost.org/LICENSE_1_0.txt). 784] 785 786