• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1
2[section:st_eg Student's t Distribution Examples]
3
4[section:tut_mean_intervals Calculating confidence intervals on the mean with the Students-t distribution]
5
6Let's say you have a sample mean, you may wish to know what confidence intervals
7you can place on that mean.  Colloquially: "I want an interval that I can be
8P% sure contains the true mean".  (On a technical point, note that
9the interval either contains the true mean or it does not: the
10meaning of the confidence level is subtly
11different from this colloquialism.  More background information can be found on the
12[@http://www.itl.nist.gov/div898/handbook/eda/section3/eda352.htm NIST site]).
13
14The formula for the interval can be expressed as:
15
16[equation dist_tutorial4]
17
18Where, ['Y[sub s]] is the sample mean, /s/ is the sample standard deviation,
19/N/ is the sample size, /[alpha]/ is the desired significance level and
20['t[sub ([alpha]/2,N-1)]] is the upper critical value of the Students-t
21distribution with /N-1/ degrees of freedom.
22
23[note
24The quantity [alpha] is the maximum acceptable risk of falsely rejecting
25the null-hypothesis.  The smaller the value of [alpha] the greater the
26strength of the test.
27
28The confidence level of the test is defined as 1 - [alpha], and often expressed
29as a percentage.  So for example a significance level of 0.05, is equivalent
30to a 95% confidence level.  Refer to
31[@http://www.itl.nist.gov/div898/handbook/prc/section1/prc14.htm
32"What are confidence intervals?"] in __handbook for more information.
33] [/ Note]
34
35[note
36The usual assumptions of
37[@http://en.wikipedia.org/wiki/Independent_and_identically-distributed_random_variables independent and identically distributed (i.i.d.)]
38variables and [@http://en.wikipedia.org/wiki/Normal_distribution normal distribution]
39of course apply here, as they do in other examples.
40]
41
42From the formula, it should be clear that:
43
44* The width of the confidence interval decreases as the sample size increases.
45* The width increases as the standard deviation increases.
46* The width increases as the ['confidence level increases] (0.5 towards 0.99999 - stronger).
47* The width increases as the ['significance level decreases] (0.5 towards 0.00000...01 - stronger).
48
49The following example code is taken from the example program
50[@../../example/students_t_single_sample.cpp students_t_single_sample.cpp].
51
52We'll begin by defining a procedure to calculate intervals for
53various confidence levels; the procedure will print these out
54as a table:
55
56   // Needed includes:
57   #include <boost/math/distributions/students_t.hpp>
58   #include <iostream>
59   #include <iomanip>
60   // Bring everything into global namespace for ease of use:
61   using namespace boost::math;
62   using namespace std;
63
64   void confidence_limits_on_mean(
65      double Sm,           // Sm = Sample Mean.
66      double Sd,           // Sd = Sample Standard Deviation.
67      unsigned Sn)         // Sn = Sample Size.
68   {
69      using namespace std;
70      using namespace boost::math;
71
72      // Print out general info:
73      cout <<
74         "__________________________________\n"
75         "2-Sided Confidence Limits For Mean\n"
76         "__________________________________\n\n";
77      cout << setprecision(7);
78      cout << setw(40) << left << "Number of Observations" << "=  " << Sn << "\n";
79      cout << setw(40) << left << "Mean" << "=  " << Sm << "\n";
80      cout << setw(40) << left << "Standard Deviation" << "=  " << Sd << "\n";
81
82We'll define a table of significance/risk levels for which we'll compute intervals:
83
84      double alpha[] = { 0.5, 0.25, 0.1, 0.05, 0.01, 0.001, 0.0001, 0.00001 };
85
86Note that these are the complements of the confidence/probability levels: 0.5, 0.75, 0.9 .. 0.99999).
87
88Next we'll declare the distribution object we'll need, note that
89the /degrees of freedom/ parameter is the sample size less one:
90
91  students_t dist(Sn - 1);
92
93Most of what follows in the program is pretty printing, so let's focus
94on the calculation of the interval. First we need the t-statistic,
95computed using the /quantile/ function and our significance level.  Note
96that since the significance levels are the complement of the probability,
97we have to wrap the arguments in a call to /complement(...)/:
98
99   double T = quantile(complement(dist, alpha[i] / 2));
100
101Note that alpha was divided by two, since we'll be calculating
102both the upper and lower bounds: had we been interested in a single
103sided interval then we would have omitted this step.
104
105Now to complete the picture, we'll get the (one-sided) width of the
106interval from the t-statistic
107by multiplying by the standard deviation, and dividing by the square
108root of the sample size:
109
110   double w = T * Sd / sqrt(double(Sn));
111
112The two-sided interval is then the sample mean plus and minus this width.
113
114And apart from some more pretty-printing that completes the procedure.
115
116Let's take a look at some sample output, first using the
117[@http://www.itl.nist.gov/div898/handbook/eda/section4/eda428.htm
118Heat flow data] from the NIST site.  The data set was collected
119by Bob Zarr of NIST in January, 1990 from a heat flow meter
120calibration and stability analysis.
121The corresponding dataplot
122output for this test can be found in
123[@http://www.itl.nist.gov/div898/handbook/eda/section3/eda352.htm
124section 3.5.2] of the __handbook.
125
126[pre'''
127   __________________________________
128   2-Sided Confidence Limits For Mean
129   __________________________________
130
131   Number of Observations                  =  195
132   Mean                                    =  9.26146
133   Standard Deviation                      =  0.02278881
134
135
136   ___________________________________________________________________
137   Confidence       T           Interval          Lower          Upper
138    Value (%)     Value          Width            Limit          Limit
139   ___________________________________________________________________
140       50.000     0.676       1.103e-003        9.26036        9.26256
141       75.000     1.154       1.883e-003        9.25958        9.26334
142       90.000     1.653       2.697e-003        9.25876        9.26416
143       95.000     1.972       3.219e-003        9.25824        9.26468
144       99.000     2.601       4.245e-003        9.25721        9.26571
145       99.900     3.341       5.453e-003        9.25601        9.26691
146       99.990     3.973       6.484e-003        9.25498        9.26794
147       99.999     4.537       7.404e-003        9.25406        9.26886
148''']
149
150As you can see the large sample size (195) and small standard deviation (0.023)
151have combined to give very small intervals, indeed we can be
152very confident that the true mean is 9.2.
153
154For comparison the next example data output is taken from
155['P.K.Hou, O. W. Lau & M.C. Wong, Analyst (1983) vol. 108, p 64.
156and from Statistics for Analytical Chemistry, 3rd ed. (1994), pp 54-55
157J. C. Miller and J. N. Miller, Ellis Horwood ISBN 0 13 0309907.]
158The values result from the determination of mercury by cold-vapour
159atomic absorption.
160
161[pre'''
162   __________________________________
163   2-Sided Confidence Limits For Mean
164   __________________________________
165
166   Number of Observations                  =  3
167   Mean                                    =  37.8000000
168   Standard Deviation                      =  0.9643650
169
170
171   ___________________________________________________________________
172   Confidence       T           Interval          Lower          Upper
173    Value (%)     Value          Width            Limit          Limit
174   ___________________________________________________________________
175       50.000     0.816            0.455       37.34539       38.25461
176       75.000     1.604            0.893       36.90717       38.69283
177       90.000     2.920            1.626       36.17422       39.42578
178       95.000     4.303            2.396       35.40438       40.19562
179       99.000     9.925            5.526       32.27408       43.32592
180       99.900    31.599           17.594       20.20639       55.39361
181       99.990    99.992           55.673      -17.87346       93.47346
182       99.999   316.225          176.067     -138.26683      213.86683
183''']
184
185This time the fact that there are only three measurements leads to
186much wider intervals, indeed such large intervals that it's hard
187to be very confident in the location of the mean.
188
189[endsect] [/section:tut_mean_intervals Calculating confidence intervals on the mean with the Students-t distribution]
190
191[section:tut_mean_test Testing a sample mean for difference from a "true" mean]
192
193When calibrating or comparing a scientific instrument or measurement method of some kind,
194we want to be answer the question "Does an observed sample mean differ from the
195"true" mean in any significant way?".  If it does, then we have evidence of
196a systematic difference.  This question can be answered with a Students-t test:
197more information can be found
198[@http://www.itl.nist.gov/div898/handbook/eda/section3/eda352.htm
199on the NIST site].
200
201Of course, the assignment of "true" to one mean may be quite arbitrary,
202often this is simply a "traditional" method of measurement.
203
204The following example code is taken from the example program
205[@../../example/students_t_single_sample.cpp students_t_single_sample.cpp].
206
207We'll begin by defining a procedure to determine which of the
208possible hypothesis are rejected or not-rejected
209at a given significance level:
210
211[note
212Non-statisticians might say 'not-rejected' means 'accepted',
213(often of the null-hypothesis) implying, wrongly, that there really *IS* no difference,
214but statisticians eschew this to avoid implying that there is positive evidence of 'no difference'.
215'Not-rejected' here means there is *no evidence* of difference, but there still might well be a difference.
216For example, see [@http://en.wikipedia.org/wiki/Argument_from_ignorance argument from ignorance] and
217[@http://www.bmj.com/cgi/content/full/311/7003/485 Absence of evidence does not constitute evidence of absence.]
218] [/ note]
219
220
221   // Needed includes:
222   #include <boost/math/distributions/students_t.hpp>
223   #include <iostream>
224   #include <iomanip>
225   // Bring everything into global namespace for ease of use:
226   using namespace boost::math;
227   using namespace std;
228
229   void single_sample_t_test(double M, double Sm, double Sd, unsigned Sn, double alpha)
230   {
231      //
232      // M = true mean.
233      // Sm = Sample Mean.
234      // Sd = Sample Standard Deviation.
235      // Sn = Sample Size.
236      // alpha = Significance Level.
237
238Most of the procedure is pretty-printing, so let's just focus on the
239calculation, we begin by calculating the t-statistic:
240
241   // Difference in means:
242   double diff = Sm - M;
243   // Degrees of freedom:
244   unsigned v = Sn - 1;
245   // t-statistic:
246   double t_stat = diff * sqrt(double(Sn)) / Sd;
247
248Finally calculate the probability from the t-statistic. If we're interested
249in simply whether there is a difference (either less or greater) or not,
250we don't care about the sign of the t-statistic,
251and we take the complement of the probability for comparison
252to the significance level:
253
254   students_t dist(v);
255   double q = cdf(complement(dist, fabs(t_stat)));
256
257The procedure then prints out the results of the various tests
258that can be done, these
259can be summarised in the following table:
260
261[table
262[[Hypothesis][Test]]
263[[The Null-hypothesis: there is
264*no difference* in means]
265[Reject if complement of CDF for |t| < significance level / 2:
266
267`cdf(complement(dist, fabs(t))) < alpha / 2`]]
268
269[[The Alternative-hypothesis: there
270*is difference* in means]
271[Reject if complement of CDF for |t| > significance level / 2:
272
273`cdf(complement(dist, fabs(t))) > alpha / 2`]]
274
275[[The Alternative-hypothesis: the sample mean *is less* than
276the true mean.]
277[Reject if CDF of t > 1 - significance level:
278
279`cdf(complement(dist, t)) < alpha`]]
280
281[[The Alternative-hypothesis: the sample mean *is greater* than
282the true mean.]
283[Reject if complement of CDF of t < significance level:
284
285`cdf(dist, t) < alpha`]]
286]
287
288[note
289Notice that the comparisons are against `alpha / 2` for a two-sided test
290and against `alpha` for a one-sided test]
291
292Now that we have all the parts in place, let's take a look at some
293sample output, first using the
294[@http://www.itl.nist.gov/div898/handbook/eda/section4/eda428.htm
295Heat flow data] from the NIST site.  The data set was collected
296by Bob Zarr of NIST in January, 1990 from a heat flow meter
297calibration and stability analysis.  The corresponding dataplot
298output for this test can be found in
299[@http://www.itl.nist.gov/div898/handbook/eda/section3/eda352.htm
300section 3.5.2] of the __handbook.
301
302[pre
303__________________________________
304Student t test for a single sample
305__________________________________
306
307Number of Observations                                 =  195
308Sample Mean                                            =  9.26146
309Sample Standard Deviation                              =  0.02279
310Expected True Mean                                     =  5.00000
311
312Sample Mean - Expected Test Mean                       =  4.26146
313Degrees of Freedom                                     =  194
314T Statistic                                            =  2611.28380
315Probability that difference is due to chance           =  0.000e+000
316
317Results for Alternative Hypothesis and alpha           =  0.0500
318
319Alternative Hypothesis     Conclusion
320Mean != 5.000            NOT REJECTED
321Mean  < 5.000            REJECTED
322Mean  > 5.000            NOT REJECTED
323]
324
325You will note the line that says the probability that the difference is
326due to chance is zero.  From a philosophical point of view, of course,
327the probability can never reach zero.  However, in this case the calculated
328probability is smaller than the smallest representable double precision number,
329hence the appearance of a zero here.  Whatever its "true" value is, we know it
330must be extraordinarily small, so the alternative hypothesis - that there is
331a difference in means - is not rejected.
332
333For comparison the next example data output is taken from
334['P.K.Hou, O. W. Lau & M.C. Wong, Analyst (1983) vol. 108, p 64.
335and from Statistics for Analytical Chemistry, 3rd ed. (1994), pp 54-55
336J. C. Miller and J. N. Miller, Ellis Horwood ISBN 0 13 0309907.]
337The values result from the determination of mercury by cold-vapour
338atomic absorption.
339
340[pre
341__________________________________
342Student t test for a single sample
343__________________________________
344
345Number of Observations                                 =  3
346Sample Mean                                            =  37.80000
347Sample Standard Deviation                              =  0.96437
348Expected True Mean                                     =  38.90000
349
350Sample Mean - Expected Test Mean                       =  -1.10000
351Degrees of Freedom                                     =  2
352T Statistic                                            =  -1.97566
353Probability that difference is due to chance           =  1.869e-001
354
355Results for Alternative Hypothesis and alpha           =  0.0500
356
357Alternative Hypothesis     Conclusion
358Mean != 38.900            REJECTED
359Mean  < 38.900            NOT REJECTED
360Mean  > 38.900            NOT REJECTED
361]
362
363As you can see the small number of measurements (3) has led to a large uncertainty
364in the location of the true mean.  So even though there appears to be a difference
365between the sample mean and the expected true mean, we conclude that there
366is no significant difference, and are unable to reject the null hypothesis.
367However, if we were to lower the bar for acceptance down to alpha = 0.1
368(a 90% confidence level) we see a different output:
369
370[pre
371__________________________________
372Student t test for a single sample
373__________________________________
374
375Number of Observations                                 =  3
376Sample Mean                                            =  37.80000
377Sample Standard Deviation                              =  0.96437
378Expected True Mean                                     =  38.90000
379
380Sample Mean - Expected Test Mean                       =  -1.10000
381Degrees of Freedom                                     =  2
382T Statistic                                            =  -1.97566
383Probability that difference is due to chance           =  1.869e-001
384
385Results for Alternative Hypothesis and alpha           =  0.1000
386
387Alternative Hypothesis     Conclusion
388Mean != 38.900            REJECTED
389Mean  < 38.900            NOT REJECTED
390Mean  > 38.900            REJECTED
391]
392
393In this case, we really have a borderline result,
394and more data (and/or more accurate data),
395is needed for a more convincing conclusion.
396
397[endsect] [/section:tut_mean_test Testing a sample mean for difference from a "true" mean]
398
399
400[section:tut_mean_size Estimating how large a sample size would have to become
401in order to give a significant Students-t test result with a single sample test]
402
403Imagine you have conducted a Students-t test on a single sample in order
404to check for systematic errors in your measurements.  Imagine that the
405result is borderline.  At this point one might go off and collect more data,
406but it might be prudent to first ask the question "How much more?".
407The parameter estimators of the students_t_distribution class
408can provide this information.
409
410This section is based on the example code in
411[@../../example/students_t_single_sample.cpp students_t_single_sample.cpp]
412and we begin by defining a procedure that will print out a table of
413estimated sample sizes for various confidence levels:
414
415   // Needed includes:
416   #include <boost/math/distributions/students_t.hpp>
417   #include <iostream>
418   #include <iomanip>
419   // Bring everything into global namespace for ease of use:
420   using namespace boost::math;
421   using namespace std;
422
423   void single_sample_find_df(
424      double M,          // M = true mean.
425      double Sm,         // Sm = Sample Mean.
426      double Sd)         // Sd = Sample Standard Deviation.
427   {
428
429Next we define a table of significance levels:
430
431      double alpha[] = { 0.5, 0.25, 0.1, 0.05, 0.01, 0.001, 0.0001, 0.00001 };
432
433Printing out the table of sample sizes required for various confidence levels
434begins with the table header:
435
436      cout << "\n\n"
437              "_______________________________________________________________\n"
438              "Confidence       Estimated          Estimated\n"
439              " Value (%)      Sample Size        Sample Size\n"
440              "              (one sided test)    (two sided test)\n"
441              "_______________________________________________________________\n";
442
443
444And now the important part: the sample sizes required.  Class
445`students_t_distribution` has a static member function
446`find_degrees_of_freedom` that will calculate how large
447a sample size needs to be in order to give a definitive result.
448
449The first argument is the difference between the means that you
450wish to be able to detect, here it's the absolute value of the
451difference between the sample mean, and the true mean.
452
453Then come two probability values: alpha and beta.  Alpha is the
454maximum acceptable risk of rejecting the null-hypothesis when it is
455in fact true.  Beta is the maximum acceptable risk of failing to reject
456the null-hypothesis when in fact it is false.
457Also note that for a two-sided test, alpha must be divided by 2.
458
459The final parameter of the function is the standard deviation of the sample.
460
461In this example, we assume that alpha and beta are the same, and call
462`find_degrees_of_freedom` twice: once with alpha for a one-sided test,
463and once with alpha/2 for a two-sided test.
464
465      for(unsigned i = 0; i < sizeof(alpha)/sizeof(alpha[0]); ++i)
466      {
467         // Confidence value:
468         cout << fixed << setprecision(3) << setw(10) << right << 100 * (1-alpha[i]);
469         // calculate df for single sided test:
470         double df = students_t::find_degrees_of_freedom(
471            fabs(M - Sm), alpha[i], alpha[i], Sd);
472         // convert to sample size:
473         double size = ceil(df) + 1;
474         // Print size:
475         cout << fixed << setprecision(0) << setw(16) << right << size;
476         // calculate df for two sided test:
477         df = students_t::find_degrees_of_freedom(
478            fabs(M - Sm), alpha[i]/2, alpha[i], Sd);
479         // convert to sample size:
480         size = ceil(df) + 1;
481         // Print size:
482         cout << fixed << setprecision(0) << setw(16) << right << size << endl;
483      }
484      cout << endl;
485   }
486
487Let's now look at some sample output using data taken from
488['P.K.Hou, O. W. Lau & M.C. Wong, Analyst (1983) vol. 108, p 64.
489and from Statistics for Analytical Chemistry, 3rd ed. (1994), pp 54-55
490J. C. Miller and J. N. Miller, Ellis Horwood ISBN 0 13 0309907.]
491The values result from the determination of mercury by cold-vapour
492atomic absorption.
493
494Only three measurements were made, and the Students-t test above
495gave a borderline result, so this example
496will show us how many samples would need to be collected:
497
498[pre'''
499_____________________________________________________________
500Estimated sample sizes required for various confidence levels
501_____________________________________________________________
502
503True Mean                               =  38.90000
504Sample Mean                             =  37.80000
505Sample Standard Deviation               =  0.96437
506
507
508_______________________________________________________________
509Confidence       Estimated          Estimated
510 Value (%)      Sample Size        Sample Size
511              (one sided test)    (two sided test)
512_______________________________________________________________
513    75.000               3               4
514    90.000               7               9
515    95.000              11              13
516    99.000              20              22
517    99.900              35              37
518    99.990              50              53
519    99.999              66              68
520''']
521
522So in this case, many more measurements would have had to be made,
523for example at the 95% level, 14 measurements in total for a two-sided test.
524
525[endsect] [/section:tut_mean_size Estimating how large a sample size would have to become in order to give a significant Students-t test result with a single sample test]
526
527[section:two_sample_students_t Comparing the means of two samples with the Students-t test]
528
529Imagine that we have two samples, and we wish to determine whether
530their means are different or not.  This situation often arises when
531determining whether a new process or treatment is better than an old one.
532
533In this example, we'll be using the
534[@http://www.itl.nist.gov/div898/handbook/eda/section3/eda3531.htm
535Car Mileage sample data] from the
536[@http://www.itl.nist.gov NIST website].  The data compares
537miles per gallon of US cars with miles per gallon of Japanese cars.
538
539The sample code is in
540[@../../example/students_t_two_samples.cpp students_t_two_samples.cpp].
541
542There are two ways in which this test can be conducted: we can assume
543that the true standard deviations of the two samples are equal or not.
544If the standard deviations are assumed to be equal, then the calculation
545of the t-statistic is greatly simplified, so we'll examine that case first.
546In real life we should verify whether this assumption is valid with a
547Chi-Squared test for equal variances.
548
549We begin by defining a procedure that will conduct our test assuming equal
550variances:
551
552   // Needed headers:
553   #include <boost/math/distributions/students_t.hpp>
554   #include <iostream>
555   #include <iomanip>
556   // Simplify usage:
557   using namespace boost::math;
558   using namespace std;
559
560   void two_samples_t_test_equal_sd(
561           double Sm1,       // Sm1 = Sample 1 Mean.
562           double Sd1,       // Sd1 = Sample 1 Standard Deviation.
563           unsigned Sn1,     // Sn1 = Sample 1 Size.
564           double Sm2,       // Sm2 = Sample 2 Mean.
565           double Sd2,       // Sd2 = Sample 2 Standard Deviation.
566           unsigned Sn2,     // Sn2 = Sample 2 Size.
567           double alpha)     // alpha = Significance Level.
568   {
569
570
571Our procedure will begin by calculating the t-statistic, assuming
572equal variances the needed formulae are:
573
574[equation dist_tutorial1]
575
576where Sp is the "pooled" standard deviation of the two samples,
577and /v/ is the number of degrees of freedom of the two combined
578samples.  We can now write the code to calculate the t-statistic:
579
580   // Degrees of freedom:
581   double v = Sn1 + Sn2 - 2;
582   cout << setw(55) << left << "Degrees of Freedom" << "=  " << v << "\n";
583   // Pooled variance:
584   double sp = sqrt(((Sn1-1) * Sd1 * Sd1 + (Sn2-1) * Sd2 * Sd2) / v);
585   cout << setw(55) << left << "Pooled Standard Deviation" << "=  " << sp << "\n";
586   // t-statistic:
587   double t_stat = (Sm1 - Sm2) / (sp * sqrt(1.0 / Sn1 + 1.0 / Sn2));
588   cout << setw(55) << left << "T Statistic" << "=  " << t_stat << "\n";
589
590The next step is to define our distribution object, and calculate the
591complement of the probability:
592
593   students_t dist(v);
594   double q = cdf(complement(dist, fabs(t_stat)));
595   cout << setw(55) << left << "Probability that difference is due to chance" << "=  "
596      << setprecision(3) << scientific << 2 * q << "\n\n";
597
598Here we've used the absolute value of the t-statistic, because we initially
599want to know simply whether there is a difference or not (a two-sided test).
600However, we can also test whether the mean of the second sample is greater
601or is less (one-sided test) than that of the first:
602all the possible tests are summed up in the following table:
603
604[table
605[[Hypothesis][Test]]
606[[The Null-hypothesis: there is
607*no difference* in means]
608[Reject if complement of CDF for |t| < significance level / 2:
609
610`cdf(complement(dist, fabs(t))) < alpha / 2`]]
611
612[[The Alternative-hypothesis: there is a
613*difference* in means]
614[Reject if complement of CDF for |t| > significance level / 2:
615
616`cdf(complement(dist, fabs(t))) > alpha / 2`]]
617
618[[The Alternative-hypothesis: Sample 1 Mean is *less* than
619Sample 2 Mean.]
620[Reject if CDF of t > significance level:
621
622`cdf(dist, t) > alpha`]]
623
624[[The Alternative-hypothesis: Sample 1 Mean is *greater* than
625Sample 2 Mean.]
626
627[Reject if complement of CDF of t > significance level:
628
629`cdf(complement(dist, t)) > alpha`]]
630]
631
632[note
633For a two-sided test we must compare against alpha / 2 and not alpha.]
634
635Most of the rest of the sample program is pretty-printing, so we'll
636skip over that, and take a look at the sample output for alpha=0.05
637(a 95% probability level).  For comparison the dataplot output
638for the same data is in
639[@http://www.itl.nist.gov/div898/handbook/eda/section3/eda353.htm
640section 1.3.5.3] of the __handbook.
641
642[pre'''
643   ________________________________________________
644   Student t test for two samples (equal variances)
645   ________________________________________________
646
647   Number of Observations (Sample 1)                      =  249
648   Sample 1 Mean                                          =  20.145
649   Sample 1 Standard Deviation                            =  6.4147
650   Number of Observations (Sample 2)                      =  79
651   Sample 2 Mean                                          =  30.481
652   Sample 2 Standard Deviation                            =  6.1077
653   Degrees of Freedom                                     =  326
654   Pooled Standard Deviation                              =  6.3426
655   T Statistic                                            =  -12.621
656   Probability that difference is due to chance           =  5.273e-030
657
658   Results for Alternative Hypothesis and alpha           =  0.0500'''
659
660   Alternative Hypothesis              Conclusion
661   Sample 1 Mean != Sample 2 Mean       NOT REJECTED
662   Sample 1 Mean <  Sample 2 Mean       NOT REJECTED
663   Sample 1 Mean >  Sample 2 Mean       REJECTED
664]
665
666So with a probability that the difference is due to chance of just
6675.273e-030, we can safely conclude that there is indeed a difference.
668
669The tests on the alternative hypothesis show that we must
670also reject the hypothesis that Sample 1 Mean is
671greater than that for Sample 2: in this case Sample 1 represents the
672miles per gallon for Japanese cars, and Sample 2 the miles per gallon for
673US cars, so we conclude that Japanese cars are on average more
674fuel efficient.
675
676Now that we have the simple case out of the way, let's look for a moment
677at the more complex one: that the standard deviations of the two samples
678are not equal.  In this case the formula for the t-statistic becomes:
679
680[equation dist_tutorial2]
681
682And for the combined degrees of freedom we use the
683[@http://en.wikipedia.org/wiki/Welch-Satterthwaite_equation Welch-Satterthwaite]
684approximation:
685
686[equation dist_tutorial3]
687
688Note that this is one of the rare situations where the degrees-of-freedom
689parameter to the Student's t distribution is a real number, and not an
690integer value.
691
692[note
693Some statistical packages truncate the effective degrees of freedom to
694an integer value: this may be necessary if you are relying on lookup tables,
695but since our code fully supports non-integer degrees of freedom there is no
696need to truncate in this case.  Also note that when the degrees of freedom
697is small then the Welch-Satterthwaite approximation may be a significant
698source of error.]
699
700Putting these formulae into code we get:
701
702   // Degrees of freedom:
703   double v = Sd1 * Sd1 / Sn1 + Sd2 * Sd2 / Sn2;
704   v *= v;
705   double t1 = Sd1 * Sd1 / Sn1;
706   t1 *= t1;
707   t1 /=  (Sn1 - 1);
708   double t2 = Sd2 * Sd2 / Sn2;
709   t2 *= t2;
710   t2 /= (Sn2 - 1);
711   v /= (t1 + t2);
712   cout << setw(55) << left << "Degrees of Freedom" << "=  " << v << "\n";
713   // t-statistic:
714   double t_stat = (Sm1 - Sm2) / sqrt(Sd1 * Sd1 / Sn1 + Sd2 * Sd2 / Sn2);
715   cout << setw(55) << left << "T Statistic" << "=  " << t_stat << "\n";
716
717Thereafter the code and the tests are performed the same as before.  Using
718are car mileage data again, here's what the output looks like:
719
720[pre'''
721   __________________________________________________
722   Student t test for two samples (unequal variances)
723   __________________________________________________
724
725   Number of Observations (Sample 1)                      =  249
726   Sample 1 Mean                                          =  20.145
727   Sample 1 Standard Deviation                            =  6.4147
728   Number of Observations (Sample 2)                      =  79
729   Sample 2 Mean                                          =  30.481
730   Sample 2 Standard Deviation                            =  6.1077
731   Degrees of Freedom                                     =  136.87
732   T Statistic                                            =  -12.946
733   Probability that difference is due to chance           =  1.571e-025
734
735   Results for Alternative Hypothesis and alpha           =  0.0500'''
736
737   Alternative Hypothesis              Conclusion
738   Sample 1 Mean != Sample 2 Mean       NOT REJECTED
739   Sample 1 Mean <  Sample 2 Mean       NOT REJECTED
740   Sample 1 Mean >  Sample 2 Mean       REJECTED
741]
742
743This time allowing the variances in the two samples to differ has yielded
744a higher likelihood that the observed difference is down to chance alone
745(1.571e-025 compared to 5.273e-030 when equal variances were assumed).
746However, the conclusion remains the same: US cars are less fuel efficient
747than Japanese models.
748
749[endsect] [/section:two_sample_students_t Comparing the means of two samples with the Students-t test]
750
751[section:paired_st Comparing two paired samples with the Student's t distribution]
752
753Imagine that we have a before and after reading for each item in the sample:
754for example we might have measured blood pressure before and after administration
755of a new drug.  We can't pool the results and compare the means before and after
756the change, because each patient will have a different baseline reading.
757Instead we calculate the difference between before and after measurements
758in each patient, and calculate the mean and standard deviation of the differences.
759To test whether a significant change has taken place, we can then test
760the null-hypothesis that the true mean is zero using the same procedure
761we used in the single sample cases previously discussed.
762
763That means we can:
764
765* [link math_toolkit.stat_tut.weg.st_eg.tut_mean_intervals Calculate confidence intervals of the mean].
766If the endpoints of the interval differ in sign then we are unable to reject
767the null-hypothesis that there is no change.
768* [link math_toolkit.stat_tut.weg.st_eg.tut_mean_test Test whether the true mean is zero]. If the
769result is consistent with a true mean of zero, then we are unable to reject the
770null-hypothesis that there is no change.
771* [link math_toolkit.stat_tut.weg.st_eg.tut_mean_size Calculate how many pairs of readings we would need
772in order to obtain a significant result].
773
774[endsect] [/section:paired_st Comparing two paired samples with the Student's t distribution]
775
776
777[endsect] [/section:st_eg Student's t]
778
779[/
780  Copyright 2006, 2012 John Maddock and Paul A. Bristow.
781  Distributed under the Boost Software License, Version 1.0.
782  (See accompanying file LICENSE_1_0.txt or copy at
783  http://www.boost.org/LICENSE_1_0.txt).
784]
785
786