Transcript Slide 1

Chapter 15
The Analysis of Variance
1
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
A Problem
A study was done on the survival time of
patients with advanced cancer of the
stomach, bronchus, colon, ovary or breast
when treated with ascorbate1. In this study,
the authors wanted to determine if the
survival times differ based on the affected
organ.
1
2
Cameron, E. and Pauling, L. (1978) Supplemental ascorbate in the supportive
treatment of cancer: re-evaluation of prolongation of survival time in terminal human
cancer. Proceedings of the National Academy of Science, USA, 75, 4538-4542.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
A Problem
A comparative dotplot of the survival times is
shown below.
Dotplot for Survival Time
Cancer Type
Stomach
Ovary
Colon
Bronchus
Breast
0
3
1000
2000
3000
Survival Time (in days)
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
A Problem
The hypotheses used to answer the question
of interest are
H0: µstomach = µbronchus = µcolon = µovary = µbreast
Ha: At least two of the µ’s are different
The question is similar to ones encountered in
chapter 11 where we looked at tests for the
difference of means of two different variables. In
this case we are interested in looking a more than
two variable.
4
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Single-factor Analysis of Variance
(ANOVA)
A single-factor analysis of variance
(ANOVA) problems involves a comparison of k
population or treatment means µ1, µ2, … , µk.
The objective is to test the hypotheses:
H0: µ1 = µ2 = µ3 = … = µk
Ha: At least two of the µ’s are different
5
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Single-factor Analysis of Variance
(ANOVA)
The analysis is based on k independently
selected samples, one from each population
or for each treatment.
In the case of populations, a random
sample from each population is selected
independently of that from any other
population.
When comparing treatments, the
experimental units (subjects or objects)
that receive any particular treatment are
chosen at random from those available
for the experiment.
6
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Single-factor Analysis of Variance
(ANOVA)
A comparison of treatments based on
independently selected experimental units is
often referred to as a completely randomized
design.
7
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Single-factor Analysis of Variance
(ANOVA)
Dotplots of Yield by Fertilizer
(group means are indicated by lines)
Yield
70
60
50
40
Type 1
Type 2
Type 3
Fertilizer
Notice that in the above comparative dotplot, the
differences in the treatment means is large relative to
the variability within the samples.
8
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Single-factor Analysis of Variance
(ANOVA)
Dotplots of Price by Subject
(group means are indicated by lines)
Price
85
75
Statistics
Psychology
Economics
Subject
Business
65
Notice that in the above comparative dotplot, the
differences in the treatment means is not easily
understood relative to the sample variability.
ANOVA techniques will allow us to determined if those
differences are significant.
9
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
ANOVA Notation
k = number of populations or treatments being compared
Population or treatment
1
2
… k
Population or treatment mean
µ1 µ2
… µk
10
Population or treatment variance 12
 22
…
k2
Sample size
n1
n2
…
nk
Sample mean
x1
x2
…
xk
Sample variance
s12
s22
…
sk2
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
ANOVA Notation
N = n1 + n2 + … + nk
(Total number of observations in the data set)
T = grand total = sum of all N observations
 n1x1  n2 x2 
 nk xk
T
x  grand mean 
N
11
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Assumptions for ANOVA
1. Each of the k populations or treatments,
the response distribution is normal.
2. 1 = 2 = … = k (The k normal
distributions have identical standard
deviations.
3. The observations in the sample from any
particular one of the k populations or treatments
are independent of one another.
4. When comparing population means, k random
samples are selected independently of one
another. When comparing treatment means,
treatments are assigned at random to subjects or
objects.
12
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Definitions
A measure of disparity among the sample
means is the treatment sum of squares,
denoted by SSTr is given by
SSTr  n1  x1  x   n2  x 2  x  
2
2
 nk  xk  x 
A measure of variation within the k samples, called
error sum of squares and denoted by SSE is
given by
SSE   n1  1 s12  n2  1 s22 
13
 nk  1 sk2
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
2
Definitions
A mean square is a sum of squares divided
by its df. In particular,
mean square for
SSTr
treatments = MSTr =
k 1
SSE
mean square for error = MSE =
Nk
The error df comes from adding the df’s associated
with each of the sample variances:
(n1 - 1) + (n2 - 1) + …+ (nk - 1)
= n 1 + n 2 … + nk - 1 - 1 - … - 1 = N - k
14
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Example
Three filling machines are used by a bottler to
fill 12 oz cans of soda. In an attempt to
determine if the three machines are filling the
cans to the same (mean) level, independent
samples of cans filled by each were selected
and the amounts of soda in the cans measured.
The samples are given below.
Machine 1
12.033
11.98512.009
12.009
12.033
12.025
12.054
12.050
Machine 2
12.031
11.98511.99811.992
11.98512.027
11.987
15
Machine 3
12.034
12.001
12.021
12.021
12.020
12.038
12.029
12.058
12.011
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Example
n1  8, x1  12.0248, s1  0.02301
n2  7, x2  12.0007, s2  0.01989
n3  9, x3  12.0259, s3  0.01650
x  12.018167
SSTr  n1  x1  x   n2  x 2  x  
2
2
 nk  xk  x 
2
 8(0.0065833)2  7(-0.0174524)2  9(0.0077222)2
 0.000334672+0.00213210+0.00053669
 0.00301552
16
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Example
n1  8, x1  12.0248, s1  0.02301
n2  7, x2  12.0007, s2  0.01989
n3  9, x3  12.0259, s3  0.01650
x  12.018167
SSE   n1  1 s12  n2  1 s22 
 nk  1 sk2
 7(0.0230078)2  6(0.0198890)2  8(0.01649579)2
 0.0037055  0.0023734  0.0021769
 0.00825582
17
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Example
n1  8, x1  12.0248, s1  0.02301
n2  7, x2  12.0007, s2  0.01989
n3  9, x3  12.0259, s3  0.01650
x  12.018167
SSTr
mean square for treatments = MSTr =
k 1
SSTr 0.00301552
MSTr 

 0.0015078
k 1
3 1
SSE
mean square for error = MSE =
Nk
SSE 0.0082579
MSE 

 0.00039313
Nk
24  3
18
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Comments
Both MSTr and MSE are quantities that are
calculated from sample data.
As such, both MSTr and MSE are statistics
and have sampling distributions.
More specifically, when H0 is true, µMSTr = µMSE.
However, when H0 is false, µMSTr  µMSE and the
greater the differences among the m’s, the larger µMSTr
will be relative to µMSE.
19
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
The Single-Factor ANOVA F Test
Null hypothesis: H0: µ1 = µ2 = µ3 = … = µk
Alternate hypothesis: At least two of the µ’s
are different
Test Statistic:
20
MSTr
F
MSE
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
The Single-Factor ANOVA F Test
When H0 is true and the ANOVA assumptions
are reasonable, F has an F distribution with
df1 = k - 1 and df2 = N - k.
Values of F more contradictory to H0 than what was
calculated are values even farther out in the upper tail,
so the P-value is the area captured in the upper tail of
the corresponding F curve.
21
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Example
Consider the earlier example involving the
three filling machines.
Machine 1
12.033
12.025
Machine 2
12.031
12.027
Machine 3
12.034
12.020
22
11.985
12.054
12.009
12.050
12.009
12.033
11.985
11.987
11.998
11.992
11.985
12.021
12.029
12.038
12.011
12.058
12.021
12.001
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Example
n1  8, x1  12.0248, s1  0.02301
n2  7, x2  12.0007, s2  0.01989
n3  9, x3  12.0259, s3  0.01650
x  12.018167
23
SSTr  0.00301552
SSE  0.00825582
MSTr  0.0015078
MSE  0.00039313
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Example
1. Let µ1, µ2 and µ3 denote the true mean
amount of soda in the cans filled by
machines 1, 2 and 3, respectively.
2. H0: µ1 = µ2 = µ3
3. Ha: At least two among are µ1, µ2 and µ3
different
4. Significance level:  = 0.01
5. Test statistic: F 
24
MSTr
MSE
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Example
6. Looking at the comparative dotplot, it
seems reasonable to assume that the
distributions have the same ’s. We shall
look at the normality assumption on the
next slide.*
Dotplot for Fill
Machine
Machine 3
Machine 2
Machine 1
11.99
12.00
12.01
12.02
12.03
12.04
12.05
12.06
Fill
*When the sample sizes are large, we can make judgments about
both the equality of the standard deviations and the normality of the
underlying populations with a comparative boxplot.
25
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Example
6. Looking at normal plots for the samples, it
certainly appears reasonable to assume that
the samples from Machine’s 1 and 2 are
samples from normal distributions.
Unfortunately, the normal plot for the sample
from Machine 2 does not appear to be a
sample from a normal population. So as to
have a computational example, we shall
continue and finish the test, treating the
result with a “grain of salt.”
26
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Example
7. Computation:
n1  8, x1  12.0248, s1  0.02301
n2  7, x2  12.0007, s2  0.01989
n3  9, x3  12.0259, s3  0.01650
x  12.018167
SSTr  0.00301552
SSE  0.00825582
MSTr  0.0015078
MSE  0.00039313
N  n1  n2  n3  8  7  9  24, k  3
MSTr 0.0015078
F

 3.835
MSE 0.00039313
df1  treatment df  k  1  3  1  2
27
df2  error df  N  k  24  3  21
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Example
8. P-value:
Recall
MSTr 0.0015078

 3.835
MSE 0.00039313
df1  treatment df  k  1  3  1  2
F
df2  error df  N  k  24  3  21
From the F table with
numerator df1 = 2 and
denominator df2 = 21 we
can see that
0.025 < P-value < 0.05
(Minitab reports this value
to be 0.038
28
dfden / dfnum
21

2
0.100
0.050
0.025
0.010
0.001
2.57
3.47
4.42
5.78
9.77
3.835
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Example
9. Conclusion:
Since P-value >  = 0.01, we fail to reject
H0. We are unable to show that the mean
fills are different and conclude that the
differences in the mean fills of the
machines show no statistically significant
differences at the 1% level of significance.
29
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Total Sum of Squares
Total sum of squares, denoted by SSTo,
is given by
SSTo 

(x  x)2
all N obs.
with associated df = N - 1.
The relationship between the three sums of
squares is SSTo = SSTr + SSE
which is often called the fundamental identity
for single-factor ANOVA.
Informally this relation is expressed as
Total variation = Explained variation + Unexplained variation
30
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Single-factor ANOVA Table
The following is a fairly standard way of
presenting the important calculations from
an single-factor ANOVA. The output from
most statistical packages will contain an
additional column giving the P-value.
31
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Single-factor ANOVA Table
The ANOVA table supplied by Minitab
One-way ANOVA: Fills versus Machine
Analysis of Variance for Fills
Source
DF
SS
MS
Machine
2 0.003016 0.001508
Error
21 0.008256 0.000393
Total
23 0.011271
32
F
3.84
P
0.038
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Another Example
A food company produces 4 different
brands of salsa. In order to determine if the
four brands had the same sodium levels, 10
bottles of each Brand were randomly (and
independently) obtained and the sodium
content in milligrams (mg) per tablespoon
serving was measured.
The sample data are given on the next
slide.
Use the data to perform an appropriate
hypothesis test at the 0.05 level of
significance.
33
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Another Example
Brand A
43.85 44.30 45.69 47.13 43.35
45.59 45.92 44.89 43.69 44.59
Brand B
42.50 45.63 44.98 43.74 44.95
42.99 44.95 45.93 45.54 44.70
Brand C
45.84 48.74 49.25 47.30 46.41
46.35 46.31 46.93 48.30 45.13
Brand D
43.81 44.77 43.52 44.63 44.84
46.30 46.68 47.55 44.24 45.46
34
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Another Example
1. Let µ1, µ2 , µ3 and µ4 denote the true
mean sodium content per tablespoon in
each of the brands respectively.
2. H0: µ1 = µ2 = µ3 = µ4
3. Ha: At least two among are µ1, µ2, µ3 and
µ4 are different
4. Significance level:  = 0.05
MSTr
5. Test statistic: F 
MSE
35
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Another Example
6. Looking at the following comparative
boxplot, it seems reasonable to assume
that the distributions have the equal ’s as
well as the samples being samples from
normal distributions.
Boxplots of Brand A - Brand D
(means are indicated by solid circles)
49
48
47
46
45
44
43
36
Brand D
Brand C
Brand B
Brand A
42
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Example
7. Computation:
Brand
k
Brand A
10
Brand B
10
Brand C
10
Brand D
10
xi
44.900
44.591
47.056
45.180
si
1.180
1.148
1.331
1.304
x  45.432
SSTr  n1(x1  x)2  n2 (x 2  x)2  n3 (x 3  x)2  n4 (x 4  x)2
 10(44.900  45.432)2  10(44.591  45.432)2
 10(47.056  45.432)2  10(45.180  45.432)2
 36.912
Treatment df = k - 1 = 4 - 1 = 3
37
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Example
7. Computation (continued):
SSE  n1  1 s12  n2  1 s22  n3  1 s32  n4  1 s 42
 9(1.180)2  9(1.148)2  9(1.331)2  9(1.304)2
 55.627
Error df = N - k = 40 - 4 = 36
SSTr
F
38
36.912
MSTr
dfSSTr
3  12.304  7.963


SSE
55.627
MSE
1.5452
dfSSE
36
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Example
8. P-value:
F = 7.96 with dfnumerator= 3 and dfdenominator= 36
7.96
Using df = 30 we find
P-value < 0.001
39
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Example
9. Conclusion:
Since P-value <  = 0.001, we reject
H0. We can conclude that the mean
sodium content is different for at least
two of the Brands.
We need to learn how to interpret the results and
will spend some time on developing techniques to
describe the differences among the µ’s.
40
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Multiple Comparisons
A multiple comparison procedure is a
method for identifying differences among the
µ’s once the hypothesis of overall equality
(H0) has been rejected.
The technique we will present is based on
computing confidence intervals for difference
of means for the pairs.
Specifically, if k populations or treatments are studied,
we would create k(k-1)/2 differences. (i.e., with 3
treatments one would generate confidence intervals for
µ1 - µ2, µ1 - µ3 and µ2 - µ3.) Notice that it is only
necessary to look at a confidence interval for µ1 - µ2 to
see if µ1 and µ2 differ.
41
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
The Tukey-Kramer Multiple
Comparison Procedure
When there are k populations or treatments
being compared, k(k-1)/2 confidence
intervals must be computed. If we denote the
relevant Studentized range critical value by
q, the intervals are as follows:
MSE  1 1 
For mi - mj: (mi  m j )  q
  
2  ni n j 
Two means are judged to differ significantly if
the corresponding interval does not include
zero.
42
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
The Tukey-Kramer Multiple
Comparison Procedure
When all of the sample sizes are the same,
we denote n by n = n1 = n2 = n3 = … = nk,
and the confidence intervals (for µi - µj)
simplify to
MSE
( mi  m j )  q
n
43
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Example (continued)
Continuing with example dealing with the
sodium content for the four Brands of salsa we
shall compute the Tukey-Kramer 95% TukeyKramer confidence intervals for µA - µB, µA - µC,
µA - µD, µB - µC, µB - µD and µC - µD.
55.627
 1.5452, n  nA  nB  nC  nD  10
36
 Interpolating from the table

q  3.81 

i.e.
60%
of
the
way
from
3.85
to
3.79


MSE 
MSE
1.5452
q
 3.81
 1.498
n
10
44
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Example (continued)
95% Confidence 95% Confidence
Interval
Limits
Difference
mA - mB
mA - mC
mA - mD
mB - mC
mB - mD
mC - mD
45
0.309 ± 1.498
(-1.189, 1.807)
-2.156 ± 1.498
(-3.654, -0.658)
-0.280 ± 1.498
(-1.778, 1.218)
-2.465 ± 1.498
(-3.963, -0.967)
-0.589 ± 1.498
(-2.087, 0.909)
1.876 ± 1.498
(0.378, 3.374)
Notice that the confidence intervals for µA – µB, µA – µC
and µC – µD do not contain 0 so we can infer that the mean
sodium content for Brands C is different from Brands A, B
and D.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Example (continued)
We also illustrate the differences with the
following listing of the sample means in
increasing order with lines underneath those
blocks of means that are indistinguishable.
Brand B
Brand A
Brand D
Brand C
44.591
44.900
45.180
47.056
Notice that the confidence interval for µA – µC, µB – µC, and
µC – µD do not contain 0 so we can infer that the mean
sodium content for Brand C and all others differ.
46
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Minitab Output for Example
One-way ANOVA: Sodium versus Brand
Analysis of Variance for Sodium
Source
DF
SS
MS
Brand
3
36.91
12.30
Error
36
55.63
1.55
Total
39
92.54
Level
Brand
Brand
Brand
Brand
A
B
C
D
N
10
10
10
10
Pooled StDev =
47
Mean
44.900
44.591
47.056
45.180
1.243
StDev
1.180
1.148
1.331
1.304
F
7.96
P
0.000
Individual 95% CIs For Mean
Based on Pooled StDev
------+---------+---------+---------+
(-----*------)
(------*-----)
(------*------)
(------*-----)
------+---------+---------+---------+
44.4
45.6
46.8
48.0
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Minitab Output for Example
Tukey's pairwise comparisons
Family error rate = 0.0500
Individual error rate = 0.0107
Critical value = 3.81
Intervals for (column level mean) - (row level mean)
Brand A
Brand B
Brand B
-1.189
1.807
Brand C
-3.654
-0.658
-3.963
-0.967
Brand D
-1.778
1.218
-2.087
0.909
48
Brand C
0.378
3.374
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Simultaneous Confidence Level
The Tukey-Kramer intervals are created in a
manner that controls the simultaneous
confidence level.
For example at the 95% level, if the procedure is used
repeatedly on many different data sets, in the long run only
about 5% of the time would at least one of the intervals not
include that value of what it is estimating.
We then talk about the family error rate being 5% which is
the maximum probability of one or more of the confidence
intervals of the differences of mean not containing the true
difference of mean.
49
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.