Transcript Slide 1

Statistics for Managers
Using Microsoft® Excel
5th Edition
Chapter 11
Analysis of Variance
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.
Chap 11-1
Learning Objectives
After completing this chapter, you should be able to:
 Recognize situations in which to use analysis of variance






(ANOVA)
Understand different analysis of variance designs
Evaluate assumptions of the model
Perform a single-factor ANOVA and interpret the results
Conduct and interpret a Tukey-Kramer post-analysis to
determine which means are different
Analyze two-factor analysis of variance tests
Conduct and interpret a Tukey-Kramer post-analysis procedure
to determine which factors are different
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.
Chap 11-2
General ANOVA Analysis
 Investigator controls one or more independent variables
 Called factors or treatment variables
 One factor contains three or more levels or groups or
categories/classifications
 Other factors contains two or more levels or groups or
categories/classifications
 Experimental design: the plan used to test the hypothesis
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.
Chap 11-3
One-Factor ANOVA
 Also known as Completely Randomized
Design and One-way ANOVA
 Experimental units (subjects) are assigned
randomly to treatments
 Subjects are assumed homogeneous
 Only one factor or independent variable
 With three or more treatment levels
 Analyzed by one-factor analysis of variance
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.
Chap 11-4
One-Factor Analysis of Variance
Evaluates the difference among the means of three or more
groups
Examples: Accident rates for 1st, 2nd, and 3rd shift
Expected mileage for five brands of tires
Assumptions
 Populations are normally distributed
(test with Box plot or Normal Probability Plot)
 Populations have equal variances
(use Levene’s Test for Homogeneity of Variance)
 Samples are randomly and independently drawn
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.
Chap 11-5
Why Analysis of Variance?
 We could compare the means in pairs using a t test for
difference of means
 Each t test contains Type 1 error
 The total Type 1 error with k pairs of means is 1- (1 - a) k
 If there are 5 means and you use a = .05
 Must perform 10 comparisons
 Type I error is 1 – (.95) 10 = .40
 40% of the time you will reject the null hypothesis of equal
means in favor of the alternative even when the null is true!
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.
Chap 11-6
Hypotheses: One-Factor ANOVA

H0 : μ1  μ2  μ3    μc
 All population means are equal
 i.e., no treatment effect (no variation in means among groups)

H1 : Not all of the population means are the same
 At least one population mean is different
 i.e., there is a treatment (groups) effect
 Does not mean that all population means are different (at
least one of the means is different from the others)
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.
Chap 11-7
Hypotheses: One-Factor
ANOVA
H0 : μ1  μ2  μ3    μc
H1 : Not all μ j are thesame
All Means are the same:
The Null Hypothesis is True
(No Group Effect)
μ1  μ 2  μ 3
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.
Chap 11-8
Hypotheses: One-Factor
ANOVA
At least one mean is different:
The Null Hypothesis is NOT true
(Treatment Effect is present)
H0 : μ1  μ 2  μ3    μ c
H1 : Not all μj are thesame
or
μ1  μ2  μ3
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.
μ1  μ2  μ3
Chap 11-9
One-Factor ANOVA Table
Source of
Variation
df
SS
MS
(Variance)
P-value
Between
Groups
c-1
SSA
MSA
P(X=F)
Within
Groups
n-c
SSW
MSW
Total
n-1
SST =
SSA+SSW
F-Ratio
F
MSA
MSW
c = number of groups
n = sum of the sample sizes from all groups
df = degrees of freedom
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.
Chap 11-10
One-Factor ANOVA
Test Statistic
H0: μ1= μ2 = … = μc
H1: At least two population means are different
 Test statistic
MSA
F
MSW
 MSA is mean squares among variances
 MSW is mean squares within variances
 Degrees of freedom
 df1 = c – 1
 df2 = n – c
(c = number of groups)
(n = sum of all sample sizes)
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.
Chap 11-11
One-Factor ANOVA
Test Statistic
 The F statistic is the ratio of the among variance to the
within variance
 The ratio must always be positive
 df1 = c -1 will typically be small
 df2 = n - c will typically be large
Decision Rule:
Reject H0 if F > FU,
otherwise do not reject H0
a = .05
0
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.
Do not
reject H0
Reject H0
FU
Chap 11-12
One-Factor ANOVA
F Test Example
You want to see if three different
golf clubs yield different
distances. You randomly select
five measurements from trials
on an automated driving
machine for each club. At the
.05 significance level, is there a
difference in mean driving
distance?
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.
Club 1
254
263
241
237
251
Club 2
234
218
235
227
216
Club 3
200
222
197
206
204
Chap 11-13
One-Way ANOVA
Example
Distance
270
Club 1
254
263
241
237
251
Club 2
234
218
235
227
216
Club 3
200
222
197
206
204
x1  249.2 x 2  226.0 x 3  205.8
x  227.0
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.
260
250
240
230
220
•
•• X 1
•
•
••
•
••
X2
210
••
••
200
190
•
1
2
Club
X
X3
3
Chap 11-14
ANOVA -- Single Factor:
Excel Output
EXCEL: Tools | Data Analysis | ANOVA: Single Factor
SUMMARY
Groups
Count
Sum
Average
Variance
Club 1
5
1246
249.2
108.2
Club 2
5
1130
226
77.5
Club 3
5
1029
205.8
94.2
ANOVA
Source of
Variation
SS
df
MS
F
P-value
25.275
4.99E-05
Between
Groups
4716.4
2
2358.2
Within
Groups
1119.6
12
93.3
Total
5836.0
14
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.
F crit
3.89
Chap 11-15
One-Factor ANOVA Example
Solution
H0: μ1 = μ2 = μ3
H1: μi not all equal
a = .05
p-value: 4.99E-05
Decision:
Reject H0 at a = 0.05
Conclusion:
There is evidence that
at least one μi differs
from the rest
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.
Chap 11-16
The Tukey-Kramer Procedure
 Tells which population means are significantly different
 e.g.: μ1 = μ2 ≠ μ3
 Done after rejection of equal means in ANOVA
 Allows pair-wise comparisons
 Compare absolute mean differences with critical
range
μ1= μ2
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.
μ3
x
Chap 11-17
Tukey-Kramer Critical Range
CriticalRange  QU
MSW  1 1 

2  n j n j' 
where:
QU = Value from Studentized Range Distribution with c
and n - c degrees of freedom for the desired level
of a (see appendix E.9 table)
MSW = Mean Square Within
nj and nj’ = Sample sizes from groups j and j’
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.
Chap 11-18
The Tukey-Kramer Procedure:
Example
Club 1
254
263
241
237
251
Club 2
234
218
235
227
216
Club 3
200
222
197
206
204
1. PhStat computes the absolute
mean differences:
x1  x 2  249.2  226.0  23.2
x1  x 3  249.2  205.8  43.4
x 2  x 3  226.0  205.8  20.2
2. You find a QU value from the table in appendix E.9 with
c = 3 (across the table) and n – c = 15 – 3 = 12 degrees of
freedom (down the table) for the desired level of a (a = .05
used here):
QU  3.77
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.
Chap 11-19
The Tukey-Kramer Procedure:
Example
(continued)
3. PhStat computes the Critical Range:
Critical Range  QU
MSW
2
1 1
    3.77 93.3  1  1   16.285
n n 
2 5 5
j' 
 j
4. Compare:
5. All of the absolute mean differences
are greater than critical range.
Therefore there is a significant
difference between each pair of
means at 5% level of significance.
x1  x 2  23.2
x1  x 3  43.4
x 2  x 3  20.2
PhStat does all the calculations for you
but you must input the Q value
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.
Chap 11-20
Tukey-Kramer in PHStat
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.
Chap 11-21
ANOVA Assumptions
Levene’s Test
 Tests the assumption that the variances of each
group are equal.
 First, define the null and alternative hypotheses:
 H0: σ21 = σ22 = …=σ2c
 H1: Not all σ2j are equal
 Second, compute the absolute value of the difference
between each value and the median of each group.
 Third, perform a one-way ANOVA on these absolute
differences.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.
Chap 11-22
Two-Factor ANOVA
 Examines the effect of
 Two factors of interest on the dependent
variable
 e.g., Percent carbonation and line speed on soft
drink bottling process
 Interaction between the different levels of these
two factors
 e.g., Does the effect of one particular carbonation
level depend on which level the line speed is set?
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.
Chap 11-23
Two-Factor ANOVA
 Assumptions
 Populations are normally distributed
 Populations have equal variances
 Independent random samples are selected
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.
Chap 11-24
Two-Factor ANOVA
Sources of Variation
Two Factors of interest: A and B
r = number of levels of factor A
c = number of levels of factor B
n/ = number of replications for each cell
n = total number of observations in all cells
(n = rcn/)
Xijk = value of the kth observation of level i
of factor A and level j of factor B
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.
Chap 11-25
Two-Factor ANOVA Summary Table
With Replication
Source of
Variation
Degrees of
Freedom
Sum of
Squares
Mean
Squares
F
Statistic
p-value
Sample
Factor A
(Row)
r–1
SSA
MSA =
SSA/(r – 1)
MSA/
MSE
f (FA)
Columns
Factor B
c–1
SSB
MSB =
SSB/(c – 1)
MSB/
MSE
f (FB)
SSAB
MSAB =
SSAB/ [(r – 1)(c – 1)]
MSAB/
MSE
f (FA&B)
SSE
MSE =
SSE/[rc (n’ – 1)]
Interaction
(r – 1)(c – 1)
(AB)
Within
Error
Total
rc
(n’ –
1)
rc n’ – 1
SST
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.
Chap 11-26
Two-Factor ANOVA
With Replication
As production manager,
you want to see if 3 filling
machines have different
mean filling times when
used with 5 types of boxes.
At the .05 level, is there a
difference in machines, in
boxes? Is there an
interaction?
Box Machine1 Machine2 Machine3
1
25.40
23.40
20.00
26.40
24.40
21.00
2
26.31
21.80
22.20
25.90
23.00
22.00
3
24.10
23.50
19.75
24.40
22.40
19.00
4
23.74
22.75
20.60
25.40
23.40
20.00
5
25.10
21.60
20.40
26.20
22.90
21.90
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.
Chap 11-28
Summary Table
Source of
Variation
Sample
(Boxes)
Columns
(Machines)
Degrees of Sum of
Mean
Freedom
Squares Square
F
P-Value
5-1=4
7.4714
1.8678
3.6868
3-1=2
106.298
53.149
104.908 1.52E-09
Interaction (5-1)(3-1) = 8
9.7032
1.2129
7.5994
.5066
Within
(Error)
5·3·(2-1)=15
Total
3·5·2 -1 = 29 131.0720
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.
2.3941
.0277
.0690
Chap 11-29
The Tukey-Kramer Procedure:
With Replication Factor A
critical range  Qr ,rc ( n ' 1)
MSW
cn'
1. No Macro - compute by formula only
2. MSW (Within) from ANOVA Printout
3. Q from Table E.9 in the Book page 860
alpha = .05 or .01
r is the number of levels of factor A (across table)
rc(n’-1) (down the table)
c is the number of levels of factor B
n’ is the number of replications
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.
Chap 11-30
The Tukey-Kramer Procedure:
With Replication Factor B
MSW
critical range  Qc,rc ( n ' 1)
rn'
1. No Macro compute by formula only
2. MSW (Within) from ANOVA Printout
3. Q from Table E.9 in the Book page 860
alpha = .05 or .01
c is the number of levels of factor B (across table)
rc(n’-1) (down the table)
r is the number of levels of factor A
n’ is the number of replications
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.
Chap 11-31
Chapter Summary
 Described one-factor analysis of variance
 ANOVA assumptions
 ANOVA test for difference in c means
 The Tukey-Kramer procedure for multiple comparisons
 Described two-factor analysis of variance
 Examined effects of multiple factors
 Examined interaction between factors for the model with replicated
observations
 The Tukey-Kramer procedure for multiple comparisons for both
factor A and factor B for the model with replication
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.
Chap 11-32