Statistics for Managers Using Microsoft Excel, 3/e

Download Report

Transcript Statistics for Managers Using Microsoft Excel, 3/e

Statistics for Managers
Using Microsoft Excel
3rd Edition
Chapter 9
Analysis of Variance
© 2002 Prentice-Hall, Inc.
Chap 9-1
Chapter Topics

The completely randomized design: onefactor analysis of variance




The factorial design: two-way analysis of
variance


ANOVA assumptions
F test for difference in c means
The Tukey-Kramer procedure
Examine effects of factors and interaction
Kruskal-Wallis rank test for differences in
c medians
© 2002 Prentice-Hall, Inc.
Chap 9-2
General Experimental Setting

Investigator controls one or more independent
variables



Observe effects on dependent variable


Called treatment variables or factors
Each treatment factor contains two or more levels
(or categories/classifications)
Response to levels of independent variable
Experimental design: the plan used to test
hypothesis
© 2002 Prentice-Hall, Inc.
Chap 9-3
Completely Randomized Design

Experimental units (subjects) are assigned
randomly to treatments


Only one factor or independent variable


Subjects are assumed homogeneous
With two or more treatment levels
Analyzed by

One-factor analysis of variance (one-way ANOVA)
© 2002 Prentice-Hall, Inc.
Chap 9-4
Randomized Design Example
Factor (Training Method)
Factor Levels
(Treatments)
Randomly
Assigned
Units
Dependent
Variable
(Response)
© 2002 Prentice-Hall, Inc.



21 hrs
17 hrs
31 hrs
27 hrs
25 hrs
28 hrs
29 hrs
20 hrs
22 hrs
Chap 9-5
One-Factor Analysis of Variance
F Test

Evaluate the difference among the mean responses
of two or more (c ) populations


e.g.: Several types of tires, oven temperature settings
Assumptions



Samples are randomly and independently drawn
 This condition must be met
Populations are normally distributed
 F test is robust to moderate departure from normality
Populations have equal variances
 Less sensitive to this requirement when samples are
of equal size from each population
© 2002 Prentice-Hall, Inc.
Chap 9-6
Why ANOVA?



Could compare the means one by one using Z
or t tests for difference of means
Each Z or t test contains Type I error
The total Type I error with k pairs of means
is 1- (1 - a) k

e.g.: If there are 5 means and use a = .05



© 2002 Prentice-Hall, Inc.
Must perform 10 comparisons
Type I error is 1 – (.95) 10 = .40
40% of the time you will reject the null hypothesis
of equal means in favor of the alternative even
when the null is true!
Chap 9-7
Hypotheses of One-Way ANOVA

H0 : 1  2 



 c
All population means are equal
No treatment effect (no variation in means among
groups)
H1 : Not all i are the same



© 2002 Prentice-Hall, Inc.
At least one population mean is different (others
may be the same!)
There is treatment effect
Does not mean that all population means are
different
Chap 9-8
One-Factor ANOVA
(No Treatment Effect)
H0 : 1  2   c
H1 : Not all i are the same
The Null
Hypothesis is
True
1  2  3
© 2002 Prentice-Hall, Inc.
Chap 9-9
One-Factor ANOVA
(Treatment Effect Present)
H0 : 1  2 
 c
H1 : Not all i are the same
1  2  3
© 2002 Prentice-Hall, Inc.
The Null
Hypothesis is
NOT True
1  2  3
Chap 9-10
One-Factor ANOVA
(Partition of Total Variation)
Total Variation SST
=






Variation Due to
Treatment SSA
Commonly referred to as:
Sum of Squares Among
Sum of Squares Between
Sum of Squares Model
Sum of Squares Explained
Sum of Squares Treatment
Among Groups Variation
© 2002 Prentice-Hall, Inc.
+
Variation Due to Random
Sampling SSW




Commonly referred to as:
Sum of Squares Within
Sum of Squares Error
Sum of Squares Unexplained
Within Groups Variation
Chap 9-11
Total Variation
nj
c
SST   ( X ij  X )
2
j 1 i 1
X ij : the i -th observation in group j
n j : the number of observations in group j
n : the total number of observations in all groups
c : the number of groups
c
X 
© 2002 Prentice-Hall, Inc.
nj
 X
j 1 i 1
c
ij
the overall or grand mean
Chap 9-12
Total Variation

SST  X11  X
 X
2
21
X
  X
(continued)
2
nc c
X

2
Response, X
X
Group 1
© 2002 Prentice-Hall, Inc.
Group 2
Group 3
Chap 9-13
Among-Group Variation
c
SSA   n j ( X j  X )
j 1
2
SSA
MSA 
c 1
X j : The sample mean of group j
X : The overall or grand mean
i  j
© 2002 Prentice-Hall, Inc.
Variation Due to Differences Among Groups.
Chap 9-14
Among-Group Variation(continued)



2

2
SSA  n1 X1  X  n2 X 2  X 

 nc X c  X

2
Response, X
X3
X1
Group 1
© 2002 Prentice-Hall, Inc.
Group 2
X2
X
Group 3
Chap 9-15
Within-Group Variation
c
nj
SSW   ( X ij  X j )
2
j 1 i 1
SSW
MSW 
nc
X j : The sample mean of group j
X ij : The i-th observation in group j
Summing the variation
within each group and then
adding over all groups.
© 2002 Prentice-Hall, Inc.
j
Chap 9-16
Within-Group Variation(continued)
SSW   X11  X1    X 21  X1  
2
2

 X ncc  X c

2
Response, X
X3
X1
Group 1
© 2002 Prentice-Hall, Inc.
Group 2
X2
Group 3
Chap 9-17
Within-Group Variation(continued)
For c = 2, this is the
SSW
MSW 
pooled-variance in the
nc
t-Test.
2
2
2
(n1  1) S1  (n2  1) S2      (nc  1) Sc

(n1  1)  (n2  1)      (nc  1)
•If more than two groups,
use F Test.
•For two groups, use tTest. F Test more limited.
© 2002 Prentice-Hall, Inc.
j
Chap 9-18
One-Factor ANOVA
F Test Statistic

Test statistic

MSA
F
MSW



MSA is mean squares among or between variances
MSW is mean squares within or error variances
Degrees of freedom


© 2002 Prentice-Hall, Inc.
df1  c  1
df 2  n  1
Chap 9-19
One-Factor ANOVA
Summary Table
Degrees
Source of
of
Variation
Freedom
Among
c–1
(Factor)
Within
(Error)
Total
© 2002 Prentice-Hall, Inc.
Sum of
Squares
SSA
n–c
SSW
n–1
SST =
SSA + SSW
Mean
Squares
(Variance)
F
Statistic
MSA =
MSA/MSW
SSA/(c – 1 )
MSW =
SSW/(n – c )
Chap 9-20
Features of One-Factor ANOVA
F Statistic

The F Statistic is the ratio of the among
estimate of variance and the within
estimate of variance




The ratio must always be positive
Df1 = c -1 will typically be small
Df2 = n - c will typically be large
The ratio should be closed to 1 if the null
is true
© 2002 Prentice-Hall, Inc.
Chap 9-21
Features of One-Factor ANOVA
F Statistic
(continued)


The numerator is expected to be greater than
the denominator
The ratio will be larger than 1 if the null is
false
© 2002 Prentice-Hall, Inc.
Chap 9-22
One-Factor ANOVA F Test
Example
As production manager, you
want to see if three filling
machines have different mean
filling times. You assign 15
similarly trained and
experienced workers, five per
machine, to the machines. At
the .05 significance level, is
there a difference in mean
filling times?
© 2002 Prentice-Hall, Inc.
Machine1 Machine2
Machine3
25.40
26.31
24.10
23.74
25.10
23.40
21.80
23.50
22.75
21.60
20.00
22.20
19.75
20.60
20.40
Chap 9-23
One-Factor ANOVA Example:
Scatter Diagram
Machine1 Machine2
Machine3
25.40
26.31
24.10
23.74
25.10
23.40
21.80
23.50
22.75
21.60
27
20.00
22.20
19.75
20.60
20.40
X 1  24.93
X 2  22.61
X 3  20.59
X  22.71
© 2002 Prentice-Hall, Inc.
26
25
24
23
22
21
20
•
••
•
•
X1
••
•
••
X2
•
••
••
X
X3
19
Chap 9-24
One-Factor ANOVA Example
Computations
Machine1 Machine2 Machine3
25.40
26.31
24.10
23.74
25.10
23.40
21.80
23.50
22.75
21.60
20.00
22.20
19.75
20.60
20.40
X 1  24.93
nj  5
X 2  22.61
c3
n  15
X 3  20.59
X  22.71
2
2
2

SSA  5  24.93  22.71   22.61  22.71   20.59  22.71 


 47.164
SSW  4.2592  3.112  3.682  11.0532
MSA  SSA /(c -1)  47.16 / 2  23.5820
MSW  SSW /(n - c)  11.0532 /12  .9211
© 2002 Prentice-Hall, Inc.
Chap 9-25
Summary Table
Degrees
Source of
of
Variation
Freedom
Sum of
Squares
Mean
Squares
(Variance)
F
Statistic
MSA/MSW
=25.60
Among
(Factor)
3-1=2
47.1640
23.5820
Within
(Error)
15-3=12
11.0532
.9211
Total
15-1=14
58.2172
© 2002 Prentice-Hall, Inc.
Chap 9-26
One-Factor ANOVA Example
Solution
Test Statistic:
H0: 1 = 2 = 3
H1: Not All Equal
a = .05
df1= 2
MSA
23.5820
 25.6
F

MSW
.9211
df2 = 12
Decision:
Reject at a = 0.05
Critical Value(s):
a = 0.05
0
© 2002 Prentice-Hall, Inc.
3.89
F
Conclusion:
There is evidence that at least
one  i differs from the rest.
Chap 9-27
Solution In EXCEL


Use tools | data analysis | ANOVA: single
factor
EXCEL worksheet that performs the one-factor
ANOVA of the example
© 2002 Prentice-Hall, Inc.
Chap 9-28
The Tukey-Kramer Procedure

Tells which population means are significantly
different



e.g.: 1 = 2  3
Two groups whose means
may be significantly
different
1= 2
Post hoc (a posteriori) procedure


f(X)
3
X
Done after rejection of equal means in ANOVA
Ability for pair-wise comparisons

Compare absolute mean differences with critical
range
© 2002 Prentice-Hall, Inc.
Chap 9-29
The Tukey-Kramer Procedure:
Example
Machine1 Machine2 Machine3
25.40
23.40
20.00
26.31
21.80
22.20
24.10
23.50
19.75
23.74
22.75
20.60
25.10
21.60
20.40
2. Compute Critical Range:
Critical Range  QU ( c,nc )
1. Compute absolute mean
differences:
X 1  X 2  24.93  22.61  2.32
X 1  X 3  24.93  20.59  4.34
X 2  X 3  22.61  20.59  2.02
MSW  1 1 
    1.618
2  nj nj' 
3. All of the absolute mean differences are greater. There
is a significance difference between each pair of means at
5% level of significance.
© 2002 Prentice-Hall, Inc.
Chap 9-30
Solution in PHStat


Use PHStat | c-sample tests | Tukey-Kramer
procedure …
EXCEL worksheet that performs the TukeyKramer procedure for the previous example
© 2002 Prentice-Hall, Inc.
Chap 9-31
Two-Way ANOVA

Examines the effect of

Two factors on the dependent variable


e.g.: Percent carbonation and line speed on
soft drink bottling process
Interaction between the different levels of
these two factors

© 2002 Prentice-Hall, Inc.
e.g.: Does the effect of one particular
percentage of carbonation depend on which
level the line speed is set?
Chap 9-32
Two-Way ANOVA

(continued)
Assumptions

Normality


Homogeneity of Variance


Populations are normally distributed
Populations have equal variances
Independence of Errors

© 2002 Prentice-Hall, Inc.
Independent random samples are drawn
Chap 9-33
Two-Way ANOVA
Total Variation Partitioning
Variation Due to
Treatment A
Total Variation
SST
d.f.= n-1
=
Variation Due to
Treatment B
Variation Due to
Interaction
Variation Due to
Random Sampling
© 2002 Prentice-Hall, Inc.
SSA
d.f.= r-1
SSB
d.f.= c-1
+
+
SSAB +
d.f.= (r-1)(c-1)
SSE
d.f.= rc(n’-1)
Chap 9-34
Two-Way ANOVA
Total Variation Partitioning
r  the number of levels of factor A
c  the number of levels of factor B
n  the number of values (replications) for each cell
'
n  the total number of observations in the experiment
X ijk  the value of the k -th observation for level i of
factor A and level j of factor B
© 2002 Prentice-Hall, Inc.
Chap 9-35
Total Variation
r
c
n'

SST   X ijk  X
i 1 j 1 k 1

2
Sum of Squares Total
= total variation among all
observations around the grand mean
r
X 
© 2002 Prentice-Hall, Inc.
c
n'
 X
i 1 j 1 k 1
'
r
ijk

c
n'
 X
i 1 j 1 k 1
rcn
n
 the overall or grand mean
ijk
Chap 9-36
Factor A Variation
r

SSA  cn  X i  X
'
i 1

2
Sum of Squares Due to Factor A
= the difference among the various
levels of factor A and the grand
mean
© 2002 Prentice-Hall, Inc.
Chap 9-37
Factor B Variation
c

SSB  rn  X  j   X
'
j 1

2
Sum of Squares Due to Factor B
= the difference among the various
levels of factor B and the grand mean
© 2002 Prentice-Hall, Inc.
Chap 9-38
Interaction Variation
r
c

SSAB  n  X ij   X i  X  j   X
'
i 1 j 1

2
Sum of Squares Due to Interaction between A and B
= the effect of the combinations of factor A and
factor B
© 2002 Prentice-Hall, Inc.
Chap 9-39
Random Error
r
c
n'

SSE   X ijk  X ij 
i 1 j 1 k 1

2
Sum of Squares Error
= the differences among the observations within
each cell and the corresponding cell means
© 2002 Prentice-Hall, Inc.
Chap 9-40
Two-Way ANOVA:
The F Test Statistic
H0: 1 .= 2 . = ••• = r .
F Test for Factor A Main Effect
MSA
H1: Not all i . are equal F  MSE
SSA
MSA 
r 1
Reject if
F > FU
H0:  1 = . 2 = ••• =  c F Test for Factor B Main Effect
MSB
H1: Not all . j are equal F  MSE
SSB
MSB 
c 1
Reject if
F > FU
H0: ij = 0 (for all i and j) F Test for Interaction Effect
H1: ij  0
© 2002 Prentice-Hall, Inc.
MSAB
F
MSE
SSAB
MSAB 
 r  1 c  1
Reject if
F > FU
Chap 9-41
Two-Way ANOVA
Summary Table
Source of
Variation
Degrees of
Freedom
Sum of
Squares
Mean
Squares
F
Statistic
Factor A
(Row)
r–1
SSA
MSA =
SSA/(r – 1)
MSA/
MSE
MSB/
MSE
MSAB/
MSE
Factor B
(Column)
c–1
SSB
MSB =
SSB/(c – 1)
AB
(Interaction)
(r – 1)(c – 1)
SSAB
MSAB =
SSAB/ [(r – 1)(c – 1)]
Error
n’
SSE
MSE =
SSE/[rc n’ – 1)]
Total
© 2002 Prentice-Hall, Inc.
rc
– 1)
rc n’ – 1
SST
Chap 9-42
Features of Two-Way ANOVA
F Test

Degrees of freedom always add up




rcn’-1=rc(n’-1)+(c-1)+(r-1)+(c-1)(r-1)
Total=error+column+row+interaction
The denominator of the F Test is always the
same but the numerator is different.
The sums of squares always add up

Total=error+column+row+interaction
© 2002 Prentice-Hall, Inc.
Chap 9-43
Kruskal-Wallis Rank Test
for c Medians

Extension of Wilcoxon rank-sum test




Tests the equality of more than 2 (c)
population medians
Distribution-free test procedure
Used to analyze completely randomized
experimental designs
Use 2 distribution to approximate if each
sample group size nj > 5

© 2002 Prentice-Hall, Inc.
df = c – 1
Chap 9-44
Kruskal-Wallis Rank Test

Assumptions






Independent random samples are drawn
Continuous dependent variable
Data may be ranked both within and among
samples
Populations have same variability
Populations have same shape
Robust with regard to last two conditions

Use F test in completely randomized designs and
when the more stringent assumptions hold
© 2002 Prentice-Hall, Inc.
Chap 9-45
Kruskal-Wallis Rank Test
Procedure

Obtain ranks


In event of tie, each of the tied values gets their
average rank
Add the ranks for data from each of the c
groups

Square to obtain tj2
c T2 
 12
j
H 
  3(n  1)

 n(n  1) j 1 n j 
n  n1  n2 
© 2002 Prentice-Hall, Inc.
 nc
Chap 9-46
Kruskal-Wallis Rank Test
Procedure
(continued)

Compute test statistic
c T2 
 12
j
H 
  3(n  1)

 n(n  1) j 1 n j 
n  n1  n2   nc




n j  Number of observation in j –th sample
H may be approximated by chi-square distribution
with df = c –1 when each nj >5
© 2002 Prentice-Hall, Inc.
Chap 9-47
Kruskal-Wallis Rank Test
Procedure
(continued)

Critical value for a given a
Upper tail



2
U
Decision rule


Reject H0: M1 = M2 = ••• = mc if test statistic
2
H > U
Otherwise do not reject H0
© 2002 Prentice-Hall, Inc.
Chap 9-48
Kruksal-Wallis Rank Test:
Example
As production manager, you
want to see if three filling
machines have different median
filling times. You assign 15
similarly trained & experienced
workers, five per machine, to
the machines. At the .05
significance level, is there a
difference in median filling
times?
© 2002 Prentice-Hall, Inc.
Machine1 Machine2
Machine3
25.40
26.31
24.10
23.74
25.10
23.40
21.80
23.50
22.75
21.60
20.00
22.20
19.75
20.60
20.40
Chap 9-49
Example Solution: Step 1
Obtaining a Ranking
Raw Data
Machine1 Machine2
Machine3
25.40
26.31
24.10
23.74
25.10
© 2002 Prentice-Hall, Inc.
23.40
21.80
23.50
22.75
21.60
Ranks
Machine1 Machine2
Machine3
20.00
22.20
19.75
20.60
20.40
14
15
12
11
13
65
9
6
10
8
5
38
2
7
1
4
3
17
Chap 9-50
Example Solution: Step 2
Test Statistic Computation
2


T
c
j 
 12
H 
 3(n  1)


n(n  1) j  1 n

j


 12
 652 382 172  

   3(15  1)



5
5 
15(15  1)  5



 11.58
© 2002 Prentice-Hall, Inc.
Chap 9-51
Kruskal-Wallis Test Example
Solution
H0: M1 = M2 = M3
H1: Not all equal
a = .05
df = c - 1 = 3 - 1 = 2
Critical Value(s):
a = .05
0
© 2002 Prentice-Hall, Inc.
5.991
Test Statistic:
H = 11.58
Decision:
Reject at a = .05
Conclusion:
There is evidence that
population medians are
not all equal.
Chap 9-52
Kruskal-Wallis Test in PHStat


PHStat | c-sample tests | Kruskal-Wallis rank
sum test …
Example solution in excel spreadsheet
© 2002 Prentice-Hall, Inc.
Chap 9-53
Chapter Summary

Described the completely randomized design:
one-factor analysis of variance




Described the factorial design: two-way analysis
of variance


ANOVA assumptions
F test for difference in c means
The Tukey-Kramer procedure
Examine effects of factors and interaction
Discussed Kruskal-Wallis rank test for differences
in c medians
© 2002 Prentice-Hall, Inc.
Chap 9-54