Basic Business Statistics, 10/e

Download Report

Transcript Basic Business Statistics, 10/e

Basic Business Statistics
10th Edition
Chapter 11
Analysis of Variance
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc..
Chap 11-1
Learning Objectives
In this chapter, you learn:

The basic concepts of experimental design

How to use one-way analysis of variance to test
for differences among the means of several
populations (also referred to as “groups” in this
chapter)

When to use a randomized block design

How to use two-way analysis of variance and the
concept of interaction
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Chap 11-2
Chapter Overview
Analysis of Variance (ANOVA)
One-Way
ANOVA
F-test
Randomized
Block Design
Multiple
Comparisons
Two-Way
ANOVA
Interaction
Effects
TukeyKramer
test
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Chap 11-3
General ANOVA Setting

Investigator controls one or more independent
variables



Observe effects on the dependent variable


Called factors (or treatment variables)
Each factor contains two or more levels (or groups or
categories/classifications)
Response to levels of independent variable
Experimental design: the plan used to collect
the data
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Chap 11-4
Completely Randomized Design

Experimental units (subjects) are assigned
randomly to treatments


Only one factor or independent variable


Subjects are assumed homogeneous
With two or more treatment levels
Analyzed by one-way analysis of variance
(ANOVA)
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Chap 11-5
One-Way Analysis of Variance

Evaluate the difference among the means of three
or more groups
Examples: Accident rates for 1st, 2nd, and 3rd shift
Expected mileage for five brands of tires

Assumptions
 Populations are normally distributed
 Populations have equal variances
 Samples are randomly and independently drawn
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Chap 11-6
Hypotheses of One-Way ANOVA


H0 : μ1  μ2  μ3    μc

All population means are equal

i.e., no treatment effect (no variation in means among
groups)
H1 : Not all of the population means are the same

At least one population mean is different

i.e., there is a treatment effect

Does not mean that all population means are different
(some pairs may be the same)
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Chap 11-7
One-Factor ANOVA
H0 : μ1  μ2  μ3    μc
H1 : Not all μj are the same
All Means are the same:
The Null Hypothesis is True
(No Treatment Effect)
μ1  μ2  μ3
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Chap 11-8
One-Factor ANOVA
H0 : μ1  μ2  μ3    μc
(continued)
H1 : Not all μj are the same
At least one mean is different:
The Null Hypothesis is NOT true
(Treatment Effect is present)
or
μ1  μ2  μ3
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
μ1  μ2  μ3
Chap 11-9
Partitioning the Variation

Total variation can be split into two parts:
SST = SSA + SSW
SST = Total Sum of Squares
(Total variation)
SSA = Sum of Squares Among Groups
(Among-group variation)
SSW = Sum of Squares Within Groups
(Within-group variation)
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Chap 11-10
Partitioning the Variation
(continued)
SST = SSA + SSW
Total Variation = the aggregate dispersion of the individual
data values across the various factor levels (SST)
Among-Group Variation = dispersion between the factor
sample means (SSA)
Within-Group Variation = dispersion that exists among
the data values within a particular factor level (SSW)
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Chap 11-11
Partition of Total Variation
Total Variation (SST)
d.f. = n – 1
=
Variation Due to
Factor (SSA)
+
Variation Due to Random
Sampling (SSW)
d.f. = c – 1




Commonly referred to as:
Sum of Squares Between
Sum of Squares Among
Sum of Squares Explained
Among Groups Variation
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
d.f. = n – c




Commonly referred to as:
Sum of Squares Within
Sum of Squares Error
Sum of Squares Unexplained
Within-Group Variation
Chap 11-12
Total Sum of Squares
SST = SSA + SSW
c
nj
SST   ( Xij  X)
2
j1 i1
Where:
SST = Total sum of squares
c = number of groups (levels or treatments)
nj = number of observations in group j
Xij = ith observation from group j
X = grand mean (mean of all data values)
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Chap 11-13
Total Variation
(continued)
SST  ( X11  X)2  ( X12  X)2  ...  ( Xcnc  X)2
Response, X
X
Group 1
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Group 2
Group 3
Chap 11-14
Among-Group Variation
SST = SSA + SSW
c
SSA   n j ( X j  X)
2
j1
Where:
SSA = Sum of squares among groups
c = number of groups
nj = sample size from group j
Xj = sample mean from group j
X = grand mean (mean of all data values)
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Chap 11-15
Among-Group Variation
(continued)
c
SSA   n j ( X j  X)
2
j1
Variation Due to
Differences Among Groups
SSA
MSA 
c 1
Mean Square Among =
SSA/degrees of freedom
i
j
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Chap 11-16
Among-Group Variation
(continued)
SSA  n1(x1  x)  n2 (x2  x)  ...  nc (xc  x)
2
2
2
Response, X
X3
X1
Group 1
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Group 2
X2
X
Group 3
Chap 11-17
Within-Group Variation
SST = SSA + SSW
c
SSW  
j1
nj

i1
( Xij  X j )
2
Where:
SSW = Sum of squares within groups
c = number of groups
nj = sample size from group j
Xj = sample mean from group j
Xij = ith observation in group j
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Chap 11-18
Within-Group Variation
(continued)
c
SSW  
j1
nj

i1
( Xij  X j )2
Summing the variation
within each group and then
adding over all groups
SSW
MSW 
nc
Mean Square Within =
SSW/degrees of freedom
μj
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Chap 11-19
Within-Group Variation
(continued)
SSW  ( x11  X1 )  ( X12  X2 )  ...  ( Xcnc  Xc )
2
2
2
Response, X
X3
X1
Group 1
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Group 2
X2
Group 3
Chap 11-20
Obtaining the Mean Squares
SSA
MSA 
c 1
SSW
MSW 
nc
SST
MST 
n 1
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Chap 11-21
One-Way ANOVA Table
Source of
Variation
SS
df
Among
Groups
SSA
c-1
Within
Groups
SSW
n-c
SST =
SSA+SSW
n-1
Total
MS
(Variance)
F ratio
SSA
MSA
MSA =
c - 1 F = MSW
SSW
MSW =
n-c
c = number of groups
n = sum of the sample sizes from all groups
df = degrees of freedom
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Chap 11-22
One-Way ANOVA
F Test Statistic
H0: μ1= μ2 = … = μc
H1: At least two population means are different

Test statistic
MSA
F
MSW
MSA is mean squares among groups
MSW is mean squares within groups

Degrees of freedom


df1 = c – 1
df2 = n – c
(c = number of groups)
(n = sum of sample sizes from all populations)
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Chap 11-23
Interpreting One-Way ANOVA
F Statistic

The F statistic is the ratio of the among
estimate of variance and the within estimate
of variance



The ratio must always be positive
df1 = c -1 will typically be small
df2 = n - c will typically be large
Decision Rule:
 Reject H0 if F > FU,
otherwise do not
reject H0
 = .05
0
Do not
reject H0
Reject H0
FU
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Chap 11-24
One-Way ANOVA
F Test Example
You want to see if three
different golf clubs yield
different distances. You
randomly select five
measurements from trials on
an automated driving
machine for each club. At the
0.05 significance level, is
there a difference in mean
distance?
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Club 1
254
263
241
237
251
Club 2
234
218
235
227
216
Club 3
200
222
197
206
204
Chap 11-25
One-Way ANOVA Example:
Scatter Diagram
Club 1
254
263
241
237
251
Club 2
234
218
235
227
216
Club 3
200
222
197
206
204
Distance
270
260
250
240
•
••
•
•
230
220
X1
••
•
••
X2
210
x1  249.2 x 2  226.0 x 3  205.8
200
x  227.0
190
••
••
1
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
•
2
Club
X
X3
3
Chap 11-26
One-Way ANOVA Example
Computations
Club 1
254
263
241
237
251
Club 2
234
218
235
227
216
Club 3
200
222
197
206
204
X1 = 249.2
n1 = 5
X2 = 226.0
n2 = 5
X3 = 205.8
n3 = 5
X = 227.0
n = 15
c=3
SSA = 5 (249.2 – 227)2 + 5 (226 – 227)2 + 5 (205.8 – 227)2 = 4716.4
SSW = (254 – 249.2)2 + (263 – 249.2)2 +…+ (204 – 205.8)2 = 1119.6
MSA = 4716.4 / (3-1) = 2358.2
MSW = 1119.6 / (15-3) = 93.3
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
2358.2
F
 25.275
93.3
Chap 11-27
One-Way ANOVA Example
Solution
H0: μ1 = μ2 = μ3
H1: μj not all equal
 = 0.05
df1= 2
df2 = 12
Critical
Value:
FU = 3.89
 = .05
0
Do not
reject H0
Reject H0
FU = 3.89
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Test Statistic:
MSA 2358.2
F

 25.275
MSW
93.3
Decision:
Reject H0 at  = 0.05
Conclusion:
There is evidence that
at least one μj differs
F = 25.275
from the rest
Chap 11-28
One-Way ANOVA
Excel Output
EXCEL: tools | data analysis | ANOVA: single factor
SUMMARY
Groups
Count
Sum
Average
Variance
Club 1
5
1246
249.2
108.2
Club 2
5
1130
226
77.5
Club 3
5
1029
205.8
94.2
ANOVA
Source of
Variation
SS
df
MS
Between
Groups
4716.4
2
2358.2
Within
Groups
1119.6
12
93.3
Total
5836.0
14
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
F
25.275
P-value
4.99E-05
F crit
3.89
Chap 11-29
The Tukey-Kramer Procedure

Tells which population means are significantly
different



e.g.: μ1 = μ2  μ3
Done after rejection of equal means in ANOVA
Allows pair-wise comparisons

Compare absolute mean differences with critical
range
μ1= μ2
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
μ3
x
Chap 11-30
Tukey-Kramer Critical Range
Critical Range  QU
MSW
2
1 1
  
n n 
j' 
 j
where:
QU = Value from Studentized Range Distribution
with c and n - c degrees of freedom for
the desired level of  (see appendix E.9 table)
MSW = Mean Square Within
nj and nj’ = Sample sizes from groups j and j’
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Chap 11-31
The Tukey-Kramer Procedure:
Example
Club 1
254
263
241
237
251
Club 2
234
218
235
227
216
Club 3
200
222
197
206
204
1. Compute absolute mean
differences:
x1  x 2  249.2  226.0  23.2
x1  x 3  249.2  205.8  43.4
x 2  x 3  226.0  205.8  20.2
2. Find the QU value from the table in appendix E.10 with
c = 3 and (n – c) = (15 – 3) = 12 degrees of freedom
for the desired level of  ( = 0.05 used here):
QU  3.77
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Chap 11-32
The Tukey-Kramer Procedure:
Example
(continued)
3. Compute Critical Range:
Critical Range  QU
MSW
2
1 1
    3.77 93.3  1  1   16.285
n n 
2 5 5
j' 
 j
4. Compare:
5. All of the absolute mean differences
are greater than critical range.
Therefore there is a significant
difference between each pair of
means at 5% level of significance.
Thus, with 95% confidence we can conclude
that the mean distance for club 1 is greater
than club 2 and 3, and club 2 is greater than
club 3.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
x1  x 2  23.2
x1  x 3  43.4
x 2  x 3  20.2
Chap 11-33
The Randomized Block Design

Like One-Way ANOVA, we test for equal
population means (for different factor levels, for
example)...

...but we want to control for possible variation
from a second factor (with two or more levels)

Levels of the secondary factor are called blocks
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Chap 11-34
Partitioning the Variation

Total variation can now be split into three parts:
SST = SSA + SSBL + SSE
SST = Total variation
SSA = Among-Group variation
SSBL = Among-Block variation
SSE = Random variation
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Chap 11-35
Sum of Squares for Blocking
SST = SSA + SSBL + SSE
r
SSBL  c  ( Xi.  X)
2
i1
Where:
c = number of groups
r = number of blocks
Xi. = mean of all values in block i
X = grand mean (mean of all data values)
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Chap 11-36
Partitioning the Variation

Total variation can now be split into three parts:
SST = SSA + SSBL + SSE
SST and SSA are
computed as they were
in One-Way ANOVA
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
SSE = SST – (SSA + SSBL)
Chap 11-37
Mean Squares
SSBL
MSBL  Mean square blocking 
r 1
MSA  Mean square among groups 
SSA
c 1
SSE
MSE  Mean square error 
(r  1)(c  1)
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Chap 11-38
Randomized Block ANOVA Table
Source of
Variation
SS
df
MS
F ratio
MSA
MSE
Among
Treatments
SSA
c-1
MSA
Among
Blocks
SSBL
r-1
MSBL
Error
SSE
(r–1)(c-1)
MSE
SST
rc - 1
Total
c = number of populations
r = number of blocks
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
MSBL
MSE
rc = sum of the sample sizes from all populations
df = degrees of freedom
Chap 11-39
Blocking Test
H0 : μ1.  μ2.  μ3.  ...
H1 : Not all block means are equal
MSBL
F=
MSE

Blocking test:
df1 = r – 1
df2 = (r – 1)(c – 1)
Reject H0 if F > FU
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Chap 11-40
Main Factor Test
H0 : μ.1  μ.2  μ.3  ...  μ.c
H1 : Not all population means are equal
F=
MSA
MSE

Main Factor test: df1 = c – 1
df2 = (r – 1)(c – 1)
Reject H0 if F > FU
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Chap 11-41
The Tukey Procedure

To test which population means are significantly
different



e.g.: μ1 = μ2 ≠ μ3
Done after rejection of equal means in randomized
block ANOVA design
Allows pair-wise comparisons

Compare absolute mean differences with critical
range
1= 2
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
3
x
Chap 11-42
The Tukey Procedure
(continued)
Critical Range  Qu
MSE
r
Compare:
Is x.j  x.j'  Critical Range ?
If the absolute mean difference
is greater than the critical range
then there is a significant
difference between that pair of
means at the chosen level of
significance.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
x .1  x .2
x .1  x .3
x .2  x .3
etc...
Chap 11-43
Factorial Design:
Two-Way ANOVA

Examines the effect of

Two factors of interest on the dependent
variable


e.g., Percent carbonation and line speed on soft drink
bottling process
Interaction between the different levels of these
two factors

e.g., Does the effect of one particular carbonation
level depend on which level the line speed is set?
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Chap 11-44
Two-Way ANOVA
(continued)

Assumptions

Populations are normally distributed

Populations have equal variances

Independent random samples are
drawn
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Chap 11-45
Two-Way ANOVA
Sources of Variation
Two Factors of interest: A and B
r = number of levels of factor A
c = number of levels of factor B
n’ = number of replications for each cell
n = total number of observations in all cells
(n = rcn’)
Xijk = value of the kth observation of level i of
factor A and level j of factor B
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Chap 11-46
Two-Way ANOVA
Sources of Variation
SST = SSA + SSB + SSAB + SSE
SSA
Factor A Variation
SST
Total Variation
SSB
Factor B Variation
SSAB
n-1
(continued)
Degrees of
Freedom:
r–1
c–1
Variation due to interaction
between A and B
(r – 1)(c – 1)
SSE
rc(n’ – 1)
Random variation (Error)
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Chap 11-47
Two Factor ANOVA Equations
Total Variation:
r
n
c
SST   ( Xijk  X)
2
i1 j1 k 1
Factor A Variation:
r
2

SSA  cn  ( Xi..  X)
i1
Factor B Variation:
c
2

SSB  rn  ( X. j.  X)
j1
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Chap 11-48
Two Factor ANOVA Equations
(continued)
Interaction Variation:
r
c
SSAB  n ( Xij.  Xi..  X.j.  X)2
i1 j1
Sum of Squares Error:
r
c
n
SSE   ( Xijk  Xij. )2
i1 j1 k 1
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Chap 11-49
Two Factor ANOVA Equations
r
where:
X
Xi.. 
 X
j1 k 1
i1 j1 k 1
ijk
rcn
 Grand Mean
ijk
 Mean of ith level of factor A (i  1, 2, ..., r)
cn
r
X. j. 
(continued)
n
 X
n
c
c
n
 X
i 1 k 1
ijk
rn
n
 Mean of jth level of factor B (j  1, 2, ..., c)
Xijk
Xij.  
 Mean of cell ij
k 1 n
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
r = number of levels of factor A
c = number of levels of factor B
n’ = number of replications in each cell
Chap 11-50
Mean Square Calculations
SSA
MSA  Mean square factor A 
r 1
SSB
MSB  Mean square factor B 
c 1
SSAB
MSAB  Mean square interaction 
(r  1)(c  1)
SSE
MSE  Mean square error 
rc(n'1)
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Chap 11-51
Two-Way ANOVA:
The F Test Statistic
H0: μ1.. = μ2.. = μ3.. = • • •
H1: Not all μi.. are equal
H0: μ.1. = μ.2. = μ.3. = • • •
H1: Not all μ.j. are equal
H0: the interaction of A and B is
equal to zero
H1: interaction of A and B is not
zero
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
F Test for Factor A Effect
MSA
F
MSE
Reject H0
if F > FU
F Test for Factor B Effect
MSB
F
MSE
Reject H0
if F > FU
F Test for Interaction Effect
MSAB
F
MSE
Reject H0
if F > FU
Chap 11-52
Two-Way ANOVA
Summary Table
Source of
Variation
Sum of
Squares
Degrees of
Freedom
Mean
Squares
F
Statistic
Factor A
SSA
r–1
MSA
MSA
MSE
Factor B
SSB
c–1
AB
(Interaction)
SSAB
(r – 1)(c – 1)
Error
SSE
rc(n’ – 1)
Total
SST
n–1
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
= SSA /(r – 1)
MSB
= SSB /(c – 1)
MSAB
= SSAB / (r – 1)(c – 1)
MSB
MSE
MSAB
MSE
MSE =
SSE/rc(n’ – 1)
Chap 11-53
Features of Two-Way ANOVA
F Test

Degrees of freedom always add up

n-1 = rc(n’-1) + (r-1) + (c-1) + (r-1)(c-1)

Total = error + factor A + factor B + interaction

The denominator of the F Test is always the
same but the numerator is different

The sums of squares always add up

SST = SSE + SSA + SSB + SSAB

Total = error + factor A + factor B + interaction
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Chap 11-54
Examples:
Interaction vs. No Interaction
No interaction:
Interaction is
present:
Factor B Level 3
Factor B Level 2
Factor A Levels
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Mean Response
Factor B Level 1
Mean Response


Factor B Level 1
Factor B Level 2
Factor B Level 3
Factor A Levels
Chap 11-55
Multiple Comparisons:
The Tukey Procedure

Unless there is a significant interaction, you
can determine the levels that are significantly
different using the Tukey procedure

Consider all absolute mean differences and
compare to the calculated critical range

Example: Absolute differences
for factor A, assuming three factors:
X1..  X 2..
X1..  X 3..
X 2..  X 3..
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Chap 11-56
Multiple Comparisons:
The Tukey Procedure

Critical Range for Factor A:
Critical Range  QU
MSE
c n'
(where Qu is from Table E.10 with r and rc(n’–1) d.f.)

Critical Range for Factor B:
Critical Range  QU
MSE
r n'
(where Qu is from Table E.10 with c and rc(n’–1) d.f.)
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Chap 11-57
Chapter Summary

Described one-way analysis of variance





Considered the Randomized Block Design



The logic of ANOVA
ANOVA assumptions
F test for difference in c means
The Tukey-Kramer procedure for multiple comparisons
Treatment and Block Effects
Multiple Comparisons: Tukey Procedure
Described two-way analysis of variance


Examined effects of multiple factors
Examined interaction between factors
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.
Chap 11-58