Rules for determining Expected Mean Squares (EMS) in an

Download Report

Transcript Rules for determining Expected Mean Squares (EMS) in an

ANOVA TABLE
Factorial Experiment
Completely Randomized Design
Anova table for the 3 factor Experiment
Source
SS
df
MS
F
A
SSA
a-1
MSA
MSA/MSError
B
SSB
b-1
MSB
MSB/MSError
C
SSC
c-1
MSC
MSC/MSError
AB
SSAB
(a - 1)(b - 1)
MSAB
MSAB/MSError
AC
SSAC
(a - 1)(c - 1)
MSAC
MSAC/MSError
BC
SSBC
(b - 1)(c - 1)
MSBC
MSBC/MSError
ABC
SSABC
(a - 1)(b - 1)(c - 1)
MSABC
MSABC/MSError
Error
SSError
abc(n - 1)
MSError
p -value
Sum of squares entries
a
a
SS A  nbc ̂  nbc  yi  y 
i 1
2
i
2
i 1
Similar expressions for SSB , and SSC.
SS AB  nc  nc yij  yi  y j   y 
a
i 1
a
2
ij
b
i 1 j 1
Similar expressions for SSBC , and SSAC.
2
Sum of squares entries
a
2
SS ABC  n ikj
i 1
a
b
 n yijk  yij  yik   y jk  yi
c
 y j   y k   yi 
i 1 j 1 k 1
2
Finally
SS Error   yijkl  yijk 
a
b
c
n
i 1 j 1 k 1 l 1
2
The statistical model for the 3 factor Experiment
yijk/ 

mean effect
 i   j   k
main effects
  ij   ik    jk 
 ijk
2 factor interactions
3 factor interaction
  ijk/
random error
Anova table for the 3 factor Experiment
Source
SS
df
MS
F
A
SSA
a-1
MSA
MSA/MSError
B
SSB
b-1
MSB
MSB/MSError
C
SSC
c-1
MSC
MSC/MSError
AB
SSAB
(a - 1)(b - 1)
MSAB
MSAB/MSError
AC
SSAC
(a - 1)(c - 1)
MSAC
MSAC/MSError
BC
SSBC
(b - 1)(c - 1)
MSBC
MSBC/MSError
ABC
SSABC
(a - 1)(b - 1)(c - 1)
MSABC
MSABC/MSError
Error
SSError
abc(n - 1)
MSError
p -value
The testing in factorial experiments
1. Test first the higher order interactions.
2. If an interaction is present there is no need
to test lower order interactions or main
effects involving those factors. All factors
in the interaction affect the response and
they interact
3. The testing continues with lower order
interactions and main effects for factors
which have not yet been determined to
affect the response.
Random Effects and Fixed
Effects Factors
• So far the factors that we have considered are
fixed effects factors
• This is the case if the levels of the factor are a
fixed set of levels and the conclusions of any
analysis is in relationship to these levels.
• If the levels have been selected at random from
a population of levels the factor is called a
random effects factor
• The conclusions of the analysis will be
directed at the population of levels and not
only the levels selected for the experiment
Example - Fixed Effects
Source of Protein, Level of Protein, Weight Gain
Dependent
– Weight Gain
Independent
– Source of Protein,
• Beef
• Cereal
• Pork
– Level of Protein,
• High
• Low
Example - Random Effects
In this Example a Taxi company is interested in
comparing the effects of three brands of tires (A, B and
C) on mileage (mpg). Mileage will also be effected by
driver. The company selects b = 4 drivers at random
from its collection of drivers. Each driver has n = 3
opportunities to use each brand of tire in which mileage
is measured.
Dependent
– Mileage
Independent
– Tire brand (A, B, C),
• Fixed Effect Factor
– Driver (1, 2, 3, 4),
• Random Effects factor
The Model for the fixed effects experiment
yijk     i   j   ij   ijk
where , 1, 2, 3, 1, 2, ()11 , ()21 , ()31 ,
()12 , ()22 , ()32 , are fixed unknown constants
And ijk is random, normally distributed with mean 0
and variance s2.
Note:
a
n
a
b
           
i 1
i
j 1
j
i 1
ij
j 1
ij
0
The Model for the case when factor B is a random
effects factor
yijk     i   j   ij   ijk
where , 1, 2, 3, are fixed unknown constants
And ijk is random, normally distributed with mean 0 and
variance s2.
j is normal with mean 0 and variance s B2
and
2
()ij is normal with mean 0 and variance s AB
a
Note:

i 1
i
0
This model is called a variance components model
The Anova table for the two factor model
yijk     i   j   ij   ijk
Source
SS
df
a -1
A
SSA
b-1
B
SSA
AB
SSAB (a -1)(b -1)
Error SSError ab(n – 1)
MS
SSA/(a – 1)
SSB/(a – 1)
SSAB/(a – 1) (a – 1)
SSError/ab(n – 1)
The Anova table for the two factor model (A, B – fixed)
yijk     i   j   ij   ijk
Source
SS
df
MS
EMS
F
nb a 2
s 
i
a  1 
i 1
MSA/MSError
A
SSA
a -1
MSA
B
SSA
b-1
MSB
AB
SSAB
(a -1)(b -1)
MSAB
Error
SSError
ab(n – 1)
MSError
2
na b 2
s 
j
b  1 
j 1
2
a
b
n
 ij2
s 

a  1b  1 i 1 j 1
2
EMS = Expected Mean Square
s2
MSB/MSError
MSAB/MSError
The Anova table for the two factor model
(A – fixed, B - random)
yijk     i   j   ij   ijk
Source
SS
df
MS
A
SSA
a -1
MSA
B
SSA
b-1
MSB
AB
SSAB
(a -1)(b -1)
MSAB
Error
SSError
ab(n – 1)
MSError
EMS
s  ns
2
2
AB
nb a 2

i
a  1 
i 1
s 2  nas B2
2
s 2  ns AB
s2
Note: The divisor for testing the main effects
of A is no longer MSError but MSAB.
F
MSA/MSAB
MSB/MSError
MSAB/MSError
Rules for determining Expected
Mean Squares (EMS) in an Anova
Table
Both fixed and random effects
Formulated by Schultz[1]
1.
Schultz E. F., Jr. “Rules of Thumb for Determining
Expectations of Mean Squares in Analysis of
Variance,”Biometrics, Vol 11, 1955, 123-48.
1. The EMS for Error is s2.
2. The EMS for each ANOVA term contains two
or more terms the first of which is s2.
3. All other terms in each EMS contain both
coefficients and subscripts (the total number
of letters being one more than the number of
factors) (if number of factors is k = 3, then
the number of letters is 4)
4. The subscript of s2 in the last term of each
EMS is the same as the treatment
designation.
5. The subscripts of all s2 other than the first contain
the treatment designation. These are written with
the combination involving the most letters written
first and ending with the treatment designation.
6. When a capital letter is omitted from a subscript ,
the corresponding small letter appears in the
coefficient.
7. For each EMS in the table ignore the letter or letters
that designate the effect. If any of the remaining
letters designate a fixed effect, delete that term from
the EMS.
8. Replace s2 whose subscripts are composed
entirely of fixed effects by the appropriate sum.
a
s
2
A

by
i 1
a 1
a
2
s AB
by
2
i
  
i 1
2
ij
 a  1 b  1
Example: 3 factors A, B, C – all are random effects
Source
A
B
C
AB
AC
BC
ABC
Error
EMS
F
2
2
2
s 2  ns ABC
 ncs AB
 nbs AC
 nbcs A2
2
2
2
s 2  ns ABC
 ncs AB
 nas BC
 nacs B2
2
2
2
s 2  ns ABC
 nas BC
 nbs AC
 nabs C2
2
2
s 2  ns ABC
 ncs AB
MS AB MS ABC
2
2
s 2  ns ABC
 nbs AC
MS AC MS ABC
2
2
s 2  ns ABC
 nas BC
MS BC MS ABC
2
s 2  ns ABC
s2
MS ABC MSError
Example: 3 factors A fixed, B, C random
Source
A
B
C
AB
AC
BC
ABC
Error
EMS
s  ns
2
2
ABC
 ncs
2
AB
 nbs
s  nas
2
2
BC
F
a
2
AC
 nbc   i2
 a  1
i 1
 nacs B2
MS B MS BC
2
s 2  nas BC
 nabs C2
MSC MS BC
2
2
s 2  ns ABC
 ncs AB
MS AB MS ABC
2
2
s 2  ns ABC
 nbs AC
MS AC MS ABC
2
s 2  nas BC
MS BC MS Error
2
s 2  ns ABC
s2
MS ABC MSError
Example: 3 factors A , B fixed, C random
Source
A
B
C
AB
AC
BC
ABC
Error
EMS
F
a
s  nbs
2
AC
 nbc   i2
 a  1
MS A MS AC
s  nas
2
BC
 nac   j2
 b  1
MS B MS BC
2
2
i 1
a
i 1
s 2  nabs C2
s  ns
2
a
2
ABC
b
 nc   ij
2
i 1 j 1
MSC MSError
 a  1b  1
MS AB MS ABC
2
s 2  nbs AC
MS AC MSError
2
s 2  nas BC
MS BC MS Error
2
s 2  ns ABC
s2
MS ABC MSError
Example: 3 factors A , B and C fixed
Source
A
B
C
AB
AC
BC
ABC
Error
EMS
a
F
s  nbc   i2
 a  1
MS A MS Error
s  nac   j2
 b  1
MS B MS Error
s 2  nbc   k2
 c  1
MSC MSError
2
i 1
a
2
i 1
c
k 1
a
b
s  nc   ij
 a 1b  1
2
2
i 1 j 1
a
c
s 2  nb   ij
 a  1 c  1
2
i 1 k 1
b
c
s 2  na    ij
a
b
2
j 1 k 1
c
s 2  n   ijk
i 1 j 1 k 1
s2
2
MS AB MS Error
MS AC MSError
b  1 c  1
MS BC MS Error
 a  1b  1 c  1
MS ABC MSError
Example - Random Effects
In this Example a Taxi company is interested in
comparing the effects of three brands of tires (A, B and
C) on mileage (mpg). Mileage will also be effected by
driver. The company selects at random b = 4 drivers at
random from its collection of drivers. Each driver has n
= 3 opportunities to use each brand of tire in which
mileage is measured.
Dependent
– Mileage
Independent
– Tire brand (A, B, C),
• Fixed Effect Factor
– Driver (1, 2, 3, 4),
• Random Effects factor
The Data
Driver
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
2
Tire
A
A
A
B
B
B
C
C
C
A
A
A
B
B
B
C
C
C
Mileage
39.6
38.6
41.9
18.1
20.4
19
31.1
29.8
26.6
38.1
35.4
38.8
18.2
14
15.6
30.2
27.9
27.2
Driver
3
3
3
3
3
3
3
3
3
4
4
4
4
4
4
4
4
4
Tire
A
A
A
B
B
B
C
C
C
A
A
A
B
B
B
C
C
C
Mileage
33.9
43.2
41.3
17.8
21.3
22.3
31.3
28.7
29.7
36.9
30.3
35
17.8
21.2
24.3
27.4
26.6
21
Asking SPSS to perform Univariate ANOVA
Select the dependent variable, fixed factors, random factors
The Output
Tests of Between-Subjects Effects
Dependent Variable: MILEAGE
Source
Intercept
TIRE
DRIVER
TIRE * DRIVER
Hypothesis
Error
Hypothesis
Error
Hypothesis
Error
Hypothesis
Error
Type III
Sum of
Squares
28928.340
68.290
2072.931
87.129
68.290
87.129
87.129
170.940
df
1
3
2
6
3
6
6
24
Mean
Square
28928.340
22.763a
1036.465
14.522b
22.763
14.522b
14.522
7.123 c
F
1270.836
Sig.
.000
71.374
.000
1.568
.292
2.039
.099
a. MS(DRIVER)
b. MS(TIRE * DRIVER)
c. MS(Error)
The divisor for both the fixed and the random main effect is MSAB
This is contrary to the advice of some texts
The Anova table for the two factor model
(A – fixed, B - random)
yijk     i   j   ij   ijk
Source
SS
df
MS
EMS
A
SSA
a -1
MSA
B
SSA
b-1
MSB
s 2  nas B2
MSB/MSError
AB
SSAB
(a -1)(b -1)
MSAB
2
s 2  ns AB
MSAB/MSError
Error
SSError
ab(n – 1)
MSError
s2
2
s 2  ns AB

nb a 2
i
a  1 
i 1
F
MSA/MSAB
Note: The divisor for testing the main effects of A is no longer
MSError but MSAB.
References Guenther, W. C. “Analysis of Variance” Prentice Hall, 1964
The Anova table for the two factor model
(A – fixed, B - random)
yijk     i   j   ij   ijk
Source
SS
df
MS
EMS
A
SSA
a -1
MSA
B
SSA
b-1
MSB
2
s 2  ns AB
 nas B2
MSB/MSAB
AB
SSAB
(a -1)(b -1)
MSAB
2
s 2  ns AB
MSAB/MSError
Error
SSError
ab(n – 1)
MSError
s2
2
s 2  ns AB

nb a 2
i
a  1 
i 1
F
MSA/MSAB
Note: In this case the divisor for testing the main effects of A is
MSAB . This is the approach used by SPSS.
References Searle “Linear Models” John Wiley, 1964
Crossed and Nested Factors
The factors A, B are called crossed if every level
of A appears with every level of B in the
treatment combinations.
Levels of B
Levels
of A
Factor B is said to be nested within factor A if the
levels of B differ for each level of A.
Levels of A
Levels of B
Example: A company has a = 4 plants for
producing paper. Each plant has 6 machines for
producing the paper. The company is interested
in how paper strength (Y) differs from plant to
plant and from machine to machine within plant
Plants
Machines
Machines (B) are nested within plants (A)
The model for a two factor experiment with B
nested within A.
yijk 

overall mean

i
effect of factor A

   j i    ijk
effect of B within A
random error
The ANOVA table
Source
SS
df
MS
F
A
SSA
a-1
MSA
MSA/MSError
B(A)
SSB(A)
a(b – 1)
MSB(A)
MSB(A) /MSError
Error
SSError
ab(n – 1) MSError
p - value
Note: SSB(A ) = SSB + SSAB and a(b – 1) = (b – 1) + (a - 1)(b – 1)
Example: A company has a = 4 plants for
producing paper. Each plant has 6 machines for
producing the paper. The company is interested
in how paper strength (Y) differs from plant to
plant and from machine to machine within plant.
Also we have n = 5 measurements of paper
strength for each of the 24 machines
The Data
Plant
machine
Plant
machine
1
1
2
3
4
5
98.7 59.2 84.1 72.3 83.5
93.1 87.8 86.3 110.3 89.3
100.0 84.1 83.4 81.6 86.1
3
13
14
15
16
17
83.6 76.1 64.2 69.2 77.4
84.6 55.4 58.4 86.7 63.3
90.6 92.3 75.4 60.8 76.6
2
6
7
60.6 33.6
84.8 48.2
83.6 68.9
8
44.8
57.3
66.5
9
58.9
51.6
45.2
10
63.9
62.3
61.1
11
63.7
54.6
55.3
12
48.1
50.6
39.9
22
37.0
47.8
41.0
23
43.8
62.4
60.8
24
30.0
43.0
56.9
4
18
19
61.0 64.2
81.3 50.3
73.8 32.1
20
35.5
30.8
36.3
21
46.9
43.1
40.8
Anova Table Treating Factors (Plant, Machine) as
crossed
Tests of Between-Subjects Effects
Dependent Variable: STRENGTH
Type III
Sum of
Source
Squares
Corrected Model
21031.065 a
Intercept
298531.4
PLANT
18174.761
MACHINE
1238.379
PLANT * MACHINE 1617.925
Error
5505.469
Total
325067.9
Corrected Total
26536.534
df
23
1
3
5
15
48
72
71
Mean
Square
914.394
298531.4
6058.254
247.676
107.862
114.697
a. R Squared = .793 (Adjusted R Squared = .693)
F
7.972
2602.776
52.820
2.159
.940
Sig.
.000
.000
.000
.074
.528
Anova Table: Two factor experiment B(machine)
nested in A (plant)
Source
Plant
Machine(Plant)
Error
Sum of Squares
18174.76119
2856.303672
5505.469467
df
Mean Square
F
3
6058.253731 52.819506
20
142.8151836 1.2451488
48
114.6972806
p - value
0.00000
0.26171
Graph
120
Paper Strength
100
80
Plant 1
Plant 2
Plant 3
Plant 5
60
40
20
0
0
1
2
3
4
Machine
5
6
7