Two Factor Designs

Download Report

Transcript Two Factor Designs

Love does not come by demanding from others, but it is a self initiation.
1
Two Factor Designs
Consider studying the impact of two factors on the
yield (response):
BRAND
1
2
DEVICE
3
4
1
17.9, 18.1
17.8, 17.8
18.1, 18.2
17.8, 17.9
2
18.0, 18.2
18.0, 18.3
18.4, 18.1
18.1, 18.5
3
18.0, 17.8
17.8, 18.0
18.1, 18.3
18.1, 17.9
NOTE:
The “1”,
“2”,etc...
mean
Level 1,
Level 2,
etc...,
NOT
metric
values
Here we have R = 3 rows (levels of the Row factor), C = 4
(levels of the column factor), and n = 2 replicates per cell
[nij for (i,j)th cell if not all equal]
2
MODEL:
Yijk = tibjtbijijk
i = 1, ..., R
j = 1, ..., C
k= 1, ..., n
In general, n observations per cell, R • C cells.
3
: the grand mean
ti :the difference between the ith
row mean and the grand mean
bj :the difference between the jth
column mean and the grand mean
tbij :the interaction associated with
the i-th row and the j-th column
(= ij - ti - bj - )
4
Yijk = Y•••+ (Yi•• - Y•••) + (Y•j• - Y•••)
+ (Yij• - Yi•• - Y•j• + Y•••)
+ (Yijk - Yij•)
Where
Y••• = Grand mean
Yi•• = Mean of row i
Y•j• = Mean of column j
Yij• = Mean of cell (i,j)
[All the terms are somewhat “intuitive”,
except for (Yij• -Yi•• - Y•j• + Y•••)]
5
The term (Yij• -Yi•• - Y•j• + Y•••)
is more intuitively written as:
(Yij• - Y•••)
how a cell
mean differs
from grand mean
(Yi•• - Y•••)
adjustment
for “row
membership”
(Y•j• - Y•••)
adjustment
for “column
membership”
We can, without loss of generality, assume (for a
moment) that there is no error (random part); why
then might the above be non-zero?
6
ANSWER:
“INTERACTION”
Two basic ways to look at interaction:
1)
AL
AH
BL
BH
5
10
8
?
If AHBH = 13, no interaction
If AHBH > 13, + interaction
If AHBH < 13, - interaction
- When B goes from BLBH, yield goes up by 3 (58).
- When A goes from AL AH, yield goes up by 5 (510).
- When both changes of level occur, does yield go up by
the sum, 3 + 5 = 8?
Interaction = degree of difference from sum of separate effects
7
BL
2)
AL
AH
5
10
BH
8
17
- Holding BL, what happens as A goes from AL AH? +5
- Holding BH, what happens as A goes from AL  AH? +9
If the effect of one factor (i.e., the impact of changing
its level) is DIFFERENT for different levels of another
factor, then INTERACTION exists between the two
factors.
NOTE:
- Holding AL, BL
- Holding AH, BL
BH has impact + 3
BH has impact + 7
(AB) = (BA) or (9-5) = (7-3).
8
Going back to the (model) equation on page 4, and
bringing Y... to the other side of the equation, we get
(Yijk - Y•••) = (Yi•• - Y•••) + (Y•j• - Y•••)
+ [(Yij• - Yi••) - (Y•j• - Y•••)]
Effect of column j
at row i.
+ (Yijk - Yij•)
Effect of column j
If we then square both sides, triple sum both
sides over i, j, and k, we get, (after noting that
all cross-product terms cancel):
9
(Yijk - Y•••)=n.C.(Yi•• - Y•••)
i
j k
i
+ n.R.(Y•j• - Y•••)2
j
+ n.(Yij• - Yi•• - Y•j• +Y•••)
i j
(Yijk - Yij•)
OR,
i
j k
TSS = SSBRows + SSBCols + SSIR,C+ SSWError
and, in terms of degrees of freedom,
R.C.n-1 = (R-1) + (C-1) + (R-1)(C-1) + R.C.(n-1);
DF of Interaction = (RC-1)-(R-1)-(C-1) = (R-1)(C-1).
10
In our example:
1
17.9, 18.1
D
E
V
I
C
E
1
18.1
18.2, 18.0
2
3
BRAND
2
3
17.8, 17.8 18.1, 18.2
17.8
18.15
18.0, 18.3 18.4, 18.1
4
17.8,17.9
17.85
17.95
18.1, 18.5
18.20
18.1
18.0, 17.8
17.9
18.00
18.15
18.25
17.8, 18.0 18.1, 18.3
17.9
18.2
18.3
18.1, 17.9
18.00
18.0
17.95 18.20 18.05
18.05
11
SSBrows
=2 • 4[(17.95-18.05) + (18.20-18.05) + (18.0-18.05) ]
2
=
SSBcol
8 (.01 + .0225 + .0025) = .28
2
2
=
=
TSS
2
2
6 (.0025 + .001 + .0225 + 0) = .21
2 • (18-17.95-18+18.05)2 + (17.8-17.95-17.95+18.05)2
[
....… + (18-18-18.05+18.05)2
SSW
2
=2•3[(18-18.05) +(17.95-18.05) +(18.2-18.05) +( 18.05-18.05) ]
=
SSIR,C
2
]
2 [.055] = .11
2
2
2
2
= (17.9-18.0) + (18.1-18.0) + (17.8-17.8) + (17.8-17.8) + …
2
2
....... (18.1-18.0) + (17.9-18.0)
=
.30
=
.28 + .21+ .11 + .30 = .90
12
ANOVA
SOURCE
Rows
COL
Int’n
Error
SSQ
.28
.21
.11
.30
df
2
3
6
12
M.S.
Fcalc
.14
5.6
.07
2.8
.0183
.73
.025
=.05
1)
Ho: All Row Means Equal
H1: Not all Row Means Equal
2)
Ho: All Col. Means Equal
H1: Not All Col. Means Equal
3)
FTV (6, 12) = 3.00
Ho: No Int’n between factors
Accept Ho
H1: There is int’n between factors
FTV (2, 12) = 3.89
Reject Ho
FTV (3, 12) = 3.49
Accept Ho
13
14
Minitab: Stat >> Anova >> General Linear Model
General Linear Model: time versus brand, device
Factor Type Levels Values
brand fixed
4 1, 2, 3, 4
device fixed
3 1, 2, 3
Analysis of Variance for time, using Adjusted SS for Tests
Source
brand
device
brand*device
Error
Total
DF
3
2
6
12
23
Seq SS
0.21000
0.28000
0.11000
0.30000
0.90000
Adj SS
0.21000
0.28000
0.11000
0.30000
Adj MS
0.07000
0.14000
0.01833
0.02500
F
2.80
5.60
0.73
P
0.085
0.019
0.633
S = 0.158114 R-Sq = 66.67% R-Sq(adj) = 36.11%
15
16
Assumption:
ijk follows N(0, s ) for all i, j,
2
k, and they are independent.
17
Test for Normality
1. Restore residuals when doing Anova.
2. Stat >> Basic Statistics >> Normality Test
Mean -4.44089E-16
StDev 0.1142
N
24
AD
0.815
P-Value
0.030
Not really normal but not too far from normal.
18
Test for Equal Variances
Minitab:
Stat >> Anova >> Test for Equal Variances
Test for Equal Variances: time versus device, brand
95% Bonferroni confidence intervals for standard
deviations
device
1
1
…
3
brand
1
2
N
2
2
Lower
StDev
Upper
0.0463363 0.141421 49.6486
*
0.000000 *
4
2
0.0463363 0.141421 49.6486
Bartlett's Test (normal distribution)
Test statistic = 2.33, p-value = 0.993
* NOTE * Levene's test cannot be computed for these data.
19
Fixed Effect Model
20
Random Effect Model
Additional assumptions:
 ti follows N(0, s2t) for all i, and they are
independent.
 bj follows N(0, s2b) for all j, and they are
independent.
 tbij follows N(0, s2tb) for all I, j, and they are
independent.
• All these random components ti ,bj , tbij ,ijk
are (mutually) independent.
21
Mixed Effect Model
(fixed rows and random columns)
22
Another issue:
Table 17.17 (O/L 6th ed., p. 1057)
col = random
row= fixed
MEAN
SQUARE
EXPECTATIONS
Fixed
MSRows
MSCol
MSRC
MSError
Random
s cnqt s
tb
t


s Rnqb s nstb  Rnsb


s nqtb s n stb
s
s


ns
 cns
Mixed
s nstb  cnqt

s nstb  Rnsb

s nstb
s

Reference: Design and Analysis of Experiments by D.C. Montgomery, 4 th edition, Chapter 11.
23
Fixed:
Specific levels chosen by the experimenter
Random: Levels chosen randomly from a large
number of possibilities
Fixed:
All Levels about which inferences are to be
made are included in the experiment
Random: Levels are some of a large number possible
Fixed:
A definite number of qualitatively
distinguishable levels, and we plan to study
them all,
or a continuous set of quantitative settings,
but we choose a suitable, definite subset
in a limited region and confine inferences to
that subset
Random: Levels are a random sample from an infinite
( or large) population
24
“In a great number of cases the investigator
may argue either way, depending on his mood and
his handling of the subject matter. In other words,
it is more a matter of assumption than of reality.”
Some authors say that if in doubt, assume
fixed model. Others say things like “I think in most
experimental situations the random model is
applicable.” [The latter quote is from a person
whose experiments are in the field of biology].
25
My own feeling is that in most areas of
management, a majority of experiments involve the
fixed model [e.g., specific promotional campaigns,
two specific ways of handling an issue on an income
statement, etc.] . Many cases involve neither a
“pure” fixed nor a “pure” random situation [e.g.,
selecting 3 prices from 6 “practical” possibilities].
Note that the issue sometimes becomes
irrelevant in a practical sense when (certain)
interactions are not present. Also note that each
assumption may yield you the same “answer” in
terms of practical application, in which case the
distinction may not be an important one.
26
How to Fit these Models in Minitab
• “Balanced ANOVA” can fit restricted and
unrestricted version. By default, it shows
unrestricted model.
• “General Linear Model” can only fit unrestricted
model.
• There are no difference between restricted or
unrestricted versions for fixed effect and random
effect model. It only matters for the mixed effect
model.
27
More on Minitab
• The notation in EMS output under restricted
model matches with ours but it under unrestricted
model is different to ours.
• General suggestion: use “General Linear Model”
to fit the models BUT use “Balanced ANOVA,
option of restricted model” to find the EMS for
fixed and random effect models.
28
Two-Way ANOVA in Minitab
Stat>>Anova>>General Linear Model:
Model
Random factors
Results
Factor plots
Graphs
device brand device*brand
device
Tick “Display expected mean squares
and variance components”
Main effects plots & Interactions plots
Use standardized residuals for plots
29
General Linear Model: time versus device, brand
Factor Type Levels Values
device random 3
123
brand fixed
4
1234
Analysis of Variance for time, using Adjusted SS for Tests
Source
device
brand
device*brand
Error
Total
DF
2
3
6
12
23
Seq SS
0.28000
0.21000
0.11000
0.30000
0.90000
Adj SS
0.28000
0.21000
0.11000
0.30000
Adj MS
F P
0.14000 7.64 0.022
0.07000 3.82 0.076
0.01833 0.73 0.633
0.02500
30
Exercise: Lifetime of a Special-purpose
Battery
It is important in battery testing to consider
different temperatures and modes of use; a
battery that is superior at one temperature
and mode of use is not necessarily superior
at other treatment combination. The
batteries were being tested at 4 different
temperatures for three modes of use (I for
intermittent, C for continuous, S for
sporadic). Analyze the data.
31
Battery Lifetime (2 replicates)
Temperature
Mode of
use
1
2
3
4
I
12, 16
15, 19
31, 39
53, 55
C
15, 19
17, 17
30, 34
51, 49
S
11, 17
24, 22
33, 37
61, 67
32
Brand Name Appeal for Men & Women:
M
F
Interesting Example:*
Frontiersman
50 people
per cell
April
Mean Scores
“Frontiersman”
Dependent
males
Variables
(n=50)
Intent-topurchase
4.44
“April”
males
(n=50)
“Frontiersman”
females
(n=50)
3.50
2.04
“April”
females
(n=50)
4.52
(*) Decision Sciences”, Vol. 9, p. 470, 1978
33
Interaction Plot - Data Means for y
brand
1
2
Mean
4
3
2
1
2
gender
Y 12
34
ANOVA Results
Dependent
Variable
Source
d.f.
Intent-topurchase
(7 pt. scale)
Sex (A)
1
Brand name (B)
1
AxB
1
Error
196
MS
23.80
29.64
146.21
4.24
F
5.61*
6.99**
34.48***
*p<.05
**p<.01
***p<.001
35
Two Factors with No Replication,
A
B
1
2
3
4
1
2
3
7
10
6
9
3
6
2
5
4
8
5
7
When there’s no replication, there is no “pure” way to
estimate ERROR.
Error is measured by considering more than one
observation (i.e., replication) at the same “treatment
combination” (i.e., experimental conditions).
36
Our model for analysis is “technically”:
Yij = i tj + Iij
i = 1, ..., R
j = 1, ..., C
We can write:
Yij = Y•• + (Yi• - Y••) + (Y•j - Y••)
+ (Yij - Yi• - Y•j+ Y••)
37
After bringing Y•• to the other side of
the equation, squaring both sides, and
double summing over i and j,
We Find:
C R
R
j=1 i=1
i=1
(Yij - Y••)2 = C • (Yi•-Y••)2
C
+ R • (Y•j - Y••)2
j=1
R
C
i=1
j=1
+  (Yij - Yi• - Y•j + Y••)2
38
TSS = SSBROWS + SSBCol + SSIR, C
Degrees of
Freedom : R•C - 1 = (R - 1) + (C - 1) + (R - 1) (C - 1)
We Know, E(MSInt.) = sVInt.
If we assume VInt. = 0, E(MSInt.) = s2,
and we can call SSIR,C
SSW
MSInt
MSW
39
And, our model may be rewritten:
Yij =  + i + tj + ij,
and the “labels” would become:
TSS = SSBROWS+ SSBCol + SSW
Error
In our problem:
SSBrows = 28.67
SSBcol = 32
SSW
= 1.33
40
and:
Source
ANOVA
SSQ
df
MSQ
Fcalc
at = .01,
rows
28.67
3
9.55
43
FTV (3,6)
col
Error
32.00
2
16.00
1.33
6
00.22
72
= 9.78
FTV(2,6)
= 10.93
TSS = 62
11
41
What if we’re wrong about there being no interaction?
If we “think” our ratio is,
in Expectation, s2 + VROWS , (Say, for ROWS)
s2
and it really is (because there’s interaction)
s2 + VROWS,
s2 + Vint’n
being wrong can lead only to giving us an
underestimated Fcalc.
42
Thus, if we’ve REJECTED Ho, we can feel
confident of our conclusion, even if there’s
interaction
If we’ve ACCEPTED Ho, only then could the
no interaction assumption be CRITICAL.
43