Three or More Factors: Latin Squares

Download Report

Transcript Three or More Factors: Latin Squares

Always be contented, be grateful,
be understanding and be compassionate.
1
Blocking
• We will add a factor even if it is not of interest so that
the study of the prime factors is under more homogeneous
conditions. This factor is called “block”. Most of time, the
block does not interact with prime factors.
• Popular block factors are “location”, “gender” and so
on.
• A two-factor design with one block factor is called a
“randomized block design”.
2
RBD Model (Section 15.2)
•A randomized (complete) block design is an
experimental design for comparing t
treatments (or say levels) in b blocks.
Treatments are randomly assigned to units
within a block and without replications.
•The probability model of RBD is the same as
two-way Anova model with no interaction
term (so can conduct multiple comparisons
for each factor separately)
3
For example, suppose that we are studying worker
absenteeism as a function of the age of the worker,
and have different levels of ages: 25-30, 40-55, and
55-60. However, a worker’s gender may also affect
his/her amount of absenteeism. Even though we are
not particularly concerned with the impact of gender,
we want to ensure that the gender factor does not
pollute our conclusions about the effect of age.
Moreover, it seems unlikely that “gender” interacts
with “ages”. We include “gender” as a block factor.
4
O/L: Example 15.1
• Goal: To compare the effects of 3 different
insecticides on a variety of string beans.
• Condition: It was necessary to use 4 different
plots of land.
• Response of interest: the number of seedlings
that emerged per row.
5
Data:
insecticide
1
1
1
1
2
2
2
2
3
3
3
3
plot
1
2
3
4
1
2
3
4
1
2
3
4
seedlings
56
48
66
62
83
78
94
93
80
72
83
85
6
Minitab>>General Linear Model, response
seedlings, model insecticide & plot
General Linear Model: seedings versus insectcide, plot
Analysis of Variance for seedlings, using Adjusted SS for
Tests
Source
DF
Seq SS
Adj SS Adj MS
F
P
insecticide 2 1832.00 1832.00 916.00 211.38 0.000
plot
3
438.00
438.00 146.00
33.69 0.000
Error
6
26.00
26.00
4.33
Total
11 2296.00
S = 2.08167
R-Sq = 98.87%
R-Sq(adj) = 97.92%
Unusual Observations for seedings
Obs seedings
Fit SE Fit Residual St Resid
11
83.0000 86.0000 1.4720
-3.0000
-2.04 R
R denotes an observation with a large standardized
residual.
7
Residual Plots for seedings
Normal Probability Plot
Versus Fits
99
2
Residual
Percent
90
50
-2
10
1
0
-4
-2
0
Residual
2
4
50
60
Histogram
2
2
Residual
Frequency
90
Versus Order
3
1
0
70
80
Fitted Value
0
-2
-3
-2
-1
0
Residual
1
2
1
2
3
4
5
6 7
8 9
Observation Order
10
11 12
8
RBD with random blocks
• We would like to apply our conclusions on
a large pool of blocks
• We are able to sample blocks randomly
• Example: Minitab unit 5
– Goal: to study the difference of 3 appraisers on
their appraised values
– Blocks: randomly selected 5 properties
9
Latin Square Design (Section 15.3)
Example:
Three factors, A (block factor), B (block factor), and C
(treatment factor), each at three levels. A possible arrangement:
B1
B2
B3
A1
C1
C1
C1
A2
C2
C2
C2
A3
C3
C3
C3
10
Notice, first, that these designs are squares; all factors
are at the same number of levels, though there is no
restriction on the nature of the levels themselves.
Notice, that these squares are balanced: each letter
(level) appears the same number of times; this insures
unbiased estimates of main effects.
How to do it in a square? Each treatment appears once
in every column and row.
Notice, that these designs are incomplete; of the 27
possible combinations of three factors each at three
levels, we use only 9.
11
Example:
Three factors, A (block factor), B (block factor), and C
(treatment factor), each at three levels, in a Latin Square design;
nine combinations.
B1
B2
B3
A1
C1
C2
C3
A2
C2
C3
C1
A3
C3
C1
C2
12
Example with 4 Levels per Factor
FACTORS
VARIABLE
Automobiles
Tire positions
Tire treatments
A
B
C
B1
A1
C
A2
C
A3
C
A4
C
four levels
four levels
four levels
B2
C
4
848
831
952
4
776
C
1
2
784
C
1
997
C
3
841
C
2
1
845
C
4
C
2
817
C
B4
890
C
2
962
3
C
877
C
1
B3
3
855
Lifetime of a tire
(days)
776
C
4
806
3
871
13
The Model for (Unreplicated)
Latin Squares
Example:
Three factors r, t, and g each at m levels ,
i = 1,... m
y
ijk
=  + r
i
+ tj
Y=A+B+C+e
+ gk +
ijk
j =1, ... , m
k=1, ... ,m
AB, AC, BC, ABC
Note that interaction is not present in the model.
Same three assumptions: normality, constant variances, and randomness.
14
Putting in Estimates:
y ijk =y ... + (y i.. – y ... ) + (y . j. – y ... ) + (y .. k – y ... ) + R
or bringing y••• to the left – hand side,
(y ijk – y ...) = (y i .. – y ...) + (y .j . – y ... ) + (y ..k – y ... ) + R,
Total
variability
among
yields
where R =
Variability
Variability
among
among
=
+
yields
yields
associated
associated
with
with
Rows
Columns
Variability
among
+
yields
associated
with Inside
Factor
yijk – y i.. – y . j. – y.. k + 2y...
15
Actually,
R
= y ijk - y i .. - y . j. - y ..k + 2y...
= (y ijk - y ...) - (y i.. - y ...)
- (y . j. - y ...)
- (y ..k - y ...),
An “interaction-like” term. (After all, there’s no
replication!)
16
The analysis of variance (omitting the mean squares,
which are the ratios of second to third entries), and
expectations of mean squares:
Source of
variation
Row s
C olumns
Inside
factor
Error
Total
Sum of
squares
D egrees of
freedom
m–1
Expected
value of
mean square
 2 + VRows
m  (y . j. – y ... )
2
m–1
 2 + VCol
m  (y ..k – y ... )
2
m–1
m
m
 (y
i=1
m
2
–
y
)
i..
...
j=1
m
 2 + V Inside
factor
k=1
by subtraction (m – 1)( m – 2)
i j k (y ijk – y ...) 2
2
m2 – 1
17
The expected values of the mean squares
immediately suggest the F ratios
appropriate for testing null hypotheses on
rows, columns and inside factor.
18
Our Example:
(Inside factor = Tire Treatment)
Tire Position
B1
A1
4
B2
3
855
Auto.
A2
1
2
3
2
845
841
776
2
784
4
952
997
4
1
1
831
1
3
4
B4
890
817
848
A4
2
877
962
A3
B3
776
3
806
871 19
General Linear Model: Lifetime versus Auto, Postn, Trtmnt
Factor
Auto
Postn
Trtmnt
Type Levels Values
fixed
4 1 2 3 4
fixed
4 1 2 3 4
fixed
4 1 2 3 4
Analysis of Variance for Lifetime, using Adjusted SS for Tests
Source
Auto
Postn
Trtmnt
Error
Total
DF
3
3
3
6
15
Seq SS
17567
4679
26722
16165
65132
Adj SS
17567
4679
26722
16165
Adj MS
5856
1560
8907
2694
F
2.17
0.58
3.31
P
0.192
0.650
0.099
Unusual Observations for Lifetime
Obs
11
Lifetime
784.000
Fit
851.250
SE Fit
41.034
Residual
-67.250
St Resid
-2.12R
20
Minitab DATA ENTRY
VAR1 VAR2 VAR3 VAR4
855 1
1
4
962 2
1
1
848 3
1
3
831 4
1
2
877 1
2
3
817 2
2
2
.
.
.
.
.
.
.
.
.
.
.
.
871 4
4
3
21
Latin Square with REPLICATION
• Case One: using the same rows and columns
for all Latin squares.
• Case Two: using different rows and columns
for different Latin squares.
• Case Three: using the same rows but
different columns for different Latin squares.
22
Treatment Assignments for n
Replications
• Case One: repeat the same Latin square n
times.
• Case Two: randomly select one Latin
square for each replication.
• Case Three: randomly select one Latin
square for each replication.
23
Example: n = 2, m = 4, trtmnt = A,B,C,D
Case One:
column
column
row
1
2
3
4
row
1
2
3
4
1
A
B
C
D
1
A
B
C
D
2
B
C
D
A
2
B
C
D
A
3
C
D
A
B
3
C
D
A
B
4
D
A
B
C
4
D
A
B
C
• Row = 4 tire positions; column = 4 cars
24
Case Two
column
column
row
1
2
3
4
row
5
6
7
8
1
A
B
C
D
5
B
C
D
A
2
B
C
D
A
6
A
D
C
B
3
C
D
A
B
7
D
B
A
C
4
D
A
B
C
8
C
A
B
D
• Row = clinics; column = patients; letter = drugs for flu
25
Case Three
column
row
1
2
3
4
5
6
7
8
1
A
B
C
D
B
C
D
A
2
B
C
D
A
A
D
C
B
3
C
D
A
B
D
B
A
C
4
D
A
B
C
C
A
B
D
• Row = 4 tire positions; column = 8 cars
26
ANOVA for Case 1
SSBR, SSBC, SSBIF are computed the same way
as before, except that the multiplier of (say for
rows) m (Yi..-Y…)2 becomes
mn  (Yi..-Y…)2
and degrees of freedom for error becomes
(nm2 - 1) - 3(m - 1) = nm2 - 3m + 2
27
ANOVA for other cases:
1. SS: please refer to the book, Statistical Principles of
research Design and Analysis by R. Kuehl.
2. DF: # of levels – 1 for all terms except error.
DF of error = total DF – the sum of the rest
DF’s.
Using Minitab in the same way can give Anova tables for
all cases.
28
Three or More Factors
Notation:
• Y = response; A, B, C, … = input factors
• AB = interaction between A and B
• ABC = interaction between A, B, and C
• The term involving k factors has order of k:
eg.
AB  order 2 term
ABC  order 3 term
29
• Full model = the model includes all factors
and their interactions, denoted as
(1) Two factors
A|B (= A+B+AB)
(2) Three factors
A|B|C (= A+B+C+AB+AC+BC+ABC)
(3) And so on.
30
Backward Model Selection
1. Fit the full model and delete the most
insignificant highest order term.
2. Fit the reduced model from 1. and delete the
most insignificant highest order term.
3. Repeat 2. until all remaining highest order terms
are significant.
4. Repeat the same procedure (deleting the most
insignificant term each time until no insignificant
terms) for the 2nd highest order, then the 3rd
highest order, …, and finally the order 1 terms.
5. Determine the final model and do assumption
checking for it.
31
Note.
If a term is in the current model, then all
lower order terms involving factors in that
term must not be deleted even if they are
insignificant.
eg. If ABC is significant (so it is in the
model), then A, B, C, AB, AC, BC cannot be
deleted.
32
Note.
The procedure of backward model selection
can be very time-consuming if the number
of factors, k, is large. In such cases, we
delete all insignificant terms together when
we are processing the order 4 or higher
terms.
• Examples are in Minitab unit 11.
33