Chapter 4 Randomized Blocks, Latin Squares, and Related

Download Report

Transcript Chapter 4 Randomized Blocks, Latin Squares, and Related

Chapter 4 Experiments with
Blocking Factors
1
4.1 The Randomized Complete
Block Design
• Nuisance factor: a design factor that probably has
an effect on the response, but we are not interested
in that factor.
• Typical nuisance factors include batches of raw
material, operators, pieces of test equipment, time
(shifts, days, etc.), different experimental units
2
• If the nuisance variable is known and
controllable, we use blocking
• If the nuisance factor is known and
uncontrollable, sometimes we can use the
analysis of covariance (see Chapter 14) to
remove the effect of the nuisance factor from the
analysis
3
• If the nuisance factor is unknown and
uncontrollable (a “lurking” variable), we hope
that randomization balances out its impact across
the experiment
• Sometimes several sources of variability are
combined in a block, so the block becomes an
aggregate variable
4
• We wish to determine whether 4 different tips
produce different (mean) hardness reading on a
Rockwell hardness tester
• Assignment of the tips to an experimental unit;
that is, a test coupon
• Structure of a completely randomized experiment
• The test coupons are a source of nuisance
variability
• Alternatively, the experimenter may want to test
the tips across coupons of various hardness levels
• The need for blocking
• Randomized Complete block design (RCBD)
5
• To conduct this experiment as a RCBD, assign all
4 tips to each coupon
• Each coupon is called a “block”; that is, it’s a
more homogenous experimental unit on which to
test the tips
• Variability between blocks can be large,
variability within a block should be relatively
small
• In general, a block is a specific level of the
nuisance factor
• A complete replicate of the basic experiment is
conducted in each block
• A block represents a restriction on
randomization
6
• All runs within a block are randomized
• Suppose that we use b = 4 blocks:
• Once again, we are interested in testing the
equality of treatment means, but now we have to
remove the variability associated with the
nuisance factor (the blocks)
7
8
Statistical Analysis of the RCBD
• Suppose that there are a treatments (factor levels)
and b blocks
• A statistical model (effects model) for the RCBD
is
 i  1, 2,..., a
yij     i   j   ij 
 j  1, 2,..., b
–  is an overall mean, i is the effect of the ith
treatment, and j is the effect of the jth block
– ij ~ NID(0,2)
k
– a
 i  0,   j  0
i 1
j 1
9
• Means model for the RCBD
yij  ij   ij , ij     i   j
• The relevant (fixed effects) hypotheses are
H 0 : 1  2 
 a where i  (1/ b) j 1 (    i   j )    i
b
• An equivalent way for the above hypothesis
H0 :1   2     a  0
• Notations:
b
y i   y ij , i  1,...,a
j 1
a
y j   y ij , j  1,...,b
i 1
a
b
b
a
j 1
i 1
y   y ij   y j   y i
i 1 j 1
y i  y i / b, y j  y j / a, y  y / N
10
• ANOVA partitioning of total variability:
a
b
a
b
2
(
y

y
)
 ij ..   [( yi.  y.. )  ( y. j  y.. )
i 1 j 1
i 1 j 1
( yij  yi.  y. j  y.. )]2
a
b
 b ( yi.  y.. )  a  ( y. j  y.. ) 2
2
i 1
a
j 1
b
  ( yij  yi.  y. j  y.. ) 2
i 1 j 1
SST  SSTreatments  SS Blocks  SS E
11
•
•
•
•
SST = SSTreatment + SSBlocks + SSE
Total N = ab observations, SST has N – 1 degrees
of freedom.
a treatments and b blocks, SSTreatment and SSBlocks
have a – 1 and b – 1 degrees of freedom.
SSE has ab – 1 – (a – 1) – (b – 1) = (a – 1)(b – 1)
degrees of freedom.
From Theorem 3.1, SSTreatment /2, SSBlocks / 2 and
SSE / 2 are independently chi-square distributions.
12
• The expected values of mean squares:
a
E ( MSTreatment )   2 
b i2
i 1
a 1
b
E ( MS Blocks )   2 
a   j2
j 1
b 1
E ( MS E )   2
• For testing the equality of treatment means,
MSTreatments
F0 
~ Fa 1,( a 1)(b1)
MS E
13
• The ANOVA table
• Another computing formulas:
2
2
a
y
y
1
SST   y ij2   , SSTreatments   y i2  
N
b i 1
N
i 1 j 1
a
SSBlocks
b
1 b 2 y2
  y j 
, SSE  SST  SSTreatments  SSBlocks
a j 1
N
14
15
• To conduct this experiment as a RCBD, assign all
4 pressures to each of the 6 batches of resin
• Each batch of resin is called a “block”; that is, it’s
a more homogenous experimental unit on which to
test the extrusion pressures
16
17
18
4.1.2 Model Adequacy Checking
• Residual Analysis
• Residual: eij  yij  yˆ ij  yij  yi  y j  y
• Basic residual plots indicate that normality,
constant variance assumptions are satisfied
• No obvious problems with randomization
19
20
21
Multiple Comparisons (Fisher
LSD)
22
• Can also plot residuals versus the type of tip
(residuals by factor) and versus the blocks. Also
plot residuals v.s. the fitted values.
• These plots provide more information about the
constant variance assumption, possible outliers
4.1.3 Some Other Aspects of the Randomized
Complete Block Design
• The model for RCBD is complete additive.
 i  1, 2,..., a
yij     i   j   ij 
 j  1, 2,..., b
23
• Interactions?
• For example:
E( yij )   i  j  ln E( yij )  ln   ln i  ln  j
• The treatments and blocks are random.
• Choice of sample size:
– Number of blocks , the number of replicates
and the number of error degrees of freedom 
a
–
b i2
 2  i 1 2
a
24
• Estimating miss values:
– Approximate analysis: estimate the missing
values and then do ANOVA.
– Assume the missing value is x. Minimize SSE to
find x
SSE  x 2  ( yi'.  x) 2 / b  ( y.' j  x) 2 / a  ( y..'  x) 2 / ab  R
x
ayi'  by' j  y'
(a  1)(b  1)
– The error degrees of freedom - 1
25
26
4.1.4 Estimating Model Parameters and the General
Regression Significance Test
• The linear statistical model
 i  1, 2,..., a
yij     i   j   ij 
 j  1, 2,..., b
• The normal equations
abˆ
bˆ
 bˆ1
 bˆ1

bˆ

a


 aˆ1
 ˆ
1

ˆ1

aˆ
   bˆa
 bˆa
   ˆa

ˆ1
  


ˆ
1
   aˆb
   ˆ
b
 y
 y1


  
 aˆ1
ˆb
 y a
 y1

ˆa

 aˆb
 yb
27
• Under the constraints,
a
ˆ
the solution is
i
i 1
b
 0,  ˆ j  0
j 1
ˆ  y ,ˆi  yi  y , ˆ j  y j  y
and the fitted values,
yˆ ij  ˆ  ˆi  ˆ j  yi  y j  y
• The sum of squares for fitting the full model:
2
2
2
b y
y
y

j
R(  , ,  )  ˆy  ˆi yi   ˆ j y j   i  
 
ab
i 1
j 1
i 1 b
j 1 a
a
b
a
• The error sum of squares
2
SSE   yij2  R( , ,  )   yij  yi  y j  y 
a
b
i 1 j 1
a
b
i 1 j 1
28
• The sum of squares due to treatments:
yi2 y2
R( |  ,  )  R( ,  ,  )  R( ,  )   
ab
i 1 b
a
where
b
b
y2j
j 1
j 1
a
R( ,  )  ˆy..   ˆ j y. j  
29
4.2 The Latin Square Design
• RCBD removes a known and controllable
nuisance variable.
• Example: the effects of five different formulations
of a rocket propellant used in aircrew escape
systems on the observed burning rate.
– Remove two nuisance factors: batches of raw
material and operators
• Latin square design: rows and columns are
orthogonal to treatments.
30
• The Latin square design is used to eliminate two
nuisance sources, and allows blocking in two
directions (rows and columns)
• Usually Latin Square is a p  p squares, and each
cell contains one of the p letters that corresponds
to the treatments, and each letter occurs once and
only once in each row and column.
• See Page 139
31
• The statistical (effects) model is
 i  1, 2,..., p

yijk     i   j   k   ijk  j  1, 2,..., p
k  1, 2,..., p

– yijk is the observation in the ith row and kth
column for the jth treatment,  is the overall
mean, i is the ith row effect, j is the jth
treatment effect, k is the kth column effect and
ijk is the random error.
– This model is completely additive.
– Only two of three subscripts are needed to
denote a particular observation.
32
• Sum of squares:
SST = SSRows + SSColumns + SSTreatments + SSE
• The degrees of freedom:
p2 – 1 = p – 1 + p – 1 + p – 1 + (p – 2)(p – 1)
• The appropriate statistic for testing for no
differences in treatment means is
MSTreatments
F0 
~ Fp 1,( p 2)( p 1)
MS E
• ANOVA table
33
• least squares estimates of the model parameters, i ,
j , k
p
p
p
i 1
j 1
k 1
 : p 2 ˆ  pˆ i  pˆ j  p ˆk  y...
p
p
j 1
k 1
 i : pˆ  pˆ i  pˆ j  p ˆk  yi..
p
p
i 1
k 1
 j : pˆ  pˆ i  pˆ j  p ˆk  y. j .
p
p
i 1
j 1
 k : pˆ  pˆi  pˆ j  pˆk  y..k
34
• Under the constrains,
p
p
p
ˆ  ˆ   ˆ
i 1
i
j 1
j
k 1
k
0
y...
ˆ  2  y...
p
ˆ i  yi..  y...
ˆ j  y. j .  y...
ˆk  y..k  y...
35
36
• Example 4.3
37
• The residuals
eijk  yijk  yˆ ijk  yijk  yi  y j  yk  2 y
• If one observation is missing,
p( yi'  y' j  y'k )  2 y'
yijk 
( p  2)( p  1)
38
• Standard Latin square
• Random order
39
• Replication of Latin Squares:
– The same batches and operators
40
•Replication of Latin Squares:
•The same batches and different operators
41
•Replication of Latin Squares:
•The different batches and different operators
42
4.3 The Graeco-Latin Square Design
• Graeco-Latin square:
– Two Latin Squares
– One is Greek letter and the other is Latin letter.
– Two Latin Squares are orthogonal
– Table 4.17
– Block in three directions
– Four factors (row, column, Latin letter and
Greek letter)
– Each factor has p levels. Total p2 runs
43
44
• The statistical model:
yijkl    i   j  k  l   ijkl , i, j, k , l  1,, p
– yijkl is the observation in the ith row and lth
column for Latin letter j, and Greek letter k
–  is the overall mean, i is the ith row effect, j
is the effect of Latin letter treatment j , k is the
effect of Greek letter treatment k, l is the
effect of column l.
– ANOVA table (Table 4.18)
– Under H0, the testing statistic is Fp-1,(p-3)(p-1)
distribution.
45
46
• Example 4.4
– Add a block factor: 5 test assemblies
47
4.4 Balance Incomplete Block
Designs
• May not run all the treatment combinations in
each block.
• Randomized incomplete block design (BIBD)
• Any two treatments appear together an equal
number of times.
• There are a treatments and each block can hold
exactly k (k < a) treatments.
• For example: A chemical process is a function of
the type of catalyst employed.
48
49
4.4.1 Statistical Analysis of the BIBD
• a treatments and b blocks. Each block contains k
treatments, and each treatment occurs r times.
There are N = ar = bk total observations. The
number of times each pairs of treatments appears
in the same block is   r (k  1)
(a  1)
• The statistical model for the BIBD is
yij     i   j   ij
50
• The sum of squares
SST  SSTreatments ( adjusted )  SSBlocks  SSE
SST   yij2  y2 / N
i
j
1 b 2
SSBlocks   y j  y2 / N
k j 1
a
SSTreatments ( adjusted ) 
k  Qi2
i 1
a
1 b
, Qi  yi   nij y j
k j 1
SSE  SST  SSTreatments ( adjusted )  SSBlocks
51
• The degree of freedom:
– Treatments(adjusted): a – 1
– Error: N – a – b – 1
• The testing statistic for testing equality of the
treatment effects:
MSTreatments ( adjusted )
F0 
MS E
• ANOVA table
52
53
• Example 4.5
• The contrast sum of squares
2
a

 
2
SSC  k   ci Qi  / a  ci 
 i 1
  i 1 
a
54
4.4.2 Least Squares Estimation of the Parameters
1 if T rti in Blk j
Model: nij yij  nij    i   j   ij  i  1,..., g ; j  1,...,b nij  
0 otherwise
Q   n    nij  yij     i   j 
g
g
b
i 1 j 1
2
ij ij
b
2
i 1 j 1
g
b
set
Q
 2 nij  yij     i   j   0 

i 1 j 1
b
set
Q
 2 nij  yij     i   j   0 
 i
j 1
a
set
Q
 2 nij  yij     i   j   0 
 j
i 1
g
b
 n
i 1 j 1
b
ij
i 1
^
i 1
^
j 1
b
^
yij  yi  r   r  i   nij  j
g
n
k
^
yij  N   r  i  k   j
^
n
j 1
ij
g
^
ij
i  1,..., g
j 1
^
g
^
^
yij  y j  k    nij  i  k  j
j  1,...,b
i 1
55
• The least squares normal equations:
a
b
i 1
j 1
 : Nˆ  r ˆi  k  ˆ j  y
b
 i : rˆ  rˆi   nij ˆ j  y i
j 1
a
 j : kˆ   nijˆi  kˆ j  y j
i 1
• Under the constrains,
a
ˆ
i 1
i
b
 0,  ˆ j  0
j 1
we have ˆ  y
56
• For the treatment effects,
a
ˆ j  y. j / k  ˆ   nijˆi / k
i 1
b
a
j 1
i 1
rˆ  rˆi   nij ( y. j / k  ˆ   nijˆi / k )  yi.
b
a
b
j 1
i 1
j 1
rˆ  rˆi   nij (ˆ   nijˆi / k )  yi.   nij y. j / k
b
a
b
j 1
p 1
j 1
krˆi   nij  n pjˆi  k yi.   nij y. j  kQi
57
a
b
b
j 1
j 1 p  i
krˆi   nij2ˆi   nij n pjˆ p  kQi
r (k  1)ˆi  
a
ˆ
p 1, p  i
p
 kQi
 (a  1)ˆi   (ˆi )  kQi
kQi
, i  1,2,, a
ˆi 
a
58