Transcript Chapter 6 The 2k Factorial Design
Chapter 6 The
2 k
Factorial Design
1
6.1 Introduction
• The special cases of the general factorial design (Chapter 5) • k factors and each factor has only two levels • Levels: – quantitative (temperature, pressure,…), or qualitative (machine, operator,…) – High and low – Each replicate has 2 2 = 2 k observations 2
• Assumptions: (1) the factor is fixed, (2) the design is completely randomized and (3) the usual normality assumptions are satisfied • Wildly used in
factor screening experiments
3
6.2 The 2
2
Factorial Design
• Two factors, A and B, and each factor has two levels, low and high.
• Example: the concentration of reactant v.s. the amount of the catalyst (Page 208) 4
• “-” And “+” denote the low and high levels of a factor, respectively • Low and high are arbitrary terms • Geometrically, the four runs form the corners of a square • Factors can be quantitative or qualitative, although their treatment in the final model will be different 5
• Average effect of a factor = the change in response produced by a change in the level of that factor averaged over the levels if the other factors.
• (1),
a, b
and
ab
: the total of
n
replicates taken at the treatment combination.
• The main effects:
A
B
1 2
n
{[
ab ab
a
2
n b b
] 2
n
[ ( 1 )
a
1 2
n
{[
ab ab
b
2
n a a
] 2
n
( [ 1 )
b
(
y A
( 1 1 )]} )]}
y B
2
y A
1 1
n
2
n y B
[
ab
[
ab
a
b b
a
( 1 )] ( 1 )] 6
• The interaction effect: 1
AB
{[
ab
b
] [
a
2
n
( 1 )]} 1 [
ab
2
n
ab
2
n
( 1 )
b
2
n a
• In that example,
A = 8.33, B = -5.00
( 1 )
a
b
] and
AB = 1.67
• Analysis of Variance • The total effects:
Contrast Contrast A B Contrast AB
ab
a
b
( 1 )
ab
b
a
( 1 )
ab
( 1 )
a
b
7
• Sum of squares:
SS SS B A SS AB
[ [
ab ab
a
4
n b
b
a
[
ab
( 4
n
1 )
b
a
] 2 4
n
( 1 )] 2 ( 1 )] 2
SS T SS E
2 1 2 1
SS T
1
y
2
ijk
SS A
y
4
SS B
2
n
SS AB
8
Response:Conversion ANOVA for Selected Factorial Model Analysis of variance table [Partial sum of squares] Source
Model
A B AB
Sum of Squares
291.67
208.33
75.00
8.33
Pure Error 31.33
Cor Total 323.00
DF
3
1 1 1
8 11
Mean Square
97.22
208.33
75.00
8.33
3.92
F Value
24.82
53.19
19.15
2.13
Prob > F
0.0002
< 0.0001
0.0024
0.1828
Std. Dev.
Mean C.V.
PRESS 1.98
27.50
7.20
70.50
R-Squared Adj R-Squared Pred R-Squared Adeq Precision 0.9030
0.8666
0.7817
11.669
The F test for the “model” source is testing the significance of the overall model; that is, is either A, B, or AB or some combination of these effects important?
9
• Table of plus and minus signs: (1)
a b ab
I
+ + + +
A
– + – +
B
– – + +
AB
+ – – +
10
• The regression model:
y
0 1
x
1 2
x
2 – x 1 and x 2 are coded variables that represent the two factors, i.e. x 1 1 and 1.
(or x 2 ) only take values on –
x x
2 1
Conc
( (
Conc Conc low low
Conc high
) /
Conc high
) / 2 2
Catalyst
( (
Catalyst low Catalyst low
Catalyst high
Catalyst high
) / 2 ) / 2 11
– Use least square method to get the estimations of the coefficients – For that example,
y
ˆ 27 .
5 8 .
33 2
x
1 5 2 .
00
x
2 – Model adequacy: residuals (Pages 213~214) 12
• Response surface plot:
y
ˆ 18 .
33 0 .
8333
Conc
5 .
00
Catalyst
– Figure 6.3
13
6.3 The 2
3
Design
• Three factors, A, B and C, and each factor has two levels. (Figure 6.4 (a)) • Design matrix (Figure 6.4 (b)) • (1),
a, b, ab, c, ac, bc, abc
• 7 degree of freedom: main effect = 1, and interaction = 1 14
15
• Estimate main effect:
A
1 [
a
( 1 )
ab
b
4
n
y A
y A
a ab ac abc 4
n
( 1 )
ac
b
c
4
n c
abc bc
bc
] 1 4n [
a
ab
ac
abc
( 1 )
b
c
bc
] • Estimate two-factor interaction: the difference between the average A effects at the two levels of B
AB
1 4
n
[
abc abc
ab
bc
c
ab
( 1 ) 4
n
b bc
ac
b
4
n c ac
a
a
( 1 )] 16
• Three-factor interaction:
ABC
1 {[
abc
bc
] [
ac
4
n
c
] [
ab
b
] [
a
( 1 )]} 1 4
n
[
abc
bc
ac
c
ab
b
a
( 1 )] • Contrast: Table 6.3
– Equal number of plus and minus – The inner product of any two columns = 0 – I is an identity element – The product of any two columns yields another column – Orthogonal design • Sum of squares: SS = (Contrast) 2 /8n 17
18
Treatment Combination (1)
a b ab c ac bc abc
Contrast Effect
Table of – and + Signs for the 2 3 Factorial Design (pg. 218)
I
+ + + + + + + + – – – –
A
+ + + + 24 3.00
– – – B + – + + + 18 2.25
Factorial Effect
AB
– + – –
C
– – + – + 6 + 0.75
– – + + + + 14 1.75
– – – 2
AC
+ – + + + 0.25
– – – –
BC
+ + + 4 + 0.50
ABC
– + – + – – + 4 + 0.50
19
• Example 6.1
A
= gap,
B
= Flow,
C
= Power,
y
= Etch Rate 20
21
• The regression model and response surface: – The regression model: 776 .
0625 101 .
2 625
x
1 306 .
125 2
x
3 153 .
625 2
x
1
x
3 – Response surface and contour plot (Figure 6.7) 22
23
24
6.4 The General 2
k
Design
• k factors and each factor has two levels • Interactions • The standard order for a 2 4 design: (1),
a, b, ab, c, ac, bc, abc, d, ad, bd, abd, cd, acd, bcd, abcd k
two-factor interactions
k
three-factor interactions 25 1
k
factor interaction
• • The general approach for the statistical analysis: – Estimate factor effects – Form initial model (full model) – Perform analysis of variance (Table 6.9) – Refine the model – Analyze residual – Interpret results
Contrast ABC
...
K
(
a
1 )(
b
1 ) (
k
1 )
ABC
K
2
n
2
k Contrast ABC
K SS ABC
K
1
n
2
k
(
Contrast ABC
K
) 2 26
27
6.5 A Single Replicate of the 2
k
Design
• These are 2
k
factorial designs with
one observation
corner of the “cube” at each • An unreplicated 2
k
factorial design is also sometimes called a “
single replicate
” of the 2
k
• If the factors are spaced too closely, it increases the chances that the noise will overwhelm the signal in the data 28
• Lack of replication causes potential
problems
in statistical testing – Replication admits an estimate of “pure error” (a better phrase is an
internal estimate
of error) – With no replication, fitting the full model results in zero degrees of freedom for error • Potential
solutions
to this problem – Pooling high-order interactions to estimate error (
sparsity of effects principle
) –
Normal probability plotting
of effects (Daniels, 1959) 29
• Example 6.2 (A single replicate of the 2 4 design) – A 2 4 factorial was used to investigate the effects of four factors on the filtration rate of a resin – The factors are
A
= temperature,
B
= pressure,
C
= concentration of formaldehyde,
D
= stirring rate 30
31
• Estimates of the effects Model Error Error Error Error Error Error Error Error Error Error Error Error Error Error Error Term Intercept A B C D AB AC AD BC BD CD ABC ABD ACD BCD ABCD Effect 21.625
3.125
9.875
14.625
0.125
-18.125
16.625
2.375
-0.375
-1.125
1.875
4.125
-1.625
-2.625
1.375
SumSqr % Contribution 1870.56 32.6397
39.0625 0.681608
390.062 6.80626
855.563 14.9288
0.0625
0.00109057
1314.06 22.9293
1105.56
19.2911
22.5625 0.393696
0.5625
5.0625
0.00981515
0.0883363
14.0625 0.245379
68.0625 1.18763
10.5625 0.184307
27.5625 0.480942
7.5625
0.131959
Lenth's ME Lenth's SME 6.74778
13.699
32
• The normal probability plot of the effects DESIGN-EXPERT Plot Filtration Rate Normal plot A: Temperature B: Pressure C: Concentration D: Stirring Rate 99 A 95 90 80 70 50 30 20 10 5 C D AD AC 1 -18.12
-8.19
1.75
11.69
21.62
Effect 33
34
DESIGN-EXPERT Plot Filtration Rate 104 X = A: Temperature Y = C: Concentration C- -1.000
C+ 1.000
Actual Factors B: Pressure = 0.00
D: Stirring Rate = 0.00
88.4426
72.8851
57.3277
Interaction Graph C: Concentration 41.7702
-1.00
-0.50
0.00
0.50
A: Tem perature 1.00
DESIGN-EXPERT Plot Filtration Rate 104 X = A: Temperature Y = D: Stirring Rate D- -1.000
D+ 1.000
Actual Factors B: Pressure = 0.00
C: Concentration = 0.00
88.75
73.5
Interaction Graph D: Stirring Rate 58.25
43 -1.00
-0.50
0.00
A: Tem perature 0.50
1.00
35
• B is not significant and all interactions involving B are negligible • Design projection: 2
4
design => 2
3
design in A,C and D • ANOVA table (Table 6.13)
36
37
Response:Filtration Rate ANOVA for Selected Factorial Model Analysis of variance table [Partial sum of squares] Source
Model
A C D AC AD
Residual Cor Total
Sum of Squares
5535.81
1870.56
390.06
855.56
1314.06
1105.56
195.12
5730.94
DF
5
1 1 1 1 1
10 15
Mean Square
1107.16
1870.56
390.06
855.56
1314.06
1105.56
19.51
F Value
56.74
95.86
19.99
43.85
67.34
56.66
Prob >F
< 0.0001
< 0.0001
0.0012
< 0.0001
< 0.0001
< 0.0001
Std. Dev.
Mean C.V.
PRESS 4.42
70.06
6.30
499.52
R-Squared 0.9660
Adj R-Squared Pred R-Squared Adeq Precision 0.9489
0.9128
20.841
38
• The regression model:
Final Equation in Terms of Coded Factors:
Filtration Rate +70.06250
= +10.81250 * Temperature +4.93750 * Concentration +7.31250 * Stirring Rate -9.06250
* Temperature * Concentration +8.31250 * Temperature * Stirring Rate • Residual Analysis (P. 235) • Response surface (P. 236) 39
40
41
• Half-normal plot: the absolute value of the effect estimates against the cumulative normal probabilities.
Half Normal plot DESIGN-EXPERT Plot Filtration Rate A: T emperature B: Pressure C: Concentration D: Stirring Rate 99 40 20 0 97 95 90 85 80 70 60 C D AD AC A 0.00
5.41
10.81
|Effect| 16.22
21.63
42
• Example 6.3 (Data transformation in a Factorial Design)
A
= drill load,
B
= flow,
C
= speed,
D
= type of mud,
y =
advance rate of the drill 43
• The normal probability plot of the effect estimates DESIGN-EXPERT Plot adv._rate
A: load B: flow C: speed D: mud 99 40 20 0 97 95 90 85 80 70 60 Half Normal plot BD BC D C B 0.00
1.61
3.22
|Effect| 4.83
6.44
44
DESIGN-EXPERT Plot adv._rate
• Residual analysis Normal plot of residuals adv._rate
99 30 20 10 5 95 90 80 70 50 1 2.58625
1.44875
0.31125
-0.82625
Residuals vs. Predicted -1.96375
-0.82625
0.31125
1.44875
2.58625
Res idual -1.96375
1.69
4.70
7.70
10.71
13.71
Predicted 45
• The residual plots indicate that there are problems with the
equality of variance
assumption • The usual approach to this problem is to employ a
transformation
on the response • In this example,
y
* ln
y
46
DESIGN-EXPERT Plot Ln(adv._rate) A: load B: flow C: speed D: mud 99 40 20 0 97 95 90 85 80 70 60 Half Normal plot D C B 0.00
0.29
0.58
|Effect| 0.87
1.16
Three main effects are large No indication of large interaction effects What happened to the interactions?
47
Response: Constant: 0.000
adv._rate
Transform: Natural log ANOVA for Selected Factorial Model Analysis of variance table [Partial sum of squares] Sum of Mean F Source
Model
B C D
Squares DF
7.11
5.35
1.34
0.43
Residual 0.17
3
1 1 1
12
Square
2.37
5.35
1.34
0.43
0.014
Value
164.82
371.49
93.05
29.92
Prob > F
< 0.0001
< 0.0001
< 0.0001
0.0001
Cor Total 7.29
15 Std. Dev. 0.12
Mean C.V.
1.60
7.51
PRESS 0.31
R-Squared 0.9763
Adj R-Squared Pred R-Squared 0.9704
0.9579
Adeq Precision 34.391
48
• Following Log transformation
Final Equation in Terms of Coded Factors:
Ln(adv._rate) = +1.60
+0.58
* B +0.29
+0.16
* C * D 49
DESIGN-EXPERT Plot Ln(adv._rate) 99 95 90 80 70 50 30 20 10 5 1 Normal plot of residuals DESIGN-EXPERT Plot Ln(adv._rate) 0.194177
Residuals vs. Predicted 0.104087
0.0139965
-0.0760939
-0.166184
-0.0760939
0.0139965
0.104087
0.194177
Res idual -0.166184
0.57
1.08
1.60
Predicted 2.11
2.63
50
• Example 6.4: – Two factors (A and C) affect the mean number of defects – A third factor (B) affects variability – Residual plots were useful in identifying the dispersion effect – The magnitude of the dispersion effects:
F i
* ln
S
2 (
i
)
S
2 (
i
) – When variance of positive and negative are equal, this statistic has an approximate normal distribution 51
52
53
54
55
6.7 2
k
Designs are Optimal Designs
• Consider 2 2 design with one replication.
• Fit the following model:
y
0 1
x
1 2
x
2 • Matrix form: ( 1 )
a b ab
y
1 1 1 1 1 1 1 1 1 1 1 1 12
x
1
x
2 1 1 1 1 0 1 2 12
X
56
• The LS estimation: ˆ (
X
'
X
) 1
X
'
Y
( 1 ) ( ( 1 1 ( 1 ) ) )
a a
a
4
a
4 4 4
b b b b
ab ab ab ab
• D-optimal criterion, |X’X|: the volumn of the joint confidence region that contains all coefficients is inversely proportional to the square root of |X’X|. • G-optimal design: min max Var( ) 4 2 ( 1
x
1 2
x
2 2 57
x
1 2
x
2 2 )
6.8 The Addition of Center Points to the 2
k
Design
• Based on the idea of replicating
some
of the runs in a factorial design • Runs at the center provide an estimate of error and allow the experimenter to distinguish between two possible models: First-order model (interaction)
y
0
i k
1
i x i
i k
1
k
ij x x i j
Second-order model
y
0
i k
1
i x i
i k
1
k
ij x x i j
i k
1
ii x i
2 58
59
y F
y C
no "curvature"
The hypotheses are:
H
0 :
i k
1
ii
0
H
1 :
i k
1
ii
0
SS
Pure Quad
F C n F F
n C y C
) 2 To detect the possibility of the quadratic effects: add center points This sum of squares has a single degree of freedom 60
61
• Example 6.6
Refer to the original experiment shown in Table 6.10. Suppose that four center points are added to this experiment, and at the points
x
1=
x
2 =
x
3=
x
4=0 the four observed filtration rates were 73, 75, 66, and 69. The average of these four center points is 70.75, and the average of the 16 factorial runs is 70.06. Since are very similar, we suspect that there is no strong curvature present.
n C
4 Usually between 3 and 6 center points will work well Design-Expert provides the analysis, including the
F
-test for pure quadratic curvature 62
63
64
• If curvature is significant, augment the design with axial runs to create a central composite design . The CCD is a very effective design for fitting a second-order response surface model 65