#### Transcript The 2 2 Design

```Design and Analysis of
Experiments
Dr. Tai-Yue Wang
Department of Industrial and Information Management
National Cheng Kung University
Tainan, TAIWAN, ROC
1/33
Two-Level Factorial
Designs
Dr. Tai-Yue Wang
Department of Industrial and Information Management
National Cheng Kung University
Tainan, TAIWAN, ROC
2/33
Outline








Introduction
The 22 Design
The 23 Design
The general 2k Design
A single Replicate of the 2k design
Designs
2k Designs are Optimal Designs
The additional of center Point to the 2k Design
Introduction







Special case of general factorial designs
k factors each with two levels
Factors maybe qualitative or quantitative
A complete replicate of such design is 2k
factorial design
Assumed factors are fixed, the design are
completely randomized, and normality
Used as factor screening experiments
Response between levels is assumed linear
The
Factor
2
2 Design
Treatment
Combination
A
B
-
-
+
Replication
I
II
III
IV
A low, B low
28
25
27
80
-
A high, B low
36
32
32
100
-
+
A low, B high
18
19
23
60
+
+
A high, B high
31
30
29
90
The
2
2 Design
“-” and “+” denote the low and
high levels of a factor,
respectively

Low and high are arbitrary
terms

Geometrically, the four runs
form the corners of a square

Factors can be quantitative or
qualitative, although their
treatment in the final model
will be different
The


Estimate factor effects
Formulate model






2
2 Design
With replication, use full model
With an unreplicated design, use normal probability plots
Statistical testing (ANOVA)
Refine the model
Analyze residuals (graphical)
Interpret results
The
2
2 Design
A  y A  y A
ab  a b  (1)


2n
2n
 21n [ab  a  b  (1)]
B  yB  yB
ab  b a  (1)


2n
2n
 21n [ab  b  a  (1)]
ab  (1) a  b
AB 

2n
2n
 21n [ab  (1)  a  b]
[ab  a  b  (1)]2
SSA 
4n
[ab  a  b  (1)]2
SSB 
4n
[ab  a  b  (1)]2
SSAB 
4n
2
2
2
n
y
2
SST   yijk
 ...
4n
i 1 j 1 k 1
SSE  SST  SSA  SSB  SSAB
The



2
2 Design
Standard order Yates’s order
Effects
(1)
a
b
ab
A
-1
+1
-1
+1
B
-1
-1
+1
+1
AB
+1
-1
-1
+1
Effects A, B, AB are orthogonal contrasts with one
degree of freedom
Thus 2k designs are orthogonal designs
The

2
2 Design
ANOVA table
The

2
2 Design
Algebraic sign for calculating effects in 22 design
The

2
2 Design
Regression model
y  0  1x1  2 x2  

x1 and x2 are code variable in this case
x1 
x2 

con  ( conlow  conhigh ) / 2
( conhigh  conlow ) / 2
catalyst  ( catalystlow  catalysthigh ) / 2
( catalysthigh  catalystlow ) / 2
Where con and catalyst are natural variables
The

2
2 Design
Regression model
Factorial Fit: Yield versus Conc., Catalyst
Estimated Effects and Coefficients for Yield (coded units)
Term
Effect Coef
Constant
27.500
Conc.
8.333 4.167
Catalyst
-5.000 -2.500
Conc.*Catalyst
1.667 0.833
SE Coef
0.5713
0.5713
0.5713
0.5713
T
48.14
7.29
-4.38
1.46
P
0.000
0.000
0.002
0.183
141.667
8.333
3.917
3.917
F
36.17
2.13
S = 1.97906 PRESS = 70.5
R-Sq = 90.30% R-Sq(pred) = 78.17% R-Sq(adj) = 86.66%
Analysis of Variance for Yield (coded units)
Source
Main Effects
2-Way Interactions
Residual Error
Pure Error
Total
DF
2
1
8
8
11
Seq SS
283.333
8.333
31.333
31.333
323.000
283.333
8.333
31.333
31.333
P
0.000
0.183
The

2
2 Design
Regression model
The

2
2 Design
Regression model
The

2
2 Design
Regression model
Estimated Coefficients for Yield using data in uncoded units
Term
Constant
Conc.
Catalyst
Conc.*Catalyst

Coef
28.3333
0.333333
-11.6667
0.333333
Regression model (without interaction)
Estimated Coefficients for Yield using data in uncoded units
Term
Constant
Conc.
Catalyst
Coef
18.3333
0.833333
-5.00000
The

2
2 Design
Response surface
The

2
2 Design
Response surface (note: the axis of catalyst is
reversed with the one from textbook)
The


3
2 Design
3 factors, each at two level.
Eight combinations
The


3
2 Design
Design matrix
Or geometric notation
The

3
2 Design
Algebraic sign
The



3
2 Design
-- Properties of the Table
Except for column I, every column has an equal
number of + and – signs
The sum of the product of signs in any two
columns is zero
Multiplying any column by I leaves that column
unchanged (identity element)
22
The

3
2 Design
-- Properties of the Table
The product of any two columns yields a column
in the table:
A  B  AB
AB  BC  AB C  AC
2


Orthogonal design
Orthogonality is an important property shared by
all factorial designs
23
The


3
2 Design
-- example
Nitride etch process
Gap, gas flow, and RF power
The


3
2 Design
-- example
Nitride etch process
Gap, gas flow, and RF power
The

3
2 Design
-- example
Full model
Estimated Effects and Coefficients for Etch Rate (coded units)
Term
Effect
Coef
SE Coef
Constant
776.06
11.87
65.41
Gap
-101.62
-50.81
11.87
Gas Flow
7.37
3.69
11.87
Power
306.12
153.06
11.87
Gap*Gas Flow
-24.88
-12.44
11.87
Gap*Power
-153.63
-76.81
11.87
Gas Flow*Power
-2.12
-1.06
11.87
Gap*Gas Flow*Power
5.62
2.81
11.87
T
0.000
-4.28
0.31
12.90
-1.05
-6.47
-0.09
0.24
P
138793
32299
127
2253
2253
F
61.62
14.34
0.06
0.003
0.764
0.000
0.325
0.000
0.931
0.819
S = 47.4612 PRESS = 72082
R-Sq = 96.61% R-Sq(pred) = 86.44% R-Sq(adj) = 93.64%
Analysis of Variance for Etch Rate (coded units)
Source
Main Effects
2-Way Interactions
3-Way Interactions
Residual Error
Pure Error
Total
DF
3
3
1
8
8
15
Seq SS
416378
96896
127
18020
18021
531421
416378
96896
127
18020
18021
P
0.000
0.001
0.819
The

3
2 Design
-- example
Reduced model
Factorial Fit: Etch Rate versus Gap, Power
Estimated Effects and Coefficients for Etch Rate (coded units)
Term
Effect
Coef
SE Coef
Constant
776.06
10.42
Gap
-101.62
-50.81
10.42
Power
306.12
153.06
10.42
Gap*Power
-153.63
-76.81
10.42
T
74.46
-4.88
14.69
-7.37
P
0.000
0.000
0.000
0.000
208080
94403
1738
1738
F
119.71
54.31
S = 41.6911 PRESS = 37080.4
R-Sq = 96.08% R-Sq(pred) = 93.02% R-Sq(adj) = 95.09%
Analysis of Variance for Etch Rate (coded units)
Source
Main Effects
2-Way Interactions
Residual Error
Pure Error
Total
DF
2
1
12
12
15
Seq SS
416161
94403
20858
20858
531421
416161
94403
20858
20858
P
0.000
0.000
The 23 Design – example -- Model
Summary Statistics for Reduced Model

5
SS
5.106

10
R 2  Model 
 0.9608
5
SST
5.314 10
2
R

SS E / df E
20857.75 /12
 1
 1
 0.9509
5
SST / dfT
5.314 10 /15
R2 for prediction (based on PRESS)
2
Pred
R
PRESS
37080.44
 1
 1
 0.9302
5
SST
5.314 10
28
The
3
2 Design
-- example
The
3
2 Design
-- example
The Regression Model
31
Cube Plot of Ranges
What do the
large ranges
when gap and
power are at the
high level tell
you?
32
The General 2k Factorial
Design

There will be k main effects, and
k 
  two-factor interactions
 2
k 
  three-factor interactions
 3
1 k  factor interaction
33
The General 2k Factorial
Design

Statistical Analysis
34
The General 2k Factorial
Design

Statistical
Analysis
35
Unreplicated 2k Factorial
Designs





These are 2k factorial designs with one observation at
each corner of the “cube”
An unreplicated 2k factorial design is also sometimes
called a “single replicate” of the 2k
These designs are very widely used
Risks…if there is only one observation at each corner,
is there a chance of unusual response observations
spoiling the results?
Modeling “noise”?
36
Unreplicated 2k Factorial
Designs
If the factors are spaced too closely, it increases the chances
that the noise will overwhelm the signal in the data
More aggressive spacing is usually best
37
Unreplicated 2k Factorial
Designs

Lack of replication causes potential problems
in statistical testing



Replication admits an estimate of “pure error” (a
better phrase is an internal estimate of error)
With no replication, fitting the full model results
in zero degrees of freedom for error
Potential solutions to this problem


Pooling high-order interactions to estimate error
Normal probability plotting of effects (Daniels,
1959)
38
Unreplicated 2k Factorial
Designs -- example



A 24 factorial was used to investigate the
effects of four factors on the filtration rate of a
resin
The factors are A = temperature, B = pressure,
C = mole ratio, D= stirring rate
Experiment was performed in a pilot plant
39
Unreplicated 2k Factorial
Designs -- example
40
Unreplicated 2k Factorial
Designs -- example
41
Unreplicated 2k Factorial
Designs – example –full model
42
Unreplicated 2k Factorial
Designs -- example –full model
43
Unreplicated 2k Factorial
Designs -- example –full model
44
Unreplicated 2k Factorial
Designs -- example –reduced model
Factorial Fit: Filtration versus Temperature, Conc., Stir Rate
Estimated Effects and Coefficients for Filtration (coded units)
Term
Effect
Coef
SE Coef
T P
Constant
70.063
1.104
63.44 0.000
Temperature
21.625
10.812
1.104
9.79 0.000
Conc.
9.875
4.938
1.104
4.47 0.001
Stir Rate
14.625
7.312
1.104
6.62 0.000
Temperature*Conc.
-18.125
-9.062
1.104
-8.21 0.000
Temperature*Stir Rate 16.625
8.313
1.104
7.53 0.000
S = 4.41730 PRESS = 499.52
R-Sq = 96.60% R-Sq(pred) = 91.28% R-Sq(adj) = 94.89%
Analysis of Variance for Filtration (coded units)
Source
DF
Seq SS
Main Effects
3
3116.19
2-Way Interactions
2
2419.62
Residual Error
10
195.12
Lack of Fit
2
15.62
Pure Error
8
179.50
Total
15
5730.94
3116.19
2419.62
195.12
15.62
179.50
1038.73
1209.81
19.51
7.81
22.44
F
53.23
62.00
P
0.000
0.000
0.35
0.716
45
Unreplicated 2k Factorial
Designs -- example –reduced model
46
Unreplicated 2k Factorial
Designs -- example –reduced model
47
Unreplicated 2k Factorial
Designs -- example –reduced model
48
Unreplicated 2k Factorial
Designs -- example –Design projection


Since factor B is negligible, the experiment
can be interpreted as a 23 factorial design with
factors A, C, D.
2 replicates
49
Unreplicated 2k Factorial
Designs -- example –Design projection
50
Unreplicated 2k Factorial
Designs -- example –Design projection
Factorial Fit: Filtration versus Temperature, Conc., Stir Rate
Estimated Effects and Coefficients for Filtration (coded units)
Term
Effect
Coef
SE Coef
T
Constant
70.063
1.184
59.16
Temperature
21.625
10.812
1.184
9.13
Conc.
9.875
4.938
1.184
4.17
Stir Rate
14.625
7.312
1.184
6.18
Temperature*Conc.
-18.125
-9.062
1.184
-7.65
Temperature*Stir Rate
16.625
8.313
1.184
7.02
Conc.*Stir Rate
-1.125
-0.562
1.184
-0.48
Temperature*Conc.*Stir Rate -1.625 -0.813
1.184
-0.69
P
0.000
0.000
0.003
0.000
0.000
0.000
0.647
0.512
S = 4.73682 PRESS = 718
R-Sq = 96.87% R-Sq(pred) = 87.47% R-Sq(adj) = 94.13%
Analysis of Variance for Filtration (coded units)
Source
DF
Seq SS
Main Effects
3
3116.19
2-Way Interactions
3
2424.69
3-Way Interactions
1
10.56
Residual Error
8
179.50
Pure Error
8
179.50
Total
15
5730.94
3116.19
2424.69
10.56
179.50
179.50
1038.73
808.23
10.56
22.44
22.44
F
46.29
36.02
0.47
P
0.000
0.000
0.512
51
Dealing with Outliers






Replace with an estimate
Make the highest-order interaction zero
In this case, estimate cd such that ABCD = 0
Analyze only the data you have
Now the design isn’t orthogonal
Consequences?
52
Duplicate Measurements on the
Response





Four wafers are stacked in the furnace
Four factors: temperature, time, gas flow, and
pressure.
Response: thickness
Treated as duplicate not replicate
Use average as the response
53
Duplicate Measurements on the
Response
54
Duplicate Measurements on the
Response

Stat DOE
Factorial
Preprocess
Response for
Analyze
55
Duplicate Measurements on the
Response

Stat DOE
Factorial
Analyze
Factorial
Design
56
Duplicate Measurements on the
Response
Factorial Fit: average versus Temperature, Time, Pressure
Estimated Effects and Coefficients for average (coded units)
Term
Effect
Coef
SE Coef
Constant
399.188
1.049
Temperature
43.125
21.562
1.049
Time
18.125
9.062
1.049
Pressure
-10.375
-5.187
1.049
Temperature*Time
16.875
8.438
1.049
Temperature*Pressure -10.625
-5.312
1.049
T
380.48
20.55
8.64
-4.94
8.04
-5.06
P
0.000
0.000
0.000
0.001
0.000
0.000
3061.23
795.31
17.61
30.31
14.44
F
173.81
45.16
P
0.000
0.000
2.10
0.185
S = 4.19672 PRESS = 450.88
R-Sq = 98.39% R-Sq(pred) = 95.88% R-Sq(adj) = 97.59%
Analysis of Variance for average (coded units)
Source
DF
Seq SS
Main Effects
3
9183.7
2-Way Interactions
2
1590.6
Residual Error
10
176.1
Lack of Fit
2
60.6
Pure Error
8
115.5
Total
15
10950.4
9183.69
1590.62
176.12
60.62
115.50
57
Duplicate Measurements on the
Response
58
Duplicate Measurements on the
Response
59
The 2k design and design
optimality

The model parameter estimates in a 2k design
(and the effect estimates) are least squares
estimates. For example, for a 22 design the
model is
y  0  1x1  2 x2  12 x1x2  
60
The 2k design and design
optimality
(1)   0  1 ( 1)   2 ( 1)  12 ( 1)(1)  1
a   0  1 (1)   2 ( 1)  12 (1)(1)   2
b   0  1 ( 1)   2 (1)  12 ( 1)(1)   3
The four
observations
from a 22 design
ab   0  1 (1)   2 (1)  12 (1)(1)   4

In matrix form:
Y  X  
61
The 2k design and design
optimality

  ( X ' X ) 1 X 'Y
The “usual” contrasts
βˆ = (XX)-1 Xy
1
(1)  a  b  ab 
 a  ab  b  (1) 


b  ab  a  (1) 


(1)

a

b

ab


 (1)  a  b  ab 


4
ˆ
 0 
(1)  a  b  ab   a  ab  b  (1) 
 




ˆ
 1  1  a  ab  b  (1)  

4

I

4
 ˆ 
b  ab  a  (1)   b  ab  a  (1) 
4

 2


 
4
ˆ 

 (1)  a  b  ab  
 12 
 (1)  a  b  ab 



4

4
0

0

0
0
4
0
0
0
0
4
0
0
0 
0

4
The XX matrix is
diagonal –
consequences of an
orthogonal design
The regression
coefficient estimates
are exactly half of the
‘usual” effect estimates
62
The 2k design and design
optimality
 The matrix X’X has interesting and
useful properties:
V ( ˆ )   2 (diagonal element of (XX)1 )

2
4
|(XX) | 256
Minimum possible
value for a four-run
design
Maximum possible
value for a four-run
design
Notice that these results depend on both the
design that you have chosen and the model
63
The 2k design and design
optimality



The 22 design is called D-optimal design
In fact, all 2k design is D-optimal design for
fitting first order model with interaction.
Consider the variance of the predicted
response in the 22 design:
The 2k design and design
optimality
V [ yˆ ( x1 , x2 )]   2 x(XX)-1 x
x  [1, x1 , x2 , x1 x2 ]
V [ yˆ ( x1 , x2 )] 
2
(1  x12  x22  x12 x22 )
4
The maximum prediction variance occurs when x1  1, x2  1
V [ yˆ ( x1 , x2 )]   2
The prediction variance when x1  x2  0 is
V [ yˆ ( x1 , x2 )] 
2
4
What about average prediction variance over the design space?
The 2k design and design
optimality
1 1
1
I    V [ yˆ ( x1 , x2 )dx1dx2
A 1 1
A = area of design space = 22  4
1 1
1
2 1
    (1  x12  x22  x12 x22 )dx1dx2
4 1 1 4
4 2

9


Minimize the
maximum prediction
variance
The 22 design is called G-optimal design
In fact, all 2k design is G-optimal design for
fitting first order model with interaction.
The 2k design and design
Smallest possible value
optimality
of the average
prediction variance


The 22 design is called I-optimal design
In fact, all 2k design is I-optimal design for
fitting first order model with interaction.
The 2k design and design
optimality



The Minitab provide the function on “Select
Optimal Design” when you have a full
factorial design and are trying to reduce the it
to a partial design or “fractional design”.
It only provide the “D-optimal design”
One needs to have a full factorial design first
and the choose the number of data points to
be allowed to use.
The 2k design and design
optimality



These results give us some assurance that
these designs are “good” designs in some
general ways
Factorial designs typically share some (most)
of these properties
There are excellent computer routines for
finding optimal designs
69
to a 2k Designs


Based on the idea of replicating some of the
runs in a factorial design
Runs at the center provide an estimate of
error and allow the experimenter to
distinguish between two possible models: Quadratic
effects
k
k
k
First-order model (interaction) y   0    i xi    ij xi x j  
i 1
k
k
i 1 j i
k
k
Second-order model y   0    i xi    ij xi x j    ii xi2  
i 1
i 1 j i
i 1
70
to a 2k Designs


When adding center points, we assume that
the k factors are quantitative.
Example on 22 design
71
to a 2k Designs
Five point:
(-,-),(-,+),(+,-),(+,+), and (0,0).
 nF=4 and nC=4
 Let y be the average of the
four runs at the four factorial
points and let y be the average
of nC run at the center point.

F
C
72
to a 2k Designs


If the difference of y  y is
small, the center points lie
on or near the plane passing
through factorial points and
The hypotheses are:
F
C
k
H 0 :   ii  0
i 1
k
H1 :   ii  0
i 1
73
to a 2k Designs

Test statistics:

nF nC ( yF  yC )2

nF  nC
with one degree of freedom
74
to a 2k Designs -- example



In example 6.2, it is a 24 factorial.
By adding center points x1=x2=x3=x4=0, four
additional responses (filtration rates) are : 73,
75, 66,69.
So y =70.75 and y =70.06.
C
F
75
to a 2k Designs -- example
Term
Constant
Temperature
Pressure
Conc.
Stir Rate
Temperature*Pressure
Temperature*Conc.
Temperature*Stir Rate
Pressure*Conc.
Pressure*Stir Rate
Conc.*Stir Rate
Temperature*Pressure*Conc.
Temperature*Pressure*Stir Rate
Temperature*Conc.*Stir Rate
Pressure*Conc.*Stir Rate
Temperature*Pressure*Conc.*Stir Rate
Ct Pt
Effect
21.625
3.125
9.875
14.625
0.125
-18.125
16.625
2.375
-0.375
-1.125
1.875
4.125
-1.625
-2.625
1.375
Coef
70.063
10.812
1.562
4.937
7.312
0.063
-9.063
8.313
1.188
-0.187
-0.563
0.937
2.063
-0.813
-1.312
0.687
0.687
SE Coef
1.008
1.008
1.008
1.008
1.008
1.008
1.008
1.008
1.008
1.008
1.008
1.008
1.008
1.008
1.008
1.008
2.253
T
69.52
10.73
1.55
4.90
7.26
0.06
-8.99
8.25
1.18
-0.19
-0.56
0.93
2.05
-0.81
-1.30
0.68
0.31
P
0.000
0.002
0.219
0.016
0.005
0.954
0.003
0.004
0.324
0.864
0.616
0.421
0.133
0.479
0.284
0.544
0.780
76
to a 2k Designs -- example
Analysis of Variance for Filtration (coded units)
Source
Main Effects
2-Way Interactions
3-Way Interactions
4-Way Interactions
Curvature
Residual Error
Pure Error
Total
DF
4
6
4
1
1
3
3
19
Seq SS
3155.25
2447.88
120.25
7.56
1.51
48.75
48.75
5781.20
3155.25
2447.88
120.25
7.56
1.51
48.75
48.75
788.813
407.979
30.062
7.562
1.512
16.250
16.250
F
48.54
25.11
1.85
0.47
0.09
P
0.005
0.012
0.320
0.544
0.780
77
to a 2k Designs

If curvature is significant, augment the design
with axial runs to create a central composite
design. The CCD is a very effective design for
fitting a second-order response surface model
78
to a 2k Designs
79
to a 2k Designs




Use current operating conditions as the
center point
Check for “abnormal” conditions during the
time the experiment was conducted
Check for time trends
Use center points as the first few runs when
there is little or no information available about
the magnitude of error
80
Center Points and Qualitative Factors
81
```