#### Transcript Chapter 6 The 2k Factorial Design

## Chapter 6 The

*2 k*

## Factorial Design

1

## 6.1 Introduction

• The special cases of the general factorial design (Chapter 5) • k factors and each factor has only two levels • Levels: – quantitative (temperature, pressure,…), or qualitative (machine, operator,…) – High and low – Each replicate has 2 2 = 2 k observations 2

• Assumptions: (1) the factor is fixed, (2) the design is completely randomized and (3) the usual normality assumptions are satisfied • Wildly used in

**factor screening experiments**

3

## 6.2 The 2

2

## Factorial Design

• Two factors, A and B, and each factor has two levels, low and high.

• Example: the concentration of reactant v.s. the amount of the catalyst (Page 208) 4

• “-” And “+” denote the low and high levels of a factor, respectively • Low and high are arbitrary terms • Geometrically, the four runs form the corners of a square • Factors can be quantitative or qualitative, although their treatment in the final model will be different 5

• Average effect of a factor = the change in response produced by a change in the level of that factor averaged over the levels if the other factors.

• (1),

*a, b*

and

*ab*

: the total of

*n*

replicates taken at the treatment combination.

• The main effects:

*A*

*B*

1 2

*n*

{[

*ab ab*

*a*

2

*n b b*

] 2

*n*

[ ( 1 )

*a*

1 2

*n*

{[

*ab ab*

*b*

2

*n a a*

] 2

*n*

( [ 1 )

*b*

(

*y A*

( 1 1 )]} )]}

*y B*

2

*y A*

1 1

*n*

2

*n y B*

[

*ab*

[

*ab*

*a*

*b b*

*a*

( 1 )] ( 1 )] 6

• The interaction effect: 1

*AB*

{[

*ab*

*b*

] [

*a*

2

*n*

( 1 )]} 1 [

*ab*

2

*n*

*ab*

2

*n*

( 1 )

*b*

2

*n a*

• In that example,

*A = 8.33, B = -5.00*

( 1 )

*a*

*b*

] and

*AB = 1.67 *

• Analysis of Variance • The total effects:

*Contrast Contrast A B Contrast AB*

*ab*

*a*

*b*

( 1 )

*ab*

*b*

*a*

( 1 )

*ab*

( 1 )

*a*

*b*

7

• Sum of squares:

*SS SS B A SS AB*

[ [

*ab ab*

*a*

4

*n b*

*b*

*a*

[

*ab*

( 4

*n*

1 )

*b*

*a*

] 2 4

*n*

( 1 )] 2 ( 1 )] 2

*SS T SS E*

2 1 2 1

*SS T*

1

*y*

2

*ijk*

*SS A*

*y*

4

*SS B*

2

*n*

*SS AB*

8

**Response:Conversion ANOVA for Selected Factorial Model Analysis of variance table [Partial sum of squares] Source**

Model

*A B AB*

**Sum of Squares**

291.67

*208.33*

*75.00*

*8.33*

Pure Error 31.33

Cor Total 323.00

**DF**

3

*1 1 1*

8 11

**Mean Square**

97.22

*208.33*

*75.00*

*8.33*

3.92

**F Value**

24.82

*53.19*

*19.15*

*2.13*

**Prob > F**

0.0002

*< 0.0001*

*0.0024*

*0.1828*

Std. Dev.

Mean C.V.

PRESS 1.98

27.50

7.20

70.50

R-Squared Adj R-Squared Pred R-Squared Adeq Precision 0.9030

0.8666

0.7817

11.669

**The F test for the “model” source is testing the significance of the overall model; that is, is either A, B, or AB or some combination of these effects important?**

9

• Table of plus and minus signs: (1)

*a b ab*

I

*+ + + +*

A

*– + – +*

B

*– – + +*

AB

*+ – – +*

10

• The regression model:

*y*

0 1

*x*

1 2

*x*

2 – x 1 and x 2 are coded variables that represent the two factors, i.e. x 1 1 and 1.

(or x 2 ) only take values on –

*x x*

2 1

*Conc*

( (

*Conc Conc low low*

*Conc high*

) /

*Conc high*

) / 2 2

*Catalyst*

( (

*Catalyst low Catalyst low*

*Catalyst high*

*Catalyst high*

) / 2 ) / 2 11

– Use least square method to get the estimations of the coefficients – For that example,

*y*

ˆ 27 .

5 8 .

33 2

*x*

1 5 2 .

00

*x*

2 – Model adequacy: residuals (Pages 213~214) 12

• Response surface plot:

*y*

ˆ 18 .

33 0 .

8333

*Conc*

5 .

00

*Catalyst*

– Figure 6.3

13

## 6.3 The 2

3

## Design

• Three factors, A, B and C, and each factor has two levels. (Figure 6.4 (a)) • Design matrix (Figure 6.4 (b)) • (1),

*a, b, ab, c, ac, bc, abc*

• 7 degree of freedom: main effect = 1, and interaction = 1 14

15

• Estimate main effect:

*A*

1 [

*a*

( 1 )

*ab*

*b*

4

*n*

*y A*

*y A*

a ab ac abc 4

*n*

( 1 )

*ac*

*b*

*c*

4

*n c*

*abc bc*

*bc*

] 1 4n [

*a*

*ab*

*ac*

*abc*

( 1 )

*b*

*c*

*bc*

] • Estimate two-factor interaction: the difference between the average A effects at the two levels of B

*AB*

1 4

*n*

[

*abc abc*

*ab*

*bc*

*c*

*ab*

( 1 ) 4

*n*

*b bc*

*ac*

*b*

4

*n c ac*

*a*

*a*

( 1 )] 16

• Three-factor interaction:

*ABC*

1 {[

*abc*

*bc*

] [

*ac*

4

*n*

*c*

] [

*ab*

*b*

] [

*a*

( 1 )]} 1 4

*n*

[

*abc*

*bc*

*ac*

*c*

*ab*

*b*

*a*

( 1 )] • Contrast: Table 6.3

– Equal number of plus and minus – The inner product of any two columns = 0 – I is an identity element – The product of any two columns yields another column – Orthogonal design • Sum of squares: SS = (Contrast) 2 /8n 17

18

Treatment Combination (1)

*a b ab c ac bc abc*

Contrast Effect

**Table of – and + Signs for the 2 3 Factorial Design (pg. 218)**

*I*

+ + + + + + + + – – – –

*A*

+ + + + 24 3.00

– – – B + – + + + 18 2.25

Factorial Effect

**AB**

– + – –

*C*

– – + – + 6 + 0.75

– – + + + + 14 1.75

– – – 2

*AC*

+ – + + + 0.25

– – – –

*BC*

+ + + 4 + 0.50

**ABC**

– + – + – – + 4 + 0.50

19

• Example 6.1

*A*

= gap,

*B*

= Flow,

*C*

= Power,

*y*

= Etch Rate 20

21

• The regression model and response surface: – The regression model: 776 .

0625 101 .

2 625

*x*

1 306 .

125 2

*x*

3 153 .

625 2

*x*

1

*x*

3 – Response surface and contour plot (Figure 6.7) 22

23

24

## 6.4 The General 2

k

## Design

• k factors and each factor has two levels • Interactions • The standard order for a 2 4 design: (1),

*a, b, ab, c, ac, bc, abc, d, ad, bd, abd, cd, acd, bcd, abcd k*

two-factor interactions

*k*

three-factor interactions 25 1

*k*

factor interaction

• • The general approach for the statistical analysis: – Estimate factor effects – Form initial model (full model) – Perform analysis of variance (Table 6.9) – Refine the model – Analyze residual – Interpret results

*Contrast ABC*

...

*K*

(

*a*

1 )(

*b*

1 ) (

*k*

1 )

*ABC*

*K*

2

*n*

2

*k Contrast ABC*

*K SS ABC*

*K*

1

*n*

2

*k*

(

*Contrast ABC*

*K*

) 2 26

27

## 6.5 A Single Replicate of the 2

k

## Design

• These are 2

*k *

factorial designs with

**one observation**

corner of the “cube” at each • An unreplicated 2

*k *

factorial design is also sometimes called a “

**single replicate**

” of the 2

*k*

• If the factors are spaced too closely, it increases the chances that the noise will overwhelm the signal in the data 28

• Lack of replication causes potential

**problems**

in statistical testing – Replication admits an estimate of “pure error” (a better phrase is an

**internal estimate**

of error) – With no replication, fitting the full model results in zero degrees of freedom for error • Potential

**solutions**

to this problem – Pooling high-order interactions to estimate error (

**sparsity of effects principle**

) –

**Normal probability plotting**

of effects (Daniels, 1959) 29

• Example 6.2 (A single replicate of the 2 4 design) – A 2 4 factorial was used to investigate the effects of four factors on the filtration rate of a resin – The factors are

*A*

= temperature,

*B*

= pressure,

*C *

= concentration of formaldehyde,

*D*

= stirring rate 30

31

• Estimates of the effects Model Error Error Error Error Error Error Error Error Error Error Error Error Error Error Error Term Intercept A B C D AB AC AD BC BD CD ABC ABD ACD BCD ABCD Effect 21.625

3.125

9.875

14.625

0.125

-18.125

16.625

2.375

-0.375

-1.125

1.875

4.125

-1.625

-2.625

1.375

SumSqr % Contribution 1870.56 32.6397

39.0625 0.681608

390.062 6.80626

855.563 14.9288

0.0625

0.00109057

1314.06 22.9293

1105.56

19.2911

22.5625 0.393696

0.5625

5.0625

0.00981515

0.0883363

14.0625 0.245379

68.0625 1.18763

10.5625 0.184307

27.5625 0.480942

7.5625

0.131959

Lenth's ME Lenth's SME 6.74778

13.699

32

• The normal probability plot of the effects DESIGN-EXPERT Plot Filtration Rate Normal plot A: Temperature B: Pressure C: Concentration D: Stirring Rate 99 A 95 90 80 70 50 30 20 10 5 C D AD AC 1 -18.12

-8.19

1.75

11.69

21.62

Effect 33

34

DESIGN-EXPERT Plot Filtration Rate 104 X = A: Temperature Y = C: Concentration C- -1.000

C+ 1.000

Actual Factors B: Pressure = 0.00

D: Stirring Rate = 0.00

88.4426

72.8851

57.3277

Interaction Graph C: Concentration 41.7702

-1.00

-0.50

0.00

0.50

A: Tem perature 1.00

DESIGN-EXPERT Plot Filtration Rate 104 X = A: Temperature Y = D: Stirring Rate D- -1.000

D+ 1.000

Actual Factors B: Pressure = 0.00

C: Concentration = 0.00

88.75

73.5

Interaction Graph D: Stirring Rate 58.25

43 -1.00

-0.50

0.00

A: Tem perature 0.50

1.00

35

### • B is not significant and all interactions involving B are negligible • Design projection: 2

4

### design => 2

3

### design in A,C and D • ANOVA table (Table 6.13)

36

37

**Response:Filtration Rate ANOVA for Selected Factorial Model Analysis of variance table [Partial sum of squares] Source**

Model

*A C D AC AD*

Residual Cor Total

**Sum of Squares**

5535.81

*1870.56*

*390.06*

*855.56*

*1314.06*

*1105.56*

195.12

5730.94

**DF**

5

*1 1 1 1 1*

10 15

**Mean Square**

1107.16

*1870.56*

*390.06*

*855.56*

*1314.06*

*1105.56*

19.51

**F Value**

56.74

*95.86*

*19.99*

*43.85*

*67.34*

*56.66*

**Prob >F**

< 0.0001

*< 0.0001*

*0.0012*

*< 0.0001*

*< 0.0001*

*< 0.0001*

Std. Dev.

Mean C.V.

PRESS 4.42

70.06

6.30

499.52

R-Squared 0.9660

Adj R-Squared Pred R-Squared Adeq Precision 0.9489

0.9128

20.841

38

• The regression model:

**Final Equation in Terms of Coded Factors:**

Filtration Rate +70.06250

= +10.81250 * Temperature +4.93750 * Concentration +7.31250 * Stirring Rate -9.06250

* Temperature * Concentration +8.31250 * Temperature * Stirring Rate • Residual Analysis (P. 235) • Response surface (P. 236) 39

40

41

• Half-normal plot: the absolute value of the effect estimates against the cumulative normal probabilities.

Half Normal plot DESIGN-EXPERT Plot Filtration Rate A: T emperature B: Pressure C: Concentration D: Stirring Rate 99 40 20 0 97 95 90 85 80 70 60 C D AD AC A 0.00

5.41

10.81

|Effect| 16.22

21.63

42

• Example 6.3 (Data transformation in a Factorial Design)

*A *

= drill load,

*B*

= flow,

*C*

= speed,

*D*

= type of mud,

*y = *

advance rate of the drill 43

• The normal probability plot of the effect estimates DESIGN-EXPERT Plot adv._rate

A: load B: flow C: speed D: mud 99 40 20 0 97 95 90 85 80 70 60 Half Normal plot BD BC D C B 0.00

1.61

3.22

|Effect| 4.83

6.44

44

DESIGN-EXPERT Plot adv._rate

• Residual analysis Normal plot of residuals adv._rate

99 30 20 10 5 95 90 80 70 50 1 2.58625

1.44875

0.31125

-0.82625

Residuals vs. Predicted -1.96375

-0.82625

0.31125

1.44875

2.58625

Res idual -1.96375

1.69

4.70

7.70

10.71

13.71

Predicted 45

• The residual plots indicate that there are problems with the

**equality of variance**

assumption • The usual approach to this problem is to employ a

**transformation**

on the response • In this example,

*y*

* ln

*y*

46

DESIGN-EXPERT Plot Ln(adv._rate) A: load B: flow C: speed D: mud 99 40 20 0 97 95 90 85 80 70 60 Half Normal plot D C B 0.00

0.29

0.58

|Effect| 0.87

1.16

Three main effects are large No indication of large interaction effects What happened to the interactions?

47

**Response: Constant: 0.000**

**adv._rate**

**Transform: Natural log ANOVA for Selected Factorial Model Analysis of variance table [Partial sum of squares] Sum of Mean F Source**

Model

*B C D*

**Squares DF**

7.11

*5.35*

*1.34*

*0.43*

Residual 0.17

3

*1 1 1*

12

**Square**

2.37

*5.35*

*1.34*

*0.43*

0.014

**Value**

164.82

*371.49*

*93.05*

*29.92*

**Prob > F**

< 0.0001

*< 0.0001*

*< 0.0001*

*0.0001*

Cor Total 7.29

15 Std. Dev. 0.12

Mean C.V.

1.60

7.51

PRESS 0.31

R-Squared 0.9763

Adj R-Squared Pred R-Squared 0.9704

0.9579

Adeq Precision 34.391

48

• Following Log transformation

**Final Equation in Terms of Coded Factors:**

Ln(adv._rate) = +1.60

+0.58

* B +0.29

+0.16

* C * D 49

DESIGN-EXPERT Plot Ln(adv._rate) 99 95 90 80 70 50 30 20 10 5 1 Normal plot of residuals DESIGN-EXPERT Plot Ln(adv._rate) 0.194177

Residuals vs. Predicted 0.104087

0.0139965

-0.0760939

-0.166184

-0.0760939

0.0139965

0.104087

0.194177

Res idual -0.166184

0.57

1.08

1.60

Predicted 2.11

2.63

50

• Example 6.4: – Two factors (A and C) affect the mean number of defects – A third factor (B) affects variability – Residual plots were useful in identifying the dispersion effect – The magnitude of the dispersion effects:

*F i*

* ln

*S*

2 (

*i*

)

*S*

2 (

*i*

) – When variance of positive and negative are equal, this statistic has an approximate normal distribution 51

52

53

54

55

## 6.7 2

k

## Designs are Optimal Designs

• Consider 2 2 design with one replication.

• Fit the following model:

*y*

0 1

*x*

1 2

*x*

2 • Matrix form: ( 1 )

*a b ab*

*y*

1 1 1 1 1 1 1 1 1 1 1 1 12

*x*

1

*x*

2 1 1 1 1 0 1 2 12

*X*

56

• The LS estimation: ˆ (

*X*

'

*X*

) 1

*X*

'

*Y*

( 1 ) ( ( 1 1 ( 1 ) ) )

*a a*

*a*

4

*a*

4 4 4

*b b b b*

*ab ab ab ab*

• D-optimal criterion, |X’X|: the volumn of the joint confidence region that contains all coefficients is inversely proportional to the square root of |X’X|. • G-optimal design: min max Var( ) 4 2 ( 1

*x*

1 2

*x*

2 2 57

*x*

1 2

*x*

2 2 )

## 6.8 The Addition of Center Points to the 2

k

## Design

• Based on the idea of replicating

**some**

of the runs in a factorial design • Runs at the center provide an estimate of error and allow the experimenter to distinguish between two possible models: First-order model (interaction)

*y*

0

*i k*

1

*i x i*

*i k*

1

*k*

*ij x x i j*

Second-order model

*y*

0

*i k*

1

*i x i*

*i k*

1

*k*

*ij x x i j*

*i k*

1

*ii x i*

2 58

59

*y F*

*y C*

### no "curvature"

The hypotheses are:

*H*

0 :

*i k*

1

*ii*

0

*H*

1 :

*i k*

1

*ii*

0

*SS*

Pure Quad

*F C n F F*

*n C y C*

) 2 To detect the possibility of the quadratic effects: add center points This sum of squares has a single degree of freedom 60

61

• Example 6.6

Refer to the original experiment shown in Table 6.10. Suppose that four center points are added to this experiment, and at the points

*x*

1=

*x*

2 =

*x*

3=

*x*

4=0 the four observed filtration rates were 73, 75, 66, and 69. The average of these four center points is 70.75, and the average of the 16 factorial runs is 70.06. Since are very similar, we suspect that there is no strong curvature present.

*n C*

4 Usually between 3 and 6 center points will work well Design-Expert provides the analysis, including the

*F*

-test for pure quadratic curvature 62

63

64

• If curvature is significant, augment the design with axial runs to create a central composite design . The CCD is a very effective design for fitting a second-order response surface model 65