#### Transcript The 2 2 Design

Design and Analysis of Experiments Dr. Tai-Yue Wang Department of Industrial and Information Management National Cheng Kung University Tainan, TAIWAN, ROC 1/33 Two-Level Factorial Designs Dr. Tai-Yue Wang Department of Industrial and Information Management National Cheng Kung University Tainan, TAIWAN, ROC 2/33 Outline Introduction The 22 Design The 23 Design The general 2k Design A single Replicate of the 2k design Additional Examples of Unreplicated 2k Designs 2k Designs are Optimal Designs The additional of center Point to the 2k Design Introduction Special case of general factorial designs k factors each with two levels Factors maybe qualitative or quantitative A complete replicate of such design is 2k factorial design Assumed factors are fixed, the design are completely randomized, and normality Used as factor screening experiments Response between levels is assumed linear The Factor 2 2 Design Treatment Combination A B - - + Replication I II III IV A low, B low 28 25 27 80 - A high, B low 36 32 32 100 - + A low, B high 18 19 23 60 + + A high, B high 31 30 29 90 The 2 2 Design “-” and “+” denote the low and high levels of a factor, respectively Low and high are arbitrary terms Geometrically, the four runs form the corners of a square Factors can be quantitative or qualitative, although their treatment in the final model will be different The Estimate factor effects Formulate model 2 2 Design With replication, use full model With an unreplicated design, use normal probability plots Statistical testing (ANOVA) Refine the model Analyze residuals (graphical) Interpret results The 2 2 Design A y A y A ab a b (1) 2n 2n 21n [ab a b (1)] B yB yB ab b a (1) 2n 2n 21n [ab b a (1)] ab (1) a b AB 2n 2n 21n [ab (1) a b] [ab a b (1)]2 SSA 4n [ab a b (1)]2 SSB 4n [ab a b (1)]2 SSAB 4n 2 2 2 n y 2 SST yijk ... 4n i 1 j 1 k 1 SSE SST SSA SSB SSAB The 2 2 Design Standard order Yates’s order Effects (1) a b ab A -1 +1 -1 +1 B -1 -1 +1 +1 AB +1 -1 -1 +1 Effects A, B, AB are orthogonal contrasts with one degree of freedom Thus 2k designs are orthogonal designs The 2 2 Design ANOVA table The 2 2 Design Algebraic sign for calculating effects in 22 design The 2 2 Design Regression model y 0 1x1 2 x2 x1 and x2 are code variable in this case x1 x2 con ( conlow conhigh ) / 2 ( conhigh conlow ) / 2 catalyst ( catalystlow catalysthigh ) / 2 ( catalysthigh catalystlow ) / 2 Where con and catalyst are natural variables The 2 2 Design Regression model Factorial Fit: Yield versus Conc., Catalyst Estimated Effects and Coefficients for Yield (coded units) Term Effect Coef Constant 27.500 Conc. 8.333 4.167 Catalyst -5.000 -2.500 Conc.*Catalyst 1.667 0.833 SE Coef 0.5713 0.5713 0.5713 0.5713 T 48.14 7.29 -4.38 1.46 P 0.000 0.000 0.002 0.183 Adj MS 141.667 8.333 3.917 3.917 F 36.17 2.13 S = 1.97906 PRESS = 70.5 R-Sq = 90.30% R-Sq(pred) = 78.17% R-Sq(adj) = 86.66% Analysis of Variance for Yield (coded units) Source Main Effects 2-Way Interactions Residual Error Pure Error Total DF 2 1 8 8 11 Seq SS 283.333 8.333 31.333 31.333 323.000 Adj SS 283.333 8.333 31.333 31.333 P 0.000 0.183 The 2 2 Design Regression model The 2 2 Design Regression model The 2 2 Design Regression model Estimated Coefficients for Yield using data in uncoded units Term Constant Conc. Catalyst Conc.*Catalyst Coef 28.3333 0.333333 -11.6667 0.333333 Regression model (without interaction) Estimated Coefficients for Yield using data in uncoded units Term Constant Conc. Catalyst Coef 18.3333 0.833333 -5.00000 The 2 2 Design Response surface The 2 2 Design Response surface (note: the axis of catalyst is reversed with the one from textbook) The 3 2 Design 3 factors, each at two level. Eight combinations The 3 2 Design Design matrix Or geometric notation The 3 2 Design Algebraic sign The 3 2 Design -- Properties of the Table Except for column I, every column has an equal number of + and – signs The sum of the product of signs in any two columns is zero Multiplying any column by I leaves that column unchanged (identity element) 22 The 3 2 Design -- Properties of the Table The product of any two columns yields a column in the table: A B AB AB BC AB C AC 2 Orthogonal design Orthogonality is an important property shared by all factorial designs 23 The 3 2 Design -- example Nitride etch process Gap, gas flow, and RF power The 3 2 Design -- example Nitride etch process Gap, gas flow, and RF power The 3 2 Design -- example Full model Estimated Effects and Coefficients for Etch Rate (coded units) Term Effect Coef SE Coef Constant 776.06 11.87 65.41 Gap -101.62 -50.81 11.87 Gas Flow 7.37 3.69 11.87 Power 306.12 153.06 11.87 Gap*Gas Flow -24.88 -12.44 11.87 Gap*Power -153.63 -76.81 11.87 Gas Flow*Power -2.12 -1.06 11.87 Gap*Gas Flow*Power 5.62 2.81 11.87 T 0.000 -4.28 0.31 12.90 -1.05 -6.47 -0.09 0.24 P Adj MS 138793 32299 127 2253 2253 F 61.62 14.34 0.06 0.003 0.764 0.000 0.325 0.000 0.931 0.819 S = 47.4612 PRESS = 72082 R-Sq = 96.61% R-Sq(pred) = 86.44% R-Sq(adj) = 93.64% Analysis of Variance for Etch Rate (coded units) Source Main Effects 2-Way Interactions 3-Way Interactions Residual Error Pure Error Total DF 3 3 1 8 8 15 Seq SS 416378 96896 127 18020 18021 531421 Adj SS 416378 96896 127 18020 18021 P 0.000 0.001 0.819 The 3 2 Design -- example Reduced model Factorial Fit: Etch Rate versus Gap, Power Estimated Effects and Coefficients for Etch Rate (coded units) Term Effect Coef SE Coef Constant 776.06 10.42 Gap -101.62 -50.81 10.42 Power 306.12 153.06 10.42 Gap*Power -153.63 -76.81 10.42 T 74.46 -4.88 14.69 -7.37 P 0.000 0.000 0.000 0.000 Adj MS 208080 94403 1738 1738 F 119.71 54.31 S = 41.6911 PRESS = 37080.4 R-Sq = 96.08% R-Sq(pred) = 93.02% R-Sq(adj) = 95.09% Analysis of Variance for Etch Rate (coded units) Source Main Effects 2-Way Interactions Residual Error Pure Error Total DF 2 1 12 12 15 Seq SS 416161 94403 20858 20858 531421 Adj SS 416161 94403 20858 20858 P 0.000 0.000 The 23 Design – example -- Model Summary Statistics for Reduced Model R2 and adjusted R2 5 SS 5.106 10 R 2 Model 0.9608 5 SST 5.314 10 2 Adj R SS E / df E 20857.75 /12 1 1 0.9509 5 SST / dfT 5.314 10 /15 R2 for prediction (based on PRESS) 2 Pred R PRESS 37080.44 1 1 0.9302 5 SST 5.314 10 28 The 3 2 Design -- example The 3 2 Design -- example The Regression Model 31 Cube Plot of Ranges What do the large ranges when gap and power are at the high level tell you? 32 The General 2k Factorial Design There will be k main effects, and k two-factor interactions 2 k three-factor interactions 3 1 k factor interaction 33 The General 2k Factorial Design Statistical Analysis 34 The General 2k Factorial Design Statistical Analysis 35 Unreplicated 2k Factorial Designs These are 2k factorial designs with one observation at each corner of the “cube” An unreplicated 2k factorial design is also sometimes called a “single replicate” of the 2k These designs are very widely used Risks…if there is only one observation at each corner, is there a chance of unusual response observations spoiling the results? Modeling “noise”? 36 Unreplicated 2k Factorial Designs If the factors are spaced too closely, it increases the chances that the noise will overwhelm the signal in the data More aggressive spacing is usually best 37 Unreplicated 2k Factorial Designs Lack of replication causes potential problems in statistical testing Replication admits an estimate of “pure error” (a better phrase is an internal estimate of error) With no replication, fitting the full model results in zero degrees of freedom for error Potential solutions to this problem Pooling high-order interactions to estimate error Normal probability plotting of effects (Daniels, 1959) 38 Unreplicated 2k Factorial Designs -- example A 24 factorial was used to investigate the effects of four factors on the filtration rate of a resin The factors are A = temperature, B = pressure, C = mole ratio, D= stirring rate Experiment was performed in a pilot plant 39 Unreplicated 2k Factorial Designs -- example 40 Unreplicated 2k Factorial Designs -- example 41 Unreplicated 2k Factorial Designs – example –full model 42 Unreplicated 2k Factorial Designs -- example –full model 43 Unreplicated 2k Factorial Designs -- example –full model 44 Unreplicated 2k Factorial Designs -- example –reduced model Factorial Fit: Filtration versus Temperature, Conc., Stir Rate Estimated Effects and Coefficients for Filtration (coded units) Term Effect Coef SE Coef T P Constant 70.063 1.104 63.44 0.000 Temperature 21.625 10.812 1.104 9.79 0.000 Conc. 9.875 4.938 1.104 4.47 0.001 Stir Rate 14.625 7.312 1.104 6.62 0.000 Temperature*Conc. -18.125 -9.062 1.104 -8.21 0.000 Temperature*Stir Rate 16.625 8.313 1.104 7.53 0.000 S = 4.41730 PRESS = 499.52 R-Sq = 96.60% R-Sq(pred) = 91.28% R-Sq(adj) = 94.89% Analysis of Variance for Filtration (coded units) Source DF Seq SS Main Effects 3 3116.19 2-Way Interactions 2 2419.62 Residual Error 10 195.12 Lack of Fit 2 15.62 Pure Error 8 179.50 Total 15 5730.94 Adj SS 3116.19 2419.62 195.12 15.62 179.50 Adj MS 1038.73 1209.81 19.51 7.81 22.44 F 53.23 62.00 P 0.000 0.000 0.35 0.716 45 Unreplicated 2k Factorial Designs -- example –reduced model 46 Unreplicated 2k Factorial Designs -- example –reduced model 47 Unreplicated 2k Factorial Designs -- example –reduced model 48 Unreplicated 2k Factorial Designs -- example –Design projection Since factor B is negligible, the experiment can be interpreted as a 23 factorial design with factors A, C, D. 2 replicates 49 Unreplicated 2k Factorial Designs -- example –Design projection 50 Unreplicated 2k Factorial Designs -- example –Design projection Factorial Fit: Filtration versus Temperature, Conc., Stir Rate Estimated Effects and Coefficients for Filtration (coded units) Term Effect Coef SE Coef T Constant 70.063 1.184 59.16 Temperature 21.625 10.812 1.184 9.13 Conc. 9.875 4.938 1.184 4.17 Stir Rate 14.625 7.312 1.184 6.18 Temperature*Conc. -18.125 -9.062 1.184 -7.65 Temperature*Stir Rate 16.625 8.313 1.184 7.02 Conc.*Stir Rate -1.125 -0.562 1.184 -0.48 Temperature*Conc.*Stir Rate -1.625 -0.813 1.184 -0.69 P 0.000 0.000 0.003 0.000 0.000 0.000 0.647 0.512 S = 4.73682 PRESS = 718 R-Sq = 96.87% R-Sq(pred) = 87.47% R-Sq(adj) = 94.13% Analysis of Variance for Filtration (coded units) Source DF Seq SS Main Effects 3 3116.19 2-Way Interactions 3 2424.69 3-Way Interactions 1 10.56 Residual Error 8 179.50 Pure Error 8 179.50 Total 15 5730.94 Adj SS 3116.19 2424.69 10.56 179.50 179.50 Adj MS 1038.73 808.23 10.56 22.44 22.44 F 46.29 36.02 0.47 P 0.000 0.000 0.512 51 Dealing with Outliers Replace with an estimate Make the highest-order interaction zero In this case, estimate cd such that ABCD = 0 Analyze only the data you have Now the design isn’t orthogonal Consequences? 52 Duplicate Measurements on the Response Four wafers are stacked in the furnace Four factors: temperature, time, gas flow, and pressure. Response: thickness Treated as duplicate not replicate Use average as the response 53 Duplicate Measurements on the Response 54 Duplicate Measurements on the Response Stat DOE Factorial Preprocess Response for Analyze 55 Duplicate Measurements on the Response Stat DOE Factorial Analyze Factorial Design 56 Duplicate Measurements on the Response Factorial Fit: average versus Temperature, Time, Pressure Estimated Effects and Coefficients for average (coded units) Term Effect Coef SE Coef Constant 399.188 1.049 Temperature 43.125 21.562 1.049 Time 18.125 9.062 1.049 Pressure -10.375 -5.187 1.049 Temperature*Time 16.875 8.438 1.049 Temperature*Pressure -10.625 -5.312 1.049 T 380.48 20.55 8.64 -4.94 8.04 -5.06 P 0.000 0.000 0.000 0.001 0.000 0.000 Adj MS 3061.23 795.31 17.61 30.31 14.44 F 173.81 45.16 P 0.000 0.000 2.10 0.185 S = 4.19672 PRESS = 450.88 R-Sq = 98.39% R-Sq(pred) = 95.88% R-Sq(adj) = 97.59% Analysis of Variance for average (coded units) Source DF Seq SS Main Effects 3 9183.7 2-Way Interactions 2 1590.6 Residual Error 10 176.1 Lack of Fit 2 60.6 Pure Error 8 115.5 Total 15 10950.4 Adj SS 9183.69 1590.62 176.12 60.62 115.50 57 Duplicate Measurements on the Response 58 Duplicate Measurements on the Response 59 The 2k design and design optimality The model parameter estimates in a 2k design (and the effect estimates) are least squares estimates. For example, for a 22 design the model is y 0 1x1 2 x2 12 x1x2 60 The 2k design and design optimality (1) 0 1 ( 1) 2 ( 1) 12 ( 1)(1) 1 a 0 1 (1) 2 ( 1) 12 (1)(1) 2 b 0 1 ( 1) 2 (1) 12 ( 1)(1) 3 The four observations from a 22 design ab 0 1 (1) 2 (1) 12 (1)(1) 4 In matrix form: Y X 61 The 2k design and design optimality ( X ' X ) 1 X 'Y The “usual” contrasts βˆ = (XX)-1 Xy 1 (1) a b ab a ab b (1) b ab a (1) (1) a b ab (1) a b ab 4 ˆ 0 (1) a b ab a ab b (1) ˆ 1 1 a ab b (1) 4 I 4 ˆ b ab a (1) b ab a (1) 4 2 4 ˆ (1) a b ab 12 (1) a b ab 4 4 0 0 0 0 4 0 0 0 0 4 0 0 0 0 4 The XX matrix is diagonal – consequences of an orthogonal design The regression coefficient estimates are exactly half of the ‘usual” effect estimates 62 The 2k design and design optimality The matrix X’X has interesting and useful properties: V ( ˆ ) 2 (diagonal element of (XX)1 ) 2 4 |(XX) | 256 Minimum possible value for a four-run design Maximum possible value for a four-run design Notice that these results depend on both the design that you have chosen and the model 63 The 2k design and design optimality The 22 design is called D-optimal design In fact, all 2k design is D-optimal design for fitting first order model with interaction. Consider the variance of the predicted response in the 22 design: The 2k design and design optimality V [ yˆ ( x1 , x2 )] 2 x(XX)-1 x x [1, x1 , x2 , x1 x2 ] V [ yˆ ( x1 , x2 )] 2 (1 x12 x22 x12 x22 ) 4 The maximum prediction variance occurs when x1 1, x2 1 V [ yˆ ( x1 , x2 )] 2 The prediction variance when x1 x2 0 is V [ yˆ ( x1 , x2 )] 2 4 What about average prediction variance over the design space? The 2k design and design optimality 1 1 1 I V [ yˆ ( x1 , x2 )dx1dx2 A 1 1 A = area of design space = 22 4 1 1 1 2 1 (1 x12 x22 x12 x22 )dx1dx2 4 1 1 4 4 2 9 Minimize the maximum prediction variance The 22 design is called G-optimal design In fact, all 2k design is G-optimal design for fitting first order model with interaction. The 2k design and design Smallest possible value optimality of the average prediction variance The 22 design is called I-optimal design In fact, all 2k design is I-optimal design for fitting first order model with interaction. The 2k design and design optimality The Minitab provide the function on “Select Optimal Design” when you have a full factorial design and are trying to reduce the it to a partial design or “fractional design”. It only provide the “D-optimal design” One needs to have a full factorial design first and the choose the number of data points to be allowed to use. The 2k design and design optimality These results give us some assurance that these designs are “good” designs in some general ways Factorial designs typically share some (most) of these properties There are excellent computer routines for finding optimal designs 69 Addition of Center Points to a 2k Designs Based on the idea of replicating some of the runs in a factorial design Runs at the center provide an estimate of error and allow the experimenter to distinguish between two possible models: Quadratic effects k k k First-order model (interaction) y 0 i xi ij xi x j i 1 k k i 1 j i k k Second-order model y 0 i xi ij xi x j ii xi2 i 1 i 1 j i i 1 70 Addition of Center Points to a 2k Designs When adding center points, we assume that the k factors are quantitative. Example on 22 design 71 Addition of Center Points to a 2k Designs Five point: (-,-),(-,+),(+,-),(+,+), and (0,0). nF=4 and nC=4 Let y be the average of the four runs at the four factorial points and let y be the average of nC run at the center point. F C 72 Addition of Center Points to a 2k Designs If the difference of y y is small, the center points lie on or near the plane passing through factorial points and there is no quadratic effects. The hypotheses are: F C k H 0 : ii 0 i 1 k H1 : ii 0 i 1 73 Addition of Center Points to a 2k Designs Test statistics: SSPure Quad nF nC ( yF yC )2 nF nC with one degree of freedom 74 Addition of Center Points to a 2k Designs -- example In example 6.2, it is a 24 factorial. By adding center points x1=x2=x3=x4=0, four additional responses (filtration rates) are : 73, 75, 66,69. So y =70.75 and y =70.06. C F 75 Addition of Center Points to a 2k Designs -- example Term Constant Temperature Pressure Conc. Stir Rate Temperature*Pressure Temperature*Conc. Temperature*Stir Rate Pressure*Conc. Pressure*Stir Rate Conc.*Stir Rate Temperature*Pressure*Conc. Temperature*Pressure*Stir Rate Temperature*Conc.*Stir Rate Pressure*Conc.*Stir Rate Temperature*Pressure*Conc.*Stir Rate Ct Pt Effect 21.625 3.125 9.875 14.625 0.125 -18.125 16.625 2.375 -0.375 -1.125 1.875 4.125 -1.625 -2.625 1.375 Coef 70.063 10.812 1.562 4.937 7.312 0.063 -9.063 8.313 1.188 -0.187 -0.563 0.937 2.063 -0.813 -1.312 0.687 0.687 SE Coef 1.008 1.008 1.008 1.008 1.008 1.008 1.008 1.008 1.008 1.008 1.008 1.008 1.008 1.008 1.008 1.008 2.253 T 69.52 10.73 1.55 4.90 7.26 0.06 -8.99 8.25 1.18 -0.19 -0.56 0.93 2.05 -0.81 -1.30 0.68 0.31 P 0.000 0.002 0.219 0.016 0.005 0.954 0.003 0.004 0.324 0.864 0.616 0.421 0.133 0.479 0.284 0.544 0.780 76 Addition of Center Points to a 2k Designs -- example Analysis of Variance for Filtration (coded units) Source Main Effects 2-Way Interactions 3-Way Interactions 4-Way Interactions Curvature Residual Error Pure Error Total DF 4 6 4 1 1 3 3 19 Seq SS 3155.25 2447.88 120.25 7.56 1.51 48.75 48.75 5781.20 Adj SS 3155.25 2447.88 120.25 7.56 1.51 48.75 48.75 Adj MS 788.813 407.979 30.062 7.562 1.512 16.250 16.250 F 48.54 25.11 1.85 0.47 0.09 P 0.005 0.012 0.320 0.544 0.780 77 Addition of Center Points to a 2k Designs If curvature is significant, augment the design with axial runs to create a central composite design. The CCD is a very effective design for fitting a second-order response surface model 78 Addition of Center Points to a 2k Designs 79 Addition of Center Points to a 2k Designs Use current operating conditions as the center point Check for “abnormal” conditions during the time the experiment was conducted Check for time trends Use center points as the first few runs when there is little or no information available about the magnitude of error 80 Center Points and Qualitative Factors 81