Transcript lecture10
Analysis of Variance
Outlines:
Designing Engineering Experiments
Completely Randomized Single-Factor Experiment
Random Effects Model
Randomized Complete Block Design
Designing Engineering Experiments
Factor: Parameter of interest
Levels: Possible value of Factors.
Analysis of Variance (ANOVA): Analysis the effects of different factor
levels to the response.
Randomization: Random running order.
Controllable variables: Other parameters that involve in the experiment
Experiment activities:
1.
Conjecture
2.
Experiment
3.
Analysis
4.
Conclusion
Completely Randomized Single-Factor
Experiment
Ex: A manufacturer of paper is interested in improving the tensile strength of the
product. The product engineer thinks that tensile strength is a function of the
hardwood concentration in the paper. The range of hardwood concentrations of
practical interest is between 5 and 20% . The engineers design to investigate four
levels of hardwood concentration: 5% 10% 15% and 20%. They design to make up
six test specimens at each concentration level.
Completely Randomized Single-Factor
Experiment
Randomizing order: randomly select the order for each run to reduce the
effect of nuisance variable such as the warm-up effect.
Box plot: Represent the variability within a treatment
and the variability between treatments.
Analysis of Variance
Typical data for single-factor experiment
Linear Statistical Model
i 1,2,...,a
Yij i Eij
j 1,2,...,n
For each treatment
Eij has a normal
distribution with
mean 0 and sd =
i 1,2,...,a
Yij i Eij
, where i i
j
1
,
2
,...,
n
For each treatment,
yij is normal
distribution with µi
and
Analysis of Variance
We are interested in the equality of the a treatment means µ1, µ2,…, µa
H 0 : 1 2 3 ... a 0
H1 : i 0
If H0 is true, each observation consists of the overall mean µ plus the random error
Eij => Changing the level of the factor has no effect on the mean response.
ANOVA partitions the total variability in the sample data into two components
1. The variation between treatments
2. The variation within treatment
If there are no
differences between
treatments => 1=2
Analysis of Variance
Total variation is described by the total sum of squares (SST)
Analysis of Variance
Degree of freedom
SST SStreatment SSE
an 1 a 1 a(n 1)
Mean square for treatment
MStreatment SStreatment /(a 1)
Mean square for error
MSE SSE /[a(n 1)]
To verify hypothesis, we compare MStreatment and MsE
By using F test statistic
Analysis of Variance
If H0 is reject => MStreatment > MSE
Reject H0 if f 0 f ,a1,a( n1)
Computing formulas for ANOVA
ANOVA Table
=> H1 should be upper-tail test
Analysis of Variance
Ex. Consider the tensile strength. We can use ANOVA to test the hypothesis
that different hardwood concentrations do not affect the mean tensile
strength of the paper. Use α=0.01
H 0 :1 2 3 4 0
H1 : i 0
For at least one i
Analysis of Variance
f0 = 19.60 >f0.01,3,20 =4.94 , reject H0
Conclusion: hardwood concentration significantly affects the mean strength
of the paper
Analysis of Variance
Confidence interval on the mean of the ith treatment
Ex. Find the 95% CI of the mean strength of 20% hardwood concentration?
y4. t0.025, 20
MS E
MS E
4 y4. t0.025, 20
n
n
21.17 (2.086) 6.51/ 6 4 21.17 (2.086) 6.51/ 6
19 4 23.34
Analysis of Variance
Confidence interval on a difference in treatment means
Ex. Find a 95% CI on the difference in means µ3-µ2
Multiple comparisons following ANOVA
When H0 :1 2 3... a 0 is rejected in ANOVA, we know that some of
the treatment are different.
ANOVA doesn’t identify which means are different.
Methods for investigating this issue is called multiple comparisons methods.
Fisher LSD : compares all pairs of the means (µi and µj)
H0: µi = µj with test-statistic
t0
yi. y j .
2MS E
n
The pair of means µi and µj would be declared significantly different if
| yi. y j. | LSD
Multiple comparisons following ANOVA
Ex. Apply the Fisher LSD to the hardwood concentration experiment.
a= 4 levels, n=6 replicates, and t0.025,20 = 2.086
The treatment means are: y1. 10.00psi, y2. 15.67 psi, y3. 17.00psi, y4. 21.17 psi
The value of LSD,
Compare the difference for every pairs of treatments and LSD,
LSD t0.025, 20 2MSE / n 2.086 2(6.51) / 6 3.07
4vs.1 21.17 10.00 11.17 3.07
4vs.2 21.17 15.67 5.50 3.07
4vs.3 21.17 17.00 4.17 3.07
5%
3vs.1 17.00 10.00 7 3.07
3vs.2 17.00 15.67 1.33 3.07 *
2vs.1 15.67 10.00 5.67 3.07
0
5
10
10% 15% 20%
15
20
25
Model Adequacy Checking
Residual Analysis and Model Checking
Residual VS time: Test independence assumption of error
Residual VS fitted values: Test constant variance assumption of error
Normality Plot: test the normal distribution assumption of error.
Randomized Complete Block Design
An extension of the paired t-test but with more than 2 treatments.
Reduce the nuisance factor.
Ex. 3 methods could be used to evaluate the strength reading on steel plate
girders. If there are 4 plates and each plate is large enough to hold all the
treatment, the experimental design would be appear as Figure.
Randomized Complete Block Design
ANOVA Sums of square
Randomized Complete Block Design
Computing Formulas for ANOVA randomized block
Randomized Complete Block Design
ANOVA for a Randomized Complete Block Design
Randomized Complete Block Design
Ex. Fabric Strength Data from a randomized complete block design can be
shown in table. We want to test the effect of chemical type to the strength
of fabric by using α=0.01
H0:all types provide
identical strength
H1: not equal strength
Randomized Complete Block Design
f0=75.13 > f0.01,3,12 =5.95, reject Ho and conclude that there is a
significant difference in the chemical types
Randomized Complete Block Design
Multiple comparison
Similar to simple ANOVA but LSD t / 2,( a 1)(b1)
2MSE
b
Ex. Refer to previous example, use Fisher’s LSD method to analyze the
difference between each pair of treatment.
LSD t0.025,(3*4)
2(0.08)
2(0.08)
2.179
0.39
5
5
type 4 results in significantly different strengths
than the other three types .
types 2 and 3 do not differ, and types 1 and 3 do not
differ.
There may be a small difference in strength between types 1 and 2.
Exercise
1.
A civil engineer is interested in determining whether four different methods
of estimating flood flow frequency produce equivalent estimates of peak
discharge when applied to the same watershed. Each procedure is used
six times on the watershed and the resulting discharge data are shown in
table
Estimation
method
Observation
1
0.34
0.12
1.23
0.70
1.75
0.12
2
0.91
2.94
2.14
2.36
2.86
4.55
3
6.31
8.37
9.75
6.09
9.82
7.24
4
17.15 11.82 10.95
17.20
14.35
16.82
Exercise
2. An experiment was conducted to investigate leaking current in a SOS
MOSFETS device. The purpose of the experiment was to investigate how
leakage current varies as the channel length changes. Four channel lengths
were selected. For each channel length, five different widths were also
used, and width is to be considered a nuisance factor. The data are as
follows:
Exercise
3. An article in the American Industrial Hygiene Association Journal (Vol. 37, 1976, pp. 418–
422) describes a field test for detecting the presence of arsenic in urine samples. The test has
been proposed for use among forestry workers because of the increasing use of organic
arsenics in that industry. The experiment compared the test as performed by both a trainee
and an experienced trainer to an analysis at a remote laboratory. Four subjects were selected
for testing and are considered as blocks. The response variable is arsenic content (in ppm) in
the subject’s urine. The data are as follows:
Exercise
4. An article in the Food Technology Journal (Vol. 10, 1956, pp. 39–42)
describes a study on the protopectin content of tomatoes during storage.
Four storage times were selected, and samples from nine lots of tomatoes
were analyzed. The protopectin content (expressed as hydrochloric acid
soluble fraction mg/kg) is in the following table.
Exercise
5. An article in the IEEE Transactions on Components, Hybrids, and
Manufacturing Technology (Vol.15, No. 2, 1992, pp. 146–153) describes an
experiment in which the contact resistance of a brake-only relay was
studied for three different materials (all were silver-based alloys). The data
are as follows.
Exercise
6. An article in Lubrication Engineering (December 1990) describes the results
of an experiment designed to investigate the effects of carbon material
properties on the progression of blisters on carbon face seals. The carbon
face seals are used extensively in equipment such as air turbine starters.
Five different carbon materials were tested, and the surface roughness was
measured. The data are as follows:
Exercise
7. article in Communications of the ACM (Vol. 30, No. 5, 1987) studied
different algorithms for estimating software development costs. Six
algorithms were applied to eight software development projects and the
percent error in estimating the development cost was observed. The data
are in the table at the bottom of the page.