Transcript lecture10

Analysis of Variance
Outlines:
 Designing Engineering Experiments
 Completely Randomized Single-Factor Experiment
 Random Effects Model
 Randomized Complete Block Design
Designing Engineering Experiments

Factor: Parameter of interest

Levels: Possible value of Factors.

Analysis of Variance (ANOVA): Analysis the effects of different factor
levels to the response.

Randomization: Random running order.

Controllable variables: Other parameters that involve in the experiment

Experiment activities:
1.
Conjecture
2.
Experiment
3.
Analysis
4.
Conclusion
Completely Randomized Single-Factor
Experiment
Ex: A manufacturer of paper is interested in improving the tensile strength of the
product. The product engineer thinks that tensile strength is a function of the
hardwood concentration in the paper. The range of hardwood concentrations of
practical interest is between 5 and 20% . The engineers design to investigate four
levels of hardwood concentration: 5% 10% 15% and 20%. They design to make up
six test specimens at each concentration level.
Completely Randomized Single-Factor
Experiment

Randomizing order: randomly select the order for each run to reduce the
effect of nuisance variable such as the warm-up effect.
Box plot: Represent the variability within a treatment
and the variability between treatments.
Analysis of Variance

Typical data for single-factor experiment

Linear Statistical Model
 i  1,2,...,a
Yij     i  Eij 
 j  1,2,...,n

For each treatment
Eij has a normal
distribution with
mean 0 and sd =
 i  1,2,...,a
Yij  i  Eij 
, where i     i
j

1
,
2
,...,
n

For each treatment,
yij is normal
distribution with µi
and 
Analysis of Variance

We are interested in the equality of the a treatment means µ1, µ2,…, µa
H 0 :  1   2   3 ...   a  0
H1 :  i  0


If H0 is true, each observation consists of the overall mean µ plus the random error
Eij => Changing the level of the factor has no effect on the mean response.
ANOVA partitions the total variability in the sample data into two components

1. The variation between treatments

2. The variation within treatment
If there are no
differences between
treatments => 1=2
Analysis of Variance

Total variation is described by the total sum of squares (SST)
Analysis of Variance

Degree of freedom
SST  SStreatment  SSE
an  1  a  1  a(n  1)


Mean square for treatment
MStreatment  SStreatment /(a 1)
Mean square for error
MSE  SSE /[a(n 1)]

To verify hypothesis, we compare MStreatment and MsE

By using F test statistic
Analysis of Variance

If H0 is reject => MStreatment > MSE

Reject H0 if f 0  f ,a1,a( n1)

Computing formulas for ANOVA

ANOVA Table
=> H1 should be upper-tail test
Analysis of Variance

Ex. Consider the tensile strength. We can use ANOVA to test the hypothesis
that different hardwood concentrations do not affect the mean tensile
strength of the paper. Use α=0.01
H 0 :1   2   3   4  0
H1 :  i  0
For at least one i
Analysis of Variance


f0 = 19.60 >f0.01,3,20 =4.94 , reject H0
Conclusion: hardwood concentration significantly affects the mean strength
of the paper
Analysis of Variance
Confidence interval on the mean of the ith treatment
Ex. Find the 95% CI of the mean strength of 20% hardwood concentration?
y4.  t0.025, 20
MS E
MS E
  4  y4.  t0.025, 20
n
n
21.17  (2.086) 6.51/ 6   4  21.17  (2.086) 6.51/ 6
19   4  23.34
Analysis of Variance

Confidence interval on a difference in treatment means

Ex. Find a 95% CI on the difference in means µ3-µ2
Multiple comparisons following ANOVA

When H0 :1   2   3...   a  0 is rejected in ANOVA, we know that some of
the treatment are different.

ANOVA doesn’t identify which means are different.

Methods for investigating this issue is called multiple comparisons methods.

Fisher LSD : compares all pairs of the means (µi and µj)

H0: µi = µj with test-statistic

t0 
yi.  y j .
2MS E
n
The pair of means µi and µj would be declared significantly different if
| yi.  y j. | LSD
Multiple comparisons following ANOVA

Ex. Apply the Fisher LSD to the hardwood concentration experiment.

a= 4 levels, n=6 replicates, and t0.025,20 = 2.086

The treatment means are: y1.  10.00psi, y2.  15.67 psi, y3.  17.00psi, y4.  21.17 psi

The value of LSD,

Compare the difference for every pairs of treatments and LSD,
LSD  t0.025, 20 2MSE / n  2.086 2(6.51) / 6  3.07
4vs.1  21.17  10.00  11.17  3.07
4vs.2  21.17  15.67  5.50  3.07
4vs.3  21.17  17.00  4.17  3.07
5%
3vs.1  17.00  10.00  7  3.07
3vs.2  17.00  15.67  1.33  3.07 *
2vs.1  15.67  10.00  5.67  3.07
0
5
10
10% 15% 20%
15
20
25
Model Adequacy Checking

Residual Analysis and Model Checking



Residual VS time: Test independence assumption of error
Residual VS fitted values: Test constant variance assumption of error
Normality Plot: test the normal distribution assumption of error.
Randomized Complete Block Design

An extension of the paired t-test but with more than 2 treatments.

Reduce the nuisance factor.

Ex. 3 methods could be used to evaluate the strength reading on steel plate
girders. If there are 4 plates and each plate is large enough to hold all the
treatment, the experimental design would be appear as Figure.
Randomized Complete Block Design

ANOVA Sums of square
Randomized Complete Block Design

Computing Formulas for ANOVA randomized block
Randomized Complete Block Design

ANOVA for a Randomized Complete Block Design
Randomized Complete Block Design

Ex. Fabric Strength Data from a randomized complete block design can be
shown in table. We want to test the effect of chemical type to the strength
of fabric by using α=0.01
H0:all types provide
identical strength
H1: not equal strength
Randomized Complete Block Design

f0=75.13 > f0.01,3,12 =5.95, reject Ho and conclude that there is a
significant difference in the chemical types
Randomized Complete Block Design

Multiple comparison

Similar to simple ANOVA but LSD  t / 2,( a 1)(b1)

2MSE
b
Ex. Refer to previous example, use Fisher’s LSD method to analyze the
difference between each pair of treatment.
LSD  t0.025,(3*4)
2(0.08)
2(0.08)
 2.179
 0.39
5
5
type 4 results in significantly different strengths
than the other three types .
types 2 and 3 do not differ, and types 1 and 3 do not
differ.
There may be a small difference in strength between types 1 and 2.
Exercise
1.
A civil engineer is interested in determining whether four different methods
of estimating flood flow frequency produce equivalent estimates of peak
discharge when applied to the same watershed. Each procedure is used
six times on the watershed and the resulting discharge data are shown in
table
Estimation
method
Observation
1
0.34
0.12
1.23
0.70
1.75
0.12
2
0.91
2.94
2.14
2.36
2.86
4.55
3
6.31
8.37
9.75
6.09
9.82
7.24
4
17.15 11.82 10.95
17.20
14.35
16.82
Exercise
2. An experiment was conducted to investigate leaking current in a SOS
MOSFETS device. The purpose of the experiment was to investigate how
leakage current varies as the channel length changes. Four channel lengths
were selected. For each channel length, five different widths were also
used, and width is to be considered a nuisance factor. The data are as
follows:
Exercise

3. An article in the American Industrial Hygiene Association Journal (Vol. 37, 1976, pp. 418–
422) describes a field test for detecting the presence of arsenic in urine samples. The test has
been proposed for use among forestry workers because of the increasing use of organic
arsenics in that industry. The experiment compared the test as performed by both a trainee
and an experienced trainer to an analysis at a remote laboratory. Four subjects were selected
for testing and are considered as blocks. The response variable is arsenic content (in ppm) in
the subject’s urine. The data are as follows:
Exercise
4. An article in the Food Technology Journal (Vol. 10, 1956, pp. 39–42)
describes a study on the protopectin content of tomatoes during storage.
Four storage times were selected, and samples from nine lots of tomatoes
were analyzed. The protopectin content (expressed as hydrochloric acid
soluble fraction mg/kg) is in the following table.
Exercise

5. An article in the IEEE Transactions on Components, Hybrids, and
Manufacturing Technology (Vol.15, No. 2, 1992, pp. 146–153) describes an
experiment in which the contact resistance of a brake-only relay was
studied for three different materials (all were silver-based alloys). The data
are as follows.
Exercise
6. An article in Lubrication Engineering (December 1990) describes the results
of an experiment designed to investigate the effects of carbon material
properties on the progression of blisters on carbon face seals. The carbon
face seals are used extensively in equipment such as air turbine starters.
Five different carbon materials were tested, and the surface roughness was
measured. The data are as follows:
Exercise
7. article in Communications of the ACM (Vol. 30, No. 5, 1987) studied
different algorithms for estimating software development costs. Six
algorithms were applied to eight software development projects and the
percent error in estimating the development cost was observed. The data
are in the table at the bottom of the page.