Experimental Design: Is it Important?

Download Report

Transcript Experimental Design: Is it Important?

Analysis of Variance & One
Factor Designs
Y= DEPENDENT VARIABLE
(“yield”)
(“response variable”)
(“quality indicator”)
X = INDEPENDENT VARIABLE
(A possibly influential FACTOR)
1
OBJECTIVE: To determine the impact of X on Y
Mathematical Model:
Y = f (x, ) , where  = (impact of) all
factors other than X
Ex:
Y = Battery Life
(hours)
X = Brand of Battery
 = Many other factors (possibly, some we’re
unaware of)
2
Statistical Model
“LEVEL” OF BRAND
1
1
2
•
•
•
•
R
2 ••• • • •••C
Y11 Y12 • • • • • • •Y1c
Y21
•
•
•
•
•
•
Yij
•
•
•
•
YRI
(Brand is, of course, represented as
“categorical”)
• • • •
•
•
•
Yij = + j + ij
i = 1, . . . . . , R
j = 1, . . . . . , C
•
•YRc
3
Where
= OVERALL AVERAGE
j = index for FACTOR (Brand) LEVEL
i = index for “replication”
j = Differential effect (response)
associated with jth level of X
and
ij = “noise” or “error” associated with the
(particular) (i,j)th data value.
Let j = AVERAGE associated with jth level of X
 j = j –  and  = AVERAGE of j .
4
Yij =  + j + ij
C
By definition, 
j=1
j = 0
The experiment produces
RxC
Yij data values.
The analysis produces estimates of
, ,,c . (We can then get estimates of
the ij by subtraction).
5
1
Y11
2
3 ••••• C
Y12 • • • • • •Y1c
Y21
•
•
•
•
•
•
•
•
YRI
• • • • • • • • •
Y• 1 Y• 2•
• •
YRc
(Y• j) • • Y• c
Y•1, Y•2, etc., are Column Means
6
c
Y• • = Y• j C = “GRAND MEAN”
j=1
/
(assuming same # data points in each column)
(otherwise, Y• • = mean of all the data)
7
Yij =  + j + ij
MODEL:
Y• •
estimates 
Y •j - Y ••
estimates j (= j – )
(for all j)
These estimates are based on Gauss’ (1796)
PRINCIPLE OF LEAST SQUARES
and (I would argue) on COMMON SENSE
8
MODEL:
Yij =  + j + ij
If you insert the estimates into the MODEL,
<
(1)
Yij = Y • • + (Y•j - Y • • ) + ij.
it follows that our estimate of ij is
(2)
ij = Yij - Y•j
9
Then, Yij = Y• • + (Y• j - Y• • ) + ( Yij - Y• j)
{
{
{
or, (Yij - Y• • ) = (Y•j - Y• •) + (Yij - Y•j )
(3)
VARIABILITY
in Y
Variability
Variability
TOTAL
=
in Y
+
in Y
associated
associated
with X
with all other
factors
10
If you square both sides of (3), and double sum both sides
(over i and j), you get, [after some unpleasant algebra, but
lots of terms which “cancel”]
C R
C
2
C R
2
(Yij - Y• • ) = R •  (Y•j - Y• •) + (Yij - Y•j)
j=1
j=1 i=1
{
{
{
j=1 i=1
2
(
(
TSS
TOTAL SUM OF
SQUARES
=
SSBC
+
=
SUM OF
SQUARES
BETWEEN
COLUMNS
+
(
SSW (SSE)
(
SUM OF SQUARES
WITHIN COLUMNS
11
ANOVA TABLE
SOURCE OF
VARIABILITY
Between
Columns
(due to brand)
SSQ
SSBC
Within
Columns
(due to error)
SSW
TOTAL
TSS
DF
C-1
Mean
(M.S.)
square
SSBC
= MSBC
C-1
SSW
= MSW
(R - 1) • C
(R-1)•C
RC -1
12
Example: Y = LIFETIME (HOURS)
BRAND
3 replications
per level
1
2
3
4
5
6
7
8
1.8 4.2 8.6 7.0 4.2 4.2 7.8 9.0
5.0 5.4 4.6 5.0 7.8 4.2 7.0 7.4
1.0 4.2 4.2 9.0 6.6 5.4 9.8 5.8
2.6 4.6 5.8 7.0 6.2 4.6 8.2 7.4 5.8
SSBC =
2
2
2
3 ( [2.6 - 5.8] + [4.6 - 5.8] + • • • + [7.4 - 5.8] )
=
3 (23.04)
=
69.12
13
SSW
=2
(1.8 - 2.6) = .64
(4.2 - 4.6)2 =.16
(9.0 -7.4)2 = 2.56
2
(5.4 - 4.6)2= .64 • • • • (7.4 - 7.4)2 = 0
2
(4.2 - 4.6)2= .16
(5.8 - 7.4)2 = 2.56
.96
5.12
(5.0 - 2.6) = 5.76
(1.0 - 2.6) = 2.56
8.96
Total of (8.96 + .96 + • • • • • • + 5.12),
SSW = 46.72
14
ANOVA TABLE
Source of
Variability
SSQ
df
M.S.
BRAND
69.12
7
9.87
= 8-1
ERROR
46.72
16
2.92
= 2 (8)
TOTAL
115.84
23
= (3 • 8) -1
15
We can show:
“VCOL”
E (MSBC) = 2
+ MEASURE OF
DIFFERENCES
AMONG COLUMN
MEANS
(
R
C-1
•
(j - )2
j
E (MSW) = 2
(Assuming each Yij has (constant) standard deviation, )
(More about assumptions, Later)
16
E ( MSBC ) = 2 + VCOL
E ( MSW ) = 2
This suggests that
if
if
MSBC
MSW
MSBC
MSW
>
<
There’s some
evidence of non1 , zero V , or “level
COL
of X affects Y”
No evidence that
1,
VCOL > 0, or that
“level of X affects Y”
17
With HO:
HI:
Level of X
has no impact
on Y
Level of X
does have
impact on Y,
We need
MSBC
MSW
>>1
to reject HO.
18
More Formally,
HO: 1 = 2 = • • • c = 0
HI: not all j = 0
OR
HO: 1 = 2 = • • • • c
(All column
means are equal)
HI: not all j are EQUAL
19
The probability Law of
MSBC
MSW
= “Fcalc” , is
The F - distribution with (C-1, (R-1)C)
degrees of freedom
Assuming

HO true.
C = Table Value
20
In our problem:
ANOVA TABLE
Source of
Variability
SSQ
df
M.S.
BRAND
69.12
7
9.87
3.38
ERROR
46.72
16
2.92
= 9.87
2.92
Fcalc
21
F table coming up
 = .05
C = 2.66
3.38
(7,16 DF)
22
F-Table
23
Hence, at  = .05, Reject Ho .
(i.e., Conclude that level of
BRAND does have an impact on
battery lifetime.)
24
ON E FACTOR AN OVA, Using EXCEL
1.8
5
1
4.2
5.4
4.2
8.6
4.6
4.2
7
5
9
4.2
7.8
6.6
4.2
4.2
5.4
7.8
7
9.8
9
7.4
5.8
Count
Sum
Avera g e
Va ria nc e
3
3
3
3
3
3
3
3
7.8
13.8
17.4
21
18.6
13.8
24.6
22.2
2.6
4.6
5.8
7
6.2
4.6
8.2
7.4
4.48
0.48
5.92
4
3.36
0.48
2.08
2.56
SS
df
MS
F
P-va lue
F c rit
Betw een Group s
Within Group s
69.12
46.72
7
16
9.87429
2.92
3.3816
0.02064
2.657
Tota l
115.8
23
Anova : Sing le-Fa c tor
Sum m a ry
Group s
Colum n
Colum n
Colum n
Colum n
Colum n
Colum n
Colum n
Colum n
1
2
3
4
5
6
7
8
ANOVA
Sourc e of Va ria tion
25
SPSS/MINITAB INPUT
VAR001
1.8
5.0
1.0
4.2
5.4
4.2
.
.
.
9.0
7.4
5.8
VAR002
1
1
1
2
2
2
.
.
.
8
8
8
26
ONE_FACTOR ANOVA, using SPSS
----- ON EWAY ----Variable Lifetime
By Variable Device
Analysis of Variance
Sum of
Source
Between Groups
Within Groups
Total
D.F. Squares
Mean
F
F
Squares
Ratio
Prob.
3.3816
.0206
7
69.1200
9.8743
16
46.7200
2.9200
23
115.8400
27
ONE FACTOR ANOVA
(MINITAB)
MINITAB: STAT>>ANOVA>>ONE-WAY
Analysis of Variance for life
Source
DF
SS
MS
F
P
3.38
0.021
brand
7
69.12
9.87
Error
16
46.72
2.92
Total
23
115.84
28
Dotplots of life by brand
(group means are indicated by lines)
10
9
8
7
life
6
5
4
3
2
8
7
6
5
4
3
2
brand
1
1
29
Boxplots of life by brand
(means are indicated by solid circles)
10
9
8
7
life
6
5
4
3
2
1
8
7
6
5
4
3
2
brand
1
0
30
EXAMPLE: MORTAR
The tension bond strength of cement mortar is an
important characteristic of the product. An
engineer is interested in comparing the strength of
a modified formulation in which polymer latex
emulsions have been added during mixing to the
strength of the unmodified mortar. The
experimenter has collected 10 observations on
strength for the modified formulation and another
10 observations for the unmodified formulation.
31
Modified
16.85
16.40
17.21
16.35
16.52
17.04
16.96
17.15
16.59
16.57
Unmodified
17.50
17.63
18.25
18.00
17.86
17.75
18.22
17.90
17.96
18.15
32
One-way ANOVA: strength versus type (Minitab)
Analysis of Variance for strength
Source
DF
SS
MS
F
P
type
1
6.7048
6.7048 82.98 0.000
Error
18
1.4544
0.0808
Total
19
8.1592
33
Boxplots of strength by type
(means are indicated by solid circles)
17.5
type
2
16.5
1
strength
18.5
34
ON E FACTOR AN OVA, using JMP
MVPC Survey Results
Amesbury
Andover
Methuen
66
55
56
66
50
56
66
51
57
67
47
58
70
57
61
64
48
54
71
52
62
66
50
57
71
48
61
67
50
58
63
48
54
60
49
51
66
52
57
70
48
60
69
48
59
66
48
56
70
51
61
65
49
55
71
46
62
63
51
53
69
54
59
67
54
58
64
49
54
68
55
58
65
47
55
67
47
58
65
53
55
70
51
60
68
50
58
73
54
64
Salem
64
70
62
64
66
62
67
60
68
68
66
66
61
63
67
67
70
62
62
68
70
62
63
65
68
68
64
65
69
62
35
36
Assumptions
Basically, the same as in
Regression analysis:
Run order plot
Normality plot
Residual plot
MODEL:
Yij = + j + ij
1.) the ij are indep. random variables
2.) Each ij is Normally Distributed
E(ij) = 0 for all i, j
3.) 2(ij) = constant for all i, j
37
Diagnosis: Normality
• The points on the normality plot must more or less
follow a line to claim “normal distributed”.
• There are statistic tests to verify it scientifically.
• The ANOVA method we learn here is not sensitive
to the normality assumption. That is, a mild
departure from the normal distribution will not
change our conclusions much.
Normality plot: normal scores vs. residuals
38
From Mortar data:
Normal Probability Plot of the Residuals
(response is strength)
2
Normal Score
1
0
-1
-2
-0.5
0.0
0.5
Residual
39
Diagnosis: Constant Variances
• The points on the residual plot must be more or less
within a horizontal band to claim “constant
variances”.
• There are statistic tests to verify it scientifically.
• The ANOVA method we learn here is not sensitive to
the constant variances assumption. That is, slightly
different variances within groups will not change our
conclusions much.
Residual plot: fitted values vs. residuals
40
From Mortar data:
Residuals Versus the Fitted Values
(response is strength)
Residual
0.5
0.0
-0.5
17.0
17.5
18.0
Fitted Value
41
Diagnosis:
Randomness/Independence
• The run order plot must show no “systematic”
patterns to claim “randomness”.
• There are statistic tests to verify it scientifically.
• The ANOVA method is sensitive to the constant
variances assumption. That is, a little level of
dependence between data points will change our
conclusions a lot.
Run order plot: order vs. residuals
42
From Mortar data:
Residuals Versus the Order of the Data
(response is strength)
Residual
0.5
0.0
-0.5
2
4
6
8
10
12
14
16
18
20
Observation Order
43
This assumes a “fixed model”:
Inherent interest in the specific levels of the factors under study - there’s
no direct interest in extrapolating to other levels - inference will be limited
to levels that appear in the experiment. Experimenter selects the levels
If a “random model”:
Levels in experiment randomly selected from a population of such
levels, and inference is to be made about the entire population of
levels.
Then, besides assumptions 1 to 3, there is another assumption:
4) a) the j are independent random variables which are normally
distributed with constant variance
b) the j and ij are independent
44
With these assumptions, the estimates
(Y.. and the Y• j ) are “Maximum likelihood
estimates”(a statistical notion which could be
thought of as “efficiency” [“most likely value”]),
and, more directly relevant:
The “Conventional” F- and t- tests are
applicable (VALID) for a variety of hypothesis
testing and confidence interval computations.
45
KRUSKAL - WALLIS
TEST
(Non - Parametric Alternative)
HO: The probability distributions are
identical for each level of the factor
HI: Not all the distributions are the same
46
Brand
A
B
C
32
32
28
30
32
21
30
26
15
29
26
15
26
22
14
23
20
14
20
19
14
19
16
11
18
14
9
12
14
8
Mean: 23.9
22.1
BATTERY LIFETIME
(hours)
(each column rank
ordered, for
simplicity)
14.9 (here, irrelevant!!)
47
HO: no difference in distribution
among the three brands with
respect to battery lifetime
HI:
At least one of the 3 brands
differs in distribution from the
others with respect to lifetime
48
Ranks
Brand
A
B
C
32 (29)
32 (29)
28 (24)
30 (26.5)
32 (29)
21 (18)
30 (26.5)
26 (22)
15 (10.5)
29 (25)
26 (22)
15 (10.5)
26 (22)
22 (19)
14 (7)
23 (20)
20 (16.5)
14 (7)
20 (16.5)
19 (14.5)
14 (7)
19 (14.5)
16 (12)
11 (3)
18 (13)
14 (7)
9 (2)
12 (4)
14 (7)
8 (1)
T1 = 197
T2 = 178
T3 = 90
n1 = 10
n2 = 10
n3 = 10
49
TEST STATISTIC:
H=
12
N (N + 1)
K
•  (Tj2/nj ) - 3 (N + 1)
j=1
nj = # data values in column j
N = n
K j
=1
K =j #
Columns (levels)
Tj = SUM OF RANKS OF DATA ON COL j
When all DATA COMBINED
(There is a slight adjustment in the formula
as a function of the number of ties in rank.)
50
H=
30 (31)
[
197 2 178 2 902
+
+
10
10
10
[
12
- 3 (31)
= 8.41
(with adjustment for ties, we get 8.46)
51
What do we do with H?
We can show that, under HO , H is well
2
approximated by a  distribution with
df = K - 1.
Here, df = 2, and at = .05, the critical value = 5.99
-,df
 = .05
= F-,df,
8
C
2
df
5.99
8.41 = H
Reject HO; conclude that mean lifetime NOT
the same for all 3 BRANDS
52