2 Way Analysis of Variance (ANOVA)

Download Report

Transcript 2 Way Analysis of Variance (ANOVA)

2 Way Analysis of
Variance (ANOVA)
Peter Shaw
RU




ANOVA - a recapitulation.
This is a parametric test, examining whether the
means differ between 2 or more populations.
It generates a test statistic F, which can be
thought of as a signal:noise ratio. Thus large
Values of F indicate a high degree of pattern
within the data and imply rejection of H0.
It is thus similar to the t test - in fact ANOVA on 2
groups is equivalent to a t test [F = t2 ]
How to do an ANOVA:

table 1: Calculate total Sum of
Squares for the data Sstot = Σ (xi - μ)2
i


= Σi(xi2) – CF

where CF = Correction factor = (Σixi *
Σ xi) /N
i
2: calculate Treatment Sum of
Squares SStrt = Σt(Xt.*Xt.)/r - CF
Xt. = sum of all values
within treatment t
 where

3: Draw up ANOVA table
ANOVA tables



Exact layout varies
somewhat - I dislike SPSS’s
version!
Learn as parrots:
Source DF SS MS F
Source
df
SS
MS
F
Treatment
(T-1)
SStrt
SStrt / (T-1)
MStrt / MSerr
Error
Sserr
by subtraction
Total
N-1
=
Sstot
SSerr / DFerr
Variance
One way ANOVA’s
limitations


This technique is only applicable when there
is one treatment used.
Note that the one treatment can be at 3, 4,…
many levels. Thus fertiliser trials with 10
concentrations of fertiliser could be analysed
this way, but a trial of BOTH fertiliser and
insecticide could not.
Linear models..





Although rather worrying-looking, these equations formally
define the ANOVA model being used. (By understanding
these equations you can readily derive all of ANOVA from
scratch)
The formal model underlying 1-Way ANOVA with Treatment
A and r replicates:
Xir = μ + Ai + Errir
Xir is the rth replicate of Treatment A applied at level i
Ai is the effect of treatment i (= difference between μ and
mean of all data in treatment i.

Errtr is the unexplained error in Observation Xtr

Note that ΣAi = Σerrir = 0
Basic model: Data are deviations
from the global mean:
Xir = μ + Errir
Sum of vertical deviations
squared = SStot

μ
Trt 1
Trt 2
1 way model: Data are deviations
from treatment means:
Xir = μ + Ai + Errir
Sum of vertical deviations
squared = SSerr

A2
A1
Trt 1
Trt 2
No model

Xir just is!
H0 model:
Xir = μ + Errir

μ
A1
μ
1 way anova model:
Xir = μ + Ai + Errir
Two-way ANOVA


Allows two different treatments to
be examined simultaneously.
In its simplest form it is all but
identical to 1 way, except that you
calculate 2 different treatment
sums of squares:
Calculate total Sum of Squares
Sstot= Σ (xi2) – CF
i
Calculate Sum of Squares for treatment A
SSA = ΣA(XA.*XA.)/r - CF
Calculate Sum of Squares for treatment B
SSB = ΣB(XB.*XB.)/r - CF
2 Way ANOVA table
Source
df
SS
MS
Treatment A
(NA-1)
SSA
SSA / (NA-1) MSA / MSerr
Treatment B
(NB-1)
SSB
SSB / (NB-1) MSB / MSerr
Error
By
By
Subtraction
SStot
SSerr / DFerr
Subtraction
Total
N-1
Variance
F
The 2 way Linear model

The formal model underlying 2-Way ANOVA, with 2
treatments A and B

Xikr = μ + Ai + Bk + errikr

Xikr is the rth replicate of Treatment A level i and treatment




B level k
Ai is the effect of the ith level of treatment A (= difference
between μ and mean of all data in this treatment.
Bk is the effect of the kth level of treatment B (= difference
between μ and mean of all data in this treatment.
Errijr is the unexplained error in Observation Xijr
Note that ΣAi = ΣBk = Σerrikr = 0
To take a worked example
(Steel & Torrie p. 343).
Effect of 2 treatments on blood phospholipids in
lambs. 1 was a handling treatment, one the time of
day.
A1B1
8.53
20.53
12.53
14.00
10.80
totals:
66.39
A1B2
17.53
21.07
20.80
17.33
20.07
A2B1
39.14
26.20
31.33
45.80
40.20
A2B2
32.00
23.80
28.87
25.06
29.33
96.80
182.67
139.06
2 Way ANOVA on these data:
Start by a preliminary eyeballing of the data: They are
continuous, plausibly normally distributed. There are 2
handling treatments and 2 time treatments, which are
combined in a factorial design so that each of the 4
combinations is replicated 5 times.
Get the basics:
n = 20
Σx = 484.92
Σx^2 = 13676.7
CF = 484.92^2 / 20 = 11757.37
SS = 13676.7 - cf = 1919.33
Now get totals for treatments A and B
B1
B2
Σ
A1
66.39
96.80
163.19
A2
182.67
139.06
321.73
Σ
249.06
235.86
484.92
Hence the sums of squares for A and B
can be calculated:
SSA = 163.19^2/10 + 321.73^2 / 10 - CF =
1256.75
SSB = 249.06^2/10 + 235.86^2/10 - CF =
8.712
A alone
Source
A
error
total
Df
1
18
19
SS
1256.75
662.58
1919.33
MS
1256.75
36.81
F
34.14**
Df
1
18
19
SS
MS
8.71
106.15
F
0.08 NS
B alone
Source
B
error
total
8.71
1910.62
1919.33
Pooled (the correct format)
Source
A
B
error
total
Df
1
1
17
19
SS
1256.75
8.71
653.87
1919.33
MS
1256.75
8.71
38.86
F
32.67**
0.24NS
Note that we have reduced error variance and DF
by incorporating 2 treatments into one table. This
is not just good practice but technically required by including only one treatment in the table you
are implicitly calling the effects of the other
treatment random noise, which is incorrect.
ANOVA tables can have many different treatments
included. The skill in ANOVA is not working out
the sums of squares, it is the interpretation of
ANOVA tables.
The clues to look for are always in the DF column.
A treatment with N levels has N-1 DF - this always
applies and allows you to infer the model a
researcher was using to analyse data.
Your turn! These data come from
a factorial experiment with 2
treatments applied at 3 levels each,
with 2 replicates of each treatment.
Hence the design contains
3 (A)*3 (B)*2(reps) = 18 data points.
They are specially contrived to make
the calculations easy for ANOVA
Remember the sequence:
Get: n, Σx, Σx^2
Calculate CF then SStot
Get the totals for each
treatment: A1, A2, A3, B1,
B2 and B3 hence get SSA
and SSB
A
1
1
1
1
1
1
2
2
2
2
2
2
3
3
3
3
3
3
B
1
1
2
2
3
3
1
1
2
2
3
3
1
1
2
2
3
3
18
22
25
35
47
53
29
31
38
42
45
51
38
42
46
44
35
45
These model data:





N = 18
Σx = 686.00
Σx^2 = 27822.00
CF = 26144.22
SStot = 27822.00 -26144.22
= 1677.78
Totals for each
treatment:





B1
B2
B3
Σ
A1
40
60
100
200
A2
60
80
96
236
A3
80
90
80
250
Σ
180
230
276
686
Sums of squares:







SSa = 200^2/6 + 236^2/6 + 250^2/6 - CF =
221.78
SSb = 180^2/6 + 230^2/6 + 276^2/6 - CF =
768.44
Source Df
A
2
B
2
error 13
Σ
17
SS
MS
221.78
110.89
768.44
384.22
687.56
52.89
1677.78
F
2.1 NS
7.26**
Interaction terms



We now meet a unique, powerful feature of
ANOVA. It can examine data for interactions
between treatments - synergism or
antagonism.
No other test allows this, while in ANOVA it is a
standard feature of any 2 way table.
Note that this interaction analysis is only valid
if the design is perfectly balanced. Unequal
replication or missing data points make this
invalid (unlike 1 way, which is robust to
imbalance).
Synergism and
antagonism




Some treatments intensify each others’ effects:
The classic examples come from pharmacology.
Alcohol alone is lethal at the 20-40 unit range.
Barbiturates are lethal. Together they are a
vastly more lethal combination, as the 2 drugs
synergise. (In fact most sedatives and
depressants show similar dangerous
synergism).
In ecology, SO2 + NO2 is more damaging than
the additive effects of each gas alone - a
Antagonism.


is the opposite - 2 treatments
nullifying each other.
Drought antagonises effects of air
pollution on plants, as drought
leads to closed stomata excluding
the noxious gas.
Response
No interaction
2
I
Treatment B
I
I
I
I
I
1
I
1
Response
I
2
3 Treatment A
I
Antagonism
I
I
I
1
Synergism
2
I
I
I
I
I
3 Treatment A
1
2
I
3
How to do this?




Easy! We work out a sum of squares caused
by ALL treatments at ALL levels. Thus for a 3*3
design there are really 9 treatments, etc. Call
this SStrt
Now we can partition this Sum of squares:
SStrt = SSA + SSB + SSInteraction
We know SSA, we know SSB, so we get
SSinteraction by subtraction.
To get SStrt we just add up all data in each
treatment, square this total, divide by replicates,
add up and remove CF.
For the lamb blood data:



We have 4 separate treatments:
A1B1, A1B2, A2B1, A2B2
The data within these 4 groups add
to: 66.39, 182.67, 96.80, 139.06.
There are 5 replicates
SStrt = 66.39^2/5 + 182.67^2/5 +
96.8^2/5 + 139.06^2/5 - CF =
1539.407
2 Way anova table with interaction







Source
All trts
A
B
A*B
Df
3
1
1
1
SS
1539.07
1256.75
8.71
273.95
error
Σ
16
19
379.92
1919.33
MS
F
***********
1256.75
52.93*
8.71
0.37NS
273.95
11.54**
23.75
Interpreting the
interaction term
The hardest part of 2 way anova is trying to explain what a
significant interaction term means, in terms that make sense
to most people! Formally it is easy; you are testing H0: Ms
for interaction term is same population as MS for error.
In English let’s try “It means that you can’t reliably predict
the effect of Treatment A at level m with B at level n,
knowing only the effect of Am and Bn on their own.”
Treatment A – big effect (A2>A1)
Treatment B – mean (B1) is v
close to mean (B2) so no effect
Interaction: When A=1, B1<B2
but when A =2, B1> B2
200.00
Mean dat
150.00
100.00
50.00
0.00
1.00
A1B1 A1B2
2.00
a
A2B1 A2B2