Transcript Statistics review Basic concepts: measures Variability
Statistics review
• • • •
Basic concepts: Variability measures Distributions Hypotheses Types of error
• • • •
Common analyses T-tests One-way ANOVA Randomized block ANOVA Two-way ANOVA
The t-test
Asks: do two samples come from different populations?
YES Ho NO DATA A B
The t-test
Depends on whether the difference between samples is much greater than difference within sample.
A B Between >> within… A B
The t-test
Depends on whether the difference between samples is much greater than difference within sample.
A B Between < within… A B
The t-test
T-statistic= Difference between means Standard error within each sample
s 2 + s 2 n 1 n 2
The t-test
How many degrees of freedom?
(n 1 -1) + (n 2 -1) Why does this seem familiar?
s 2 + s 2 n 1 n 2
T-tables
v
1 2 3 4 0.10
3.078
1.886
1.638
1.533
0.05
6.314
2.920
2.353
2.132
0.025
12.706
4.303
3.182
2.776
Careful! This table built for one-tailed tests. Only common stats table where to do a two-tailed test (A infinity
1.282
1.645
1.960
T-tables
v
1 2 3 4 0.10
3.078
1.886
1.638
1.533
0.05
6.314
2.920
2.353
2.132
0.025
12.706
4.303
3.182
2.776
Two samples, each n=3, with t-statistic of 2.50: significantly different?
infinity
1.282
1.645
1.960
T-tables
v
1 2 3 4 0.10
3.078
1.886
1.638
1.533
0.05
6.314
2.920
2.353
2.132
0.025
12.706
4.303
3.182
2.776
Two samples, each n=3, with t-statistic of 2.50: significantly different? No!
infinity
1.282
1.645
1.960
If you have two samples with similar n and S.E., why do you know instantly that they are not significantly different if their error bars overlap?
v
1 2 3 4 0.10
3.078
1.886
1.638
1.533
0.05
6.314
2.920
2.353
2.132
0.025
12.706
4.303
3.182
2.776
infinity
1.282
1.645
1.960
If you have two samples with similar n and S.E., why do you know instantly that they are not significantly different if their error bars overlap?
v
1 2 3 4 0.10
3.078
1.886
1.638
1.533
0.05
6.314
2.920
2.353
2.132
0.025
12.706
4.303
3.182
2.776
infinity
1.282
1.645
1.960
• the difference in means < 2 x S.E., i.e. t-statistic < 2 • and, for any df, t must be > 1.96
to be significant!
}
One-way ANOVA
General form of the t-test, can have more than 2 samples Ho: All samples the same… Ha: At least one sample different
Ho
One-way ANOVA
General form of the t-test, can have more than 2 samples DATA A AB B C C
Ha
A A C B BC
One-way ANOVA
Just like t-test, compares differences between samples to differences within samples
T-test statistic (t) ANOVA statistic (F)
A B C Difference between means Standard error within sample MS between groups MS within group
Mean squares:
MS= Sum of squares df
Everyone gets a lot of cake (high MS) when:
Lots of cake (high SS) Few forks (low df)
MS= Sum of squares df
Mean squares:
MS= Sum of squares df
Analogous to variance
Variance:
S
2
= Σ (x
i
– x )
2
n-1
df Sum of squared differences
ANOVA tables
Treatment (between groups) Error (within groups) Total df
df (X) df (E) df (T)
SS
SSX SSE SST
MS F p
SST = SSX SSE
ANOVA tables
Treatment (between groups) Error (within groups) Total df
df (X) df (E) df (T)
SS
SSX SSE SST
MS
SSX df (X) SSE df (E)
F
SSE MSX = SSX
df (X) df (E)
p
= MSE
ANOVA tables
Treatment (between groups) Error (within groups) Total df
df (X) df (E) df (T)
SS
SSX SSE SST
MS F
SSX df (X)
}
SSE df (E)
}
MSX MSE
p
Look up !
SSE MSX = SSX
df (X) df (E)
= MSE
Do three species of palms differ in growth rate? We have 5 observations per species. Complete the table!
df Treatment (between groups) Error (within groups) Total
k(n-1)
SS
69 104
MS F p
Hint: For the total df, remember that we calculate total SS as if there are no groups (total variance)… df Treatment (between groups) Error (within groups) Total
k(n-1)
SS
69 104
MS F p
Note: treatment df always k-1 Is it significant? At alpha = 0.05, F 2,12 = 3.89
Treatment (between groups) Error (within groups) Total df
2 12 14
SS
69 35 104
MS
34.5
2.92
F
11.8
p
?
2. Randomized block Good patch BLOCK A Medium patch BLOCK B Poor patch BLOCK C
Pro : Can remove between-block SS from error SS…may increase power of test Error Treatment Error Block Treatment
Con : Blocks use up error degrees of freedom Error Treatment Error Block Treatment
Do the benefits outweigh the costs? Does MS error go down?
F = Treatment SS/treatment df Error SS/error df Error Error Block Treatment Treatment
Two-way ANOVA
Just like one-way ANOVA, except subdivides the treatment SS into:
• • •
Treatment 1 Treatment 2 Interaction 1&2
Two-way ANOVA
Suppose we wanted to know if moss grows thicker on north or south side of trees, and we look at 10 aspen and 10 fir trees:
•
Aspect (2 levels, so 1 df)
•
Tree species (2 levels, so 1 df)
•
Aspect x species interaction (1df x 1df = 1df)
•
Error?
k(n-1) = 4 (10-1) = 36
v
Aspect Species df
1 1
Aspect x Species
1
Error (within groups) Total
36 39
SS
SS(Aspect)
MS F
MS(Aspect) MS(As) MSE SS(Species) MS(Species) MS(Sp) SS(Int) MS(Int) MSE MS(Int) MSE SSE MSE SST
Interactions
Combination of treatments gives non additive effect
Additive effect: Alder
5
Fir
3 2
North South
Interactions
Combination of treatments gives non additive effect
Anything not parallel!
North South North South
Careful!
If you log-transformed your variables, the absence of interaction is a multiplicative effect: log (a) + log (b) = log (ab)
North South North South