Descriptive Statistics: To describe your sample, you need to state: The mean, and the number of the measurements (n) that it was.
Download
Report
Transcript Descriptive Statistics: To describe your sample, you need to state: The mean, and the number of the measurements (n) that it was.
Descriptive Statistics:
To describe your sample, you need to state:
The mean, and the number of the measurements (n) that it was based on
The standard deviation: A measure of the variability of the data about
the mean
Other useful information derived from the mean and standard deviation,
such as:
1) The confidence intervals: the range within which 95% or 99% or
99.9% of measurements of this sort would be expected to fall
2) The standard error: the range of means that we could expect 95%
or 99% or 99.9% of the time if we were to repeat the same type
of measurement again and again on different samples.
Descriptive Statistics II:
Degree of freedom: n-1
For each observation (x) the deviation (d) from the mean ( x )
is x x
x
d x n
d
S
2
The sum of squares:
2
2
2
Sample variance:
Standard deviation:
2
n 1
S
d
2
n 1
Descriptive Statistics III:
Standard error (SE) or Standard error of the mean (SEM):
To show how good is our estimate of the mean, which is actually the standard
deviation of the mean. The notation for the SE/SEM is n , to calculate n , use
equation n / n , or SE S / n
If n is extremely large then S and SE would almost be the same. Otherwise, the
difference between them would be quite large.
Many people prefer to cite the SE rather than S, because it make the range of
variation appear to be smaller!
Another advantage of using SE is that if anyone were to repeat an experiment,
then the mean would be likely to fall within the t n , which is the confidence
intervals.
Descriptive Statistics IV:
Confidence intervals of a mean: The range within which in 95% or 99% or 99.9% (level of
confidence) of cases the mean would be expected to fall if we repeat an experiment over and
over again. The limiting values are the confidence limits.
x SE t ; t value can be got from the t table
The number of degrees of freedom determines the t value. So, when designing
experiments we have to find a compromise between the level of confidence we want and
the amount of work involved.
The t values fall off rapidly from 2 to 5 df (3 to 6 replicate observations) but more slowly
thereafter. For a complicated or costly experiment there would be little justification for
using more than, say, 4 or 5 replicates.
Statistic test: -- hypothesis testing
The statistical hypothesis is sometimes called the Null Hypothesis.
The null hypothesis is often the reverse of what the experimenter actually believes;
it is put forward to allow the data to contradict it. (Science proceeds conservatively,
always assuming that something interesting is not happening unless convincing evidence
suggests, for the moment, that it might be. )
Errors in hypothesis testing
Null True
(exp. hypothesis
Null False
(exp.
Exp. Hypothesis: there
is difference between
OFF and NF
Null: there’s no
difference between
OFF and NF
Truth is: there is
no difference
Truth is:
there is
difference
wrong)
hypothesis
No Effect
correct)
Correct
prob = 1- b
“power”
Decide to believe our
hypothesis
Type I ERROR
prob = a
“significance
level ”
Correct
prob = 1- b
“power”
Type II
ERROR
prob = b
Decides that our
hypothesis is wrong
Real Effect
Reject Null
Decides: there are real
effects (accept exp.
Hypothesis)
Accept Null
Decides: there are no
real effects (reject exp.
Hypothesis)
Type I ERROR
prob = a
“significance
level ”
Correct
prob = 1- a
Correct
prob = 1- a
Type II
ERROR
prob = b
Statistical Tests – What test to do?
Does the values of things fall in same category?
No
Yes
Does the Sample group # <3?
Yes
Chi-squared test
No
Can the two sets of replicate
data be arranged in pairs?
ANOVA
Is the treatments separate?
Yes
No
Paired-samples test
Student’s t-test
Yes
One-way ANOVA
No
Two-way ANOVA
Student’s t-test (I):
Usually a minimum of 2, but ideally at least 3, replicates of each sample or
treatment. DON’T need the same number of replicates of each treatment.
The t-test compares the actual difference between two means ( x1 x2 ) in
relation to the variation in the data ( SE 2 SE 2 ):
1
t
2
x1 x2
SE1 SE 2
2
2
One-tailed t-test: test only that one particular mean is larger (or smaller) than
the other.
For most purposes, use a two-tailed test
Student’s t-test (II):
Usually the variances of the two treatments are assumed similar.
To test whether this is true, divide the larger variance by the smaller, and
compare this variance ratio with a value from a table of ‘F’ (variance ratio)
for p= 0.05. if the calculated variance ratio is less than the tabulated F value,
the variances do not differ significantly, and doing a t-test with assumption of
equal variances is valid.
When use Excel: Tools Data analysis t-test: Two-sample assuming
equal variances OK. For ‘output range’, choose the top-left cell of the area
where you want the results of analysis to be displayed.
Student’s t-test (III) – Work flow
NULL Hypothesis:
There is no difference
between the means in
the data
Question: if the null hypothesis is true, what is the probability (p-value)
of obtaining our given result (Type I error)? It compares the size of the
difference between two means in relation to the amount of inherent
variability (the random error, not related to treatment differences) in the
data.
Conclusions:
Null hypothesis is wrong:
The difference is
significant. Or, we can be or
95% confident that the
means differ
significantly.
Null
hypothesis
is right:
There’s no
significant
difference
Yes
Test: p < 0.05? Or, t >Tt?
No
Calculate the index of our given result -Calculated t value (t, the ratio between the
difference of the two mean values and the variance
of the difference between the two means)
Decide the probability level (significant level)
below which we would accept the hypothesis is
wrong. Usually 0.05 in biology study
Calculate: 1) the probability (p-value) of getting the calculated t value (t) by chance alone; 2) Using tTable, Get the t value corresponding to the desired significant level from Table (Tt). In other words,
we want to know that in order for the difference between the means to be significant at the desired
level, the t value has to exceed what value (Tt)
Paired-samples test (I):
Use this test as an alternative to the t-test, for cases where data can be paired to
reduce incidental variation - i.e. variation that you expect to be present but that is
irrelevant to the hypothesis you want to test.
In general, more replicates are needed for this test than for a t-test (say, a minimum
of 5 for each treatment), and you will need the same number of replicates for each
treatment.
This test compares the mean difference between the pairs of treatments in relation
to the standard error of this mean difference. Test whether the mean difference is
significantly different from zero. Use One-tailed test (p=0.1)
When use Excel: Tools Data analysis t-test: paired two sample for means
OK. For ‘output range’, choose the top-left cell of the area where you want the
results of analysis to be displayed.
Paired-sample test (II) – Work flow
Hypothesis: There
is no difference
between the means
in the data
Question: if this hypothesis is true, what is the probability (p-value) of
obtaining our given result? Compare the mean value of difference
between the pairs of treatments in relation to the variance of the
difference between the pairs of the treatments.
Conclusions:
Hypothesis is wrong: The
difference is significant.
or
Or, we can be 95%
confident that the means
differ significantly.
Hypothesi
s is right:
There’s no
significant
difference
Yes
Test: p < 0.1? Or, Tc >Tt?
No
Calculate the index of our given result in terms of
-- Calculated t value (Tc, the ratio between the
mean of difference of the pairs of treatment and
the standard error of this mean)
Decide the probability level (significant level)
below which we would accept the hypothesis is
wrong. Usually 0.1 for one-tailed test
Calculate: 1) the probability (p-value) of getting the calculated t value (Tc) by chance alone; 2) Using
Table, Get the t value corresponding to the desired significant level from Table (Tt). In other words,
we want to know that in order for the difference between the treatments to be significant at the desired
level, the t value has to exceed what value (Tt)
Analysis of variance (ANOVA) or F-test:
Use this test to compare several treatments. If only two treatments, it’s a
equivalent of t-test.
One-way ANOVA: The simplest from of ANOVA, it is used to compare
several separate treatments. One way ANOVA needs at least 2 replicates of
each treatment. It tells if there are differences between the treatments as a
whole. But it can also be used, with caution, like a multiple t-test, to tell
you which of the treatments differ from each other.
Two-way ANOVA: compare combinations of treatments. You can get
useful information even if you don’t have replication, but get much more
information if you have 2 (or more) replicates of each combination of
treatments. Then the test can tell you if you have significant interaction.
One-way ANOVA (I):
Use this test for comparing means of 3 or more samples/treatments, to
avoid the error inherent in performing multiple t-test (in each t-test we
accept a 5% chance of our conclusion being wrong, if we have many
separate t-tests, say 21 for 7 treatments, we would expect (by probability)
that one test would give us a false result).
Ideally, for this test we would have the same number of replicates for each
treatment, but this is not essential. Advanced computer programmes can
overcome the problem of unequal replicates by entering ‘missing values’.
One-way ANOVA (II):
An important assumption underlies the ANOVA: all treatments have
similar variance.
To test whether this is true, divide the highest variance by the lowest to
obtain a variance ratio (F), called omnibus F-ratio, and compare this
variance ratio with a value from a table of ‘Fmax’ for p= 0.05. if the
calculated variance ratio is less than the tabulated F value, the variances do
not differ significantly. If not, the data might need to be transformed.
When apply ANOVA analysis to 2 treatments,
Degree of freedom (df)
F t2
df of between treatments mean square = number of treatments -1
df of residual mean square = number of treatment x (number of replicates -1)
One-way ANOVA (III):
ANOVA involves the partitioning of the total variance into (1) variance
associated with the different treatments/samples and (2) random variance,
evidenced by the variability within the treatments. In this test, we ask, in
effect, is there a large amount of variance associated with the different
treatments compared with the amount of random variance?
ANOVA tells only whether there are differences between treatments in the
experiment as a whole, but doesn’t give any information on which
treatment is differ from which. To solve this problem, we need multiple
comparison test.
Although Excel can run ANOVA test, but it doesn’t have the multiple
comparison test that usually follow the analysis, therefore, recommend to
use StatView instead for ANOVA and the test follow it.
One-way ANOVA (IV) – Work flow
Hypothesis: There
is no difference
between the means
in the data
Question: if this hypothesis is true, what is the probability (p-value) of
obtaining our given result? Is there a large amount of variance associated
with the different treatments compared with the amount of random
variance?
Conclusions:
Hypothesis is wrong: The
difference is significant.
or
Or, we can be 95%
confident that the means
differ significantly.
Yes
Hypothesi
s is right:
There’s no
significant
difference
No
Test: p < 0.05? Or, Fc >Ft?
Calculate the index of our given result in terms of
-- Calculated F value (Fc, the ratio between the
amount of variance associated with the different
treatments and the amount of random variance)
Decide the probability level (significant level)
below which we would accept the hypothesis is
wrong. Usually 0.05 for biology test
Calculate: 1) the probability (p-value) of getting the calculated F value (Fc) by chance alone; 2)
Using Table, Get the F value corresponding to the desired significant level from Table (Ft). In other
words, we want to know that in order for the difference between the treatments to be significant at the
desired level, the F value has to exceed what value (Ft)
After ANOVA: Analytical Comparisons
The F ratio we have calculated from ANOVA cannot tell us
which groups is different from which group.
The procedure for finding out the difference between groups is
know as making analytical comparisons.
With a non significant omnibus F we are prepared to assert
that there are no real differences among the means. – We can
stop the analysis there.
A significant omnibus F demands further analysis of the data.
– Which differences between the means are real and which are
not?
– Exception: Factor with only two
Multiple comparison tests – Background
After one-way ANOVA, the follow up tests can be called "planned
comparisons", "post-hoc tests", "multiple comparison tests" or "post
tests". Several ways to make multiple comparisons:
1.
All possible comparisons, including averages of groups. So you might
compare the average of groups A and B with the average of groups C, D
and E. Or compare group A, to the average of B-F. Scheffe's test does
this.
2.
All possible pairwise comparisons. Compare the mean of every group
with the mean of every other group. Use Tukey or Newman-Keuls
comparisons.
3.
All against a control. If group A is the control, you may only want to
compare A with B, A with C, A with D... but not compare B with C or C
with D. Dunnett's test does this.
4.
Only a few comparisons based on your scientific goals. So you might
want to compare A with B and B with C and that's it. Bonferroni's test
does this.
Multiple comparison tests – Terminology
Multiple comparison test applies whenever you make several comparisons at
once.
Post test is generally used interchangeably with multiple comparison test.
Post-hoc test is used for situations where you can decide which comparisons you
want to make after looking at the data. You don't need to plan ahead.
Planned comparison tests require that you focus in on a few scientifically
sensible comparisons. You can't decide which comparisons to do after looking at
the data. The choice must be based on the scientific questions you are asking, and
be chosen when you design the experiment.
Multiple comparison tests – Basic Tests Type
Planned comparisons
A priori
Few in number
Theoretically motivated
Post hoc comparisons
Based on looking at the data
Exploratory
Risky business
Multiple comparison tests – Familywise alpha
Probability that we will have one or more false alarms =
familywise a (F.A.)= experimentwise a
It increases as number of possible comparisons increases, this
is termed as a -inflation
e.g. for 7 levels, we have a .66 chance of making at least one F. A.
Post hoc tests were created to deal with a-inflation
Post hoc tests should be used when you are examining more
than two categories in an ANOVA
Multiple comparison tests – Post Hoc Test
– Fisher’s Protected LSD
Basic strategy: only perform pairwise t-tests if the ANOVA
provides a significant F.
Insight: requiring significant F reduces the number of
opportunities for false alarms.
Problem: opportunities remain, especially with many levels.
Solution: Don’t use it!
(Use it for 3-group situations only if you like)
Multiple comparison tests – Post Hoc Test
– Commonly used post hoc tests
Bonferroni (For pairwise comparison for ANOVA design,
Dunn test is identical to Bonferroni)
Tukey’s Honestly Significant Difference (Tukey’s HSD)
Multiple comparison tests – Post Hoc Test
– Bonferroni
It calculates a new pairwise alpha to keep the familywise alpha value at .05
e.g., for a all possible pairwise comparisons on five means, the new pairwise a would
be 0.05/10 = 0.005
Pros: The Bonferroni is probably the most commonly used post hoc test. It is highly
flexible, very simple to compute, and can be used with any type of statistical test, not just
post hoc test with ANOVA.
Cons: tends to lack power
the familywise error calculation depends on the assumption that, for all tests, the null
hypothesis is true. This is unlikely to be the case, especially after a significant
omnibus test;
all tests are assumed to be orthogonal (i.e., independent or nonoverlapping) when
calculating the familywise error test, and this is usually not the case when all pairwise
comparisons are made
the test does not take into account whether the findings are consistent with theory and
past research. If consistent with previous findings and theory, an individual test should
be less likely to be a Type I error
Type II error rates are too high for individual tests. It overcorrects for Type I error
Multiple comparison tests – Post Hoc Test
– Modified Bonferroni Approaches
Several alternatives to the traditional Bonferroni have been
developed, including the Dunn-Bonferroni, the Dunn-Sidak
method, Hochberg’s sequential method, or Keppel’s modified
Bonferri among several others.
These tests have greater power than the Bonferroni while
retaining the general nature of the Bonferroni.
Multiple comparison tests – Post Hoc Test
– Tukey’s Honestly Significant Difference (HSD) Test
Mechanism: It calculates a new critical value that can be used
to evaluate whether differences between any two pairs of
means are significant. One simply calculates one critical value
and then the difference between all possible pairs of means.
Each difference is then compared to the Tukey critical value.
If the difference is larger than the Tukey value, the
comparison is significant.
Insight: The greatest chance of making a type 1 error (F.A.)
arises in comparing the largest mean with the smallest.
If we can protect against an F.A. in this case, all other comparisons are
also protected
If this comparison is not significant, nor is any other one!
Advantage: provides protection of familywise alpha
Multiple comparison tests – Post Hoc Test
– Tukey’s Honestly Significant Difference (HSD) Test
– Doing Tukey HSD tests (equal n)
Suppose we do an experiment with one factor, 5 levels. There is
the ANOVA table:
Source
SS
df
MS
F
p
Between
2942 4
736 4.13 <.05
Within (error)
9801 55
178
Here are the group means:
I
II
III
IV
V
63
82
80
77
70
Multiple comparison tests – Post Hoc Test
– Tukey’s Honestly Significant Difference (HSD) Test
– Doing Tukey HSD tests (equal n) (II)
I
I 63
V 70
IV 77
III 80
II 82
V
IV
III
II
63 70 77
80
82
7 14
17
19
0
10
12
0
7
0
3
5
0
2
0
Here are the
differences between
the means. Which are
significant?
Need to find a
critical value for the
difference between
group means…
Multiple comparison tests – Post Hoc Test
– Tukey’s Honestly Significant Difference (HSD) Test
– Doing Tukey HSD tests (equal n) (III)
The studentized range statistic, q
This is a statistic (just like F or t), for which an
expected distribution is known under the null
hypothesis of no differences.
However, we don’t compute this for our data. Rather, we use this
formula to figure out a critical value for the numerator thus:
set alpha to desired level (.05)
use tables to obtain critical value of q for appropriate
degrees of freedom
work backward to get critical difference between means
Multiple comparison tests – Post Hoc Test
– Tukey’s Honestly Significant Difference (HSD) Test
– Doing Tukey HSD tests (equal n) (IV)
Get critical q using correct degrees of freedom
•
Numerator = no of groups
•
Denominator = d.f. for MSW (use closest available)
Here, d.f. = 5, 55, qcrit = 3.98 (Table A11 in Cohen)
= 15.34 in this case
Which pairs of means exceed this?
1 & 2, 1 & 3
Multiple comparison tests – Post Hoc Test
– Tukey’s Honestly Significant Difference (HSD) Test
– Practical use of Tukey’s HSD
has greater power than the other tests under most circumstances
readily available in computer packages.
Fine if all groups have equal n
The Tukey-Kramer test is used by SPSS when the group sizes are unequal.
Used only for post hoc testing: stronger tests are available if only a small number of
comparisons are to be made (planned comparison)
Do not use if group sizes are very unequal.
When not all possible comparisons are needed, other tests, such as the Dunnett or a
modified Bonferroni method should be considered because they may have
power advantages.
Multiple comparison tests – Post Hoc Test
– Other methods
Scheffe: computes a new critical value for an F test conducted when comparing
two groups from the larger ANOVA (i.e., a correction for a standard t-test). The
formula simply modifies the F-critical value by taking into account the number
of groups being compared: (a –1) Fcrit. The new critical value represents the
critical value for the maximum possible familywise error rate. results in a
higher than desired Type II error rate, by imposing a severe correction.
Dunn: Identical to the Bonferroni correction.
Dunnett: similar to the Tukey test but is used only if a set of comparisons are
being made to one particular group (e.g. several treatment groups are compared
to one control group).
Games-Howell: used when variances are unequal and also takes into account
unequal group sizes. appears to do better than the Tukey HSD if variances are
very unequal (or moderately so in combination with small sample size) or can
be used if the sample size per cell is very small (e.g., <6).
Multiple Comparison Tests – Post Hoc Test
– What test to do?
Equal group size ?
No
Yes
Are all possible
comparisons needed ?
Yes
Tukey HSD
Games-Howell or
Tukey-Kramer
No
Dunnett or Modified Bonferroni
Multiple Comparison Tests – Planned comparisons
Post hoc comparisons are fine for exploratory studies.
Ideally, however, we know in advance which differences we expect
to find in our data….
… and these may or may not be simple differences between two
means.
Unlike post hoc tests, these can be done whether or not the ANOVA
is significant.
Multiple Comparison Tests – Planned comparisons (II)
The biggest problem with post hoc comparisons was the
proliferation of possible comparisons.
Planned comparisons put a lid on that. For a single factor with 6
levels, there are at most 5 (n-1) independent comparisons among
means possible.
Independent comparisons are also called orthogonal contrasts.
For comparison between two treatments, the planned comparison is
unadjusted t-test.
Multiple Comparison Tests – Planned comparisons
– Rationale that underpins planned contrasts
Rationale that underpins planned comparisons or planned contrasts
differ from post hoc tests on three important attributes. Specifically:
First, planned contrasts are undertaken in lieu of the initial
ANOVA. In contrast, post hoc tests are conducted after the
initial ANOVA.
Second, the level of alpha associated with each planned
contrast is 0.05 and does not need to be adjusted, which
optimizes power.
Third, unlike most - but not all - post hoc procedures, planned
contrasts can compare amalgams or combinations of groups
with another
Multiple Comparison Tests – Planned comparisons
– an Example
A study looked at the influence of the medium used (lecture, movie)
on tendency of subjects to change their attitude towards Bush
Administration. The media used were:
A movie, favorable to BA
A lecture, also favorable
A combination of lecture and movie.
Subjects were assigned at random to groups, each having been given
a preliminary attitude test. After the treatment, each was tested
again, and the change in attitude was the dependent variable.
Multiple Comparison Tests – Planned comparisons
– an Example (II)
Problem: mere repetition of the test may affect a subject’s score, so
a control group without any exposure to lecture or movie was
included.
Problem: Perhaps seeing any movie or hearing any lecture would
cause a change in score.
…so 2 more control groups were introduced:
Experimental Groups
I
Movie
II
Lecture
III
Mov + Lec
Control Groups
IV
V
Nothing Neutral
Movie
VI
Neutral
Lecture
Multiple Comparison Tests – Planned comparisons
– an Example (III)
The investigators now had the following specific questions:
[1] Do the experimental groups (as a whole) differ from the control
groups?
[2] Among Experimental groups, is the Movie +Lecture different
from average effect of either Movie alone or Lecture alone?
[3] Is the Experimental Lecture different from the Experimental
Movie?
[4] Among control groups, does “Nothing” differ from either “Movie”
or “Lecture”?
Each specific question can be expressed as a comparison among
sample means…..
Multiple Comparison Tests – Planned comparisons
– an Example (IV)
[1] Do the experimental groups (as a whole) differ from the control
groups?
Experimental Groups
I
Movie
II
III
Lecture
Mov + Lec
Control Groups
IV
V
Nothing Neutral
Movie
+1
+1
+1
-1
Comparison weights
-1
VI
Neutral
Lecture
-1
Multiple Comparison Tests – Planned comparisons
– an Example (V)
[2] Among Experimental groups, is the Movie+Lecture different
from average effect of either Movie alone or Lecture alone?
Experimental Groups
I
II
Movie
+1
Lecture
+1
Control Groups
III
Mov + Lec
-2
IV
V
Nothing Neutral
0
VI
Neutral
Movie
Lecture
0
0
Multiple Comparison Tests – Planned comparisons
– an Example (VI)
[3] Is the Experimental Lecture different from the Experimental
Movie?
Experimental Groups
I
II
Movie
+1
Lecture
-1
Control Groups
III
Mov + Lec
0
IV
V
Nothing Neutral
0
VI
Neutral
Movie
Lecture
0
0
Multiple Comparison Tests – Planned comparisons
– an Example (VII)
[4] Among control groups, does “Nothing” differ from either
“Movie” or “Lecture”?
Experimental Groups
I
II
Movie
Lecture
Control Groups
III
Mov + Lec
IV
V
Nothing Neutral
Movie
0
0
0
+2
-1
VI
Neutral
Lecture
-1
Multiple Comparison Tests – Planned comparisons
– another Example
For example, assume the three levels of the independent
variable are lecturer style:
A1
Mean
Residual mean square
A2
A3
16
4
2
18
6
10
10
8
9
12
10
13
19
2
11
15
6
9
14.29
From previous research, we anticipate that A1 > A2, A1 > A3
(we are making no predictions about A2 vs. A3).
Does the data support this?
Multiple Comparison Tests – Planned comparisons
– another Example (II)
Let us adopt the symbol Y to represent the difference we are
interested in between A1 and A2: Y X1 X 2
We can rewrite this as: Y (1) X1 (1) X 2
Including all the means in the experiment:
Y (1) X1 (1) X 2 (0) X 3
Multiple Comparison Tests – Planned comparisons
– another Example (III)
Planned comparisons are based on the calculation of an F- ratio.
Very similar to ANOVA (or, F-test), the F-ratio is calculated as the ratio of the
between treatments variance (Vb) over random variance (Va, which is the
Residual mean square). The difference is that the Vb in ANOVA is calculated as:
total variance – random variance. Therefore, the Vb here is associated with all
the different treatments; whereas the Vb in planned comparison is calculated as
the exact difference between the specific treatments we are interested.
To calculate Vb, a sum of squares associated with the comparison (SS Acomp) has
n( Y ) 2
to be calculated first:
SS Acomp
2
c
i
Y= the difference between the compared means
n = the number of subjects that contribute to the mean
ci = the coefficient with which we weight the mean
In our case,
5((1)(15) (1)(6) (0)(12))2
SSAcomp
202.5
2
2
2
(1) (1) (0)
Multiple Comparison Tests – Planned comparisons
– another Example (IV)
FAcomp
The F- ratio is calculated by:
MS Acomp
MSS / A
MSS/A: Residual mean square calculated in ANOVA test
MSAcomp: The mean square for the comparison, which is given by
MS Acomp
SSAcomp
dfAcomp
SSAcomp
1
SSAcomp
All planned comparisons have on 1 degree of freedom.
In this case,
FAcomp
MS Acomp
MSS / A
202.5
14.29
14.17
Get critical F value from F-table using the same method as in ANOVA.
in this case, df is (1, 12), Fcritical = 4.8 for p < 0.05
If FAcomp > F critical , as in this case, null hypothesis is rejected.
Multiple Comparison Tests – Planned comparisons
if you have strong and precise theoretical questions before you run
the experiment, planned comparisons are considerably more
powerful than post hoc tests.
Power Analysis
"How precise will my parameter estimates tend to be if I select a particular sample size?" and
"How big a sample do I need to attain a desirable level of precision?"
Null True
(exp. hypothesis
Null False
(exp.
wrong)
hypothesis
No Effect
correct)
Real Effect
Reject Null
Decides: there are real
effects (accept exp.
Hypothesis)
Accept Null
Decides: there are no
real effects (reject exp.
Hypothesis)
Type I ERROR
prob = a
“significance
level ”
Correct
prob = 1- a
Correct
prob = 1- b
“power”
Type II
ERROR
prob = b
Ideally, power
should be at least
.80 to detect a
reasonable
departure from
the null hypothesis
Online power calculator
Online power calculator
Calculate power
Calculate power
Calculate sample size
Calculate sample size
StatView
References:
1.
‘Asking questions in biology’ Chris Barnard, Francis Gibert & Peter McGregor.
Prentice Hall, Edition 2, 2001. (A fun book to read :)
2.
The Really Easy Statistics Site (Strongly recommend!)
http://helios.bto.ed.ac.uk/bto/statistics/tress1.html
3.
Post Hoc Test: www.ioa.pdx.edu/newsom/da1/ho_posthoc.doc
4.
http://www.une.edu.au/WebStat/unit_materials/c7_anova/oneway_post_hoc.htm
5.
Power analysis: http://www.math.yorku.ca/SCS/Online/power/;
http://calculators.stat.ucla.edu/powercalc/