Review of the Basic Logic of NHST • Significance tests are used to accept or reject the null hypothesis. • This is done.

Download Report

Transcript Review of the Basic Logic of NHST • Significance tests are used to accept or reject the null hypothesis. • This is done.

Review of the Basic Logic of NHST
• Significance tests are used to accept or reject the null
hypothesis.
• This is done by studying the sampling distribution for a
statistic.
– If the probability of observing your result is < .05 if the null is true,
reject the null
– If the probability of observing your result is > .05, accept the null.
• Today we’re going to discuss ANOVA.
ANOVA
• A common situation in psychology is when an
experimenter randomly assigns people to more than two
groups to study the effect of the manipulation on a
continuous outcome.
• The significance test used in this kind of situation is called
an ANOVA, which stands for analysis of variance.
ANOVA example
• We are interested in whether peoples diet affects their
happiness.
• We take a sample of 60 people and randomly assign (a) 20
people to eat at least three fruits a day, (b) 20 people to eat
at least three vegetables a day, and (c) 20 people to eat at
least three donuts a day.
• Subsequently we measure how happy people are on a
continuous scale.
• Note: The independent variable is categorical (people are
assigned to groups); there are three groups total.
• The dependent variable is continuous—we measure how
happy people are on a continuous metric.
ANOVA example
• Let’s say we find that the fruit group has a mean
score of 3 (ˆ12 =1), the veggie group has a mean
score of 3.2 (ˆ 22 = .9), and the donut group has a
mean score of 4.0 (ˆ 32 = 1.2).
• Two possibilities
– The differences between groups are due to sampling error, not a
real effect of diet. In other words, the three samples are drawn
from populations with identical means and variances.
– The differences between groups are due to the effect of diet, not
sampling error. In other words, the samples are drawn from
populations with different means.
ANOVA example
• In order to evaluate the null hypothesis, we need to know
how likely it is that we would observe the differences we
observed if the null hypothesis is true.
• How can we do this?
• As before, we construct a sampling distribution.
• The sampling distribution used in ANOVA is conceptually
different from the one used with a t-test.
F-ratio
• The sampling distribution for a t-test was based on mean
differences, and the t-distribution itself is based on mean
differences relative to the sampling error.
• In ANOVA, the sampling distribution is based on a ratio
called the F-ratio.
• The F-ratio is the ratio of the population variance as
estimated between groups vs. within groups.
Variation in Sample Means drawn from the
same population
• We begin by noting that anytime we take random samples
of the same size from a population, we will observe
variation in our sample means—despite the fact that the
samples come from the same population.
• Why? This occurs because of sampling error. The sample
is only a subset of the population, and, hence, only
represents a portion of the scores composing the
population.
Implications of the variation
• What are the implications of sampling error in this
context?
• We will observe variability in the sample means for our
fruit, veggie, and donut groups even if the null hypothesis
is true.
• How much variability will we observe?
• This depends on two factors:
– Sample size. As N increases, sampling error decreases. We’ll
ignore this factor for now.
– The variance of the scores in the population. When there is a lot of
variation in happiness in the population, we’ll be more likely to
observe variation among our sample means.
Estimating the variance in the population
• Thus, our first step in conducting a significance test
ANOVA-style is to estimate the variance of the dependent
variable in the population.
• PREVIEW: We’re going to use this information to
determine whether the variation in sample means that we
observe is greater than what we would expect if the
samples all came from the same population.
• We’re going to estimate this variance in two ways: within
groups and between groups.
Method # 1 | Within Groups
• How can we estimate the variance of the scores in the
population?
• We can draw on the logic of pooled variances that we
discussed in the lecture on t-tests.
• If the samples come from populations with identical means
and variances, then each of the three sample variances is an
estimate of the same quantity (i.e., the population
variance).
• Thus, we can pool or average the three variances (using the
N-1 correction) to estimate the population variance.
MSWithin
• In our example, we pool (i.e., average) 1, .9, and 1.2.
• (1 + .9 + 1.2)/3 = 1.03
• Thus, our pooled estimate of the population variance is
1.03
• In ANOVA-talk, this pooled estimate of the population
variance is called mean squares within or MSWithin.
• We use the term “within” because we are estimating the
population variance separately within each sample or
condition.
Method # 2 | Between Groups
• There is another way to estimate the variance in the
population, based on studying the variation in sample
means across or between conditions.
• Recall from our discussion of sampling distributions of
means that, given certain facts about the population and the
sampling process, we could characterize the range of longrun sample outcomes.
• For example, N characterizes the average difference we
might expect between sample means and the population
mean.
2
Method # 2
• We can view our three sample (i.e., condition) means as
constituting a “sampling distribution” of sorts based on
three samples instead of an infinite number of samples (as
is the case in a theoretical sampling distribution).
• Hence, if we calculate the variance among these three
sample means we’re essentially estimating the variance of
the sampling distribution of means.
1
2
3
W
1
2
3
W
4
EL
A theoretical sampling distribution of
means based on infinite number of samples.
4
EL
5
L
-
L
- BEI
N
An
approximate
version of the theoretical
sampling distribution based on the three
means obtained in our study.
The variance of these three means then
provides an estimate of  2 N .
This distribution has a standard deviation
of 
N
or a variance of
2 N .
BE
5
Why is this cool? Because if we can
estimate  2 N , then we have an estimate
of  2 (the population variance)!!!
G
How do we do it?
• Well, we have three sample means. To find the variance among them,
we simply find the average squared difference among them.
• First, we need to find the Grand Mean (GM): the mean of the three
means
• GM = (3 + 3.2 + 4)/3 = 3.4
How do we do it?
• Now we can estimate the variance of the sampling distribution of
means, and hence the variance of the population, by studying the
average squared deviation of these three means from the Grand Mean.
2


 M  GM
N groups  1
 (estimates ) 
2
N
• Note: We use 1 less than the number of groups in the denominator for
the same reason we use 1 less than the number of people when
estimating a population variance: We are estimating a variance, albeit,
the variance of a sampling distribution of means instead of a
population variance. Any variance that is an estimate of a population
variance will be a bit too small.
For our example
M
3
3.2
4
2


 M  GM
N groups  1
GM (M – GM) (M – GM)2
3.4
-.40
.16
3.4
-.20
.04
3.4
.60
.36
Ngroups = 3, so Ngroups –1 = 2
(M – GM)2 /2 = (.16 +.04+.36)/2 = .28
MSBetween
• We’re almost there.
• Now, we have an estimate of the variance
of the sampling distribution,    2  2

 
 ..28
N
 N
• We know that N (the sample size within a
condition) is 20, so simple algebraic
manipulation reveals that our estimate of
the population variance is 5.6
• In short, we can estimate the population
variance between samples by multiplying
the variance among means by the sample
size.
• This quantity is called Mean Squares
Between or MSBetween.
ˆ 2
 .28
N
ˆ 2
 .28
20
20 ˆ 2 20


 .28
1 20
1
ˆ 2  5.6
Note: In class when we did this on the
board, we accidentally took the square
root of N.
Getting to the big picture
• Ok, we’ve been traveling some dark back roads. What
does this all mean?
• First, we should note that there are two ways to estimate
the population variance from our three sample/condition
means
– MSWithin: We can pool the variance estimates that are
calculated within each condition
– MSBetween : We can treat the three sample means as if
they comprise an approximate sampling distribution of
means. By estimating the variance of this sampling
distribution, we can also estimate the variance of the
population.
F ratio
• Now, here is the important part.
• If the null hypothesis is true, then these two
mathematically distinct estimates of the population
variance should be identical.
• If we express them as a ratio, the ratio should be close to
1.00.
MS Between
F
MSWithin
F ratio
• However—and this is critical—if the null hypothesis is
false, and we are sampling from populations with different
means, the variance between means will be greater than
what we would observe if the null hypothesis was true.
• The size of MSWithin, however, will be the same. (Footnote)
• Thus, the F ratio will become increasingly larger than 1.00
as the difference between population means increases.
F ratio
• Even if the null hypothesis is true, the F ratio will depart
from 1 because of sampling error.
• The degree to which it departs from 1, under various
sample sizes, can be quantified probabilistically.
• The significance test works as follows: When the p-value
associated with the F-ratio for a specific sample size is <
.05, we reject the null hypothesis. When it is larger than
.05, we accept the null hypothesis.
F ratio our example
• In our example, MSWithin was 1.03. MSBetween was 5.6. Thus,
MS Between
F
MSWithin
5.6
F
 5.4
1.03
• If we were to look up some probability values in a table, we would
find the p-value associated with this particular F is less than .05. Thus,
we would reject null hypothesis and conclude that diet has an effect on
happiness.
Summary
• ANOVA is a significance test the is frequently used when
there are two or more conditions and a continuous outcome
variable.
• ANOVA is based on the comparison of two estimates of
the population variance
• When the null hypothesis is true, these two estimates will
converge.
• When the null hypothesis is false (i.e., the groups are being
sampled from populations with different means), the two
estimates will diverge, leading to an F ratio larger than 1.
• When the p-value associated with an F-ratio calculated in a
study is < .05, reject null. If not, accept null.