Psych 5500/6500

Download Report

Transcript Psych 5500/6500

Psych 5500/6500
The t Test for a Single Group Mean (Part 1):
Two-tail Tests & Confidence Intervals
Fall, 2008
1
Back to Our Example
We are testing a theory which predicts that
Elbonians should have a different mean IQ
than people in the USA. In general H0 is the
hypothesis of ‘no difference’, while HA is the
hypothesis that reflects what the theory
predicts.
H0: μElbonia = 100 (same as USA)
HA: μElbonia  100 (different than the USA)
2
Challenge
The sample mean of the Elbonians is 106, the
challenge is to determine whether this is
because the population mean of Elbonians is
different than 100, or if the Elbonians have a
population mean of 100 and we obtained a
sample mean six above that just due to
chance (i.e. our sample had random bias).
3
Approach
1. Determine the probability of obtaining our
data if H0 were true.
2. If that probability is low enough (less than
or equal to our significance level of .05)
then reject H0.
4
Test Statistic
The statistic that is going to make or break H0
is the sample mean, this is our ‘test
statistic’. We need to see what values the
test statistic could take on if H0 were true.
5
Sampling Distribution
This is the population of values the sample mean could be if H0 is
true.
6
Normality of the Sampling
Distribution
We will be estimating the standard deviation of the
population so the SDM will be modeled using the t
distribution, rather than the normal distribution.
For the probabilities in the t table to be accurate we
need to either sample our scores from a normal
population or make sure that our sample N is 30 or
greater. As we will be using a small N in this
example (to make number crunching easier) let’s
assume that the IQ scores of Elbonians are
normally distributed.
7
Mean of the SDM
We are looking at the sampling distribution of
the mean assuming H0 is true. If H0 is true,
then the mean IQ of Elbonians is 100, we will
use that to determine the mean of the SDM.
μY  μY
If H0 is true then μ Y  100, so μ Y  100
8
SDM (so far)
9
Standard Error
To find the standard deviation of the SDM
(i.e. the ‘standard error’) we need to know
the standard deviation of the population.
We don’t, so we will have to estimate it
from our sample of 6 scores.
Y  106,103,108,100,109,110 Y  106

Y

 Y 
N
2
SSY
2
6362
 67,490
 67,490 67,416 74
6
SS Y 74

 14.8 est. Y  est.σ 2Y  14.8  3.85
N 1 5
est. Y 3.85
est.σ Y 

 1.57
N
6
est.σ 2Y 
10
SDM (so far)
11
Rejection Regions
Now, we are interested in those values of the
sample mean that have a 5% chance or less
of happening if H0 were true. If we get a
sample mean in that area of the curve we
will decide to ‘reject H0’. If we get a sample
mean close to what H0 predicts (i.e. close to
100) then we will ‘not reject H0’.
12
SDM (so far)
13
tcritical values
The next step is to find out how many standard
deviations above and below the mean we have to
go to cut off the 5% most unlikely means (2.5% on
each tail).
d.f.= N -1 = 6 -1= 5
Looking in the t table we see that leads to a t value
of 2.571 (the lower tail would be -2.571). As this
establishes where we make our decision about
H0, these are called the ‘tcritical values’.
14
SDM (so far)
15
tobtained value
OK, our criteria are set, there is only a 5%
chance that the sample mean will fall 2.571
or more standard deviations away from the
mean if H0 is true, if the sample mean does
fall that far away from the mean then we
will reject H0.
Now, the sample mean we obtained was 106,
we need to change that into a standard
score to see where if falls on our curve.
The standard score of the sample mean is
called the ‘tobtained value’.
16
tobtained value (cont.)
Remember that a standard score is always some
point on a curve minus the mean of the curve divided
by the standard deviation of the curve.
t obt 
Y  μY
est.σ Y
106 100

 3.82
1.57
So, our sample mean of 106 falls almost four standard
deviations above the mean on the curve representing
what H0 predicts.
17
SDM (so far)
Our decision is “to reject H0”
18
Equivalent Approach
We could also have found what values of the sample mean fall at
the rejection regions. From the graph above it is clear that our sample
19
mean of 106 would lead to a decision to reject H0.
Stating the Decision
Remember:
H0: μElbonia = 100 (same as USA)
HA: μElbonia  100 (different than the USA)
Decision: ‘We reject H0”
Since ‘rejecting H0’ implies ‘accepting HA’, we can
go on to say...
We can conclude that the mean IQ of the
population of Elbonians does not equal 100’.
20
Possible Error
If our decision is to ‘reject H0’, then if we are
wrong that would be a type 1 error.
If the null hypothesis happens to actually be
true (μElbonia = 100) then there is a .05
chance that we would obtain a sample
mean that would lead us to erroneously
reject H0 (i.e. α=.05)
21
What if...
Let’s take a look an an example where everything
is exactly the same except the data lead to ‘not
rejecting’. I’ll simply subtract 4 from each of the
scores to give us a sample mean closer to 100,
without changing d.f. or our estimate of the
standard deviation.
Y=102, 99, 104, 96, 105, 106
Mean=102
22
tobtained value
Remember that a standard score is always some
point on a curve minus the mean of the curve divided
by the standard deviation of the curve.
t obt 
Y  μY
est.σ Y
102 100

 1.27
1.57
So, our sample mean of 102 falls a little more
than one standard deviation above the mean of the curve
representing what H0 predicts.
23
SDM
Our decision is “do not reject H0”
24
Equivalent Approach
25
It is clear that a sample mean of 102 would lead to not rejecting H0
Stating the Decision
Remember:
H0: μElbonia = 100 (same as USA)
HA: μElbonia  100 (different than the USA)
Decision: Do not reject H0.
Since ‘not rejecting H0’ implies ‘not accepting HA’, we can
say...
• “We can not conclude that the mean IQ of the population of
Elbonians differs from 100”, or clearer still,
• “We cannot determine whether or not the mean IQ of
Elbonians differs from 100” (I like this wording as it best
conveys the ambiguity of the result).
However, we can’t say...
• “We can conclude that the mean IQ of Elbonians equals
100.”
26
Huh?
Under most circumstances, when the mean falls in
the ‘do not reject H0’ area, you can say you ‘fail to
reject H0’ but you can’t say that you ‘accept H0’ or
that you’ve ‘proven H0 is correct’. The reason for
this will be developed in the lecture on ‘power’.
For now, think of it this way: You set out to try to find
a difference between the actual population mean
and the value proposed by H0. If you find a
difference, then that result is interpretable (reject
H0). If you don’t find a difference, is that because
the difference doesn’t exist or because you didn’t
search hard enough to find it?
27
Confidence Intervals
We can now return to confidence intervals, as we
now have the tool we need to generate them (i.e.
the t table). The 95% confidence interval of the
mean can be generated using the following
formula, where td.f.,α=.05,2-tail stands for the tc value
for the two-tailed alpha of .05 with the appropriate
degrees of freedom. For our sample that had a
mean of 106...
limits  Y  t df, α .05,2- tail est.σ Y 
limits  106  (2.571)(1.57)  106  4.04
101.96 μ  110.04
28
99% Confidence Interval
For other intervals use a different value for
alpha. For the 99% confidence interval use
the two-tailed tc for alpha=.01
limits  Y  t df, α .01,2- tail est.σ Y 
limits  106  (4.032)(1.57)  106  6.33
99.67  μ  112.33
29
Using Confidence Intervals
Confidence intervals can be used to make
decisions about a priori hypotheses. Any
hypothesis that proposed (a priori) a μ
outside that range can be rejected at the .05
significance level.
30
Using Confidence Intervals
The 95% confidence interval for our data was:
101.96  μY  110.04
Note that H0 said that μ =100, thus H0 can be
rejected using the confidence interval.
There is absolutely no difference between testing H0
using a two-tailed test and doing the same thing
using confidence intervals. Confidence intervals
are useful in other ways, however, and we will take
a look at those towards the end of the semester
when we examine alternatives to null hypothesis
testing.
31