3680 Lecture 14

Download Report

Transcript 3680 Lecture 14

Math 3680
Lecture #15
Hypothesis Testing:
The t Test
• In all of the previous examples, we assumed that
we knew the population standard deviation s.
• In practice, this is an extremely rare situation!
• Instead, the sample standard deviation S is used in
lieu of s.
• This approximation is called the bootstrap estimate.
Example. The manufacturer of a new fiberglass tire
claims that its average life will be at least 40,000
miles. To verify this claim, a sample of 12 tires is
tested, and the lifetimes were found to be
36,100
42,000
36,800
40,200
35,800
37,200
33,800
37,000
33,000
38,500
41,000
36,000
Test the manufacture’s claim using a = 0.05.
Solution.
H0: The average life is at
least 40,000 miles.
Ha: The average life is less
than 40,000 miles.
We choose a = 0.05.
Notice that we have to
compute the sample SD
this time; it’s not given.
36100
40200
33800
38500
42000
35800
37000
41000
36800
37200
33000
36000
Mean
SD
37283.33
2731.91
We now can compute the
test statistic
X 
t
S/ n
37283.33  40000

2731.91 / 12
36100
40200
33800
38500
42000
35800
37000
41000
36800
37200
33000
36000
Mean
SD
37283.33
2731.91
 3.4448.
But there’s a catch: since we used S instead of s, the
test statistic t does NOT follow the normal curve.
Since we used S instead of s, the test statistic t does
NOT follow the normal curve.
Instead, there’s a theorem which says that the test
statistic t follows the Student t distribution with
n - 1 degrees of freedom. (There’s a slight catch with
this theorem that we’ll discuss later.)
For the current problem, instead of using the normal
curve to compute the observed significance level, we
will use the Student t distribution with 11 degrees of
freedom.
For the sake of completeness, here’s the pdf of the
Student’s t-distribution with r degrees of freedom:
 r 1


2 

f (t ) 
r
  r
2
 t2 
1  
r


r 1
2
If you find this intimidating, don’t worry: we will
never use it.
Student
T Distribution
with
1 degrees
of freedom
0.35
0.3
0.25
0.2
0.15
0.1
0.05
-4
-2
Red: t distribution
2
4
Blue: standard normal curve
Student
T Distribution
with
2 degrees
of freedom
0.35
0.3
0.25
0.2
0.15
0.1
0.05
-4
-2
Red: t distribution
2
4
Blue: standard normal curve
Student
T Distribution
with
3 degrees
of freedom
0.35
0.3
0.25
0.2
0.15
0.1
0.05
-4
-2
Red: t distribution
2
4
Blue: standard normal curve
Student
T Distribution
with
4 degrees
of freedom
0.35
0.3
0.25
0.2
0.15
0.1
0.05
-4
-2
Red: t distribution
2
4
Blue: standard normal curve
Student
T Distribution
with
5 degrees
of freedom
0.35
0.3
0.25
0.2
0.15
0.1
0.05
-4
-2
Red: t distribution
2
4
Blue: standard normal curve
Student
T Distribution
with
10 degrees
of freedom
0.35
0.3
0.25
0.2
0.15
0.1
0.05
-4
-2
Red: t distribution
2
4
Blue: standard normal curve
Student
T Distribution
with
20 degrees
of freedom
0.35
0.3
0.25
0.2
0.15
0.1
0.05
-4
-2
Red: t distribution
2
4
Blue: standard normal curve
Student
T Distribution
with
30 degrees
of freedom
0.35
0.3
0.25
0.2
0.15
0.1
0.05
-4
-2
Red: t distribution
2
4
Blue: standard normal curve
Student
T Distribution
with
40 degrees
of freedom
0.35
0.3
0.25
0.2
0.15
0.1
0.05
-4
-2
Red: t distribution
2
4
Blue: standard normal curve
Student
T Distribution
with
50 degrees
of freedom
0.35
0.3
0.25
0.2
0.15
0.1
0.05
-4
-2
Red: t distribution
2
4
Blue: standard normal curve
Student
T Distribution
with
100 degrees
of freedom
0.35
0.3
0.25
0.2
0.15
0.1
0.05
-4
-2
Red: t distribution
2
4
Blue: standard normal curve
For the t distribution with 11 df, we can compute the
critical value for a = 0.05 using Table 4 (p. 510).
5%
0.3
0.2
0.1
-3.4448
-1.79588
In Excel, be careful: the command is TINV(0.1,11).
(The default is for two tails, not one tail.)
Observed Significance Level
0.002724
0.3
0.2
0.1
-3.4448
-1.79588
In Excel, the command is TDIST(3.448,11,1).
(The third entry specifies the number of tails.)
Conclusion: We reject the null hypothesis. There is
good reason to believe that the average lifespan of the
tires is less than 40,000 miles.
Note: It is possible to compute power with the
Student’s t-distribution, but the computations are
much, much more complicated than the normal case
(Larsen & Marx, 3rd ed., p. 447). Many statistical
software packages are able to compute power for
the t-test automatically.
Excel: Use the command
=TTEST(A1:A12, B1:B12,1, 1)
• This is silly, I know, but you have
to list the claimed average once for
each entry in the list.
• The blue 1 stands for a one-tailed
test; the second 1 is required.
36100 40000
40200 40000
33800 40000
38500 40000
42000 40000
35800 40000
37000 40000
41000 40000
36800 40000
37200 40000
33000 40000
36000 40000
0.00273917
Remember: If the sample is small (n < 30) and the
population variance s is unknown, then we use the
t-test and not the z-test.
On the other hand, if either s is known or the sample
is sufficiently large (n > 30), then we may safely use
the z-test instead.
Also, we must be careful about stating the null and
alternative hypotheses so that we correctly choose
whether to use a left-tail, a right-tail, or both tails.
Example. Before a substance can be deemed safe for
landfilling, its chemical properties must be assessed.
In a sample of six replicates of sludge from a New
Hampshire wastewater treatment plant, the mean pH
was 6.68 with a standard deviation of 0.20. Can we
conclude than the mean pH is less than 7.0?
J. Benoit, T. Eighmy and B. Crannell, Journal of Geotechnical and
Geoenvironmental Engineering 1999, pp. 877--888.
Example. Certain rectangles
appear more pleasing to the eye
than others. The ancient Greeks
called a rectangle with
 5 1 
(length)
width  

2


the golden rectangle, and this
ratio was called the
golden ratio. The golden ratio
has been claimed to be a
deliberate design of various art
and architecture.
The data below shows the width-to-length ratios of
beaded rectangles used by the Shoshone Native
Americans to decorate their leather goods.
Does it appear that the golden rectangle is also an
aesthetic standard for the Shoshones?
0.693
0.749
0.654
0.670
0.662
0.672
0.615
0.606
0.690
0.628
0.668
0.611
0.606
0.609
0.601
0.553
0.570
0.844
0.576
0.933
C. Dubois, ed., Lowie’s Selected Papers in Anthropology
(UC Press, Berkeley, 1960), pp. 137--142
Robustness of the t Test
The t statistic is defined by
X 
t
S/ n
If X1, X2, …, Xn follow a normal distribution, then
there’s a theorem that says that this t statistic follows
the Student t-distribution with n - 1 degrees of
freedom.
But, in real life, this assumption is almost certainly not
true. Models are idealized; real data are, well, real.
Now what?
The good news is that the underlying pdf doesn’t have
to be very close to normal in order for the test statistic
to be close to the Student t-distribution.
The following graphs are empirical histograms of the
t statistic computed from 10,000 data sets drawn from
a “triangular” distribution with pdf
0.5
0  x  1,
 x / 2,
f ( x)  
(2  x) / 2, 1  x  2,
0.4
0.3
0.2
0.1
0.5
1
1.5
which is not too far off from bell shaped. Even for
very small samples, the t distribution is accurate.
2
t Statistic
from a Triangular
with a sample of size 4
distribution
0.4
0.3
0.2
0.1
-5
-4
-3
-2
-1
0
1
2
3
4
5
The following graphs are empirical histograms of the
t statistic computed from 10,000 data sets drawn from
a Uniform(0,1) distribution, which is symmetric but
decisively not bell-shaped. Notice that convergence
does not occur as quickly.
1
0.8
0.6
0.4
0.2
0.2
0.4
0.6
0.8
1
t Statistic
from a Uniform
with a sample of size 2
0,1
0.4
0.3
0.2
0.1
-5
-4
-3
-2
-1
0
1
2
3
4
5
t Statistic
from a Uniform
with a sample of size 3
0,1
0.4
0.3
0.2
0.1
-5
-4
-3
-2
-1
0
1
2
3
4
5
t Statistic
from a Uniform
with a sample of size 4
0,1
0.4
0.3
0.2
0.1
-5
-4
-3
-2
-1
0
1
2
3
4
5
t Statistic
from a Uniform
with a sample of size 5
0,1
0.4
0.3
0.2
0.1
-5
-4
-3
-2
-1
0
1
2
3
4
5
t Statistic
from a Uniform
with a sample of size 6
0,1
0.4
0.3
0.2
0.1
-5
-4
-3
-2
-1
0
1
2
3
4
5
t Statistic
from a Uniform
with a sample of size 7
0,1
0.4
0.3
0.2
0.1
-5
-4
-3
-2
-1
0
1
2
3
4
5
t Statistic
from a Uniform
with a sample of size 8
0,1
0.4
0.3
0.2
0.1
-5
-4
-3
-2
-1
0
1
2
3
4
5
t Statistic
from a Uniform
with a sample of size 9
0,1
0.4
0.3
0.2
0.1
-5
-4
-3
-2
-1
0
1
2
3
4
5
t Statistic
from a Uniform
with a sample of size 10
0,1
0.4
0.3
0.2
0.1
-5
-4
-3
-2
-1
0
1
2
3
4
5
The following graphs are empirical histograms of the
t statistic computed from 10,000 data sets drawn from
an Exponential(1) distribution, which is neither
symmetric nor bell-shaped.
This time, the sample has to be of size 40 or so for the
t distribution to be accurate… that’s mostly due to the
Central Limit Theorem.
1
0.8
0.6
0.4
0.2
1
2
3
4
t Statistic
from an Exponential
with a sample of size 5
1
0.4
0.3
0.2
0.1
-10
-9
-8
-7
-6
-5
-4
-3
-2
-1
0
1
2
3
4
5
t Statistic
from an Exponential
with a sample of size 10
1
0.4
0.3
0.2
0.1
-10
-9
-8
-7
-6
-5
-4
-3
-2
-1
0
1
2
3
4
5
t Statistic
from an Exponential
with a sample of size 15
1
0.4
0.3
0.2
0.1
-10
-9
-8
-7
-6
-5
-4
-3
-2
-1
0
1
2
3
4
5
t Statistic
from an Exponential
with a sample of size 20
1
0.4
0.3
0.2
0.1
-10
-9
-8
-7
-6
-5
-4
-3
-2
-1
0
1
2
3
4
5
t Statistic
from an Exponential
with a sample of size 30
1
0.4
0.3
0.2
0.1
-10 -9
-8
-7
-6
-5
-4
-3
-2
-1
0
1
2
3
4
5
t Statistic
from an Exponential
with a sample of size 40
1
0.4
0.3
0.2
0.1
-10 -9
-8
-7
-6
-5
-4
-3
-2
-1
0
1
2
3
4
5
t Statistic
from an Exponential
with a sample of size 50
1
0.4
0.3
0.2
0.1
-10 -9
-8
-7
-6
-5
-4
-3
-2
-1
0
1
2
3
4
5
t Statistic
from an Exponential
with a sample of size 100
1
0.4
0.3
0.2
0.1
-10 -9
-8
-7
-6
-5
-4
-3
-2
-1
0
1
2
3
4
5
Observations
• The distribution of the t statistic is relatively
unaffected by the pdf of the Xi, as long as
– The pdf is not too skewed, and
– The sample size is not too small.
• As the sample size n increases, the distribution of
the t statistic gets closer to the Student t-distribution
with n -1 degrees of freedom.
Observations
We succinctly describe this as saying that the t test is
robust, meaning that it is not heavily dependent on the
underlying assumption of normality. The practical
importance of this robustness is that the t test can be
used in real-life situations.
Practical Implications
• If n < 15, the data should be nearly normal. Make a
histogram. If there are outliers or strong skewness, do
not use the t-test.
• If 15  n  40, make a histogram to check that the
data is unimodal, free of outliers, and reasonably
symmetric. Again, make a histogram.
• If n > 40, the t-test is safe even if the data is skewed.