Hypothesis Testing - Wayne State College

Transcript Hypothesis Testing - Wayne State College

Hypothesis Testing
An Inference Procedure
We will study procedures for both the unknown
population mean on a quantitative variable and
the unknown population proportion on a
qualitative variable.
1
Background
There are times we would like to know about the unknown mean
in a population. But, it is often expensive and too time
consuming to investigate the whole population. So, a sample is
taken. The method of confidence intervals is based on idea that a
point estimate would vary from sample to sample in theory and
so from the one sample we do take we build in the variability and
then are a certain percent confident our interval contains the
unknown value.
Hypothesis testing will rely on some of the same ideas used in
confidence interval, but here there is a least a starting point for
the unknown value. The starting point can be from past work or
belief one has in a process.
2
Example
Consider an example about a company that puts cereal in a
boxes. On the label of each box it says there are 368 grams of
cereal in the box. Does each box have exactly 368 grams?
Probably not because maybe a few extra flakes fall in one box and
a few less in another box. But in the grand scheme of things the
process is filling the boxes on average to 368 grams. (Now if one
box had 268 grams and the other had 468 grams for an average
of 368 we would have a problem, but not of the kind we are
talking about here.)
3
Null hypothesis
From the cereal example we would say the null hypothesis is that
the mean amount of cereal put in the boxes is 368 grams and in a
shorthand notation we would write:
Ho: μ = 368.
The Ho stands for null hypothesis. Here this basically means if
the company believes they are putting 368 ounces in each box
then we will not on face value object to that assertion.
The mu, μ, is the idea that we are making a hypothesis about the
population of all boxes. Of courses we will only take a sample,
but our hypothesis is about the population mean.
4
Alternative Hypothesis
In hypothesis testing there will always be a mutually exclusive
alternative hypothesis to the null.
In the cereal example the alternative hypothesis may be that the
cereal boxes are not being filled to an average of 368 grams and we
would write this as
H1: μ ≠ 368.
The general process of hypothesis testing starts with the null and
alternative hypotheses. Then a sample is collected and analyzed. The
analysis will have one either continue to believe in the null hypothesis
and thus fail to reject the null, or one will reject the null and conclude
the alternative is the one to go with. Note in the cereal example, if
the null is rejected the firm better find out why the machine is not
filling the cereal boxes properly and get that situation fixed.
5
Analogy
Story about hypothesis tests. Not really stats, but an idea to
consider. Say I have two decks of cards. One deck is a regular deck –
spades, hearts, diamonds and clubs. The other deck is special – 4
sets of hearts.
Now, I take out one of the decks, but you do not know which one. In
the language of statistics the null hypothesis will be that I took out
the regular deck. You will accept the null hypothesis unless an event
occurs that has a really low probability. If a really low probability
event occurs you will reject the null hypothesis and go with the
alternative hypothesis.
So, I take out a deck and deal you five cards – a royal flush hearts!
You would reject the null hypothesis of a regular deck and go with
the alternative that the deck I pulled out is the special one because a
6
royal flush hearts has a low probability in a regular deck.
Sampling Distribution
You may recall that when we have a quantitative variable
and the population standard deviation of the variable is
known, the distribution of the sample mean is
1) normal
2) Has the same mean as the mean of the variable in the
population,
3) Has standard error = standard deviation in the
population divided by the square root of the sample size.
When the population standard deviation is not known we
rely on the sample standard deviation and the distribution
of the sample mean is a t distribution.
In what follows we assume population standard deviation is
known, but the ideas we bring up are also relevant later. 7
Regions of Sampling Distribution
X
μ
Imagine that this slide has animation. Think about the arrows as both starting
out in the center and as the arrows move out they push the vertical lines with
them.
Using the cereal example, the center of the distribution is thought to be at 368.
As we move in either direction from the center we have sample means that are
possible when the population mean really is 368. But at some point as we move
out we start to wonder about our 1 sample mean as really coming from a
8
distribution with mean equal to 368.
Regions of Sampling Distribution
In the process of hypothesis testing the area of the sampling
distribution is divided up into regions.
The nonrejection region is the area in the middle of the
distribution. These values are relatively close to the center. So
if we get a sample value in this area we do not have enough
evidence to reject the null hypothesis.
The “tail” areas that I have on the previous screen are
considered rejection regions. While sample mean values could
occur in these regions when in fact the true mean is 368, the
probability is low and thus this raises suspicion about the null
hypothesized value and leads us to reject the null. (Could I deal
you a royal flush hearts from a regular deck? Yes, but chance is
9
small, or much better under the alternative hypothesis.)
Critical Values
The values of x bar that occur where the arrows are pushed
out are called critical values of x bar. Note that the critical
values are not determined from the sample. The null
hypothesized value is also NOT determined from the
sample. Remember the null hypothesis value is determined
from past work or knowledge of some process. The critical
values are picked based on some additional ideas I want to
explore next.
10
Type I Error
A Type I error is a situation where you reject the null
hypothesis, Ho, when it is true and should not be rejected.
The probability of making a type I error is called alpha and
is often referred to as the level of significance.
In the cereal example if we reject the null hypothesis we
will have to shut down production and investigate the
production process to see why it is not putting in the
“correct” amount of cereal. There is a consequence to
rejecting a true null hypothesis. Depending on the nature
of the consequence we pick the value of alpha. Traditional
values of alpha are .01, .05 and .1. The choice of alpha
will be part of determining the critical x bar values.
11
Type II Error
A type II error is a situation where the null hypothesis is not
rejected when it should be because the null is false.
The probability of making a type II error is called beta, β.
A type II error also has consequences. In the cereal
example if we do not reject the null when we should we
could either be giving more cereal than we say we are (and
thus not charging for it – we certainly have costs in making
it), or giving less than we say we are and thus cheating
customers.
In an introductory statistics class such as ours we typically
focus on the type I error.
12
Critical Value approach
Alpha/2
Alpha/2
X
Reject region
μ = μo
Reject region
Do not reject
region
Lower critical
value
Upper critical
value
13
Critical value approach
The null and alternative hypotheses can be stated in a
generic way as
Ho: μ = μo
H1: μ ≠ μo,
where μo is a specific number. In our cereal example we
would have
Ho: μ = 368
H1: μ ≠ 368.
When the alternative is a not equal sign we have what is
called a two tailed test because if we are off in either
direction we are concerned. In this case we divide up the
alpha value in half and make our rejection regions have
areas add up to alpha. If alpha = .05 we would have .025
in each tail of the distribution.
14
Critical Value Approach
Our context here is that we know the population standard
deviation so we use the Z table (the standard normal table).
While my graph a few slides back is of X bar, we translate
to Z values.
With alpha = .05 and thus .025 in either tail, the lower
critical Z = -1.96 and the upper critical Z = 1.96. We would
reject the null if from our sample the Zstat is less than -1.96
or greater then 1.96
Now, let’s say we take a sample of 25 observations and we
get a mean of 372.5 grams and we know the population
standard deviation is 15. The Zstat = (372.5 –
368)/(15/sqrt(25)) = 4.5/3 = 1.50. This means we can not
reject the null. The data support the filling process is ok!
15
p – value approach
The critical value approach had you set up rejection
regions and in the end work with a sample. In the p – value
approach you will work with the sample almost as soon as
you can.
Remember we had a sample mean of 372.5 and the Zstat
for this is 1.50. A Z of 1.50 has area .9332 to the left and
.0668 to the right. The area to the right is the upper tail
associated with the actual sample mean. In the critical
value approach we had .025 in the upper tail. So, the
.0668 suggests our sample mean is in the do not reject
region. With a two tail test we look at the Zstat from the
sample and the negative of the Zstat, here -1.50. Then
when alpha = .05 we can see our tail areas add up to
.1336.
16
p – value approach
The p – value for a sample mean is the probability in the
tail given the null hypothesis is true. If we have a two tail
test we just double the one tail value to get the p – value.
Then if p – value > alpha we do not reject the null,
but if the p – value < alpha we reject the null because we
know the Zstat is more extreme than the critical values.
If the p – value is low, then Ho must go. Note in our work a
“low” p – value will be defined from problem to problem.
Low from problem to problem may be called the level of
significance or alpha.
17
With a .01 level of significance we have .005 as the area in
each tail. We would reject the null if
1) The Zstat is less than -2.575, or
2) The Zstat is greater than 2.575.
Area = .005
Area = .005
-2.575
2.575
18

Hypothesis Testing - Wayne State College

Transcript Hypothesis Testing - Wayne State College

Directory