z-Scores and the Normal Curve

Download Report

Transcript z-Scores and the Normal Curve

Significance Testing
Statistical testing of the mean (z test)
Binomial Distribution
Mathematicians have figured formulas to
estimate long run relative frequencies for
simple events, like how many heads will
appear for a given number of coin tosses.
The binomial is one such.
.3
Recall dice
Number of ‘heads’
in 10 flips of a
coin.
.2
.1
0.0
PROB

-.1
-2
0
2
4
6
8
10
12
Normal Distribution

Probability (Relative Frequency)
Standard Normal Curve
We have already
figured percentages of
the normal.
Percentages of the
normal correspond to
probabilities of finding
individual cases in the
distribution. The
sampling distribution of
the mean is normal if N
is large.
0. 4
50 P ercent
0. 3
34.13 %
0. 2
0. 1
13.59%
2.15%
0. 0
-3
-2
-1
0
1
2
3
S cores in standard deviations from mu
Middle 95 percent from going up and
down 1.96 SDs from the mean.
Significance Testing 1



Significance testing is a
‘what if’ game.
We make an assumption,
and ask what will happen if
the assumption is true.
Assumption is null
hypothesis.
Significance testing is
based on probabilities that
come from the ‘what if’
scenario (from the null
hypothesis).



What if the true mean
height of students at USF
is 66 inches and SD is 5
inches? What if we draw
people from USF 100 at a
time and plot the means?
We can figure the sampling
distribution from these
assumptions and figure
probabilities.
Rejection region is place
that is unlikely to occur if
null is true.

Probability (Relative Frequency)
Significance Testing 2
Standard Normal Curve
0. 4
50 P ercent
0. 3
34.13 %
0. 2
0. 1
13.59%
Given:   66;   5; N  100
Sampling Distribution of Means
Result 1: Sampling
distribution of mean is
U pper (about 67)
Low er (about 65)
normal (mu =66).
Result 2: Standard
Middle 95
error of the mean is:
2.15%
0. 0
-3
-2
-1
0
1
2
3
S cores in standard deviations from mu


X 
X
5

 .5
N
100
2.5 perc ent
64
What is a rejection region?
Rejection
Region
65
2.5 perc ent
66
Inc hes
67
68
Rejection
Region
Review


Suppose population mean is 500, population
SD is 100 (SAT data), and sample size is
100.
Draw sampling distribution of means.




What is the shape of this distribution?
What is the mean of this distribution?
What is the standard deviation of this distribution?
Find, mark, and label the rejection regions.
Review
  500
  100
 X  10
Sampling Distribution of Means
SAT D ata
Re l a ti ve F re q u e n cy
0 .0 4
0 .0 3
Upper = 500+1.96(10)=519.6
0 .0 2
Lower = 500-1.96(10) =480.4
0 .0 1
RR
RR
0 .0 0
440
460
RR > 519.6
RR < 480.4
480
500
520
SAT Means
540
Shape is normal, mean is 500, SD is 10.
560
Significance Testing 3






Establish ‘what if’
Collect sample data.
Examine probability of sample result given the null.
If probability is low, say that null is false, i.e., reject
the null hypothesis.
This is a significance test. If we reject the null, we
say result is statistically significant.
Significance testing lets us make decisions about
populations from sample data.
Example
Sampling Distribution of Means
CI  4  .22(1.96) Low er = 3.57 U pper = 4.43
2.0


Mean # beers at
Skipper’s
Smokehouse?
Null (what if):
  4;   2
Data from Skipper’s:
X  5; N  81
Derive:
X 
X
2

 .22
N
81
1.6
Re l a ti ve F re q u e n cy

Sk ipper Mean = 5
1.2
0.8
0.4
R ejec tion R egion
R ejec tion R egion
0.0
2
3
4
5
Mean Beers C ons umed
6
Reject the null. Result is significant.
Observed data are very unlikely if null is
true. Null must be false. Lots of beer at
Skipper’s. Note: data are fictitious.
Review

We want to know if a workbook helps with
learning stats. We know from past classes
that students average 75 percent on the final
with a SD of 5. Our new class of 225 has a
mean of 78. Did the workbook help? (Hint:
sqrt(225) = 15.)
Review
  75
 5
Sampling Distribution of Means
Final Ex am D ata
 X  .33
RRe l a ti ve F re q u e n cy
1.2
Upper = 75+1.96(.33) = 75.65.
This is far below 78.
The workbook helps.
0.9
0.6
0.3
0.0
70
72
74
76
Perc ent C orrec t
78
80
Definition

The term probability refers to the long run




1
2
3
4
Frequency of outcome
Odds ratio
Relative frequency
Rolling of dice
Definition

The calculation of probabilities in hypothesis
testing rests upon assumptions described in
the ______.




1 Alternative hypothesis
2 Null hypothesis
3 Sampling distribution
4 Standard error
Definition

The rejection region is the place in the
sampling distribution that is _____




Close to the mean
Obtained when the result is not significant
Very unlikely if the null hypothesis is true
Visited by losers