Document 7706754

Download Report

Transcript Document 7706754

Stat 31, Section 1, Last Time
•
T distribution
,
–
For unknown
–
Compute with TDIST & TINV (different!)
•
replace with
s
Paired Samples
–
•
Similar to above, work with differences
Inference for Proportions
–
Counts & Proportions
–
CIs: Best Guess & Conservative
Inference for proportions
Case 2: Choice of Sample Size:
Idea: Given the margin of error m ,
find sample size n to make:
0.95  P pˆ  p  m
i.e.
pˆ  p
 p 1  p  
N  0,

n 

i.e.
Dist’n
0.95
0.975
m
m
m
Dist’n
Sample Size for Proportions
i.e. find n so that
i.e.
p 1  p 
0.975  NORMDIST (m,0,
)
n
p 1  p  

m  NORMINV  0.975,0,

n


Problem:
in both cases, can’t “get at” n
Solution:
Standardize,
i.e. put on N(0,1) scale
I.e.
Inference for proportions
Find n so that


0.95  P pˆ  p  m  P 






m
 P Z 

p 1  p  

n 

pˆ  p

p 1  p 
n


m

p 1  p  
n

N(0,1) dist’n
0.975
m
p 1  p 
n
Sample Size for Proportions
i.e. find n so that:
Now solve to get:
m
 NORMINV (0.975,0,1)
p 1  p 
n
NORMINV 0.975,0,1 p1  p 
n
m
NORMINV 0.975,0,1 

n
 p1  p 
m


2
Problem:
don’t know p
Sample Size for Proportions
Solution 1:
Use
Best Guess
p̂ from:
–
Earlier Study
–
Previous Experience
–
Prior Idea
Sample Size for Proportions
Solution 2:
Recall
Conservative
max p1  p   1
p 0,1
4
So “safe” to use:
NORMINV 0.975,0,1  1

n

m

 4
2
Sample Size for Proportions
E.g. Old textbook problem 8.14 (now 8.16)
An opinion poll found that 44% of adults
agree that parents should be given
vouchers for education at a school of
their choice. The result was based on a
small sample. How large an SRS is
required to obtain a margin of error of
+- 0.03, in a 95% CI?
Sample Size for Proportions
E.g. Old textbook problem 8.14 (now 8.16)
See Class Example 26,
Part 2:
https://www.unc.edu/~marron/UNCstat31-2005/Stat31Eg26.xls
Sample Size for Proportions
Note:
conservative version not much
bigger, since
0.44 ~ 0.5 so
gap is small
0.44
0.5
Sample Size for Proportions
HW:
8.23,
8.25,
give both “best
guess” and “conservative” answers
Hypo. Tests for Proportions
Case 3:
Hypothesis Testing
General Setup:

 
H 0 : p  

 

 
H 0 : p  

 
Given Value
Hypo. Tests for Proportions
Assess strength of evidence by:
P-value = P{what saw or m.c. | B’dry} =
= P{observed
Problem: sd of p
ˆ
p̂ or m.c. | p =
p 1  p 
n
}
Hypo. Tests for Proportions
p 1  p 
Problem: sd of p 
ˆ
n
Solution: (different from above “best guess”
and “conservative”)
calculation is done base on:
p
Hypo. Tests for Proportions
e.g. Old Text Problem 8.16 (now 8.18)
Of 500 respondents in a Christmas tree
marketing survey, 44% had no children
at home and 56% had at least one child
at home. The corresponding figures
from the most recent census are 48%
with no children, and 52% with at least
one. Test the null hypothesis that the
telephone survey has a probability of
selecting a household with no children
that is equal to the value of the last
census. Give a Z-statistic and P-value.
Hypo. Tests for Proportions
e.g. Old Text Problem 8.16
(now 8.18)
Let p = % with no child
(worth writing down)
H 0 : p  0.48
H A : p  0.48
Hypo. Tests for Proportions
Observed p
ˆ  0.44 , from n  500
P-value =
 Pp
ˆ  0.44 or m.c. | p  0.48
ˆ  p  0.04 | p  0.48
 P p
 2  Ppˆ  0.44
Hypo. Tests for Proportions
P-value  2  Pp
ˆ  0.44
= 2 * NORMDIST(0.44,0.48,sqrt(0.48*(1-0.48)/500),true)
See Class Example 26, Part 3
https://www.unc.edu/~marron/UNCstat31-2005/Stat31Eg26.xls
= 0.0734
Yes-No:
no strong evidence
Gray-level:
somewhat strong evidence
Hypo. Tests for Proportions
Z-score version:
P-value =
ˆ  p  0.04
P p


 P


pˆ  p

p 1  p 
n
So Z-score is:


0.4

0.481  0.48 
500

= 1.79
Hypo. Tests for Proportions
Note also 1-sided version:
Yes-no:
is strong evidence
Gray Level:
stronger evidence
HW: 8.19, 8.21, interpret from both
yes-no and gray-level viewpoints
And now for something
completely different….
Another fun movie
Thanks to Trent Williamson
Chapter 9: Two-Way Tables
Main idea:
Divide up populations in two ways
–
–
•
E.g. 1:
E.g. 2:
Age & Sex
Education & Income
Typical Major Question:
How do divisions relate?
Are the divisions independent?
•
–
–
Similar idea to indepe’nce in prob. Theory
Statistical Inference?
Two-Way Tables
Class Example 40, Textbook Problem 9.20
Market Researchers know that background
music can influence mood and
purchasing behavior. A supermarket
compared three treatments: No music,
French accordion music and Italian
string music. Under each condition, the
researchers recorded the numbers of
bottles of French, Italian and other wine
purshased.
Two-Way Tables
Class Example 40, Textbook Problem 9.20
Here is the two way table that summarizes
the data:
Wine:
French
Italian
Other
None
30
11
43
Music
French
39
1
35
Italian
30
43
35
Are the type of wine purchased, and the
background music related?
Two-Way Tables
Class Example 40:
Visualization
Class Example 40 - Counts
45
40
35
30
# Bottles 25
purchased 20
15
10
Other Wine
5
Italian Wine
0
None
French Wine
French
Italian
Music
Shows how counts are broken down by:
music type
wine type
Two-Way Tables
Big Question:
Is there a
relationship?
Class Example 40 - Counts
45
40
35
30
# Bottles 25
purchased 20
15
10
Other Wine
5
Italian Wine
0
None
French Wine
French
Italian
Note: tallest bars
French Wine  French Music
Italian Wine  Italian Music
Other Wine  No Music
Suggests there is a relationship
Music
Two-Way Tables
General Directions:
•
Can we make this precise?
•
Could it happen just by chance?
–
•
Really: how likely to be a chance effect?
Or is it statistically significant?
–
I.e. music and wine purchase are related?
Two-Way Tables
Class Example 40, a look under the hood…
Excel Analysis, Part 1:
https://www.unc.edu/~marron/UNCstat31-2005/Stat31Eg40.xls
Notes:
•
Read data from file
•
Only appeared as column
•
Had to re-arrange
•
Better way to do this???
•
Made graphic with chart wizard
Two-Way Tables
HW:
Make 2-way bar graphs, and discuss
relationships between the divisions, for
the data in:
9.1
(younger people tend to be better
educated)
9.13
9.15
(you try these…)
Two-Way Tables
An alternate view:
Replace counts by proportions (or %-ages)
Class Example 40 (Wine & Music), Part 2
https://www.unc.edu/~marron/UNCstat31-2005/Stat31Eg40.xls
Advantage:
May be more interpretable
Drawback:
No real difference (just rescaled)