Older Version of 4.1 Notes

Download Report

Transcript Older Version of 4.1 Notes

Ch 4 - Designing
Studies
I can identify the population and
sample in a survey.
Population - the entire group of individuals
about which we want information.
Sample - the part of the population from
which we actually collect information.
Sample Survey
1st - determine what population we want to
describe
2nd - determine exactly what we want to
measure (define our variables)
The student government at a high school
surveys 100 of the students at a the school
to get their opinions about a change to the
bell schedule.
What’s the population? Sample?
What was being studied?
An archaeological dig turns up large
numbers of pottery shards, broken stone
tools, and other artifacts. Students working
on the project classify each artifact and
assign it a number. The counts in different
categories are important for understanding
the site, so the project director chooses 2%
of the artifacts at random and checks the
students’ work. Identify the population and
sample.
I can understand two types of bias
in sampling.
Bias - when the design on a study will favor
certain outcomes
Convenience Sample - choosing individuals
who are easiest to reach
Voluntary Response - when the sample
chooses themselves by responding to a
general appeal.
Why do each lead to bias?
Convenience
Bias
unrepresentative of the entire
population because answer will
be influence by where you are.
i.e. if you are surveying how
people feel about the library tax
and only ask people who are at
the library
Voluntary
Response Bias
only people with strongly
opinions (in either direction) will
respond.
people can respond more than
once
i.e. call-ins, write-ints, internet
voting
When you identify the bias - also state
IN WHICH DIRECTION!
You are on the staff of a member of Congress who is
considering a bill that would provide governmentsponsored insurance for nursing-home care. You
report that 1128 letters have been received on the
issue, of which 871 oppose the legislation. “I’m
surprised that most of my constituents oppose the
bill. I thought it would be quite popular,” says the
congresswoman. Are you convinced that a majority
of the voters oppose the bill? How would you explain
the statistical issue to the congresswoman?
In June 2008, Parade magazine posed the
following question: “Should drivers be banned
from using all cell phones?” Readers were
encouraged to vote online at parade.com. The
July 13, 2008 issue reported 2407 (85%) said
“Yes” and 410 (15%) said “No.”
a) What type of sample did Parade survey
obtain?
b) Explain why this is biased and is 85% too
high or too low? Why?
HW: Day 1 on the outline
So, what’s a good method?
SRS - simple random sample - n individuals
chosen from a population in such a way
that every set of n individuals has an equal
chance to be in the sample actually
selected.
Random Rectangle Activity
HW: day 2 on outline
How can I select a SRS of 4 students from
this class?
Ideas: put all names in a hat, on equally
sized slips of paper and select 4 of them
Assign everyone a number and use a
Random Digit Table (Table D) to select the
four people
How to use Table D
1.assign every individual in the population a digit.
2.the number of digits have to equal the number
of digits in the population
3.start with 0 (or 00 or 000...)
4.decide what to do if you get a repeated digit or
a digit not in the range you need
5.pick a line to start at and read consecutive
groups of digits to select your sample
Day 3
Other Sampling Methods
Stratified Random Sample: when the
population is grouped based on a similarity.
Cluster Sample: when the population is divided
into smaller groups that mirror the
characteristics of the population.
Other Sampling Methods
Stratified Random
Sample
Cluster Sample
•
first, divide the population into
smaller groups (mirror the
population)
next, choose a separate SRS from
each stratum
•
next, choose an SRS of the
clusters
combine all SRSs to form the full
sample
•
all individuals in each cluster are
included in the sample
groups are homogeneous - like the
math class you’re in
groups are heterogenous - like CLC
groups
first, classify population into similar
groups (strata)
A manager of a beach-front hotel wants to survey guests in the hotel to
estimate overall customer satisfaction. The hotel has two towers, an older
one to the south and a newer one to the north. Each tower has 10 floors of
standard rooms (40 rooms per floor) and 2 floors of suites (20 suites per
floor). Half of the rooms in each tower face the beach, while the other half of
the rooms face the street. There are a total of 880 rooms.
a) Explain how to select a simple random sample of 88 rooms.
b) Explain how to select a stratified random sample of rooms.
c) Explain how to select a cluster of rooms.
d) Explain why selecting 2 of the 24 different floors would not be a good way
to obtain a cluster sample.
Advantages & Disadvantages
SRS
Advantage
Disadvantage
simple to carry out
chance of over- or underrepresenting in the
sample
each individual in the population has the same chance
to be selected
no chance of over- or under-representing
Stratified
each individual in the population still has the same
chance of being selected
can be convenient when groups are already “created”
Clustering
each individual in the population still has the same
chance of being selected
a little more complicated
to execute
chance of over- or underrepresenting in the
sample
A Sample Free Response
In response to nutrition concerns raised last year about food served in
school cafeterias, the Smallville School District entered into a one-year
contract with the Healthy Alternative Meals (HAM) company. Under this
contract, the company plans and prepares meals for 2,500 elementary,
middle, and high school students, with a focus on good nutrition. The school
administration would like to survey the students in the district to estimate the
proportion of students who are satisfied with the food under this contract.
Two sampling plans for selecting the students to be surveyed are under
consideration by the administration. One plan is to take a simple random
sample of students in the district and then survey those students. The other
plan is to take a stratified random sample of students in the district and then
survey those students.
(a) Describe a simple random sampling procedure that the administrators
could use to select 200 students from the 2,500 students in the district.
(b) If a stratified random sampling procedure is used, give one example of
an effective variable on which to stratify in this survey. Explain your
reasoning.
(c) Describe one statistical advantage of using a stratified random sample
over a simple random sample in the context of this study.
Answers to part A
Answers to part B
Answers to part C
What type of sampling is this?
At a party there are 30 students over age 21 and
20 students under age 21. You choose at random
3 of those over 21 and separately at random 2 of
those under 21 to interview about attitudes towards
alcohol. You have given every student at the party
the same chance to be interviewed.
What is that chance?
What type of sampling procedure was this?
HINT: an SRS will allow for
a sample to have all of a
certain “group” or none of a
“groups
One the west side of Rocky Mountain National Park,
many mature pine trees are dying due to infestation by
pine beetles. Scientists would like to use sampling to
estimate the proportion of all pine trees in the area that
have been infested.
Why would an SRS not be practical?
Could they just sample the pines along the road?
Suppose the sampling was carried out randomly and
accurately and 35% of the pine trees sampled were
infested. Can they conclude 35% of all pine trees are
infested?
Day 4
Inference and what can go
wrong?
Why do we sample?
to infer about a population
surveying a population takes too much time and money!
Can we trust it?
• YES - the law of probability allows for random
sampling to work!
• there are margins of error to account for the
variability between the sample and the population.
Nothing was wrong with the procedure!
What can go wrong?
There are different types of bias to cause
sampling to go wrong:
Sampling Errors
Nonsampling Errors
Sampling
Voluntary Response
Convenience Sample
Undercoverage
(we already know about voluntary and convenience)
Nonresponse
Response Bias
Nonsampling
Wording of Question
Undercoverage
*when some groups in the population are left
out of the process when choosing the
sample*
Example: if you were to go to people’s
houses and survey about the unemployment
rate - you are leaving out all the homeless
people and those who have jobs and are not
home.
Nonresponse
*when an individual chosen for the sample can’t
be contacted or refuses to participate*
WARNING: the is not voluntary response bias,
these individuals were chosen to be in the
sample and do not want to be
Example: when you make a call at SIRS and they hang up on
you or tell you something rude of why they don’t want to
participate!
Response Bias
*when the individual gives the wrong answer*
Many factors contribute to this:
• people know what the answer should be and
give that
• what the interviewer looks like
• recalling past events
Wording of Questions
*confusing/leading questions that lead to a
certain response*
MOST IMPORTANT INFLUENCE ON
ANSWERS! Never trust a survey unless you
have seen the questions!
Examples: Order of the questions, any
prompts/cues given before the question
Ch 4 Project
by yourself or with a partner
You will design and conduct an experiment to investigate the effects of
response bias in surveys
You can choose the topic, but you must design your experiment to answer
one of the following questions:
1.
2.
3.
4.
Can the wording of a questions create response bias?
Do the characteristics of the interviewer create response bias?
Does anonymity change the responses to sensitive questions?
Does manipulating the answer choices change the response?
Ch 4 Project
see page 267 for what is required
I will hand out a rubric - USE IT!
Not only are you going to analyze the survey
results, you will analyze if the way the survey was
conducted biased the results
due: October 31, 2013
(approved by Friday, October 24)
start on HW: day 4