Wording the question

Download Report

Transcript Wording the question

Sample surveys
and polls
0011 0010 1010 1101 0001 0100 1011
1
2
4
Year
Sample
size
Winner
Gallup
prediction
Election
result
Error
1936
~50,000
Roosevelt
55.7% ↑
62.5%
-6.8%
1940
~50,000
Roosevelt
52.0% ↑
55.0%
-3.0%
1944
~50,000
Roosevelt
51.5% ↑
53.8%
-2.3%
1948
~50,000
Truman
44.5% ↓
49.5%
-5.0%
1952
5,385
Eisenhower
51.0% ↑
55.4%
-4.4%
1956
8,144
Eisenhower
59.5% ↑
57.8%
+1.7%
1960
8,015
Kennedy
51.0% ↑
50.1%
+0.9%
1964
6,625
Johnson
64.0% ↑
61.3%
+2.7%
1968
4,414
Nixon
43.0% ↑
43.5%
-0.5%
1972
3,689
Nixon
62.0% ↑
61.8%
+0.2%
1976
3,439
Carter
48.0% ↓
50.1%
-2.1%
1980
3,500
Reagan
47.0% ↑
50.8%
-3.8%
1984
3,456
Reagan
59.0% ↑
59.2%
+0.2%
1988
4,089
Bush
56.0% ↑
53.9%
+2.1%
1992
2,019
Clinton
49% ↑
43.3%
+5.7%
1996
2.,417
Clinton
52.0% ↑
50.1%
+1.9%
2000
3,129
Bush
48.0% ↑
47.9%
+0.1%
2004
1,866
Bush
49.0% ↔
51.0%
-2.0%
Some classic mistakes
The Literary Digest Poll
0011 0010 1010 1101 0001 0100 1011
• 1936 presidential election: Franklin
Delano Roosevelt vs. Alf Landon
• Literary Digest had called every
presidential election since 1916
• Sample size: 2.4 million!
• They predicted Roosevelt would lose
by 43%
• In fact it was a landslide for
Roosevelt at 62%
1
2
4
Literary Digest poll
0011 0010 1010 1101 0001 0100 1011
• Context
–
–
–
–
Midst of the Great Depression
9 million unemployed; real income down 1/3
Landon: “Cut spending”
Roosevelt: “Balance peoples’ budgets before
government’s budget”
• How the polling was done
– Survey sent to 10 million people
– 2.4 million responded (huge!)
1
2
4
Literary Digest poll was biased
0011 0010 1010 1101 0001 0100 1011
• Sampling frame not representative
– Phone numbers, subscriptions lists, drivers’
registrations, country club memberships
– Lists not representative
– Telephones were a luxury
– Biased toward better off groups (and more
Republican)
– Selection bias and non-response bias
• Voluntary response bias
– Main issue was the economy
– The anti-Roosevelt forces were angry---and
had a higher response rate!
1
2
4
Beginning of the Gallup Poll and
scientific sampling methods
0011 0010 1010 1101 0001 0100 1011
• Young pollster George Gallup used a sample of
3,000 of the 2.4 million responses to reproduce the
Literary Digest’s prediction
• Then, by using a completely different sample of
50,000, Gallup predicted 56% for Roosevelt and
44% for Landon
• Roosevelt received 62% of the vote
• Gallup used random sampling methods
• Despite improve, note the bias against the
Democratic candidates from 1936 to 1948
• This had disastrous consequences in 1948
1
2
4
The Year the Polls Elected Dewey
0011 0010 1010 1101 0001 0100 1011
• 1948 Election:
Harry Truman
versus Thomas
Dewey
• Every major poll
(including Gallup)
predicted Dewey
would win by 5
percentage points
1
2
4
What went wrong?
• 0010
Pollsters
samples
using quota sampling
0011
1010chose
1101their
0001
0100 1011
• Each interviewer assigned a fixed quota of subjects in certain categories
(race, sex, age)
• E.g., a Gallup Poll interviewer in St. Louis was required to interview 13
people, of whom
– 6 live in the suburb, 7 in the central city
– 7 men and 6 women; Over the 7 men (similar for women):
• 3 under 40 years old, 4 over 40
• 1 Black, 6 white
• Even monthly rentals paid by the subjects were specified
• In each category, interviewers free to choose
• Left room for human choice and inevitable bias
• Republicans were easier to reach
– Had telephones, permanent addresses, “nicer” neighborhoods
• Interviewers ended up with too many Republicans
• Quota sampling abandoned for random sampling
1
2
4
How surveys can get it wrong
0011 0010 1010 1101 0001 0100 1011
• Sampling error
– Errors caused by taking a sample (versus census)
• Random sampling error
– Deviation between statistic and parameter
– Error due to chance inevitable with random sample
– Margin of error in confidence statement includes only
random sampling error
• Non-sampling error
– Errors not related to act of selecting a sample
– Could happen in a census
• Distinction between sampling error and non-sampling
error: could it happen in a census
1
2
4
Sampling error
0011 0010 1010 1101 0001 0100 1011
• Most common form is undercoverage
• Sampling frame leaves out parts of the population
• Using telephone directories for phone survey
1
– Half the households in large cities are unlisted
– About 5% of households without phones
• Random digit dialing
2
4
– Misses students in dorms, inmates in prison, soldiers in
the military, homeless people
– Too expensive to call Hawaii and Alaska
Nonsampling error
0011 0010 1010 1101 0001 0100 1011
• From the Gannett News Service, Lafayette Journal and
Courier, Nov. 24, 1983
• Initial release of income data from 1980 census showed
Stumpy Point, North Carolina (pop. 205) with median
household income $84,413
• Income from census forms entered in tens of dollars.
$8000 is entered is “0800”. Many incomes incorrectly
entered as “8000”. Computer read it as $80,000.
• Example of processing error
• Response error
1
2
4
Nonsampling error: nonresponse
0011 0010 1010 1101 0001 0100 1011
• Serious problem facing sample surveys
• Common for opinion polls and market research studies to
have 75% to 80% nonresponse rate
• Current Population Survey (US Bureau of Labor Statistics
and Census Bureau): 6-7% nonresponse rate
• General Social Survey (U of Chicago):
–
–
–
–
1
2
4
Run by university
Contacts people in person, goes house to house
Many advantages
24% nonresponse rate
Wording the question
0011 0010 1010 1101 0001 0100 1011
• Do you agree? (From The New York Times, April, 1982)
– (1) “A freeze in nuclear weapons should be opposed because
it would do nothing to reduce the danger of thousands of
nuclear weapons already in place and would leave the Soviet
Union in a position of nuclear superiority.”
– (2) “A freeze in nuclear weapons should be favored because it
would begin a much-needed process to stop everyone in the
world from building nuclear weapons now and reduce the
possibility of nuclear war in the future.”
1
2
4
• Results: 58% agreed with (1). 56% agreed with (2), and 27%
agreed with both!
Open versus closed questions
0011 0010 1010 1101 0001 0100 1011
• “What do you think is the most important problem
facing the country today?”
1
2
• “Which of the following do you think is the most
important problem facing the country today---the
energy shortage, the quality of public schools, legalized
abortion, or pollution---or, if you prefer, you may name
a different problem as most important.”
–
4
From “Problems in the use of survey questions to measure public opinion,” Science, Volume 236
(1987)
Open versus closed questions
0011 0010 1010 1101 0001 0100 1011
• Results of 171
responses to
open question
and 178
responses to
closed question
Problem
Open
Closed
Energy
0.0%
5.6%
Schools
1.2%
Abortion
0.0%
Pollution
1.2%
14.0%
Others
93.0%
39.3%
Don’t
know
4.7%
0.6%
1
2
32.0%
8.4%
4
Response bias
0011 0010 1010 1101 0001 0100 1011
• People respond differently to how they believe
• Deliberate bias
– “Do you agree that abortion, the murder of innocent beings,
should be outlawed?”
• Unintentional bias
– “Do you or do you not use drugs?”
1
2
4
• People often want to please the interviewer
– “Do you think your professor is doing a good job teaching
statistics?”
• Affected by sex, attire, race, behavior of interviewer
• Wording, Ordering, Complexity of Questions
Another type of response bias
0011 0010 1010 1101 0001 0100 1011
• “Some people say that the 1975 Public
Affairs Act should be repealed. Do you
agree or disagree that it should be
repealed.” Washington Post, Feb. 1995
1
2
4
• Results: For repeal: 24%, Against repeal: 19%,
No opinion: 57%
• No such thing as the Public Affairs Act!
How to cope with errors:
weighting the sample
0011 0010 1010 1101 0001 0100 1011
“The sample first was weighted to take into account unequal
probabilities of selection from sampling: Weighting accounts for
the number of telephones going into the household, and household
size. It then was weighted for age, gender, and education to take
care of minor fluctuations in the sample, and align it with the
findings of the 2000 Census of the adult population. It is assumed
to be representative of all Minnesota households with telephones,
within the margin of sampling error.”
– How the Poll was Conducted,
Minneapolis Star Tribune
1
2
4
Weighting responses in a sample
0011 0010 1010 1101 0001 0100 1011
• Weighting responses is common method to deal with nonresponse
• Example for a telephone poll: Suppose women are twice
as likely to answer the phone as men
• Then weight survey results by multiplying women’s
responses by ½.
• For instance: “Will you vote for X”?
– Responses: 150 men: (90 Yes, 60 No)
–
300 women: (100 Yes, 200 No)
• After weighting:
– 150 men: 90 Yes, 60 No
– 150 women: 50 Yes, 100 No
1
2
4
• Report sample proportion of (90+50) /300 = 46.67%
• In practice, it’s very complicated
Stratified sampling
• More complex sampling methods to insure better
representation
• Goal: Random sample of 240 Carleton students
• To insure discipline representation divide into strata
according to population
0011 0010 1010 1101 0001 0100 1011
–
–
–
–
Arts and Literature 20%
Humanities 15%
Social Sciences 30%
Mathematics and Natural Sciences 35%
1
2
4
• Within each discipline, choose at random
• Choose 240 x .20 = 48 Arts and Lit students
240 x .15 = 36 Humanities
240 x .30 = 72 Social science
240 x .35 = 84 Math and natural
Stratified sampling
0011 0010 1010 1101 0001 0100 1011
• Advantages: Sample will be representative for
the strata; Can gain precision of estimate
• Disadvantages: Logistically difficult; must know
about the population; May not be possible
• Note that technically a stratified sample is not a
simple random sample
• Every possible group of 240 students is not
equally likely to be selected
1
2
4
Cluster sampling
0011 0010 1010 1101 0001 0100 1011
• Warehouse contains 10,000 window frames
stored on pallets
• Each pallet contains 20 to 30 window frames
• Goal: Estimate how many window frames have
wood rot
• Would like to sample about 500 frames
• Cluster sample
1
2
4
– Sample pallets, not windows. Choose, say 20.
– Include in sample all the windows on each pallet
Cluster sampling
0011 0010 1010 1101 0001 0100 1011
• Door-to-door surveys
– City blocks are the clusters
• Survey farms throughout the Midwest on pesticide use
– Counties are the clusters
• Airlines get customer opinions
– Individual flights are the clusters
• Advantage: Much easier to implement depending on
context
• Disadvantage: Greater sampling variability; less statistical
accuracy
1
2
4
Current Population Survey:
Multistage cluster sampling
0011 0010 1010 1101 0001 0100 1011
• Countries divided into 2,007 Primary Sampling Units
• Stage 1: 792 PSUs chosen (but not quite at random)
– 432 highly populated PSUs (like Chicago and LA) are
automatically in the sample
1
2
• PSUs divded into smaller census blocks
• Blocks grouped into strata
• Households in each block grouped into clusters of about 4
households each
• Final sample consists of clusters and interviewers go to all
households in the chosen clusters
• Offers some of the advantages of quota sampling but with
no selection bias
4
How to evaluate a poll or survey
0011 0010 1010 1101 0001 0100 1011
• Who carried out and funded the survey?
• What is the population?
• How was the sample selected?
– Random methods?
• How large was the sample?
– What’s the margin of error?
• What was the response rate?
• How were subjects contacted?
• When was the survey conducted?
• What are the exact questions asked?
1
2
4