Survey Design Slides - LISA

Download Report

Transcript Survey Design Slides - LISA

Designing Surveys and Interpreting the Results
A LISA Short Course
Eric Vance
July 16, 2012
Fralin Auditorium
Director of LISA
Assistant Research Professor
Department of Statistics
This workshop will consist of
six sections:
1. Introductions
2. Survey
Fundamentals
3. Questionnaire
Design
4. Survey
Implementation
5. Interpreting
Results
6. Questions and
Answers
200
●
160
●
120
Weight in pounds
Scatter plot of Height and Weight
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
55
60
65
Height in inches
70
75
Section 1: Introductions
Data
Experiments
Knowledge
Understanding
Decisions
Laboratory for Interdisciplinary
Statistical Analysis
LISA helps VT researchers benefit
from the use of Statistics
Experimental Design • Data Analysis • Interpreting Results
Grant Proposals • Software (R, SAS, JMP, SPSS...)
LISA’s mission is to train statisticians to become
interdisciplinary collaborators and promote the
value of statistical thinking in all phases of
scientific research.
In 2011 there were 1046 total clients of LISA’s three services:
• 355 Collaborative Projects from 62 VT departments
• 304 visitors to Walk-in Consulting (M-F, 1-3PM in the GLC)
• 387 attendees at LISA Short Courses
www.lisa.stat.vt.edu
www.lisa.stat.vt.edu/?=short_courses
Slides, plots, and
R code are
available on this
website by clicking
on the course title
Statistics in All Phases of Research
Control sources
of variation,
detect outliers
Data
Experiments
Visualize the data;
analyze with
statistical models
Knowledge
Design
experiments to
answer research
questions
Understanding
Determine practical
and statistical
significance
of results
Make scientifically
sound decisions and
communicate them
Decisions
Section 2: Survey Fundamentals
Statistical analyses should start with
a research question
1. What are your questions?
What do you want to know?
2. Do you need data to answer these
questions? If so, what data do you
need?
3. How will you collect these data?
There are many ways to collect data
besides surveys
• Direct measurements or observations: instead of
asking someone how much water they use, install a
water meter.
• Use existing data sources: collect data from the
water utility
• Explore complex issues with focus groups: interview
a group of women about how they use water in their
daily lives.
Scientific surveys measure the
statistics behind the stories
• Personal interviews and focus groups are designed to
collect anecdotes and to understand “why” something is
happening
• Surveys can tell you “what” is happening to a large
number of people under different situations
• Surveys can also help explain “why” something is
happening
• Surveys can transform stories into numbers and allow
us to make statistical comparisons: before vs. after;
here vs. there; Group 1 vs. Group 2
Scientific surveys elicit unbiased answers from
samples representative of the overall population
• In Section 3 we will discuss ways to elicit accurate
responses
• The key concept of a survey is that the one can draw
conclusions about the overall population based on the
results from a much smaller sample
• An exit poll of 1000 randomly sampled voters will be within
3% of the final vote percentage 95% of the time
• The more precise you need your final result to be, the more
people you must sample: an exit poll of 9600 voters will be
within 1% of the final vote percentage 95% of the time
5 steps involved in designing a survey
1. Clearly define your research objectives (What
do you want to know?)
2. Define the population to be sampled (Who do
you want to know about?)
3. Develop a sampling plan (Will my sample be
representative of the population I care about?)
4. Design a questionnaire to minimize errors and
biases (How does each question relate to your
research objectives?)
5. Pilot test and retest your survey
(Fix errors and start again at Step 1)
Step 1: Clearly define your research
objectives
• State CLEARLY and CONCISELY your
– Overall research goals
– Specific scientific questions
• Refer to these objectives constantly throughout the
design of your survey to ensure your survey is
answering the desired questions of interest
Step 2: Define the population to be
sampled
Who will you interview to answer your research questions?
The population is the overall group of interest or the target group.
• Subject: Any material we
measure
Plant, person, piano, etc.
• Population: representation of
all the possible outcomes or
measurements of interest
• Sample: subset of the
population to be measured
(i.e., a group of subjects that
represent the population)
Step 3: Develop a sampling plan
• Once the target population has been identified, next the
sampling plan must be devised
• Goal: Randomly select a small percent of the population that
will in turn represent the ideas of the population as a whole.
• The sampling plan involves:
A. The technique used to select the subjects for your study
• Simple random sampling (draw names out of a out or from a list)
• Stratified random sampling (sample one person per team/table)
• Cluster sampling (randomly pick a team/table and sample
everyone from that table)
• Systematic sample (sample the first person from each team/table)
B. The number of people needed for your study
• Sample size calculations (sample enough for precise results for $)
Simple random sampling
• Subjects chosen by random mechanism
• Each subject has an equal chance of being in the study
• Easiest to summarize BUT most tedious to implement
in the field
Example:
Randomly select 10
students from the Stat 3005
class roster to ask a
question.
Stratified random sampling
• First divide population into strata (groups) based on
similarity
• Then randomly select subjects within each strata
o Easier to implement
o May result in more precise summary
Example:
Randomly select 5 male
students and randomly select
5 female students from the
STAT 5615 class roster to ask
a question.
Cluster sampling
•
•
•
•
Population has many clusters
First randomly select a number of clusters
Then sample all the units within each cluster
Require clusters to be representatives of
population
Example:
Population: opinions of all students
(attending class) at Virginia Tech
1. Randomly select a certain number of
classes
2. Ask all students in each class their
opinion
Note: Cluster sampling is often NOT as
efficient as stratified sampling for surveying.
Systematic sampling
• Select every kth subject from a list of all possible subjects
Example:
Telemarketers randomly
sample every 10th phone
number on the Yellow
Book to make marketing
calls.
Sample size calculations
• How many people do we interview?
Answer: It depends
• Sample size calculations can be computed using statistical
methods (Ask a LISA statistical collaborator for help!)
• Sample size calculations also involve characteristics of the
study:
Time, money, precision required
• For many Gallup polls, the population of interest is all adult
Americans. To represent this population, the sample usually
consists of around 1,000 adults.
When sample sizes approach 500 or more the gains in accuracy
get smaller and smaller for the increase in sample size.
Sample size calculation
for a proportion
Let
n = sample size
σ = standard deviation
d = confidence interval size
α = significance level
Then, to get a (1-α/2)*100% confidence interval, we
need a sample size of:
é 2s -1 æ a öù
n= ê
F ç1- ÷ú
è 2 øû
ë d
2
Sample size calculation
for a proportion
For example, suppose we want an estimate for a 95%
confidence interval of width 0.2 (meaning we have a 0.1
margin of error). If we know from a pilot study that the
standard deviation of the population is 1, then,
σ=1
d = 0.2
α = 0.05
And plugging these numbers into the previous equation,
we get,
n = 384.15
Which means we need to sample 385 people.
Section 3: Questionnaire Design
Step 4: Minimize biases and errors
when designing the questionnaire
and sampling plan
Three major types of biases and errors:
Selection bias or coverage error:
Your sample is not representative of your population
Nonresponse bias:
Those who respond to your survey are different in
important ways from those who choose not to respond
Measurement error:
Survey responses are inaccurate
Selection bias or coverage error
• Definition:
– Not all members of a population have a known, nonzero chance
of being selected for survey
• Problem:
– Survey may turn out to be biased
• Possible Solutions:
– Identify target population (might require some expertise in the
subject of the survey)
– Construct a sampling frame - a list of all possible respondents
– Avoid duplicates and respondents that are outside of target
population; and excluding a portion of target population
– Randomize
Nonresponse bias
• Definition:
– Survey error that happens when respondents are different from
nonrespondents in a significant way
• Problems:
– Filters out certain types of respondents
– The reason for which a person responds (or, conversely, does
not respond) to a survey is related to the subject of the survey
• Possible Solutions:
– Provide incentives for completing survey
– Explain why the survey is important
– Keep the survey short and sweet
– Give more weight to answers from hard-to-reach respondents
(and ask a statistician for advice)
Nonresponse bias
In a national sample of board-certified physicians, a short survey
was mailed asking physicians to nominate the five best hospitals in
their specialty regardless of cost or location. Up to three follow-ups
were mailed to nonresponders to gain participation. The final
response rate was 47.3%.
Males were significantly more likely to respond than females, which
would not be an issue if men and women answered in the same
way…
But, men were significantly more likely to nominate one or two top
hospitals in their specialty. In addition, women were significantly
more likely to nominate hospitals only in their region.
Measurement error
• Definition:
– Inaccurate answers to survey questions (sometimes
due to lack of clarity in writing)
• Problems:
– Makes it difficult to judge if answers are accurate
– May lead to incorrect conclusions about target
population
• Possible Solutions:
– Write clear, concise questions
– Be aware of leading questions
– Be aware of social factors that may influence
responses
– Explain why the survey is important
Measurement error
In a study about measurement error in earnings
data, respondents were asked to report their
annual wages. The reported wages were then
compared to earnings statements on detailed W2 records.
Not surprisingly, the study found that
respondents tended to over-report their wages
when compared to their W-2 records. Also, the
discrepancy between reported and official wages
decreased as official wage increased.
The best way to write good survey
questions is to pilot test and re-write
• An article appearing in the International Journal of Market
Research gives great advice about questionnaire design. This
YouTube video summarizes 10 things to look out for.
http://youtu.be/53mASVzGRF4
Keep the questionnaire
as short as possible
• The Creative Research Systems has the following useful
suggestions: (http://www.surveysystem.com/sdesign.htm)
• Follow the “KISS” method meaning “Keep it short and simple!”
• Categorize questions into 3 groups:
– Must Know
– Useful to Know
– Nice to Know
• If the questionnaire seems too long, start omitting the “nice to
know” questions
• Don’t get caught in the trap where you find that you have a captive
audience, so you begin asking questions that are not pertinent
Think about the order of questions
•
•
Group related questions together
Choose first question carefully. The first question should:
–
–
–
•
Place sensitive questions near the end
–
•
•
Apply to everyone
Be easy to read
Be interesting
Give respondents a chance to become comfortable with
questionnaire
Ask about sequential events in the order that they occurred
Avoid unintended question order effects
Avoid potential question order effects
•
•
•
•
•
•
•
Priming
–
Early questions refresh respondents’ memory for subsequent
questions
Carryover
–
Respondents believe questions are similar and answer them with
same criteria
Consistency
–
Respondents answer questions similarly to try to appear consistent
Norm of evenhandedness
–
Respondents answer questions similarly to try to be fair
Anchoring
–
Early questions set a standard for comparison to later questions
Subtraction
–
Considerations in answers to early questions are left out of
subsequent judgments
Avoiding extremeness
–
Respondents try to seem neutral by choosing some items while
rejecting others
Save demographic questions for
the end of the survey
• The following demographic questions should be saved for
the end of the questionnaire:
Age, Education, income, martial status, etc.
• Ensures that respondents will not feel that they are losing
their anonymity when answering the rest of the questions
• Choose the most important questions for your survey to be
asked at the beginning of the survey
Open versus closed questions
• Open questions allow the respondent to freely answer the
question.
Imposes fewer restrictions and allows for more depth in the
overall answer
• Closed questions force the respondent to answer the question
by choosing from predetermined choices.
Advantage: Ease in analysis
• One suggestion is to test the survey on a small group with an
open question. From those responses form a closed question
that encompasses the categories expressed in the responses
to the open questions.
• Allow for an “other” option in closed questions, to permit
respondents to write their own responses
Avoid double-barreled questions
• Refrain from having two concepts embedded in one question
Example:
“Do you have time to read the newspaper every day?”
• Notice you are asking about “time” and “reading the newspaper every
day”.
Revision:
“Do you read the newspaper every day?”
• If the answer is no, you can create a question to determine the reasons
the person does not read the newspaper.
Convert opinions and words into
numbers using the Likert scale
• A popular technique in survey design
is the use of scaling questions.
– Respondents are able to select a
number or category that represents their
answer to the survey question.
• Likert scaling is common technique
used in questionnaires.
– A Likert item is question or statement on
a questionnaire where the respondent
gives a rating for their response on a
topic.
– The rating is usually the level of
agreement the respondent has
concerning the statement or question.
– A Likert item is balanced, meaning there
is an equal number of positive and
http://en.wikipedia.org/wiki/File:Example_Likert_Scale.jpg
negative positions.
Convert opinions and words into
numbers using the Likert scale
• The 5-point and 7-point scale responses are the most common
• Make sure the visual middle option actually corresponds to the
middle value:
Example:
v
Disagree Neither agree or disagree Slightly Agree
Agree
Strongly Agree
Slightly Agree
Agree
Revision:
v
Disagree Slightly Disagree
Neither agree or disagree
• Likert items can be analyzed separately or the items may be
summed and the sum can be analyzed. The sum of Likert items
is called the Likert Scale.
Step 5: Pilot test (and retest)
your survey
• You should pretest the survey on a smaller sample
whenever possible!
• This pilot test can
Allow you to revise the questionnaire if needed
Allow you to create a closed question from the responses
for an open question
Help you estimate the variability in the responses to your
questions and determine the necessary sample size
Section 4: Survey Implementation
Decide how to collect the survey data
• After we know what data we want to collect, who we want to
survey, and how we ask the questions we must determine
the best instrument for collecting the survey data
• Data collection options:
Personal Interviews—either paper or PDA/Smartphone
Telephone Interviews
Mail Surveys
Email Surveys
• For more discussion of data collection options see
http://www.surveysystem.com/sdesign.htm.
Personal interviews
• A face-to-face encounter between the interviewer and
the subject
• Advantages:
– People usually respond when confronted face-to-face
– Can get a better sense of the reaction of the subject
– Prevent misunderstandings
• Disadvantages:
– More costly
– Interviewers who are not trained properly may
introduce bias into the sample
Telephone interviews
• Most popular instrument for survey in the United States since
96% of homes have telephones
• Personal interviews and telephone interviews are usually the
most successful forms of surveying with response rates around
60 to 75%
• Advantages:
– Less expensive than personal interviewing
– Random phone numbers can be dialed
– Fast results
• Disadvantages:
– People are reluctant to answer phone interviews
– Phone calls can usually only be made from around 6pm9pm
– Phone surveys normally need to be shorter in length than
personal interviews
Mail surveys
• Advantages:
– Cheap
– Questionnaire can include pictures
– People are able to answer on their own time
• Disadvantages:
– Timely processes
– Response rates have a tendency to be low
Email surveys
• Advantages
– Cheap
– Fast
– You can attach pictures or sound files
• Disadvantages
– People may respond multiple times
– People who have email may not be representative of the
population as a whole
In 2011 in Mozambique we interviewed
1600 households using PDAs
• An on-the-ground statistician ensured the data were of
high quality by downloading the data every night, checking
them for errors, giving feedback to the surveyors, and
correcting any errors found
Section 5: Interpreting the Results
200
●
160
●
120
Weight in pounds
Scatter plot of Height and Weight
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
55
60
65
Height in inches
70
75
Data must be processed, analyzed,
and reported
• Download or input your data onto a computer
• Clean your data—start with the most important variables
1. Ensure all data are in the correct format
2. Decide what to do with missing data
3. Detect outliers and coding errors by visual or
graphical inspection
• Process textual data by reading, classifying, and counting
Example of processing data in Excel
• Pre-workshop survey data downloaded to Excel
• Clean your data
1. Ensure all data are in the correct format
Fix numerical data so they are all numerical
2. Decide what to do with missing data
Ignore some missing values
Decide which missing values should be 0
Decide which what to do with “I Don’t Know” responses
3. Detect outliers and coding errors by visual or
graphical inspection
• Process text by classifying it and creating a Pareto chart
The best statistical analysis is often
just a plot or graph of your data
• Summarize your data one variable (height) at a time
Histograms show the distribution of the data points
4
3
2
1
0
Number of respondents
Histogram of Height
60
70
80
90
Height in inches
100
110
The best statistical analysis is often
just a plot or graph of your data
• Summarize your data one variable (height) at a time
Histograms show the distribution of the data points
4
3
2
1
0
Number of respondents
Histogram of Height
55
60
65
Height in inches
70
75
The best statistical analysis is often
just a plot or graph of your data
• Summarize your data one variable (height) at a time
Box plots summarize the distribution of the data
Boxplot of Height
55
55
60
60
65
65
70
70
75
The best statistical analysis is often
just a plot or graph of your data
• Summarize your data one variable (height) at a time
Box plots summarize the distribution of the data
Boxplot of Height
Median
Outlier?
25th percentile
75th percentile
55
55
60
60
65
65
70
70
75
Summarize two variables based on
quant/quant, quant/qual, qual/qual
• The relationship between two quantitative variables can be
visualized in a scatter plot and quantified by correlation or
regression
• The relationship between a quantitative and qualitative
variable can be shown in a side-by-side box plot and
summarized with a t-test
• The relationship between two qualitative variables can be
shown in a table or a mosiac plot and summarized by
Fisher’s Exact Test or a Chi-squared test
Plot two quantitative variables
on a scatter plot
• The relationship between two quantitative variables can be
visualized in a scatter plot and quantified by correlation or
regression
200
●
160
●
120
Weight in pounds
Scatter plot of Height and Weight
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
55
60
65
Height in inches
70
75
Plot two quantitative variables
on a scatter plot
• The relationship between two quantitative variables can be
visualized in a scatter plot and quantified by correlation or
regression
200
Males
Females
●
160
●
120
Weight in pounds
Scatter plot of Height and Weight
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
55
60
65
Height in inches
70
75
Use side-by-side box plots to show
the difference between two groups
• The relationship between a quantitative and qualitative
variable can be shown in a side-by-side box plot and
summarized with a t-test
Females
Males
Side−by−side boxplots of Height by Gender
55
60
65
Height in inches
70
75
Summarize qualitative responses by
classifying and using a Pareto chart
• Process textual data by reading, classifying, and counting
• A Pareto chart orders categories from highest to lowest
frequency
16
10
4
Sampling
References
Interpret
Online
Ethics
Assess
Analysis
Design
0
Number of Responses
8 Wants for the Workshop
Guideline for reporting
statistical results
• Focus on the statistical estimates of quantities you care
about and how they compare to other quantities rather than
describing the results of a statistical test
Example:
The average height of females (n=15) in our sample was 64.9 in.
This was 4.5 inches shorter than the average male (69.4 in, n=5).
A 95% Confidence Interval for this difference is (-9.4 to 0.7).
A t-test showed that this difference was not statistically
significant (p=0.08).
Section 6: Questions and Answers
• What is the question you most want answered today?
• Can you convince your team that your question should be
the one Dr. Vance will answer?
References
• Dillman, Don A., Jolene D. Smyth, and Leah Melani Christian.
Internet, Mail, and Mixed-Mode Surveys: The Tailored Design
Method. 3rd ed. Hoboken, NJ: John Wiley & Sons, Inc, 2009.
• Lietz, P. (2010) Research into Questionnaire Design.
International Journal of Market Research, 52, 2, pp. 249-272.
• Scheaffer, Richard L., William Mendenhall III, and R. Lyman Ott.
Elementary Survey Sampling. 6th ed. Belmont, CA: Duxbury,
2006.
• http://en.wikipedia.org/wiki/Likert_scale
• http://www.surveysystem.com/sdesign.htm
• http://www.csudh.edu/dearhabermas/sampling01.htm
• http://www.youtube.com/watch?v=53mASVzGRF4
• Eric Vance [email protected]
• LISA, http://www.lisa.stat.vt.edu
Mozambique Survey
• Asked people living in villages to rate how “painful” a task
it was to fetch water on a 6-point Likert scale (ranging
from 1- not painful at all, to 6- extremely painful)
– Question was given to households in villages both with
and without a water pump
– Some households, especially those without water
pumps, must travel hours per day to fetch water
• How can we best depict the resulting data?
– Histograms
– Box-and-whisker plots
40
20
0
Frequency
60
Water Fetch Pain for Pumps
1
2
3
4
5
6
Level of Pain
200
100
0
Frequency
Water Fetch Pain for No Pumps
1
2
3
4
Level of Pain
5
6
No Pump
Pump
Water Fetch Pain
1
2
3
4
Level of Pain
5
6