Transcript Document

Creating a Successful Survey
Anne Ryan
Marcos Carzolio
Faculty Collaborator for LISA
Visiting Assistant Professor
Department of Statistics, VT
Associate Collaborator for LISA
Graduate Student
Department of Statistics, VT
Laboratory for Interdisciplinary Statistical Analysis
Laboratory for Interdisciplinary Statistical
Analysis
LISA helps VT researchers benefit
from the use of Statistics
Experimental Design • Data Analysis • Interpreting Results
Grant Proposals • Software (R, SAS, JMP, SPSS...)
Our goal is to improve the quality of
research and the use of statistics at Virginia
Tech.
www.lisa.stat.vt.edu
How can LISA help?
• Formulate research question.
• Screen data for integrity and unusual observations.
• Implement graphical techniques to showcase the
data – what is the story?
• Develop and implement an analysis plan to address
research question.
• Help interpret results.
• Communicate! Help with writing the report or giving
the talk.
• Identify future research directions.
3
Laboratory for Interdisciplinary Statistical
Analysis
LISA helps VT researchers benefit
from the use of Statistics
Designing Experiments • Analyzing Data • Interpreting Results
Grant Proposals • Using Software (R, SAS, JMP, Minitab...)
Collaboration
Walk-In Consulting
Monday—Friday 1-3 pm GLC Video Conf. Room
From our website request a meeting for personalized Mondays and Fridays 3-5 pm in 312 Sandy Hall
statistical advice
Tuesdays and Wednesdays 11-1 pm in Port
Thursdays 9:30-11:30 am ICTAS Café X
Great advice right now:
for questions requiring <30 mins
Meet with LISA before collecting your data
Short Courses
Designed to help graduate students
apply statistics in their research
All services are FREE for VT researchers.
www.lisa.stat.vt.edu
3 Stages of Statistical Thinking
1. Design – How do we obtain the data?
2. Description – How do we summarize the data?
–
–
Statistical Summaries
Graphical Summaries
3. Inference – How do we make
decisions/predictions based on data?
Outline: Elements of Survey Design
•
•
•
•
•
•
•
•
•
Clearly Define Research Objectives
Define Population to Be Sampled
Develop Sampling Plan
Data Collection Options
Errors with Surveys
Questionnaire Design
Pretest
Histograms, boxplots, and Scatterplots
Factor Analysis
Clearly Define Research Objectives
• State CLEARLY and CONCISELY your
– Overall Research Goals
– Specific Scientific Questions
• Refer to these objectives constantly throughout the design of
your survey to ensure your survey is answering the desired
questions of interest.
Define Population to Be Sampled
Who will you interview to answer your research questions?
The overall group of interest or the target group is the population.
• Subject: Any material we
measure.
– Plant, Person, Piano etc.
• Population: representation of
all the possible outcomes or
measurements of interest.
• Sample: Subset of the
population to be measured (i.e.
group of subjects that
represent the population).
Sampling Plan
• Once the target population has been identified, next the
sampling plan must be devised.
• Goal: Randomly select a small percent of the population that
will in turn represent the ideas of the population as a whole.
• The sampling plan involves:
– The technique used to select the subjects for your study.
•
•
•
•
Simple Random Sampling
Stratified Random Sampling
Cluster Sampling
Systematic Sample
– The number of people needed for your study.
• Sample size calculations.
Simple Random Sampling
• Subjects chosen by random mechanism.
• Each subject has an equal chance of begin part of the study.
• Easiest to summarize BUT most tedious to implement in the
field.
Example: Randomly select 10
students from the Stat 3005 class
roster to ask a question.
Stratified Random Sampling
• First divide population into strata (Groups) based on similarity
• Then randomly select subjects within each strata.
– Easier to implement.
– May result in more precise summary.
Example: Randomly select 5
male students and randomly
select 5 female students from
the STAT 5615 class roster to ask
a question.
Cluster Sampling
•
•
•
•
Population has many clusters.
First randomly select a number of clusters.
Then sample all the units within each cluster.
Require clusters to be representatives of population.
Example
Population: opinions of all students (attending
class) at VT
1) Randomly select a certain number of
classes
2) ask all students in each class their opinion
Note: Cluster sampling is often NOT as efficient
as stratified sampling for surveying.
Systematic Sampling
Example: Telemarketers
randomly sample every 10th
phone number on the
Yellow Book to make
marketing calls.
• Determine the sampling technique for the
following situations:
– You are studying sleeping patterns among freshmen,
sophomores, juniors, and seniors at Virginia Tech. You
group the students based on grade level and then take a
simple random sample of 10 students from each grade
level.
• Stratified Sampling
– You are studying sleeping patterns at Virginia Tech. From
the registrar you obtain a master list of students at
Virginia Tech. You then randomly select 5,000 students to
survey about their sleeping habits.
• Simple Random Sample
– A light bulb manufacturer produces approximately
100,000 light bulbs per day. The quality control
department must monitor the defect rate of the bulbs.
Testing each bulb would be costly and inefficient, so
department decides to test every 100th bulb produced.
• Systematic Sampling
– You are studying the sleeping patterns of college
students. From a list of all the colleges and universities
across the country, you perform a simple random sample
to select 10 colleges/universities. Then you measure
every student attending the 10 colleges/universities.
• Cluster Sampling
Sample Size Calculation
• How many people do we interview?
– Answer: It depends.
• Sample size calculations can be computed using statistical methods.
(Come to LISA we can help!)
• Sample size calculations also involve characteristics of the study:
– Time, money, precision.
• For many Gallup polls, the population of interest is all adult
Americans. To represent this population, the sample usually consists
of around 1,000 adults.
– When sample sizes get to sizes around 500 or more the gains in
accuracy get smaller and smaller for the increase in sample size.
Data Collection Options
• Once we know the subjects we want to survey, we must
determine the best instrument for collecting data.
• Data Collection Options:
– Personal Interviews
– Telephone Interviews
– Mail Surveys
– Email Surveys
• For more discussion of data collection options see
http://www.surveysystem.com/sdesign.htm.
Personal Interviews
• A face-to-face encounter between the interviewer and the
subject.
• Advantages:
– People usually respond when confronted face-to-face
– Can get a better sense of the reaction of the subject
– Prevent misunderstandings
• Disadvantages:
– More Costly
– Interviewers who are not trained properly may introduce
bias into the sample.
Telephone Interviews
• Most popular instrument for survey in the United States since 96% of
homes have telephones.
• Personal Interviews and telephone interviews are usually the most
successful forms of surveying with response rates around 60 to 75%.
• Advantages:
– Less expensive than personal interviewing
– Random phone numbers can be dialed
– Fast results
• Disadvantages:
– People are reluctant to answer phone interviews
– Phone calls can usually only be made from around 6pm-9pm
– Phone surveys normally need to be shorter in length than
personal interviews
Mail Surveys
• Advantages:
– Cheap
– Questionnaire can include pictures
– People are able to answer on their own time
• Disadvantages:
– Timely processes
– Response rates have a tendency to be low
Email Surveys
• Advantages
– Cheap
– Fast
– You can attach pictures or sound files
• Disadvantages
– People may respond multiple times
– People who have email may not be representative of the
population as a whole
Nonresponse Bias
In a national sample of board-certified physicians, a short survey was
mailed asking physicians to nominate the five best hospitals in their
specialty regardless of cost or location. Up to three follow-ups were
mailed to nonresponders to gain participation. The final response rate was
47.3%.
Males were significantly more likely to respond than females, which would
not be an issue if men and women answered in the same way…
But, men were significantly more likely to nominate one or two top
hospitals in their specialty. In addition, women were significantly more
likely to nominate hospitals only in their region.
Nonresponse Bias
•
•
•
Definition:
– Survey error that happens when respondents are different from
nonrespondents in a significant way
Problems:
– Filters out certain types of respondents
– The reason for which a person responds (or, conversely, does not respond) to
a survey is related to the subject of the survey
Possible Solutions:
– Provide incentives for completing survey
– Explain why survey is important
– Keep survey short and sweet
– Give more weight to answers from hard-to-reach respondents (Come to LISA)
Measurement Error
In a study about measurement error in earnings
data, respondents were asked to report their annual
wages. The reported wages were then compared to
earnings statements on detailed W-2 records.
Not surprisingly, the study found that respondents
tended to over-report their wages when compared
to their W-2 records. Also, the discrepancy between
reported and official wages decreased as official
wage increased.
Measurement Error
• Definition:
– Inaccurate answers to survey questions (sometimes due to
lack of clarity in writing)
• Problems:
– Makes it difficult to judge if answers are accurate
– May lead to incorrect conclusions about target population
• Possible Solutions:
– Write clear, concise questions
– Be aware of leading questions
– Be aware of social factors that may influence responses
– Explain why survey is important
Coverage Error
• Definition:
– Not all members of a population have a known, nonzero chance of
being selected for survey
• Problem:
– Survey may turn out to be biased
• Possible Solutions:
– Identify target population (might require some expertise in the subject
of the survey)
– Construct a sampling frame - a list of all possible respondents
– Avoid: duplicates; respondents that are outside of target population;
and excluding a portion of target population
– Randomize
Sampling Error
• Definition:
– Inherent inaccuracy due to one’s inability to sample entire
population
• Problem:
– Variability among individual respondents makes it difficult
to learn about group as a whole
• Possible Solutions:
– Find right sample size (Come to LISA)
– Know difference between sample and population
Questionnaire Design
• Our goal of this section is to comment on some of the important
aspects of questionnaire design.
• An article appearing in the International Journal of Market Research
gives great advise about questionnaire design. This youtube video
summarizes the findings in the article
http://www.youtube.com/watch?v=53mASVzGRF4.
• We will discuss the following topics associated with questionnaire
design. This list of topics is not comprehensive, so we suggest that
you explore the topic of questionnaire design further.
–
–
–
–
–
–
Length
Question Ordering
Don’t Know Option
Open versus Closed Questions
Wording
Scaling Questions
Length of Questionnaire
• Keep the questionnaire as short as possible.
• The Creative Research Systems has the following useful suggestions.
(http://www.surveysystem.com/sdesign.htm)
• Follow the “KISS” method meaning “Keep it short and simple!”
• Categorize questions into 3 groups:
– Must Know
– Useful to Know
– Nice to Know
• If the questionnaire seems too long, start omitting the “nice to know”
questions.
• Don’t get caught in the trap where you find that you have a captive
audience, so you begin asking questions that are not pertinent.
Question Order Effects
•
•
•
•
•
•
•
Priming
–
Early questions refresh respondents’ memory for subsequent questions
Carryover
–
Respondents believe questions are similar and answer them with same
criteria
Consistency
–
Respondents answer questions similarly to try to appear consistent
Norm of Evenhandedness
–
Respondents answer questions similarly to try to be fair
Anchoring
–
Early questions set a standard for comparison to later questions
Subtraction
–
Considerations in answers to early questions are left out of subsequent
judgments
Avoiding Extremeness
–
Respondents try to seem neutral by choosing some items while rejecting
others
Priming
An NIH Survey on Disability asked respondents to
list causes of their disabilities. Nearly 49% of
respondents who were previously asked about
sensory impairments reported those as the causes
for their disability, while only 41% of those who
had not previously been asked about sensory
impairments reported the same causes.
Carryover
• General questions should proceed specific questions.
– A study was conducted in 1979 to determine a person’s
overall happiness and a person’s happiness in their marriage.
– Possible ordering for questions:
• General happiness question first followed by specific question
concerning happiness in marriage.
• Specific question concerning happiness in marriage first followed by
general happiness question.
– Results: Over 60% of respondents indicated that they were
very happy in their marriage.
• General Happiness Question followed by specific marriage happiness
question-52% responded they were very happy.
• Specific marriage happiness question followed by general happiness
qeustion-38% responded they were very happy.
– Overall respondents were happier with their marriage than
life in general.
– The marriage question first caused people to rank their level
of overall happiness lower.
Consistency
Three questionnaires about criminals were
administered to students, where one was strongly
worded against criminals, another was biased
toward leniency for criminals, and the third was
constructed to be neutral.
Afterwards, the students were asked to complete
scales measuring their opinions about criminals.
Student responses tended to reflect a similar level
of leniency to the questionnaire they answered
beforehand.
Norm of Evenhandedness
Students at Washington State University were
asked about the consequences of plagiarism. Two
questions in particular were given: “Should a
student who plagiarizes be expelled?” and “Should
a professor who plagiarizes be fired?”
When the professor question was asked first, 34%
of respondents indicated on the student question
that students should be expelled. But when the
professor question was asked second, only 21%
indicated that students should be expelled.
Anchoring
In 1997, a Gallup poll asked respondents “Do you
generally think Bill Clinton is honest and
trustworthy?” and “Do you generally think Al Gore
is honest and trustworthy?” in different orders.
When the Bill Clinton question was asked first, 50%
stated that he was honest, then 60% answered that
Gore was honest. But when the Gore question was
asked first, 68% answered that he was honest, then
57% responded that Clinton was honest.
Subtraction
In 1994, a survey asked responents how they would
describe the economic situation of their communities over
the next 5 years and how they felt about the economic
situation in their state over the next 5 years.
The survey found that 7-10% more people responded that
the state economy would get better when the state
economy question was asked before the community
economy question.
The conclusion of the study was that people tend to remove
considerations from subsequent questions after they have
been used in previous questions.
Avoiding Extremeness
Students were presented a survey about the controversial
topics of euthanasia and reduced training for doctors. Then
half of them were told they would interact with another
student about the topics face-to-face, while the other half
were told they would listen to a recording of another
student talking about the subject. Before they would
proceed, however, they were given more questions relating
to the topics.
Students who were told they would interact face-to-face
with other students answered more moderately than the
students who were told they would only listen to a
recording. In general, people tend to be more moderate in
social settings.
Question Order
•
•
Group related questions together
Choose first question carefully. The first question should:
–
–
–
•
Place sensitive questions near the end
–
•
•
Apply to everyone
Be easy to read
Be interesting
Give respondents a chance to become comfortable with questionnaire
Ask about sequential events in the order that they occurred
Avoid unintended question order effects
Question Order
• The following demographic questions should be saved for the end
of the questionnaire.
– Age, Education, income, martial status, etc.
• Ensures that respondents will not feel that they are losing their
anonymity when answering the rest of the questions.
• Choose the most important questions for your survey to be asked
at the beginning of the survey.
Don’t Know Option
• Add a “don’t know” or “not applicable” option to all
questions unless you are positive that every respondent
will have an answer or will feel comfortable answering the
question.
• Do not want people to feel as though they are being
forced to give an answer.
• An alternative to the “don’t know” option
– Create screening questions before the actual question to determine if
the respondent has the knowledge to answer the question.
– If it is determined the respondent has the background knowledge the
question is given without a “don’t know” option.
– If the respondent does not have the background knowledge, then the
client may skip that question completely.
Open versus Closed Questions
• Open questions allow the respondent to freely answer the
question.
– Less restrictions and allows for more depth in the overall
answer.
• Closed questions force the respondent to answer the question
by choosing from predetermined choices.
– Advantage: Ease in analysis.
• One suggestion is to test the survey on a small group with an
open question. From those responses form a closed question
that encompasses the categories expressed in the responses to
the open questions.
• Allow for an “other” option in closed questions, to permit
respondents to write their own responses.
Wording
• Refrain from having two concepts embedded in one question.
– Example: “Do you have time to read the newspaper every day?”
• Notice you are asking about “time” and “reading the newspaper every day”.
– A better revision, “Do you read the newspaper every day?”
• If the answer is no, you can create a question to determine the reasons the person
does not read the newspaper.*
• Refrain from negatively worded questions.
– Example: Students should not be required to attend weekly
colloquium.
a) Agree
b) Disagree
– Example: Question: What is your view on the concept that
students should not be unhelpful with recruiting new
graduate students to the statistics department.
Revision: What is your view on the concept that students
should be helpful with recruiting new graduate students
to the statistics department.
Scaling Questions
• A popular technique in survey design
is the use of scaling questions.
– Respondents are able to select a number
or category that represents their answer
to the survey question.
• Likert scaling is common technique
used in questionnaires.
– A Likert item is question or statement on a
questionnaire where the respondent gives
a rating for their response on a topic.
– The rating is usually the level of
agreement the respondent has concerning
the statement or question.
– A likert item is balanced, meaning there is
an equal number of positive and negative
positions.
http://en.wikipedia.org/wiki/File:Example_Likert_Scale.jpg
Scaling Questions
• Research reports that 5-point and 7-point scale responses are the
most common.
• The inclusion of the middle option increases the validity and
reliability of a response scale slightly.
– Example:
v
Disagree Slightly Disagree
Neither agree or disagree
Slightly Agree
Agree
• Likert items can be analyzed separately or the items may be
summed and the sum can be analyzed. The sum of Likert items is
called the Likert Scale.
Pretest
• A pretest of the survey to a smaller sample is suggested if
possible.
• This pretest can
– Allow you to revise the questionnaire if needed.
– Allow you to create a closed question from the responses for
an open question.
– Help you estimate the variability in the responses to your
questions.
Sample Size Calculation
for a Proportion
Let
n = sample size
σ = standard deviation
d = confidence interval size
α = significance level
Then, to get a (1-α/2)*100% confidence interval, we
need a sample size of:
é 2s -1 æ a öù
n= ê
F ç1- ÷ú
è 2 øû
ë d
2
Sample Size Calculation
for a Proportion
For example, suppose we want an estimate for a 95%
confidence interval of width 0.2 (meaning we have a 0.1
margin of error). If we know from a pilot study that the
standard deviation of the population is 0.5, then,
σ = 0.5
d = 0.2
α = 0.05
And plugging these numbers into the previous equation,
we get,
n = 96.03
Which means we need to sample 97 people.
Mozambican Survey
• Asked people living in villages to rate how “painful” a task it
was to fetch water on a 6-point Likert scale (ranging from 1- not
painful at all, to 6- extremely painful)
– Question was given to households in villages both with and
without a water pump
– Some households, especially those without water pumps,
must travel hours per day to fetch water
• How can we best depict the resulting data?
– Histograms
– Box-and-whisker plots
40
20
0
Frequency
60
Water Fetch Pain for Pumps
1
2
3
4
5
6
Level of Pain
200
100
0
Frequency
Water Fetch Pain for No Pumps
1
2
3
4
Level of Pain
5
6
No Pump
Pump
Water Fetch Pain
1
2
3
4
Level of Pain
5
6
●●
●
●
● ● ●
● ● ●
●
● ● ● ●●●●
● ● ● ●
● ● ●
●
● ●●
● ● ●● ●
●
● ●
●● ●
●
●
●
● ●● ●
●● ●
●
● ●● ●
●●
●
●
●●
●
●●
●●
●●
●
●
●
●● ●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
● ●
● ●
● ●
●●
●
●
●
●
●
●●
●
●
● ●●
●
● ●
●●
●
●
● ●●
● ●
●
●
●
●
●●
●
●
●
No Pump
●
●
Pump
Water Fetch Pain
●
●
●
●●
●● ●
●
●
●
●
●
● ● ●● ●●●
●●●●
● ●●
●●
●
●
● ●●
●●
●● ●
● ●●●●● ●● ●
●● ● ● ●●●● ● ● ●● ●●
● ●●
●● ●●
●
●●
● ●● ●● ●● ● ●●
● ●●●
● ● ● ●●
●●● ● ● ● ●
● ●●●●● ●●● ● ●● ● ●
●● ● ●
●
●
●● ●● ●
●
● ●
●●●●●● ●●● ●●
● ●●
●
●● ●● ●● ●
●● ●
●● ●●
●●
●
●●● ●
●●●●
●● ● ● ● ●
●● ● ● ●
● ●●●
●
● ●● ●
● ● ●
●●● ●
● ●● ●
●
●
●●● ● ● ● ●●●● ●● ●● ●●●
●
●●
● ●●●●
●●●
● ●● ● ●● ● ● ● ● ●●●
●●● ●
●● ●
●●●● ●
●
● ●●●●●●●●●●●●● ●●●
●
●●● ●
●
●
●●● ● ● ● ● ● ●● ●●
● ● ● ●●● ● ● ●● ●● ●● ●●
● ●●● ●● ● ●
●●
● ●
● ●● ●●● ●●●● ●●
● ●●●●●
● ●● ● ●●●● ●● ●●
●
● ●● ●●●
●
●●●●●●●
● ●●●
●●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●● ● ●● ● ●
● ●
● ●
● ●●● ● ●
●●●
●●
●
●●● ● ●● ●
● ●●●●● ● ●
● ● ●●
●●
●
●● ● ●● ●
●●
●●●
●●
●●●
●● ●
● ● ● ●● ●●●
● ●●●●● ●●●
●●●●●
●●
●● ●
●●●●●●
●●
●● ● ●●●●● ●●
●● ●
● ●
●●
●●
● ● ● ● ●● ●
●●●
● ● ●
●● ● ● ● ●
●● ●●● ●
●● ●
● ● ● ● ● ●●● ● ●●● ● ●
●●
● ●● ●
● ● ●●
● ● ●● ●
● ● ●
●●● ●● ● ●●●●
●●● ● ● ●●●● ● ● ● ● ● ●● ● ●
●● ● ●●●●
1
2
3
4
Level of Pain
5
●
●
●●
●
● ●● ●
●
●
● ●
6
●
●●
●
●● ●●
●●
●
●● ●
Two-Sample t-Test Results
data: pump and no.pump
t = -7.3009
df = 217.264
p-value = 5.329e-12
alternative hypothesis: true difference
in means is not equal to 0
95 percent confidence interval:
[-1.2752194, -0.7330663]
sample estimates:
mean of x
mean of y
2.563758
3.567901
Scatterplots
• Draws data points on a 2-dimensional plane
• Great for visualizing relationship between two continuous
variables
• Requires paired data: (x1,y1), (x2,y2), (x3,y3), etc.
• Examples are: income vs. expenditure; height vs. weight;
electricity consumption vs. home square footage
• Should be used in the exploratory phase of analysis
• Can be misleading, especially if there is a lurking variable
Scatterplot Example
• 736 households in northeastern Mozambique were asked about
their water consumption habits
• Variable of interest is called improved LPCD – or liters per capita
daily from an improved (clean) water source
• We compare improved LPCD with the distance to the nearest
handpump
Scatterplot
References
• Dillman, Don A., Jolene D. Smyth, and Leah Melani Christian.
Internet, Mail, and Mixed-Mode Surveys: The Tailored Design
Method. 3rd ed. Hoboken, NJ: John Wiley & Sons, Inc, 2009.
• Lietz, P. (2010) Research into Questionnaire Design. International
Journal of Market Research, 52, 2, pp. 249-272.
• Scheaffer, Richard L., William Mendenhall III, and R. Lyman Ott.
Elementary Survey Sampling. 6th ed. Belmont, CA: Duxbury, 2006.
• http://en.wikipedia.org/wiki/Likert_scale
• http://www.surveysystem.com/sdesign.htm
• http://www.csudh.edu/dearhabermas/sampling01.htm
• http://www.youtube.com/watch?v=53mASVzGRF4
• http://www.statsoft.com/textbook/principal-components-factoranalysis/