Transcript Document

Designs for Research:
The Xs and Os Framework
Research Methods for Public
Administrators
Dr. Gail Johnson
Dr. Johnson,
www.ResearchDemysified.org
1
Steps in the Research Process
Planning
1. Determining Your Questions
2. Identifying Your Measures and Measurement
Strategy
3. Selecting a Research Design
4. Developing Your Data Collection Strategy
5. Identifying Your Analysis Strategy
6. Reviewing and Testing Your Plan
Dr. Johnson,
www.ResearchDemysified.org
2
Narrow Definition of Design
 While sometimes the overall research plan
is called a “design,” this discussion
focuses on the narrow definition
 The narrow definition focuses on 3 design
elements
Dr. Johnson,
www.ResearchDemysified.org
3
Three Design Elements
1. When measures are taken
 After
 Before and After
 Multiple times before and/or after
2. Whether there are comparison groups
3. Whether there is random assignment to
comparison groups
Dr. Johnson,
www.ResearchDemysified.org
4
Three Broad Categories for
Research Design
 Experimental
 Quasi-Experimental
 Non-Experimental
Dr. Johnson,
www.ResearchDemysified.org
5
Experimental Design
 The best design to use for cause-effect
questions because it rules out most other
possible explanations for the results
obtained.
 Random
assignment assures that the two groups
are comparable.
Dr. Johnson,
www.ResearchDemysified.org
6
The Xs and Os Framework
R
O
indicates Random assignment to the treatment
group or the comparison group
is the Observation, that is, the measure for the
dependent variable


Examples: earnings, weight, test scores, stock market trading,
reported crime rate, kilowatt hours, reported discrimination,
poverty rate, number of people unemployed, GDP, etc)
The researchers are looking to see if these measures change
because of the treatment
Dr. Johnson,
www.ResearchDemysified.org
7
The Xs and Os Framework
X
is the treatment, which may be a:
A particular medication
 A particular exercise regimen
 A program (eg. Head Start Program or Troubled
Asset Relief Program)
 An independent variable (eg.economic news stories,
sunspot activity, a change in daylight savings hours,
etc)

Dr. Johnson,
www.ResearchDemysified.org
8
Example:
Which approach works better in learning statistics:
using computer software or calculating formulas
by hand?
Experimental Design:
 Create Comparison Groups:


Group 1: X: computers to do formulas
Group 2: no computers
 Randomly assign students into 2 groups
 Observe: test scores before and after
Dr. Johnson,
www.ResearchDemysified.org
9
Using the Xs and Os Framework
R
R
O1
O1
X
O2
O2
R indicates Random assignment
O is the Observation (test scores). Testing
statistical knowledge before and after
X is the treatment (in this case the use of
computers)
Dr. Johnson,
www.ResearchDemysified.org
10
Experimental Design:
Xs and Os
Variation: No Pre-Measure
 Sometimes it is not possible to have a pre-measure
 For example: I am testing to see whether a
welfare to work training program results in
people getting jobs with above poverty wages. I
can randomly assign people to the program or
the control group, but I will not have a good
measure for wages before they entered the
program since they are all on welfare
Dr. Johnson,
www.ResearchDemysified.org
11
Experimental Design:
Xs and Os Notation
Variation: No Pre-Measure (note there
are on observations before the
treatment)
R
X O2
R
O2
Dr. Johnson,
www.ResearchDemysified.org
12
Quasi-Experimental Designs
 Non-Equivalent Comparison Design
 Like experimental except no random
assignment
 Use when you cannot control the process
for deciding who gets the treatment.
 Weak because there may be selection bias
 But this is often more practical in public
sector research
Dr. Johnson,
www.ResearchDemysified.org
13
Quasi-Experimental Design:
Xs and Os
O1
O1
X
O2
O2
Treatment Group
Control Group
Key elements:
Random
 Pre and Post Measurement
 Treatment to Test Group
 Control Group (or comparison group without the
treatment—but there is no random assignment).
Dr. Johnson,
www.ResearchDemysified.org
14
Quasi-Experimental Designs
 Does spanking make a difference?
 Can we randomly assign children to
spanking and non-spanking parents?
 No: We
have to deal with the world as it exists
 At best we can compare the behavior
children from parents who spank with
children whose parents don’t spank.
Dr. Johnson,
www.ResearchDemysified.org
15
Types of Quasi-Experimental
Designs
 Statistical Controls (sometimes called
Correlation with Statistical Controls.
 variations:
Causal Comparative or Ex Post
Facto design
 Basically: statistical procedures are used to
create comparisons group
Dr. Johnson,
www.ResearchDemysified.org
16
Ex-post Facto Design: Study of
Child Abuse and Neglect
A study funded by the Army Medical Research and
Material Command, reported, “During the 40 months
covered by the study, 1,858 parents in 1,771 families
of enlisted soldiers neglected or abused their
children, in a total of 3,334 incidents involving 2,968
children. Of those, 942 incidents occurred during
deployments.”[1]
[1] Aaron Levin, “Children of U.S. Army soldiers face increased risk of maltreatment while a parent is
deployed away from home,” Psychiatry News, September 7, 2007, Volume 42, Number 17, page 8; “Child
Abuse, Neglect Rise Dramatically When Army Parents Deploy To Combat,” ScienceDaily, August 1,
2007, http://www.sciencedaily.com /releases/2007/07/070731175911.htm
Dr. Johnson,
www.ResearchDemysified.org
17
Ex-post Facto Design: Study of
Child Abuse and Neglect
 In this study, the researchers gathered data about children
at a child care center serving military families and looked
at the characteristics among those that were reported to
have been abused or neglected as compared to those that
were not.
 They looked backwards to see if there were some
differences that might explain why some children were
abused and neglected.
 They found that deployments were a factor
 From a policy perspective, this suggests that families
require more supports to handle the stresses associated
with deployments.
Dr. Johnson,
www.ResearchDemysified.org
18
Correlational Design with
Statistical Controls
We cannot randomly assign people but can
create comparison groups using statistical
software and then compare outcomes.
 Eg. We
can compare people from different
income groups to see if income is related to
birth weights of their babies.
 Eg. We can compare citizen policy preferences
to see if there are differences based on age, race
or gender.
Dr. Johnson,
www.ResearchDemysified.org
19
Does Head Start Make a
Difference?
 Select all 8th graders from two inner-city schools
 Obtain school records which has information
about whether the attended Head Start as well as
other information
 Statistical software can divide all the 8th graders
into two groups: those who attended Head Start
and those that didn’t
 The 8th-grade reading scores can be compared
Dr. Johnson,
www.ResearchDemysified.org
20
Does Head Start Make a
Difference?
 If Head Start made a difference, then:
 Their scores will be higher than those who did not
 Their scores will be similar to the scores of other 8th
graders in the school district
 It might be possible to look at other factors,
assuming the data is in their permanent records:

Education of parents, family income, other pre-school
experiences
Dr. Johnson,
www.ResearchDemysified.org
21
More Quasi-experimental
Designs
 Longitudinal and Time Series
 Measures taken over time
 Time series: many measures
 Longitudinal: a few measures
 No clear dividing point when longitudinal
becomes a time series
 Example: Federal budget deficit over time
Noted: O O O O O O O O O O O O O
Dr. Johnson,
www.ResearchDemysified.org
22
More Quasi-experimental
Designs
 Interrupted Time Series
 Measures
taken before and after an event
 Time series: at least 15 measures before and
after
 Example: Number of smog warnings before
and after air pollution legislation was passed in
the city
 Noted: O O O O X O O O O O
Dr. Johnson,
www.ResearchDemysified.org
23
More Quasi-experimental
Designs
 Multiple Time Series: Comparison
 Example: number of
smog days after city
passes air pollution legislation as
compared to a city of equal size and
density that did not pass an air pollution
law
 Noted:O O O O O X O O O O
OOOOO OOOO
Dr. Johnson,
www.ResearchDemysified.org
24
More Quasi-experimental
Designs
 Two ways to select:
 Cross-sectional:
slice of the population: a
different group of people, roads, cities at each
point in time
 Drug survey of high school seniors
 Panel: track the same people, roads, cities over
time

National Longitudinal Survey of Youth: same group
of people have been surveyed since 1979
Dr. Johnson,
www.ResearchDemysified.org
25
Non-Experimental Designs
 Sometimes researchers are just trying to
take a picture at one point in time
 They are not trying to answer a causeeffect/impact question
 These designs are appropriate for answering
descriptive and normative questions
discussed earlier
Dr. Johnson,
www.ResearchDemysified.org
26
Non-Experimental Design
One shot: X
O
Key elements:

No random assignment
 No pre-measures
 No comparison
Weakest design for cause-effect questions!!
Dr. Johnson,
www.ResearchDemysified.org
27
Non-Experimental Design
Variations:
Before and After Design
O
X
O
Static Group Comparison
X
O
O
Dr. Johnson,
www.ResearchDemysified.org
28
True Confessions
 Immigration Reform and Control Act
 Employers
would be fined if they knowingly
hired illegal workers
 GAO was asked to determine whether this law
caused a widespread pattern of discrimination
against those who look or sound foreign.
 Type of Question: Cause-Effect
Dr. Johnson,
www.ResearchDemysified.org
29
True Confessions
 What design elements can be used?
 Random Assignment?
No. Congress does not
randomly require some states to implement a
law and some states not.
 Comparison Groups: No. All states had to
implement at the same time.
 Before measure: No. The law was implemented
before any measure could be taken.
Dr. Johnson,
www.ResearchDemysified.org
30
True Confessions
 What design is left?
 Implement the law (X) and measure
discrimination (O)
 A one-shot design
 The weakest design to answer an impact
question.
 You play the hand you are dealt.
Dr. Johnson,
www.ResearchDemysified.org
31
Sometimes Experimental Designs
are not Possible
 Designs reflect the situation and an experimental
design is not always possible or practical.
 You can’t assign children to parents who spank
and those who do not
 It might be more practical to conduct a reading
program in a specific school rather than
randomly children across the school district into
a reading program or not.
Dr. Johnson,
www.ResearchDemysified.org
32
Sometimes Experimental Designs
are not Possible
 In public administration, the uses of
experimental designs are limited by other
ethical and legal considerations:
 You
cannot require anyone to participate.
 You cannot deny services or benefits to which
people are entitled.
 You cannot deny life-saving treatments to
people in need.
Dr. Johnson,
www.ResearchDemysified.org
33
Sometimes Experimental Designs
Are Not Possible
Politics may play a role: mayors may object to
their city being in the “control group” while
other cities get money to implement a
program.
Dr. Johnson,
www.ResearchDemysified.org
34
Design and Internal Validity
 You may see changes after a program has been
implemented, but those changes might be caused
by something other than the program.
 The intention of design is to ensure that you are
not tricked into believing an explanation that is not
true.
 Design helps ensure internal validity.
 Design eliminates other possible (or rival)
explanations.
Dr. Johnson,
www.ResearchDemysified.org
35
Threats to Internal Validity
History:
Due to a particular event that took
place while data was being conducted.
 Drug-related death just before post-test may
explain “no drug” attitude, not the program.
 Using a comparison group in the same
environment will reduce this threat.
 If a comparison group is not possible, ask:
“what has happened to determine if there
was some event that might effect the
results?”
Dr. Johnson,
www.ResearchDemysified.org
36
Threats to Internal Validity
Maturation:
Changes based on aging, growth,
natural increases in skills
 Improved study skills because of maturity, not
the program
 This matters in studies where the behavior or
attitude likely to affected by getting older or
becoming more experienced
 Using a comparison group will reduce this
threat
Dr. Johnson,
www.ResearchDemysified.org
37
Threats to Internal Validity
Testing:
changes do to learning how to take the
test.
Risk in pre/post designs where they they
“learned” how to do the test.
 Using a comparison group would reduce this
threat because both groups would have taken
the pre and post tests. Any learning from the
testing alone would be controlled.
Dr. Johnson,
www.ResearchDemysified.org
38
Threats to Internal Validity
Instrumentation: Changes in data collection
Pre/post and comparative designs are vulnerable
 Example: Interviewer changes (race/gender)
may get different results, especially on
race/gender questions.
 Example: Changing wording in questions or
changing measures are a problem because
different things have been measured. The
results on not truly comparable.
 Ask: are the the measures reliable?
Dr. Johnson,
www.ResearchDemysified.org
39
Threats to Internal Validity
Regression to the Mean: Things tend to average out
over time



A problem when a group is selected for treatment or a
program is enacted because of an unusually high or
low score.
The next set of scores are likely to change--to “regress
to the mean”–regardless of treatment.
Using measures over time helps or a comparison time
series identify trends and makes it easier to assess real
change from just the appearance of change because of
the regression to the mean effect.
Dr. Johnson,
www.ResearchDemysified.org
40
Threats to Internal Validity
Selection:



The group under study may be different
in ways that effect the results.
School selected for a program is different from the
schools that were not selected
 A low income school may score different than a
high income school
Volunteers may be different than those who chose not
to participate.
“Did the program officials select the people most likely
to succeed to make the program look successful?”
Dr. Johnson,
www.ResearchDemysified.org
41
Threats to Internal Validity
Selection: The group under study may be different
in ways that effect the results.
 Random selection and assignment avoids this
problem
 But if random is not possible, collect data that
might help examine differences (demographic
data usually work)
Dr. Johnson,
www.ResearchDemysified.org
42
Threats to Internal Validity
Attrition:
different rates of dropping out may
effect results.
 “Problem” people may drop out, so results may
look better based on those left behind.
 Eg. Test scores may be higher because the
failing students had dropped out.
 Do what is possible to avoid attrition. If there is
attrition, researchers should note as limitation
to conclusions that can be drawn.
Dr. Johnson,
www.ResearchDemysified.org
43
Did the Poverty Program Fail?
Year
1960
Poverty Rate
22.2%
1970
12.6
1980
12.3
1990
13.5
2000
11.3
Dr. Johnson,
www.ResearchDemysified.org
44
How to Decide?
 Measurement:
 How do you define the “poverty program”?


What components of the poverty program were specifically
designed to reduce poverty?
How was poverty operationalized?

Does food account for 1/3 of our living expenses?
 Design: No control group, no random assignment
 At best, an interrupted time series design
 We do not know what percent of people would be
below the poverty line if the “poverty program” was not
in place during any of the recessions between 1960 and
2000.
Dr. Johnson,
www.ResearchDemysified.org
45
External Validity
 Does what happens in the lab under
controlled settings likely to be the same as
what happens outside of the lab?
 Does what happens in this study reflect
what occurs in other places where the
program is also being conducted?
 Programs
may share the same name but be
implemented differently.
Dr. Johnson,
www.ResearchDemysified.org
46
External Validity
 Experimental designs are strong on internal
validity
 But are often weak on external validity
 Relatively
small and therefore rarely
representative of the larger population
 Much of what we know about social
psychology comes from experiments involving
college students—but they may or may not
accurately reflect how other people behave.
Dr. Johnson,
www.ResearchDemysified.org
47
External Validity
 It is easy for policymakers, program managers and
advocates to get excited about an innovative
program or policy and decide to implement it in
their community.
 The tough question is one of external validity: will
this program or policy will work in their particular
situation?
 Public administrators have long known about the
limits of “cookie cutter” or “one-size-fits all”
approaches.
Dr. Johnson,
www.ResearchDemysified.org
48
No Perfect Design
 One-shot designs:
 Useful
for descriptive and normative questions
 Very weak for cause/effect questions: many
threats


However, it is often used in public administration.
We implement a program and then see if it worked.
Multiple one-shot designs begin to build a
case
Dr. Johnson,
www.ResearchDemysified.org
49
No Perfect Design
 Pre/post designs:
 Useful in giving context for measuring change.
 Threats: testing, instrumentation, regression to the
mean, attrition, history, and maturation may be threats.
 Threats tend to be context related


For example, regression to the mean is only a threat if a an
unusually high or low score was used as the selection criteria.
For example, testing is only a threat if the researchers used a
before and after test as part of their research design.
Dr. Johnson,
www.ResearchDemysified.org
50
No Perfect Design
 Comparison designs:

Useful in looking at differences
 Controls for history and maturation if
comparison group is a close match.
 Selection and attrition are threats
Dr. Johnson,
www.ResearchDemysified.org
51
No Perfect Design
 Experimental design:

Controls for most threats by design
 Hard to do in the public sector
It is hard to randomly assign people or localities to
receive a program or not.
 It is sometimes unethical to deny people access to
treatment just to form a control group.

Dr. Johnson,
www.ResearchDemysified.org
52
Linkage: Question
Design
 Descriptive Questions: (What is?)
 One
shot designs
 Pre/post Designs
 Cross-sectional surveys
 Time series.
Describes inputs and outputs
Dr. Johnson,
www.ResearchDemysified.org
53
Linkage: Question
Design
Normative questions:
 Does the observed condition meet a given
criterion?
 One
shot design
 Pre/post design
 Time series
 Benchmarking is a normative question.
Dr. Johnson,
www.ResearchDemysified.org
54
Linkage: Question
Design
 Impact Questions:
To determine the relationship or effect between
two variables or program impact




Experimental designs: considered the gold standard
Quasi-experimental designs using a comparison group
and a pre/post design.
Interrupted time series
Correlational statistical designs.
Dr. Johnson,
www.ResearchDemysified.org
55
Elements of Good Research
 No single design can be applied to every research
question because every situation is unique.
 Good researchers identify the reasons for the
tradeoffs and the potential weaknesses of the
study’s design.
 Sometimes there really is not much choice because
the situation itself is very limited.
 In these cases, researchers state the limitations of
the design and present their conclusions within the
context of those limitations
Dr. Johnson,
www.ResearchDemysified.org
56
Takeaway Lessons
 If the research is claiming a cause-effect
relationship, caution is warranted if:
 You
cannot determine the design in terms of the
Xs and Os framework
 If the researchers did not use an experimental
design or a strong quasi-experimental design
Dr. Johnson,
www.ResearchDemysified.org
57
Tough Questions to Ask
 Is there something else other than the
program (or hypothesized causal variable)
that could explain these results?
 Has something been left out that might alter
the results?
 Is the research design really strong enough
to support their conclusions about program
or policy success or failure?
Dr. Johnson,
www.ResearchDemysified.org
58
Creative Commons
 This powerpoint is meant to be used and
shared with attribution
 Please provide feedback
 If you make changes, please share freely
and send me a copy of changes:
 [email protected]
 Visit www.creativecommons.org for more
information
Dr. Johnson,
www.ResearchDemysified.org
59