Session 2: Specifying the Conceptual and Operational
Download
Report
Transcript Session 2: Specifying the Conceptual and Operational
Specifying the Conceptual
and Operational Models and
the Research Questions that
Follow
Mark W. Lipsey
Vanderbilt University
IES/NCER Summer Research Training Institute, 2010
Focus on randomized controlled
trials
Purpose of the Summer Training Institute:
Increasing capacity to develop and conduct
rigorous evaluations of the effectiveness of
education interventions
Caveat: “Rigorous evaluations” are not
appropriate for every intervention or every
research project involving an intervention
They require special resources (funding,
amenable circumstances, expertise, time)
They can produce misleading or uninformative
results if not done well
The preconditions for making them meaningful
may not be met.
Critical preconditions for rigorous
evaluation
A well-specified, fully developed
intervention with useful scope
basis in theory and prior research
identified target population
specification of intended outcomes/effects
“theory of change” explication of what it does
and why it should have the intended effects for
the intended population
operators’ manual: complete instructions for
implementing
ready-to-go materials, training procedures,
software, etc.
Critical preconditions for rigorous
evaluation (continued)
A plausible rationale that the intervention is
needed; reason to believe it has
advantages over what’s currently proven
and available
Clarity about the relevant counterfactual–
what it is supposed to be better than
Demonstrated “implementability”– can be
implemented well enough in practice to
plausibly have effects
Some evidence that it can produce the
intended effects albeit short of standards
for rigorous evaluation
Critical preconditions for rigorous
evaluation (continued)
Amenable research sites and
circumstances:
cooperative schools, teachers, parents,
and administrators willing to participate
student sample appropriate in terms of
representativeness and size for showing
educationally meaningful effects
access to students (e.g., for testing),
records, classrooms (e.g., for
observations)
IES funding categories
Goal 2 (intervention development) for
advancing intervention concepts to the
point where rigorous evaluation of its
effects may be justified
Goal 3 (efficacy studies) for determining
whether an intervention can produce
worthwhile effects; RCT evaluations
preferred.
Goal 4 (effectiveness studies) for
investigating the effects of an intervention
implemented under realistic conditions at
scale; RCT evaluations preferred.
Specifying the theory of change
embodied in the intervention
1. Nature of the need addressed
what and for whom (e.g., 2nd grade
students who don’t read well)
why (e.g., poor decoding skills, limited
vocabulary)
where the issues addressed fit in the
developmental progression (e.g.,
prerequisites to fluency and
comprehension, assumes concepts of
print)
rationale/evidence supporting these
specific intervention targets at this
particular time
Specifying the theory of change
2. How the intervention addresses the need
and why it should work
content: what the student should know or be
able to do; why this meets the need
pedagogy: instructional techniques and
methods to be used; why appropriate
delivery system: how the intervention will
arrange to deliver the instruction
Most important: What aspects of the above
are different from the counterfactual
condition
What are the key factors or core ingredients
most essential and distinctive to the
intervention
Logic models as theory schematics
Target
Population
Intervention
Proximal Outcomes
Distal Outcomes
Positive
attitudes to
school
4 year
old pre-K
children
Exposed to
intervention
Improved
pre-literacy
skills
Learn
appropriate
school
behavior
Increased
school
readiness
Greater
cognitive
gains in K
Mapping variables onto the intervention
theory: Sample characteristics
Positive
attitudes to
school
4 year
old pre-K
children
Exposed to
intervention
Sample descriptors:
basic demographics
diagnostic, need/eligibility
identification
nuisance factors (for
variance control)
Improved
pre-literacy
skills
Learn
appropriate
school
behavior
Increased
school
readiness
Greater
cognitive
gains in K
Potential moderators:
setting, context
personal and family
characteristics
prior experience
Mapping variables onto the intervention
theory: Intervention characteristics
Positive
attitudes to
school
4 year
old pre-K
children
Exposed to
intervention
Independent variable:
T vs. C experimental
condition
Generic fidelity:
T and C exposure to the
generic aspects of the
intervention (type,
amount, quality)
Improved
pre-literacy
skills
Learn
appropriate
school
behavior
Increased
school
readiness
Greater
cognitive
gains in K
Specific fidelity:
T and C(?) exposure to
distinctive aspects of
the intervention (type,
amount, quality)
Potential moderators:
characteristics of personnel
intervention setting, context
e.g., class size
Mapping variables onto the intervention
theory: Intervention outcomes
Positive
attitudes to
school
4 year
old pre-K
children
Exposed to
intervention
Focal dependent variables:
pretests (pre-intervention)
posttests (at end of intervention)
follow-ups (lagged after end of
intervention
Improved
pre-literacy
skills
Learn
appropriate
school
behavior
Increased
school
readiness
Greater
cognitive
gains in K
Other dependent variables:
construct controls– related DVs
not expected to be affected
side effects– unplanned positive
or negative outcomes
mediators– DVs on causal
pathways from intervention
to other DVs
Main relationships of (possible)
interest
Causal relationship between IV and DVs (effects
of causes); tested as T-C differences
Duration of effects post-intervention; growth
trajectories
Moderator relationships; ATIs (aptitude-Tx
interactions): differential T effects for different
subgroups; tested as T x M interactions or T-C
differences between subgroups
Mediator relationships: stepwise causal
relationship with effect on one DV causing
effect on another; tested via Baron & Kenny
(1986), SEM type techniques.
Formulation of the research
questions
Organized around key variables and
relationships
Specific with regard to the nature of
the variables and relationships
Supported with a rationale for why
the question is important to answer
Connected to real-world education
issues
What works, for whom, under what
circumstances, how, and why?
Describing and Quantifying
Outcomes
Mark W. Lipsey
Vanderbilt University
IES/NCER Summer Research Training Institute, 2010
Outcome constructs to measure
Identifying the relevant outcome constructs
follows from the theory development and other
considerations covered in the earlier session
What: proximal/mediating and distal outcomes
When: temporal status– baseline, immediate
outcome, longer term outcomes
What else:
possible positive or negative side effects
construct control outcomes not targeted for
change
Aligning the outcome constructs and measures with
the intervention and policy objectives
Instruction
Assessment
Policy relevant outcomes
(e.g., state achievement standards)
Alignment of instructional tasks
with the assessment tasks
Instructional tasks,
activities, content
Identical
Analogous
(near transfer)
Generalized
(far transfer)
Basic psychometric issues
Validity (typically correlation with established
measures or subgroup differences)
Reliability (typically internal consistency or testretest correlation)
standardized measures of established validity
and reliability
researcher developed measures with validity
and reliability demonstrated in prior research
new measures with validity and/or reliability
to be investigated in present study
Sensitivity to change: Achievement effect sizes
from 124 randomized education studies
Type of
Mean Effect
Outcome
Size
Measure
Standardized
.04
test, broad
Standardized
.28
test, narrow
Focal topic test,
.40
mastery test
Number of
Measures
103
426
300
Data from which measurement
sensitivity can be inferred
Observed effects from other intervention
studies using the measure
Mean effect sizes and their standard
deviations from meta-analysis
Longitudinal research and descriptive
research showing change over time or
differences between relevant criterion groups
Archival data allowing ad hoc analysis of,
e.g., change over time, differences between
groups
Pilot data on change over time or group
differences with the measure
Variance control and
measurement sensitivity
Variance control via procedural consistency and statistical control using
covariates for e.g., pre-intervention individual differences and differences
in testing procedures or conditions
Issues related to multiple
outcome measures
Correlated measures:
overlap and efficiency
Factor Analysis of Preschool Outcome Variables
Factor Loadings
Subtest
Letter Word Identification
Quantitative Concepts
Applied Problems
Picture Vocabulary
Oral Comprehension
Story Recall
Pre-K
Pretest
Pre-K
Posttest
Kindergarten
Follow-up
.60
.82
.82
.75
.82
.53
.69
.82
.80
.76
.79
.55
.73
.78
.75
.67
.74
.61
Correlated change may be even
more relevant
Factor Analysis of Gain Scores for Pre-K Outcomes
Factor Loadings
Subtest
Pre to
Post
Basic School Skills
Letter Word Identification
Quantitative Concepts
Applied Problems
.74
.66
.54
Complex Language
Picture Vocabulary
Oral Comprehension
Story Recall
.09
.16
-.08
-.19
.14
.08
.77
.75
.37
Post to
Follow-up
Pre to
Follow-up
.73
.70
.47
-.06
.06
.16
.79
.74
.40
-.15
.13
.41
.14
.17
-.16
.48
.72
.68
-.04
.13
-.01
.74
.69
.37
Handling multiple correlated
outcome measures
Pruning– try to avoid measures that have
high conceptual overlap and are likely to
have relatively large intercorrelations
Procedural– organize assessment and data
collection to combine where possible for
efficiency
Analytic
create composite variables to use in the analysis
use multivariate techniques like MANOVA to
examine omnibus effects as context for
univariate effects
use latent variable analysis, e.g., in SEM
IES Guidelines on multiple
significance tests
Schochet, P.Z. (2008). Technical methods report: Guidelines for
multiple testing in impact evaluations. IES/NCEE 2008-4108.
http://ies.ed.gov/pubsearch/pubsinfo.asp?pubid=NCEE20084018
Delineate separate outcome domains in the study protocol.
Define confirmatory and exploratory analysis prior to data analysis
Specify which subgroups will be part of the confirmatory analysis and
which will be part of the exploratory analysis
Design the evaluation to have sufficient statistical power for
examining effects for all prespecified confirmatory analyses
For domain-specific confirmatory analysis, conduct hypothesis testing
for domain outcomes as a group
Multiplicity adjustments are not required for exploratory analysis
Qualify confirmatory and exploratory analysis findings in the study
report
Practicality and appropriateness to
the circumstances
Feasibility– time and resources required
Respondent burden– minimize demands,
provide incentives/compensation
Developmental appropriateness– consider
not only age but performance level, possible
ceiling and floor effect
For follow-up beyond one school year, may
need measures designed for a broad age
span to maintain comparability
May need to tailor measures or assessment
procedures for special populations
(disabilities, English language learners)