Session 2: Specifying the Conceptual and Operational

Download Report

Transcript Session 2: Specifying the Conceptual and Operational

Specifying the Conceptual
and Operational Models and
the Research Questions that
Follow
Mark W. Lipsey
Vanderbilt University
IES/NCER Summer Research Training Institute, 2010
Focus on randomized controlled
trials
 Purpose of the Summer Training Institute:
Increasing capacity to develop and conduct
rigorous evaluations of the effectiveness of
education interventions
 Caveat: “Rigorous evaluations” are not
appropriate for every intervention or every
research project involving an intervention
 They require special resources (funding,
amenable circumstances, expertise, time)
 They can produce misleading or uninformative
results if not done well
 The preconditions for making them meaningful
may not be met.
Critical preconditions for rigorous
evaluation
 A well-specified, fully developed
intervention with useful scope
basis in theory and prior research
identified target population
specification of intended outcomes/effects
“theory of change” explication of what it does
and why it should have the intended effects for
the intended population
 operators’ manual: complete instructions for
implementing
 ready-to-go materials, training procedures,
software, etc.




Critical preconditions for rigorous
evaluation (continued)
 A plausible rationale that the intervention is
needed; reason to believe it has
advantages over what’s currently proven
and available
 Clarity about the relevant counterfactual–
what it is supposed to be better than
 Demonstrated “implementability”– can be
implemented well enough in practice to
plausibly have effects
 Some evidence that it can produce the
intended effects albeit short of standards
for rigorous evaluation
Critical preconditions for rigorous
evaluation (continued)
 Amenable research sites and
circumstances:
 cooperative schools, teachers, parents,
and administrators willing to participate
 student sample appropriate in terms of
representativeness and size for showing
educationally meaningful effects
 access to students (e.g., for testing),
records, classrooms (e.g., for
observations)
IES funding categories
 Goal 2 (intervention development) for
advancing intervention concepts to the
point where rigorous evaluation of its
effects may be justified
 Goal 3 (efficacy studies) for determining
whether an intervention can produce
worthwhile effects; RCT evaluations
preferred.
 Goal 4 (effectiveness studies) for
investigating the effects of an intervention
implemented under realistic conditions at
scale; RCT evaluations preferred.
Specifying the theory of change
embodied in the intervention
1. Nature of the need addressed
 what and for whom (e.g., 2nd grade
students who don’t read well)
 why (e.g., poor decoding skills, limited
vocabulary)
 where the issues addressed fit in the
developmental progression (e.g.,
prerequisites to fluency and
comprehension, assumes concepts of
print)
 rationale/evidence supporting these
specific intervention targets at this
particular time
Specifying the theory of change
2. How the intervention addresses the need
and why it should work



content: what the student should know or be
able to do; why this meets the need
pedagogy: instructional techniques and
methods to be used; why appropriate
delivery system: how the intervention will
arrange to deliver the instruction
Most important: What aspects of the above
are different from the counterfactual
condition
What are the key factors or core ingredients
most essential and distinctive to the
intervention
Logic models as theory schematics
Target
Population
Intervention
Proximal Outcomes
Distal Outcomes
Positive
attitudes to
school
4 year
old pre-K
children
Exposed to
intervention
Improved
pre-literacy
skills
Learn
appropriate
school
behavior
Increased
school
readiness
Greater
cognitive
gains in K
Mapping variables onto the intervention
theory: Sample characteristics
Positive
attitudes to
school
4 year
old pre-K
children
Exposed to
intervention
Sample descriptors:
basic demographics
diagnostic, need/eligibility
identification
nuisance factors (for
variance control)
Improved
pre-literacy
skills
Learn
appropriate
school
behavior
Increased
school
readiness
Greater
cognitive
gains in K
Potential moderators:
setting, context
personal and family
characteristics
prior experience
Mapping variables onto the intervention
theory: Intervention characteristics
Positive
attitudes to
school
4 year
old pre-K
children
Exposed to
intervention
Independent variable:
T vs. C experimental
condition
Generic fidelity:
T and C exposure to the
generic aspects of the
intervention (type,
amount, quality)
Improved
pre-literacy
skills
Learn
appropriate
school
behavior
Increased
school
readiness
Greater
cognitive
gains in K
Specific fidelity:
T and C(?) exposure to
distinctive aspects of
the intervention (type,
amount, quality)
Potential moderators:
characteristics of personnel
intervention setting, context
e.g., class size
Mapping variables onto the intervention
theory: Intervention outcomes
Positive
attitudes to
school
4 year
old pre-K
children
Exposed to
intervention
Focal dependent variables:
pretests (pre-intervention)
posttests (at end of intervention)
follow-ups (lagged after end of
intervention
Improved
pre-literacy
skills
Learn
appropriate
school
behavior
Increased
school
readiness
Greater
cognitive
gains in K
Other dependent variables:
construct controls– related DVs
not expected to be affected
side effects– unplanned positive
or negative outcomes
mediators– DVs on causal
pathways from intervention
to other DVs
Main relationships of (possible)
interest
 Causal relationship between IV and DVs (effects
of causes); tested as T-C differences
 Duration of effects post-intervention; growth
trajectories
 Moderator relationships; ATIs (aptitude-Tx
interactions): differential T effects for different
subgroups; tested as T x M interactions or T-C
differences between subgroups
 Mediator relationships: stepwise causal
relationship with effect on one DV causing
effect on another; tested via Baron & Kenny
(1986), SEM type techniques.
Formulation of the research
questions
 Organized around key variables and
relationships
 Specific with regard to the nature of
the variables and relationships
 Supported with a rationale for why
the question is important to answer
 Connected to real-world education
issues
 What works, for whom, under what
circumstances, how, and why?
Describing and Quantifying
Outcomes
Mark W. Lipsey
Vanderbilt University
IES/NCER Summer Research Training Institute, 2010
Outcome constructs to measure
Identifying the relevant outcome constructs
follows from the theory development and other
considerations covered in the earlier session
 What: proximal/mediating and distal outcomes
 When: temporal status– baseline, immediate
outcome, longer term outcomes
 What else:
 possible positive or negative side effects
 construct control outcomes not targeted for
change
Aligning the outcome constructs and measures with
the intervention and policy objectives
Instruction
Assessment
Policy relevant outcomes
(e.g., state achievement standards)
Alignment of instructional tasks
with the assessment tasks
Instructional tasks,
activities, content
Identical
Analogous
(near transfer)
Generalized
(far transfer)
Basic psychometric issues
Validity (typically correlation with established
measures or subgroup differences)
Reliability (typically internal consistency or testretest correlation)
 standardized measures of established validity
and reliability
 researcher developed measures with validity
and reliability demonstrated in prior research
 new measures with validity and/or reliability
to be investigated in present study
Sensitivity to change: Achievement effect sizes
from 124 randomized education studies
Type of
Mean Effect
Outcome
Size
Measure
Standardized
.04
test, broad
Standardized
.28
test, narrow
Focal topic test,
.40
mastery test
Number of
Measures
103
426
300
Data from which measurement
sensitivity can be inferred
 Observed effects from other intervention
studies using the measure
 Mean effect sizes and their standard
deviations from meta-analysis
 Longitudinal research and descriptive
research showing change over time or
differences between relevant criterion groups
 Archival data allowing ad hoc analysis of,
e.g., change over time, differences between
groups
 Pilot data on change over time or group
differences with the measure
Variance control and
measurement sensitivity
Variance control via procedural consistency and statistical control using
covariates for e.g., pre-intervention individual differences and differences
in testing procedures or conditions
Issues related to multiple
outcome measures
Correlated measures:
overlap and efficiency
Factor Analysis of Preschool Outcome Variables
Factor Loadings
Subtest
Letter Word Identification
Quantitative Concepts
Applied Problems
Picture Vocabulary
Oral Comprehension
Story Recall
Pre-K
Pretest
Pre-K
Posttest
Kindergarten
Follow-up
.60
.82
.82
.75
.82
.53
.69
.82
.80
.76
.79
.55
.73
.78
.75
.67
.74
.61
Correlated change may be even
more relevant
Factor Analysis of Gain Scores for Pre-K Outcomes
Factor Loadings
Subtest
Pre to
Post
Basic School Skills
Letter Word Identification
Quantitative Concepts
Applied Problems
.74
.66
.54
Complex Language
Picture Vocabulary
Oral Comprehension
Story Recall
.09
.16
-.08
-.19
.14
.08
.77
.75
.37
Post to
Follow-up
Pre to
Follow-up
.73
.70
.47
-.06
.06
.16
.79
.74
.40
-.15
.13
.41
.14
.17
-.16
.48
.72
.68
-.04
.13
-.01
.74
.69
.37
Handling multiple correlated
outcome measures
 Pruning– try to avoid measures that have
high conceptual overlap and are likely to
have relatively large intercorrelations
 Procedural– organize assessment and data
collection to combine where possible for
efficiency
 Analytic
 create composite variables to use in the analysis
 use multivariate techniques like MANOVA to
examine omnibus effects as context for
univariate effects
 use latent variable analysis, e.g., in SEM
IES Guidelines on multiple
significance tests
Schochet, P.Z. (2008). Technical methods report: Guidelines for
multiple testing in impact evaluations. IES/NCEE 2008-4108.
http://ies.ed.gov/pubsearch/pubsinfo.asp?pubid=NCEE20084018

Delineate separate outcome domains in the study protocol.

Define confirmatory and exploratory analysis prior to data analysis

Specify which subgroups will be part of the confirmatory analysis and
which will be part of the exploratory analysis

Design the evaluation to have sufficient statistical power for
examining effects for all prespecified confirmatory analyses

For domain-specific confirmatory analysis, conduct hypothesis testing
for domain outcomes as a group

Multiplicity adjustments are not required for exploratory analysis

Qualify confirmatory and exploratory analysis findings in the study
report
Practicality and appropriateness to
the circumstances
 Feasibility– time and resources required
 Respondent burden– minimize demands,
provide incentives/compensation
 Developmental appropriateness– consider
not only age but performance level, possible
ceiling and floor effect
 For follow-up beyond one school year, may
need measures designed for a broad age
span to maintain comparability
 May need to tailor measures or assessment
procedures for special populations
(disabilities, English language learners)