Transcript a=1 - HUMIS

Modern Approach to Causal
Inference
Brian C. Sauer, PhD MS
SLC VA Career Development Awardee
About Me
• SLC VA Career Development Awardee
• PhD in Pharmacoepidemiology from College of
Pharmacy at University of Florida
• MS in Biomedical Informatics from University
Utah
• Assistant Research Professor in Division of
Epidemiology, Department of Internal
Medicine
Acknowledgments
• Mentors
– Matthew Samore, MD
– Tom Greene , PhD
– Jonathan Nebeker, MD
• Simulation
– Chen Wang, MS statistics
• Primary References:
– Causal Inference Book: Jamie Robins & Miguel Hernán
• http://www.hsph.harvard.edu/faculty/miguel-hernan/causalinference-book/
– Modern Epidemiology 3rd Ed. Chapter 12. Rothman,
Greenland, Lash
Outline
• Causal Inference and the counterfactual
framework
• Exchangeability & conditional exchangeability
• Use of Directed Acyclic Graphs (DAGs) to
identify a minimal set of covariates to remove
confounding.
Key Learning Points
• Understand the Rationale for
– randomized control trials.
– covariate selection in observation research.
• Identify the minimal set of covariates needed
to produce unbiased effect estimates.
• Develop terminology and language to describe
these ideas with precision.
• Become familiar with notation for causal
inference, which is a barrier to this literature.
Counterfactual Framework
• Neyman (1923)
– Effects of point exposures in randomized
experiments
• Rubin (1974)
– Effects of point exposures in randomized and
observational studies (potential outcomes and
Rubin Causal Framework)
• Robins (1986)
– Effects of time-varying exposures in randomized and
observational studies. (counterfactuals)
Counterfactual
Working Example:
• Zeus took the heart pill 5-days later he
died
• Had he not taken the heart pill he
would still be alive on that 5th day
– that is, all things being equal
• Did the pill cause Zeus’s death?
Counterfactual
Working Example:
• Hera didn’t take the pill
– 5-days later she was alive
• Had she taken the pill she would still be alive
5-days later.
• Did the pill cause Hera’s survival?
Gettysburg: A Novel of the Civil War
• Newt Gingrich and
William R Forstchen
• Historical Fiction
– Imagines how the war
would have ended if there
was a confederate victory
at Gettysburg
Notation for Actual Data
• Y=1 if patient died, 0 otherwise
– Yz =1, Yh =0
• A=1 if patient treated, 0 otherwise
– Az =1, Ah=0
Pat ID
A
Y
Zeus
1
1
Hera
0
0
Notation for Ideal Data
Outcome under No Treatment
• Ya=0=1 if subject had not taken the pill, he would
have died
– Yz, a=0= 0, Yh, a=0= 0
Outcome under Treatment
• Ya=1=1 if subject had taken the pill, he would have
died
– Yz, a=1= 1, Yh, a=1= 0
Pat ID
A
Ya=0
Ya=1
Zeus
1
0
1
Hera
0
0
0
Available Research Data Set
ID
A
Y
Ya=0
Ya=1
Zeus
1
1
?
1
Hera
0
0
0
?
Apollo
1
0
?
0
Cyclope
0
0
0
?
(Individual) Causal Effect
Formal definition of causal effects:
– For Zeus: Pill has a causal effect because
- Yz,a=1 ≠ Yz,a=0
– For Hera: Pill doesn’t have a causal effect
because
- Yh,a=1 = Yh,a=0
Average Causal Effects
Formal Definitional of Average Causal Effects
• In the population, exposure A has a causal
effect on the outcome Y if
- Pr[Ya=1=1] ≠ Pr[Ya=0=1]
 Causal null hypothesis holds if
- Pr[Ya=1=1] = Pr[Ya=0=1]
- E[Ya=1=1] = E[Ya=0=1]
Representation of causal null
Causal effects can be measured in many scales
• Risk difference: Pr[Ya=1=1] - Pr[Ya=0=1] =0
• Risk Ratio: Pr[Ya=1=1] ÷ Pr[Ya=0=1] =1
• Odds Ratio, Hazard Ratio, etc…
Average Causal Effects
No average causal effect:
• Risk difference:
=0
• Risk Ratio: Pr[Ya=1=1] / Pr[Ya=0=1] =1
• Are there individual Causal Effects?
Pr[Ya=1=1]
Pr[Ya=0=1]
Pat ID
Ya=0
Ya=1
Rheia
0
1
Kronos
1
0
Demeter
0
0
Hades
0
0
Hestia
0
0
Poseidon
1
0
Hera
0
0
Zeus
0
1
Artemis
1
1
Apollo
1
0
Leto
0
1
Ares
1
1
Athena
1
1
Hephaestus
0
1
Aphrodite
0
1
Cyclope
0
1
Persephone
1
1
Hermes
1
0
Hebe
1
0
Dionysus
1
0
Causal Effects
Definition
• Causal effects are calculated by contrasting
counterfactual risks within the population.
• Counterfactual contrasts are by definition
causal effects.
Associational Measures
Observed in real world – association ≠
causation
• Pr[Y=1|A=1] =Pr[Y=1|A=0]: Treatment A and
outcome Y are independent
• Also quantify strength of association - Risk
Difference, Risk Ratio, OR, HR, etc
• Pr[Y=1|A=1] - Pr[Y=1|A=0]=0
• Pr[Y=1|A=1] ÷ Pr[Y=1|A=0]=1
Causation vs. Association
The key conceptual difference:
• A causal effect defines a comparison of the
sample subjects under different actions
– Assumes the counterfactual approach
– Everyone simultaneously treated and untreated
– Marginal effects
• Association is defined as a comparison of
different subjects under different conditions
– Effects conditional on treatment assignment group
Causation vs. Association
Causation?
Question:
• Under what conditions can associational
measures be used to estimate causal
effects?
Causation?
Question:
• Under what conditions can associational
measures be used to estimate causal
effects?
Answer:
• Ideal randomized experiments
Randomized Experiments
• Generate missing counterfactual data
• Missing counterfactual is missing completely
at random (MCAR)
• Because of this, causal effects can be
consistently estimated with ideal RCTs despite
the missing data
Ideal Randomized Experiments
Exchangeability:
• Risk under the potential treatment value a among
the treated is equal to the risk under the potential
treatment value a for the untreated
• Pr[Ya=1|A=1] = Pr[Ya=1|A=0]
• Consequence of these conditional risk being equal
in all subsets defined by treatment status in the
population is that they must be equal to the
marginal risk under treatment value a in the whole
population
Population of Interest
Key Issue
• In the presence of exchangeability the
counterfactual risk under treatment in the
white part of the population would equal
the counterfactual risk under treatment in
the entire population.
RCT?
•
•
•
•
A= heart transplant
Y= death
L=prognostic factor
Counts
– 13 of 20 (65%) gods were
treated
– 9 of 12 (75%) treated had
prognostic factor (l=1)
– 3 of 7 (25%) not treated have
prognostic factor
Pat ID
L
A
Y
Rheia
0
0
0
Kronos
0
0
1
Demeter
0
0
0
Hades
0
0
0
Hestia
0
1
0
Poseidon
0
1
0
Hera
0
1
0
Zeus
0
1
1
Artemis
1
0
1
Apollo
1
0
1
Leto
1
0
0
Ares
1
1
1
Athena
1
1
1
Eros
1
1
1
Aphrodite
1
1
1
Cyclope
1
1
1
Persephone
1
1
1
Hermes
1
1
0
Hebe
1
1
0
Dionysus
1
1
0
RCT?
• Design 1:
– 13 of 20 treated: Randomly
selected 65% for treatment
• Design 2:
– 9 out of 12 in critical
condition (75%) treated
– 4 out of 8 not in critical
condition were treated
(50%)
Pat ID
L
A
Y
Rheia
0
0
0
Kronos
0
0
1
Demeter
0
0
0
Hades
0
0
0
Hestia
0
1
0
Poseidon
0
1
0
Hera
0
1
0
Zeus
0
1
1
Artemis
1
0
1
Apollo
1
0
1
Leto
1
0
0
Ares
1
1
1
Athena
1
1
1
Eros
1
1
1
Aphrodite
1
1
1
Cyclope
1
1
1
Persephone
1
1
1
Hermes
1
1
0
Hebe
1
1
0
Dionysus
1
1
0
Conditionally RCT
• Simply combination of 2 marginally
randomized experiments.
• One conducted in subset of population L=0
and the other in L=1
• Values not MCAR, but they are MAR
condition on the covariate L
• Marginal exchangeability not achieved
• randomization generates conditional
exchangeability
Analysis Randomized Trials
• Question: How do you analyze a marginally
RCT?
• Answer:
Analysis Randomized Trials
• Question: How do you typically analyze a
marginally RCT?
• Hint: Dependent and independent variables?
• Answer:
Analysis Randomized Trials
• Question 1: How do you typically analyze a
marginally RCT?
• Hint: Dependent and independent variables?
• Answer: Crude or unadjusted analysis with
treatment and outcome.
Analysis Randomized Trials
• Question 2: How do you typically analyze a
conditionally RCT?
Analysis Randomized Trials
• Question 2: How do you typically analyze a
conditionally RCT?
• Answer 2:
– Robins recommends standardization and IPW
– Stratification type method or common
– Conditions were standardization ≠ stratification.
Review their text.
Summary
• Randomization produces
– Marginal exchangeability
– Conditional exchangeability
• Exchangeability
– Allows us to use associational measure to
estimate causal effects
Observational Studies
• Investigator has no control over treatment
assignment, e.g., randomization
• Cannot achieve exchangeability by design
• To estimate a causal contrast we must obtain valid
observable substitute quantities for the desired
counterfactual quantities
• If we don’t have good substitutes thenwe have a
confounded relationship, i.e., the associational RR ≠
CRR
Observational Studies
Conceptual justification
• Conceptualize observational studies as
though they are conditionally randomized
experiments.
• We assume that some components of the
observational study happen by chance.
Identifiability Conditions
• Consistency: treatment levels are not assigned
by researcher, but correspond to well defined
interventions
• Positivity: all conditional probabilities of
treatment are greater than zero
• Conditional Exchangeability: conditional
probabilities of being assigned to specific
treatment not chosen by investigator, but can
be calculated from data
Observational Studies?
Causal Inference
• Exchangeability and conditional
exchangeability can not be reached by design.
• Question 1: How do we address conditional
exchangeability in Observational studies?
Observational Studies?
Causal Inference
• Exchangeability and conditional
exchangeability can not be reached by design.
• Question 1: How do we address conditional
exchangeability in Observational studies?
• Question 2: How should be pick covariates for
our observational studies?
Big Picture
• Covariates should be selected to produce
conditional exchangeability
• Confounding must be removed to produce
conditional exchangeability
– A variable that removes confounding is a confounder
• Adjusting for certain types of covariates can
either block paths, open paths or do nothing
• We want to adjust variables that block all
backdoor paths between the treatment and
outcome, i.e., remove confounding.
Theory of Causal DAGs
• Mathematically
formalized by
– Pearl (1988, 1995, 2000)
– Sprites, Glymour, and
Scheines (1993, 2000)
Directed Acyclic Graphs
• Are abstract mathematical objects.
• Encode an investigators a priori
assumptions about the causal relations
among the exposure, outcomes and
covariates.
• They represent:
– joint probability distributions
– causal structures.
Value of DAGs
• Support communication among researchers
and clinicians
• Explicate our belief and background
knowledge about causal structures
• Allow us to determine what needs to be
measured to remove confounding
• Helps us determine how bias can be induced
• Helps choose appropriate statistics
Directed Acyclic Graphs (DAGs)
• Directed edges (arrows) linking nodes
(variables)
• Variables joined by an arrow are said
to be adjacent or neighbors
• Acyclic because no arrows from
descendents (effects) to ancestors
(causes)
• Descendants of variable X are
variables affected either directly or
indirectly by X
• Ancestors of X are all the variables
that affect X directly or indirectly
• Paths between two variables can be
directed or undirected
d-separation Criteria
• Rules linking absence of open
paths to statistical independencies
• Describe expected data
distributions if the causal structure
represented by the graph is correct
• Unconditional d-separation
– Path is open or unblocked if no
collider on path
– Collider blocks a path
• d-Connected
– Open path between two variables
Graphical Conditioning
• Conditioning (adjustment) on a
collider F on a path, or any
descendant of F, opens the path
at F
– U1 and U2 are marginally
independent, but conditionally
associated (conditioning on F)
• Conditioning on a non-collider
closes the path and removes C as
a source of association between
A and Y
– A and Y are marginally associated,
but conditionally independent
(conditioning on C)
Graphical vs. Statistical
Criteria for Indentifying Confounders
• Statistical: a confounder must
– Be associated with the exposure under study in
the source population
– Be a risk factor for the outcome, though it need
not actually cause the outcome
– Not be affected by the exposure or the outcome
• Graphical: a confounder must
– Be a common cause
– Have an unblocked back-door path
Unified Theory of Bias
• Bias can be reduced to or explained by 3
structures
– Reverse causation: case-control – outcome
precedes exposure measurement or outcome can
have effect on exposure. Measurement error or
Information bias.
– Common cause: confounding, confounding by
indication
– Conditioning on common effects: collider,
selection bias, time varying confounding
Covariate Selection
• Adequate Background Knowledge
– Confounder identification must be grounded on an
understanding of the causal structure linking the variables
being studied (treatment and disease)
– Build a directed acyclic graph (DAG) to check whether the
necessary criteria for confounding exists.
– Condition on the minimal set of variables necessary to
remove confounding
• Inadequate Background Knowledge
– Remove known instrumental variables, colliders,
intermediates (variables with post treatment
measurement
– Use automated selection procedures such as HDPS
Confounding and Bias
• Under adjustment occurs when
– An open back door path was not closed
• Over adjustment can occur from adjusting
– Instrumental variables
– Intermediate variables
– Colliders
– Variables caused by outcome
• Discussion of variable types
Confounder
• Common Cause, i.e., confounder
• Confounder L distort the effect of
treatment A on disease Y
• Always adjust for confounders,
unless small data set and
confounder has strong
association with treatment and
week association with outcome
• Goal is to produce conditional
exchangeability
Confounder Example
• A = treatment
– a=1 statin alone
– a=0 niacin alone
• L = Baseline Cholesterol
– l=1: LDL ≥ 160 mg/dL
– l=0: LDL < 160 mg/dL
• Y = Myocardial infarction
– Y=1: Yes
– Y=0: No
Intermediate Variable
• Adjusting for intermediate variable
I in a fixed covariate model will
remove the effect of treatment A
on disease/outcome Y
• In a fixed covariate model we do
not want to include variables
influenced by A or Y
• Time-varying treatment model
does include time-varying
confounding that is also an
intermediate variable
Intermediate Example
• A = treatment
– a=1 statin alone
– a=0 niacin alone
• I = Post-treatment Cholesterol
– i=1: LDL ≥ 160 mg/dL
– i=0: LDL < 160 mg/dL
• Y = Myocardial infarction
– Y=1: Yes
– Y=0: No
Collider
• Adjusting for the collider
C can produce bias
• Conditioning on common
effect F without
adjustment of U1 or U2
will induce an association
between U1 and U2,
which will confound the
association between A
and Y
Collider
•
•
•
•
•
A = antidepressant use
Y = lung cancer
U1 = depression
U2 = smoking status
F= cardiovascular
disease
Variables associated with treatment or
disease only
• Inclusion of variables associated with
treatment only (A) can cause bias and
imprecision
• Variables associated with disease but not
treatment (risk factors) can be included in
models. They are expected to decrease
variance of treatment effect without
increasing bias
• Including variables associated with disease
reduces the chance of missing important
confounders
Reality is Complicated
Shrier I, Platt, RW. Reducing bias through directed acyclic graphs. BMC
Medical Research Methodology. 2008: 8:70
Determining Minimal Set of Variables
• Produce a DAG and get clinical experts to
agree on underlying causal network
• Block (condition) on the variables that allow
for open backdoor paths
– Backdoor paths are confounders
• Pearl 6-step approach for determining
minimal set of variables (illustrated by Shrier & Platt.
Reducing bias through DAGs. BMC research Methodology 2008 8:70)
Limitations of DAG approach
• Subject matter knowledge is often not good
enough to draw DAG that can be used to
determine the minimal set of covariates
needed to produce conditional
exchangeability
• In large database studies with many providers
it is difficult to know all the factors that
influence treatment decisions.
Insufficient Background Knowledge
• Recommendation :
– Propensity score (PS) approach:
• Remove colliders and instruments (variables associated with
treatment but not disease)
• In a large PS study we should include as many of the remaining
variables as possible.
• Focus should be on variables that are a priori thought to be strongly
causally related to outcomes (risk factors, confounders)
– Outcome models approach:
• Use a change in estimate approach to select variables
– Since evidence of best variable selection approaches are
limited, researchers should explore the sensitivity of their
results to different variable selection strategies as well as
removal and inclusion of variables that could be IV or
colliders.
Analysis of Observational Data Based
on Counterfactuals
• Fixed treatments
– Propensity Score
– Instrumental Variables
– IPW
• Time-varying treatments (sequentially
randomization)
– IPW
– G-estimation
– Doubly robust
Simulation
• We have developed simulations to understand
and teach these concepts.
• Poster at CDA conference.
• If interested then please contact me
• [email protected]; [email protected]