Transcript a=1 - HUMIS
Modern Approach to Causal Inference Brian C. Sauer, PhD MS SLC VA Career Development Awardee About Me • SLC VA Career Development Awardee • PhD in Pharmacoepidemiology from College of Pharmacy at University of Florida • MS in Biomedical Informatics from University Utah • Assistant Research Professor in Division of Epidemiology, Department of Internal Medicine Acknowledgments • Mentors – Matthew Samore, MD – Tom Greene , PhD – Jonathan Nebeker, MD • Simulation – Chen Wang, MS statistics • Primary References: – Causal Inference Book: Jamie Robins & Miguel Hernán • http://www.hsph.harvard.edu/faculty/miguel-hernan/causalinference-book/ – Modern Epidemiology 3rd Ed. Chapter 12. Rothman, Greenland, Lash Outline • Causal Inference and the counterfactual framework • Exchangeability & conditional exchangeability • Use of Directed Acyclic Graphs (DAGs) to identify a minimal set of covariates to remove confounding. Key Learning Points • Understand the Rationale for – randomized control trials. – covariate selection in observation research. • Identify the minimal set of covariates needed to produce unbiased effect estimates. • Develop terminology and language to describe these ideas with precision. • Become familiar with notation for causal inference, which is a barrier to this literature. Counterfactual Framework • Neyman (1923) – Effects of point exposures in randomized experiments • Rubin (1974) – Effects of point exposures in randomized and observational studies (potential outcomes and Rubin Causal Framework) • Robins (1986) – Effects of time-varying exposures in randomized and observational studies. (counterfactuals) Counterfactual Working Example: • Zeus took the heart pill 5-days later he died • Had he not taken the heart pill he would still be alive on that 5th day – that is, all things being equal • Did the pill cause Zeus’s death? Counterfactual Working Example: • Hera didn’t take the pill – 5-days later she was alive • Had she taken the pill she would still be alive 5-days later. • Did the pill cause Hera’s survival? Gettysburg: A Novel of the Civil War • Newt Gingrich and William R Forstchen • Historical Fiction – Imagines how the war would have ended if there was a confederate victory at Gettysburg Notation for Actual Data • Y=1 if patient died, 0 otherwise – Yz =1, Yh =0 • A=1 if patient treated, 0 otherwise – Az =1, Ah=0 Pat ID A Y Zeus 1 1 Hera 0 0 Notation for Ideal Data Outcome under No Treatment • Ya=0=1 if subject had not taken the pill, he would have died – Yz, a=0= 0, Yh, a=0= 0 Outcome under Treatment • Ya=1=1 if subject had taken the pill, he would have died – Yz, a=1= 1, Yh, a=1= 0 Pat ID A Ya=0 Ya=1 Zeus 1 0 1 Hera 0 0 0 Available Research Data Set ID A Y Ya=0 Ya=1 Zeus 1 1 ? 1 Hera 0 0 0 ? Apollo 1 0 ? 0 Cyclope 0 0 0 ? (Individual) Causal Effect Formal definition of causal effects: – For Zeus: Pill has a causal effect because - Yz,a=1 ≠ Yz,a=0 – For Hera: Pill doesn’t have a causal effect because - Yh,a=1 = Yh,a=0 Average Causal Effects Formal Definitional of Average Causal Effects • In the population, exposure A has a causal effect on the outcome Y if - Pr[Ya=1=1] ≠ Pr[Ya=0=1] Causal null hypothesis holds if - Pr[Ya=1=1] = Pr[Ya=0=1] - E[Ya=1=1] = E[Ya=0=1] Representation of causal null Causal effects can be measured in many scales • Risk difference: Pr[Ya=1=1] - Pr[Ya=0=1] =0 • Risk Ratio: Pr[Ya=1=1] ÷ Pr[Ya=0=1] =1 • Odds Ratio, Hazard Ratio, etc… Average Causal Effects No average causal effect: • Risk difference: =0 • Risk Ratio: Pr[Ya=1=1] / Pr[Ya=0=1] =1 • Are there individual Causal Effects? Pr[Ya=1=1] Pr[Ya=0=1] Pat ID Ya=0 Ya=1 Rheia 0 1 Kronos 1 0 Demeter 0 0 Hades 0 0 Hestia 0 0 Poseidon 1 0 Hera 0 0 Zeus 0 1 Artemis 1 1 Apollo 1 0 Leto 0 1 Ares 1 1 Athena 1 1 Hephaestus 0 1 Aphrodite 0 1 Cyclope 0 1 Persephone 1 1 Hermes 1 0 Hebe 1 0 Dionysus 1 0 Causal Effects Definition • Causal effects are calculated by contrasting counterfactual risks within the population. • Counterfactual contrasts are by definition causal effects. Associational Measures Observed in real world – association ≠ causation • Pr[Y=1|A=1] =Pr[Y=1|A=0]: Treatment A and outcome Y are independent • Also quantify strength of association - Risk Difference, Risk Ratio, OR, HR, etc • Pr[Y=1|A=1] - Pr[Y=1|A=0]=0 • Pr[Y=1|A=1] ÷ Pr[Y=1|A=0]=1 Causation vs. Association The key conceptual difference: • A causal effect defines a comparison of the sample subjects under different actions – Assumes the counterfactual approach – Everyone simultaneously treated and untreated – Marginal effects • Association is defined as a comparison of different subjects under different conditions – Effects conditional on treatment assignment group Causation vs. Association Causation? Question: • Under what conditions can associational measures be used to estimate causal effects? Causation? Question: • Under what conditions can associational measures be used to estimate causal effects? Answer: • Ideal randomized experiments Randomized Experiments • Generate missing counterfactual data • Missing counterfactual is missing completely at random (MCAR) • Because of this, causal effects can be consistently estimated with ideal RCTs despite the missing data Ideal Randomized Experiments Exchangeability: • Risk under the potential treatment value a among the treated is equal to the risk under the potential treatment value a for the untreated • Pr[Ya=1|A=1] = Pr[Ya=1|A=0] • Consequence of these conditional risk being equal in all subsets defined by treatment status in the population is that they must be equal to the marginal risk under treatment value a in the whole population Population of Interest Key Issue • In the presence of exchangeability the counterfactual risk under treatment in the white part of the population would equal the counterfactual risk under treatment in the entire population. RCT? • • • • A= heart transplant Y= death L=prognostic factor Counts – 13 of 20 (65%) gods were treated – 9 of 12 (75%) treated had prognostic factor (l=1) – 3 of 7 (25%) not treated have prognostic factor Pat ID L A Y Rheia 0 0 0 Kronos 0 0 1 Demeter 0 0 0 Hades 0 0 0 Hestia 0 1 0 Poseidon 0 1 0 Hera 0 1 0 Zeus 0 1 1 Artemis 1 0 1 Apollo 1 0 1 Leto 1 0 0 Ares 1 1 1 Athena 1 1 1 Eros 1 1 1 Aphrodite 1 1 1 Cyclope 1 1 1 Persephone 1 1 1 Hermes 1 1 0 Hebe 1 1 0 Dionysus 1 1 0 RCT? • Design 1: – 13 of 20 treated: Randomly selected 65% for treatment • Design 2: – 9 out of 12 in critical condition (75%) treated – 4 out of 8 not in critical condition were treated (50%) Pat ID L A Y Rheia 0 0 0 Kronos 0 0 1 Demeter 0 0 0 Hades 0 0 0 Hestia 0 1 0 Poseidon 0 1 0 Hera 0 1 0 Zeus 0 1 1 Artemis 1 0 1 Apollo 1 0 1 Leto 1 0 0 Ares 1 1 1 Athena 1 1 1 Eros 1 1 1 Aphrodite 1 1 1 Cyclope 1 1 1 Persephone 1 1 1 Hermes 1 1 0 Hebe 1 1 0 Dionysus 1 1 0 Conditionally RCT • Simply combination of 2 marginally randomized experiments. • One conducted in subset of population L=0 and the other in L=1 • Values not MCAR, but they are MAR condition on the covariate L • Marginal exchangeability not achieved • randomization generates conditional exchangeability Analysis Randomized Trials • Question: How do you analyze a marginally RCT? • Answer: Analysis Randomized Trials • Question: How do you typically analyze a marginally RCT? • Hint: Dependent and independent variables? • Answer: Analysis Randomized Trials • Question 1: How do you typically analyze a marginally RCT? • Hint: Dependent and independent variables? • Answer: Crude or unadjusted analysis with treatment and outcome. Analysis Randomized Trials • Question 2: How do you typically analyze a conditionally RCT? Analysis Randomized Trials • Question 2: How do you typically analyze a conditionally RCT? • Answer 2: – Robins recommends standardization and IPW – Stratification type method or common – Conditions were standardization ≠ stratification. Review their text. Summary • Randomization produces – Marginal exchangeability – Conditional exchangeability • Exchangeability – Allows us to use associational measure to estimate causal effects Observational Studies • Investigator has no control over treatment assignment, e.g., randomization • Cannot achieve exchangeability by design • To estimate a causal contrast we must obtain valid observable substitute quantities for the desired counterfactual quantities • If we don’t have good substitutes thenwe have a confounded relationship, i.e., the associational RR ≠ CRR Observational Studies Conceptual justification • Conceptualize observational studies as though they are conditionally randomized experiments. • We assume that some components of the observational study happen by chance. Identifiability Conditions • Consistency: treatment levels are not assigned by researcher, but correspond to well defined interventions • Positivity: all conditional probabilities of treatment are greater than zero • Conditional Exchangeability: conditional probabilities of being assigned to specific treatment not chosen by investigator, but can be calculated from data Observational Studies? Causal Inference • Exchangeability and conditional exchangeability can not be reached by design. • Question 1: How do we address conditional exchangeability in Observational studies? Observational Studies? Causal Inference • Exchangeability and conditional exchangeability can not be reached by design. • Question 1: How do we address conditional exchangeability in Observational studies? • Question 2: How should be pick covariates for our observational studies? Big Picture • Covariates should be selected to produce conditional exchangeability • Confounding must be removed to produce conditional exchangeability – A variable that removes confounding is a confounder • Adjusting for certain types of covariates can either block paths, open paths or do nothing • We want to adjust variables that block all backdoor paths between the treatment and outcome, i.e., remove confounding. Theory of Causal DAGs • Mathematically formalized by – Pearl (1988, 1995, 2000) – Sprites, Glymour, and Scheines (1993, 2000) Directed Acyclic Graphs • Are abstract mathematical objects. • Encode an investigators a priori assumptions about the causal relations among the exposure, outcomes and covariates. • They represent: – joint probability distributions – causal structures. Value of DAGs • Support communication among researchers and clinicians • Explicate our belief and background knowledge about causal structures • Allow us to determine what needs to be measured to remove confounding • Helps us determine how bias can be induced • Helps choose appropriate statistics Directed Acyclic Graphs (DAGs) • Directed edges (arrows) linking nodes (variables) • Variables joined by an arrow are said to be adjacent or neighbors • Acyclic because no arrows from descendents (effects) to ancestors (causes) • Descendants of variable X are variables affected either directly or indirectly by X • Ancestors of X are all the variables that affect X directly or indirectly • Paths between two variables can be directed or undirected d-separation Criteria • Rules linking absence of open paths to statistical independencies • Describe expected data distributions if the causal structure represented by the graph is correct • Unconditional d-separation – Path is open or unblocked if no collider on path – Collider blocks a path • d-Connected – Open path between two variables Graphical Conditioning • Conditioning (adjustment) on a collider F on a path, or any descendant of F, opens the path at F – U1 and U2 are marginally independent, but conditionally associated (conditioning on F) • Conditioning on a non-collider closes the path and removes C as a source of association between A and Y – A and Y are marginally associated, but conditionally independent (conditioning on C) Graphical vs. Statistical Criteria for Indentifying Confounders • Statistical: a confounder must – Be associated with the exposure under study in the source population – Be a risk factor for the outcome, though it need not actually cause the outcome – Not be affected by the exposure or the outcome • Graphical: a confounder must – Be a common cause – Have an unblocked back-door path Unified Theory of Bias • Bias can be reduced to or explained by 3 structures – Reverse causation: case-control – outcome precedes exposure measurement or outcome can have effect on exposure. Measurement error or Information bias. – Common cause: confounding, confounding by indication – Conditioning on common effects: collider, selection bias, time varying confounding Covariate Selection • Adequate Background Knowledge – Confounder identification must be grounded on an understanding of the causal structure linking the variables being studied (treatment and disease) – Build a directed acyclic graph (DAG) to check whether the necessary criteria for confounding exists. – Condition on the minimal set of variables necessary to remove confounding • Inadequate Background Knowledge – Remove known instrumental variables, colliders, intermediates (variables with post treatment measurement – Use automated selection procedures such as HDPS Confounding and Bias • Under adjustment occurs when – An open back door path was not closed • Over adjustment can occur from adjusting – Instrumental variables – Intermediate variables – Colliders – Variables caused by outcome • Discussion of variable types Confounder • Common Cause, i.e., confounder • Confounder L distort the effect of treatment A on disease Y • Always adjust for confounders, unless small data set and confounder has strong association with treatment and week association with outcome • Goal is to produce conditional exchangeability Confounder Example • A = treatment – a=1 statin alone – a=0 niacin alone • L = Baseline Cholesterol – l=1: LDL ≥ 160 mg/dL – l=0: LDL < 160 mg/dL • Y = Myocardial infarction – Y=1: Yes – Y=0: No Intermediate Variable • Adjusting for intermediate variable I in a fixed covariate model will remove the effect of treatment A on disease/outcome Y • In a fixed covariate model we do not want to include variables influenced by A or Y • Time-varying treatment model does include time-varying confounding that is also an intermediate variable Intermediate Example • A = treatment – a=1 statin alone – a=0 niacin alone • I = Post-treatment Cholesterol – i=1: LDL ≥ 160 mg/dL – i=0: LDL < 160 mg/dL • Y = Myocardial infarction – Y=1: Yes – Y=0: No Collider • Adjusting for the collider C can produce bias • Conditioning on common effect F without adjustment of U1 or U2 will induce an association between U1 and U2, which will confound the association between A and Y Collider • • • • • A = antidepressant use Y = lung cancer U1 = depression U2 = smoking status F= cardiovascular disease Variables associated with treatment or disease only • Inclusion of variables associated with treatment only (A) can cause bias and imprecision • Variables associated with disease but not treatment (risk factors) can be included in models. They are expected to decrease variance of treatment effect without increasing bias • Including variables associated with disease reduces the chance of missing important confounders Reality is Complicated Shrier I, Platt, RW. Reducing bias through directed acyclic graphs. BMC Medical Research Methodology. 2008: 8:70 Determining Minimal Set of Variables • Produce a DAG and get clinical experts to agree on underlying causal network • Block (condition) on the variables that allow for open backdoor paths – Backdoor paths are confounders • Pearl 6-step approach for determining minimal set of variables (illustrated by Shrier & Platt. Reducing bias through DAGs. BMC research Methodology 2008 8:70) Limitations of DAG approach • Subject matter knowledge is often not good enough to draw DAG that can be used to determine the minimal set of covariates needed to produce conditional exchangeability • In large database studies with many providers it is difficult to know all the factors that influence treatment decisions. Insufficient Background Knowledge • Recommendation : – Propensity score (PS) approach: • Remove colliders and instruments (variables associated with treatment but not disease) • In a large PS study we should include as many of the remaining variables as possible. • Focus should be on variables that are a priori thought to be strongly causally related to outcomes (risk factors, confounders) – Outcome models approach: • Use a change in estimate approach to select variables – Since evidence of best variable selection approaches are limited, researchers should explore the sensitivity of their results to different variable selection strategies as well as removal and inclusion of variables that could be IV or colliders. Analysis of Observational Data Based on Counterfactuals • Fixed treatments – Propensity Score – Instrumental Variables – IPW • Time-varying treatments (sequentially randomization) – IPW – G-estimation – Doubly robust Simulation • We have developed simulations to understand and teach these concepts. • Poster at CDA conference. • If interested then please contact me • [email protected]; [email protected]