Transcript Slide 1
Strengthening Causal Inference in HIV Studies:
+
Examples
CAPS Methods Core Presentation, April 18, 2012 Starley Shade, Sheri Lippman, Mi-Suk Kang Dufour & Carol Camlin
+
Outline
Answering causal questions: common roadblocks in HIV research Causal Inference Framework and Overview of methods Concrete example: Using treatment and censoring weighting in Prevention with Positives Concrete example: G-comp for population level attributable risk in the SHAZ study Q & A
+
Roadblocks in HIV research: selection bias / who gets exposed
Population surveillance and surveys in probability-based samples study participants (in testing, in survey research, etc.) almost always systematically differ from non participants Observational studies using ‘comparison’ clinics, communities: Systematic differences in study arms exist and/or may accrue over time
+
Common roadblocks in HIV research: Loss To Follow-up
Cohort studies of HIV+ individuals: highly susceptible to loss to follow-up >20% after 2 years, in resource-poor settings: medical records don’t capture patient mobility Death registries rarely available & those who die mistakenly assumed to be lost to follow-up Those who drop out are systematically different from those who stay engaged in care
+
Roadblocks in HIV research: time dependent confounding
C (&U) 0 C (&U) 1 C (&U) 2 Expos 0 Expos 1 Expos 2 STI 0 STI 1 STI 2 Time dependent confounding – if C is related to prior exposure & affects sub sequent exposure C = group of confounders C (&U) 3 U = unmeasured confounders Expos 3 STI 3
+
Common roadblocks in HIV research: Complex, multi component intervention studies
Increasing calls for comprehensive HIV prevention interventions addressing multiple levels and domains of influence on individual behavior Evaluation of such studies hampered by: Diverse levels of exposure to individual intervention components Difficult to distinguish relative contributions of individual intervention components to observed outcomes
+
Mending our comparison – the causal /counter factual framework
“We may define a cause to be an object followed by another… where, if the first object had not been, the second never had existed” (Hume 1748) An association can be considered causal when, if the exposure had been altered, the outcome would have been different Key part is the counterfactual element – reference to what would have happened if, contrary to fact, the exposure had been something other than what it actually was
+
Counterfactual framework
“Ideal experiment” illustrates the framework a hypothetical study which, if we could actually conduct it, would allow us to infer causality Ideal experiment: Person or population experiences one exposure and observed for outcome over a given time period Roll back the clock Change the exposure but leave everything else the same, observe for outcome over the same time period 8
+
Counterfactual framework
AIDS OBSERVED: Time Counterfactual question: how long would Person A have survived had if he/she had not received treatment? 9
+
Counterfactual framework
OBSERVED: Time UNOBSERVED: AIDS AIDS 10
+
Counterfactuals – specifying what we really want to know
Thinking about the counterfactual outcome(s) as something we are missing and something we are trying to estimate when we analyze HIV studies or any epidemiologic data is instructive Akin to a missing data problem When we compare groups of people observed as exposed or unexposed we want to compare groups that best estimate the counterfactual outcomes that are unobserved or missing 11
+
Notation for presentation
A = treatment A Y Y = outcome W, L W = confounders (point treatment) L = confounders (longitudinal) The Likelihood of Data simplifies to: L(O) = P(Y|A,W,L)P(A|W,L)
+
Rationale for causal inference approach
Basic regression models produce stratum specific, or conditional, estimates (i.e., “while holding constant a set of covariates”)
E
[
Y
|
A
,
L
)
b
0
b
1
A
b
3
L
(
j
)...
Where Y is outcome, A is observed exposure and L is matrix of time-dependent covariates Therefore, our estimates of effect are also conditional
E
[
Y
|
A
1 ,
L
)
E
(
Y
|
A
0 ,
L
)
b
1
+
Rationale for causal inference approach
Causal inference approaches help us model our way back to the ideal (counter factual) experiment
E
[
Y
(
a
1 )
Y
(
a
0 )] Where Y is outcome and a is counterfactual where all individuals are exposed (a=1) or unexposed (a=0)
+
Inverse Probability Weighting
+
Inverse Probability of Treatment Weighting (IPTW)
Re-create the counter factual data set by weighting IPTW assigns a weight for each subject equivalent to the inverse probability of being in their exposure group at each interval.
w t
1 /
P
[
A
(
j
) 1 |
A
(
j
1 ),
L
(
j
)] The treatment model is based on values of past and current covariates (L(j)) and past exposures (A(j-1)).
E
[
A
(
j
) |
A
(
j
1 ),
L
(
j
)]
a
0
a
2 (
L
(
j
)
a
3
A
(
j
1 )
a
4
L
(
j
1 )...
+
Inverse Probability of Treatment Weighting (IPTW)
The treatment weights are applied to the observed population (e.g. weighted logistic regression)
w t
[
E
(
Y
|
A
)]
b
0
b
1
A
Creates a new pseudo-population in which the distribution of confounders is balanced between the two exposure groups, essentially mimicking a randomized trial.
E
[
Y
(
a
1 )
Y
(
a
0 )]
b
1
+
Inverse Probability of Censoring Weighting (IPCW)
Like IPTW, IPCW assigns a weight equivalent to the inverse probability of remaining in the study at each interval, based on values of observed covariates and past outcomes and exposures.
w c
1 /
P
[
C
1 |
A
(
j
),
L
(
j
)] The censoring weights are applied to the observed population, creating a new pseudo-population in which censored subjects are “replaced” by up weighting uncensored subjects with the same values of past exposures and covariates.
+
Example: Prevention with Positives Demonstration Projects
Fifteen HRSA-funded demonstration projects implemented prevention with positives in clinical settings Each site decided whether to randomize patients to: Provider-delivered intervention vs. Assessment Specialist-delivered intervention vs. Assessment Mixed intervention vs. Provider intervention How do we assess the effectiveness of each intervention type?
+
Example: Prevention with Positives Patient characteristics
Male
Standard of care Provider Specialist Mixed
781 (74) 490 (64) 705 (72) White 410 (39) 282 (37) 332 (25) Heterosexual 453 (43) 371 (48) 478 (49) Age 40 or more 720 (68) 423 (55) 704 (72) Education (Less than HS) 540 (51) 377 (49) 524 (54) Employed 411 (39) 355 (46) 324 (33) CD4 < 200 152 (14) 109 (14) 154 (16) VL < 75 381 (36) 216 (28) 418 (43)
p<
530 (72) .001
298 (22) .001
297 (39) .001
431 (57) .001
371 (49) ns 279 (37) .001
120 (16) ns 219 (29) .001
+
Example: Prevention with Positives Retention
At the 12-month follow-up assessment, 58% of patients were retained in the standard of care group, 76% of patients were retained in the provider intervention sites; 62% were retained in the specialist sites; and 44% in the mixed intervention sites. There were differences in retention by patient characteristics.
Older, white, gay males with more than a high school education but who did not use cocaine or injection drugs were more likely to be retained in the study at 12-months .
+
Example: Prevention with Positives Risk Behavior
30% 25% 20% 15% 10% 5% 0% Provider-led Specialist-led Mixed Assessment Baseline 6 months 12 months
+
Example: Prevention with Positives Analysis
Inverse probability of treatment weights
E
[
A
|
L
]
a
0
a
1 (
male
)
a
2 (
white
)
a
3 (
gay
)...
w t
1 /
P
(
A
|
L
)
+
Example: Prevention with Positives Analysis
Inverse probability of censoring weights
E
[
C
(
j
) 1 |
A
,
L
]
c
c
(
provider
)
c
(
specialist
)...
c
(
male
)
c
(
white
)
c
(
gay
)...
w c
1 /
P
[
C
(
j
) |
A
,
L
] * 1 /
P
[
C
(
j
1 ) |
A
,
L
]...
Weighted logistic regression
w t b
0 *
w c
b
1 ( * log
it
[
E
(
provider
)
Y
|
A
)]
b
2 (
specialist
b
3 (
mixed
)
+
Example: Prevention with Positives Results
Intervention type
Provider-delivered Specialist-delivered Mixed Assessment only
6 months OR (95% CI)
0.93 (0.60, 1.20)
0.58 (0.35, 0.96)
0.89 (0.53, 1.51;) Reference
12 months OR (95% CI) 0.55 (0.32, 0.94)
0.67 (0.39, 1.14) 0.89 (0.53, 1.51) Reference
+
G-computation and Population intervention Models
G-computation
Sometimes called substitution estimation approach G-computation approach is to model the exposure and outcome relationship and then “control” exposure in the population by substituting counterfactual exposures in your model Population intervention models use this approach to answer practical questions 27
+
Population Intervention Models
Standard regression models give conditional estimate:
E
(
Y
|
A
1 ,
W
w
)
E
(
Y
|
A
0 ,
W
w
) Marginal structural models allow total effect estimate:
E w
(
Y
1 )
E w
(
Y
0 ) For interventions what we care about is the population difference when intervention is present or absent:
E w
(
Y a
)
E w
(
Y
)
+
Analogous to Attributable Risk
Traditional population Attributable Risk or Attributable Fraction: The proportion of the disease risk in the total population associated with the exposure
Incidence
exp
osed
Incidence un
exp
osed
proportion
exp
osed
* 100
Incidence
exp
osed
This assumes the exposure causes the outcome and that there are no other causes i.e. in absence of that exposure there would be no outcome
+
Why PIMS?
Rarely looking at outcomes with only one important predictor/confounder PIMS allow assessment of effect averaged across covariates Rarely able to completely eliminate a risk factor from population PIMS allow estimation for realistic interventions
+
Population Intervention Models: estimation
1) Estimate outcome model 2) Create new dataset setting covariate(s) of interest to intervention levels 3) Predict outcome of interest using model estimated in step 1 4) Calculate the difference between predicted mean outcome and observed mean outcome
+
Example: SHAZ! study
SHAZ! (Shaping the Health of Adolescents in Zimbabwe) Enrolled adolescent orphan girls ages 16 to 19 Overall project was designed as an HIV prevention intervention based on provision of reproductive health services, economic livelihoods training and life-skills education
+
Example: SHAZ! study
Using baseline data to look at a secondary outcome Interested in the potential of interventions to improve mental health for adolescent orphan girls Several structural factors considered as potentially modifiable with intervention
Orphaning
Age at orphaning
Socioeconomic status
Food security Ability to pay for medication Ever homeless Changes in household Completed education
Social environment
Female caregiver relationship Social support Exposure to violence Feeling safe at home Caring for ill person
Psychological distress
(Unmeasured)
Baseline Self efficacy Poor physical health
General health status Viral infection
Baseline Mental Health status
SSQ
+
PIMS Question:
What is the potential impact of intervening on these factors on this population’s mental health status?
+ Domain/variable Social environment
Physical violence Sexual violence forced sex Unsafe home environment Household expereince of violence Caring for ill Low social support Absence of supportive female caregiver
Prevalence in Population
N % 18 29 28 241 34 115 231 4.7% 7.6% 7.3% 62.9% 8.9% 30.0% 60.3%
Hypothesized intervention level
no experience of physical violence no experience of sexual violence no experience of forced sex home environment considered very safe noone in the house experiencing violence not caring for someone ill in the household "enough" people you can count on 116 30.3% presence of a female caregiver who is "often" or "always" supportive
Socioeconomic status
Food security Unable to buy medicine Changes in household location Ever homeless Less than form 4 education
Low baseline self efficacy Poor physical health
Less than excellent health Viral infection HIV/HSV-2 132 235 197 86 99 335 278 42 34.5% 61.4% 51.4% 22.5% 25.8% 87.5% 72.6% 11.0% never going to bed hungry or not eating because there is no food able to buy needed medicine within 2 days no changes in household location within the past 5 years never homeless at least form 4 (secondary) education Average response of "agree/strongly agree" with positive statements, "disagree/strongly disagree" with negative statements excellent self reported health no viral infection with HIV or HSV-2
+
Traditional regression results
Conditional Effects parameter (standard regression)
Social environment
Physical violence Sexual violence forced sex Unsafe home environment Household expereince of violence Caring for ill Low social support Absence of supportive female caregiver
Socioeconomic status
Food security Unable to buy medicine Changes in household location Ever homeless Less than form 4 education
Low baseline self efficacy Poor physical health
Less than excellent health Viral infection HIV/HSV-2 Dichotomized OR
3.67
0.61
2.99
1.50
1.85
5.19
1.64
2.57
0.88
1.30
1.11
2.40
1.38
4.84
2.67
2.57
Domain/variable Potential Impact of Interventions Social environment
Physical violence Sexual violence forced sex Unsafe home environment Household experience of violence Caring for ill Low social support Absence of supportive female caregiver
Socioeconomic status
Food security Unable to buy medicine Changes in household location Ever homeless Less than form 4 education
Low baseline self efficacy Poor physical health
Less than excellent health Viral infection HIV/HSV-2
Prevalence in Population
N % 18 29 28 241 34 115 231 116 4.7% 7.6% 7.3% 62.9% 8.9% 30.0% 60.3% 30.3% 132 235 197 86 99 335 278 42 34.5% 61.4% 51.4% 22.5% 25.8% 87.5% 72.6% 11.0%
Population Intervention parameter
-1.1% 0.0% -0.7% -3.5% -1.1%
-5.8% -4.4% -3.9%
0.4% -2.7% -0.9% -2.8% -0.5%
-9.2% -7.4%
-1.3%
+
Extension of this approach to longitudinal context:
Baseline covariates 6 month covariates 12 month covariates 18 month covariates Intervention Participation: Life-skills Red Cross Intervention Participation: Start vocational training Intervention Participation: finish vocational training Baseline Mental Health Mental Health at 6 months Mental Health at 12 months Mental Health at 18 months Intervention Participation: Receive grant Mental Health at 24 months Time
+
Question:
Does poor mental health status affect participation in the intervention over time?
+
Analytic approach
Interested in effect of exposure (A) on outcome (Y) given covariates and past exposure and outcome E W [E 0 (Y|A=1,W) ‐ E 0 (Y|A=0,W)] Where W includes past exposure and outcome and other covariates
+
Analytic approach cont.
Fit a series of point treatment models for outcomes at timepoints following exposure(s) of interest
+
Example 1:
Baseline covariates (W) Intervention Participation: Life-skills (Y) Red Cross (Y) 6 month covariates Intervention Participation: Start vocational training Baseline Mental Health (A) Mental Health at 6 months
+
Example 2:
Baseline covariates (W) Intervention Participation: Life-skills Red Cross (W) 6 month covariates (W) Intervention Participation: Start vocational training (Y) Baseline Mental Health (W) Mental Health at 6 months(A)
Odds of Completion of Intervention Components by Symptomatic Status for Mental Health Distress at Baseline, Conditional on Completing Previous Intervention Components: Estimates from Logistic Regression Lifeskills Sample OR Size (95% CI) Red Cross Sample OR Size (95% CI) Start vocational training Sample OR Size (95% CI) Completed vocational training Sample OR Size (95% CI) Received Grant Sample OR Size (95% CI) 300 1.1 (0.35, 3.42) 282 0.57
(0.30, 1.11) 114 1.30 (0.14, 12.14) 114 0.63
(0.26, 1.54) 78 0.54 (0.05, 6.37)
Difference in Intervention Component Completion by Mental Health Distress Symptoms, Conditional on Completing Previous Intervention Components: Average Treatment Effects (ATE) using tmle(D/S/A) estimation Lifeskills Sample Size ATE (95% CI) Red Cross Sample Size ATE (95% CI) Start vocational Sample Size training ATE (95% CI) Completed vocational Sample Size training ATE (95% CI) Symptomatic at baseline Symptomatic at 6 months Symptomatic at 12 months 300 0.03 (-0.02,0.08) 282 Symptomatic at 18 months bold numbers indicate parameters statistically significant at p<0.05
-0.23 (-0.41,-0.05)
119 -0.01 (-0.16, 0.14) 118 0.05 (0.02,0.10) 114 113 110 -0.18 (-0.43, 0.07) 0.04 (-0.19,0.26) -0.01 (-0.28, 0.26)
+
Assumptions and Limitations
+
Assumptions
No Unmeasured Confounding There is no way to empirically test for no unmeasured confounding; collection of data on a complete set of covariates should be incorporated in the design phase Time-ordering (temporality) Need to be certain the covariates measured were prior to treatment if used in Tx weights/ treatment is prior to outcome.
Experimental Treatment Assignment (ETA) or positivity Groups defined by all possible combinations of covariates must have the potential to be in any (either) treatment groups. If there are covariate groups that will only be observed in one treatment state, then we cannot estimate the effect of the exposure within that group
+
Acknowledgements
Thanks to: Alan Hubbard, UCB Mark van der Laan , UCB Jennifer Ahern, UCB