Transcript Slide 1

Strengthening Causal Inference in HIV Studies:

+

Examples

CAPS Methods Core Presentation, April 18, 2012 Starley Shade, Sheri Lippman, Mi-Suk Kang Dufour & Carol Camlin

+

Outline

 Answering causal questions: common roadblocks in HIV research  Causal Inference Framework and Overview of methods  Concrete example: Using treatment and censoring weighting in Prevention with Positives  Concrete example: G-comp for population level attributable risk in the SHAZ study  Q & A

+

Roadblocks in HIV research: selection bias / who gets exposed

 Population surveillance and surveys in probability-based samples  study participants (in testing, in survey research, etc.) almost always systematically differ from non participants  Observational studies  using ‘comparison’ clinics, communities: Systematic differences in study arms exist and/or may accrue over time

+

Common roadblocks in HIV research: Loss To Follow-up

 Cohort studies of HIV+ individuals: highly susceptible to loss to follow-up  >20% after 2 years, in resource-poor settings: medical records don’t capture patient mobility  Death registries rarely available & those who die mistakenly assumed to be lost to follow-up  Those who drop out are systematically different from those who stay engaged in care

+

Roadblocks in HIV research: time dependent confounding

C (&U) 0 C (&U) 1 C (&U) 2 Expos 0 Expos 1 Expos 2 STI 0 STI 1 STI 2 Time dependent confounding – if C is related to prior exposure & affects sub sequent exposure C = group of confounders C (&U) 3 U = unmeasured confounders Expos 3 STI 3

+

Common roadblocks in HIV research: Complex, multi component intervention studies

 Increasing calls for comprehensive HIV prevention interventions addressing multiple levels and domains of influence on individual behavior  Evaluation of such studies hampered by:   Diverse levels of exposure to individual intervention components Difficult to distinguish relative contributions of individual intervention components to observed outcomes

+

Mending our comparison – the causal /counter factual framework

 “We may define a cause to be an object followed by another… where, if the first object had not been, the second never had existed” (Hume 1748)  An association can be considered causal when, if the exposure had been altered, the outcome would have been different  Key part is the counterfactual element – reference to what would have happened if, contrary to fact, the exposure had been something other than what it actually was

+

Counterfactual framework

 “Ideal experiment” illustrates the framework  a hypothetical study which, if we could actually conduct it, would allow us to infer causality  Ideal experiment:  Person or population experiences one exposure and observed for outcome over a given time period  Roll back the clock  Change the exposure but leave everything else the same, observe for outcome over the same time period 8

+

Counterfactual framework

AIDS OBSERVED: Time Counterfactual question: how long would Person A have survived had if he/she had not received treatment? 9

+

Counterfactual framework

OBSERVED: Time UNOBSERVED: AIDS AIDS 10

+

Counterfactuals – specifying what we really want to know

 Thinking about the counterfactual outcome(s) as something we are missing and something we are trying to estimate when we analyze HIV studies or any epidemiologic data is instructive  Akin to a missing data problem  When we compare groups of people observed as exposed or unexposed we want to compare groups that best estimate the counterfactual outcomes that are unobserved or missing 11

+

Notation for presentation

 A = treatment A Y  Y = outcome W, L  W = confounders (point treatment)  L = confounders (longitudinal)  The Likelihood of Data simplifies to:  L(O) = P(Y|A,W,L)P(A|W,L)

+

Rationale for causal inference approach

 Basic regression models produce stratum specific, or conditional, estimates (i.e., “while holding constant a set of covariates”)

E

[

Y

|

A

,

L

) 

b

0 

b

1

A

b

3

L

(

j

)...

Where Y is outcome, A is observed exposure and L is matrix of time-dependent covariates  Therefore, our estimates of effect are also conditional

E

[

Y

|

A

 1 ,

L

) 

E

(

Y

|

A

 0 ,

L

) 

b

1

+

Rationale for causal inference approach

 Causal inference approaches help us model our way back to the ideal (counter factual) experiment

E

[

Y

(

a

 1 ) 

Y

(

a

 0 )] Where Y is outcome and a is counterfactual where all individuals are exposed (a=1) or unexposed (a=0)

+

Inverse Probability Weighting

+

Inverse Probability of Treatment Weighting (IPTW)

 Re-create the counter factual data set by weighting  IPTW assigns a weight for each subject equivalent to the inverse probability of being in their exposure group at each interval.

w t

 1 /

P

[

A

(

j

)  1 |

A

(

j

 1 ),

L

(

j

)]  The treatment model is based on values of past and current covariates (L(j)) and past exposures (A(j-1)).

E

[

A

(

j

) |

A

(

j

 1 ),

L

(

j

)] 

a

0 

a

2 (

L

(

j

) 

a

3

A

(

j

 1 ) 

a

4

L

(

j

 1 )...

+

Inverse Probability of Treatment Weighting (IPTW)

 The treatment weights are applied to the observed population (e.g. weighted logistic regression)

w t

[

E

(

Y

|

A

)] 

b

0 

b

1

A

Creates a new pseudo-population in which the distribution of confounders is balanced between the two exposure groups, essentially mimicking a randomized trial.

E

[

Y

(

a

 1 ) 

Y

(

a

 0 )] 

b

1

+

Inverse Probability of Censoring Weighting (IPCW)

 Like IPTW, IPCW assigns a weight equivalent to the inverse probability of remaining in the study at each interval, based on values of observed covariates and past outcomes and exposures.

w c

 1 /

P

[

C

 1 |

A

(

j

),

L

(

j

)]  The censoring weights are applied to the observed population, creating a new pseudo-population in which censored subjects are “replaced” by up weighting uncensored subjects with the same values of past exposures and covariates.

+

Example: Prevention with Positives Demonstration Projects

 Fifteen HRSA-funded demonstration projects implemented prevention with positives in clinical settings  Each site decided whether to randomize patients to:    Provider-delivered intervention vs. Assessment Specialist-delivered intervention vs. Assessment Mixed intervention vs. Provider intervention  How do we assess the effectiveness of each intervention type?

+

Example: Prevention with Positives Patient characteristics

Male

Standard of care Provider Specialist Mixed

781 (74) 490 (64) 705 (72) White 410 (39) 282 (37) 332 (25) Heterosexual 453 (43) 371 (48) 478 (49) Age 40 or more 720 (68) 423 (55) 704 (72) Education (Less than HS) 540 (51) 377 (49) 524 (54) Employed 411 (39) 355 (46) 324 (33) CD4 < 200 152 (14) 109 (14) 154 (16) VL < 75 381 (36) 216 (28) 418 (43)

p<

530 (72) .001

298 (22) .001

297 (39) .001

431 (57) .001

371 (49) ns 279 (37) .001

120 (16) ns 219 (29) .001

+

Example: Prevention with Positives Retention

 At the 12-month follow-up assessment,     58% of patients were retained in the standard of care group, 76% of patients were retained in the provider intervention sites; 62% were retained in the specialist sites; and 44% in the mixed intervention sites.  There were differences in retention by patient characteristics.

 Older, white, gay males with more than a high school education but who did not use cocaine or injection drugs were more likely to be retained in the study at 12-months .

+

Example: Prevention with Positives Risk Behavior

30% 25% 20% 15% 10% 5% 0% Provider-led Specialist-led Mixed Assessment Baseline 6 months 12 months

+

Example: Prevention with Positives Analysis

 Inverse probability of treatment weights

E

[

A

|

L

] 

a

0 

a

1 (

male

) 

a

2 (

white

) 

a

3 (

gay

)...

w t

 1 /

P

(

A

|

L

)

+

Example: Prevention with Positives Analysis

 Inverse probability of censoring weights

E

[

C

(

j

)  1 |

A

,

L

] 

c

c

(

provider

) 

c

(

specialist

)...

c

(

male

) 

c

(

white

) 

c

(

gay

)...

w c

 1 /

P

[

C

(

j

) |

A

,

L

] * 1 /

P

[

C

(

j

 1 ) |

A

,

L

]...

 Weighted logistic regression

w t b

0 *

w c

b

1 ( * log

it

[

E

(

provider

)

Y

 |

A

)] 

b

2 (

specialist

b

3 (

mixed

)

+

Example: Prevention with Positives Results

Intervention type

Provider-delivered Specialist-delivered Mixed Assessment only

6 months OR (95% CI)

0.93 (0.60, 1.20)

0.58 (0.35, 0.96)

0.89 (0.53, 1.51;) Reference

12 months OR (95% CI) 0.55 (0.32, 0.94)

0.67 (0.39, 1.14) 0.89 (0.53, 1.51) Reference

+

G-computation and Population intervention Models

G-computation

   Sometimes called substitution estimation approach G-computation approach is to model the exposure and outcome relationship and then “control” exposure in the population by substituting counterfactual exposures in your model Population intervention models use this approach to answer practical questions 27

+

Population Intervention Models

Standard regression models give conditional estimate:

E

(

Y

|

A

 1 , 

W

 

w

) 

E

(

Y

|

A

 0 , 

W

 

w

) Marginal structural models allow total effect estimate:

E w

(

Y

1 ) 

E w

(

Y

0 ) For interventions what we care about is the population difference when intervention is present or absent:

E w

(

Y a

) 

E w

(

Y

)

+

Analogous to Attributable Risk

 Traditional population Attributable Risk or Attributable Fraction:  The proportion of the disease risk in the total population associated with the exposure

Incidence

exp

osed

Incidence un

exp

osed

proportion

exp

osed

* 100

Incidence

exp

osed

This assumes the exposure causes the outcome and that there are no other causes i.e. in absence of that exposure there would be no outcome

+

Why PIMS?

 Rarely looking at outcomes with only one important predictor/confounder  PIMS allow assessment of effect averaged across covariates  Rarely able to completely eliminate a risk factor from population  PIMS allow estimation for realistic interventions

+

Population Intervention Models: estimation

1) Estimate outcome model 2) Create new dataset setting covariate(s) of interest to intervention levels 3) Predict outcome of interest using model estimated in step 1 4) Calculate the difference between predicted mean outcome and observed mean outcome

+

Example: SHAZ! study

 SHAZ! (Shaping the Health of Adolescents in Zimbabwe)  Enrolled adolescent orphan girls ages 16 to 19  Overall project was designed as an HIV prevention intervention based on provision of reproductive health services, economic livelihoods training and life-skills education

+

Example: SHAZ! study

 Using baseline data to look at a secondary outcome  Interested in the potential of interventions to improve mental health for adolescent orphan girls  Several structural factors considered as potentially modifiable with intervention

Orphaning

Age at orphaning

Socioeconomic status

Food security Ability to pay for medication Ever homeless Changes in household Completed education

Social environment

Female caregiver relationship Social support Exposure to violence Feeling safe at home Caring for ill person

Psychological distress

(Unmeasured)

Baseline Self efficacy Poor physical health

General health status Viral infection

Baseline Mental Health status

SSQ

+

PIMS Question:

What is the potential impact of intervening on these factors on this population’s mental health status?

+ Domain/variable Social environment

Physical violence Sexual violence forced sex Unsafe home environment Household expereince of violence Caring for ill Low social support Absence of supportive female caregiver

Prevalence in Population

N % 18 29 28 241 34 115 231 4.7% 7.6% 7.3% 62.9% 8.9% 30.0% 60.3%

Hypothesized intervention level

no experience of physical violence no experience of sexual violence no experience of forced sex home environment considered very safe noone in the house experiencing violence not caring for someone ill in the household "enough" people you can count on 116 30.3% presence of a female caregiver who is "often" or "always" supportive

Socioeconomic status

Food security Unable to buy medicine Changes in household location Ever homeless Less than form 4 education

Low baseline self efficacy Poor physical health

Less than excellent health Viral infection HIV/HSV-2 132 235 197 86 99 335 278 42 34.5% 61.4% 51.4% 22.5% 25.8% 87.5% 72.6% 11.0% never going to bed hungry or not eating because there is no food able to buy needed medicine within 2 days no changes in household location within the past 5 years never homeless at least form 4 (secondary) education Average response of "agree/strongly agree" with positive statements, "disagree/strongly disagree" with negative statements excellent self reported health no viral infection with HIV or HSV-2

+

Traditional regression results

Conditional Effects parameter (standard regression)

Social environment

Physical violence Sexual violence forced sex Unsafe home environment Household expereince of violence Caring for ill Low social support Absence of supportive female caregiver

Socioeconomic status

Food security Unable to buy medicine Changes in household location Ever homeless Less than form 4 education

Low baseline self efficacy Poor physical health

Less than excellent health Viral infection HIV/HSV-2 Dichotomized OR

3.67

0.61

2.99

1.50

1.85

5.19

1.64

2.57

0.88

1.30

1.11

2.40

1.38

4.84

2.67

2.57

Domain/variable Potential Impact of Interventions Social environment

Physical violence Sexual violence forced sex Unsafe home environment Household experience of violence Caring for ill Low social support Absence of supportive female caregiver

Socioeconomic status

Food security Unable to buy medicine Changes in household location Ever homeless Less than form 4 education

Low baseline self efficacy Poor physical health

Less than excellent health Viral infection HIV/HSV-2

Prevalence in Population

N % 18 29 28 241 34 115 231 116 4.7% 7.6% 7.3% 62.9% 8.9% 30.0% 60.3% 30.3% 132 235 197 86 99 335 278 42 34.5% 61.4% 51.4% 22.5% 25.8% 87.5% 72.6% 11.0%

Population Intervention parameter

-1.1% 0.0% -0.7% -3.5% -1.1%

-5.8% -4.4% -3.9%

0.4% -2.7% -0.9% -2.8% -0.5%

-9.2% -7.4%

-1.3%

+

Extension of this approach to longitudinal context:

Baseline covariates 6 month covariates 12 month covariates 18 month covariates Intervention Participation: Life-skills Red Cross Intervention Participation: Start vocational training Intervention Participation: finish vocational training Baseline Mental Health Mental Health at 6 months Mental Health at 12 months Mental Health at 18 months Intervention Participation: Receive grant Mental Health at 24 months Time

+

Question:

Does poor mental health status affect participation in the intervention over time?

+

Analytic approach

Interested in effect of exposure (A) on outcome (Y) given covariates and past exposure and outcome E W [E 0 (Y|A=1,W) ‐ E 0 (Y|A=0,W)] Where W includes past exposure and outcome and other covariates

+

Analytic approach cont.

Fit a series of point treatment models for outcomes at timepoints following exposure(s) of interest

+

Example 1:

Baseline covariates (W) Intervention Participation: Life-skills (Y) Red Cross (Y) 6 month covariates Intervention Participation: Start vocational training Baseline Mental Health (A) Mental Health at 6 months

+

Example 2:

Baseline covariates (W) Intervention Participation: Life-skills Red Cross (W) 6 month covariates (W) Intervention Participation: Start vocational training (Y) Baseline Mental Health (W) Mental Health at 6 months(A)

Odds of Completion of Intervention Components by Symptomatic Status for Mental Health Distress at Baseline, Conditional on Completing Previous Intervention Components: Estimates from Logistic Regression Lifeskills Sample OR Size (95% CI) Red Cross Sample OR Size (95% CI) Start vocational training Sample OR Size (95% CI) Completed vocational training Sample OR Size (95% CI) Received Grant Sample OR Size (95% CI) 300 1.1 (0.35, 3.42) 282 0.57

(0.30, 1.11) 114 1.30 (0.14, 12.14) 114 0.63

(0.26, 1.54) 78 0.54 (0.05, 6.37)

Difference in Intervention Component Completion by Mental Health Distress Symptoms, Conditional on Completing Previous Intervention Components: Average Treatment Effects (ATE) using tmle(D/S/A) estimation Lifeskills Sample Size ATE (95% CI) Red Cross Sample Size ATE (95% CI) Start vocational Sample Size training ATE (95% CI) Completed vocational Sample Size training ATE (95% CI) Symptomatic at baseline Symptomatic at 6 months Symptomatic at 12 months 300 0.03 (-0.02,0.08) 282 Symptomatic at 18 months bold numbers indicate parameters statistically significant at p<0.05

-0.23 (-0.41,-0.05)

119 -0.01 (-0.16, 0.14) 118 0.05 (0.02,0.10) 114 113 110 -0.18 (-0.43, 0.07) 0.04 (-0.19,0.26) -0.01 (-0.28, 0.26)

+

Assumptions and Limitations

+

Assumptions

 No Unmeasured Confounding   There is no way to empirically test for no unmeasured confounding; collection of data on a complete set of covariates should be incorporated in the design phase  Time-ordering (temporality)  Need to be certain the covariates measured were prior to treatment if used in Tx weights/ treatment is prior to outcome.

 Experimental Treatment Assignment (ETA) or positivity   Groups defined by all possible combinations of covariates must have the potential to be in any (either) treatment groups. If there are covariate groups that will only be observed in one treatment state, then we cannot estimate the effect of the exposure within that group

+

Acknowledgements

Thanks to:  Alan Hubbard, UCB  Mark van der Laan , UCB  Jennifer Ahern, UCB