Transcript Slide 1

Measures of association
Intermediate methods in
observational epidemiology
2008
Measures of Association
1) Measures of association based on ratios
– Cohort studies
• Relative risk (RR)
• Odds ratio (OR)
– Case control studies
• OR of exposure and OR of disease
• OR when the controls are a sample of the total population
– Prevalence ratio (or Prevalence OR) as an estimate of
the RR
2) Measures of association based on absolute
differences: attributable risk
Cohort studies
Hypothetical cohort study of the one-year incidence (q) of acute myocardial
infarction for individuals with severe systolic hypertension (HTN, ≥180 mm Hg) or
normal systolic blood pressure (<120 mm Hg).
Severe
Number
Myocardial infarction
Systolic
Present
Absent
Probability (q) Probability oddsdis
HTN
Yes
10000
180
9820
0.0180
0.01833
No
10000
30
9970
0.0030
0.00301
180
0.0180
10000
RR 

 6.00
30
0.0030
10000
OR dis
q
0.0180
1  q 1.0  0.0180



q
0.0030
1  q 1.0  0.0030
180
9820  0.01833 6.09
30
0.00301
9970
Severe
Systolic
HTN
Yes
Number
No
Myocardial infarction
Present
Absent
Probability (q) Probability oddsdis
10000
180 (a)
9820 (b)
0.0180
0.01833
10000
30 (c)
9970 (d)
0.0030
0.00301
The OR can also be calculated from the “crossproducts ratio” if the table is organized exactly as
above :
OR disease
a
a
ab
ab
q
 a 
b
a
1 

1  q
a  b  a  b b ad




 
q
c
c
c bc
1  q
cd
cd
d
d
 c 
1 

c

d

 cd
OR disease
180  9970

 6.09
9820  30
When (and only when) the OR is used to estimate the
RR, there is a “built-in” bias:
q
q
1  q
q 1  q
1  q
OR 




q
1  q
q
q 1  q
1  q
RR
“bias”
Example:
Severe
Systolic
HTN
Yes
Number
No
Myocardial infarction
Present
Absent
10000
180 (a)
9820 (b)
0.0180
0.01833
RR=6.0
10000
30 (c)
9970 (d)
0.0030
0.00301
OR=6.09
OR dis
Probability (q) Probability oddsdis
1  0.003
 6.0 
 6.09
1  0.018
IN GENERAL:
• The OR is always further away from 1.0
than the RR.
• The higher the incidence, the higher
the discrepancy.
Relationship between RR and OR
… when probability of the event (q) is low:
q
q
1 q
1 q
  10
.
or, in other words, (1-q)  1, and thus, the “built-in bias” term,
1  q
and OR  RR.
Example:
Severe
Systolic
HTN
Yes
Number
No
Myocardial infarction
Present
Absent
10000
180
9820
10000
30
9970
OR  6.0 
180
RR  10000  6.00
30
10000
1  0.003
0.997
 6.0 
 6.09
1  0.018
0.982
180
OR  9820  6.09
30
9970
Relationship between RR and OR
… when probability of the event (q) is high:
Example:
Cohort study of the one-year recurrence of acute myocardial infarction
(MI) among MI survivors with severe systolic hypertension (HTN, ≥180
mm Hg) and normal systolic blood pressure (<120 mm Hg).
Severe
Systolic
HTN
Yes
Number
No
Recurrent MI
Present
Absent
10000
3600
6400
10000
600
9400
OR  6.0 
q
0.36
0.06
3600
RR  10000  6.00
600
10000
1  0.06
0.94
 6.0 
 8.81
1  0.36
0.64
3600
OR  6400  8.81
600
9400
OR vs. RR: Advantages
• OR can be estimated from logistic regression.
• OR can be estimated from a case-control study
Case-control studies
A) Odds ratio of exposure and odds ratio of disease
Hypothetical cohort study of the one-year incidence of acute
myocardial infarction for individuals with severe systolic hypertension
(HTN, 180 mm Hg) and normal systolic blood pressure (<120 mm Hg).
Severe
Systolic
HTN
Yes
Number
No
Myocardial infarction
Present
Absent
10000
180
9820
10000
30
9970
Oddsdis exp
OR dis 
Oddsdis non -exp
same
Hypothetical case-control study assuming that all members of the
cohort (cases and non cases) were identified
Severe Syst HTN
Cases
Controls
Yes
180
9820
No
30
9970
OR exp
180
 9820  6.09
30
9970
Oddsexp cases

Oddsexp non -cases
180
 30  6.09
9820
9970
Retrospective (case-control) studies can estimate the OR of disease
because:
ORexposure = ORdisease
Because ORexp = ORdis, interpretation of the OR is always “prospective”.
Calculation of the Odds Ratios: Example of Use of
Salicylates and Reye’s Syndrome
Past use of
salicylates
Yes
Cases
Controls
26
53
No
1
87
Total
27
140
Odds Ratios
(26/1) ÷ (53/87) =
43.0
Preferred Interpretation: Children using salicylates have an odds (≈risk) of
Reye’s syndrome 43 times higher than that of non-users.
Another interpretation (less useful): Odds of past salicylate use is 43
times greater in cases than in controls.
(Hurwitz et al, 1987, cited by Lilienfeld & Stolley, 1994)
Cohort study:
Severe
Systolic
HTN
Number
Yes
No
Myocardial infarction
Present
Absent
10000
180
9820
10000
30
9970
ORdis 
Odds dis exp
Odds
dis un exp
180
9820

 6.09
30
9970
In a retrospective (case-control) study, an unbiased sample of the cases and
controls yields an unbiased OR
It is not necessary that the sampling fraction be the same in both cases and
controls. For example, a majority of cases (e.g., 90%) and a small sample of controls
(e.g., 20%) could be chosen (assume no random variability).
(As cases are less frequent, the sampling fraction for cases is usually greater than that
for controls).
Severe Syst HTN
Yes
Cases
162
Controls
1964
No
27
1994
Toal
210 x 0.9 = 189
19790 x 0.2 = 3958
OR exp 
Oddsexp in cases
Odds exp in cntls
162
 27  6.09
1964
1994
Case-control studies
B) OR when controls are a sample of the total population
Risk factor
CASES
NON-CASES
Present
a
b
TOTAL
POPULATION
a+b
Absent
c
d
c+d
OR exp 
Odds exp cases
Odds exp non -cases
a
 c
b
d
OR exp
Odds exp cases

Odds exp population
a
 c
ab
cd
a
 a  b  RR
c
cd
In a case-control study, when the control group is a sample of the total
population (rather than only of the non-cases), the odds ratio of
exposure is an unbiased estimate of the RELATIVE RISK
Example:
Hypothetical cohort study of the one-year recurrence of acute myocardial
infarction (MI) among MI survivors with severe systolic hypertension (HTN,
≥180 mm Hg) or normal systolic blood pressure (<120 mm Hg).
Severe
Systolic
HTN
Yes
Recurrent MI
Total
population
Present
Absent
3600
6400
10000
No
600
9400
10000
3600
RR  10000  6.00
600
10000
Example:
Hypothetical cohort study of the one-year recurrence of acute myocardial
infarction (MI) among MI survivors with severe systolic hypertension (HTN,
180+ mm Hg) or normal systolic blood pressure (<120 mm Hg).
Severe
Systolic
HTN
Yes
Recurrent MI
Total
population
Present
Absent
3600
6400
10000
No
600
9400
10000
• Using a traditional casecontrol strategy, cases of
recurrent MI can be compared
to non-cases, i.e., individuals
without recurrent MI:
OR exp
3600
 600  8.81
6400
9400
3600
RR  10000  6.00
600
10000
Example:
Hypothetical cohort study of the one-year recurrence of acute myocardial
infarction (MI) among MI survivors with severe systolic hypertension (HTN,
180+ mm Hg) or normal systolic blood pressure (<120 mm Hg).
Severe
Systolic
HTN
Yes
Recurrent MI
Present
Absent
3600
6400
10000
No
600
9400
10000
• Using a traditional case-control
strategy, cases of recurrent MI are
compared to non-cases, i.e.,
individuals without recurrent MI:
OR exp
3600
 600  8.81  ORdis
6400
9400
Total
population
3600
RR  10000  6.00
600
10000
• Using a case-cohort strategy,
the controls are formed by the
total population:
OR exp
3600
3600
 600  10000  6.00  RR
10000
600
10000 10000
Severe
Systolic
HTN
Yes
Recurrent MI
Total
population
Present
Absent
3600
6400
10 000
No
600
9400
10 000
Note that it is not necessary to have a total group of cases and non-cases or the total
population to assess an association in a case-control study. What is needed is a sample
estimate of cases and either non-cases (to obtain the odds ratio of disease) or the total
population (to obtain the relative risk). Example: samples of 20% cases and 10% total
population:
ORexp
720
120

 6.0  RR
1000
1000
Thus… RR= unbiased exposure odds estimate in cases divided by
unbiased exposure odds estimate in the total population.
To summarize, in a case-control study:
What is the control
group?
What is calculated?
Sample of
NON-CASES
ORexp
Sample of the
TOTAL POPULATION
ORexp 
Oddsexp cases

Oddsexp non -cases
Oddsexp cases
Oddsexp total pop
To obtain ...
ORDisease
RR
How to calculate the OR when there are
more than two exposure categories
Example:
Univariate analysis of the relationship between parity and
eclampsia.*
Parity
2 or more
1
Nulliparous
Cases
11
21
68
Controls
40
27
33
* Abi-Said et al: Am J Epidemiol 1995;142:437-41.
7.5
8
7
6
5
OR 4
3
2
1
2.9
1
0
2+
1
Nulliparous
Number of pregnancies
OR
1.0 (Reference)
(21/11)÷(27/40)=2.9
(68/11)÷(33/40)=7.5
How to calculate the OR when there are
more than two exposure categories
Example:
Univariate analysis of the relationship between parity and
eclampsia.*
Parity
2 or more
1
Nulliparous
Cases
11
21
68
Controls
40
27
33
OR
1.0
2.9
7.5
* Abi-Said et al: Am J Epidemiol 1995;142:437-41.
10
Log
scale
Correct display:
7.5
2.9
OR
1
12 for linear trend  29.215, p  0.0001
1
2+
1
Nulliparous
Number of pregnancies
Baseline is 1.0
A note on the use of estimates from a
cross-sectional study (prevalence ratio, OR) to estimate the RR
P
P I D
1 - P I D 
 
If the prevalence is low (~≤5%) 
 
Prevalence Odds=
P
I D 
P I D
1 - P
P I

P I
If this ratio= 1.0
Duration (prognosis) of the disease after onset is
independent of exposure (similar in exposed and
unexposed)...
However, if exposure is also associated with shorter survival (D+ < D-), D+/D- <1  the
prevalence ratio will underestimate the RR.
P I

P I
Example?
Smoking and emphysema
Measures of association based on absolute differences
(absolute measures of “effect”)
The excess risk (e.g., incidence)
among individuals exposed to a certain
risk factor that can be attributed to the
risk factor per se:
ARexp  q  q  20
1000
 10
1000
 10 / 1000
Or, expressed as a proportion
(e.g., percentage):
%ARexp 
q  q
20/1000 - 10/1000
 100 
 100  50%
q
20/1000
Alternative formula for the %ARexp:
%AR exp 
RR - 1
2.0 - 1.0
 100 
 100  50%
RR
2.0
Incidence (per 1000)
• Attributable risk in the exposed:
20/1000
ARexp
10/1000
Unexposed Exposed
• Population attributable risk:
The excess risk in the population that can be attributed to a given risk
factor. Usually expressed as a percentage:
%PopARexp 
qpop  q
qpop
 100
The Pop AR will depend not only on the RR, but also on the prevalence
of the risk factor (pe).
pe (RR  1)
Levin’s formula %PopAR exp 
 100
pe (RR  1)  1
(Levin: Acta Un Intern Cancer 1953;9:531-41)
Pop AR
Pop AR
ARexp
Unexposed Population Exposed
Incidence (per 1000)
High exposure prevalence
Incidence (per 1000)
Low exposure prevalence
ARexp
Unexposed Population Exposed
Chu SP et al. Risk factors for proximal humerus fracture. Am J Epi 2004; 160:360-367
Cases: 448 incident cases identified at Kaiser Permanente. 45+ yrs old, identified through
radiology reports and outpatient records, confirmed by radiography, bone scan or MRI.
Pathologic fractures excluded (e.g., metastatic cancer).
Controls: 2,023 controls sampled from Kaiser Permanente membership (random sample).
Dietary Calcium (mg/day)
Odds Ratios (95% CI)
Highest quartile (≥970)
1.0 (reference)
Third quartile (771-969)
1.36 (0.96, 1.91)
Second quartile (496-770)
1.11 (0.81, 1.52)
Lowest quartile (≤495)
1.54 (1.14, 2.07)
What is the %AR in those exposed to the
lowest quartile?
More or less 1.0

Percent ARexposed
RR - 1
OR - 1
1.54  1
100 ~
 100 
 100  35%
RR
OR
1.54
Interpretation: If those exposed to values in the lowest
quartile had been exposed to other values, their odds
(risk) would have been 35% lower.
What is the Percent AR in the total population due to exposure in the lowest quartile?
Levin’s formula for the Percent ARpopulation
Percent Population AR 
pexp ( RR  1)
pexp ( RR  1)  1

Pexp (RR  1)
Pexp (RR  1)  1
 100 ~
pexp (OR  1)
pexp (OR  1)  1
100
 100 
RR estimate ~ 1.54
Pexp ~ 0.25
0.25 (154
.  1)
 100  11..9%
0.25 (154
.  1)  1
Interpretation: The exposure to the lowest quartile is responsible for about 12% of the total
incidence of humerus fracture in the Kaiser permanente population