Review of observational study design and basic statistics for contingency tables Coffee Chronicles  BY MELISSA AUGUST, ANN MARIE BONARDI, VAL CASTRONOVO, MATTHEW JOE'S BLOWS.

Download Report

Transcript Review of observational study design and basic statistics for contingency tables Coffee Chronicles  BY MELISSA AUGUST, ANN MARIE BONARDI, VAL CASTRONOVO, MATTHEW JOE'S BLOWS.

Review of observational study
design and basic statistics for
contingency tables
Coffee Chronicles

BY MELISSA AUGUST, ANN MARIE BONARDI, VAL CASTRONOVO, MATTHEW
JOE'S BLOWS Last week researchers reported that coffee might help prevent
Parkinson's disease. So is the caffeine bean good for you or not? Over the years,
studies haven't exactly been clear:

According to scientists, too much coffee may cause...
 1986 --phobias, --panic attacks
 1990 --heart attacks, --stress, --osteoporosis
 1991 -underweight babies, --hypertension
 1992 --higher cholesterol
 1993, 08 --miscarriages
 1994 --intensified stress
 1995 --delayed conception
But scientists say coffee also may help prevent...
 1988 --asthma
 1990 --colon and rectal cancer,...
 2004—Type II Diabetes (*6 cups per day!)
 2006—alcohol-induced liver damage
 2007—skin cancer
Medical Studies
The General Idea…
Evaluate whether a risk factor (or preventative factor)
increases (or decreases) your risk for an outcome (usually
disease, death or intermediary to disease).
?
Exposure
Disease
Observational vs.
Experimental Studies
Observational studies – the population is observed without
any interference by the investigator
Experimental studies – the investigator tries to control the
environment in which the hypothesis is tested (the
randomized, double-blind clinical trial is the gold standard)
Limitation of observational
research: confounding

Confounding: risk factors don’t happen in isolation, except
in a controlled experiment.
– Example: In a case-control study of a salmonella outbreak,
tomatoes were identified as the source of the infection. But the
association was spurious. Tomatoes are often eaten with serrano
and jalapeno peppers, which turned out to be the true source of
infection.
– Example: Breastfeeding has been linked to higher IQ in infants,
but the association could be due to confounding by socioeconomic
status. Women who breastfeed tend to be better educated and have
better prenatal care, which may explain the higher IQ in their
infants.
Confounding: A major problem
for observational studies
?
Exposure
Disease
Confounder
Why Observational Studies?

Cheaper
 Faster
 Can examine long-term effects
 Hypothesis-generating
 Sometimes, experimental studies are not
ethical (e.g., randomizing subjects to
smoke)
Possible Observational
Study Designs
Cross-sectional studies
Cohort studies
Case-control studies
Cross-Sectional (Prevalence)
Studies
Measure disease and exposure on a random sample of
the population of interest. Are they associated?
Marginal probabilities of exposure AND disease are valid, but only
measures association at a single time point.
The 2x2 Table
Exposure (E)
Disease (D)
a
No Exposure
(~E)
b
No Disease (~D)
c
d
(a+c)/T = P(E)
(b+d)/T = P(~E)
Marginal probability
of exposure
Marginal probability of
disease
(a+b)/T = P(D)
(c+d)/T = P(~D)
N
Example: cross-sectional
study

Relationship between atherosclerosis and late-life
depression (Tiemeier et al. Arch Gen Psychiatry, 2004).

Methods: Researchers measured the
prevalence of coronary artery calcification
(atherosclerosis) and the prevalence of
depressive symptoms in a large cohort of
elderly men and women in Rotterdam
(n=1920).
Example: cross-sectional
study
P(“D”)= Prevalence of depression (sub-thresshold or depressive disorder)
(20+13+12+9+11+16)/1920 = 4.2%
P(“E”)= Prevalence of atherosclerosis (coronary calcification >500):
(511+12+16)/1920 = 28.1%
The 2x2 table:
Coronary calc >500
Coronary calc <=500
Any
depression
None
28
511
53
1328
81
1839
539
1381
1920
P(depression)= 81/1920 = 4.2%
P(atherosclerosis) = 539/1920 = 28.1%
P(depression/atherosclerosis) = 28/539 = 5.2%
Difference of proportions Z-test:
Coronary calc >500
Coronary calc <=500
Any
depression
None
28
511
53
1328
81
pdepression / atherosclerosis 
Z
difference

s.e.(difference)
1839
539
1381
1920
28
53
 .052; pdepression / unblocked 
 .038
539
1381
.052 .038
.014

 1.33; p  .18
(.042)(1  .042) (.042)(1  .042) .0101

539
1381
Or, use relative risk (risk ratio):
Coronary calc >500
Coronary calc <=500
Any
depression
None
28
511
53
1328
81
1839
539
1381
1920
.052
RR 
 1.37; 95% CI  (0.86, 2.19)
.038
Interpretation: those with coronary calcification are 37%
more likely to have depression (not significant).
Or, use chi-square test:
Observed:
Coronary calc >500
Coronary calc <=500
Any depression
None
28
511
539
53
1328
1381
81
Expected:
Coronary calc >500
Coronary calc <=500
1839
1920
Any depression
None
539*81/1920=
539-22.7=
22.7
516.3
81-22.7=
1381-58.3=
58.3
1322.7
81
1839
539
1381
Chi-square test:
(observed- expected)2
 
expected
2
2
2
2
(
28

22
.
7
)
(
53

58
.
3
)
(
511

516
.
3
)
2
1 


22.7
58.3
516.3
2
(1328 1322.7)

 1.77
1322.7
p  .18
Note: 1.77 = 1.332
Chi-square test also works for
bigger contingency tables (RxC):
Chi-square test also works for
bigger contingency tables (RxC):
No
depression
Subthreshhold
depressive
symptoms
Clinical
depressive
disorder
0-100
865
20
9
101-500
463
13
11
>500
511
12
16
Coronary
calcification
Observed:
Coronary
calcificati
on
0-100
101-500
>500
Expected:
No
depression
Subthreshhold
depressive
symptoms
Clinical
depressive
disorder
865
20
9
Coronary
calcification
894
463
13
11
487
511
12
16
539
0-100
36
1920
894*1839 849*45/1
/1920=
920=
856.3
101-500
>500
1839 45
No
depression
Subthreshhold
depressive
symptoms
21
487*1839 487*45/1
/1920=
920=
Clinical
depressive
disorder
894(21+856.
3)=16.7
487(466.5+1
1.4)=9.1
466.5
11.4
1839(856.3+4
66.5)=
516.2
4536(21+11.4) (16.7+9.1
=
)=
12.6
10.2
Chi-square test:
(observed- expected)2
 
expected
2
2
2
2
(
865

856
.
3
)
(
20

21
)
(
9

16
.
7
)
42 


856.3
21
16.7
(463 466.5) 2 (13  11.4) 2 (11 9.1) 2




466.5
11.4
9.1
(511 516.2) 2 (12  12.6) 2 (16  10.2) 2


 7.877
516.2
12.6
10.2
p  .096
Cause and effect?
?
Biological changes
atheroscleros
is
?
Lack of exercise
Poor Eating
depression in
elderly
Confounding?
?
Biological changes
atheroscleros
is
?
Lack of exercise
Poor Eating
Advancing Age
depression in
elderly
Cross-Sectional Studies

Advantages:
– cheap and easy
– generalizable
– good for characteristics that (generally) don’t change
like genes or gender

Disadvantages
– difficult to determine cause and effect
– problematic for rare diseases and exposures
2. Cohort studies:
Sample on exposure status and track
disease development (for rare exposures)

Marginal probabilities (and rates) of developing
disease for exposure groups are valid.
Example: The Framingham
Heart Study

The Framingham Heart Study was established in
1948, when 5209 residents of Framingham, Mass,
aged 28 to 62 years, were enrolled in a prospective
epidemiologic cohort study.
 Health and lifestyle factors were measured (blood
pressure, weight, exercise, etc.).
 Interim cardiovascular events were ascertained
from medical histories, physical examinations,
ECGs, and review of interim medical record.
Example 2: Johns Hopkins Precursors Study
(medical students 1948 through 1964)
http://www.jhu.edu/~jhumag/0601web/study.html
From the John Hopkin’s Magazine website (URL above).
Cohort Studies
Disease
Exposed
Target
population
Disease-free
cohort
Disease-free
Disease
Not
Exposed
Disease-free
TIME
The Risk Ratio, or Relative Risk (RR)
Exposure (E)
Disease (D)
a
No Exposure
(~E)
b
No Disease (~D)
c
d
a+c
b+d
risk to the exposed
RR 
P( D / E )
P( D /~E )
a /( ac)

b /(bd )
risk to the unexposed
Hypothetical Data
Congestive
Heart Failure
No CHF
High Systolic BP
Normal BP
400
400
1100
2600
1500
3000
400
/
1500
RR 
 2.0
400 /3000

Advantages/Limitations:
Cohort Studies
Advantages:
– Allows you to measure true rates and risks of disease
for the exposed and the unexposed groups.
– Temporality is correct (easier to infer cause and effect).
– Can be used to study multiple outcomes.
– Prevents bias in the ascertainment of exposure that may
occur after a person develops a disease.

Disadvantages:
– Can be lengthy and costly! 60 years for Framingham.
– Loss to follow-up is a problem (especially if non-
random).
– Selection Bias: Participation may be associated with
exposure status for some exposures
Case-Control Studies
Sample on disease status and ask
retrospectively about exposures (for rare
diseases)
 Marginal probabilities of exposure for cases and
controls are valid.
• Doesn’t require knowledge of the absolute risks of disease
• For rare diseases, can approximate relative risk
Case-Control Studies
Exposed in
past
Disease
(Cases)
Target
population
Not exposed
Exposed
No Disease
(Controls)
Not Exposed
Example: the AIDS epidemic
in the early 1980’s

Early, case-control studies among AIDS cases and
matched controls indicated that AIDS was
transmitted by sexual contact or blood products.
 In 1982, an early case-control study matched
AIDS cases to controls and found a positive
association between amyl nitrites (“poppers”) and
AIDS; odds ratio of 8.6 (Marmor et al. 1982).
This is an example of confounding.
Case-Control Studies in
History

In 1843, Guy compared occupations of men with
pulmonary consumption to those of men with
other diseases (Lilienfeld and Lilienfeld 1979).
 Case-control studies identified associations
between lip cancer and pipe smoking (Broders
1920), breast cancer and reproductive history
(Lane-Claypon 1926) and between oral cancer and
pipe smoking (Lombard and Doering 1928). All
rare diseases.
 Case-control studies identified an association
between smoking and lung cancer in the 1950’s.
Case-control example

A study of the relation between body mass index and
the incidence of age-related macular degeneration
(Moeini et al. Br. J. Ophthalmol, 2005).

Methods: Researchers compared 50 Iranian
patients with confirmed age-related macular
degeneration and 80 control subjects with respect
to BMI, smoking habits, hypertension, and
diabetes. The researchers were specifically
interested in the relationship of BMI to age-related
macular degeneration.
Results
Table 2 Comparison of body mass index (BMI) in case and control
groups
Lean BMI <20
Normal 20 BMI <25
Case n = 50(%) Control n = 80 (%) p Value
7 (14)
6 (7.5)
NS
16 (32)
20 (25)
NS
Overweight 25 BMI <30 21 (42)
Obese BMI 30
6 (12)
NS, not significant.
36 (45)
18 (22.5)
NS
NS
Corresponding 2x2 Table
Overweight
Normal
ARMD
27
23
50
No ARMD
54
26
80
What is the risk ratio here?
Tricky: There is no risk ratio, because we
cannot calculate the risk of disease!!
The odds ratio…

We cannot calculate a risk ratio from a case-control study.

BUT, we can calculate a measure called the odds ratio…
Odds vs. Risk
If the risk is…
½ (50%)
¾ (75%)
1/10 (10%)
1/100 (1%)
Then the odds
are…
1:1
3:1
1:9
1:99
Note: An odds is always higher than its corresponding probability,
unless the probability is 100%.
The Odds Ratio (OR)
Exposure (E)
Disease (D)
a
No Disease (~D)
c
No Exposure
(~E)
b
d
c+d=controls
a /(a  b) a
ad
b /(a  b)
b

 
c /(c  d )
c bc
d /(c  d ) d
Odds of exposure
in the cases The proportion of cases and
P ( E /controls
D ) are set by the
therefore, they
P (~ Einvestigator;
/ D)
do not represent the risk
P ( E /(probability)
~ D)
of developing
P (~ E /disease.
~ D)
Odds of exposure
in the controls
OR 
a+b=cases
The Odds Ratio (OR)
Exposure (E)
Disease (D)
a
No Disease (~D)
c
No Exposure
(~E)
b
d
Odds of exposure for the cases.
OR 
a
b
c
d
ad


bc
a
c
b
d
Odds of exposure for the controls
Odds of disease
for the exposed
Odds of disease for
the unexposed
Proof via Bayes’ Rule (optional)
P( E / D)
P(~ E / D)
P( E / ~ D)
P(~ E / ~ D)
Odds of exposure in the cases
Odds of exposure in the controls
Bayes’ Rule
P( D / E ) P( E )
P( D)
P ( D / ~ E ) P (~ E )
P( D)
P (~ D / E ) P ( E )
P (~ D )
P (~ D / ~ E ) P (~ E )
P (~ D )
P( D / E )
P(~ D / E )
P( D / ~ E )
P(~ D / ~ E )
=
Odds of disease in the exposed
What we want!
Odds of disease in the unexposed
The Odds Ratio (OR)
Overweight
Normal
ARMD
a
b
No ARMD
c
d
Odds of overweight for the cases.
OR 
a
b
c
d
ad


bc
a
c
b
d
Odds of overweight for the controls
Odds of ARMD
for the
overweight
Odds of ARMD for
the normal weight
The Odds Ratio (OR)
Overweight
Normal
ARMD
27
23
No ARMD
54
26
OR 
27
23
54
26
27 * 26

 .57
23 * 54
The Odds Ratio (OR)
Overweight
Normal
ARMD
27
23
No ARMD
54
26
OR 
27
23
54
26
27 * 26

 .57
23 * 54
Can be interpreted as: Overweight people have a 43%
decrease in their ODDS of age-related macular
degeneration. (not statistically significant here)
The odds ratio is a good
approximation of the risk ratio
if the disease is rare.
If the disease is rare (affecting <10% of the population), then:
OR  RR
WHY?
If the disease is rare, the probability of it NOT happening is
close to 1, and the odds is close to the risk. Eg: OR  1 / 19  .474
1/ 9
1 / 20
RR 
 .50
1 : 10
The rare disease assumption
OR 
P( D / E )
P (~ D / E )
1
P( D / ~ E )
P (~ D / ~ E )

P( D / E )
P( D / ~ E )
 RR
1
When a disease is rare:
P(~D) = 1 - P(D)  1
The odds ratio vs. the risk ratio
Rare Outcome
Odds ratio
Odds ratio
Risk ratio
1.0 (null)
Risk ratio
Common Outcome
Odds ratio
Odds ratio
Risk ratio
1.0 (null)
Risk ratio
When is the OR is a good
approximation of the RR?
General Rule of
Thumb:
“OR is a good
approximation as long
as the probability of the
outcome in the
unexposed is less than
10%”
Prevalence of age-related macular
degeneration is about 6.5% in people
over 40 in the US (according to a
2011 estimate). So, the OR is a
reasonable approximation of the RR.
Advantages/Limitations:
Case-control studies

Advantages:
– Cheap and fast
– Efficient for rare diseases

Disadvantages:
– Getting comparable controls is often tricky
– Temporality is a problem (did risk factor cause disease
or disease cause risk factor?
– Recall bias
Inferences about the odds
ratio…
Properties of the OR (simulation)
(50 cases/50 controls/20% exposed)
If the Odds Ratio=1.0 then with 50
cases and 50 controls, of whom 20%
are exposed, this is the expected
variability of the sample ORnote
the right skew
Properties of the lnOR
Standard deviation =
1 1 1 1
  
a b c d
Hypothetical Data
Amyl Nitrite Use
AIDS
20
No Amyl
Nitrite
10
Does not have
AIDS
6
24
(20)(24)
OR 
 8.0
(6)(10)
1.96
95% CI  (8.0)e
1 1 1 1
  
20 6 10 24
1.96
, (8.0)e
1 1 1 1
  
20 6 10 24
30
30
Note that the
size of the
smallest 2x2 cell
determines the
magnitude of
the variance
 (2.47- 25.8)
When can the OR mislead?
Example:
Does dementia predict death?

Dementia: The leading predictor of death in
a defined elderly population. Neurology
2004; 62: 1156-1162
 Among patients with dementia: 291/355
(82%) died
 Among patients without dementia:
947/4328 (22%) died
Dementia study

Authors report OR = 16.23 (12.27, 21.48)
 But the RR = 3.72
 Fortunately, they do not dwell on the OR,
but it could mislead if not interpreted
correctly…
Better to give OR or RR?
From an RCT (prospective!) of a new diet drug, the authors
showed the following table:
Odds Ratios for losing at least 5kg were:
4.0 (low dose vs. placebo)
20.9 (medium dose vs. placebo)
31.5 (high dose vs. placebo)
Better to give OR or RR?
Corresponding RRs are:
59%/29%=2 (low dose vs. placebo)
87%/29%=3 (medium dose vs. placebo)
91%/29%=3 (high dose vs. placebo)
Summary of statistical tests
for contingency tables
Table Size
Test or measures of association
2x2
risk ratio (cohort or cross-sectional studies)
odds ratio (case-control studies)
Chi-square
difference in proportions
Fisher’s Exact test (cell size less than 5)
RxC
Chi-square
Fisher’s Exact test (expected cell size <5)
Fisher’s Exact Test
Fisher’s “Tea-tasting
experiment”
Claim: Fisher’s colleague (call her “Cathy”) claimed that, when drinking
tea, she could distinguish whether milk or tea was added to the cup first.
To test her claim, Fisher designed an experiment in which she tasted 8
cups of tea (4 cups had milk poured first, 4 had tea poured first).
Null hypothesis: Cathy’s guessing abilities are no better than chance.
Alternatives hypotheses:
Right-tail: She guesses right more than expected by chance.
Left-tail: She guesses wrong more than expected by chance
Fisher’s “Tea-tasting
experiment”
Experimental Results:
Guess poured first
Milk
Tea
Milk
3
1
4
Tea
1
3
4
Poured First
Fisher’s Exact Test
Step 1: Identify tables that are as extreme or more extreme than what
actually happened:
Here she identified 3 out of 4 of the milk-poured-first teas correctly. Is
that good luck or real talent?
The only way she could have done better is if she identified 4 of 4
correct.
Guess poured first
Milk
Tea
Poured First
Milk
3
1
4
Tea
1
3
4
Guess poured first
Milk
Tea
Milk
4
0
Tea
0
4
Poured First
4
4
Fisher’s Exact Test
Step 2: Calculate the probability of the tables (assuming fixed marginals)
Guess poured first
Milk
Tea
Milk
3
1
Tea
1
3
Poured First
4
4



P(3) 
 .229

4
3
4
1
8
4
Guess poured first
Milk
Tea
Milk
4
0
Tea
0
4
Poured First
4
4



P(4) 
 .014

4
4
4
0
8
4
Step 3: to get the left tail and right-tail p-values, consider the probability
mass function:
Probability mass function of X, where X= the number of correct
identifications of the cups with milk-poured-first:



P(4) 
 .014




P(3) 
 .229




P(2) 
 .514




P(1) 
 .229

    .014
P(0) 

4
4
4
0
8
4
4
3
4
1
8
4
4
2
4
2
8
4
4
1
4
3
8
4
4
0
4
4
8
4
“right-hand
tail
probability”:
p=.243
SAS also gives a
“two-sided p-value”
which is calculated
by adding up all
probabilities in the
distribution that are
less than or equal to
“left-hand tail
probability” the probability of
(testing the the observed table
alternative (“equal or more
hypothesis thatextreme”). Here:
she’s 0.229+.014+.0.229+
systematically .014= .4857
Summary of statistical tests
for contingency tables
Table Size
Test or measures of association
2x2
risk ratio (cohort or cross-sectional study)
odds ratio (case-control study)
Chi-square
difference in proportions
Fisher’s Exact test (cell size less than 5)
RxC
Chi-square
Fisher’s Exact test (expected cell size <5)