Transcript bias

Bias in Epidemiology
Wenjie Yang
[email protected]
2007.12
“The search for subtle links between diet,
lifestyle, or environmental factors and disease
is an unending source of fear but often yields
little certainty.”
____Epidemiology faces its limits.
Science 1995; 269: 164-169.
Residential Radon—lung cancer
Sweden
Yes
Canada
No
DDT metabolite in blood stream
Breast Cancer
Abortion
Maybe yes,maybe no
Electromagnetic fields(EMF)
Canada & France:
Leukemia
America:
Brain Cancer
What can be wrong in the study?
Random error
Results in low precision
of the epidemiological
measure  measure is
not precise, but true
1 Imprecise measuring
2 Too small groups
Systematic errors
(= bias)
Results in low validity of
the epidemiological
measure  measure is
not true
1 Selection bias
2 Information bias
3 Confounding
Random errors
Systematic errors
Errors in epidemiological studies
Random error
• Low precision because of
– Imprecise measuring
– Too small groups
• Decreases with increasing group size
• Can be quantified by confidence interval
Bias in epidemiology
1 Concept of bias
2 Classification and controlling of bias
2.1 selective bias
2.2 information bias
2.3 confounding bias
Overestimate?
Underestimate?
Random error:
Definition
Deviation of results and inferences
from the truth, occurring only as a
result of the operation of chance.
Bias:
Definition: Systematic, non-random
deviation of results and inferences
from the truth.
2 Classification and controlling of
bias
Time
Assembling
subjects
Selection
bias
collecting
data
Information
bias
analyzing
data
Confounding
bias
VALIDITY OF EPIDEMIOLOGIC STUDIES
Reference Population
External
Validity
Study
Population
Exposed
Internal
Validity
Unexposed
2.1 Selection bias
2.1.1 definition
Due to improper assembling method or
limitation, research population can not
represent the situation of target
population, and deviation arise from it.
2.1.2 several common Selection
biases
(1)Admission bias (Berkson’s bias)
There are 50,000 male citizen aged
30-50 years old in a community. The
prevalence of hypertension and skin cancer
are considerably high. Researcher A want
to know whether hypertension is a risk
factor of lung cancer and conduct a casecontrol study in the community .
case
control
Hypertension
1000
9000
No hypertension
4000
36000
5000
45000
sum
χ2 =0
sum
10000
40000
50000
OR=(1000×36000)/(9000 ×4000)=1
Researcher B conduct another
case-control study in hospital of
the community.(chronic gastritis
patients as control) .
No association between
hypertension and chronic gastritis
admission rate
Lung cancer & hypertension
20%
Lung cancer without hypertension
20%
chronic gastritis & hypertension 20%
chronic gastritis without hypertension 20%
case
control
hypertension 200 (1000) 200 (2000)
No hypertension 800 (4000) 400 (8000)
sum
1000 (5000) 600 (10000)
sum
400
1200
1600
case
hypertention
control
40
No hypertention 160
sum
200
χ2 =10.58
sum
100
140
200
360
300
500
P<0.01
OR=(40×200)/(100×160)=0.5
(2)prevalence-incidence bias
(Neyman’s bias)
Risk factor A
Prognostic B
A
case
exposed
50
25
75
unexposed 50
75
125
100
200
sum
100
χ2 =13.33,
OR=3
control
P<0.01
sum
Risk Factor A
Prognostic Factor B
Risk Factor A
Prognostic Factor B
A
case
exposed
50
25
75
unexposed 50
75
125
100
200
sum
100
χ2 =13.33,
OR=3
control
P<0.01
sum
B
case
exposed
80
100
180
unexposed 40
100
140
200
320
sum
control
120
χ2 =8.47 P<0.01
OR=2.0
sum
(3)non-respondent bias
Survey skills to sensitive question
Abortion
Abortion
yes
no
1
2
2
1
number of
subjects:N
Abortion
Yes
No
proportion of
red ball:A
1
2
numbers who’s
answer is “1”:K
2
1
Abortion rate:
X
number of
subjects:N=1000
Abortion
Yes
1
No
2
proportion of
red ball:A=40%
numbers who’s
answer is
“1”:K=540
Abortion rate:
X=?
N*A *X+ N*(1-A) *(1-X)=K
2
1
(4)detection signal bias
Intake estrogen
Endometrium cancer
(4)detection signal bias
50%
50%
Early stage
Medium stage
Terminal stage
Early stage:90%
Medium stage:30%
Terminal stage
50%
5%
Intake estrogen
Early
findout
Uterus bleed
Frequently
check
(5)susceptibility bias:
E
Physical
check
drop out
UE
2.2 Information Bias
(1)recalling bias
(2)report bias
(3)diagnostic/exposure
suspicion bias
(4) Measurement bias
2.3 Confounding bias
Definition:
The apparent effect of the exposure of
interest is distorted because the effect
of an extraneous factor is mistaken for
or mixed with the actual exposure
effect.
Properties of a Confounder:
• A confounding factor must be a risk factor for the
disease.
• The confounding factor must be associated with
the exposure under study in the source population.
• A confounding factor must not be affected by the
exposure or the disease.
The confounder cannot be an intermediate step in the
causal path between the exposure and the disease.
2.3.2 Control of confounding bias
1 In designing phase
1 ) restriction
2) randomization
3) matching
2 In analysis phase
1) Stratified analysis
(Mantal-Hazenszel’s method)
2) Standardized
3) logistic analysis
A case-control study of Oral
contraceptive to myocardial infarction
OC
MI
control
sum
+
-
29
205
135
1607
164
1812
sum
234
1742
1976
χ2 =5.84 ,P<0.05 cOR=1.68 OR 95C.I.(1.10,2.56)
Is age a potential confounding factor?
Age distribution in 2 group
age(year)
MI
proportion(%)
case
proportion(%)OR
25~
6
2.6
286
16.4
1.0
30~
21
9.0
423
24.3
2.36
35~
37
15.8
356
20.4
4.95
40~
71
30.3
371
21.3
9.12
45~49
99
42.3
306
17.6
15.42
合计
234
100.0
1742
100.0
----
OC exposure proportion in different age groups(%)
OC exposure in MI
OC exposure in control
exposure
Age
exposure
(year)
+
-
sum
Proportion(%)
+
25~
4
2
6
66.7
62
224 286
21.7
30~
9
12
21
42.9
33
390 423
7.8
35~
4
33
37
10.8
26
330 356
7.3
40~
6
65
71
8.5
9
362 371
2.4
45~49
6
93
99
6.1
5
301 306
1.6
sum 29 205
234
χ2 =38.99 P<0.01
12.4
135
-
sum
1607 1742
Proportion(%)
7.7
χ2 =108.43 P<0.01
Stratified analysis
OC
MI
25~
+
4
62
30~
+
2
9
224
33
age(year)
Control OR
-
12
390
35~
+
4
26
40~
+
33
6
330
9
45~49
+
65
6
362
5
-
93
301
OR95%C.I.
7.2 (1.64,31.65)
8.9 (3.96,19.98)
1.5 (0.53,4.24)
3.7 (1.36,10.04)
3.9 (1.26,12.10)
Woolf’s Chi-square test
χ2 =6.212
ν=5-1=4
P<0.05,
Incorporate OR
ORMH=3.97
(crudeOR adjustedOR)
1.68  3.97
 100% 
 55.7%
adjustedOR
3.97
Analytic epidemiology :
Case-control study; HIV “carried” by
mosquitoes ?
Mosquito exposure
HIV +
Controls
O.R. = 5.38
158
No exposure
17
175
247 143
390
405 160
565
Analytic epidemiology : stratification for confounding ;
Case-control study. HIV “carried” by mosquitoes ?
Mosquito Exposure
Females
HIV +
Mosquito Exposure
Males
HIV +
155
No exposure
3
2
166
133
15
304
controls
O.R. = 1.27
81
10
O.R. = 1.21
261