Presentation and interpretation of epidemiological data: objectives Raj Bhopal, Bruce and John Usher Professor of Public Health, Public Health Sciences Section, Division of Community Health Sciences, University of.

Download Report

Transcript Presentation and interpretation of epidemiological data: objectives Raj Bhopal, Bruce and John Usher Professor of Public Health, Public Health Sciences Section, Division of Community Health Sciences, University of.

Presentation and
interpretation of
epidemiological data:
objectives
Raj Bhopal,
Bruce and John Usher Professor of Public Health,
Public Health Sciences Section,
Division of Community Health Sciences,
University of Edinburgh, Edinburgh EH89AG
[email protected]
Presentation and interpretation of
epidemiological data: objectives
You should understand:
 The aim of manipulating epidemiological data is
to sharpen understanding of risk and burden of
disease, but distortions occur.
 Epidemiological studies measure, present and
interpret risk, comparing one population to
another
 The idea, definition, and calculation of:
proportional mortality, proportional mortality
ratio,
actual overall (crude) rates,
directly and indirectly standardised rates, the
standardised mortality ratio,
relative risk, odds ratio,
attributable risk, population attributable risk
and
Presentation and interpretation of
epidemiological data: objectives 2




The principal relative measure is the relative
risk while the odds ratio can approximate it
in particular circumstances.
Attributable and population attributable risk
are measures that help assess the proportion
of the burden of disease that is caused by a
particular risk factor.
How epidemiological data contributes to
assessing the health needs and health status
of populations.
Different ways of presenting data have a
major impact on the perception of risk so
epidemiological studies should provide
means of both relative and actual risk
Proportional mortality ratio
(PMR)


Sometimes the only data we have is cases
e.g. no accurate population denominators
for outcomes by hospital
PMR is commonly used to study disease
patterns by cause in settings where
population denominators are not available
P.M. = Number of deaths due to cause X
total number of deaths
Proportional mortality ratio (PMR) 2




The proportional mortality can be calculated by
sex, age group or any other appropriate
sub-division of the population
Figures can be compared between populations,
places or time periods by calculating the proportional
mortality ratio (PMR) which is simply the ratio of PM's
in the two comparison populations, ie
PMR = PM in population A
PM in population B
Proportional mortality is a simple and potentially
useful way of portraying the burden of a specific
disease within a population, and the PMR provides a
way to compare populations
PMR is one measure of the strength of the
association
Adjusted overall rates:
standardisation and the SMR








Age and sex specific rates can be compared
between times, places and sub-populations
Age and sex specific rates may be imprecise in
small studies
Age and sex specific tables are usually large
and difficult to assimilate
If so, you may calculate the summary, overall
(crude) rate
Overall actual rates (crude) rates may mislead
Age and sex structure of the compared
population probably differs
If so, age and sex are confounding variables
Therefore, we need to adjust (or standardise)
the rates for age, sex or both
Adjusted overall rates:
standardisation and the SMR 2






If age and sex differences are potentially interesting
or important explanatory factors for population
disease patterns, rates should not be adjusted
The age-adjusted figure loses information,
particularly when differences are not consistent
across age group or sex
With major differences in age and sex structure
between populations, when adjustment is most
needed, the method is less effective
Rates adjusted by the indirect method are weighted
(or biased) in relation to the age and sex structure of
the population under study
Output from such adjustment is the SMR
Only SMR comparisons between the study
population and the chosen standard population are
valid
Class exercise: age-specific
and actual overall (crude) rates



Consider the age-specific and actual
overall rates in the table 8.3.
Comment on the age structure, and the
effect this has on the overall rate, which
varies in populations A, B and C.
Why does this effect occur?
Class exercise: age-specific and
actual overall (crude) rates 2




Population B has high overall rates because it
has a comparatively older population.
The larger number of older people is weighting
(exerting influence upon) the summary figure.
In effect, the size of the population in each age
group provides a set of weights that are
applied to the overall rates.
The overall rates are misleading us into
thinking there are differences because the
weights exerted by the population structure
differ
Exercise: effect of directly
standardising on overall rates





Consider the age structures of the standard
population, and the age-specific and overall
rates in table 8.4.
Calculate the number of cases expected if the
standard population had the same age specific
rates as population A
What is the relationship between the overall
rates in table 8.4 to those in table 8.3.
Why are the overall rates now the same in
populations A, B and C?
What is the influence of a relatively young and
relatively old standard population?
Direct standardising: example




Age specific rate in population A, age
group 21-30, is 5%
There are 3000 people in the standard
population
In this age group if the standard
population had the same rate as
population A, then 5 percent of them would
be affected
5% of 3000 is 150
Effect of directly standardising
on overall rates 2




The identical age-specific rates obtained from
table 8.3 lead to an identical overall
(standardised) rate
The standard population structure supplies the
weights and these are the same in all
comparison groups
The overall result of 7.5% in table 8.4 is not
real
The young standard leads to a low
standardised rate (7.5%), and an old standard
to a high rate (13.9%)
Indirect standardisation





The standard population supplies disease
rates, not population structure
The question : how many cases would have
occurred if the study population had the same
specific rates as the standard population?
Observed figure is compared to the expected
cases
Resulting figure is the standardised morbidity
(or mortality) ratio (SMR) and
Usually expressed as a percentage
Exercise: indirect standardisation






Example of calculation: in the age group 21-30 the
rate in the standard population (table 8.5 (a)) is 10
percent
In population A there were 1000 people in this age
group.
If population A had the same age specific rate as the
standard population 10 percent would be affected
i.e. 100
The total number of cases gives the expected
number if population A had the same rates as the
standard population i.e. 450
This number can be compared to the number
actually seen i.e. 300
The overall rates and standardised rates in the three
populations A, B and C differ. Why?
Exercise: indirect
standardisation 2


Because the standard rates are weighted
differentially by the different population
structures of A, B, C.
Here the population structures of A, B and
C are weighting the national rates.
Relative risk





The relative risk is the ratio of two
incidence rates
Incidence rate in the population of interest
divided by the rate in a comparison (or control
or reference) population
We are relating the incidence of disease in
those with to those without the risk factor
This measures the size of the effect on disease
rates of the risk factor and, hence, the strength
of the association in epidemiology
RR can never be calculated from case-control
studies which do not give incidence data,
though in some circumstances the odds ratio
calculated from such a study provides an
acceptable estimate of the relative risk
Calculating and interpreting relative risk






Imagine that the incidence of lung cancer is
compared in two cities, one with polluted air (A), the
other not (B).
In the polluted city there were 20 cases in a
population of 100,000; in the other city 10 cases in a
population of 100,000. Assume accuracy in the
numerators and denominators.
What is the relative risk of lung cancer in the
polluted city (A)?
What is the relative risk of lung cancer in the less
polluted city (B)?
What explanations are there for the higher relative
risk in the polluted city?
What questions will you consider before concluding
that there is a real association between pollution
and lung cancer?
The two by two table
Risk factor
Outcome:
disease
Outcome:
no disease
total
present
a
b
a+b
absent
c
d
c+d
total
a+c
b+d
a+b+c+d
Simple formulae for relative
risk and odds ratios




Incidence in those with the risk factor =
a/a+b
Incidence in those without the risk factor
= c/c+d
(b) relative risk = a/a+b divided by c/c+d
(c) OR = cross product ratio =
a x d divided by b x c
The two by two table: lung
cancer as a rare outcome
Risk factor
Outcome:
lung cancer
Living in city a= 20
A
Outcome:
no lung
cancer
b=99,980
total
a+b
100,000
living in city
B.
c= 10
d= 99,990
c+d
100,000
total
a+c
30
b+d
199,970
a+b+c+d
200,000
The two by two table: lung
cancer as a common outcome
Risk factor
Outcome:
lung cancer
Living in city a= 20
A
Outcome:
no lung
cancer
b=80
total
a+b
100
living in city
B.
c= 10
d= 90
c+d
100
total
a+c
30
b+d
170
a+b+c+d
200
Relative risk exercise: answers



Relative risk in city A =
Incidence rate in city A/incidence in City B =
20 divided by 10= 2
Relative risk in city B =
Incidence rate in city B/incidence in City A =
10 divided by 20= 0.5
If investigators can consider the relative risk
as a fair measure of the strength of the
associationThey can apply frameworks for causal thinking
to judge whether pollution is the probable
cause of the higher relative risk in town A
Odds ratios






Odds are the chances in favour of one side in
relation to the second side
Odds are the chances of being exposed (or
diseased) as opposed to not being exposed (or
diseased)
Odds ratio is simply one set of odds divided by
another
Odds of exposure, in the two by two table, for the
group with disease are a  c and for the group
without disease bd
Odds ratio for exposure is simply the odds a÷c
divided by the odds b÷d.
Similarly, the odds of disease in those exposed to
the risk factor is a÷b, and for those not exposed
c÷d, and the odds ratio is a÷b divided by c÷d
Odds ratios 2



The epidemiological idea is a simple one
i.e. if a disease is causally associated with
an exposure, then the odds of exposure in
the diseased group will be higher than the
corresponding odds in the non-diseased
group.
If there is no association, the odds ratio
will be one.
If the exposure is protective against
disease, the odds ratio will be less than
one
Odds ratio 3



In what circumstances will the O.R. for
disease approximate the R.R.?
For both the odds ratio and the relative
risk the numerators (a and c) for the
fractions are identical.
The denominators are different, that is, b
and d in the odds ratio, and a + b and c +
d in the relative risk.
Odds ratio 4


When b is similar to a + b, and d is similar
to c + d, the odds ratio and relative risk
will be similar.
This happens when the disease is rare,
i.e., when a and c are small.
Odds ratio 5





Odds ratios approximate well to the relative
risk in some circumstances.
In case-control studies where relative risk cannot
be calculated, odds ratio provide an estimate.
Odds have desirable mathematical properties
permitting easy manipulation in mathematical
models and statistical computations, as, for
example, in multiple logistic regression.
Epidemiologists need to be aware that
misinterpretation of the odds ratio is common
Statistical packages may label the output of odds
ratio analysis as relative risk, creating a trap for
the unwary investigator
Exercise on odds ratios


Calculate the odds ratio on the lung
cancer exercise for the two instances
where the outcome is rare and the
outcome is common
How do these values compare with the
relative risk?
Epidemiological information to
choose between priorities





In a few diseases there is a unique known
causal factor e.g. nutritional disorders such as
scurvy
All cases of such diseases are attributable, by
definition, to one cause
Often the causes are multiple and complex
Choosing between alternative actions
becomes necessary for there is limited time,
money, energy and expertise
Attributable risk provides a way of developing
the epidemiological base for such decisions
Epidemiological information to
choose between priorities 2
Imagine that there is insufficient resources to
tackle all six of these CHD risk factors, what
epidemiological information would help
choose between them to reduce coronary
heart disease in a population?






High levels of some lipids in the blood,
particularly low density lipoprotein (LDL)
cholesterol
High blood pressure
Smoking
Low levels of physical activity
Obesity
Diabetes
Epidemiological information to
choose between priorities 3





Solid evidence that each of these risk factors
is a component of the causal pathway
Knowledge of the frequency of each risk factor
in the population
Knowledge of the additional risk that each risk
factor imposes
understanding of the actions that are (or might
be) effective in reducing the prevalence of the
risk factor and their costs
the reduction in disease outcome (attributable
risk)
Epidemiological information to
choose between priorities




The question being answered by attributable
risk is-how many cases would not have
occurred if a particular risk factor had not been
present?
Or, what proportion of disease incidence in
those exposed to the risk factor is attributable
to that particular risk factor.
In short, what is the attributable risk
associated with a risk factor?
from the total number of cases, subtract the
number that would have occurred anyway,
even if the cases had not had the risk factor
Attributable risk for lung
cancer in city A
Risk factor
Outcome:
lung cancer
Living in city a= 20
A
Outcome:
no lung
cancer
b=80
total
a+b
100
living in city
B.
c= 10
d= 90
c+d
100
total
a+c
30
b+d
170
a+b+c+d
200
Attributable risk for lung
cancer in city A



Attributable risk =
incidence in city A minus incidence in city
B= 20 -10
This is best expressed as a fraction of the
total risk in City A = 20-10/20 = 0.5
This is best expressed as a percentage,
so we multiply by 100 = 50%
Population attributable risk




From a public health perspective we are
interested in both the benefits of an
intervention to the exposed group and to the
whole community
In this case the question is: what proportion of
the disease in the whole population (not just the
exposed population) is attributable to a
particular exposure?
The answer depends on how common the
exposure is
If a community had no or very little exposure to
smoking, as in Sikh women living in the Punjab
India, then cases of lung cancer in that
population must be caused by other factors
Numbers needed to treat (NNT) or to
prevent (NNP)
The NNT is a measure that combines directness
with simplicity
 The number of people who need to be treated for one
patient to benefit
 The same measure could be applied to preventative
measures
 The NNT is the reciprocal of the absolute (or actual)
risk reduction
 The reciprocal of 5 is 1/5
 So, if the incidence of outcome in the untreated group
= 30/1000
and, incidence of outcome in the treated group=25/1000

then, the actual or absolute reduction in risk = 3025/1000 = 5/1000

and, the NNT
= 1000
/5 = 200

Theory





Epidemiological purposes and theories
underpin measurement, presentation and
interpretation of data
The capacity to measure and analyse data
also alters our theories e.g. the case-control
study and the odds ratio are now inextricably
intertwined
Interpretation of data is influenced by
investigators' philosophy on the nature of
knowledge (epistemology)
Epidemiologists practice positivism, the
philosophic system that is based on facts,
acquired by empirical observations, and logic
Facts are extracted by analysis and
interpretation from data that are invariably
flawed







Summary
Epidemiological data can be manipulated and
presented in many ways
Epidemiological summary measures estimate
absolute risks (e.g. numbers, rates, life years lost,
numbers needed to treat) or relative ones (e.g.
adjusted rates, relative risk, odds ratios)
Relative and actual risks portray dramatically
different perspectives on the health needs of
populations
Relative measures of risk are more useful in
aetiologic inquiry
Actual measures are better in health planning and
policy
Epidemiological data on diseases can be combined
with other information on risk factors
Combining data sets generates causal
understanding of disease processes in populations
and rational interventions to improve public health