Clinical Research: How to Avoid Magical Thinking Reese H. Clark, MD Director of Clinical Research, Pediatrix And Consulting Associate Professor Duke University.

Download Report

Transcript Clinical Research: How to Avoid Magical Thinking Reese H. Clark, MD Director of Clinical Research, Pediatrix And Consulting Associate Professor Duke University.

Clinical Research: How to Avoid
Magical Thinking
Reese H. Clark, MD
Director of Clinical Research, Pediatrix
And
Consulting Associate Professor
Duke University
Objectives
• Define the different levels of evidence use to
change clinical care
• Review how failure to do good clinical research is
associated with bad outcomes
• To discuss different study designs, their strengths
and weaknesses.
• To provide guidance on how to perform good
clinical studies that improve neonatal care
What Changes Care?
• Evidence
– Publications
– Meta-analysis
– Consensus opinion
• Personal opinion (in my experience)
• Institutional history (the way we do it)
• Personal history (the last case I had)
Experience is the ability
to make make the same mistake
repeatedly with increasing confidence
The goal of research is to
discover, learn, understand and
teach principles that improve
the quality of life
Levels of Evidence
• Level 1 – Results from randomized control trials
with meaningful outcome measure
• Level 2 – Case Control type of studies with treated
and untreated patients and minimal evidence of
selection biases
• Level 3 – The patient acts as his or her own
control or Case control with some selection bias
i.e., historical controls
• Level 4 – Case series
• Level 5 – Expert opinion based on experience
Pediatrics 2004;114(3):874
Problem
• “The best available evidence, however, is not
always sound or valid evidence. Sometimes, when
faced with a collection of reports that do not
constitute good evidence, attempts to choose the
best evidence become pointless; in this case, a
statement of no good evidence is preferable.”
• Ambalavanan N et al. Clin Perinatol 2003;
30:305-31
Definition of Research
http://www.nsf.gov/bfa/dias/policy/docs/45cfr690.pdf
• Research means a systematic investigation,
including research development, testing and
evaluation, designed to develop or contribute to
generalizable knowledge.
• Activities which meet this definition constitute
research for purposes of this policy, whether or not
they are conducted or supported under a program
which is considered research for other purposes.
• For example, some demonstration and service
programs may include research activities.
What is Research?
http://www.hhs.gov/ohrp/humansubjects/guidance/decisioncharts.htm#c2
• Human Subject Regulations Decision Charts
• The Office for Human Research Protections (OHRP)
provides the following graphic aids as a guide for
institutional review boards (IRBs), investigators, and
others who decide if an activity is research involving
human subjects that must be reviewed by an IRB under the
requirements of the U.S. Department of Health and Human
Services (HHS) regulations at 45 CFR part 46. OHRP
welcomes comment on these decision charts. The charts
address decisions on the following:
– whether an activity is research that must be reviewed by an IRB
– whether the review may be performed by expedited procedures,
and
– whether informed consent or its documentation may be waived.
Most Important Research Issue
Protect the people who are volunteering
to participate in a clinical research study.
Drug Misadventures
The Willowbrook Study
Vulnerable Population Rules
• The site where a highly controversial medical
study was conducted there between 1963 and 1966
by medical researcher Saul Krugman,
• Healthy children were intentionally inoculated,
orally and by injection, with the virus that causes
the disease, then monitored to gauge the effects of
gamma globulin in combating it. A public outcry
forced the study to be discontinued.
• Researchers defended the deliberate injection of
these children by pointing out that the vast
majority of them would acquire the infection
anyway.
Iatrogenesis:
“Brought forth by a healer"
Since Hippocrates's
time, the potential
damaging effect
of a healer's actions
has been recognized:
480 BC
First do no harm
“primum non nocere”
The road to hell is paved with
good intentions……
-St. Bernard of Clairvaux ~1150 AD
Bloodletting
Iatrogenesis in Neonatology
• Lowered thermal
environment increased
mortality 1900–1964
• Supplemental oxygen RLF
(ROP) 1941–1954
• Initial thirsting and
starving neurological
deficits 1945–1970
• Synthetic vitamin K
kernicterus 1945–1961
• Sulfisoxazole kernicterus
1953–1956
• Chloramphenicol ‘‘gray
baby’’ syndrome 1956–1960
• Novobiocin jaundice 1957–
1962
• Hexachlorophene brain
lesions 1952–1971
• Epsom salts enemas
magnesium intoxication
1964–1965
• Feeding gastrostomy
increased mortality 1963–
1969
• Benzyl alcohol ‘‘gasping’’
syndrome –1982
From: Robertson AF, Reflections on Errors in Neonatology (Part I,II,III) J Perinatology 2003
With the best intent we can do great harm.
• Eferol -- Increases NEC, liver injury and death
• Verapamil -- Causes profound bradycardia
• Post-natal steroids -- May increase brain injury
• Overventilation -- Increases the risk of PVL
Martone WJ et al. Illness with fatalities in premature infants:
association with an intravenous vitamin E preparation, EFerol. Pediatrics. 1986;78:591-600.
• A role for vitamin E in the prevention of
retrolental fibroplasia was first reported in 1949.
• A number of clinical studies between 1978 and
1983 suggested that vitamin E supplementation (to
normal or supranormal serum levels) might
prevent or ameliorate the course of retrolental
fibroplasia and other complications of prematurity,
especially following oxygen exposure.
• Vitamin E’s therapeutic or preventive role had not
yet been clarified before it was used.
MMWR
April 13, 1984 / 33(14);198-9
http://www.cdc.gov/mmwr/preview/mmwrhtml/00000319.htm
• CDC has received reports from two hospitals of clusters of an unusual illness
occurring among low-birth weight (less than 1,500 grams), premature infants in
neonatal intensive-care units.
• Thirteen affected infants developed clinically significant ascites, in addition to
some or all of the following abnormalities: hepatomegaly, splenomegaly,
cholestatic jaundice, azotemia, and thrombocytopenia.
• All affected infants had received parenteral nutrition therapy, in addition to
other supportive measures.
• An intravenous vitamin E preparation, containing 25 mg/ml vitamin E, 9%
polysorbate 80 and 1% polysorbate 20 in 2-ml vials (E-Ferol Aqueous
SolutionR, distributed by O'Neal, Jones & Feldman, St. Louis, Missouri), was
introduced in each hospital for addition to parenteral nutrition solutions
approximately 1 month before the onset of illness in the first infant in both
clusters.
• All affected infants received E-Ferol; some affected infants received up to 1 ml
or more daily. Both outbreaks ceased shortly after use of E-Ferol was
discontinued.
Do You Remember E-Ferol? The Penalty for Selling
Untested Drugs in Neonatology
Jerold F. Lucey Pediatrics 1992;89;159;
• In 1984, E-Ferol killed at least 38 newborns.
• Iatrogenic disasters are often caused primarily by wellintentioned physicians using logical therapies which turned
out to have unexpected, lethal side effects.
• A poorly managed, avaricious company, O’Neil, Jones and
Feldman, Inc. decided to get the jump on the market and sell
an untested preparation of intravenous vitamin E.
• Physicians assumed E-Ferol had been tested and approved for
use by the FDA. It hadn’t been tested.
• An astute clinician spotted the problem.
• On January 19, 1989, three defendants pleaded guilty and were
sentenced to fines of $130 000 each and 6-month jail
sentences. These penalties were for conspiracy, mail fraud, and
Cosmetic Act Felony.
Problems
• Clinical practice is dynamic
• Study design and execution can take years
• Muticenter studies require coordination
which can further delay the process
• Publication delay is terrible
• Evidence-based practice requires
continuous reevaluation of practice
Change in Event Rate With Time
Term Neonates With Meconium Aspiration Syndrome
Surfactant
Vasopressors
iNO
ECMO
Percent of Patients
30.0%
25.0%
20.0%
15.0%
10.0%
5.0%
0.0%
2006
2005
2004
2003
2002
2001
2000
Year of Discharge
Study Types
Study Designs
• Case Series
– Retrospective
– Prospective
• Case-Control
• Randomized Control Trial
– Crossover design
– No Crossover
Case Series -- Reporting Our
Experiences
• Evaluates the occurrence of an outcome and the
factors associated with that outcome
• Examples
– The effect of iNO on oxygenation
– What factors are associated with, not that cause, IVH?
• Advantage: Easy to do. Easy to get consent.
• Disadvantage: No concurrent controls
• Never proves efficacy or safety
Ways to Strengthen a Case Series
• Define study plan, outcomes measures, and
statistical methods prospectively
• Avoid hunting for an effect
• Be careful to evaluate the study sample for
selection bias (e.g., ECMO centers admit
other institutions treatment failures)
• Do not over-interpret the results
Case-Control Trials
• Similar to case series except the cases are
compared to a defined group of controls
• Type of controls:
– Historical
– Same period, different location
– Matched for factors that influence the outcome
• Gives a better sense of efficacy but selection bias
and confounding variables remain a problem
Randomized Controlled Trials
• Patients are randomly assigned to one of
several defined groups. The management of
each group is strictly defined.
• Crossover design allows patients to
“crossover” into other treatment groups.
While easier to get consent for, the results
are difficult to interpret.
Outcomes = Study Endpoint = What is
really important?
• Physiology -- Heart rate, blood pressure, PaO2
• Health consequence -- Survival, chronic lung
disease, seizures, stroke, learning disability
• Quality of life -- Joyful participation in life
• Health economics -- Did we produce the same
outcome more efficiently?
Outcomes
“Not everything that can be
counted counts,
and not everything that
counts can be counted.”
Albert Einstein
Outcomes
• Primary -- The outcome we are most
interested in studying. Sample size is
determined by estimating the number of
patients needed to evaluate this endpoint.
Every aspect of study design is directed at
getting a clean measure of this outcome.
• Secondary -- All other measures of
outcome.
Characteristics of a Good Study
Endpoint or Outcome Measure
• Easy to measure and to define
– Survival is easy to define but hard to study
– Chronic lung disease is easy to study but hard to define
• Valuable
– Healthy survival
– Not a transient rise in PaO2
• Occurs at a frequency that is feasible to study
• Outcome change must be attributable to the
intervention studied
Surrogate Outcome Measures
• Definition -- A measure that predicts or is
closely associated with another measure of
outcome
• Example -- Grade 3-4 IVH is often used as
a surrogate measure (or proxy) of
neurological outcome. If we decrease the
rate of severe IVH, we predict that we will
improve neurological outcome.
Failure of Surrogate Markers
• Ment LR et al. Pediatrics 1994;93:543-550
– Low-dose prophylactic indomethacin decreased IVH
from 18 to 12% in neonates 0.6-1.25 kg
– Also reduced the rate of grade 3-4 IVH from 5 to 1.4%
– Survival was not significantly different but was better
(92 vs. 87%) in the treated group
• Ment LR et al. Pediatrics 1996;98:714-718
– Follow-up showed no difference in IQ or the
occurrence of cerebral palsy
Prophylactic Indomethacin
Ment et al. Pediatrics 1996;98:714-718
25%
20%
Indomethacin
Control
15%
10%
5%
0%
Mortality
All IVH
IVH 3-4
CP
Postnatal Steroids
• Meta-analysis shows that steroids reduce the risk of
CLD in premature neonates (Bhuta et al. Arch Dis Child
1998;79:F26)
• CLD is associated with poor neurodevelopmental
outcome.
• It might be expected that steroids might improve
neurodevelopmental outcome
• Instead early steroids increase neurodevelopmental
problems (Yeh et al. Pediatrics 1998;101)
Meta-analysis
• Summarize the results of different research
studies of related problems
• Systematic approach to the identification
and abstracting the critical information held
in each study
• Present a comprehensive best estimate
meant to summarize what is known about
the clinical problem
Meta-analysis
LeLorier et al. NEJM 1997;337:536
Positive
Negative
Positive
13
6
19
Negative
7
14
21
Total
20
20
40
Evaluation of Meta-analysis
Sensitivity
65%
Specificity
70%
Positive Predictive Value
68%
Negative Predictive Value
67%
Kappa value
0.35 (0.06-.64)
Definitions
• Relative Risk -- The probability (risk) of being
treated with ECMO if you get iNO compared to if
you did not get iNO (%ECMO use in iNO treated/
%ECMO in control not treated with iNO)
• Odds Ratios -- The rate that ECMO patients are
treated with iNO compared to patients who do not
get ECMO. (%iNO exposure in ECMO patients) /
(%iNO use in non ECMO patients). Better
applied to morbidity factors like ICH, or CLD
Definitions
• Confidence intervals -- How certain are you that
the observation falls within your measured result.
Usually the number is 95% CI
• Standard Deviation -- a measure of average
variance from the mean (Square root of
{Sum(individual values - mean value)2/number of
measurements}
• Standard Error of the Mean -- STD/Square root
of the sample size.
Relative Risk of Outcome
Decreased death
Good confidence
Effect but no
Confidence
No effect
Increased death
Good confidence
.2
.25
.33
.5
1
2
3
Relative Risk of Outcome
4
5
Efficacy or Equivalency or Non-inferiority?
• Efficacy trials are directed at proving that one therapy is
better than another with 95% confidence
• Equivalence implies that the two therapies produce the
same outcome
– If one therapy reduces health care cost, then we may only want to
show that the two approaches produce similar outcomes
• Non-inferiority are done to show that patients treated with
X do no worse than those treated with Y. Like efficacy
but only one tail-test are used. Required sample size is
smaller.
Relative Risk of Outcome
The new therapy is
better and no risker
Effect but no
confidence
Equivalent but no
confidence
Equivalent
Good confidence
.2
.25
.33
.5
1
2
3
Relative Risk of Outcome
4
5
Relative Risk of Death and/or ECMO
(% Death or ECMO iNO/ % in Control)
NINOS (n = 235)
INOSG (n = 58)
Boston (n = 90)
Ohmeda (n = 155)
Total (n = 538)
0.00
0.50
1.00
Relative Risk of Death/ECMO
iNO/Control
1.50
Relative Risk of Death
NINOS (n = 235)
INOSG (n = 58)
Boston (n = 90)
Ohmeda (n = 155)
Total (n = 538)
0
1
2
3
4
5
6
Relative Risk of Death
7
8
Rate of ECMO or Death
NINOS trial, NEJM 1996
100%
80%
iNO
Control
60%
40%
20%
0%
MA
S(n
RD
Sep
PPH
CD
S(n
H(n
sis(
N(n
=25
n=5
=11
=53
=41
)
0)
6)
)
)
Problems With the General Application of
Any Model
• Standardized Rate -- Observed outcome rate divided
by the predicted rate
• Inadequate sample size
• Selection biases
• Neonatal care changes and the model must be
recalibrated
• May be slow in identifying poor performers if we
have to wait for adequate sample size
Sample Size Calculations
• Dependent on:
– The absolute event rate in the population
being studied
– The absolute difference between the two
groups
– How certain you want to be in the measured
difference
Outcome
Effect of Sample Size On Confidence Interval and
Probability that the Proportions are Different
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
p=0.6
Gr
p
p=0.16
1(
Gr
p
n=
10
)
2(
Gr
p
n=
10
)
p<0.001
Gr
p
1(
n=
10
0
)
Sample Size
2(
Gr
p
n=
10
0
)
1(
Gr
p
n=
10
0
0)
2(
n=
10
0
0)
Site Variation
Perecentage Died or Had
Developlmental Delay
Center Differences & Outcomes
of Extremely Low Birth Weight Infants
120%
100%
80%
60%
40%
20%
0%
Site
Vohr B et al. Pediatrics April 2004, pp 781
Across Site Variability
• Most sites randomize within center
• Each site is a mini-trial
• Differences in care are variable within sites and
may effect the efficacy of the drug being studied
within that site
–
–
–
–
steroids
nosocomial sepsis
nutrition
saturation targets
Site Variability in Proportion of Neonates Alive & Off Oxygen
Treatment Study (“All Treated”)
Percent Alive & Off Oxygen
Surfactant 1
Surfactant 2
100
80
60
40
20
0
Site
All Patients At Site
Difference between Group 1 and Group2
Alive and off oxygen PMA36 for Treatment Study
Primary Outcome
Group 2 - Group 1
60%
Surfactant 2 better
40%
20%
0%
-20%
-40%
Surfactant 1 better
-60%
Site
Testing a Test
Test
Positive
Disease
Present
A
Disease
Absent
B
Negative
C
D
Sensitivity Specificity
A/(A+C) D/(B+D)
PPV
A/(A+B)
NPV
D/(C+D)
Solutions
• Create a network of centers with a common
goal to answer important questions
• Define answerable questions
• Limit data collection to confounding variables
• Define strict time lines and review progress on
a quarterly basis
• Present findings at national meetings
• Interpret results carefully
“For I was assailed by so many
doubts and errors that the only
profit I appeared to have drawn
from trying to become educated,
was progressively to
have discovered my ignorance.”
Descartes, Discourse on Method, 1637.