Transcript Slide 1

Monitoring clinical performance
100 years of living science
Dr Paul Aylin
Dr Foster Unit Imperial College
[email protected]
8th November 2007
Date • Location of Event
Contents
•Background
•Data sources
• Clinical information systems
• Routinely collected hospital data
•Methods
• Casemix adjustment
• Analysis and presentation
•Interpretation of performance data
Florence Nightingale
Florence Nightingale
Uniform hospital statistics would:
“Enable
us to ascertain the relative mortality of
different hospitals as well as of different diseases
and injuries at the same and at different ages, the
relative frequency of different diseases and
injuries among the classes which enter hospitals
in different countries, and in different districts of
the same country”
Nightingale 1863
Heart operations at
the BRI
“Inadequate care for
one third of children”
Harold Shipman
Murdered more than
200 patients
Final report of the Bristol Inquiry
“Bristol was awash with data. There was
enough information from the late 1980s
onwards to cause questions about mortality
rates to be raised both in Bristol and
elsewhere had the mindset to do so existed.”
Clinical databases
Need to collect extensive clinical information
to facilitate adequate adjustment for case-mix
has contributed to the creation and
maintenance of clinical databases
A survey of multicentre clinical databases
found the existence of 105 such clinical
databases in many areas of UK healthcare
Black et.al. BMJ 2004
Bristol (Kennedy) Inquiry Report Data were available all
the time
“From the start of the 1990s a national
database existed at the Department of Health
(the Hospital Episode Statistics database)
which among other things held information
about deaths in hospital. It was not recognised
as a valuable tool for analysing the
performance of hospitals. It is now, belatedly.”
Hospital Episode Statistics UK administrative data
Electronic record of every inpatient or day case
episode of patient care in every NHS (public) hospital
14 million records a year
300 fields of information including
• Patient details such as age, sex, address
• Diagnosis using ICD10
• Procedures using OPCS4
• Admission method
• Discharge method
HES regarded as unreliable by many clinicians
Comparison of administrative data vs clinical databases
Isolated CABG
• HES around 10% fewer cases compared to National Cardiac Surgical Database
Fifth National Adult Cardiac Surgical Database Report 2003. The Society of
Cardiothoracic Surgeons of Great Britain and Ireland. Dendrite Clinical Systems Ltd.
Henley-Upon-Thames. 2004.
Vascular surgery
• HES = 32,242
• National Vascular Database = 8,462
Aylin P; Lees T; Baker S; Prytherch D; Ashley S. (2007) Descriptive study comparing
routine hospital administrative data with the Vascular Society of Great Britain and
Ireland's National Vascular Database. Eur J Vasc Endovasc Surg 2007;33:461-465
Bowel resection for colorectal cancer
• HES 2001/2 = 16,346
• ACPGBI 2001/2 = 7,635
• ACPGBI database, 39% of patients had missing data for the risk factors
Garout M, Tilney H, Aylin, P. Comparison of administrative data with the Association
of Coloproctology of Great Britain and Ireland (ACPGBI) colorectal cancer database.
International Journal of Colorectal Disease 2007.
Cost
• Administrative data £1 per record
• Clinical databases range from £10 (UK
Cardiac Surgical Register) to £60 (Scottish
Hip Fracture Audit)
Raftery J, Roderick P, Stevens A. Potential use of
routine databases in health technology assessment.
Health Technol Assess 2005;9(20)
Whatever source of information
•Timely feedback
•Accessible to clinicians
•Case mix adjustment
Case mix adjustment
Limited within administrative data?
• Age
• Sex
• Emergency/Elective
Risk adjustment models using HES on 3 index procedures
•CABG
•AAA
•Bowel resection for colorectal cancer
Risk factors
Age
Recent MI admission
Sex
Charlson comorbidity score
(capped at 6)
Method of admission
Number of arteries replaced
Revision of CABG
Part of aorta repaired
Year
Part of colon/rectum removed
Deprivation quintile
Previous heart operation
Previous emergency admissions
Previous abdominal surgery
Previous IHD admissions
ROC
ROC curve areas comparing ‘simple’, ‘intermediate’ and ‘complex’ models derived
from HES with models derived from clinical databases for four index procedures
1
0.95
0.9
HES Simple model (Year, age, sex)
HES Intermediate model (including method of admission)
HES Full model
Best model derived from clinical dataset
0.85
0.8
0.75
0.7
0.65
0.6
0.55
0.5
CABG
AAA - unruptured
AAA - ruptured
Colorectal excision
for cancer
Index procedure
Aylin P; Bottle A; Majeed A. Use of administrative data or clinical databases as predictors of risk of death in hospital:
comparison of models. BMJ 2007;334: 1044
Calibration plots for ‘complex’ HES-based risk prediction models for four index procedures
showing observed number of deaths against predicted based on validation set
Surgery for colorectal cancer
Operative mortality
10%
Observed mortality
Model
9%
8%
7%
6%
35%
30%
25%
20%
5%
4%
15%
3%
10%
2%
5%
1%
0%
0%
1
2
3
4
5
6
7
8
9
10
All
1
2
3
4
5
6
7
8
Deciles based on risk
9
10
All
Deciles based on risk
Surgery for ruptured AAA
Surgery for unruptured AAA
80%
Operative mortality
Operative mortality
Operative mortality
Surgery for isolated CABG
70%
60%
50%
35%
30%
25%
20%
40%
15%
30%
10%
20%
5%
10%
0%
0%
1
2
3
4
5
6
7
8
9
10
All
Deciles based on risk
1
2
3
4
5
6
7
8
9
10
All
Deciles based on risk
Aylin P; Bottle A; Majeed A. Use of administrative data or clinical databases as predictors of risk of death in hospital:
comparison of models. BMJ 2007;334: 1044
Current casemix adjustment model for each procedure and
diagnosis
Adjusts for
• age
• sex
• emergency status
• socio-economic deprivation
• diagnosis subgroup (3 digit ICD10) or procedure
subgroup
• co-morbidity – Charlson index
• number of prior emergency admissions
• palliative care
• year
• Month of admission (for some respiratory diseases)
Current ROC (based on 1996/7-2006/7 HES data) for 30 day inhospital mortality
•Repair of AAA = 0.792
•Infra-inguinal bypass = 0.800
•AP resection of rectum = 0.808
•Anterior resection of rectum = 0.813
•Hip replacement = 0.851
•Transplantation of heart and lung = 0.569
•Excision of head of pancreas = 0.681
•Graft of bone marrow = 0.666
Issues
Important to only adjust for parameters outside the
control of the unit in question
Comparison of percentage of AVSD operations including outcome (death, alive or
unknown) by age at admission (in months) between UBHT and elsewhere in England
during supra-regional funding period (HES 1 April 1991 to 31 March 1995) aged under
18 months
40%
40%
Elsewhere - Died
Elsewhere - Alive or unknown
UBHT - Died
UBHT - Alive or unknown
35%
Percentage
30%
35%
30%
25%
25%
20%
20%
15%
15%
10%
10%
5%
5%
0%
0%
0
1
2
3
4
5
6
7
8
9
Age in months
10
11
12
13
14
15
16
17
Comparison of percentage of open operations including outcome (death, alive or
unknown) by age at admission (in months) between UBHT and individual centres
during supra-regional funding period (HES 1 April 1991 to 31 March 1995) aged
under 18 months
40%
35%
Percentage of operations
30%
25%
20%
15%
Other centres
UBHT
10%
5%
0%
0
1
2
3
4
5
6
7
8
9
10
Age at operation in months
11
12
13
14
15
16
17
Presentation of clinical outcomes
“Even if all surgeons are equally good, about
half will have below average results, one will
have the worst results, and the worst results
will be a long way below average”
• Poloniecki J. BMJ 1998;316:1734-1736
Th
e
ve
rs
ity
ni
R
Lo
nd
o
al
F
oy
ol
le
ge
C
n
H
os
pi
ta
lN
St
H
M
re
S
H
a
So
e
am
ry
Tr
H
's
ut
us
m
a
h
N
m
t
e
H
M
rs
p
st
S
U
a
m
n
ea
ni
T
i
c
t
r
h
v
he
us
d
H
H
N
st
t
os
os
H
er
N
S
pi
pi
o
U
ta
ta
tti
T
n
r
ls
ls
ng
us
iv
N
C
ha
H
t
H
ov
os
m
S
en
p
Tr
C
ita
try
ity
us
lN
an
t
H
H
os
d
S
St
pi
W
Tr
ta
G
ar
us
lN
eo
w
t
ic
H
rg
ks
S
e'
Ki
h
T
s
ire
ng
ru
He
st
's
N
al
C
HS
th
O
ol
xf
ca
U
l
T
e
or
ni
re
ge
ru
d
ve
st
N
H
R
rs
H
os
ad
ity
S
pi
cli
T
H
ta
ru
ffe
os
lN
st
pi
H
H
os
ta
S
ls
pi
T
Br
ta
of
R
ru
oy
lN
ig
st
Le
ht
al
H
ic
on
S
Br
es
T
om
H
t
ru
er
ea
st
pt
N
lth
on
H
C
S
C
en
an
ar
T
ru
tra
d
e
Th
st
Pa
H
N
lM
Bl
ar
HS
e
pw
ac
ef
an
C
kp
ie
Tr
or
ar
ch
ld
oo
th
di
us
es
ot
N
t
H
lV
te
H
ho
os
ic
ra
S
H
r
p
t
ac
ul
or
T
nd
i
t
la
r
al
ic
ia
us
M
nd
N
C
H
t
an
H
en
os
Ea
S
ch
t
p
r
Tr
st
ita
e
es
us
-L
Yo
lN
te
t
iv
rC
rk
H
er
S
sh
hi
p
Tr
ire
oo
ld
us
re
lN
H
t
n'
os
H
s
S
pi
U
Ba
t
U
Tr
ni
a
ni
l
v
rt'
us
s
ve
H
s
N
t
os
an
H
rs
S
ity
pi
d
T
t
Th
al
H
r
us
s
os
e
N
t
Lo
pi
H
ta
S
nd
lB
T
on
r
i
us
rm
Pl
N
N
ym
t
in
or
H
gh
S
th
ou
T
a
St
th
r
m
u
af
Sh
st
H
N
fo
os
ef
H
rd
S
fie
pi
sh
ta
T
ld
ru
ire
ls
Te
st
NH
H
ac
os
S
hi
G
pi
T
ng
uy
ta
ru
lN
's
st
Ho
an
H
sp
U
S
d
ni
ita
Tr
St
te
ls
us
d
Th
N
Br
t
H
om
is
S
to
as
Tr
So
lH
's
us
ut
ea
N
t
h
Le
H
lth
Te
S
ed
c
N
Tr
ar
es
s
ew
e
us
Te
H
NH
ca
t
os
ac
st
So
S
pi
hi
l
e
t
Tr
ng
al
ut
U
s
us
ha
po
H
N
t
m
os
n
H
pt
S
Ty
pi
on
t
T
ne
al
r
us
s
U
H
N
ni
t
os
H
ve
S
pi
rs
t
T
al
ity
r
us
s
H
N
t
os
H
S
pi
ta
Tr
ls
us
N
t
H
S
Tr
us
t
U
HSMR
RR of death following CABG HES data 1999/00 to 2001/02
4.5
4
3.5
3
2.5
2
1.5
1
0.5
0
Centre
Criticisms of ‘league tables’
• Spurious ranking – ‘someone’s got to be
bottom’
• Encourages comparison when perhaps not
justified
• 95% intervals arbitrary and no
consideration of multiple comparisons
• Single-year cross-section – what about
change?
Account has to be taken of chance variation
Bayesian approach using Monte Carlo
simulations can provide confidence intervals
around ranks
Can also provide probability that a unit is in top
10%, 5% or even is at the top of the table
See Marshall et al. (1998). League tables of in vitro
fertilisation clinics: how confident can we be about
the rankings? British Medical Journal, 316, 1701-4.
ity
ve
rs
oy
ol
le
ge
R
C
Lo
nd
o
n
os
pi
ta
H
Th
al
lN
St
Fr
e
H
M
S
H
ar
So
am ee
Tr
y
H
'
ut
us
s
m
am
h
N
t
e
H
M
rs
p
st
S
U
a
m
nc
ea
ni
T
i
t
ru
h
v
he
d
H
H
st
N
st
os
os
H
er
N
S
p
pi
o
i
U
ta
ta
tti
T
n
ru
ls
ls
ng
iv
st
N
C
ha
H
H
ov
os
m
S
en
p
Tr
C
i
t
try
al
ity
us
N
an
t
H
H
os
d
S
St
pi
W
Tr
ta
G
ar
us
lN
eo
w
t
ic
H
rg
ks
S
e'
Ki
h
Tr
s
ire
ng
H
us
ea
's
N
t
C
H
lth
O
ol
S
xf
ca
U
l
T
e
or
ni
r
g
r
e
us
e
d
ve
N
H
t
R
rs
H
os
ad
ity
S
p
c
T
ita
lif
H
r
f
u
os
e
lN
st
pi
H
H
os
ta
S
ls
pi
T
Br
ta
of
R
ru
oy
lN
ig
st
Le
ht
al
H
ic
on
S
Br
es
T
om
H
te
ru
ea
rN
st
pt
lth
on
H
C
S
C
en
an
ar
T
ru
tra
d
e
Th
st
Pa
H
N
lM
Bl
ar
H
e
pw
ac
S
ef
an
C
kp
ie
Tr
or
ar
ch
ld
oo
th
di
us
es
ot
N
t
H
lV
te
H
ho
os
ic
ra
S
H
r
p
to
ac
ul
T
nd
i
t
la
ria
ru
al
ic
st
M
nd
N
C
H
an
H
en
os
Ea
S
ch
t
p
re
Tr
st
ita
es
us
-L
Yo
lN
te
t
iv
rC
rk
H
er
S
sh
hi
p
Tr
ire
oo
ld
us
re
lN
H
t
n'
os
H
s
S
pi
U
B
ta
U
Tr
ni
ar
ni
ls
v
us
t's
ve
H
N
t
os
an
H
rs
S
ity
pi
d
Tr
ta
Th
H
ls
us
os
e
N
t
Lo
pi
H
ta
S
nd
lB
Tr
o
n
irm
us
Pl
N
N
ym
t
in
or
H
gh
S
th
ou
T
a
St
th
r
m
u
af
Sh
st
H
N
fo
os
ef
H
rd
S
fie
pi
sh
ta
T
ld
ru
ire
ls
Te
st
N
H
ac
H
os
S
h
G
pi
in
T
uy
ta
ru
g
lN
's
st
H
os
an
H
U
S
pi
d
ni
ta
Tr
St
te
ls
us
d
Th
N
Br
t
H
om
is
S
to
as
Tr
So
lH
's
us
ut
ea
N
t
h
Le
H
lth
Te
S
e
c
N
ds
Tr
ar
es
ew
e
us
Te
H
N
ca
t
os
ac
H
st
So
S
pi
hi
l
e
ta
T
ng
ut
U
ru
ls
ha
po
st
H
N
m
os
n
H
pt
S
Ty
pi
on
ta
Tr
ne
ls
us
U
H
N
ni
t
os
H
ve
S
pi
rs
ta
Tr
ity
ls
us
H
N
t
os
H
S
pi
ta
Tr
ls
us
N
t
H
S
Tr
us
t
ni
U
Ranking
Rankings for CABG mortality 1999/00 to 2001/02
35
30
25
20
15
10
5
0
Centre
Statistical Process Control (SPC) charts
Shipman:
• Aylin et al, Lancet (2003)
• Mohammed et al, Lancet (2001)
• Spiegelhalter et al, J Qual Health Care (2003)
Surgical mortality:
• Poloniecki et al, BMJ (1998)
• Lovegrove et al, CHI report into St George’s
• Steiner et al, Biostatistics (2000)
Public health:
• Terje et al, Stats in Med (1993)
• Vanbrackle & Williamson, Stats in Med (1999)
• Rossi et al, Stats in Med (1999)
• Williamson & Weatherby-Hudson, Stats in Med (1999)
Common features of SPC charts
Need to define:
• in-control process (acceptable/benchmark performance)
• out-of-control process (that is cause for concern)
Test statistic
• difference between observed and benchmark performance
• calculated for each unit at each time point
Pre-defined alarm threshold
• minimise false alarms but remain sensitive to true signals
Types of SPC chart
Shewhart
• test statistic based on current observation only
• no formal adjustment for multiple testing
Funnel plots
• Can incorporate adjustment for between centre variation
• Easy to interpret
Mortality for paediatric cardiac surgery, 1991-Mar 95 for open operations for children
aged under 1 year using SCTS data with 95% and 99.8% control limits based on the
national average
40.0%
35.0%
Mortality rate
30.0%
25.0%
Bristol
20.0%
15.0%
10.0%
5.0%
0.0%
0
100
200
300
400
500
600
Number of operations
700
800
900
Funnel plots
• No ranking
• Visual relationship with volume
• Takes account of increased variability of
smaller centres
Prospective surveillance and multiple testing
• No prior hypothesis
• Prospective surveillance involves monitoring at
multiple time points
• Sensitivity and specificity of surveillance methods
depend on number of tests (time points) carried
out
• Statistical process control charts (SPC) among
the most widely used methods for sequential
analysis
• Care required when applying SPC charts in health
care setting
Prospective SPC charts
Cumulative sums of outcomes
accumulate information on performance over time
formal assessment of sensitivity and specificity
different ways of deriving test statistic
• Log-likelihood CUSUM (our preferred
method)
• Sequential Probability Ratio Test (SPRT)
• Exponentially Weighted Moving Average
(EWMA)
Risk-adjusted Log-likelihood CUSUM charts
STEP 1: estimate pre-op risk for each patient, given
their age, sex etc. This may be national average or
other benchmark
STEP 2: Order patients chronologically by date of
operation
STEP 3: Choose chart threshold(s) of acceptable
“sensitivity” and “specificity” (via simulation)
STEP 4: Plot function of patient’s actual outcome v
pre-op risk for every patient, and see if – and why –
threshold(s) is crossed
More details
• Based on log-likelihood CUSUM to detect a
predetermined increase in risk of interest
• Taken from Steiner et al (2000); pre-op risks
derived from logistic regression of national data
• The CUSUM statistic is the log-likelihood test
statistic for binomial data based on the predicted
risk of outcome and the actual outcome
• Models can adjusts for age, sex, emergency
status, socio-economic deprivation etc.
AAA mortality monitoring
5
4
3
2
CUSUM
Value
1
statistic
Lower SMR
0
limit
1
21
Patient number
41
61
81
My Practice Score
Cards
Urology Score Card
Page 2 – Urology – Scrotal procedures
How do you interpret performance data?
Pyramid model of investigation to find credible cause
explanation
Lilford et al. Lancet 2004; 363: 1147-54
How do you interpret performance data?
•Check the data
•Difference in casemix
•Examine organisational or procedural
differences
•Only then consider quality of care
Challenges
•Data quality
•Consensus
•Primary care
•Does information change practice?
Food for thought
• an estimated one in ten patients admitted to hospital
suffers an adverse event
• an estimated 850,000 adverse events might occur each
year in NHS hospitals
• some adverse events will be inevitable complications of
treatment, but around half may be avoidable - that is,
over 400,000 potentially avoidable adverse events
every year
Vincent, C.A. Presentation at BMJ conference ‘Reducing Error in Medicine;London.
March 2000
Adverse events in British hospitals: preliminary retrospective record review
Charles Vincent, Graham Neale, and Maria Woloshynowych BMJ 2001; 322:
517-519.
Food for thought
• eight per cent of adverse events result in death and six
per cent in permanent disability - that is, over 34,000
preventable deaths and over 25,000 preventable
permanent disabilities every year
• compensation for clinical negligence costs the NHS
more than £400 million a year and altogether
outstanding claims for clinical negligence add up to
over £2.4 billion.
Vincent, C.A. Presentation at BMJ conference ‘Reducing Error in Medicine;London.
March 2000
Adverse events in British hospitals: preliminary retrospective record review
Charles Vincent, Graham Neale, and Maria Woloshynowych BMJ 2001; 322:
517-519.