No Slide Title

Download Report

Transcript No Slide Title

Nan Rothrock, Ph.D.
Northwestern University
May 22, 2012
s
Problems in patient-reported outcome measures
s
PROMIS approach to PRO instrument development
s
Available PROMIS instruments
s
s
Reliability, validity
PROMIS and the FDA

Many measures of same
health concept

Widely varying quality

Difficult to compare and
combine data

. . . across studies

. . . across conditions

Complex

Long
1
2
3
2
1
0
-1
-2
-3
Questionnaire
with a wide
range but low precision
Questionnaire
with a high
precision but small range
National Institutes of Health, 2003
•
Attack the PatientReported Outcome
(PRO) “Tower of
Babel”
•
Harness modern
psychometric
methods
•
Improve quality
and interpretability
of PROs
Bruegel, 1563



Nine-year commitment of NIH
$80+ million investment
15 funded research sites

Methodology

Measures (Instruments)

Software



Item = question or statement a patient answers
Instrument = collection of items
Legacy = existing instrument that is “gold
standard” or a commonly used and widely
accepted instrument

Domain focused, not disease focused
 Domain = feeling, function or perception you want to
measure (e.g., anxiety, physical function, general
health perceptions)

Item Banks
 A large collection of items measuring one domain
 Any and all items can be used to provide a score
 Can be administered as Computerized Adaptive Tests
(CATs) or fixed-length short forms
Focus
groups
Binning
and
winnowing
Domain
Framework
Literature
review
Archival
data
analysis
Large-scale
testing
Cognitive
interviews
Expert item
revision
Translation
review
Literacy
level
analysis
Statistical
analysis
Intellectual
property
Calibration
decisions
Short form
CAT
Validation
studies
Expert
review/
consensus
Physical
Health
Symptoms
Function
Affect
Self-Reported
Health
Mental
Health
Behavior
Cognition
Social
Health
Relationships
Function
Physical
Health
Adult
Pediatric/Parent Proxy
Pain Behavior
Pain Interference
Pain Intensity
Fatigue
Pain Interference
Upper Extremity
Fatigue
Mobility
Sleep Disturbance
Asthma Impact
Sleep-related
Impairment
Physical Function
Sexual Function
Adult
Anxiety
Pediatric/Parent Proxy
Anxiety
Depression
Anger
Mental
Health
Psychosocial
Illness Impact
Applied Cognition
Concerns
Applied Cognition
Abilities
Alcohol Use
Alcohol
Consequences
Alcohol
Expectancies
Depression
Anger
Adult
Ability to
Participate in Roles
& Activities
Satisfaction with
Roles & Activities
Social
Health
Companionship
Emotional Support
Informational
Support
Instrumental
Support
Social Isolation
Pediatric/Parent Proxy
Peer Relationships
In the past 7 days …
FATEXP
20
FATEXP
5
FATEXP
18
FATIMP
33
FATIMP
30
FATIMP
21
FATIMP
40
SomeOften Always
times
Never
Rarely
How often did you feel tired?





How often did you experience extreme exhaustion?





How often did you run out of energy?










How often were you too tired to think clearly?





How often were you too tired to take a bath or shower?










How often did your fatigue limit you at work
(include work at home)?
How often did you have enough energy to exercise
strenuously?
1
1
1
1
1
1
1
2
2
2
2
2
2
2
3
3
3
3
3
3
3
4
4
4
4
4
4
4
Reprinted with permission of the PROMIS Health Organization and the PROMIS Cooperative Group © 2007.
5
5
5
5
5
5
5

Available
 Universal Spanish

In Process







German
Portuguese
Mandarin Chinese
French
Italian
Norwegian
Others – see nihpromis.org/measures/translations

T Score
 Mean = 50
 Standard Deviation = 10

Referenced to the US General Population

Adult
 GI Symptoms
 Self-efficacy for management of chronic disease

Pediatric






Pain Behavior, Quality, Intensity
Physical Activity
Experience of Stress
Subjective Well-being
Impact of Child Illness on Family
Family Belongingness
0.5
PROMIS
Short
Form
10 items
0.4
PROMIS
Short
Form
20 items
SF-36 10 items
SE = 3.3
rel = 0.90
0.3
Error
SE = 2.3
rel = 0.95
0.2
CAT 10 items
HAQ 20 items
0.1
rheumatoid arthritis
patients
US general population
0
-4
10
-3
20
-2
30
-1
40
0
50
1
60
2
70
60
60
50
40
0
10
20
30
CESD
30
20
10
0
0
100
250
-2
-1
0
1
2
3
4
2
3
4
Depression
0 80
CESD
40
50
r =0.84
-2
-1
0
1
PROMIS
Depression
Depression
1.6
PROMIS Pain Interference
(median = 4 (4-12) items)
1.4
BPI Interference (7 items)
effect size
1.2
1
0.8
0.6
0.4
0.2
0
1
2
time point
3
1.6
Pain Behavior (median = 4 (4-4) items)
1.4
Roland-Morris (24 items)
effect size
1.2
1
0.8
0.6
0.4
0.2
0
1
2
time point
3




Importance of PRO development to include
patient voices
Importance of sound measurement
Confusion in selecting an instrument because of
huge array of choices
Ongoing discussions via Interagency Clinical
Outcomes Assessment Working Group to qualify
PROMIS Fatigue measures, attendance and
presentations at PRO Consortium


FDA Approach  evaluate content validity in
each clinical population in which the measure
may be used
PROMIS Approach  there is commonality in
patients’ experiences of symptoms/outcomes
and their impact on QOL
 Need to re-validate a well-developed & valid
instrument in a target population is questionable
Magasi, S. et al (2011) Content validity of patient-reported outcome measures:
Perspectives from a PROMIS meeting. Quality of Life Research
PROMIS FatigueSFv1.0 and PROMIS FatigueMS Scores
by Disability Status, Fatigue Severity, and Vitality Scores
N
PROMIS FatigueSF v1.0
Mean
SD
PROMIS FatigueMS
Mean
SD
Expanded Disability Status Scale (EDSS)
Mild (0-4)
Moderate (4.5-6.5)
Severe (7.0-9.5)
83
104
43
52.2
60.5
60.7
8.2
6.4
8.3
52.5
60.7
60.5
9.2
5.6
8.7
Fatigue Severity (0-10 NRS)
None/Mild (0-1)
Moderate (2-4)
Severe (5-10)
18
58
154
43.0
51.0
61.7
4.5
6.0
5.8
42.5
51.3
61.9
5.4
6.6
5.5
Vitality (item from the MOS)
None/A little
Some
Quite a lot
Very Much
52
88
44
45
63.8
59.9
55.7
47.5
5.3
6.3
6.6
7.3
64.2
60.1
56.0
47.0
5.4
5.5
6.8
7.9
Cook et al, QOLR, 2011



Supplement with targeted measures
Item banking allows flexible item choice without
loss of a standard scoring base
Alternative is a messy array of contenders that
fail to communicate across themselves regarding
severity or result interpretation

Comparability
 Provide the ability to compare or combine results from
multiple studies.

Reliability and Validity
 Reduce response burden.
 Improve measurement precision.
 Simplify administration via computer-based
administration, scoring, and reporting

IRT-based MIDs on a T-Score scale

Multiple cross-sectional and longitudinal anchors
(18)

Summarized with nonparametric statistics
(median, interquartile range)
Recommended IRT-based T-Score MIDs and Raw Score MIDs for PROMISCancer Short Forms in Advanced Cancer Patients
Instrument
T-Score MID
Points
T-Score
MID Effect
Sizes*
Raw Score
MID Points
Raw Score
MID Effect
Sizes‡
Fatigue
3 -5
0.4 - 0.7
2-3
0.4 - 0.6
Pain Interference
4 -6
0.5 - 0.7
4-7
0.3 - 0.6
Physical Function
4 -6
0.5 - 0.7
3-6
0.3 - 0.6
Anxiety
3-5
0.3 - 0.6
3-4
0.5 - 0.6
Depression
3-5
0.3 - 0.5
3-4
0.4 - 0.6
*Calculated as the T-Score MID divided by the Assessment 1 T-Score standard deviation
‡Calculated as the Raw Score MID divided by the Assessment 1 Raw Score standard
deviation
Recommended IRT-based T-Score MIDs for PROMIS-Cancer CATs
T-Score MID
Points
T-Score MID
Effect Sizes
Fatigue
4.5-5.0
0.57-0.63
Pain Interference
3.0-6.0
0.34-0.67
Physical Function
4.5-7.5
0.21-0.80
Anxiety
3.0-4.5
0.38-0.58
Depression
2.5-4.5
0.32-0.58
CAT