Transcript Strand21

Assessment Methodology: Lessons from OMERACT Meetings

Vibeke Strand, MD Biopharmaceutical Consultant Adjunct Clinical Professor, Division of Immunology, Stanford University

OMERACT: Outcome Measures in Rheumatology Clinical Trials

• • • • • • I: II: 1992: Rheumatoid Arthritis Clinical Trials 1994: Adverse Events → Establishment of Registries Health Related Quality of Life Economic Evaluations III: 1996: Osteoarthritis Osteoporosis Psychosocial Measures IV: 1998: Longitudinal Observational Studies RA Response Criteria / Imaging Ankylosing Spondylitis → ASAS Systemic Lupus Erythematosus 5: 6: 2000: MCID Economics: Cost Effectiveness Imaging: Radiography and MRI 2002: Economic Evaluations Imaging

What is OMERACT?

• • • Data driven process to define outcome measures to be used in RCTs and LOS for each clinical indication Domains derived from the “Ds”: Discomfort Disability Dollar cost Death Literature reviews, data available from LOS and RCTs: • Validity of currently defined instruments to assess outcome • “Data mining” to better understand clinical response • Correlation of patient reported responses with other outcome measures • Definition of “minimally clinically important improvement” = MCID

What is OMERACT?

• Presentation of evidence and development of consensus at each conference: Representatives from: Academia, Clinical Investigators, Regulatory Agencies, Sponsors, Clinical Rheumatologists • Goal: To Develop Recommendations for: • “Core Set” of minimum number of domains / outcome measures assessed in RCTs and LOS • Working agenda identifying ‘need’ to focus future work • Previous OMERACT Recommendations have been ratified by WHO / ILAR in RA, OA, SLE, including HRQOL and Economic evaluations

The OMERACT ‘Umbrella’ RHEUMATOID ARTHRITIS: EULAR ACR JRA: PRCSG SLE: SLICC EULAR OSTEOARTHRITIS: OARSI ANKYLOSING SPONDYLITIS: ASAS PAIN: IMMPACT

The OMERACT Filter

TRUTH:

Face, content, construct and criterion validity Is the measure truthful?

Does it measure what is intended?

DISCRIMINATION:

Reliability and sensitivity to change Does the measure discriminate between situations [states] of interest?

FEASIBILITY:

Can the outcome easily be measured given constraints of time, money and interpretability?

Boers et al: JRheum 1998: 25: 198-9

Rheumatoid Arthritis: OMERACT I, 1992

• RCTs available, but data limited • Only a few included a measure of physical function • General ‘belief’ that none had demonstrated convincing efficacy • “Paper patients” derived from actual RCT data • • → [healthy] arguments regarding changes reported Clear disagreement about importance of MD Global assessments • Participants ranked patient reported physical function and SJC highest when assessing efficacy • Facilitated recognition that ‘perception’ of benefit variable

ACR Response Criteria

• Defined and Ratified after OMERACT I Data driven nominal group process • Based on Paulus criteria and statistical analyses of CSSRD and MTX RCTs best differentiating active therapy from placebo • Require ≥20% improvement in 5 of 7 measures: • Tender Joint Count and Swollen Joint Count • and 3 of the following 5: MD Global Physical function: HAQ Pain by VAS Patient Global ESR and/or CRP

EULAR Response Definition

DAS28 Score ≤3.2

>3.2 and ≤5.1

>1.2

Decrease in DAS28 >0.6 to ≤1.2

≤0.6

Good Moderate >5.1

None DIscriminant function analysis of patients w/active; inactive RA Disease activity state determined by treatment changes

Van Gestel et al. Arth Rheum 1996; 26:705-11

As Demonstrated in RA, Responder Analyses Have Face and Content Validity

• Allow assessment of multiple domains • Facilitate comparison of efficacy across: • Products • • Heterogeneous populations, and Disease indications • May lead to tiered approach to label indications • Precedent: ACR Responder Index in RA DAS28 both confirms active disease at baseline and ‘clinical responses’ Additional data by x-ray and HRQOL

Rheumatoid Arthritis: Later Efforts

• Demonstrated that ‘generic’ measures of HRQOL sensitive to change in RA RCTs • Identified ‘MCID’ for HAQ and SF-36……facilitating: • • Comparisons across products, disease populations Economic evaluations • Helped to show impact of ‘Rheumatic Diseases’ to WHO • In this Bone and Joint Decade • Identified importance of Rheumatic Diseases relative to CV, DM, HTN, OP….

• [Hopefully] → allocation of more resources to identify and treat Rheumatic Diseases…..

Minimum Clinically Important Differences [MCID]

• Degree of improvement • Perceptible to patients = clinically important/ meaningful • Defined by patient query, delphi technique OMERACT: 33-36% improvement;18% > placebo • Confirmed by statistical correlations with patient global assessments in RCTs in RA and OA • Determination of proportion of patients with clinically important improvement provides a more interpretable result with direct clinical implications

Minimum Clinically Important Differences [MCID]

Score Range Direction of Scoring MCID Literature HAQ DI 1-4 0 - 3 SF-36 2, 5-7 0 - 100 PCS/MCS mean 50 ± 10 – + + 0.22

5 - 10 points 2.5 - 5 points

1 Guzman et al. Arth Rheum. 1996; 39:5208 2 Kosinski et al. Arth Rheum. 2000; 43:1478-87 3 Redelmeier et al. Arch Intern Med. 1993; 153:1337-42 4 Wells et al. J Rheumatol. 1993; 20:557-60 5 Kosinski et al. Arth Rheum. 2000; 43:S140 6 Samsa et al. Pharmacoeconomics. 1999; 15:141-155 7 Thumboo et al. J Rheumatol. 1999; 26:97-102.

Health Assessment Questionnaire (HAQ)

• Widely accepted, validated, rheumatology-specific instrument to assess physical function in RA • Gold Standard: OMERACT/FDA Guidance • 20 questions covering 8 types of activities Dressing + Grooming; Arising; Eating; Walking; Hygiene; Reaching; Gripping, Activities of Daily Living • HAQ Disability Index (HAQ DI) • Scores the worst items within each of the eight scales • Based on use of aids and devices

Mean Improvement in HAQ Disability Index Year-2 Cohorts at 24 Months LEF MTX SSZ Worsening US301 MN301/303/305 MN302/304 0 (97) (101) (51) (46) (248) (273) -0.22

-0.37

-0.5

-0.56

-0.6

* *LEF vs MTX; p=0.01

-0.73

Improvement -1 % Achieving MCID 84% 69% 86% 82% -0.48

-0.56

74% 78%

ATTRACT: HAQ Disability Index Mean Improvement through Week 102 0.5

0.4

0.3

0.2

0.2

0.4

0.1

0 MTX + Placebo 3 mg/kg q8w 0.5

0.5

0.4

0.45

3 mg/kg q4w 10 mg/kg q8w 10 mg/kg q4w All infliximab p-value vs. MTX + Placebo < 0.001

< 0.001

< 0.001

< 0.001

ERA: Mean Change in HAQ DI at Month 12 Baseline HAQ DI: 1.6 -0.1

-0.2

-0.3

-0.4

-0.5

-0.6

-0.7

-0.8

-0.80

Kosinski et al. AJMC. 2002;8:231-240 1.6

-0.70

MTX ETN MCID

Mean Changes in HAQ DI at Weeks 24 and 52 Anakinra+MTX Baseline: 1.38 Placebo 1.43 Active 0.0

-0.1

-0.2

-0.3

-0.4

-0.5

-0.18

-0.29

-0.15

-0.28

-0.6

-0.7

-0.8

24 weeks 52 weeks Fleishman et al. Arth Rheum. 2002;46:S574 .

MCID Placebo+MTX Anakinra+MTX

Mean Changes in HAQ DI at Weeks 24 and 52 DE019: Adalimumab+MTX 0.0

-0.1

-0.2

-0.3

-0.4

-0.5

-0.24

-0.25

MCID -0.6

-0.6

-0.56

-0.61

-0.59

-0.7

-0.8

24 weeks 52 weeks Placebo BL 1.48

Adalimumab 20 mg weekly BL 1.45

Adalimumab 40 mg eow BL 1.44

Keystone E. Arthritis & Rheum 2002; 46(9) suppl.

Mean Changes in HAQ DI from Weeks 30 to 54 ASPIRE RCT

Baseline HAQ DI: 1.5 1.5 1.5

-0.1

-0.2

-0.3

MCID -0.4

-0.5

MTX MTX+INF 3 mg/kg MTX+INF 6mg/kg -0.6

-0.7

-0.8

-0.75

-0.78

-0.79

% Achieving MCID: 65 76 76

Smolen et al. Ann Rheum Ds 2003;62:S64

Mean Changes in HAQ DI at Weeks 52 TEMPO RCT

Baseline HAQ DI: 1.7 1.7 1.8

-0.1

-0.2

-0.3

-0.4

-0.5

-0.6

-0.7

-0.8

-0.9

-

1.0

-0.61

-0.66

-0.97

MCID MTX ETN MTX+ETN

SF-36: Short Form 36 Health Survey

• Validated, widely used generic measure of HRQOL • 8 Domains: • Scored 0 - 100; age, sex adjusted rates • 2 Summary Scores • Physical Component: PCS – Measures how decrements in physical function affect day to day activities – Impact of physical impairment/disability on HRQOL • Mental Component: MCS – Impact of mental affect, symptoms of pain on HRQOL • Normative based scoring (Mean: 50, SD: 10)

SF-36 Two-Component Model Physical Component Physical Function Role Physical Bodily Pain General Health Vitality Social Function Role Emotion Mental Health Mental Component

US 301: Baseline SF-36 Scores US Norms vs US301 Population US Norms (A/S Adjusted) Study US301 Population 100 90 80 70 60 50 40 30 20 10 0 Physical Function Role Physical Bodily Pain General Health Perception Vitality Social Function Role Emotion Mental Health

US301: Mean Improvement in SF-36: Year-2 Cohorts Leflunomide and Methotrexate Better LEF 24 Months (n = 93) MTX 24 Months (n = 89) US Norms (A/S Adjusted) Baseline Year-2 Cohort 40 30 20 10 90 80 70 60 50 0 Physical Function Role Physical Bodily Pain General Health Perception Vitality Social Function Role Emotion Mental Health

20 15 10 5 0 Mean Changes in SF-36 Scores DE019: Adalimumab+MTX 35 Placebo Adalimumab (40 mg) QOW 28.1

30 25 23.3

5.2

14.6

13.5

8.2

3.5

16.9

8.7

9.0

7.5

13.4

5.2

15.5

2.3

6.7

MCID Keystone E. Arthritis & Rheum 2002; 46 suppl.

Leflunomide and Methotrexate: Mean Changes in SF-36 PCS Year-2 Cohort (US301) 60 50 US Norm 42.7 41.7

40 38.6 38.8

30 30.9

30.2

2 SDs below US Norm 20 10 0 BL 12 M 24 M LEF (93) BL 12 M 24 M MTX (97)

Etanercept and Methotrexate: Mean Changes SF-36 PCS at 12 Months (ERA) 60 50 US Norm 38.7

38.8

40 30 28.0

29.2

2 SDs below US Norm 20 10 0 BL 12M ETN 25mg (193) Kosinski et al. AJMC. 2002;8:231-240.

BL 12M MTX (199)

Infliximab: Median Improvement in SF-36 PCS at Month 24 (ATTRACT) 16 Baseline: 23.9 –30.8

12 8 6.8

6.9

6.7

4.6

4 2.8

0 MTX + Placebo (n=88) p-value vs. placebo 3 mg/kg q 8 wks (n=86) 0.011

Kavanaugh et al. Arth Rheum. 2000;43:S147.

3 mg/kg q 4 wks (n=86) <0.001

10 mg/kg q 8 wks (n=87) <0.001

10 mg/kg q 4 wks (n=81) <0.001

Anakinra+MTX: Mean Improvement in SF-36 PCS at Month 12 Baseline: 29.9 PL 28.8 Active Fleishman et al. Arth Rheum. 2002;46:S574.

Correlation Between HAQ and SF-36

Reference Ruta

1

Talamo Lubeck Strand

6 2

Kavanaugh Kosinski

5 4 3

Study — — Infliximab/ATTRACT Etanercept/ERA Etanercept/RAPOLO Leflunomide/US301 Scales PCS PF PCS PF PCS PF PCS PF PCS PF

1 Ruta et al. Br J Rheum. 1998;37:425-436.

2 Talamo et al. Br J Rheum. 1997;36:463-469.

3 Kavanaugh et al. A&R. 2000;43:S147.

4 Kosinski et al. Medical Care. 1999;37:MS23-39.

5 Lubeck et al. Value in Health. 2001;4:MS2,163.

6 Strand et al. A&R. 2001;44:S187.

Correlation -0.77

-0.72

-0.51

-0.54

-0.60

-0.61

-0.79

-0.82

-0.60

-0.74

MCID Values Are Consistent in RCTs in RA

• Improvements in HAQ DI and SF-36 in RA with newly approved therapies are statistically significant; more importantly, CLINICALLY MEANINGFUL • MCID values are consistent across agents and patient populations • Disease specific [‘relevant’] measure: HAQ • Generic measure: SF-36 • Improvements in disease specific highly correlated with generic measures

MCID Workshop: Identifying Candidate Measures to Define ‘Low Disease Activity State’

• Pain • Function • Inflammation • Health Related Quality of life • Structure damage • Toxicity • Co-morbidity • Fatigue

Osteoarthritis

• OMERACT III: 1996 • Candidate instruments to assess: • • • Pain Stiffness Physical Function • Limited data from RCTs; treatments offering only symptomatic benefit • Identification of a ‘Core Set’ of 4 Domains as a foundation for future work • Research Agenda: Identification of ‘Disease Control’, ‘Biologic Markers’ of Response

Western Ontario and McMaster Universities (WOMAC) Osteoarthritis Index

• • • Self-administered questionnaire • • Developed querying patients with hip or knee OA Reflects physical activities most affected by symptoms, disease manifestations Composite score based on 24 questions; subscores: • Pain (5 questions) • • Joint stiffness (2 questions) Physical function (17 questions) Scored by 0 - 4 Likert or 0 - 10 cm VAS scales • Improvement = negative change

BIOLOGIC MARKERS INFLAM MATION HRQOL / UTILITY PAIN PHYSICAL FUNCTION

PATIENT GLOBAL IMAGING (≥1YR)

 

STIFFNESS 90% 36% 8% MD GLOBAL OTHER Eg, Performance based Flares Time to Surgery Analgesic Count % Voting for inclusion ≥ 90% ≥30% - <90% 0% - <30% Placement INNER Core MIDDLE Core OUTER Core

   

Consequence CORE SET HR QOL/ Utility (Strongly Recommended) OPTIONAL

Outcome Measures in OA: OARSI Guidelines OMERACT Core Set and ‘Strongly Recommended’

Pain: WOMAC pain / stiffness subscales Differentiating pain from stiffness Physical function: WOMAC physical function subscale Patient Global Assessment: Signal joint Transition question How to phrase question?

In all the ways arthritis affects you, how are you doing today?

HRQOL/Utilities: WOMAC Composite Score SF-36 EQ5D / Utilities MD Global Assessment

WOMAC Scores in OA RCTs: Identifying MCID

• MCID in WOMAC composite score, Likert scale : • Anchored to Patient Global Assessment • 12 wk pivotal OA RCTs with Celecoxib: 10.1

[0 – 89] • Pain, Stiffness, Physical Fxn: 2.1, 1.2, 6.5

[0 – 20] [0 – 8] [0 – 61]

Zhao et al. Pharmacother 1999;19:1269-78

• MCID in WOMAC VAS : • Anchored to Patient Response to Rx [0-4 Likert scale] • 6 wk RCTs OA hip, knee; Rofecoxib v Ibuprofen v PL: • Pain, Stiffness, Physical Fxn: 9.7, 10, 9.3 mm, VAS • 11 mm VAS for Patient Global Assessment

Ehrich et al: JRheum 2000;27: 2635-2641

Improvement in WOMAC Composite Scores at Week 12 : Pivotal OA RCTs, Celecoxib

*

MCID = 10.1 (SE=0.4)

* * * *

14 12 10 8 6 4 2 0

* * *

CT20: knee Placebo Cel 50 Zhao et al Pharmacother 1999;19:1269-78 CT21: knee Cel 100 Cel 200 * P <.05 v placebo

* * *

CT54: hip Nap 500

*

WOMAC Physical Function Subscale, knee or hip OA at 12 months: Pivotal RCT, Rofecoxib 0 -5 -10 MCID = 9.3

-15 -20 -25 -30 Mean baseline = 69.6 mm -35 R 2 4 8 12 26 Week R = randomization P < 0.05 for all groups; treatment response compared with baseline Cannon GW, et al. Arthritis Rheum. 2000;43:978 –987.

Rofecoxib 12.5 mg Rofecoxib 25 mg Diclofenac 150 mg 39 52

• • •

SF-36 in Osteoarthritis RCTs

Truth or Validity • Domains, especially Bodily Pain discriminated differences/ changes in symptoms over time • Closer correlation with patient assessed outcomes Feasibility or Reliability •

Ware et al: A+R 1996; 39:S90

• • Ceiling effects minimal; floor effects for RP and RE domains

Ware et al: A+R 1996; 39:S90

Able to detect effects of arthritis in community sample Discrimination or Responsiveness

Hill et al: JRheum 1999; 26:2029-35

• • In longitudinal tests, BP domain and PCS summary score most responsive, even within 2-6 weeks

Bellamy et al, A+R 2000; S221

Valid and responsive measure of TKR, esp long term

Brooks et al, A+R 1997; 40:S110

Short term treatment → significant improvement in MCS

Ehrich et al: JRheum 2000;27: 2635-2641

25 20 15 10 5 0 Mean Improvement in SF-36: All Rofecoxib v Normative Data US Population Difference between ages 45-54 and 55-64 US population. Ware et al 1993 PF RP PAIN GHP US Norms VITAL Rofecoxib SOC RE MH

Change in SF-36 Scores at Week 12: OA of knee Pivotal Trial with Celecoxib 24 19 14 9 4

* * * * * * * * *

-1 PF Placebo RP BP Cel 50 * p < .05 v placebo

* * * * * * * * *

GH Cel 100 VT SF Cel 200

*

RE Nap 500 MH

Use of WOMAC and SF-36 in RCTs of OA Conclusions Based on the COX-2 Experience

• WOMAC Questionnaire reflects clinical improvement consistent with other patient assessed measures • Proved valid, reliable and sensitive to change • • • Pain and stiffness subscales reflect symptoms Physical function subscale dominates composite score WOMAC Composite score is a disease specific measure of HRQOL • Correlates closely with improvements reported by generic SF-36 • Based on MCID calculations, Likert and VAS versions similarly sensitive to change

OMERACT 4 SLE Module 1998: Goal

• To develop consensus on required outcome domains to be assessed in clinical trials in SLE • Paucity of data from Randomized Controlled Trials [RCTs]; Most evidence derived from Longitudinal Observational Studies [LOS]

Strand et al: J Rheum 1999; 26: 490-497 Smolen et al: J Rheum 1999; 26: 504-507

Disease Activity Indices BILAG, ECLAM, LAI, SLAM, SLEDAI

• Good evidence for validity, discrimination, feasibility in published cohort [LOS] studies • • Changes in one index correlated with others Recommendation to use index of choice – Computer generation of all 5 indices facilitates: • Clinical research efforts: SLICC ESCICIT EURO-LUPUS • Exchange of information: interested parties biotech / pharma • Some limitations when used as primary outcome measures in RCTs; ongoing efforts to improve

SF-36: Sensitive to Change in LOS in SLE

• Baseline domain scores low in SLE – v. age/gender matched norms for Canada, Norway, UK, US – v. serious medical problems (IDDM, CAD)

Gladman et al: J Rheum 1995; 23:1953-5

• In cohort studies reflects changes in disease activity measures –  disease activity  in PF, BP, GHP – disease activity  SF-36 domain scores, esp. PF •

Gordon et al: A+R 1997; 40:S112 Gladman et al: Clin Exp Rheum 1995; 14:305-8 Stoll et al: J Rheum 1997; 24:309-13 and 1608-14 Fortin et al: Lupus 1998; 7:101-7

Decrements in multiple domains correlate with increased disease activity and damage

Abu-Shakra et al J Rheum 1999; 26:306-9 Thumboo et al J Rheum 1999; 26:97-102 Wang et al J Rheum 2001; 28:525-32

– Immunosuppressive use

Rood et al J Rheum 2000; 27:2057-9

– ESRD

Vu, Escalante J Rheum 1999; 26:2595-2601

Domains Recommended by OMERACT 4

Disease activity : Disease Activity Scores: SLEDAI, BILAG, ECLAM, SELENA SLEDAI, SLAM-R Definitions of Active Nephritis by U/A, 24 hour CCr, proteinuria, «Renal flare» «Major SLE Flare» Damage : ACR/SLICC Damage Index End Stage Renal Disease [ESRD] Doubling of Serum Creatinine Chronicity Index on Biopsy Bone loss due to disease activity and/or corticosteroids HRQOL : SF-36 [Should also include: Adverse events Economic costs including health utilities]

As reviewed in Schiffenbauer et al: EBM Treatment of SLE; BJR: in press

Ankylosing Spondylitis: ASAS

• A successful and relevant example • To be discussed by Robert Landewe Juergen Braun

Systemic Sclerosis Workshop: OMERACT 6

Absence of data: Few ‘failed’ RCTs Limited information from LOS Assessment by organ system involvement • Renal • Cardio-pulmonary • Muscle • HRQOL • Skin • GI

OMERACT 7 May 12-16, 2004 Asilomar, California

• Module: RA: Definition of Low Disease Activity • Module Updates: Imaging in Ankylosing Spondylitis [ASAS] Working Group on Safety • Workshops: Outcome Measures in Psoriatic Arthritis Outcome Measures in Fibromyalgia Outcome Measures in Gout The Patient Perspective in Outcome Measures