CDISC Implementation at BI – Overview

Download Report

Transcript CDISC Implementation at BI – Overview

Implementation of CDISC at BI
– Overview
CDISC German User Group Meeting
Sep 2009
Dr. Jens Wientges
IBM Global Business Services
Life Sciences / Pharma Consulting
© 2009 IBM Corporation
1. Motivation, Objectives and expected Benefits
2. System Landscape, Data Flow and Processes
3. Approach
4. Real life examples of issues and sponsor
defined elements
2
IBM
© 2009 IBM Corporation
1. Motivation, Objectives and expected Benefits
2. System Landscape, Data Flow and Processes
3. Approach
4. Real life examples of issues and sponsor
defined elements
3
IBM
© 2009 IBM Corporation
Implementing CDISC at BI (ICBI) - Motivation
– Requests for analyses on substance/project databases
SDB/PDB are increasing
• need to effective use and exploit clinical data beyond single trials
• need to build efficient substance databases
– A harmonized data model based on CDISC allows for
•
•
•
•
•
a wider range of standard reporting tools
re-use of standard programs
facilitated familiarization with new trials/projects
higher flexibility in assignments to projects
quicker response to regulatory requests (same view on data)
- BI has taken the decision to implement the CDISC data
standards to effectively manage, exploit and report clinical
data
4
IBM
© 2009 IBM Corporation
ICBI - Objectives
Corporate wide, Harmonized Clinical Data Structure
5
IBM
1.
Effectual for:
- single clinical trials
- pooled databases (PDB)
2.
Operational data structure, allowing:
- data quality checks
- ADS/ADaM generation
- Ad hoc statistical analysis
3.
Based on the principles of the CDISC data standards
© 2009 IBM Corporation
ICBI - Business Benefits
Shown in three categories:
1. Submission / Regulatory Compliance
2. Knowledge Generation
3. Effort & Time Saving
6
IBM
© 2009 IBM Corporation
ICBI - Business Benefits
Submission / Regulatory Compliance
1
– Working with a data structure close to the one requested for
Submission
• Allows traceability from analysis data (ADaM) back to raw data (BICDISC and plain SDTM)
• allows for semi-automated generation of plain SDTM and define.xml
• is a one time effort per submission
• is less time consuming
• creates no external costs
– Having the same view on data as authorities
• Increases transparency
• Leads to higher efficiency / turn-around time in answering questions
 Standardized Data Structure will
- further enhance compliance to regulatory requirements
- allow more efficient creation of submission package
7
IBM
© 2009 IBM Corporation
ICBI - Business Benefits
Knowledge Generation
Working with one data structure across trials:
•
•
•
•
•
•
•
•
2
Allows easier creation of PDB and pooling of trial data
Leads to effective meta-analyses on project and/or substance level
Increases re-use of standard programs, program templates and views
Supports exchange between OPUs and functions (e.g. PK/PD, PGx,
partners, …)
Allows (semi-)automated load, transformation and incorporation of external
data from vendors, suppliers, pharmaceutical and collaboration partners
Leads to higher flexibility in assignments to trial & project tasks
Reduces time to answer of internal (various customers, e.g. medical affairs)
requests
Reduces time to answer of external (regulatory) questions
 Standardized Data Structure will further enhance effective
pooling of data and pooled analyses
8
IBM
© 2009 IBM Corporation
ICBI - Business Benefits
Effort & Time Saving
Working with BI-CDISC facilitates downstream processes:
3
• Semi-automated generation of define.xml for SDTM and ADS/ADaM
• no review cycles for define.xml generated externally
• Same view on data as authorities
• increases transparency
• results in higher efficiency in answering questions
• A higher degree of automation, making use of metadata (CDR)
• enables more efficient programming
• reduces validation efforts
• Reduces effort for creation of standard ADS/ADaM
 Standardized Data Structure will
- establish a higher level of standardization
- further enhance analysis with reduced timelines
9
IBM
© 2009 IBM Corporation
1. Motivation, Objectives and expected Benefits
2. System Landscape, Data Flow and Processes
3. Approach
4. Real life examples of issues and sponsor
defined elements
10
IBM
© 2009 IBM Corporation
Chosen Approach for BI-CDSIC
 In line with the recommendations of the SDTM and Analysis Datasets
Implementation Expert Team for a CDISC data standards implementation
we defined the following cornerstones for our data model:
1. Define a sponsor specific in-house data-structure (BI-CDISC) and create SDTM and
ADaM/ADS in parallel from there
2. Definition of transformation rules from BI-CDISC to SDTM and from BI-CDISC to ADaM/ADS
(but not creating ADS from SDTM)
3. The data model contains both collected and derived data
4. The data model will omit RELREC and SUPPQUAL (will only be created upon generation of
plain SDTM for submission)
5. BI-CDISC will make use of the SDTM vocabulary
•
SDTM-vocabulary defined as variable metadata and controlled terminology, not the SDTM
structure
6. BI-CDISC is defined by metadata and (long-term vision) metadata shall drive the
transformations from this BI-CDISC to SDTM and ADaM/ADS. Traceability from SDTM 
ADaM is sufficiently granted by including the SEQ variable in CDR and inherit it to
SDTM/ADaM and/or metadata defining the various transformation steps
11
IBM
© 2009 IBM Corporation
ICBI Data Flow through System Landscape
Load from O*C and Transform in CDR (LSH)
CDR (LSH)
Trial Database / Substance DB
O*C
Trial Database
Study
Setup
Data
Load
O*C Export
Transform
CDR 1
Pooled
Database
Submission
To
FDA
Transform
CDR 2
ADS Dev.
Displays Dev.
Master
Mapping
Table
Trial specifics  manually
partially
manually
Meta info
Trial 1
no
Change
as is
no
change
as is
no
change
Transform
define.xml
SDTM+
SDTM
as is
ADaM
Trial 2
no
change
define.xml
as is
no
change
as is
no
change
Transform
SDTM
SDTM+
ADaM
Pool as is
12
IBM
as is
define.xml
SDTM+
SDTM
Pooled DB
ADaM
Final Report
SDTM,
ADaM,
Tables, Listings,
Profiles,
+
Metadata,
define.xml
as is
© 2009 IBM Corporation
Cornerstones of ICBI
 There will be no impact on early processes
like study set up, data entry,
and user friendliness of RDC.
Data cleaning and discrepancy management remains in O*C
 ICBI requires a certain upfront (once for each trial) effort for trial specific
transformation to SDTM+ and its QC/validation
 Once data are available in the O*C database, they are loaded into LSH.
Loading is triggered by a completed Batch Validation session in O*C
 After loading the data into LSH, they can be automatically transformed
into the SDTM+ structure (Load and transformation steps can be
combined in one LSH workflow)
 ADS/ADaM will be created from SDTM+ and form the basis for reporting
 The submission data sets in plain SDTM are created by sub-setting and
restructuring out of SDTM+ (can be automated)
13
IBM
© 2009 IBM Corporation
Cornerstones of ICBI
 The define.xml can be created semi-automatically
taking the meta data available in LSH thus
improving quality (inconsistencies) and
timely delivery of final submission data sets
 To gather all meta information needed for SDTM,
ADS and define.xml a process needs to be implemented to capture the
meta information throughout the process
(see Module “Meta Data Collection and Master Mapping Table”)
 To enable DQRM reporting to be based on SDTM+, the data need to be
available in SDTM+ structure early/close to First Patient In
 Training would be required for all functions working with the data in LSH.
The O*C part of the process would not be effected (Overview training
recommended only)
14
IBM
© 2009 IBM Corporation
1. Motivation, Objectives and expected Benefits
2. System Landscape, Data Flow and Processes
3. Approach
4. Real life examples of issues and sponsor
defined elements
15
IBM
© 2009 IBM Corporation
Overall Approach
Mapping Table
Sources
OC
Views
•
•
•
•
16
IBM
T/PSAP
ADS Plan
Protocol
aCRF
BI-DM
O*C
 BI-DM
Plain SDTM
BI-DM
 Plain SDTM
• SDTM
Implementation Guide
• CDISC
Controlled Terminology
•BI-DM User Requirements
•BI PDB Requirements
•BI GLIB CT (formats)
•ADaM IG
•BI ADS Guideline
•Data Quality Requirements
© 2009 IBM Corporation
Overall Approach – Trials
 Design Data Model based on two trials of indication A
 Expand Data Model with two trials of indication B
 Proove Data Model (PoC)
– Create Pooled Database (PDB) of all four trials
– Re-create trial ADS from PDB
– Create submission SDTM from PDB
18
IBM
© 2009 IBM Corporation
Overall Approach – Teams
Safety
Treat/Exposure
Lab/Ext. Data
Efficacy
• One Rep from
each Team
• One Rep from
each Team
CT & Formats
19
IBM
Keys & Relations
© 2009 IBM Corporation
Overall Approach – Scope for Teams
Study A
Study B
O*C Views available for the studies used for mapping
• are the starting point for the mapping
• are divided up among the groups according to topics
• topics are based on logical grouping of SDTM domains
•Treat. - Exposure - TD
20
IBM
•Efficacy
•Safety
•Lab - External Data
© 2009 IBM Corporation
1. Motivation, Objectives and expected Benefits
2. System Landscape, Data Flow and Processes
3. Approach
4. Real life examples of issues and sponsor
defined elements
21
IBM
© 2009 IBM Corporation
Using --SEQ…

--SEQ should not be used for any SAS/SQL evaluation

--SEQ is dynamically assigned and might change until a
database is locked
• If BI-CDISC datasets are created multiple times prior to lock then –SEQ will be assigned differently whenever rows/observations of
data have been added or removed
22

In different snapshots of the same trial the value of --SEQ will
not be consistently applied to common observations

The Keys and Relations team does not consider the above
points to be issues, (to maintain consistency in --SEQ would be
very difficult / impossible to achieve, with little / no gain)
IBM
© 2009 IBM Corporation
I. Pooling Identifiers / Keys
Proposed Variables are:
1. SUBSTANCE
8. --DT
2. PROJECT
9. --ONDT
3. STUDYID
10.--ENDT
4. USUBJID/PTNO
11.--CAT
5. VISITNUM
12.--SCAT
6. TPTNUM
13.--TESTCD
7. VISDT
14.--METHOD
15.--SPEC
23
IBM
© 2009 IBM Corporation
ICBI – Interdomain Dependencies
 Mappings are often not trivial
– BI-CDISC variables should be
derived only once and from one
single source
– Domains have to be
created/populated in a defined
order
24
IBM
© 2009 IBM Corporation
CT Consolidation – LABNM Format
 For LABNM (>1000 code/decodes) it was decided to split them
out to three variables (LBTESTCD, LBSPEC and LBMETHOD)
 In special cases additional variables required (position, fasting
status, time, …)
25
IBM
© 2009 IBM Corporation
Identified SDTM+
Topic
SDTM
Numeric
dates/times
All dates are CHAR
(ISO8601)
Missing
SDTM
definitions
Key concept
26
IBM
no definition
available for some
variables in SDTM
V3.1.2
STUDYID
USUBJID
DOMAIN
--SEQ
--GRPID
--REFID
--SPID
SDTM(+)
Workload plain
Workload plus
Keep O*C dates (NUM)
and ISO8601 dates in
parallel
Medium
Low
because all dates have to be transformed because NUM dates are kept and
to ISO8601 and NUM for analysis
used for analysis. No backtransformation necessary
Have to be kept as plus
variables:
variables required into
current XAE or XTRTGEN
macro
(N.B. – closely evaluate
future need of variable as
input to new X-Macros)
Not possible to create ADS from plain
SDTM, because required variable for XAE
and/or XGENTRT macro. Will not be
available with plain SDTM
STUDYID
USUBJID
DOMAIN
Meaningful Keys to be
defined (based on
content)
Very High
values of ID-variables are not unique
across subjects.
Only designed for merging parent
domains to SUPPQUAL, CO, RELREC.
Does not support merging by content
across domains (e.g. XR to XD)
Very low effort expected, because the
variable needed in the macros can be
extracted as is from the available
PLUS variable without complex
referencing, transformations,
derivations or imputations
R/B*
B
e.g.
R
e.g.
Medium
needs to be defined when creating
SDTM+,
beneficial for analysis & reporting
(no additional work)
R
e.g.
* R – required,
- beneficial
© 2009 BIBM
Corporation
Identified SDTM+
Topic
SDTM
NUM - CHAR
Variables are of type
CHAR in general
Example:
USUBJID
--ORRES
Code Decode
Only Decode (CHAR)
Example:
XRCAT
EPOCH
No
SUPPQUAL
27
IBM
SUPPQUAL Domain
SDTM(+)
Workload plain
Workload plus
R/B*
Keep both, CHAR
and NUM-type variables
Example:
USUBJID
"PTNO"
--ORRES
"--ORRESN"
Medium
Numeric O*C values are converted
to CHAR, then need to be
converted back to NUM for
analysis & reporting
Low
Convert once to CHAR for SDTM.
Keep numeric values from O*C as a
plus for analysis & reporting
(no re-conversion)
B
Have
Medium
without formats it is not possible
to reproduce all the options
offered in the CRF
Very low
High
Merging needed because
information that clinically belongs
together is scattered (search and
merge).
Medium
Information that clinically belongs
together is located in one Domain.
One time effort to create plain
SDTM (selecting and splitting).
• Code (NUM)
• associated SAS format &
e.g.
R
e.g.
• Decode (CHAR)
No SUPPQUAL Domain,
variables included in parent
domain
Additional meta data required
to identify qualifier
information destined to
SUPPQUAL
Additional variable that
contains the qualifier
information that is destined
to SUPPQUAL
B
* R – required,
- beneficial
© 2009 BIBM
Corporation
Identified SDTM+
Topic
SDTM
Date/time
imputation
Reported date/time
(ISO8601)
SDTM(+)
Have
• reported date/time
• imputed date/time
Workload plain
Workload plus
R/B*
High
In case of incomplete dates,
imputation needs to be done by
hand (error prone process)
Low
If imputation rule is implemented
in O*C views. Otherwise needs to
be defined once for creation of
SDTM+
B
• imputation rule
e.g.
in parallel
Relationship
to CRF/DCM
Not included
Keep the DCM name
where the variable
originated from
Medium
Connection between SDTM data
and CRF is not readily available
Low
Primarily to ease programming
and help with debugging
Tracking of
same patient
in multiple
trials (e.g.
extension trial
information)
• Previous Trial Number
• Previous Trial Number
Low
Very low
Previous Trial Number
The collected variables need to be
copied from O*C into SDTM+ (DM
domain?). These two variables are
collected at the site and need to be
available in SDTM+ for CTR
reporting and to facilitate reporting
from the P/SDB.
28
IBM
• Previous Patient Number • Previous Patient
Number
could possibly be stored
in the Subject
Characteristic domain
(SC). This needs to be
investigated.
Previous Patient Number should
be scattered into the Subject
Characteristic domain (SC).
B
e.g.
R
e.g.
* R – required,
- beneficial
© 2009 BIBM
Corporation
Dr. Jens Wientges
IBM Global
Business Services.
Contacts
Dr. Jens Wientges
Peter Leister
29
IBM
Mailto:
[email protected]
Mobile: + 49 160 5826897
Peter Leister
Mailto:
[email protected]
Mobile: +49 160 3671761
© 2009 IBM Corporation