Primary and Secondary use - EuroRec: European Institute

Download Report

Transcript Primary and Secondary use - EuroRec: European Institute

Primary and Secondary use
of EHR systems
Meeting The Technical Security Needs
Filip De Meyer
12-10-2007
Content
 Custodix: Company Introduction
 Concepts & Terminology
 From Concept to Technical Solutions
 Example: The Custodix Anonimisation
Tool (“CAT”) (screen shots)
2
About Custodix
 In a few words…
– Established in 2000 as a spin-off company of the University of
Ghent, Belgium
– Providing Privacy Protection services, mainly in HealthCare





Trusted Third Party Services
Customized Privacy Enhanced
Data Collection Solutions
Secure storage
Privacy Consultancy
…
 “One stop shop” for privacy/data protection
 Involved in European Research since the start
 Operating in Europe, Australia and Asia
3
Commercial & Research Activities
Commercial
Research
Programs
4
Scope of Activities
Countries involved (sources of data) in
Custodix protected data flows.
5
Background/History of Activities
Ethics
Codification
“What is right?”
Data Protection legislation
examples:
 Europe:
–
–

European Directive 95/46/EC
(accepted as one of the world’s
highest privacy standards)
Member state implementation
Other:
–
–
–
Health Insurance Portability and
Accountability Act (H.I.P.A.A.)
Ontario Freedom of Information and
the Protection of Privacy Act in
Canada
…
Legislation
Regulation
“Requirement”
“Choice”
Business Risk
Management
Business Ethics
Business
Advantage
Technology
eSecurity & ePrivacy
6
Custodix Services
Data Protection
Services
Privacy Enhanced
Storage Framework
HealthCare ID
Management
Data Protection Platform
Encrypted Storage
Pseudonymisation
Patient Information
Location Services
(PILS)
Information Flow
Management
Patient Consent Management
Advanced Access Control
PKI
(Public Key
Infrastructure)
Spin-off Security
Services
Master Patient Index
TSA
(Time Stamping
Authority)
IM
(Identity
Management)
Security Services
(e.g. eProcurement )
Digital Signature webtools
Timestamping
AuthN & AuthZ
(Authentication &
Authorization)
7
EHR Sources  Research Use
Various EHR
Sources
(care/diagnostic
purposes)
Personal Health
Records
(e.g. personal
diaries)
+ Other Sources
Research
Data
Repositories
Trusted
Third Party
link
• protect privacy
•
Research
Data
Repositories
Additionally
Collected Data
(for research
purposes)
8
Reduction of Identifying Information
Risk Analysis
delete identifier
transform date
produce nym
personal data
de-identified
data
delete data items
encrypt data items
…
Reduce Identifying Information Content
9
Starting Point: Definition of Personal Data
“'personal data' shall mean any information relating to
an identified or identifiable natural person ('data
subject'); an identifiable person is one who can be
identified, directly or indirectly, in particular by
reference to an identification number or to one or
more factors specific to his physical, physiological,
mental, economic, cultural or social identity.”
(Directive 95/46/EC, the “DPD”)
10
Concept of Identification
a
d
c
h
Set of data subjects
f
b
e
g
set of characteristics
 A data subject is identified (within a set of data subjects) if it
can be singled out among other data subjects.
 Some associations between characteristics and data subjects
are more persistent in time (e.g. a national security number,
date of birth) than others (e.g. an e-mail address).
11
The Concept of Anonymisation
a
d
c
f
b
e
g
data subject
h
set of characteristics
Anonymisation is the process that removes the association between the
identifying data set and the data subject. This can be done in two
different ways:
-by removing or transforming characteristics in the associated
characteristics-data-set so that the association is not unique anymore
and relates to more than one data subject.
- by increasing the population in the data subjects set so that the
association between the data set and the data subject is not unique
anymore.
12
Terminology: Pseudonymisation
?
Pseudonym
a
c
d h
f
b
e
g
set of characteristics
Pseudonymisation is a particular type of anonymisation that, after removal
of the association with a data subject, adds an association between a
particular set of characteristics relating to a data subject and one or more
pseudonyms. The pseudonym may be unique in in a domain.
In irreversible pseudonymisation, the conceptual model does not contain a
method to derive the association between the data-subject and the set of
characteristics from the pseudonym.
Note that “pseudonymisation” and “anonymisation” terminology is not
universal
13
The Conceptual vs. Real Life Model
“To determine whether a person is identifiable, account should be
taken of all the means likely reasonably to be used either by the
controller or by any other person to identify the said person;
whereas the principles of protection shall not apply to data
rendered anonymous in such a way that the data subject is no
longer identifiable; whereas codes of conduct within the meaning
of Article 27 may be a useful instrument for providing guidance
as to the ways in which data may be rendered anonymous and
retained in a form in which identification of the data subject is no
longer possible”. (Recital 26 of the DPD)
 refine the concept of identifiability/anonymity.
 take into account “means likely and “any other person” in
through re-identification risk analysis
14
Privacy Risk Analysis
15
Levels of De-identification
(ISO/IEC DTS25237)
 Level 1: removal of clearly identifying data
(“rules of thumb”)
 Level 2: static, model based re-identification
risk analysis
 Level 3: continuous re-identification risk
analysis of live databases
 Targets for de-identification can be set and liabilities
better defined in risk analysis and policies.
16
ISO TC215 / WG 4
ISO/IEC DTS25237 (Approved T.S.)
 Health Informatics: Pseudonymisation
 Result of work in ISO/ TC 215/ WG4
 Based on conceptual model as explained in this
presentation
 Lists a number of Healthcare scenarios
–
–
–
–
clinical trials
clinical research
public health monitoring
patient safety reporting (adverse drug events)
 Current status: Approved Technical Specification
17
Common Healthcare Requirements
 Disease Management, Clinical Trials, … requirements
– Dynamic data collection of individual line data…


Longitudinal studies
Processing data of individual patients
– Protection of data subjects towards data collector


Data must be stored in protected form
Different from disclosure control
 Requires
– De-identified individual line data


Pseudonymisation / anonymisation
no protection through aggregation, data swapping, …
– A-priory estimation of privacy risks and required data
protection measures

Privacy risk based on statistical models
cfr. re-identification theory
– Protection of the “context” in which data is considered
anonymous
18
Pseudonymisation
 Goal:
– Protection of identity and privacy of individuals or
organizations
– Allowing linkage of data associated with pseudo-IDs
irrespective of the collection time (cf. longitudinal studies) and
collection place (cf. multi-center studies)
 Simplified:
– Translating a given identifier into a pseudo-identifier by
using secure, dynamic and (preferably ir-)reversible
cryptographic techniques
 Tricky part:
– Making sure that data is truly de-identified
(within a predefined context)
– Removing “indirectly identifying” content
19
Batch Data Collection
Sources
Trusted Third Party
Data
Collection Site
• Build custom solutions using standard components
• Integrate security & privacy components into existing and new
projects
20
Interactive Pseudonymisation


The “interactive pseudonymisation system”
Reconciling the concept of a “central anonymous database”
with “nominative access”
Doctor
(Dealing with
nominative patient
information)
On-the-fly
Privacy
–Pseudonymisation
–Encryption
Protection
Gateway
Pseudonymisation
server at TTP
Nominative Data Realm
Doctor
(Dealing with
nominative patient
information)
Pseudonymous Data Realm
Register
Pseudonymous Database
(at data warehouse)
}
Collective Records
Access
Research
Community
21
Web Enabled Implementation of
Privacy Enhanced Storage Framework
Sources
Browser API
Data
Collection Site
PESF Service
available as FLASH or
Java/JavaScript
toolkit
Data Protection Service (acting as reverse
proxy)
 Non-intrusive to the application
(transparent)
 Key Management Service
 Secured Search
 Service Provides Authentication and
user management to the application
22
Case: Combined Trust Services
Custodix PKI
Providing
- Authentication
- Addressing
- Directory Services
- Account Management
Direct
Messaging
SIX
Custodix Policy
Controlled
Environment
Custodix Module
encryption
anonymisation
communication
EHR
Export and
access according
to a strict policy
Custodix
Data Repository
Secure Information eXchange



Secure Communication
Anonymous Data Collection
Secured Repository
State-of-the-art Implementation
based on innovative security
technology
23
Clinical Trial
Hospital
Datasources
Patient
ACGT
Infrastructure
Physician
Developing a Biomedical GRID infrastructure for sharing Clinical and
Genomic expertise
Core Activities

Integration

Knowledge Grid

Clinical Trials
… of clinical history, medical imaging and genetic data.
… distributed mining for knowledge extraction.
… breast cancer & pediatric nephroblastoma
24
Pseudonymisation Tool
25
Center for Data Protection

Act as "data controller" or assist "data controllers" in the sense of the
European Directive 95/46/EC on the protection of individuals with regard to the
processing of personal data and on the free movement of such data;

Be a think-tank for everyone professionally involved or interested in practical
data protection;

Promote the application of novel technology in the context of data protection
(ePrivacy , eSecurity), and act as a dissemination point for practical solutions;

Get involved with the development and promotion of standards and
certification related to privacy protection;

Provide assistance in dealing with complex data protection issues on an
international level by offering access to a multidisciplinary pool of expertise.
26
 Generate privacy protection profiles that
can be run on heterogeneous data.
 Create (profile) once, run many times....
27
CAT:Overview
28
CAT: Variable Mappings Editor, XML
 Variable mappings (dicom, xml, csv, custom)
 Define a privacy type /variable
–
–
–
–
Identifier
Free text
Undefined
...
29
CAT: Transformation Editor
 Operands
– named variable (e.g. patientID)
– privacy type
 Flexible and detailed configuration
–
–
–
–
–
–
–
simple nym transformation
secure vaults (single or multiple argument)
random
replace with value
clear
make date relative
...
30
CAT: Transformation Editor, XML
31
CAT XML Example: Result
before
after
 “firstname” replaced by
calculated nym
 “last name” cleared
32
CAT: Key Handling




generate keys
store keys
import/export
...
33
CAT, DICOM Example
34
CAT: Variable Mappings Editor, DICOM
35
CAT: Transformation Editor, DICOM
36
CAT: DICOM Examples
examples
replaced by nym
original
cleared
37
Thank you for your attention!
Any Questions?
Custodix NV
Verlorenbroodstr. 120
B-9820 Merelbeke
Belgium
http://www.custodix.com/
or
[email protected]
38