Privacy Enhancing Technologies

Download Report

Transcript Privacy Enhancing Technologies

ICT in Health Research

C

hallenges and Opportunities for Privacy Protection

From Obstruction to Construction

1

Speakers

Filip De Meyer Department of Medical Informatics & Statistics University Hospital Ghent – Belgium [email protected]

Frank Robben General manager Crossroads Bank for Social Security & eHealth platform Brussels - Belgium [email protected]

“The Modern World is a Data Driven World” risks & challenges benefits & opportunities

Setting a knowledge claim means that researchers start a project with certain assumptions about how they will learn and what they will learn during their inquiry.

These claims might be called paradigms (Lincoln and Guba, 2000; Martens, 1998)

Research hypothesis generation

basic research ...

observational epidemiological studies A priori defined associations: a fraction of possible relations paradigm shift deductive → inductive Data trawling in search of associations with statistical significance Lancet 1996; 348:1152-53

Changing research models

• • • • • • • • • data trawling/fishing genome wide association studies (bio-identity !) data mining of association studies (basic, family history, genetics, epigenetics, transcriptomics,...) translational medicine (“bench to bedside”) personalised medicine (bidirectional) integration of EHR & clinical research world wide service provision (e.g. genetic testing) preservation of samples (regeneration of bio-identity) PHR & patient empowerment

Informational Privacy awareness

“People don’t react to reality; they react to their perceptions of reality”

Different perceptions

regulatory authorities

“Enforce Protection of personal privacy”

health research

“Perform research”

data privacy protection services

“Provide protective solutions that are effective”

Specificity of a privacy protection context

European Level (DPD) other Regulations (e.g. CGP) national legislation local ethics committees

Specific privacy context Importance of Privacy Policy !

data subject

Data categories

• anonymous data – data that cannot be related to an identified or identifiable person by anyone – are not personal data => privacy protection regulation does not apply

Data categories

• coded data – data that cannot be related to an identified or identifiable person by the controller of the data processing, but that can be related to an identified or identifiable person by someone else (e.g. an intermediary organization) – are personal data => privacy protection law applies

Data categories

• non-coded personal data – data that can be related to an identified of identifiable person by the controller of the data processing – are personal data => privacy protection law applies

Evaluation of identifiability

• • an identifiable person is one who can be identified, directly or indirectly, in particular by reference to an identification number or to one or more factors specific to his physical, physiological, mental, economic, cultural or social identity to determine whether a person is identifiable, account should be taken of all the means likely reasonably to be used either by the controller of the data processing or by any other person to identify the said person

Basic principles of privacy protection law

• • • fair and lawful processing purpose limitation – personal data have to be collected for specified, explicit and legitimate purposes – and must not be further processed in a way incompatible with those purposes proportionality – personal data have to be adequate, relevant and not excessive in relation to the purposes for which they are collected or further processed – avoid identification longer than necessary

Basic principles of privacy protection law

• • • transparency for the data subject, a.o.

– – about the purposes of data processing about the identity of the controller of the data processing obligations of the controller of the data processing, a.o.

– – informing the data subject keeping processed data accurate and up to date – guaranteeing sufficient information security rights of the data subject

Privacy principles are challenged

• • • • • • • large amounts of data (proportionality) undefined research hypotheses (purpose limitation) genetic data: identifiability nor information content fully defined bio-identities challenge de-identification schemes distributed data repositories (who is the controller ?) cloud computing & international service provision etc.

A common misunderstanding

Research usage of data collected from a patient for diagnostic or treatment purposes secondary use of data Research usage of data collected from a patient enrolled in a study (within the defined study) primary use of data

Privacy protection = risk management

• • • • • • balance between research benefits and privacy risks privacy legislation is a reference determine the privacy risks (research models) effective data privacy protection framework – organisational and physical protection – – protect (unauthorised) access to the data apply privacy enhancing technologies complementary restrictions on the use of data (e.g. non- discrimination legislation on data use) define and come to terms over residual risk

Good practice

• • • if possible, secondary use of data for research purposes should be conducted on anonymous data if research is not possible based on anonymous data, secondary use of data for research purposes should be conducted on coded data, with appropriate guarantees only if research is not possible based on anonymous or coded data, secondary use of data for research purposes can be conducted on non-coded personal data, with appropriate guarantees

Example: Belgian regulation

• secondary use of coded data for research purposes – notification to the Privacy Commission prior to further processing for research purposes • specific motivation of the need for coded data • complementary information in case of need for processing of coded sensitive or health data – coding prior to further processing for research purposes • by the controller of the original data or an intermediary organization when data originate from one controller • by an intermediary organization when the data originate from several controllers • the intermediary organization needs to be independent from the controller of the further processing for research purposes

Example: Belgian regulation

• secondary use of coded data for research purposes – coded data may only be disclosed to the controller of the further processing for research purposes after receipt of the proof of the notification to the Privacy Commission – information duty of the controller of the original data or the intermediary organization towards the data subjects, unless • impossibility to inform the data subjects • information duty involves a disproportionate effort • data are coded by an intermediary organization being an administrative authority having the explicit legal task to act as an intermediary organization (e.g. the eHealth platform)

Example: Belgian regulation

• secondary use of non-coded personal data for research purposes – notification to the Privacy Commission prior to further processing for research purposes • specific motivation of the need for non-coded personal data – explicit informed consent of the data subjects prior to further processing for research purposes, unless • data are public • information duty involves a disproportionate effort (notification duty to the Privacy Commission in case of sensitive or health data)

Example: Belgian regulation

• secondary use of non-coded personal data for research purposes – non-coded personal data may only be disclosed to the controller of the further processing for research purposes after receipt of the proof of the notification to the Privacy Commission

Example: Belgian regulation

• authorization of exchange of health data – every exchange of non-coded personal data has to be authorized either by the data subject, either by the law, either by a specialized sectoral committee of the Privacy Commission – every coding of data by the eHealth platform has to be authorized by the sectoral committee, indicating whether the encoding should be reversible or irreversible – every anonymizing of data by the eHealth platform has to be authorized by the sectoral committee

Example: Belgian Sectoral Committee

• • • established within the Privacy Commission consists of – 2 members of the Privacy Commission – 4 medical doctors appointed by Parliament tasks – to provide authorizations for (electronic) exchange of personal health data, in situations not regulated by law – to determine information security policies with regard to the processing of personal health data – to give advice and recommendations with regard to information security related to the processing of personal health data – to handle complaints with regard to the violation of information security policies during the processing of personal health data

Example: Belgian implementation

• creation of the eHealth platform, having as a mission – to optimize healthcare quality and continuity – – to optimize safety to simplify administrative formalities for all healthcare actors – to reliably support healthcare policy and research through – a well-organised, mutual electronic service and information exchange between all healthcare actors – with the necessary guarantees in the area of information security, privacy protection and professional secrecy

• • • • •

Belgian eHealth platform: board of directors

7 representatives of the health care providers and institutions 7 representatives of the sickness funds and patient organizations 7 representatives of the public services with competences in health care representatives of the Ministers of Health, Social Affairs, Computerization and Budget representatives of the Order of Physicians and the Order of Pharmacists with advisory vote

Belgian eHealth platform: basic architecture

Health Portal

VAS VAS Patients, healthcare providers and institutions

RIZIV-INAMI site

VAS VAS

eHealth platform Portal MyCareNet

VAS VAS

Healthcare institution software

VAS VAS

Care provider software

VAS VAS Users Network Basic services eHealth platform ADS Suppliers ADS ADS ADS ADS ADS

Belgian eHealth platform: basic services

• • • • • • • • • coordination of electronic processes web portal ( https://www.ehealth.fgov.be

) integrated user and access management logging management system for end-to-end encryption personal electronic mailbox for each healthcare supplier electronic time stamping coding and anonymizing reference directory

Belgian eHealth platform: coding and anonymizing

Belgian eHealth platform: coding and anonymizing

Privacy by design

• • • • • • • • • start from privacy risk analysis (privacy impact analysis) attack models (observational data) /residual risk definition obtain and document authorisations involve research project key actors verify ethical/privacy constraints for secondary use record privacy related metadata for data assets use Privacy Enhancing Technologies (PET) protect research data from de-identification  aim: automated enforcing of privacy policy rules

“Information security is, a journey, not a destination”

data security vulnerabilities threats impacts PET policy enforcing access control physical protection

Breach and Incident Reporting ?

Studies conducted on behalf of the European Network and Information Security Agency (ENISA) recommend that the EU should introduce notification law.

a comprehensive security-breach

Complementary building blocks

• • • “traditional” data security – encryption, authentication, authorisation, audit trails, signatures,...

– physical protection of assets privacy/security policies and procedures – IRBs in research organisations – enforcing/ training/awareness

Privacy Enhancing Technology

“Traditional” data security

• • • • • • • control access to systems, data assets based upon authorisation for roles attributed to individuals trustworthy sources to support security decisions (identies, roles, authorisations) awareness/enforcement of security policies integrate into protected research environments (circles of trust) increased interoperability (standards !)

Privacy Enhancing Technology

• • • • • • • • • complementary to access control of data assets based on identity management and de-identification various identity domains/realms set of privacy enhancing functions and methods combination of third party service provision, software agents and tools privacy violation detection requires trusted service provision  “TTP” linkage functionality otherwise not allowed !

use of cryptographic techniques

Pfitzman-Hansen terminology

• • • • • anonymity unlinkability unobservability pseudonymity etc.

http://dud.inf.tu-dresden.de/Anon_Terminology.shtml

The role of PETs ?

PETs can help to design information and communication systems and services in a way that minimises the collection and use of personal data and facilitate compliance with data protection rules. The use of PETs should result in making breaches of certain data protection rules more difficult and/or helping to detect them.

Memo/07/159 of the EU-Commission

Examples of PET functions

• • • • • de-identification of personal data “coding” (pseudonymisation) of personal data linking and aggregating de-identified or personal data controlled re-identifications etc.

Example of a PET application (cervical cancer research)

PAP smear/ clinical data questionnaire One-shot extraction of personal data pseudonymised data Privacy Protection Services Case repository (de-id. data) Follow-up live updates with personal data

Reduction of identifying information

Privacy policy personal data delete identifier transform date produce nym delete data items encrypt data items … de-identified data

original

Tools for PET application (DICOM example)

examples replaced by nym cleared

Make a “data protection” configuration once… run it several times…

XML example

The concept of identification

d a c b f e g h set of characteristics set of data subjects

A data subject is identified (within a set of data subjects) if it can be singled out among other data subjects.

Some associations between characteristics and data subjects are more persistent in time (e.g. a national security number, date of birth) than others (e.g. an e-mail address).

Determining identifiability

“To determine whether a person is identifiable, account should be taken of all the means likely reasonably to be used either by the controller or by any other person to identify the said person; whereas the principles of protection shall not apply to data rendered anonymous in such a way that the data subject is no longer identifiable; whereas codes of conduct within the meaning of Article 27 may be a useful instrument for providing guidance as to the ways in which data may be rendered anonymous and retained in a form in which identification of the data subject is no longer possible”. (Recital 26 of the DPD) 

refine the concept of identifiability/anonymity.

take into account “means likely and “any other person” in through re-identification risk analysis

• • •

Levels of de-identification ?

(ISO/IEC DTS 25237)

Level 1: removal of clearly identifying data (“rules of thumb”) Level 2: static, model based re-identification risk analysis (include “attacker models”) Level 3: continuous re-identification risk analysis of live databases (e.g. outlier issues)

Targets for de-identification can be set and liabilities better defined in risk analysis and policies

Requirements for PET-TTPs

• • • • • • • • legal status of provider must be clear and transparent independent of the (data sources) and destinations using state-of-art ICT and cryptographic technologies transparent service level agreements internal security procedures documented and verifiable (technical and organisational/procedural) no “security through obscurity” standards for service provision/interfacing …

PET- issues to be addressed

• • • • • • • • • • • differences in perception on basic concepts of identifiability controlled re-identification part of legislation ?

de-identification is not “processing” in DPD sense trustworthy operation of PET-TTPs incident reporting : when , how ?

genomic data and bio-identity requirements for incidental findings reporting in research re-identification risk analysis attack models ID management in data governance ...

We thank you for your attention Any questions ?