Responsible Data Management - Nell Hodgson Woodruff School

Download Report

Transcript Responsible Data Management - Nell Hodgson Woodruff School

RESPONSIBLE DATA
MANAGEMENT
MELINDA HIGGINS, PH.D.;
BRYAN WILLIAMS, PH.D.;
JUANDALYN BURKE, MPH
http://www.hhs.gov/ocr/privacy/hipaa/understanding/summary/privacysummary.pdf
HIPPA AND THE PRIVACY RULE
The Standards for Privacy of Individually Identifiable Health Information (“Privacy
Rule”) establishes, for the first time, a set of national standards for the protection of
certain health information. The U.S. Department of Health and Human Services
(“HHS”) issued the Privacy Rule to implement the requirement of the Health
Insurance Portability and Accountability Act of 1996 (“HIPAA”). The Privacy Rule
standards address the use and disclosure of individuals’ health information—called
“protected health information” by organizations subject to the Privacy Rule — called
“covered entities,” as well as standards for individuals' privacy rights to understand
and control how their health information is used.
A major goal of the Privacy Rule is to assure that individuals’ health information is
properly protected while allowing the flow of health information needed to provide
and promote high quality health care and to protect the public's health and well
being. The Rule strikes a balance that permits important uses of information, while
protecting the privacy of people who seek care and healing. Given that the health
care marketplace is diverse, the Rule is designed to be flexible
WHO IS COVERED BY THE PRIVACY RULE
• Health Plans
• Health Care Providers
• Health Care Clearinghouses
• Business Associates of Above
http://www.hhs.gov/ocr/privacy/hipaa/understanding/summary/privacysummary.pdf
WHAT IS PROTECTED HEALTH
INFORMATION (PHI)?
• The Privacy Rule protects all "individually identifiable health information"
held or transmitted by a covered entity or its business associate, in any
form or media, whether electronic, paper, or oral. The Privacy Rule calls this
information "protected health information (PHI).“
• “Individually identifiable health information” is information, including
demographic data, that relates to:
• the individual’s past, present or future physical or mental health or condition,
• the provision of health care to the individual, or
• the past, present, or future payment for the provision of health care to the
individual,
and that identifies the individual or for which there is a reasonable basis to believe
can be used to identify the individual. Individually identifiable health information
includes many common identifiers (e.g., name, address, birth date, Social Security
Number).
http://www.oshpd.ca.gov/Boards/CPHS/HIPAAIdentifiers.pdf
18 PROTECTED HEALTH INFORMATION (PHI)
IDENTIFIERS
1. Names
2. Geographic subdivisions smaller than a state (except the first three digits of a zip
code if the geographic unit formed by combining all zip codes with the same three
initial digits contains more than 20,000 people and the initial three digits of a zip
code for all such geographic units containing 20,000 or fewer people is changed to
000).
3. All elements of dates (except year) for dates directly related to an individual,
including birth date, admission date, discharge date, and date of death and all ages
over 89 and all elements of dates (including year) indicative of such age (except that
such ages and elements may be aggregated into a single category of age 90 or older)
4. Telephone numbers
5. Fax numbers
6. Electronic mail addresses (e-mail)
7. Social security numbers (SSN)
8. Medical record numbers (MRN)
http://www.oshpd.ca.gov/Boards/CPHS/HIPAAIdentifiers.pdf
Or http://www.cdc.gov/mmwr/preview/mmwrhtml/m2e411a1.htm#box2
18 PROTECTED HEALTH INFORMATION
IDENTIFIERS
9. Health plan beneficiary numbers
10. Account numbers
11. Certificate/license numbers
12. Vehicle identifiers and serial numbers, including license plate numbers
13. Device identifiers and serial numbers
14. Web Universal Resource Locators (URLs)
15. Internet Protocol (IP) address numbers
16. Biometric identifiers, including finger and voice prints
17. Full face photographic images and any comparable images
18. Any other unique identifying number, characteristic, or code (excluding a
random identifier code for the subject that is not related to or derived from
any existing identifier).
DATES – NEED FOR RESEARCH BUT CAN
BE PHI
• Start / Stop Dates
• PHI ONLY IF tied to Hospital Admissions or Discharges
• If Study Related only these dates are not PHI
• Adverse Events – hospitalizations, visits to ED, MD visits,
medical events (heart attack, stroke, etc.), death
• Visits to clinic
• for research purposes only (if not also tied to electronic medical
record) are NOT PHI
• if tied to medical record should be treated as PHI
• Past Medical Events – if only YEAR not PHI
MISSING DATA – NOT PHI BUT
IMPORTANT
• Reasons for Missing Data Should be Tracked
• Death
• Serious Medical Event Resulting in Removal from Study
• Loss to Follow-up
• Withdrawal
• Record Dates Carefully!! (use 4 – digit years!)
• Capture the last date for which data was obtained
• Capture the best estimate the last attempt at contact was made
• Use 4 digit years only (not PHI) unless necessary
• If exact date needed, use full date mm/dd/yyyy, and check sequence!
MISSING DATA – NOT PHI BUT
IMPORTANT
• Missing Data is typically coded using:
• a period ( . )
• blank space
• single letter from the alphabet (or the value NA)
• An "impossible" numerical value like -9, 99, 999
• Do NOT use 999 or 99 unless absolutely necessary – just
leave blank – unless capturing non-response options (i.e.
refuse to answer, did not answer, etc..)
• Check for missing/skipped items or missing forms ASAP –
missing substitution methods are imperfect at best and
always introduce some bias!! – NEED TO MINIMIZE as
much as possible (<5-10% items, <5-10% subjects)
ADHERENCE AND DOSE EFFECTS
• Lack of adherence will be correlated with missing data.
• Lack of adherence can encompass non-response of some or all items
at any time point during the study
• DOSE - Not attending one or more of the contact sessions – for
example if an intervention requires 4 meetings and the subject
attends only 2 (50% dose)
• Non-compliance with protocol – either control subjects getting some
of the intervention or intervention subjects not doing any of the
intervention (e.g. exercise; dietary compliance with
recommendations). Intent to Treat addresses some of these
conceptually, but “Treatment Received” is also useful to consider
• Both Adherence and Dose effects play large role in HTE
“Heterogeneity of Treatment Effects”
DISSEMINATION OF DATA
• Mark all PHI (easy to do in REDCAP) – or store in separate
file and location with limited access and password
protection.
• User Security – track who has access to what (specifically
which variables/files)
• Track Data Releases – ideally all data released should be
DE-IDENTIFIED – but if data is released with PHI it MUST
BE TRACKED
• Upon Study Closeout all PHI has to be removed
CONSISTENT DATA PROCESSING AND
UPDATING
• PI & Project Team: Need to define the process and
schedule for
• Updating (integration of all data sources: labs, clinics, etc.)
• processing (scoring, codebook, documentation)
• Reporting (periodic checks, missing data, outliers, typos,
accuracy, recruitment demographics)
• Have Central/Shared Repository Defined (REDCAP, S drives)
• Be Ready for: DSMB, Advisory Boards, PI, and Project Team –
ongoing exploitation and feedback (unless blinded)
• CONSORT Table and Data Reporting NEED TO MATCH
CONSORT TABLE
(useful to add subjects IDs on who dropped out when and
why – internal use only – not needed for final report)
BEFORE YOU START YOUR STUDY:
CREATE A DATA MANAGEMENT PLAN
• Data Description – what information will be gathered/collected?
Usually included in your proposal. Variables and
surveys/questionnaires listed.
• Existing Data – will any existing data be integrated?
• Format – In what format will the data be generated, and maintained?
This may include reasons why formats may change.
• Metadata – data about your data; also can be referred to as your data
dictionary.
• Storage and Backup – Where will the data be stored physically and
electronically?
• Security – How will the information be protected? What permissions,
or restrictions will be involved
BEFORE YOU START YOUR STUDY:
CREATE A DATA MANAGEMENT PLAN
• Roles/Responsibility – Who will be responsible and involved with the data
management?
• Intellectual property rights – Who holds the property rights to the data? Are
there any copyright restraints?
• Access and Sharing – How will the data be shared? What are the access
procedures (i.e. users have open access to all data or specific user groups?
• Audience – Who are your secondary users of the data (i.e. students or
another research team)?
• Archiving Data – What are the procedures needed to archive the data? What
will be archived for long term preservation?
• Ethics and Privacy – related to informed consent of how the participant will
be protected or how the data will be used in future research.
EXAMPLE OF METADATA: “DATA ABOUT YOUR DATA”
BEFORE YOU START YOUR STUDY:
DATA SOURCES, COLLECTION SYSTEMS
• Information Flow chart – how is information moving throughout
your study and with whom (and how often)?
Helps to Identify how the data will be:
• Tracked
• Merged
• Stored
• Various sources of data – Participant response through surveys,
Electronic Medical Records (EMR) extraction, lab reports,
emailing, phone calls, web-based apps, etc..
• Data collection systems – custom built (Microsoft Access, etc..)
versus web-based (REDCAP, etc..) [SON standard is REDCAP]
DATA/INFORMATION FLOW CHART
Lee, L.M., Teutsch, S.M., Thacker, S.B., & St. Louis, M.E. (Eds.) (2010). Principles and practice of public health surveillance (3rd ed.). New York:
Oxford University Press.
OTHER ITEMS TO CONSIDER
DATA INTEGRITY
• Self-report versus direct measure
• Establish the standards and rules for the content that will be
transferred into the database to avoid jeopardizing the data.
• Other integrity issues – measurement tools, metrics,
indices, etc.. used
• Establish a system involving error checking and validation
procedures.
• Example (1): numerical data should not be able to accept
alphabetical data.
• Example (2): ranges for scoring should be established; values not
submitted out of range.
ANY QUESTIONS?
Statistics Help (SAS, SPSS, etc.)
• www.statisticshell.com
• www.lynda.com
• www.khanacademy.com
RedCap
• RedCap Course on Corsara