Transcript PPT
CSE5810
Computer Science Issues In a Patient’s Perspective
Exploring the application of data mining
to Bioinformatics
Guanming Wu
Computer Science & Engineering Department
University of Connecticut
[email protected]
Background
CSE5810
Nature of being asynchronous
Management of patients’ medical records
Lack of a full(fuller) perception
Speciality
Multiple medical providers
Transparency issue
Separated medical record
2
Motivation
CSE5810
Developed in a patient’s perspective
More outsider-friendly
Used by participants in other levels and domains
Improvements in efficiency
3
Goal
CSE5810
Higher data transparency
Better comprehensibility
Higher efficiency
4
Basic Information
CSE5810
There are progresses in Bioinformatics researches:
Literature extraction based on keywords
Literature extraction from words combined
with the context of the words
Information extraction from nature language
(with high error rate)
Automated database curation and ontology
development
5
Data storage
CSE5810
Update(download) medical data on demand
Store the data locally
Does not upload data locally generated
One database per account
Architecture
Sequence
6
Standardization & quantization
CSE5810
Diverse formats of medical records
Medical test reports
Diagnosis
Prescriptions
Standardization
Easier to process
More comprehensible
Detail-oriented
Formats
Quantization
Tests and results
7
Application of data mining (1)
CSE5810
Data Mining
Helps medical systems better benefit from
data and analytics
Helps improve user-friendliness
Helps Reduces inefficiency and low-term costs
8
Application of data mining (2)
Analysis of large datasets to discover their patterns
CSE5810
Use the patterns to build models
Predict the likelihood of a patient having
a certain type of disease;
(Possible) stage of a patient
1
2
9
Application of data mining (3)
CSE5810
1 For patients:
Chronic disease management
Reminders/warning
etc.
2 For medical professionals/administration
Follow-up of patients with chronic disease(s)
Patients/professionals/Staffing management
etc.
10
Train data
CSE5810
Generate models
Data
Test Models
- from different providers
- in different formats
Run Models
- in different orders
Result
Original
database
Data
local
database
11
CSE5810
Error rate issue
Not a solid evidence/proof
Recommandation
Decision Making
12
CSE5810
Thank you
for your attention (and patience)