The Virtual Microscope - Biomedical Informatics
Download
Report
Transcript The Virtual Microscope - Biomedical Informatics
The Virtual Microscope
Umit V. Catalyurek
Department of Biomedical Informatics
Division of Data Intensive and Grid Computing
The Virtual Microscope
Joel Saltz
Renato Ferreira
Michael Beynon
Chialin Chang
Alan Sussman
Tahsin Kurc
Robert Miller
Angelo Demarzo
Mark Silberman
Asmara Afework
Anthony Wiegering
Virtual Microscope (VM)
Interactive software emulation of high power light
microscope for processing image datasets
visualize and explore microscopy images
screen for cancer
categorize images for associative retrieval
electronic capture of slide examination process used in
resident training
collaborative diagnosis
Virtual Microscope (Hopkins/UMD), Distributed
Telemicroscopy System (Rutgers), [Gu] Virtual
Telemicroscope, Virtual Microscopy (UPMC), Baccus
Virtual Microscope
The Virtual Microscope
Data requirement
Full cases consisting of multiple digitized glass
slides with data acquired at 400X
Single spot 1000x1000 pixels, 3-byte RGB=3MB
A slide of 2.5cmx3.5cm requires 50x70 grid =
10GB uncompressed
Each slide can have multiple focal planes
Johns Hopkins alone generates 500,000 slides per
year
The Virtual Microscope
Client-server architecture
Java 1.2 Client
Data storage & Image compression
Portability
More efficient storage, reduced transmission time
2 server implementations:
Customized instance of Active Data Repository
Component-based implementation using DataCutter
Heterogeneous systems, portability, user-defined processing
Caching in the VM Client
Improved scalability, portability, user-defined processing
Improved response time
Experimental Results
VM Client
VM Client
Image Declustering
0 3 4 5 2 3 4 7
1 2 7 6 1 0 5 6
6 5 0 1 6 7 2 1
7 4 3 2 5 4 3 0
0 1 6 7 0 1 6 7
3 2 5 4 3 2 5 4
4 7 0 3 4 7 0 3
5 6 1 2 5 6 1 2
Image Compression
JPEG compression - storage and network
data reduction by a factor of 10
still may take long time to transmit images
For example, 640x480 image
920 KB uncompressed
~ 90 KB jpeg compressed
~ 13 seconds to transfer using 56 Kb modem
Active Data Repository (ADR)
A C++ class library and runtime system for
building parallel databases of multidimensional datasets
enables integration of storage, retrieval and processing
of multiple datasets on parallel machines and clusters.
provides support for common operations such as data
retrieval, memory management, scheduling of processing
across a parallel machine.
can be customized for various applications.
Front-end: the interface between clients and backend.
Back-end: data storage, retrieval, and processing.
Distributed memory parallel machine or cluster, with
multiple disks attached to each node
Customizable services for application-specific processing
Virtual Microscope with ADR
.
.
.
Client
Client
Client
Client
Query:
* Slide number
* Focal plane
* Magnification
* Region of interest
Front-end
Virtual Microscope Front-end
Image blocks
Query Submission
Service
Query Interface
Service
Back-end
Dataset
Service
Indexing
Service
Data Aggregation
Service
Attribute Space
Service
Query Execution
Service
Query Planning
Service
DataCutter
A suite of Middleware for subsetting and filtering multi-dimensional
datasets stored in a distributed environment
Indexing Service
Multilevel hierarchical indexes based on spatial indexing methods –
e.g., R-trees
Filtering Service
Distributed C++ component framework
Specialized components for processing data
filters – logical unit of computation, high level tasks,
streams – how filters communicate
init,process,finalize interface
unidirectional buffer pipes
uses fixed size buffers (min, good)
manually specify filter connectivity and filter-level characteristics
Virtual Microscope with DataCutter
DC-5F
read_data
decompress
clip
zoom
view
DC-3F
read_data
decompress
clip-zoom-view
DC-2F
read_data
decompress-clip-zoom-view
Caching in the Client
Reduce data re-transmission
Cache part of the retrieved data in the client
Cache multiple resolutions/magnifications
Cache only what the user views
Two-level cache
client memory is the first level cache
local disk on the client machine is the second level
Caching Multiresolution Images
VM Server Performance
ADR VM Server Performance
VM ADR Server under workload
Average Response Time for 512x512 Output
9.0
7.0
6.0
ADR
5.0
ADR-1bg
4.0
ADR-4bg
3.0
ADR-16bg
2.0
1.0
0.0
1
2
4
8
Average Response Time for 1024x1024 Output
Num ber of Processors
30.0
Response Time (seconds)
Response Time (seconds)
8.0
25.0
20.0
ADR
ADR-1bg
15.0
ADR-4bg
ADR-16bg
10.0
5.0
0.0
1
2
4
Num ber of Processors
8
VM Servers: ADR vs DC
Average Response Time for 512x512 Output
1.4
1.2
1.0
ADR
0.8
DC-5F
0.6
DC-2F
0.4
0.2
0.0
1
2
4
8
Num ber of Processors
Average Response Time for 1024x1024 Output
6.0
Response Time (seconds)
Response Time (seconds)
1.6
5.0
4.0
ADR
3.0
DC-5F
DC-2F
2.0
1.0
0.0
1
2
4
Num ber of Processors
8
VM Servers: ADR vs DC
Average Response Time for 512x512 Output
1.6
1.6
1.4
1.4
Response Time (seconds)
Response Time (seconds)
Average Response Time for 512x512 Output
1.2
1.0
ADR
0.8
DC-5F
0.6
DC-2F
0.4
0.2
1.2
1.0
ADR-1bg
0.8
DC-5F-1bg
0.6
DC-2F-1bg
0.4
0.2
0.0
0.0
1
2
4
2
8
8
Average Response Time for 512x512 Output
Average Response Time for 512x512 Output
9.0
9.0
8.0
8.0
7.0
6.0
ADR-4bg
5.0
DC-5F-4bg
4.0
DC-2F-4bg
3.0
2.0
1.0
Response Time (seconds)
Response Time (seconds)
4
Num ber of Processors
Num ber of Processors
7.0
6.0
ADR-16bg
5.0
DC-5F-16bg
4.0
DC-2F-16bg
3.0
2.0
1.0
0.0
0.0
2
4
Num ber of Processors
8
2
4
Num ber of Processors
8
VM: ADR vs DC on SMP
Average Response Time for 512x512 Output
Average Response Time for 1024x1024 Output
2.50
7.00
ADR
1.50
8x(R-DCZV)
4x(2xR-2xDCZV)
2x(4xR-4xDCZV)
1.00
4x(2xR-4xD-2xCZV)
0.50
Response Time (seconds)
Response Time (seconds)
6.00
2.00
5.00
ADR
8x(R-DCZV)
4.00
4x(2xR-2xDCZV)
3.00
2x(4xR-4xDCZV)
4x(2xR-4xD-2xCZV)
2.00
1.00
0.00
0.00
1
2
4
Number of Clients
8
1
2
4
Number of Clients
8
Caching Client Performance
Caching Client Performance
Summary
2 VM servers:
Java 1.2 Client
Homogeneous systems tightly coupled
parallel machines with attached local disks
Heterogeneous systems, grid
Multiresolution image caching
Try
http://vmscope.jhmi.edu
End of Talk
0
1
6
7
0
3
4
5
2
3
4
7
0
1
6
7
3
2
5
4
1
2
7
6
1
0
5
6
3
2
5
4
4
7
0
3
6
5
0
1
6
7
2
1
4
7
0
3
5
6
1
2
7
4
3
2
5
4
3
0
5
6
1
2
2
1
6
5
0
3
4
5
2
3
4
7
2
1
6
5
3
0
7
4
1
2
7
6
1
0
5
6
3
0
7
4
4
5
2
3
6
5
0
1
6
7
2
1
4
5
2
3
7
6
1
0
7
4
3
2
5
4
3
0
7
6
1
0
0
3
4
5
2
3
4
7
0
3
4
5
2
3
4
7
1
2
7
6
1
0
5
6
1
2
7
6
1
0
5
6
6
5
0
1
6
7
2
1
6
5
0
1
6
7
2
1
7
4
3
2
5
4
3
0
7
4
3
2
5
4
3
0
0
1
6
7
0
1
6
7
0
1
6
7
0
1
6
7
3
2
5
4
3
2
5
4
3
2
5
4
3
2
5
4
4
7
0
3
4
7
0
3
4
7
0
3
4
7
0
3
5
6
1
2
5
6
1
2
5
6
1
2
5
6
1
2