The Virtual Microscope - Biomedical Informatics

Download Report

Transcript The Virtual Microscope - Biomedical Informatics

The Virtual Microscope
Umit V. Catalyurek
Department of Biomedical Informatics
Division of Data Intensive and Grid Computing
The Virtual Microscope











Joel Saltz
Renato Ferreira
Michael Beynon
Chialin Chang
Alan Sussman
Tahsin Kurc
Robert Miller
Angelo Demarzo
Mark Silberman
Asmara Afework
Anthony Wiegering
Virtual Microscope (VM)

Interactive software emulation of high power light
microscope for processing image datasets






visualize and explore microscopy images
screen for cancer
categorize images for associative retrieval
electronic capture of slide examination process used in
resident training
collaborative diagnosis
Virtual Microscope (Hopkins/UMD), Distributed
Telemicroscopy System (Rutgers), [Gu] Virtual
Telemicroscope, Virtual Microscopy (UPMC), Baccus
Virtual Microscope
The Virtual Microscope

Data requirement





Full cases consisting of multiple digitized glass
slides with data acquired at 400X
Single spot 1000x1000 pixels, 3-byte RGB=3MB
A slide of 2.5cmx3.5cm requires 50x70 grid =
10GB uncompressed
Each slide can have multiple focal planes
Johns Hopkins alone generates 500,000 slides per
year
The Virtual Microscope


Client-server architecture
Java 1.2 Client


Data storage & Image compression


Portability
More efficient storage, reduced transmission time
2 server implementations:

Customized instance of Active Data Repository


Component-based implementation using DataCutter


Heterogeneous systems, portability, user-defined processing
Caching in the VM Client


Improved scalability, portability, user-defined processing
Improved response time
Experimental Results
VM Client
VM Client
Image Declustering
0 3 4 5 2 3 4 7
1 2 7 6 1 0 5 6
6 5 0 1 6 7 2 1
7 4 3 2 5 4 3 0
0 1 6 7 0 1 6 7
3 2 5 4 3 2 5 4
4 7 0 3 4 7 0 3
5 6 1 2 5 6 1 2
Image Compression



JPEG compression - storage and network
data reduction by a factor of 10
still may take long time to transmit images

For example, 640x480 image
 920 KB uncompressed
 ~ 90 KB jpeg compressed
 ~ 13 seconds to transfer using 56 Kb modem
Active Data Repository (ADR)

A C++ class library and runtime system for
building parallel databases of multidimensional datasets





enables integration of storage, retrieval and processing
of multiple datasets on parallel machines and clusters.
provides support for common operations such as data
retrieval, memory management, scheduling of processing
across a parallel machine.
can be customized for various applications.
Front-end: the interface between clients and backend.
Back-end: data storage, retrieval, and processing.


Distributed memory parallel machine or cluster, with
multiple disks attached to each node
Customizable services for application-specific processing
Virtual Microscope with ADR
.
.
.
Client
Client
Client
Client
Query:
* Slide number
* Focal plane
* Magnification
* Region of interest
Front-end
Virtual Microscope Front-end
Image blocks
Query Submission
Service
Query Interface
Service
Back-end
Dataset
Service
Indexing
Service
Data Aggregation
Service
Attribute Space
Service
Query Execution
Service
Query Planning
Service
DataCutter
A suite of Middleware for subsetting and filtering multi-dimensional
datasets stored in a distributed environment

Indexing Service


Multilevel hierarchical indexes based on spatial indexing methods –
e.g., R-trees
Filtering Service



Distributed C++ component framework
Specialized components for processing data
filters – logical unit of computation, high level tasks,


streams – how filters communicate



init,process,finalize interface
unidirectional buffer pipes
uses fixed size buffers (min, good)
manually specify filter connectivity and filter-level characteristics
Virtual Microscope with DataCutter
DC-5F
read_data
decompress
clip
zoom
view
DC-3F
read_data
decompress
clip-zoom-view
DC-2F
read_data
decompress-clip-zoom-view
Caching in the Client

Reduce data re-transmission


Cache part of the retrieved data in the client
Cache multiple resolutions/magnifications


Cache only what the user views
Two-level cache


client memory is the first level cache
local disk on the client machine is the second level
Caching Multiresolution Images
VM Server Performance
ADR VM Server Performance
VM ADR Server under workload
Average Response Time for 512x512 Output
9.0
7.0
6.0
ADR
5.0
ADR-1bg
4.0
ADR-4bg
3.0
ADR-16bg
2.0
1.0
0.0
1
2
4
8
Average Response Time for 1024x1024 Output
Num ber of Processors
30.0
Response Time (seconds)
Response Time (seconds)
8.0
25.0
20.0
ADR
ADR-1bg
15.0
ADR-4bg
ADR-16bg
10.0
5.0
0.0
1
2
4
Num ber of Processors
8
VM Servers: ADR vs DC
Average Response Time for 512x512 Output
1.4
1.2
1.0
ADR
0.8
DC-5F
0.6
DC-2F
0.4
0.2
0.0
1
2
4
8
Num ber of Processors
Average Response Time for 1024x1024 Output
6.0
Response Time (seconds)
Response Time (seconds)
1.6
5.0
4.0
ADR
3.0
DC-5F
DC-2F
2.0
1.0
0.0
1
2
4
Num ber of Processors
8
VM Servers: ADR vs DC
Average Response Time for 512x512 Output
1.6
1.6
1.4
1.4
Response Time (seconds)
Response Time (seconds)
Average Response Time for 512x512 Output
1.2
1.0
ADR
0.8
DC-5F
0.6
DC-2F
0.4
0.2
1.2
1.0
ADR-1bg
0.8
DC-5F-1bg
0.6
DC-2F-1bg
0.4
0.2
0.0
0.0
1
2
4
2
8
8
Average Response Time for 512x512 Output
Average Response Time for 512x512 Output
9.0
9.0
8.0
8.0
7.0
6.0
ADR-4bg
5.0
DC-5F-4bg
4.0
DC-2F-4bg
3.0
2.0
1.0
Response Time (seconds)
Response Time (seconds)
4
Num ber of Processors
Num ber of Processors
7.0
6.0
ADR-16bg
5.0
DC-5F-16bg
4.0
DC-2F-16bg
3.0
2.0
1.0
0.0
0.0
2
4
Num ber of Processors
8
2
4
Num ber of Processors
8
VM: ADR vs DC on SMP
Average Response Time for 512x512 Output
Average Response Time for 1024x1024 Output
2.50
7.00
ADR
1.50
8x(R-DCZV)
4x(2xR-2xDCZV)
2x(4xR-4xDCZV)
1.00
4x(2xR-4xD-2xCZV)
0.50
Response Time (seconds)
Response Time (seconds)
6.00
2.00
5.00
ADR
8x(R-DCZV)
4.00
4x(2xR-2xDCZV)
3.00
2x(4xR-4xDCZV)
4x(2xR-4xD-2xCZV)
2.00
1.00
0.00
0.00
1
2
4
Number of Clients
8
1
2
4
Number of Clients
8
Caching Client Performance
Caching Client Performance
Summary

2 VM servers:



Java 1.2 Client


Homogeneous systems tightly coupled
parallel machines with attached local disks
Heterogeneous systems, grid
Multiresolution image caching
Try 

http://vmscope.jhmi.edu
End of Talk
0
1
6
7
0
3
4
5
2
3
4
7
0
1
6
7
3
2
5
4
1
2
7
6
1
0
5
6
3
2
5
4
4
7
0
3
6
5
0
1
6
7
2
1
4
7
0
3
5
6
1
2
7
4
3
2
5
4
3
0
5
6
1
2
2
1
6
5
0
3
4
5
2
3
4
7
2
1
6
5
3
0
7
4
1
2
7
6
1
0
5
6
3
0
7
4
4
5
2
3
6
5
0
1
6
7
2
1
4
5
2
3
7
6
1
0
7
4
3
2
5
4
3
0
7
6
1
0
0
3
4
5
2
3
4
7
0
3
4
5
2
3
4
7
1
2
7
6
1
0
5
6
1
2
7
6
1
0
5
6
6
5
0
1
6
7
2
1
6
5
0
1
6
7
2
1
7
4
3
2
5
4
3
0
7
4
3
2
5
4
3
0
0
1
6
7
0
1
6
7
0
1
6
7
0
1
6
7
3
2
5
4
3
2
5
4
3
2
5
4
3
2
5
4
4
7
0
3
4
7
0
3
4
7
0
3
4
7
0
3
5
6
1
2
5
6
1
2
5
6
1
2
5
6
1
2