Transcript Slide 1

High Performance Computing
for Tissue MicroArray Analysis
Dr Yinhai Wang
David McCleary, Ching-Wei Wang, Jackie James,
Dean Fennell, Peter Hamilton
Introduction
Tissue Microarrays
Key technique for high throughput single assay platform for
tissue biomarker research and discovery.
*Dolled-Filhart and Rimm, Principles
and Practice of Oncology, 7th Edition,
Chapter 7, 2004.
The Bottleneck
Relies on visual scoring of tissue biomarkers by
pathologists. It is time consuming, subjective and
prone to error.
232 Tissue Cores
Image Analysis of TMA Virtual Slides
A TMA virtual slide is an ultra-large digital image,
scanned at a high magnification (40X).
 Computer assisted analysis using TMA virtual slides.
 Objective and reproducible.
 Speed?

103,790×58,586 pixels
17GB
Objective
Objective
Automate TMA analysis
 Genuine high throughput platform
 Reduce pathologists workload
 Speedup biomarker discovery

Materials and Methods
High Performance Computing (HPC) Platform
Hewlett-Packard Blade Server.
 Intel Xeon quad-core x86_64
processors.
 >9,000 processor-cores available.
 10-16GB memory per node (8
cores).
 Gigabit Ethernet connection.
 Fibre connection to hard disks
(SAN).

High Performance Computing
Image generation
Glass slide
Visualisation
HPC Platform
Digital Slide
Serving
Module
Viewing
TMA Database
Instructions
Image File
Access Module
Analytical
Module
Parallel Processing Module
High resolution image
Results
HPC: Image File Access Module
JPEG 2000
Format?
JPEG 2000
decoder
JPEG
JPEG decoder
Yes
Aperio
virtual slide
Region
extraction
Raw image
Compressed?
No
uncompressed data
R
G
B
a
Pixel (0,0)
R
Colour
conversion
G
Pixel (0,1)
JPEG decoder
Yes
Hamamatsu
virtual slide
Region
extraction
Raw image
Compressed?
No
Vendor format independent
uncompressed data
B
G
R
B
Pixel (n,0)
G
Colour
conversion
R
G
Pixel (0,1)
JPEG decoder
Yes
Carl Zeiss
virtual slide
Region
extraction
Raw image
Compressed?
No
uncompressed data
B
G
Pixel (0,0)
R
B
G
Pixel (0,1)
R
Colour
conversion
B
R
G
B
HPC: Parallel Processing Module
6
Analyse and
return results
6
7
Analyse and
return results
Informs Master it
is now available
Database
1
Master
Worker 1
Worker 2
Worker 3
Worker 4
Request for core
coordinates
Storage
Retrieve and Load
Core Sub-image
2
3
TMA core location (x, y)
at TMA virtual slide
Assign to
available workers
5
4
Locate Image in Storage and
Core Sub-image
Centralised Dynamic
Load Balancing
A1
B1
C1
D1
A2
B2
C2
D2
HPC: Analytic Module
Texture feature calculation
◦ Tumour pattern recognition
◦ Tumour region identification
HPC: Analytic Module
Automated quantisation of biomarker IHC density on
TMA core images, using colour decomposition.
HPC: Digital Slide Serving Module
www.pathxl.com
Results
Texture Pattern Calculation for TMA Slides
•106,290×65,017 pixels
•19.3GB
•229 Tissue Cores
Speedup=(Fastest Sequential Code)/(Parallel Code)=42.58
Loading, Storing, Texture
Time for Loading vs. Saving
Time for actual Texture
Feature Calculation
Biomarker Quantification
Processing time: 30minutes77seconds
Speedup=22.19
Multi TMA Slides
There are >9000 processor-cores
available
 The processing of 1 TMA virtual slide
uses <100 processor-cores.
 >90 TMA virtual slides can be processed
simultaneously (≈1 minute).
 Genuine high throughput platform for
multiplex multi-TMA studies.

Conclusion
• A novel high performance computing platform
for the rapid analysis of TMA virtual slides.
• The centralised load balancing approach is
proven to be robust.
• It significantly speedups up the analysis of TMAs,
removing the bottleneck.
• Valuable platform for TMA research &
biomarker discovery.
• High performance platform for the algorithm
prototyping, development & evaluation.
Acknowledgements
Thanks
Dr Yinhai Wang
[email protected]
0044-(0)-28-9097 5816