Transcript Slide 1
High Performance Computing for Tissue MicroArray Analysis Dr Yinhai Wang David McCleary, Ching-Wei Wang, Jackie James, Dean Fennell, Peter Hamilton Introduction Tissue Microarrays Key technique for high throughput single assay platform for tissue biomarker research and discovery. *Dolled-Filhart and Rimm, Principles and Practice of Oncology, 7th Edition, Chapter 7, 2004. The Bottleneck Relies on visual scoring of tissue biomarkers by pathologists. It is time consuming, subjective and prone to error. 232 Tissue Cores Image Analysis of TMA Virtual Slides A TMA virtual slide is an ultra-large digital image, scanned at a high magnification (40X). Computer assisted analysis using TMA virtual slides. Objective and reproducible. Speed? 103,790×58,586 pixels 17GB Objective Objective Automate TMA analysis Genuine high throughput platform Reduce pathologists workload Speedup biomarker discovery Materials and Methods High Performance Computing (HPC) Platform Hewlett-Packard Blade Server. Intel Xeon quad-core x86_64 processors. >9,000 processor-cores available. 10-16GB memory per node (8 cores). Gigabit Ethernet connection. Fibre connection to hard disks (SAN). High Performance Computing Image generation Glass slide Visualisation HPC Platform Digital Slide Serving Module Viewing TMA Database Instructions Image File Access Module Analytical Module Parallel Processing Module High resolution image Results HPC: Image File Access Module JPEG 2000 Format? JPEG 2000 decoder JPEG JPEG decoder Yes Aperio virtual slide Region extraction Raw image Compressed? No uncompressed data R G B a Pixel (0,0) R Colour conversion G Pixel (0,1) JPEG decoder Yes Hamamatsu virtual slide Region extraction Raw image Compressed? No Vendor format independent uncompressed data B G R B Pixel (n,0) G Colour conversion R G Pixel (0,1) JPEG decoder Yes Carl Zeiss virtual slide Region extraction Raw image Compressed? No uncompressed data B G Pixel (0,0) R B G Pixel (0,1) R Colour conversion B R G B HPC: Parallel Processing Module 6 Analyse and return results 6 7 Analyse and return results Informs Master it is now available Database 1 Master Worker 1 Worker 2 Worker 3 Worker 4 Request for core coordinates Storage Retrieve and Load Core Sub-image 2 3 TMA core location (x, y) at TMA virtual slide Assign to available workers 5 4 Locate Image in Storage and Core Sub-image Centralised Dynamic Load Balancing A1 B1 C1 D1 A2 B2 C2 D2 HPC: Analytic Module Texture feature calculation ◦ Tumour pattern recognition ◦ Tumour region identification HPC: Analytic Module Automated quantisation of biomarker IHC density on TMA core images, using colour decomposition. HPC: Digital Slide Serving Module www.pathxl.com Results Texture Pattern Calculation for TMA Slides •106,290×65,017 pixels •19.3GB •229 Tissue Cores Speedup=(Fastest Sequential Code)/(Parallel Code)=42.58 Loading, Storing, Texture Time for Loading vs. Saving Time for actual Texture Feature Calculation Biomarker Quantification Processing time: 30minutes77seconds Speedup=22.19 Multi TMA Slides There are >9000 processor-cores available The processing of 1 TMA virtual slide uses <100 processor-cores. >90 TMA virtual slides can be processed simultaneously (≈1 minute). Genuine high throughput platform for multiplex multi-TMA studies. Conclusion • A novel high performance computing platform for the rapid analysis of TMA virtual slides. • The centralised load balancing approach is proven to be robust. • It significantly speedups up the analysis of TMAs, removing the bottleneck. • Valuable platform for TMA research & biomarker discovery. • High performance platform for the algorithm prototyping, development & evaluation. Acknowledgements Thanks Dr Yinhai Wang [email protected] 0044-(0)-28-9097 5816