HPC Systems available to QUB Researchers Ricky Rankin, Vaughan Purnell,

Download Report

Transcript HPC Systems available to QUB Researchers Ricky Rankin, Vaughan Purnell,

HPC Systems available to QUB Researchers
Ricky Rankin, [email protected]
Vaughan Purnell, [email protected]
17th June 2008
Agenda
 HPC at Queen’s University
 Application Areas
 Observations
 Future
17th June 2008
HPC at Queen’s University
 1990 Parallel Computer Centre
– JISC Initiative
 1996 Centre for Supercomputing in Ireland
– Partnership with TCD
– IMB SP2
 Research Support Computing Group within Information
Services
– SRIF funding
17th June 2008
Research Support Computing Group
 System support and administration
 Development of applications
 Support of code and applications
 Training
 Providing documentation and examples
 Consultation
17th June 2008
High Performance Computing Systems
 Computation
– 2 Unix Clusters
– SGI Altix
– Windows Compute Cluster
• Matlab and other windows applications
 Visualisation Systems
• HP & SGI – currently based in NITC
 Database Server
17th June 2008
HPC Unix Clusters
Harvey
– 2-CPU Itanium2 server nodes
–
Gigabyte interconnect
–
HP UX
XC
10 node is an rx8600
Quadrics interconnect
Both Clusters are connected to
10 Terabyte Storage Area Network (SAN).
17th June 2008
Altix 350
 16 Itanium 2 1.4 Ghz 3MB L3 Cache
 32GB main memory
 NUMAlink4 system interconnect
 160GB system disks
 Data Grid connectivity
 Infiniband connectivity
 FPGA board
17th June 2008
Windows Compute Cluster
 Head node + 8 worker nodes
(total of 32 cores)
 Mix of HP ProLiant
– DL140-G2 Xeon 3.6GHz,
4GB memory
– DL140-G3 Xeon 5160
3GHz ("Woodcrest"), 8GB
memory
 Head node connected by fibre
to 10TB SAN
17th June 2008
Windows Compute Cluster
 HP BladeSystem c-Class c7000
(total of 128 cores)
– Four HP BL460c dual Quad-core
E5420 blades with 16 GB memory
– Twelve HP BL260c dual Quad-core
E5420 blades with 10 GB memory
17th June 2008
Visulisation Systems
Silicon Graphics Prism
 16 x Intel Itanium 2, 1.4GHz 3MB L3 cache
 32GB main memory
 4 x Graphics Pipes
 Gigabit Ethernet connectivity
HP Visualization Centre
 5 x HP xw8200 Xeon 64 workstations (running windows XP)
17th June 2008
Database Server
 Mix of HP ProLiant
– DL140-G2 Xeon 3.6GHz,
4GB memory
17th June 2008
Software
 Anything that has a valid license
– Compute Clusters
• Beast
•R
– Windows Compute Cluster
• Paup
• POWSIM
• Matlab
• ARCGIS ??
– Database Server
• T1D
17 June 2008
th
High Performance Computing System
rx8600
rx8600
rx8600
rx8600
Q
u
a
d
ri
c
s
rx2600
rx2600
rx8600
rx8600
rx8600
rx8600
17th June 2008
S
w
it
c
h
rx2600
Visualisation
System
S
A
N
Windows
System
HPC Training
Internal - start of first semester
 Introduction to UNIX
 ‘C’ Programming for HPC Systems
 Message Passing Interface Programming
 OpenMP Programming
 Matlab
17th June 2008
Support
 Help with the system,e.g., account and job submission
problems.
 Help with system software, e.g., installation, updates and
usage.
 Help with developing applications and porting of codes.
 Advice on starting up new HPC projects.
17th June 2008
Application Areas – Evolutionary Biology
Phylogeny - is the study of
evolutionary relatedness among
various groups of organisms.
QUB work focused on taxonomy
and ecology of Antarctic and
deep sea incirrate octopuses.
17th June 2008
Application Areas – Evolutionary Biology
PAUP is a tool for inferring and analysing phylogenetic trees. This is a
heuristic search approach involving subtree pruning regrafting
operations.
Running PAUP over many replicates is very processor intensive –
tying desktop up for days.
17th June 2008
Application Areas – Evolutionary Biology
HPC solution
 Installed PAUP on the Windows Cluster.
 Used 1000 replicates packaged up in 15 lots of 63 reps and 1
lot of 55 reps giving 16 input (.nex) files.
 Job creation and submission achieved via the Windows job
manager.
 The 16 tasks took approx. 6 days to complete running on 16
cores.
BENEFIT:- able to process larger number of replicates in
parallel, freeing of desktop and results returned quickly.
17th June 2008
Application Areas – Evolutionary Biology
Other work:
 Other researchers now looking to use PAUP.
 Similar application called POWSIM currently being trailed.
 POWSIM is a program for assessing statistical power when
testing for genetic differentiation. This is being used to study
plant population genetics and evolution.
17th June 2008
Evolutionary Biology
user comments
I just couldn't have bootstrapped this dataset without use of the cluster and whilst I
might have been able to publish it (on the grounds that the computing power
just wasn't available) - it wouldn't get in such a good journal - and we also
wouldn't know how good the results were - which is a bit of a limitation in science
when you're trying to determine the truth.
It's a bit of extra work - but hey, the cluster has just saved me 15 x 6 days so I
think I can cope!
What I've actually achieved on this particular run is that I've sorted out a long
standing taxonomic problem in deep-sea octopuses in the genus Graneledone
and established that there really are two valid species in the North pacific and
they're separated by depth - and also that they probably evolved from the Atlantic
species Graneledone verrucosa.
Not totally irrelevant because this dating of nodes is what we were doing using
BEAST on one of your other clusters.
I no longer use the drop down menus - I write scripts
... anyway Vaughan has made me feel very comfortable so part of the success of
this is definitely staff dependent!
So, all in all, it has been a very positive experience for me - and I have been
singing its praises around the department so I'm sure you will get future users 17thyou
Junedefinitely
2008
well,
will, because I have a PhD student starting in October who will
be using it.
Application Areas - Medicine
High Resolution Cellular Imaging for
Cancer Diagnosis
This project faces several computational
challenges including the processing of
thousands of hi-resolution (30GB)
microscopy images.
A key requirement is rapid, high
throughput analysis of Tissue microarrays
(TMAs).
17th June 2008
Application Areas - Medicine
 Framework written in C++ and using MPI
 Master/slave method of processing image tiles from the hi-res
image.
 Each ‘tile’ is applied an analysis algorithm developed in ‘C’.
 Job submission was written in C#.
Cores
Time
2
162.908
4
65.3422
32
7.12
Tests have showed significant speedup:
17th June 2008
Application Areas - Geography
Rivers
Aim is to predict flooding and search
for a link to self-organised criticality
(study of complexity in nature).
Examination of power-law frequency
magnitude distributions that best fit the
datasets.
17th June 2008
Application Areas - Geography
Processing
 40 rivers being analysed
 Each river is sampled in 3 places.
 Each sample is evaluated by (Monte Carlo) MATLAB statistical
functions plvar and plpva.
 Each plvar function takes about 5.5 hrs to run
 Each plpva function takes about 7 hrs to run.
Impractical to run on desktop!
17th June 2008
Application Areas - Geography
HPC Solution
MATLAB Distributed Computing Toolbox
 Parallelized plvar and plpva function so that the iterations are
distributed to a number of processors on the Windows Cluster.
 Initial timings show that the plvar function, normally taking 5.5 HRS
on a standard desktop, reduced to 21 minutes using a matlab pool of
16 processors.
17th June 2008
Geography
user comments
Before using the Windows Cluster I was faced with a major dilemma.
The runs I needed to complete would each take over 5 hours to do and I was
faced with the possibly of repeating this process countless times.
I discussed the possibly of reducing my investigation or carrying on with
methods that were faster but probably inaccurate, a decision I found very
hard to make.
Only by making use of the Windows cluster have I been able to successfully
complete this integral part of my thesis
17th June 2008
Observations
 Notice no Physicists or Applied Mathematicians => new
users are not programmers.
 Many users familiar with the Windows XP desktop.
 Many users are tying up their desktops and restricting the
amount of processing.
 The challenge is to identify these users and give them some
handholding to get them started.
 Expecting to see more MATLAB Distributed Computing being
done.
17th June 2008
Future
•Anything that there is a valid license
•Compute Clusters
•Beast
•R
•Windows Compute Cluster
•Paup
•POWSIM
•Matlab
•ARCGIS ??
•Database Server
•T1D
17th June 2008
Hardware
Future
 Replace Harvey
 Expand XC
 Expand Windows Compute Cluster
17th June 2008
Questions
17th June 2008