Big Red, the Data Capacitor, and the future (clouds) Craig A. Stewart [email protected] 2 March 2008

Download Report

Transcript Big Red, the Data Capacitor, and the future (clouds) Craig A. Stewart [email protected] 2 March 2008

Big Red, the Data Capacitor, and
the future (clouds)
Craig A. Stewart
[email protected]
2 March 2008
License terms
•
•
Please cite as: Stewart, C.A. 2009. Big Red, the Data Capacitor, and the
future (clouds). Presentation. Presented 2 Mar 2009, University of
Houston, Houston, TX. Available from: http://hdl.handle.net/2022/13940
Except where otherwise noted, by inclusion of a source url or some other
note, the contents of this presentation are © by the Trustees of Indiana
University. This content is released under the Creative Commons
Attribution 3.0 Unported license
(http://creativecommons.org/licenses/by/3.0/). This license includes the
following terms: You are free to share – to copy, distribute and transmit the
work and to remix – to adapt the work under the following conditions:
attribution – you must attribute the work in the manner specified by the
author or licensor (but not in any way that suggests that they endorse you
or your use of the work). For any reuse or distribution, you must make
clear to others the license terms of this work.
November 6, 2015
Big Red - Basics and history
• IBM e1350 BladeCenter Cluster, SLES 9,
MPICH, Loadleveler, MOAB
• Spring 2006: 17 days assembly at IBM facility,
disassembled, reassembled in 10 days at IU.
• 20.48 TFLOPS peak theoretical, 15.04 achieved
on Linpack; 23rd on June 2006 Top500 List
(IU’s highest listing to date).
• In production for local users on 22 August 2006,
for TeraGrid users 1 October 2006
• Upgraded to 30.72 TFLOPS Spring 2008; ???
on June 2007 Top500 List
• Named after nickname for IU sports teams
November 6, 2015
Motivations and goals
• Initial goals for 20.48 TFLOPS system:
Local demand for cycles exceeded supply
TeraGrid Resource Partner commitments to meet
Support life science research
Support applications at 100s to 1000s of
processors
• 2nd phase upgrade to 30.72 TFLOPS
Support economic development in State of
Indiana
November 6, 2015
November 6, 2015
Why a PowerPC-based blade
cluster?
• Processing power per node
• Density, good power efficiency relative to
available processors
Processor
TFLOPS/
MWatt
MWatts/
PetaFLOPS
Intel Xeon 7041
145
6.88
AMD
219
4.57
PowerPC 970 MP (dual core)
200
5.00
• Possibility of performance gains through use of
Altivec unit & VMX instructions
• Blade architecture provides flexibility for future
• Results of Request for Proposals process
November 6, 2015
Feature
20.4 TFLOPS
30.7 TFLOPS
JS21 components
Two 2.5 GHz PowerPC 970MP
processors, 8 GB RAM, 73 GB SAS
Drive, 40 GFLOPS
Same
No. of JS21 blades
512
768
No. of processors; cores
1,024 processors; 2,048 processor
cores
1,536 processors; 3,072 processor
cores
Total system memory
4 TB
6 TB
GPFS scratch space
266 TB
Same
Lustre
535 TB
Same
Home directory space
25 TB
Same
Total outbound network bandwidth
40 Gbit/sec
Same
Bisection bandwidth
64 GB/sec - Myrinet 2000
96 GB/sec - Myrinet 2000
Computational hardware, RAM
Disk storage
Networks
November 6, 2015
November 6, 2015
IBM e1350 vs Cray XT3
Per processor
(data from http://icl.cs.utk.edu/hpcc/hpcc_results.cgi)
Per process (core)
November 6, 2015
IBM e1350 vs HP XC4000 (data from http://icl.cs.utk.edu/hpcc/hpcc_results.cgi)
November 6, 2015
Linpack performance
Benchmark
set
Nodes
Peak
Achieved
Theoretical TFLOPS
TFLOPS
HPCC
510
20.40
13.53
66.3
Top500
512
20.48
15.04
73.4
Top500
768
30.72
21.79
70.9
Difference: 4 KB vs 16 MB page size
%
November 6, 2015
November 6, 2015
Elapsed time per simulation timestep among best in TeraGrid
November 6, 2015
Image courtesy of Emad Tajkhorshid
• Simulation of TonB-dependent
transporter (TBDT)
• Used systems at NCSA, IU,
PSC
• Modeled mechanisms for
allowing transport of molecules
through cell membrane
• Work by Emad Tajkhorshid and
James Gumbart, of University
of Illinois Urbana-Champaign.
Mechanics of Force
Propagation in TonBDependent Outer Membrane
Transport. Biophysical Journal
93:496-504 (2007)
• To view the results of the
simulation, please go to:
http://www.life.uiuc.edu/emad/
TonB-BtuB/btub-2.5Ans.mpg
November 6, 2015
WxChallenge
(www.wxchallenge.com)
• Over 1,000 undergraduate students, 64
teams, 56 institutions
• Usage on Big Red:
~16,000 CPU hours on Big Red
63% of processing done on Big Red
Most of the students who used Big Red
couldn’t tell you what it is
• Integration of computation and data flows
via Lustre (Data Capacitor)
November 6, 2015
November 6, 2015
Overall user reactions
• NAMD, WRF users very pleased
• Porting from Intel instruction set a perceived
challenge in a cycle-rich environment
• MILC optimization with VMX not successful
• Keys to biggest successes:
Performance characteristics of JS21 nodes
Linkage of computation and storage (Lustre Data Capacitor)
Support for grid computing via TeraGrid
November 6, 2015
Cloud computing
•
•
•
•
Cloud computing does not exist
Infrastructure as a Service
Platform as a Service
Pitfalls of ‘Gartner hype curve’ and
confusing IaaS and PaaS
16,000
15,000
14,000
Annual Energy Use (MWHr)
13,000
12,000
11,000
10,000
9,000
8,000
7,000
6,000
Top Machine
5,000
250th Machine
500th Machine
4,000
3,000
2,000
1,000
0
Nov-04
Jun-05
Nov-05
Jun-06
Nov-06
Jun-07
Top 500 Report
Nov-07
Jun-08
Nov-08
Question: What do you plan/want
to
• do with cloud computing?
•
•
•
•
Support collaborative activities
Application hosting (institutional apps)
Content distribution
Use cloud computing as a way to deliver
reliable 7 x 24 services when the
institutions IT organization does not run a
7 x 24 operation
• Internal collaboration within the
college/university
Is it safe (results of BOF at
EDUCAUSE meeting)?
Consensus – big changes 1-3
years
• Not a clear choice now, might be compelling later
• Real worries about getting data and capability back after it
has once been outsources
• Uses will be broad
• Cloud computing will cause major realignments in funding
• Cloud computing will push more computing to the
individual
• Legal compliance issues may be solved as more
universities and colleges push on clould vendors
• Utility model – is it here? Maybe
No one makes their own lab
glassware anymore (usually)
• Look at history of local telcos, phone
switches on campus
• Look at history of processors
• Cloud computing, long haul networks,
and data management issues are deeply
intertwined
• Our challenge *may be* to figure out how
to make best use of IaaS in the future
•
•
•
•
•
•
•
•
Acknowledgements - Funding
Sources
IU’s involvement as a TeraGrid Resource Partner is supported in part by the National Science
Foundation under Grants No. ACI-0338618l, OCI-0451237, OCI-0535258, and OCI-0504075
The IU Data Capacitor is supported in part by the National Science Foundation under Grant
No. CNS-0521433.
This research was supported in part by the Indiana METACyt Initiative. The Indiana METACyt
Initiative of Indiana University is supported in part by Lilly Endowment, Inc.
This work was supported in part by Shared University Research grants from IBM, Inc. to
Indiana University.
The LEAD portal is developed under the leadership of IU Professors Dr. Dennis Gannon and
Dr. Beth Plale, and supported by NSF grant 331480.
The ChemBioGrid Portal is developed under the leadership of IU Professor Dr. Geoffrey C.
Fox and Dr. Marlon Pierce and funded via the Pervasive Technology Labs (supported by the
Lilly Endowment, Inc.) and the National Institutes of Health grant P20 HG003894-01
Many of the ideas presented in this talk were developed under a Fulbright Senior Scholar’s
award to Stewart, funded by the US Department of State and the Technische Universitaet
Dresden.
Any opinions, findings and conclusions or recommendations expressed in this material are
those of the author(s) and do not necessarily reflect the views of the National Science
Foundation (NSF), National Institutes of Health (NIH), Lilly Endowment, Inc., or any other
funding agency
Acknowledgements - People
•
•
•
•
•
Maria Morris contributed to the graphics used in this talk
Marcus Christie and Surresh Marru of the Extreme! Computing Lab contributed
the LEAD graphics
John Morris (www.editide.us) and Cairril Mills (Cairril.com Design & Marketing)
contributed graphics
This work would not have been possible without the dedicated and expert efforts
of the staff of the Research Technologies Division of University Information
Technology Services, the faculty and staff of the Pervasive Technology Labs, and
the staff of UITS generally.
Thanks to the faculty and staff with whom we collaborate locally at IU and globally
(via the TeraGrid, and especially at Technische Universitaet Dresden)
Co-author affiliations
Craig A. Stewart; [email protected]; Office of the Vice President and CIO, Indiana University, 601 E. Kirkwood, Bloomington, IN
Matthew Link; [email protected]; University Information Technology Services (UITS), Indiana University, 2711 E. 10 th St.,
Bloomington, IN 47408
D. Scott McCaulay, [email protected],UITS, Indiana University, 2711 E. 10 th St., Bloomington, IN 47408
Greg Rodgers; [email protected]; IBM Corporation, 2455 South Road, Poughkeepsie, New York 12601
George Turner; [email protected]; UITS, Indiana University, 2711 E. 10 th St., Bloomington, IN 47408
David Hancock; dyhancoc@iupui,edu; UITS, Indiana University — Purdue University Indianapolis, 535 W. Michigan Street,
Indianapolis, IN 46202
Richard Repasky; [email protected],UITS, Indiana University, 2711 E. 10 th St., Bloomington, IN 47408
Peng Wang; [email protected]; UITS, Indiana University — Purdue University Indianapolis, 535 W. Michigan Street,
Indianapolis, IN 46202
Faisal Saied; [email protected]; Rosen Center for Advanced Computing, Purdue University, 302 W. Wood Street, West Lafayette,
Indiana 47907
Marlon Pierce; Community Grids Lab, Pervasive Technology Labs at Indiana University, 501 N. Morton Street, Bloomington, IN
47404
Ross Aiken; [email protected]; IBM Corporation, 9229 Delegates Row, Precedent Office Park Bldg 81, Indianapolis, IN 46240;
Matthias Mueller; [email protected]; Center for Information Services and High Performance Computing (ZIH)
Dresden University of Technology D-01062 Dresden, Germany
Matthias Jurenz; [email protected]; Center for Information Services and High Performance Computing (ZIH)
Dresden University of Technology D-01062 Dresden, Germany
Matthias Lieber; [email protected];Center for Information Services and High Performance Computing (ZIH) Dresden
University of Technology D-01062 Dresden, Germany
Thank you
• Any questions?