Implementation and experience with Big Red - a 20.4 TFLOPS IBM BladeCenter cluster Craig A.
Download ReportTranscript Implementation and experience with Big Red - a 20.4 TFLOPS IBM BladeCenter cluster Craig A.
Implementation and experience with Big Red - a 20.4 TFLOPS IBM BladeCenter cluster Craig A. Stewart [email protected] 26 June 2007 November 6, 2015 Outline • • • • • Brief history of implementation System architecture Performance analysis User experience and science results Lessons learned to date November 6, 2015 IU & TeraGrid Image from www.teragrid.org • IU: 2 core campuses, 6 regional campuses • President-elect: Michael A. McRobbie • Advanced computing: University Information Technology Services, Pervasive Technology Labs, School of Informatics • Motivation for being part of TeraGrid: Support national research agendas Improve ability of IU researchers to use national cyberinfrastructure Testbed for IU computer science research November 6, 2015 Big Red - Basics and history • IBM e1350 BladeCenter Cluster, SLES 9, MPICH, Loadleveler, MOAB • Spring 2006: 17 days assembly at IBM facility, disassembled, reassembled in 10 days at IU. • 20.48 TFLOPS peak theoretical, 15.04 achieved on Linpack; 23rd on June 2006 Top500 List (IU’s highest listing to date). • In production for local users on 22 August 2006, for TeraGrid users 1 October 2006 • Upgraded to 30.72 TFLOPS Spring 2008; ??? on June 2007 Top500 List • Named after nickname for IU sports teams November 6, 2015 Motivations and goals • Initial goals for 20.48 TFLOPS system: Local demand for cycles exceeded supply TeraGrid Resource Partner commitments to meet Support life science research Support applications at 100s to 1000s of processors • 2nd phase upgrade to 30.72 TFLOPS Support economic development in State of Indiana November 6, 2015 November 6, 2015 November 6, 2015 Why a PowerPC-based blade cluster? • Processing power per node • Density, good power efficiency relative to available processors Processor TFLOPS/ MWatt MWatts/ PetaFLOPS Intel Xeon 7041 145 6.88 AMD 219 4.57 PowerPC 970 MP (dual core) 200 5.00 • Possibility of performance gains through use of Altivec unit & VMX instructions • Blade architecture provides flexibility for future • Results of Request for Proposals process November 6, 2015 Feature 20.4 TFLOPS 30.7 TFLOPS JS21 components Two 2.5 GHz PowerPC 970MP processors, 8 GB RAM, 73 GB SAS Drive, 40 GFLOPS Same No. of JS21 blades 512 768 No. of processors; cores 1,024 processors; 2,048 processor cores 1,536 processors; 3,072 processor cores Total system memory 4 TB 6 TB GPFS scratch space 266 TB Same Lustre 535 TB Same Home directory space 25 TB Same Total outbound network bandwidth 40 Gbit/sec Same Bisection bandwidth 64 GB/sec - Myrinet 2000 96 GB/sec - Myrinet 2000 Computational hardware, RAM Disk storage Networks November 6, 2015 November 6, 2015 HPCC and Linpack Results (510 nodes) G-HPL GPTRANS GRandom Access G-FFTE EPSTREAM Sys TFlop/s GB/s Gup/s GFlop/s GB/s Total 13.53 40.76 0.2497 67.33 2468 Per processor 0.013264 0.0399 0.000244 0.066 Data posted to http://icl.cs.utk.edu/hpcc/hpcc_results.cgi EPSTREAM Triad GB/s EPDGEMM Random Ring Bandwidth Random Ring Latency GB/s usec GFlop/s 17.73 2.42 8.27 0.0212 November 6, 2015 IBM e1350 vs Cray XT3 Per processor (data from http://icl.cs.utk.edu/hpcc/hpcc_results.cgi) Per process (core) November 6, 2015 IBM e1350 vs HP XC4000 (data from http://icl.cs.utk.edu/hpcc/hpcc_results.cgi) November 6, 2015 Linpack performance Benchmark set Nodes Peak Achieved Theoretical TFLOPS TFLOPS HPCC 510 20.40 13.53 66.3 Top500 512 20.48 15.04 73.4 Top500 768 30.72 21.79 70.9 Difference: 4 KB vs 16 MB page size % November 6, 2015 November 6, 2015 Elapsed time per simulation timestep among best in TeraGrid November 6, 2015 Image courtesy of Emad Tajkhorshid • Simulation of TonB-dependent transporter (TBDT) • Used systems at NCSA, IU, PSC • Modeled mechanisms for allowing transport of molecules through cell membrane • Work by Emad Tajkhorshid and James Gumbart, of University of Illinois Urbana-Champaign. Mechanics of Force Propagation in TonBDependent Outer Membrane Transport. Biophysical Journal 93:496-504 (2007) • To view the results of the simulation, please go to: http://www.life.uiuc.edu/emad/ TonB-BtuB/btub-2.5Ans.mpg November 6, 2015 ChemBioGrid • Analyzed 555,007 abstracts in PubMed in ~ 8,000 CPU hours • Used OSCAR3 to find SMILES strings -> SDF format -> 3D structure (GAMESS) > into Varuna database and then other applications • “Calculate and look up” model for ChemBioGrid November 6, 2015 WxChallenge (www.wxchallenge.com) • Over 1,000 undergraduate students, 64 teams, 56 institutions • Usage on Big Red: ~16,000 CPU hours on Big Red 63% of processing done on Big Red Most of the students who used Big Red couldn’t tell you what it is • Integration of computation and data flows via Lustre (Data Capacitor) November 6, 2015 November 6, 2015 Overall user reactions • NAMD, WRF users very pleased • Porting from Intel instruction set a perceived challenge in a cycle-rich environment • MILC optimization with VMX not successful so far in eyes of user community • Keys to biggest successes: Performance characteristics of JS21 nodes Linkage of computation and storage (Lustre Data Capacitor) Support for grid computing via TeraGrid November 6, 2015 Evaluation of implementation • The manageability of the system is excellent • For a select group of applications, Big Red provides excellent performance and reasonable scalability • We are likely to expand bandwidth from Big Red to the rest of the IU cyberinfrastructure • We are installing a 7 TFLOPS Intel cluster; model in future to be Intel-compatible processors as “default entry point,” more specialized systems for highly scalable codes • Focus on data management and scalable computation critical to success • Next steps: industrial partnerships and economic development in Indiana November 6, 2015 Conclusions • A 20.4 TFLOPS system with “not the usual” processors was successfully implemented serving local Indiana University researchers, and the national research audience via the TeraGrid • Integration of computation and data management systems was critical to success • In the future Science Gateways will be increasingly important: Most scientists can’t constantly chase after the fastest available system; gateway developers might be able to Programmability of increasingly unusual architectures not likely to become easier For applications with broad potential user bases, or extreme scalability on specialized systems, Science Gateways will be critical in enabling transformational capabilities and supporting scientific workflows. Achieving broad use can only be achieved by relieving scientists of need to understand details of systems November 6, 2015 • • • • • • • • Acknowledgements - Funding Sources IU’s involvement as a TeraGrid Resource Partner is supported in part by the National Science Foundation under Grants No. ACI-0338618l, OCI-0451237, OCI-0535258, and OCI-0504075 The IU Data Capacitor is supported in part by the National Science Foundation under Grant No. CNS-0521433. This research was supported in part by the Indiana METACyt Initiative. The Indiana METACyt Initiative of Indiana University is supported in part by Lilly Endowment, Inc. This work was supported in part by Shared University Research grants from IBM, Inc. to Indiana University. The LEAD portal is developed under the leadership of IU Professors Dr. Dennis Gannon and Dr. Beth Plale, and supported by NSF grant 331480. The ChemBioGrid Portal is developed under the leadership of IU Professor Dr. Geoffrey C. Fox and Dr. Marlon Pierce and funded via the Pervasive Technology Labs (supported by the Lilly Endowment, Inc.) and the National Institutes of Health grant P20 HG003894-01 Many of the ideas presented in this talk were developed under a Fulbright Senior Scholar’s award to Stewart, funded by the US Department of State and the Technische Universitaet Dresden. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation (NSF), National Institutes of Health (NIH), Lilly Endowment, Inc., or any other funding agency November 6, 2015 Acknowledgements - People • • • • • • Maria Morris contributed to the graphics used in this talk Marcus Christie and Surresh Marru of the Extreme! Computing Lab contributed the LEAD graphics John Morris (www.editide.us) and Cairril Mills (Cairril.com Design & Marketing) contributed graphics This work would not have been possible without the dedicated and expert efforts of the staff of the Research Technologies Division of University Information Technology Services, the faculty and staff of the Pervasive Technology Labs, and the staff of UITS generally. Thanks to the faculty and staff with whom we collaborate locally at IU and globally (via the TeraGrid, and especially at Technische Universitaet Dresden) Please cite as: Stewart, C.A. Implementation and experience with Big Red – a 20.4 TFLOPS IBM BladeCenter cluster. 2007. Presentation. Presented at: International Supercomputer Conference (Dresden, Germany, 26 Jun 2007). Available from: http://hdl.handle.net/2022/14607 November 6, 2015 Co-author affiliations Craig A. Stewart; [email protected]; Office of the Vice President and CIO, Indiana University, 601 E. Kirkwood, Bloomington, IN Matthew Link; [email protected]; University Information Technology Services (UITS), Indiana University, 2711 E. 10 th St., Bloomington, IN 47408 D. Scott McCaulay, [email protected],UITS, Indiana University, 2711 E. 10 th St., Bloomington, IN 47408 Greg Rodgers; [email protected]; IBM Corporation, 2455 South Road, Poughkeepsie, New York 12601 George Turner; [email protected]; UITS, Indiana University, 2711 E. 10 th St., Bloomington, IN 47408 David Hancock; dyhancoc@iupui,edu; UITS, Indiana University — Purdue University Indianapolis, 535 W. Michigan Street, Indianapolis, IN 46202 Richard Repasky; [email protected],UITS, Indiana University, 2711 E. 10 th St., Bloomington, IN 47408 Peng Wang; [email protected]; UITS, Indiana University — Purdue University Indianapolis, 535 W. Michigan Street, Indianapolis, IN 46202 Faisal Saied; [email protected]; Rosen Center for Advanced Computing, Purdue University, 302 W. Wood Street, West Lafayette, Indiana 47907 Marlon Pierce; Community Grids Lab, Pervasive Technology Labs at Indiana University, 501 N. Morton Street, Bloomington, IN 47404 Ross Aiken; [email protected]; IBM Corporation, 9229 Delegates Row, Precedent Office Park Bldg 81, Indianapolis, IN 46240; Matthias Mueller; [email protected]; Center for Information Services and High Performance Computing (ZIH) Dresden University of Technology D-01062 Dresden, Germany Matthias Jurenz; [email protected]; Center for Information Services and High Performance Computing (ZIH) Dresden University of Technology D-01062 Dresden, Germany Matthias Lieber; [email protected];Center for Information Services and High Performance Computing (ZIH) Dresden University of Technology D-01062 Dresden, Germany November 6, 2015 Thank you • Any questions?