Resources that matter to you: accessing IU’s Big Red supercomputer and massive data storage system via the TeraGrid Craig A.
Download ReportTranscript Resources that matter to you: accessing IU’s Big Red supercomputer and massive data storage system via the TeraGrid Craig A.
Resources that matter to you: accessing IU’s Big Red supercomputer and massive data storage system via the TeraGrid Craig A. Stewart Associate Dean, Research Technologies; Chief Operating Officer, Pervasive Technology Labs; PI, IU TeraGrid Resource Partner [email protected] 14 October 2007 License terms • • Please cite as: Stewart, C.A. 2007. Resources that matter to you: accessing IU’s Big Red supercomputer and massive data storage system via the TeraGrid. Tutorial presentation. Presented at IEEE 7th International conference on BioInformatics & BioEngineering. 14-17 Oct., Harvard Medical School, Boston, MA. http://hdl.handle.net/2022/13992 Some figures are shown here taken from web, under an interpretation of fair use that seemed reasonable at the time and within reasonable readings of copyright interpretations. Such diagrams are indicated here with a source url. In several cases these web sites are no longer available, so the diagrams are included here for historical value. Except where otherwise noted, by inclusion of a source url or some other note, the contents of this presentation are © by the Trustees of Indiana University. This content is released under the Creative Commons Attribution 3.0 Unported license (http://creativecommons.org/licenses/by/3.0/). This license includes the following terms: You are free to share – to copy, distribute and transmit the work and to remix – to adapt the work under the following conditions: attribution – you must attribute the work in the manner specified by the author or licensor (but not in any way that suggests that they endorse you or your use of the work). For any reuse or distribution, you must make clear to others the license terms of this work. Outline • Why this talk matters to you: Ever felt that your research was limited by availability of computer power or consulting expertise? • Information about the TeraGrid • IU’s Big Red supercomputer and available software • IU’s Massive Data Storage system • If you deliver services via the web… we may be able to help you deliver them the scientific community Slides with the IU logo are by Craig Stewart and D. Scott McCaulay and © Trustees of Indiana University. May be reused with attribution. Cyberinfrastructure, TeraGrid • • • Cyberinfrastructure consists of computing systems, data storage systems, advanced instruments and data repositories, visualization environments, and people, all linked together by software and high performance networks to improve research productivity and enable breakthroughs not otherwise possible. (http://en.wikipedia.org/wiki/Cyberinfrastructure) The TeraGrid is an open scientific discovery infrastructure that combines large computing resources (including supercomputers, storage, and scientific visualization systems) at several Resource Provider partner sites to create an integrated, persistent computational resource. The TeraGrid is funded by the National Science Foundation and is an allocatable resource available for use to accelerate US research Real-Time Usage Mashup 349 Jobs running across 10,742 processors at 22:48 09/29/2007 Alpha version Mashup tool – © Maytal Dahan, Texas Advanced Computing Center ([email protected]) TeraGrid User Community TeraGrid -11 Resource Providers, One Facility Grid Infrastructure Group (UChicago) UW PSC UC/ANL NCAR PU NCSA IU Caltech ORNL Tennessee USC/ISI SDSC LSU TACC Resource Provider (RP) Software Integration Partner Network Hub www.teragrid.org UNC/RENCI TeraGrid Resources For Scientific Discovery • Computing - over 250 TFLOPS and growing – Common help desk and consulting requests – CTSS software environment • Remote visualization servers and visualization software • Data Management – Over 20 petabytes of storage – Over 100 Scientific Data Collections • Broadening Participation in TeraGrid – Over 20 Science Gateways – Advanced Support for TeraGrid Applications – Education and training events and resources • Access – Common allocations mechanism - DAC, MRAC and LRAC www.teragrid.org TeraGrid Architectural Model RP 1 RP 2 POPS User Portal TeraGrid Infrastructure Accounting, … (Accounting, Network,Network, Authorization,…) Help RP 3 www.teragrid.org Compute Service Viz Service Data Service November 6, 2015 Big Red • TFLOPS: 30.7 • Aggregate RAM: 6.1 TB • Processors 1,536 PowerPC 970MP, supports up to 3.072 MPI processes • Disk Size :72 TB local scratch; 266 TB GPFS Software available on Big Red • Key molecular dynamics codes: NAMD, CHARMM, AMBER, GAMESS, Gaussian, Gromacs, Quantum Espresso • All the usual Bioinformatics codes: BLAST, MEME, Clustal-W, phylip, fastDNAml, etc. Image courtesy of Emad Tajkhorshid • Simulation of TonB-dependent transporter (TBDT) • Used systems at NCSA, IU, PSC • Modeled mechanisms for allowing transport of molecules through cell membrane • Work by Emad Tajkhorshid and James Gumbart, of University of Illinois Urbana-Champaign. Mechanics of Force Propagation in TonBDependent Outer Membrane Transport. Biophysical Journal 93:496-504 (2007) • To view the results of the simulation, please go to: http://www.life.uiuc.edu/emad/ TonB-BtuB/btub-2.5Ans.mpg HPSS • IU installed HPSS system in 1998 • HPSS is distributed between Indianapolis and Bloomington campuses. • Over 2.8 petabytes of storage available and can scale to 4.2 petabytes • In aggregate, HPSS can handle as much as 2GB/s of data flowing in or out. • Dual copies in Bloomington and Indianapolis by default IU HPSS from TeraGrid • Accessible by either: – gridftp.archive.iu.teragrid.org – gridftp.hpss.iu.teragrid.org • Need to add TeraGrid certificate to IU Big Red System • uberftp and globus-url-copy tested and supported How you can use the TeraGrid • The TeraGrid is funded by the National Science Foundation as a US science instrument (facility); resources of TeraGrid are allocated like other NSF instruments. NIH funded researchers are welcome! • Application process (no double jeopardy for funded researchers) • Allocation committees - Development, Medium, Large • www.teragrid.org November 6, 2015 www.teragrid.org November 6, 2015 20 www.teragrid.org www.teragrid.org … Drives This www.teragrid.org TeraGrid Science Gateways Initiative: Community Interface to Grids • Common Web Portal or application interfaces (database access, computation, workflow, etc). • “Back-End” use of TeraGrid computation, information management, visualization, or other services. www.teragrid.org ChemBioGrid www.chembiogrid.org • Analyzed 555,007 abstracts in PubMed in ~ 8,000 CPU hours • Used OSCAR3 to find SMILES strings SDF format 3D structure (GAMESS) into Varuna database and then other applications • “Calculate and look up” model for ChemBioGrid MutDB - http://mutdb.org Do you have . . . • A need for computing cycles or storage? • A Web bioinformatics resource you would like to have hosted on a large supercomputer/massive data storage system? Send email to [email protected] and we can advise you on appropriate TeraGrid resources and help you with the application process for use of any TeraGrid resource (at IU or elsewhere). Acknowledgements - Funding • • • • • • • • • IU’s involvement as a TeraGrid Resource Partner is supported in part by the National Science Foundation under Grants No. ACI-0338618l, OCI-0451237, OCI-0535258, and OCI-0504075. The IU Data Capacitor is supported in part by the National Science Foundation under Grant No. CNS-0521433. The Grid Infrastructure Group management of the TeraGrid, and Dane Skow's leadership thereof, is funded by NSF grant 0503697. This research was supported in part by the Indiana METACyt Initiative. The Indiana METACyt Initiative of Indiana University is supported in part by Lilly Endowment, Inc. This work was supported in part by Shared University Research grants from IBM, Inc. to Indiana University. The LEAD portal is developed under the leadership of IU Professors Dr. Dennis Gannon and Dr. Beth Plale, and supported by NSF grant 331480. The ChemBioGrid Portal is developed under the leadership of IU Professor Dr. Geoffrey C. Fox and Dr. Marlon Pierce and funded via the Pervasive Technology Labs (supported by the Lilly Endowment, Inc.) and the National Institutes of Health grant P20 HG003894-01. Many of the ideas presented in this talk were developed under a Fulbright Senior Scholar’s award to Stewart, funded by the US Department of State and the Technische Universitaet Dresden. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation (NSF), National Institutes of Health (NIH), Lilly Endowment, Inc., or any other funding agency. Acknowledgements - People • • • • • • Malinda Lingwall for editing, graphic layout, and managing process Maria Morris contributed to the graphics used in this talk John Morris (www.editide.us) and Cairril Mills (Cairril.com Design & Marketing) contributed graphics This work would not have been possible without the dedicated and expert efforts of the staff of the Research Technologies Division of University Information Technology Services, the faculty and staff of the Pervasive Technology Labs, and the staff of UITS generally Thanks to the faculty and staff with whom we collaborate locally at IU and globally (via the TeraGrid, and especially at Technische Universitaet Dresden) This talk is dedicated to the memory of Chuck Stringer Chuck Stringer 1973-2007 Thank you • Any questions?