Resources that matter to you: accessing IU’s Big Red supercomputer and massive data storage system via the TeraGrid Craig A.

Download Report

Transcript Resources that matter to you: accessing IU’s Big Red supercomputer and massive data storage system via the TeraGrid Craig A.

Resources that matter to you:
accessing IU’s Big Red supercomputer
and massive data storage system via
the TeraGrid
Craig A. Stewart
Associate Dean, Research Technologies;
Chief Operating Officer, Pervasive Technology Labs;
PI, IU TeraGrid Resource Partner
[email protected]
14 October 2007
License terms
•
•
Please cite as: Stewart, C.A. 2007. Resources that matter to you: accessing IU’s
Big Red supercomputer and massive data storage system via the TeraGrid. Tutorial
presentation. Presented at IEEE 7th International conference on BioInformatics &
BioEngineering. 14-17 Oct., Harvard Medical School, Boston, MA.
http://hdl.handle.net/2022/13992
Some figures are shown here taken from web, under an interpretation of fair use
that seemed reasonable at the time and within reasonable readings of copyright
interpretations. Such diagrams are indicated here with a source url. In several
cases these web sites are no longer available, so the diagrams are included here
for historical value. Except where otherwise noted, by inclusion of a source url or
some other note, the contents of this presentation are © by the Trustees of Indiana
University. This content is released under the Creative Commons Attribution 3.0
Unported license (http://creativecommons.org/licenses/by/3.0/). This license
includes the following terms: You are free to share – to copy, distribute and transmit
the work and to remix – to adapt the work under the following conditions: attribution
– you must attribute the work in the manner specified by the author or licensor (but
not in any way that suggests that they endorse you or your use of the work). For
any reuse or distribution, you must make clear to others the license terms of this
work.
Outline
• Why this talk matters to you: Ever felt that your
research was limited by availability of computer
power or consulting expertise?
• Information about the TeraGrid
• IU’s Big Red supercomputer and available
software
• IU’s Massive Data Storage system
• If you deliver services via the web… we may be
able to help you deliver them the scientific
community
Slides with the IU logo are by Craig Stewart and D. Scott McCaulay and © Trustees of Indiana University.
May be reused with attribution.
Cyberinfrastructure, TeraGrid
•
•
•
Cyberinfrastructure consists of computing systems, data storage
systems, advanced instruments and data repositories,
visualization environments, and people, all linked together by
software and high performance networks to improve research
productivity and enable breakthroughs not otherwise possible.
(http://en.wikipedia.org/wiki/Cyberinfrastructure)
The TeraGrid is an open scientific discovery infrastructure that
combines large computing resources (including supercomputers,
storage, and scientific visualization systems) at several
Resource Provider partner sites to create an integrated,
persistent computational resource.
The TeraGrid is funded by the National Science Foundation and
is an allocatable resource available for use to accelerate US
research
Real-Time Usage Mashup
349 Jobs running across 10,742 processors at 22:48 09/29/2007
Alpha version Mashup tool – © Maytal Dahan, Texas Advanced Computing Center ([email protected])
TeraGrid User Community
TeraGrid -11 Resource Providers, One Facility
Grid Infrastructure
Group (UChicago)
UW
PSC
UC/ANL
NCAR
PU
NCSA
IU
Caltech
ORNL
Tennessee
USC/ISI
SDSC
LSU
TACC
Resource Provider (RP)
Software Integration Partner
Network Hub
www.teragrid.org
UNC/RENCI
TeraGrid Resources For Scientific Discovery
• Computing - over 250 TFLOPS and growing
– Common help desk and consulting requests
– CTSS software environment
• Remote visualization servers and visualization
software
• Data Management
– Over 20 petabytes of storage
– Over 100 Scientific Data Collections
• Broadening Participation in TeraGrid
– Over 20 Science Gateways
– Advanced Support for TeraGrid Applications
– Education and training events and resources
• Access
– Common allocations mechanism - DAC, MRAC
and LRAC
www.teragrid.org
TeraGrid Architectural Model
RP 1
RP 2
POPS
User
Portal
TeraGrid Infrastructure
Accounting, …
(Accounting, Network,Network,
Authorization,…)
Help
RP 3
www.teragrid.org
Compute
Service
Viz
Service
Data
Service
November 6, 2015
Big Red
• TFLOPS: 30.7
• Aggregate RAM: 6.1
TB
• Processors 1,536
PowerPC 970MP,
supports up to 3.072
MPI processes
• Disk Size :72 TB
local scratch; 266 TB
GPFS
Software available on Big Red
• Key molecular dynamics codes: NAMD,
CHARMM, AMBER, GAMESS, Gaussian,
Gromacs, Quantum Espresso
• All the usual Bioinformatics codes:
BLAST, MEME, Clustal-W, phylip,
fastDNAml, etc.
Image courtesy of Emad Tajkhorshid
• Simulation of TonB-dependent
transporter (TBDT)
• Used systems at NCSA, IU,
PSC
• Modeled mechanisms for
allowing transport of molecules
through cell membrane
• Work by Emad Tajkhorshid and
James Gumbart, of University
of Illinois Urbana-Champaign.
Mechanics of Force
Propagation in TonBDependent Outer Membrane
Transport. Biophysical Journal
93:496-504 (2007)
• To view the results of the
simulation, please go to:
http://www.life.uiuc.edu/emad/
TonB-BtuB/btub-2.5Ans.mpg
HPSS
• IU installed HPSS system in 1998
• HPSS is distributed between Indianapolis and
Bloomington campuses.
• Over 2.8 petabytes of storage available and can
scale to 4.2 petabytes
• In aggregate, HPSS can handle as much as
2GB/s of data flowing in or out.
• Dual copies in Bloomington and Indianapolis by
default
IU HPSS from TeraGrid
•
Accessible by either:
– gridftp.archive.iu.teragrid.org
– gridftp.hpss.iu.teragrid.org
• Need to add TeraGrid certificate to IU
Big Red System
• uberftp and globus-url-copy
tested and supported
How you can use the TeraGrid
• The TeraGrid is funded by the National Science
Foundation as a US science instrument (facility);
resources of TeraGrid are allocated like other
NSF instruments. NIH funded researchers are
welcome!
• Application process (no double jeopardy for
funded researchers)
• Allocation committees - Development, Medium,
Large
• www.teragrid.org
November 6, 2015
www.teragrid.org
November 6, 2015
20
www.teragrid.org
www.teragrid.org
… Drives This
www.teragrid.org
TeraGrid Science Gateways Initiative:
Community Interface to Grids
• Common Web Portal or application interfaces (database access,
computation, workflow, etc).
• “Back-End” use of TeraGrid computation, information management,
visualization, or other services.
www.teragrid.org
ChemBioGrid www.chembiogrid.org
• Analyzed 555,007
abstracts in PubMed in
~ 8,000 CPU hours
• Used OSCAR3 to find
SMILES strings 
SDF format  3D
structure (GAMESS)
 into Varuna
database and then
other applications
• “Calculate and look
up” model for
ChemBioGrid
MutDB - http://mutdb.org
Do you have . . .
• A need for computing cycles or storage?
• A Web bioinformatics resource you would
like to have hosted on a large
supercomputer/massive data storage
system?
Send email to [email protected] and we can
advise you on appropriate TeraGrid resources and help you
with the application process for use of any TeraGrid resource
(at IU or elsewhere).
Acknowledgements - Funding
•
•
•
•
•
•
•
•
•
IU’s involvement as a TeraGrid Resource Partner is supported in part by the National Science
Foundation under Grants No. ACI-0338618l, OCI-0451237, OCI-0535258, and OCI-0504075.
The IU Data Capacitor is supported in part by the National Science Foundation under Grant
No. CNS-0521433.
The Grid Infrastructure Group management of the TeraGrid, and Dane Skow's leadership
thereof, is funded by NSF grant 0503697.
This research was supported in part by the Indiana METACyt Initiative. The Indiana METACyt
Initiative of Indiana University is supported in part by Lilly Endowment, Inc.
This work was supported in part by Shared University Research grants from IBM, Inc. to
Indiana University.
The LEAD portal is developed under the leadership of IU Professors Dr. Dennis Gannon and
Dr. Beth Plale, and supported by NSF grant 331480.
The ChemBioGrid Portal is developed under the leadership of IU Professor Dr. Geoffrey C.
Fox and Dr. Marlon Pierce and funded via the Pervasive Technology Labs (supported by the
Lilly Endowment, Inc.) and the National Institutes of Health grant P20 HG003894-01.
Many of the ideas presented in this talk were developed under a Fulbright Senior Scholar’s
award to Stewart, funded by the US Department of State and the Technische Universitaet
Dresden.
Any opinions, findings and conclusions or recommendations expressed in this material are
those of the author(s) and do not necessarily reflect the views of the National Science
Foundation (NSF), National Institutes of Health (NIH), Lilly Endowment, Inc., or any other
funding agency.
Acknowledgements - People
•
•
•
•
•
•
Malinda Lingwall for editing, graphic layout, and managing process
Maria Morris contributed to the graphics used in this talk
John Morris (www.editide.us) and Cairril Mills (Cairril.com Design & Marketing)
contributed graphics
This work would not have been possible without the dedicated and expert efforts
of the staff of the Research Technologies Division of University Information
Technology Services, the faculty and staff of the Pervasive Technology Labs, and
the staff of UITS generally
Thanks to the faculty and staff with whom we collaborate locally at IU and globally
(via the TeraGrid, and especially at Technische Universitaet Dresden)
This talk is dedicated to the memory of Chuck Stringer
Chuck Stringer 1973-2007
Thank you
• Any questions?