Nancy’s Excellent TeraGrid Adventures or

Download Report

Transcript Nancy’s Excellent TeraGrid Adventures or

Science Gateways on the
TeraGrid
Von Welch, NCSA
[email protected]
(with thanks to Nancy Wilkins-Diehr,
SDSC for many slides)
The TeraGrid Strategy
• Building a distributed
system of
unprecedented scale
• Integrating new
partners to introduce
new capabilities
– 40+ teraflops compute
– 1+ petabyte storage
– 10-40Gb/s networking
– Additional computing,
visualization capabilities
– New types of resourcesdata collections, instruments
• Creating a unified user
environment across
heterogeneous
resources
– User software environment,
User support resources.
– Created an initial community
of over 500 users, 80 PI’s.
Make it extensible!
The TeraGrid Team
• Two major components:
– 9 Resource Providers (RPs) who provide
resources and expertise
• Seven universities
• Two government laboratories
• Expected to grow
– The Grid Integration Group (GIG) who provides
leadership in grid integration among the RPs
• Led by Director, who is assisted by Executive Steering
Committee, Area Directors, Project Manager
• Includes participation by staff at each RP
• Funding now provided for people, not just
networks and hardware!
TeraGrid Resource Partners
TeraGrid Resources
ANL/UC Caltech IU
NCSA
ORNL
PSC
Purdue
SDSC
TACC
Itanium2
(0.5 TF)
Itanium
2
(0.2 TF)
Itanium
2
(10 TF)
IA-32
(0.3 TF)
Hetero
(1.7 TF)
Itanium
2
(4.4 TF)
IA-32
(6.3
TF)
IA-32
(2.0 TF)
SGI
SMP
(6.5 TF)
XT3
(10
TF)
TCS
(6 TF)
Marvel
(0.3
TF)
Power4
+
(1.1 TF)
Sun
(Vis)
32 TB
600 TB
150
TB
540 TB
50 TB
Mass
Storage
1.2 PB
3 PB
2.4 PB
6 PB
2 PB
Data
Collections
Yes
Yes
Yes
Yes
Itanium
2
(0.8 TF)
IA-32
(0.5 TF)
Compute
Resources
Online
Storage
Visualizatio
n
20 TB
Yes
Instruments
Partners
Network
(Gb/s,Hub)
155 TB
30
CHI
1 TB
Yes
Yes
Yes
Yes
Yes
Yes
will Yes
add resources
and Yes
TeraGrid will add partners!
30
LA
10
CHI
30
CHI
10
ATL
30
CHI
10
CHI
30
LA
10
CHI
Science Gateways
A new initiative for the TeraGrid
• Increasing investment by communities to build their
own cyberinfrastructure.
• Heterogeneity
– Resources - different architectures at local, national and
international levels
– Users- from HPC expert to K-12 student…they should all
benefit from CI.
– Software stacks, policies.
• How can “centers/Institutions” provide, operate,
maintain in this heterogeneous world ?
• Working with Gateways, TeraGrid will start to answer
that question by providing generic CI services to
communities.
• Integration and interoperability.
What are Gateways?
• Gateways will
– engage communities that are not traditional users of the
supercomputing centers
•
by
– providing community-tailored access to TeraGrid
services and capabilities
• Three examples:
– Web-based Portals that front-end Grid Services that provide
teragrid-deployed applications used by a community.
– Coordinated access points enabling users to move seamlessly
between TeraGrid and other grids.
– Application programs running on users' machines but accessing
services in TeraGrid (and elsewhere)
• All take advantage of existing community investment in
software, services, education, and other components of
Cyberinfrastructure.
Grid Portal Gateways
• The Portal accessed through a
browser or desktop tools
– Provides Grid authentication and access
to services
– Provide direct access to TeraGrid
hosted applications as services
• The Required Support Services
– Use NMI Portal Framework, GridPort
– NMI Grid Tools: Condor, Globus, etc.
– OSG, HEP tools: Clarens, MonaLisa
Workflow Composer
Build standard portals to meet the domain
Grid Resources
requirements of the biology communities
Dev elop f ederated databases to be
replicated and shared across TeraGrid
OGCE
OGCEPortlets
Portlets
with
with Containe
Containerr
Serv
Service
ice
API
API
Apache
Apache Jetspeed
Jetspeed
Internal
Internal Services
Services
Grid
Grid
Serv
Service
ice
Stubs
Stubs
Local
Local
Portal
Portal
Serv
Services
ices
Remote
Remote
Content
Content
Serv
Services
ices
Java
CoG Kit
• Builds on NSF & DOE software
Technical Approach
OGCE Science Portal
– Searchable Metadata catalogs
– Information Space Management.
– Workflow managers
– Resource brokers
– Application deployment services
– Authorization services.
Grid
Protocols
Grid
Serv ice
s
Open Source Tools
HTTP
Remote
Content
Serv ers
Initial Focus on 10 Gateways
Science Gateway Prototype
Discipline
Science Partner(s)
TeraGrid Liaison
Linked Environments for
Atmospheric Discovery (LEAD)
Atmospheric
Droegemeier (OU)
Gannon (IU), Pennington (NCSA)
National Virtual Observatory
(NVO)
Astronomy
Szalay (Johns Hopkins)
Williams (Caltech)
Network for Computational
Nanotechnology (NCN) and
“nanoHUB”
Nanotechnology
Lundstrum (PU)
Goasguen (PU)
National Microbial Pathogen Data Biomedicine and Biology
Resource Center (NMPDR)
Schneewind (UC), Osterman
Stevens (UC/Argonne)
(Burnham/UCSD), DeLong (MIT),
Dusko (INRA)
NSF National Evolutionary
Biomedicine and Biology
Biology Center (NESC), NIH
Carolina Center for Exploratory
Genetic Analysis, State of North
Carolina Bioinformatics Portal
project
Cunningham (Duke), Magnuson
(UNC)
Reed (UNC), Blatecky (UNC)
Neutron Science Instrument
Gateway
Physics
Dunning (ORNL)
Cobb (ORNL)
Grid Analysis Environment
High-Energy Physics
Newman (Caltech)
Bunn (Caltech)
Stephen Eubanks (LANL)
Beckman (Argonne)
Transportation System Decision Homeland Security
Support
Groundwater/Flood Modeling
Environmental
Wells (UT-Austin), Engel (ORNL) Boisseau (TACC)
Science Grid
[GrPhyN/ivDGL/Grid3]
Multiple
Pordes (FNAL), Huth (Harvard), Foster (UC/Argonne), Kesselman (USC-ISI),
Avery (Uflorida)
Livny (UW)
Expanding User Base
6000
5000
6000
A new generation
of “users” that access
TeraGrid via Science Gateways, scaling
well beyond the traditional “user” with a
5000
shell login account.
OSG
OSG
Flood
4000
Projected
user community size by each
4000
science gateway project.
3000
Flood
HEP
HEP
SNS
SNS
NESC /C C EGA
NESC
OLSG /C C EGA
Impact 3000
on society from gateways enabling
decision support is much larger!
2000
OLSG
NC N
NC
N
NVO
2000
NVO
LEAD
LEAD
1000
1000
0
0
2005
1
2006
2
2007
3
2008
4
5
2009
So how will we meet all these needs?
• With RATS!
(Requirements
Analysis Teams)
• Organized RATS
• Collection, analysis
and consolidation of
requirements to
jumpstart the work
• And milestones
Rats de Paris
Traditional HPC Model
• All user have accounts at each
site/resource
– NxN matrix of users and sites
• Users access resources through lowlevel interfaces
– E.g. Unix Shells, FTP session
• Resource takes care of all the security
– AAAA: Authentication, Authorization,
Auditing, Accounting
Traditional HPC Usage
A
U
T
H
n
% ls
% foo
OS
(Authz)
Audit
Accounting
Science Gateway Motivation
• Shell-level access to resources is great for power
users, but has steep learning curve
– Many SG users just need domain-specific interface, e.g. they
are not developing or deploying application codes
• Each resource/site has to maintain state about every
user
– Scalability problems for large/dynamic user communities
• No abstraction - users must adapt to all changes in
resources
SG Security Model
• SG acts as a interface between the
community and its resources
• Much like a traditional ‘Grid Portal’, it provides
a domain-specific interface
• However, unlike portals, it exists as a trusted
entity in its own right, allowing the resource to
“outsource” AAAA functionality to the SG
• Resources runs all commands in a
community account, which constrains what
community can do - account can be
constrained to a few community applications
Conceptual Model
% ls
% foo
% ls
% foo
% ls
% foo
SG AAAA Model
Authn
UserLevel
Audit
User-level Authz
% ls
% foo
JobLevel
Audit
Community-level
Authz
Accounting
• Security functions held by the resource are now split between
resource and Science Gateway
• However there is a strong need to communicate between the two
• Resource will want full audit information and user information to
investigate suspicious activity
• SG needs accounting information to do allocations and reporting
(e.g. who is using the SG)
Outstanding Challenges
• How to identify a job between SG and resource?
– “/bin/foo run at 15:38:13 (my time)” not very
accurate
• Standard template for resource/SG agreement
– Akin to certificate policy
• Acceptance of group accounts
– Convince folks its ok to outsource
• Restricted accounts
– Cookbook to restrict account to certain
applications
• Sandboxing of users from each others
• Community administrators
– Those who set up group account
Outstanding Challenges (cont)
• Each SG forms its own VO
– TeraGrid provides resources
– SG provides the user
• I’ve mostly talked about SG/TeraGrid relationship
• But how SGs will manage their users is open
– Authentication, Authorization, Contact
information… (the whole list Jill just gave)
– Users distributed over multiple domains
– Wanting to get into the 1000’s of users
– Different communities for each SG
• TeraGrid would like to help as much as possible
here as well
Questions?