SERVOJuly26-06.ppt

Download Report

Transcript SERVOJuly26-06.ppt

Grid-based Information Architecture for
iSERVO International Solid Earth Research
Virtual Organization
Western Pacific Geophysics Meeting (WPGM)
Beijing Convention Center
July 26 2006
Geoffrey Fox
Computer Science, Informatics, Physics
Pervasive Technology Laboratories
Indiana University Bloomington IN 47401
http://grids.ucs.indiana.edu/ptliupages/presentations/
[email protected]
http://www.infomall.org
APEC Cooperation for Earthquake Simulation

ACES is a seven year-long collaboration among scientists
interested in earthquake and tsunami predication
• iSERVO is Infrastructure to support
work of ACES
• SERVOGrid is (completed) US Grid that is
a prototype of iSERVO
• http://www.quakes.uq.edu.au/ACES/

Chartered under APEC –
the Asia Pacific Economic
Cooperation of 21 economies
Participating Institutions




CSIRO Australia
Monash University Australia
University of Western Australia, Perth,
Australia
University of Queensland Australia

















University of Western Ontario Canada
University of British Columbia Canada
China National Grid
Chinese Academy of Sciences
China Earthquake Administration
China Earthquake Network Center
Brown University
Boston University
Jet Propulsion Laboratory
Cal State Fullerton
San Diego State University


UC Davis
UC Irvine
UC San Diego
University of Southern California
University of Minnesota
Florida State University
US Geological Survey
Pacific Tsunami Warning Center PTWC
Hawaii

National Central University, Taiwan
(Taiwan Chelungpu-fault Drilling Project)

University of Tokyo
Tokyo Institute of Technology (Titech)
Sophia University
National Research Institute for Earth
Science and Disaster Prevention (NIED)
Japan
Geographical Survey Institute, Japan




Role of Information Technology
and Grids in ACES
Numerical simulations of physical, biological and social
systems
Engineering design
Economic analysis and planning
Sensor networks and sensor webs
High performance computing
Data mining and pattern analysis
Distance collaboration
Distance learning
Public outreach and education
Emergency response communication and planning
Geographic Information Systems
Resource allocation and management
Grids and Cyberinfrastructure


Grids are the technology based on Web services that implement
Cyberinfrastructure i.e. support eScience or science as a team
sport
• Internet scale managed services that link computers data
repositories sensors instruments and people
There is a portal and services in SERVOGrid for
• Applications such as GeoFEST, RDAHMM, Pattern
Informatics, Virtual California (VC), Simplex, mesh
generating programs …..
• Job management and monitoring web services for running
the above codes.
• File management web services for moving files between
various machines.
• Geographical Information System services
• Quaketables earthquake specific database
• Sensors as well as databases
• Context (dynamic metadata) and UDDI system long term
metadata services
• Services support streaming real-time data
Repositories
Federated Databases
Database
Sensors
Streaming
Data
Field Trip Data
Database
Sensor Grid
Database Grid
Research
SERVOGrid
Education
Compute Grid
Data
Filter
Services Research
Simulations
?
GIS
Discovery Grid
Services
Customization
Services
From
Research
to Education
Analysis and
Visualization
Portal
Grid of Grids: Research Grid and Education Grid
Education
Grid
Computer
Farm
SERVOGrid has a portal
The Portal is built from portlets
– providing user interface
fragments for each service
that are composed into the
full interface – uses OGCE
technology as does planetary
science VLAB portal with
University of Minnesota
Semantically Rich Services with a Semantically
Rich Distributed Operating Environment
O
SOAP Message
Streams
SS
S
Another
Service
Filter Service
FS
Wisdom
MD
Data
FS
SS
Raw Data
FS
Data
FS
Raw Data
O
S
O
FS
Knowledge
S
O
S
MD
Information
FS
MD
SS
FS
SS
FS
O
S
FS
FS
FS
MD
F
S
MD
Knowledge
O
S
MD
F
S
Information
O
S
O
S
FS
Other
Service
MD
O
S
DataFS
FS
O
S
FS
MD
Data
FS
Decisions
O
S
FS
FS
SS
SS
MD
O
S Information FS
SS
Another
Service
FS
MetaData
SS
S
S
Another
Database
Grid
S
S
Raw Data
S
S
S
S
Grids of Grids Architecture
S
S
S
S
S
S
S
S
Raw Data
SOAP
Message Streams
Another
Grid
S
S
Sensor Service
is same as outward
facing application
service




Linking
Grids
and
Services
Linkage of Services and Grids requires that messages sent by one
Grid/Service can be understood by another
Inside SERVOGrid all messages use
• Web service system standards we like (UDDI, WS-Context, WSDL,
SOAP) and
• GML as extended by WFS so that data sources and simulations all
use same syntax
All other Web service based Grids use their favorite Web service
system standards but these differ from Grid to Grid
• Further there is no agreement on application specific standards –
not all Earth Science Grids use OGC standards
• OGC standards include some capabilities overlapping general Web
Services
• Use of WSDL and SOAP is agreed although there are versioning
issues
So there is essentially there is no service level interoperability between
Grids but rather interoperation is at diverse levels with shared
technology
• SQL for databases, PBS for Job scheduling, Condor for job
management, GT4 or Unicore for Grids
Grids in Babylon

Presumptuous Tower of Babel (from the web)
• In the Bible, a city (now thought to be Babylon) in Shinar where God
confounded a presumptuous attempt to build a tower into heaven by
confusing the language of its builders into many mutually
incomprehensible languages.

For Grids, everybody likes to do their own thing and Grids are
complex multi-level entities where no obvious points of
interoperation
• so one does not need divine intervention to create multiple Grid
specifications
• But data in China, Tsunami sensors in Indian ocean and simulations in
USA etc. will not be linked for better warning and forecasting unless the
national efforts can interoperate

Two interoperation strategies:
• Make all Grids use the same specifications (divine harmony)
• Build translation services (filters!) using say OGF standards as a common
target language (more practical)

Don’t need computers (jobs) to be interoperable (although this
would be good) as each country does its own computing
• Rather need data and some metadata on each Grid to be accessible from
all Grids
Interoperability Summary




Need to define common infrastructure and domain specific
standards
• Build Interoperable Infrastructure gatewayed to existing legacy
applications and Grids
Generic Middleware
• Grid software including workflow
• Portals/Problem Solving environments incl. visualization
• We need to ensure that we can make security, job submission,
portal, data access (sharing) mechanisms in different economies
interoperate
Geographic Information Systems GIS
• Use services as defined by Open Geospatial Consortium (Web
Map and Feature Services) http://www.crisisgrid.net/
Earthquake/Tsunami Science Specific
• Satellites, sensors (GPS, Seismic)
• Fault, Tsunami … Characteristics stored in databases need
GML extensions - Schema for QuakeTables developed by
SERVOGrid can be used Internationally
ACES Components
Country
and/or
Economies
Data (shared
as part of a
collaboration)
Earthquake
Forecast/Model
Wave
Motion
Infrastructure
Institutions
Australia
Seismic data,
fault database,
GPS
Finley, LSM
PANDAS
prototype
Access
Canada
Polaris Radarsat
Pattern
Informatics
P.R. China
Seismic GPS
LURR
CAS
China National Grid
Japan
GPS
Seismic
Daichi (InSAR)
GeoFEM
JST-CREST
Earth Simulator
Naregi
Chinese
Taipei
FORMOSAT3/COSMIC (F/C)
U.S.A.
QuakeTables
Sesismic
InSAR
PBO (GPS)
Pattern
Informatics
ALLCAL
GeoFEST, PARK,
VirtualCalifornia
TeraShake
SERVOGrid
GEON
SCECGrid
Vlab
International
IMS
Pacific Rim Universities
(APRU ) PRAGMA
National Earthquake Grids of Relevance











APAC –GT2 GT4 gLite
ACcESS – Some link to SERVOGrid
China National Grid – GOS GT3 GT4
ChinaGrid – CGSP built on GT4
CNGI – China’s Next Generation Internet has significant
earthquake data component
Naregi – Uses GT4 and Unicore with much enhancements
Japanese Earthquake Simulation Grid – unclear
K*Grid Korea Enhanced SRB, GT2 to GT4
TIGER Taiwan Integrated Grid for Education and Research
unclear technology and unclear earthquake relevance
SERVOGrid – Uses WS-I+ simple Web Services
TeraGrid – Uses GT4 but not a clear model except for core job
submittal
TeraGrid: Integrating NSF Cyberinfrastructure
Buffalo
Wisc
UC/ANL
Utah
Cornell
Iowa
PU
NCAR
IU
NCSA
Caltech
PSC
ORNL
USC-ISI
UNC-RENCI
SDSC
TACC
TeraGrid is a facility that integrates computational, information, and analysis resources at the
San Diego Supercomputer Center, the Texas Advanced Computing Center, the University of
Chicago / Argonne National Laboratory, the National Center for Supercomputing Applications,
Purdue University, Indiana University, Oak Ridge National Laboratory, the Pittsburgh
Supercomputing Center, and the National Center for Atmospheric Research.
Today 100 Teraflop; tomorrow a petaflop; Indiana 20 teraflop today.
APAC National Grid
Core Grid Services
Portal Tools:
GridSphere
QPSF
Info Services:
(JCU)
APAC Registry
INCA2?
ACcESS at UQ
(ACES Partner)
outside APAC
Security:
APAC CA
MyProxy
VOMRS
Systems:
QPSF
APAC
National
Facility
IVEC
ANU
SAPAC
Gateways
Partners’ systems
Network:
GrangeNet / AARNet
APAC Private Network (AARNet)
ac3
VPAC
TPAC
CSIRO
National “Grid Projects” in China
Plan
Research
Develop
Production
Procure
Deploy
Operate
Manage
China e-Nation Strategy (2006-2020)
Virtual
Comp. Env.
CAS eScience
Net-based
Res. Env.
China
National
Grid
Semantic
Grid
Edu. &
Res. Grid
Next-Generation Network Initiative
€10M’s
NSFC
CAS
Science and Technology R &D
Assets Foundation Platform
€M’s
State
Council
MoE
MoST
National
Planning
Commission
Grid activities still growing
CNGrid (2006-2010)
• HPC Systems
– 100 Tflop/s by 2008, Pflop/s by 2010?
• Grid Software Suite: CNGrid GOS
– Merge with
international efforts
– Emphasize production
• CNGrid Environment
– Nodes, Centers, Policies
• Applications
–
–
–
–
Science
Resource & Environment
Manufacturing
Services
– Domain Grids
Cyber Science Infrastructure toward Petascale
Computing (planned 2006-2011)
International
Collaboration
- EGEE
- UNIGRIDS
-Teragrid
-GGF etc.
Operaontional Collaborati
R&D Collaboration
Nano
Proof, Eval.
R&D Collaboration
Joint Project
IMS
(Bio)
Joint Project
Osaka-U
Feedback
AIST
ナノ分野
Delivery
ナノ分野
実証・評価
Nano
実証・評価
Proof of al.Concept
分子研
Eval.
分子研
Delivyer
IMS
Feedback
ProjectOriented
VO
NII
NAREGI Site
Core Site
Customization
Operation/Maintenance
Domain Specific
VO
(e.g ITBL)
Cyber-Science Infrastructure(CSI)
Middleware
CA
Research
Dev.(βver.
V1.0 )
V2.0
Collaborative Operation Center
(IT Infra. for Academic Research and Education)
Delivery
Delivery
Feedback
R&D Collaboration
Operation/
Maintenance
(Middleware)
Operation/
Maintenance
(UPKI,CA)
Peta-scale
System
VO
Industrial
Projects
Domain Specific
VOs
Univ./National
Supercomputing
VO
IMS,AIST,KEK,NAO,etc.
Networking Infrastructure (Super-SINET)
Note: names of VO are tentative)
Networking
Contents
Project-oriented
VO
Japanese Earthquake Simulation Grid
Data-Server
NIED
48xG5, 15TB
Data-Server
GSI
8xOpteron
20TB
Integrated
Observation-Simulation
Data Grid
Super SINET
(10Gbps)
Earth Simulator
5,120xSX6
PC Cluster
ERI,
64xOpteron
paraAVS
PC Cluster
EPS,
64xOpteron
paraAVS
JST-CREST Integrated Predictive Simulation System
Strong Motion and Tsunami Generation
Earthquake Generation
Tsunami Generation
Plate Motion
Tectonic Loading
EarthquakeRupture
Wave Propagation
Artificial Structure Oscillation
Crustal Movement
Data Analysis
Seismic Activity
Data Analysis
Strong Motion
Data Analysis
GONET
Hi-net
K-NET
Database for Model
Construction
Structure Oscillation
Platform for Integrated Simulation
Data Processing, Visualization, Linear Solvers
PC clusters for small-intermediate problems
Earth Simulator for large-scale problems
GIS Urban
Information
Simulation
Output
Current PTWC Network of Seismic Stations
(from GSN & USNSN & Other Contributing Networks)
The NCES/WS-*/GS-* Features/Service Areas I
Service or Feature
WS-*
GS-*
NCES
(DoD)
Comments
A: Broad Principles
FS1: Use SOA: Service
Oriented Arch.
WS1
Core Service Architecture, Build Grids on Web
Services. Industry best practice
FS2: Grid of Grids
Strategy for legacy subsystems: modular
architecture
B: Core Services (Mainly Service Infrastructure and W3C/OASIS focus)
FS3: Service Internet,
Messaging
WS2
NCES3
Core Infrastructure including reliability, publishsubscribe messaging cf. FS13C
FS4: Notification
WS3
NCES3
JMS, MQSeries, WS-Eventing, Notification
FS5: Workflow
WS4
NCES5
Grid Programming
FS6: Security
WS5
NCES2
Grid-Shib, Permis Liberty Alliance ...
FS7: Discovery
WS6
NCES4
UDDI and extensions
FS8: System Metadata
& State
WS7
FS9: Management
WS8
FS10: Policy
WS9
ECS
FS11: Portals and Users
WS10
NCES7
GS7
Globus MDS
Semantic Grid, WS-Context
GS6
NCES1
CIM
Portlets JSR168, NCES Capability Interfaces
The NCES/WS-*/GS-* Features/Service Areas II
Service or Feature
WS-*
GS-*
NCES
Comments
B: Core Services (Mainly Higher level and OGF focus)
FS12: Computing
GS3
Job Management major Grid focus
FS13A: Data as Repositories: Files
and Databases
GS4 NCES8
Distributed Files, OGSA-DAI
Managed Data is FS14B
FS13B: Data as Sensors and
Instruments
FS13C: Data Transport
OGC SensorML
WS 2,3
GS4
NCES3,8 GridFTP or WS Interface to non SOAP transport
FS14A: Information as Monitoring
GS4
Major Grid effort for job status etc.
FS14B: Information, Knowledge,
Wisdom part of D(ata)IKW
GS4 NCES8
VOSpace for IVOA, JBI for DoD, WFS for OGC
Federation at this layer major research area
NCOW Data Strategy
FS15: Applications and User Services
GS2 NCES9
Standalone Services
Proxies for jobs
FS16: Resources and Infrastructure
GS5
Ad-hoc networks; Network Monitoring
FS17: Collaboration and
Virtual Organizations
GS7 NCES6
XGSP, Shared Web Service ports
FS18: Scheduling and matching of
Services and Resources
GS3
Current work only addresses scheduling “batch
jobs”. Need networks and services