PowerPoint プレゼンテーション

Download Report

Transcript PowerPoint プレゼンテーション

NAREGI Middleware Beta 1 and
Beyond
Satoshi Matsuoka
Professor, Global Scientific Information and
Computing Center,
Deputy Director, NAREGI Project
Tokyo Institute of Technology / NII
http://www.naregi.org
The Titech TSUBAME Production
Supercomputing Cluster, Spring 2006
Voltaire ISR9288 Infiniband
10Gbps x2 (xDDR) x
~700 Ports
Unified IB
network
10Gbps+External
Network
Sun Galaxy 4 (Opteron Dual
core 8-Way)
10480core/655Nodes
50.4TeraFlops
OS Linux (SuSE 9, 10)
NAREGI Grid MW
7th on June2006 Top500,
38.18 TFlops
NEC SX-8
Small Vector
Nodes (under
plan)
500GB
48disks 500GB
500GB
48disks
48disks
Storage
1 Petabyte (Sun “Thumper”)
0.1Petabyte (NEC iStore)
Lustre FS, NFS (v4?)
ClearSpeed CSX600
SIMD accelerator
360 boards,
35TeraFlops(Current)
Titech TSUBAME
~80+ racks
350m2 floor area
1.2 MW (peak)
Titech Supercomputing Grid 2006
• ~13,000 CPUs, 90 TeraFlops, ~26 TeraBytes Mem,
~1.1 Petabytes Disk
• CPU Cores: x86: TSUBAME (~10600), Campus Grid
Cluster (~1000), COE-LKR cluster (~260), WinCCS
(~300)
+ ClearSpeed CSX600 (720 Chips)
WinCCS
TSUBAME
計算工学
C (予定)
数理・計算
C (予定)
35km, 10Gbps
1.2km
COE-LKR
(知識) cluster
大岡山
Campus Grid
Cluster
すずかけ台
University Computer Centers
(excl. National Labs) circa Spring 2006
Hokkaido University
Information Initiative Center
10Gbps SuperSINET
Interconnecting the Centers
~60 SC Centers
in Japan
HITACHI SR11000
5.6 Teraflops
University of Tsukuba
FUJITSU VPP5000
CP-PACS 2048 (SR8000 proto)
Kyoto University
Tohoku University
Academic Center for Computing
and Media Studies
Information Synergy Center
FUJITSU PrimePower2500
10 Teraflops
NEC SX-7
NEC TX7/AzusA
University of Tokyo
Kyushu University
Information Technology Center
Computing and
Communications Center
HITACHI SR8000
HITACHI SR11000 6 Teraflops
Others (in institutes)
FUJITSU VPP5000/64
IBM Power5 p595
5 Teraflops
National Inst. of Informatics
- 10 Petaflop
center by
2011
SuperSINET/NAREGI Testbed
17 Teraflops
Tokyo Inst. Technology
Global Scientific Information
and Computing Center
Osaka University
CyberMedia Center
NEC SX-5/128M8
HP Exemplar V2500/N
1.2 Teraflops
2006 NEC/SUN TSUBAME
85 Teraflops
Nagoya University
Information Technology Center
FUJITSU PrimePower2500
11 Teraflops
Scaling Towards Petaflops…
“Keisoku”
>10PF(2011)
2010 Titech “PetaGrid”
10PF
=> Interim 200TeraFlops @ 2008
=> “Petascale” @ 2010
US 10P
(2011~12?)
NORM for a typical Japanese center?
US HPCS
(2010)
→HPC Software is the key!
1PF
BlueGene/L
360TF(2005)
US
Next Gen
Petascale
“PetaGrid”
(2007~8)
1PF (2010)
TSUBAME
Upgrade >200TF (2008-2H)
100TF
10TF
Earth Simulator
40TF (2002)
Titech
Campus Grid
1TF
Titech
Supercomputing
Campus Grid (incl
TSUBAME )~90TF
(2006)
1.3TF
2002
Chinese National Machine
>100TF (2007~8)
Korean Machine
>100TF (2006~7)
KEK 59TF
BG/L+SR11100
2004
2006
2008
2010
2012
Nano-Science : coupled simluations on the Grid
as the sole future for true scalability
… between Continuum & Quanta.
Material physics的
(Infinite system)
・Fluid dynamics
・Statistical physics
・Condensed matter theory
…
-6
Molecular Science
・Quantum chemistry
・Molecular Orbital method
・Molecular Dynamics
…
10 -9
10
Limit of
Idealization
Multi-Physics
m
Limit of
Computing
Capability
Coordinates decoupled resources;
Old HPC environment:
・decoupled resources,
・limited users,
・special software, ...
Meta-computing,
High throughput computing,
Multi-Physics simulation
w/ components and data from different groups
within VO composed in real-time
The only way to achieve true scalability!
LifeCycle of Grid Apps
and Infrastructure
HL Workflow
NAREGI WFML
VO Application
Developers&Mgrs
Workflows
and Coupled
Apps / User
Many
VO Users
GridRPC/Grid MPI
User
Apps
GridVM
Assign
metadata
to data
Data 1
SuperScheduler
Dist. Grid
Info Service
MetaComputing
Place &
register data
on the Grid
Metadata
Application
Contents
Service
Metadata
Data 2
Distributed
Servers
GridVM
Meta-
data
Data n
User
Apps
Grid-wide Data Management
Service (GridFS, Metadata,
Staging, etc.)
User
Apps
GridVM
NAREGI Software Stack (beta 1 2006)
- WS(RF) based (OGSA) SW Stack Grid-Enabled Nano-Applications (WP6)
Grid Visualization
Grid Workflow
Grid PSE
WP3
(WFML (Unicore+ WF))
Super Scheduler
WP1
Distributed
Information Service
(CIM)
(WSRF (GT4+Fujitsu WP1) + GT4 and other services)
Grid VM (WP1)
Grid Security and High-Performance Grid Networking (WP5)
SuperSINET
NII
IMS
Research
Organizations
Major University
Computing Centers
Computing Resources and Virtual Organizations
Data (WP4)
Packaging
Grid
Programming
(WP2)
-Grid RPC
-Grid MPI
List of NAREGI “Standards”
•
(beta 1 and beyond)
GGF Standards and Pseudo-standard
Activities set/employed by NAREGI
GGF “OGSA CIM profile”
GGF AuthZ
GGF DAIS
GGF GFS (Grid Filesystems)
GGF Grid CP (GGF CAOPs)
GGF GridFTP
GGF GridRPC API (as Ninf-G2/G4)
GGF JSDL
GGF OGSA-BES
GGF OGSA-Byte-IO
GGF OGSA-DAI
GGF OGSA-EMS
GGF OGSA-RSS
GGF RUS
GGF SRM (planned for beta 2)
GGF UR
GGF WS-I RUS
GGF ACS
GGF CDDLM
•
Other Industry Standards Employed by
NAREGI
ANSI/ISO SQL
DMTF CIM
IETF OCSP/XKMS
MPI 2.0
OASIS SAML2.0
OASIS WS-Agreement
OASIS WS-BPEL
OASIS WSRF2.0
OASIS XACML
Implement
“Specs” early
•
De Facto Standards / Commonly Used
Software Platforms Employed by NAREGI
even if
Ganglia
nascent if
GFarm 1.1
Globus 4 GRAM
seemingly
Globus 4 GSI
Globus 4 WSRF (Also Fujitsu WSRF for C
viable
binding)
Necessary for Longevity and
Vendor Buy-In
Metric of WP Evaluation
IMPI (as GridMPI)
Linux (RH8/9 etc.), Solaris (8/9/10), AIX, …
MyProxy
OpenMPI
Tomcat (and associated WS/XML standards)
Unicore WF (as NAREGI WFML)
VOMS
Highlights of NAREGI Beta (May 2006,
GGF17/GridWorld)
• Professionally developed and tested
• “Full” OGSA-EMS incarnation
–
–
–
–
Full C-based WSRF engine (Java -> Globus 4)
OGSA-EMS/RSS WSRF components
GGF JSDL1.0-extension job submission, authorization, etc.
Support for more OSes (AIX, Solaris, etc.) and BQs
• Sophisticated VO support for
identity/security/monitoring/accounting (extensions of
VOMS/MyProxy, WS-* adoption)
• WS- Application Deployment Support via GGF-ACS
• Comprehensive Data management w/Grid-wide FS
• Complex workflow (NAREGI-WFML) for various
coupled simulations
• Overall stability/speed/functional improvements
• To be interoperable with EGEE, TeraGrid, etc. (beta2)
• Release next week at GGF17, press conferences, etc.
Ninf-G: A Reference Implementation
of the GGF GridRPC API
•
What is GridRPC?
Programming model using RPCs on a Grid
Provide easy and simple programming interface
The GridRPC API is published as a proposed recommendation (GFD-R.P 52)
•
What is Ninf-G?
A reference implementation of the standard GridRPC API
Built on the Globus Toolkit
Now in NMI Release 8 (first non-US software in NMI)
•
Easy three steps to make your program Grid aware
–
–
–
Write IDL file that specifies interface of your library
Compile it with an IDL compiler called ng_gen
Modify your client program to use GridRPC API
② Notify results
user
Utilization of remote
supercomputers
Internet
① Call remote
procedures
Large scale computing across
supercomputers on the Grid
Call remote libraries
GridMPI
• MPI applications run on the Grid environment
• Metropolitan area, high-bandwidth environment: 
10 Gpbs,  500 miles (smaller than 10ms one-way
latency)
– Parallel Computation
• Larger than metropolitan area
– MPI-IO
computing resource
site A
computing resource
site B
Wide-area
Network
Single (monolithic) MPI application
over the Grid environment
Grid Application Environment
(WP3)
Bio VO
Portal GUI
NAREGI Portal
Grid PSE
Deployment UI
Portal GUI
Nano VO
Grid Workflow
Grid Visualization
Workflow GUI
Visualization GUI
Register UI
Gateway Services
Deployment
Service
 compile
 deploy
 un-deploy
Application
Contents
Service
Workflow
Service
(GGF-ACS)
File
/Execution
Manager
Application
Repository
NAREGIWFML
JM I/F
module
CFD
MolecularVisualization
Visualization
Service
Parallel
Service
Visualization
Service
CFD
Molecular Visualizer
Viewer
Parallel
Visualizer
BPEL+JSDL
Underlying
Grid Services
File
Transfer
(RFT)
Grid
File System
Core Grid
Services
Workflow Engine &
Distributed
・・・
WSRF
Super Scheduler
Information Service
・・・
VOMS
MyProxy
WP-3: User-Level Grid Tools & PSE
 Grid PSE
- Deployment of applications on the Grid
- Support for execution of deployed
applications
 Grid Workflow
- Workflow language independent of
specific Grid middleware
- GUI in task-flow representation
 Grid Visualization
- Remote visualization of massive data
distributed over the Grid
- General Grid services for visualization
Grid-Middleware
The NAREGI SSS Architecture
(2007/3)
PETABUS (Peta Application services Bus)
Application
Specific Service
Application
Specific Service
Application
Specific Service
WESBUS (Workflow Execution Services Bus)
EPS
JM
NAREGISSS
CSG
CESBUS (Coallocation Execution Services Bus; a.k.a. BES+ Bus)
FTS-SC
GRAM-SC
UniGridS-SC
BESBUS (Basic Execution Services Bus)
Grid Resource
BPEL Interpreter
Service
Globus
WS-GRAM I/F
(with reservation)
GridVM
UniGridS
Atomic Services
(with reservation)
AGG-SC with RS
(Aggregate SCs)
NAREGI beta 1 SSS Architecture
An extended OGSA-EMS Incarnation
NAREGIWFML
NAREGI-WP3 WorkFlowTool, PSE, GVS
Submit
Status
Cancel
Delete
JSDL
NAREGI JM(SS) Java I/F module
WFML2BPEL
CreateActivity(FromBPEL)
GetActivityStatus
RequestActivityStateChanges
Submit
Status
Delete
Cancel
BPEL2WFST
JM-Client
BPEL (include JSDL)
S
SS
Invoke SC
NAREGI JM (BPEL Engine)
SelectResource
FromJSDL
JSDL
JSDL
JSDL
R
JSDL
EPS
Fork/Exec
is-query
PostgreSQL
CES
S
MakeReservation
CancelReservation
SC(GVM-SC)
JSDL
JSDL
Fork/Exec
GridVM
GNIS
DB
S
GRAM4 specific globusrun-ws
OGSA-DAI
CreateActivity(FromJSDL)
GetActivityStatus
RequestActivityStateChanges
JSDL
R
CSG
SS: Super Scheduler
JSDL: Job Submission Description Document
JM: Job Manager
EPS: Execution Planning Service
CSG: Candidate Set Generator
RS: Reservation Service
IS: Information Service
SC: Service Container
AGG-SC: Aggregate SC
GVM-SC: GridVM SC
FTS-SC: File Transfer Service SC
BES: Basic Execution Service I/F
CES: Co-allocation Execution Service I/F (BES+)
CIM: Common Information Model
GNIS: Grid Network Information Service
AGG-SC
/RS
GetGroupsOfNodes
JSDL
JSDL
JSDL
IS
JSDL
JSDL
JSDL
JSDL
JSDL
Generate
SQL Query
From JSDL
JSDL
MakeReservation
CancelReservation
JSDL
GenerateCandidate
Set
CIM
Invoke EPS
Abbreviation
SC
WS-GRAM
PBS, LoadLeveler
R CES S
JSDL
Fork/Exec
SC(GVM-SC)
globusrun-ws
GRAM4 specific
JSDL
FTS-SC
JSDL
Fork/Exec
uber-ftp
GridVM
SC
R CES S
WS-GRAM
globus-url-copy
PBS, LoadLeveler
Co-allocation
FileTransfer
GFarm
server
3, 4: Co-allocation and Reservation
Meta computing scheduler is required to allocate and
to execute jobs on multiple sites simultaneously.
Super Scheduler
① Abstract
⑥(1) 14:00- (3:00)
JSDL
(10)
(2) Concrete
JSDL
(8)
Concrete
JSDL
(2)
(3)Local RS1
Local RS2 (EPR)
+
Execution
Planning
Services
The super scheduler negotiates with local RSs on job
execution time and reserves resources which can
execute the jobs simultaneously.
⑦ 15:00-18:00
(4)Abstract Agreement Instance EPR
create an
agreement
Abstract
instance
JSDL
⑤ (10)
Concrete
JSDL
(8)
Reservation
Service
with Meta-Scheduling
④
⑪
Abstract
Candidates:
JSDL ②
Local RS 1 EPR (8)
(10)
Local RS 2 EPR (6)
⑨
15:00-18:00
Concrete
JSDL
(2)
Service
Container
Concrete
JSDL
(8)
Service
Container
Local
RS 2
GridVM
Distributed Information Service
Distributed Information Service
create an
agreement instance
⑧
Cluster (Site) 1
Candidate Set
Generator
③
Local
RS 1
create an
agreement instance
Concrete
JSDL
(2)
⑩
Cluster (Site) 2
Local RS #:
Local Reservation Service #
NAREGI Info Service (beta) Architecture
・ CIMOM Service classifies info according to CIM based schema.
・ The info is aggregated and accumulated in RDBs hierarchically.
・ Client library utilizes OGSA-DAI client toolkit.
・ Accounting info is accessed through RUS.
User
Admin.
Information Service Node
Viewer
Java-API
Client
(Resource
Broker etc.)
Client
Library
Chargeable
Service
Data
Service
RDB
Resource
Usage
Service
CIM Providers
Aggregator
Service
Lightweight
CIMOM
ACL
Parallel Query …
Cell Domain
Information Service
●
●
Node A
Node B
Node C
… Hierarchical filtered aggregation
Cell Domain
Information Service
Grid
VM
Performance
Service
(GridVM etc.)
RUS::insertURs
OS
Processor
File System
Job Queue
Client
(publisher)
Ganglia
NAREGI IS: Standards Employed in the
Architecture
User
Admin.
Viewer
Information Service Node
Client
Distributed
Information Service
(OGSARSS etc.)
Java-API
APP
APP
Information Service Node
Client
library
OGSA-DAI
Client toolkit
Aggregator
Service
OGSA-DAI
WSRF2.1
WS-I
RUS
RDB
ACL
GridVM
GGF/
GGF/
UR
UR
Tomcat 5.0.28
CIM spec.
CIM/XML
LightCIM Schema weight
CIMOM
2.10
Service
/w extension
(Chargeable
Service)
RUS::insertURs
GT4.0.1
CIM Providers
OS
Processor
File System
Job Queue
Performance
●
●
Node A
Node B
Node C
... Distributed Query …
… Hierarchical filtered aggregation
Client
Cell Domain
Information Service
Cell Domain
Information Service
Grid
VM
(OGSABES etc.)
Ganglia
GridVM Features
 Platform independence as OGSA-EMS SC
• WSRF OGSA-EMS Service Container interface
for heterogeneous platforms and local schedulers
• “Extends” Globus4 WS-GRAM
• Job submission using JSDL
• Job accounting using UR/RUS
• CIM provider for resource information
 Meta-computing and Coupled Applications
• Advanced reservation for co-Allocation
 Site Autonomy
•WS-Agreement based job execution (beta 2)
•XACML-based access control of resource usage
 Virtual Organization (VO) Management
• Access control and job accounting based on VOs
(VOMS & GGF-UR)
NAREGI GridVM (beta) Architecture
 Virtual execution environment on each site
•Virtualization of heterogeneous resources
•Resource and job management services with unified I/F
Super
Scheduler
Information
Service
Advance reservation,
Monitoring, Control
GRAM4 WSRF I/F
GridVM Scheduler
Local Scheduler
GridVM Engine
Sandbox
Job Execution
Resource
Info.
GRAM4 WSRF I/F
GridVM Scheduler
Local Scheduler
GridVM Engine
GridMPI
AIX/LoadLeveler
site
Policy
Accounting
site
Policy
Linux/PBSPro
NAREGI GridVM: Standards Employed
in the Architecture
Super
Scheduler
GT4 GRAM-integration and
WSRF-based extension services
Job submission based on JSDL
and NAREGI extensions
GRAM4 WSRF I/F
GridVM Scheduler
Local Scheduler
GridVM Engine
Information
Service
CIM-based
resource info.
provider
GRAM4 WSRF I/F
GridVM Scheduler
Local Scheduler
GridVM Engine
GridMPI
site
Policy
site
Policy
UR/RUS-based
job accounting
xacml-like access control policy
GT4 GRAM-GridVM Integration
 Integrated as an extension module to GT4 GRAM
 Aim to make the both functionalities available
SS
Site
globusrun
RSL+JSDL’
GridVMJobFactory
Extension Service
GridVMJob
Basic job management
+ Authentication, Authorization
Delegate
GRAM
services
Delegation
Transfer
request
RFT File
Transfer
SUDO GRAM
Adapter
Scheduler
Event
Generator
GridVM
scheduler
PBS-Pro
Local
LoadLeveler
scheduler …
GridVM
Engine
Next Steps for WP1 – Beta2
• Stability, Robustness, Ease-of-install
• Standard-setting core OGSA-EMS: OGSA-RSS,
OGSA-BES/ESI, etc.
• More supported platforms (VM)
– SX series, Solaris 8-10, etc.
– More batchQs – NQS, n1ge, Condor, Torque
• “Orthogonalization” of SS, VM, IS, WSRF
components
– Better, more orthogonal WSRF-APIs, minimize sharing of
states
• E.g., reservation APIs, event-based notificaiton
– Mix-and-match of multiple SS/VM/IS/external
components, many benefits
• Robutness
• Better and realistic Center VO support
• Better interoperability with external grid MW stack, e.g.
Condor-C
VO
and
Resources
in
Beta
2
Decoupling of WP1 components for pragmatic VO deployment
RO3
Client
Client
Client
VO-APL2
RO1
VO-APL1
SS
SS
IS
RO2
IS
"Peter Arzberger" <[email protected]>
VO-RO1
SS
VO-RO2
IS
IS
IS
GridVM
IS
IS
GridVM
IS
GridVM
Policy
Policy
Policy
Policy
Policy
Policy
• VO-R01
• VO-R01
• VO-APL1
• VO-R01
• VO-APL1
• VO-APL2
• VO-R01
• VO-APL1
• VO-APL2
• VO-R02
• VO-APL2
• VO-R02
A.RO1
B.RO1
a.RO2
b.RO2
n.RO2
GridVM
IS
GridVM
IS
GridVM
IS
IS
SS
N.RO1
NAREGI Data Grid beta1 Architecture
(WP4)
Grid Workflow
Job 1
Data Grid
Components
Job 2
Data 1
Import data
into workflow
Data Access
Management
Metadata
Management
Data Resource
Management
Grid-wide File
System
Data 2
Data n
Place &
register data
on the Grid
Assign
metadata
to data
Job 1
Job 2
Grid-wide Data Sharing Service
Metadata
Data 1
Store data into
distributed file nodes
Job n
Metadata
Data 2
Meta-
Job n
data
Data n
Currently
GFarm v.1.x
NAREGI WP4: Standards Employed in the
Architecture
Workflow
(NAREGI WFML
=>BPEL+JSDL)
Import data
into workflow
Data Access
Management
Tomcat
5.0.28
Globus
Toolkit
4.0.1
Super Scheduler
(SS) (OGSA-RSS)
Metadata
Construction
OGSA-DAI
WSRF2.0
OGSA-DAI
WSRF2.0
Gfarm 1.2 PL4
(Grid FS)
Data n
Data Staging
OGSA-RSS
FTS SC
GridFTP
GGF-SRM (beta2)
Computational Nodes
Job 1
Job 2
Job n
Data 1
Data 2
Data n
PostgreSQL 8.0
PostgreSQL 8.0
Data Resource
Information DB
Job n
Data 1
Place data
on the Grid
Data Resource Management
Job 1
Data Specific
Metadata DB
Filesystem
Nodes
NAREGI-beta1 Security
Architecture (WP5)
VOMS
MyProxy
MyProxy+
VOMS
Proxy
Certificate
VOMS
Proxy
Certificate
User Management
Server(UMS)
User
Certificate
VOMS
Proxy
Certificate
Private
Key
Proxy
Certificate
GVS
GridVM
SS client
NAREGI
CA
Client Environment
WFT
Portal
VOMS
PSE
Super Scheduler
VOMS
Proxy
Certificate
GridVM
GridVM
Data Grid
Grid File System
(AIST Gfarm)
disk node
disk node disk node
NAREGI-beta1 Security Architecture
WP5-the standards
GRID CP
(GGF
CAOPs)
CP/CPS
Subset of
WebTrust
Programs for CA
Audit Criteria
Management Service
MyProxy
VOMS
NAREGI
CA
CA Service
Information
Service
VO、Certificate
Proxy
Certificate
with VO
voms-myproxy-init
Put
Get VOMS
ProxyCertificate
Attribute
with VO
Certificate
Management Server
User
Certificate
Resources Info
incl. VO
query
resources
(requirements in the VO
+VO info)
ProxyCertificate
with VO
Super Scheduler
Proxy
Certificate
withVO
ssh + voms-myproxy-init
log-in
Proxy
Certificate
with VO
PSE
GVM
VO Info、Execution Inf
Resource Info
GridVM
SS client
Client Environment
WFT
Portal
Proxy
Certificate
with VO
globusrun-ws
GridVM
services
(incl. GSI)
Private
Key
Request/Get
Certificate
Resource Info.
(Incl. VO info)
Signed
Job
Description
local Info.
incl. VO
Resource
VO and User Management
Service
• Adoption of VOMS for VO management
– Using proxy certificate with VO attributes for the
interoperability with EGEE
– GridVM is used instead of LCAS/LCMAPS
• Integration of MyProxy and VOMS servers into
NAREGI
– with UMS (User Management Server) to realize one-stop
service at the NAREGI Grid Portal
– using gLite implemented at UMS to connect VOMS server
• MyProxy+ for SuperScheduler
– Special-purpose certificate repository to realize safety
delegation between the NAREGI Grid Portal and the Super
Scheduler
– Super Scheduler receives jobs with user’s signature just like
UNICORE, and submits them with GSI interface.
Computational Resource
Allocation based on VO
• Resource
VO1
configulation
png2040
• Workflow
VO2
pbg2039
pbg1042
2 CPU
4 CPU
png2041
8 CPU
4 CPU
4 CPU
1 CPU
4
8 CPU
1
2 CPU
Different resource mapping for different VOs
Local-File Access Control
(GridVM)
• Provide VO-based access control functionality
that does not use gridmap files.
• Control file-access based on the policy specified
by a tuple of Subject, Resource, and Action.
• Subject is a grid user ID or VO name.
DN
GridVM
Access Control
Grid User
X
Local
Account
Grid User
Y
Resource R
Policy
Permit: Subject=X, Resource=R,
Action=read,write
Deny: Subject=Y,Resource=R,
Action=read
Structure of Local-File Access
Control Policy
<GridVMPolicyConfig>
1
1
<AccessControl>
Control
1
permit / deny
1
<AccessProtection>
What
Resouce
+Default
+RuleCombiningAlgorithm
Access
Type
1
file / directory
1
Who
user / VO
0..1
<AppliedTo>
+TargetUnit
1..*
<AccessRule>
+Effect
1
read / write / execute
1
1
<Resources>
0..1
<Actions>
1
0..1
<Subjects>
1
1..*
<Resource>
1
0..*
<Subject>
1
0..*
<Action>
Policy Example (1)
<gvmcf:AccessProtection gvmac:Default="Permit"
gvmac:RuleCombiningAlgorithm="Permit-overrides">
<!-- Access Rule 1: for all user -->
<gvmcf:AccessRule gvmac:Effect="Deny">
<gvmcf:AppliedTo> <gvmac:Subjects> …
<gvmac:Resources>
<gvmac:Resource>/etc/passwd</gvmac:Resource>
</gvmac:Resources>
<gvmac:Actions> …
Default
Applying
rules
<!-- Access Rule 2: for a specific user -->
<gvmcf:AccessRule gvmac:Effect=“Permit">
<gvmcf:AppliedTo gvmcf:TargetUnit=“user">
<gvmcf:Subjects> <gvmcf:Subject>User1</gvmcf:subject>
</gvmcf:Subjects>
</gvmcf:AppliedTo >
<gvmac:Resources>
<gvmac:Resource>/etc/passwd</gvmac:Resource>
</gvmac:Resources>
<gvmac:Actions>
<gvmac:Action>read</gvmac:Action>
</gvmac:Actions>
Policy Example (2)
<gvmcf:AccessRule gvmac:Effect="Permit">
<gvmcf:AppliedTo gvmcf:TargetUnit="vo">
<gvmcf:Subjects>
<gvmcf:Subject>bio</gvmcf:Subject>
</gvmcf:Subjects >
</gvmcf:AppliedTo>
<gvmac:Resources>
<gvmac:Resource>/opt/bio/bin</gvmac:Resource>
<gvmac:Resource>./apps</gvmac:Resource>
</gvmac:Resources>
<gvmac:Actions>
<gvmac:Action>read</gvmac:Action>
<gvmac:Action>execute</gvmac:Action>
</gvmac:Actions>
</gvmcf:AccessRule>
VO name
Resource
name
VO-based Resouce Mapping in
Global File System (b2)
• Next release of Gfarm (version 2.0)
will have access control functionality.
VO1
• We will extend Gfarm metadata
file
server for the data-resource mapping
server
based on VO.
file
server
Client
Gfarm Metadata
Server
VO2
file
server
file
server
Current Issues and the
Future Plan
• Current Issues on VO management
– VOMS platform
• gLite is running on GT2 and NAREGI middleware on GT4
– Authorization control on resource side
• Need to implement new functions for resource control on
GridVM, such as Web services, reservation, etc.
– Proxy certificate renewal
• Need to invent a new mechanism
• Future plans
– Cooperation with GGF security area members to
realize interoperability with other grid projects
– Proposal of a new VO management methodology and
trial of reference implementation.
NAREGI Application Mediator (WP6)
for Coupled Applications
Mediator Components
Support data exchange
between coupled simulation
co-allocated jobs
Workflow
NAREGI WFT
Simulation
Simulation
SimulationAA
A
Mediator
Mediator
Mediator
Data transfer
management
・Synchronized
file transfer
・Multiple protocol
GridFTP/MPI
Data transformation
management
・Semantic transformation libraries for
different simulations
・Coupled accelerator
*SBC: Storage-based communication
Job 1
Job n
Simulation
Simulation
SimulationBA
A
Information
Service
Super Scheduler
SQL
SBC* -XML
・Global Job ID
・Allocated nodes
・Transfer Protocol
etc.
GridVM
API
Sim.A
Data1
OGSA-DAI
WSRF2.0
Sim.A
Data2
JNI
Globus
Toolkit
4.0.1
Mediator
A
API
GridFTP
Sim.B
Mediator
B
Data3
Mediator
A
Mediator
B
Sim.A
MPI
GridVM
GridVM
Mediator
A
MPI
Sim.B
GridFTP
MPI
NAREGI beta on “VM Grid”
Create “Grid-on-Demand” environment using
Xen and Globus Workspace
Vanilla personal virtual grid/cluster using our
Titech Lab’s research results
NAREGI beta image
dynamic deployment
“VM Grid” – Prototype “S”
• http://omiij-portal.soum.co.jp:33980/gridforeveryone.php
•
•
•
•
Request # of virtual grid nodes
Fill in the necessary info in the form
Confirmation page appears, follow instructions
Ssh login to NMI stack (GT2+Condor) + selected
NAREGI beta MW (Ninf-G, etc.)
• Entire Beta installation in the works
• Other “Instant Grid” research in the works in the lab
From Interoperation to
Interoperability
GGF16 “Grid Interoperations Now”
Charlie Catlett
Director, NSF TeraGrid
on Vacation
Satoshi Matsuoka
Sub Project Director, NAREGI Project
Tokyo Institute of Technology / NII
Interoperation Activities
• The GGF GIN (Grid Interoperations Now)
effort
– Real interoperation between major Grid
projects
– Four interoperation areas identified
• Security, Data Mgmt, Information Service, Job
Submission (not scheduling)
• EGEE/gLite – NAREGI interoperation
– Based on the four GIN areas
– Several discussions, including 3 day meeting at
CERN mid March, email exchanges
• Updates at GGF17 Tokyo next week
• Some details in my talk tomorrow
Grid Regional Infrastructural Efforts
Collaborative talks on PMA, etc.
The Ideal World: Ubiquitous VO & user
management for international e-Science
Different
software stacks
but interoperable
Europe: EGEE, UK e-Science, …
US: TeraGrid, OSG,
Japan: NII CyberScience (w/NAREGI), …
Other Asian Efforts (GFK, China Grid, etc.)…
HEP
Grid
VO
NEES
-ED
Grid
VO
Astro
IVO
Standardization,
commonality in
software platforms
will realize this
The Reality: Convergence/Divergence
of Project Forces
(original slide by Stephen Pickles, edited by Satoshi Matsuoka)
CSI (JP)
b: GT4/Fujitsu WSRF
GridPP(UK)
LCG(EU) interoperable
infrastructure talks
common staff
& procedures
Globus(US)
OSG(US)
GGF
UniGrids(EU)
Own WSRF &
OGSA
OMII(UK)
WS-I+ & OGSA?
gLite / GT2
EGEE(EU)
NAREGI (JP)
WSRF & OGSA,
IBM DEISA(EU)
Unicore
EU-China Grid
(China)
interoperable
infrastructure talks
AIST-GTRC
Condor(US)
NGS(UK)
common
users
NMI(US)
TeraGrid(US)
APAC Grid
(Australia)
interoperable
infrastructure
talks
GT4 WSRF
(OGSA?)
GGF Grid Interoperation Now
• Started Nov. 17 2005 @SC05 by Catlett and
Matsuoka
– Now participation by all major grid projects
• “Agreeing to Agree on what needs to be
Agreed first”
• Identified 4 Essential Key Common Services
– Authentication, Authorization, Identity Management
• Individuals, communities (VO’s)
– Jobs: submission, auditing, tracking
• Job submission interface, job description language, etc.
– Data Management
• Data movement, remote access, filesystems, metadata mgmt
– Resource discovery and Information Service
• Resourche description schema, information services
“Interoperation” versus
“Interoperability”
• Interoperability
“The ability of software and hardware
on multiple machines from
multiple vendors to communicate“
– Based on commonly agreed documented
specifications and procedures
• Interoperation
“Just make it work together”
– Whatever it takes, could be ad-hoc,
undocumented, fragile
– Low hanging fruit, future interoperability
Interoperation Status
• GIN meetings GGF16 and GGF17
• 3-day meeting at CERN end of March
• Security
– Common VOMS/GSI infrastructure
– NAREGI more complicated use of GSI/Myproxy and proxy
delegation but should be OK
• Data
– SRM commonality and data catalog integration
– GFarm and DCache consolidation
• Information Service
– CIM vs. GLUE schema differences
– Monitoring system differences fairly
– Schema translation (see next slides)
• Job Submission
– JDL vs. JSDL, Condor-C/CE vs. OGSA SS/SC-VM
architectural differences, etc.
– Simple job submission only (see next slides)
Information Service Characteristics
• Basic syntax:
– Resource description schemas (e.g., GLUE, CIM)
– Data representations (e.g., XML, LDIF)
– Query languages (e.g., SQL, XPath)
– Client query interfaces
DAI)
(e.g., WS Resource Properties queries, LDAP, OGSA-
• Semantics:
– What pieces of data are needed by each Grid
(various previous works & actual deployment experiences
already)
• Implementation:
– Information service software systems (e.g., MDS, BDII)
– The ultimate sources of this information (e.g., PBS,
Condor, Ganglia, WS-GRAM, GridVM, various grid monitoring
systems, etc.).
NAREGI Information Service
User
Admin.
Viewer
Information Service Node
Client
Distributed
Information Service
(OGSARSS etc.)
Java-API
APP
APP
Information Service Node
Client
library
OGSA-DAI
Client toolkit
Aggregator
Service
OGSA-DAI
WSRF2.1
WS-I
RUS
RDB
ACL
GridVM
GGF/
GGF/
UR
UR
Tomcat 5.0.28
CIM spec.
CIM/XML
LightCIM Schema weight
CIMOM
2.10
Service
/w extension
(Chargeable
Service)
RUS::insertURs
GT4.0.1
CIM Providers
OS
Processor
File System
Job Queue
Performance
●
●
Node A
Node B
Node C
... Distributed Query …
… Hierarchical filtered aggregation
Client
Cell Domain
Information Service
Cell Domain
Information Service
Grid
VM
(OGSABES etc.)
Ganglia
Relational Grid Monitoring Architecture
Publish Tuples
SQL “INSERT”
Consumer
application
Send Query
Tuples
SQL “SELECT”
Producer
Service
Query
Producer
application
Registry
Service
Mediator
Consumer
Service
Receive Tuples
Vocabulary
Manager
SQL “CREATE TABLE”
Schema
Service
• An implementation of the GGF Grid Monitoring
Architecture (GMA)
• All data modelled as tables: a single schema gives the
impression of one virtual database for VO
Syntax Interoperability Matrix
Grid
Schema
Data
Query
Lang
Client IF
Software
TeraGrid
GLUE
XML
XPath
WSRF RP
Queries
MDS4
OSG
GLUE
LDIF
LDAP
LDAP
BDII
NAREGI CIM
Relatio SQL
2.10+ext nal
OGSACIMOM +
DAI
OGSAWS-I RUS DAI
EGEE/
LCG
GLUE
LDAP
Nordu
Grid
ARC
LDIF
LDAP
BDII
Relatio SQL
nal
R-GMA i/f R-GMA
LDIF
LDAP
LDAP
GIIS
Low Hanging Fruit
“Just make it work by GLUEing”
• Identify the minimum common set of
information required for interoperation in the
respective information service
• Employ GLUE and extended CIM as the base
schema for respective grids
• Each Info service in grid acts as a
information provider for the other
• Embed schema translator to perform schema
conversion
• Present data in a common fashion on each
grid ; WebMDS, NAREGI CIM Viewer, SCMSWeb,
…
Minimal Common Attributes
• Define minimal common set of attributes required
• Each system components in the grid will only access
the translated
information
CSG
ACS
Provisioning
IS
Reservation
SC
Accounting
CSG
EPS
EPS
ACS
Provisioning
IS
Reservation
Common attributes
DC
JM
JM
DC
SC
Accounting
GLUE→CIM translation
Multi-Grid
Information Service Node
Aggregator
OGSA-DAI
Service
SQL “SELECT”
NAREGI
・Development of information providers
with translation from GLUE data model to CIM
about selected common attributes such as
up/down status of grid services
CDIS for NAREGI
OGSA
-DAI
SQL “CREATE TABLE”
SQL “INSERT”
Schema
Producer
Service
Lightweight
CIMOM
NRG_Unicore
System
OS
Account
CIM provider
skeleton
GLUE→CIM
translator
SQL “SELECT”
Send Query
Receive Tuples
Tuples
GLUE-CIM
mapping;
selected
Minimal
Attributes
G-Lite / R-GMA
Publish Tuples
for EGEE resources
Aggregator
RDB
Service
Query
CDIS for NAREGI
Cell Domain
Information Service
CIM→GLUE
producer
Consumer
Service
Registry
Service
Mediator
Interoperability: NAREGI Short
Term Policy
• gLite
– Simple/Single Job (up to SPMD)
– Bi-Directional Submission
• NAREGI  gLite:
• gLite  NAREGI:
GT2-GRAM
Condor-C
– Exchange Resource Information
• GIN
–
–
–
–
Simple/Single Job (up to SPMD)
NAREGI  GIN Submission
WS-GRAM
Exchange Resource Information
• BES/ESI
– TBD
• Status somewhat Confusing
• ESI middleware already developed ?
– Globus 4.X and/or UnicoreGS ?
Job Submission Standdards
Comparison: Goals
ESI
Use JSDL
BES
NAREGI
SS
SC(VM
)
✔*1
✔
✔*1
✔*1
WSRF OGSA Base Profile 1.0 Platform
✔
✔
✔
✔
Job Management Service
✔
✔
✔
✔
Extensible Support for Resource Models
✔
Reliability
✔
✔
✔
✔
Use WS-RF modeling conventions
✔
✔
✔
✔
Advance reservation
✔
✔
Bulk operations
✔
✔
Use WS-Agreement
Generic management frameworks (WSDM)
Define
alternative renderings
*1: Extended
Server-side workflow
✔
Job Factory Operations
ESI (0.6)
BES (Draft v16)
NAREGI (b)
Original
CreateManagedJob
(there is a subscribe option)
CreateActivityFromJSDL
GetActivityStatus
RequestActivityStateChanges
StopAcceptingNewActivities
StartAcceptingNewActivities
IsAcceptingNewActivities
GetActivityJSDLDocuments
MakeReservations
CommitReservations
WSResourcePrope
rties
GetResourceProperty
GetMultipleResourceProperties
QueryResourceProperties
GetResourceProperty
GetMultipleResourceProperties
QueryResourceProperties
GetResourceProperty
GetMultipleResourceProperties
SetResourceProperties
WSResourceLifeT
ime
ImmediateResourceDestructio
n
ScheduledResourceDestruction
Destroy
SetTerminationTime
WSBaseNotificati
on
NotificationProducer
Notify
Subscribe
Now working with Dave Snelling et. al. to converge to
ESI-API (which is similar to WS-GRAM) , plus
* Advance Reservation
* Bulk submission
Interoperation: gLiteNAREGI
gLite-IS
[GLUE]
IS bridge
NAREGI-IS
[CIM]
• GLUE  CIM
gLite user
gLite-WMS
[JDL]
NAREGI user
Condor-C
NAREGI Portal
gLite to NAREGI
bridge
NAREGI client lib
• ClassAd  JSDL
• NAREGI-WF generation
• Job Submit./Ctrl.
• Status propagation
• Certification ?
NAREGI-SS
[JSDL]
Interop-SC
NAREGI-SC
• JSDL  RSL
• Job Submit./Ctrl.
• Status propagation
GT2-GRAM
gLite-CE
NAREGI GridVM
Interoperation: GIN (Short
Term)
anotherGrid-IS
[CIM or GLUE]
IS bridge
NAREGI-IS
[CIM]
• GLUE  CIM
another grid
user
anotherGrid
[JSDL]
NAREGI user
GT4
Sorry !!
One way now
NAREGI Portal
NAREGI-SS
[JSDL]
Interop-SC
NAREGI-SC
• JSDL  RSL
• Job Submit./Ctrl.
• Status propagation
WS-GRAM
anotherGrid-CE
WS-GRAM
NAREGI GridVM