Lezione 001 6 Aprile 2009 - Istituto Nazionale di Fisica

Download Report

Transcript Lezione 001 6 Aprile 2009 - Istituto Nazionale di Fisica

Università degli Studi di Bari – Corso di Laurea Specialistica in Informatica
“Tecnologia dei Servizi “Grid e cloud computing”
A.A. 2009/2010
Giorgio Pietro Maggi [email protected], http://www.ba.infn.it/~maggi
Lezione 8 - 15 Dicembre 2009
Il materiale didattico usato in questo corso è stato mutuato da quello
utilizzato da Paolo Veronesi per il corso di Griglie Computazionali
per la Laurea Specialistica in Informatica tenuto nell’anno
accademico 2008/09 presso l’Università degli Studi di Ferrara.
Paolo Veronesi
[email protected], [email protected]
http://www.cnaf.infn.it/~pveronesi/unife/
Tecnologia dei Servizi “Grid e cloud computing” - Lezione 8
0
Today’s focus: Information Services
Execution Management
• Job description & submission
• Scheduling
• Resource provisioning
Data Services
• Common access facilities
• Efficient & reliable transport
• Replication services
Resource Management
Self-Management
• Discovery
• Monitoring
• Control
• Self-configuration
• Self-optimization
• Self-healing
OGSA
Information Services
Security
• Registry
• Notification
• Logging/auditing
• Cross-organizational users
• Trust nobody
• Authorized access only
DONE
OGSA “profiles”
Web services foundation
DONE
Tecnologia dei Servizi “Grid e cloud computing” - Lezione 8
1
Outline

What is the Information System

Data Model: the GLUE Schema
 Overview
 Core
entities

OpenLDAP server introduction LCG

Information Service Architecture

Top BDII and Site BDII

Information upgrade process
Tecnologia dei Servizi “Grid e cloud computing” - Lezione 8
2
Information System

What is?


Why?




System to collect information on the state of resources
To discover resources of the grid and their nature
To have useful data that helps who is in charge of managing the
workload to do it more efficiently.
To check for health status of resources.
How?



Monitoring state of resources locally and publishing right information on
the information system.
Adopting a data model that MUST be well known to all components that
want to access monitored information
Using different approaches that we are going to investigate in next
slides
Tecnologia dei Servizi “Grid e cloud computing” - Lezione 8
3
Design of Information Systems

About Measures




About the gathering of Information




How and when collected info should be published?
Where should collected info be stored?
How long should this info be maintained in the storage?
Querying the Information System




Measures SHOULD be sensitive to the aim the users want to achieve.
Measures SHOULD be enough accurate to be considered valid.
Rate of taking measures MUST be adequate to be used.
Where should queries be sent to have a response?
What syntax and protocols have to be adopted to make queries?
What is the adopted data model to describe resources?
Security


Who is allowed to execute queries against the IS and what type of
queries is he allowed to do?
Management of user rights and credentials.
Tecnologia dei Servizi “Grid e cloud computing” - Lezione 8
4
Adopted Information Systems

The BDII (Berkley DB Information Index)
 has been adopted in LCG middleware as the Information System provider.
 It is an evolution of the Globus Meta Directory System (MDS)
 gLite actually adopts BDII as Information System.
 It is based on Lightweight Directory Access Protocol (LDAP) servers.

The Relational Grid Monitoring Architecture (R-GMA)
 Is an implementation of the Grid Monitoring Architecture (GMA) standardized by
the Global Grid Forum (GGF)
 It is a relational implementation of the GMA
 It is strongly Web Services Oriented
 To be adopted by next releases of the gLite middleware ????
Tecnologia dei Servizi “Grid e cloud computing” - Lezione 8
5
The LDAP Protocol: Generalities
LDAP (Lightweight Directory Access Protocol)
√ It establishes the transport and format of the messages used by a client to
access a directory
√ LDAP can be used as access protocol for a large number of databases
√ It provides a standard data model; the DIT (Directory Information Tree)
√ It is the internal protocol used by the EGEE/LCG services to share information
Tecnologia dei Servizi “Grid e cloud computing” - Lezione 8
6
The LDAP Protocol: DIT
o = grid (root of
the DIT)
► LDAP structures data as a tree
► Following a path from the node
back to the root of the DIT,
a unique name is built (the DN):
c= US
c=Switzerland
c=Spain
st = Geneva
“id=pml,ou=IT,or=CERN,st=Geneva, \
c=Switzerland,o=grid”
or = CERN
ou = IT
objectClass:person
cn: Patricia M. L.
phone: 5555666
office: 28-r019
Tecnologia dei Servizi “Grid e cloud computing” - Lezione 8
id = pml
ou = EP
id=gv
id=fd
7
The LDAP Protocol: The Data Model
► The LDAP information model is based on entries
► These are attribute collections defined by a unique and global DN
(Distinguished Name)
► Information is organized in a tree-like structure. A special attribute,
objectclass, can be defined for each entry. It defines the classes tree
corresponding to this entry. This attribute can be used to filter entries
containing that object class
► The information is imported and exported from and to the LDAP server by
LDIF files (LDAP Data Interchange Format)
dn: <distinguished name>
objectclass:<objectclassname>
<attributetype>:<attributevalue>
<attributetype>:<attributevalue>
dn: <distinguished name>
objectclass:<objectclassname>
<attributetype>:<attributevalue>
<attributetype>:<attributevalue>
Tecnologia dei Servizi “Grid e cloud computing” - Lezione 8
► Those fields delimited by <> can be
defined by the application following a
certain schema
►The schema describes the attributes
and the types associated with the data
objects
8
Information Service Systems
•
The gLite Data Model is based on Grid Laboratory
Uniform Environment (GLUE) Schema
•
The IS architecture used in gLite is Berkeley DB
Information Index (BDII)
–
–
–
has been adopted in LCG middleware as the Information
System provider
It is an evolution of the Globus Meta Directory System
(MDS)
It is based on Lightweight Directory Access Protocol
(LDAP) servers
Tecnologia dei Servizi “Grid e cloud computing” - Lezione 8
9
The Data Model:
GLUE Schema
Tecnologia dei Servizi “Grid e cloud computing” - Lezione 8
10
GLUE: overview
• GLUE: Grid Laboratory Uniform Environment
• It’s an information model that describe all
those resources that partecipate in the Grid
system and that are requested to be
discoverable and monitored
• The same information can be retrieved from
different BDIIs relying on different technology
(e.g. R-GMA)
Tecnologia dei Servizi “Grid e cloud computing” - Lezione 8
11
GLUE Schema
•
Describe the Grid resources information stored in
the IS
•
Independent from the underlying technology
•
Actual release is mapped on
–
–
–
•
LDAP
XML
ClassAd (Condor Matchmaking language)
The entities of the GLUE Schema are organised
hierarchically
–
Include the concept of Site, Cluster, Computing Element,
Storage Element, and an abstraction of service
Tecnologia dei Servizi “Grid e cloud computing” - Lezione 8
12
GLUE Schema Structure
Site
Collection of
resources owned by a
sinle organisation.
Contains info on the
location, the
administrator, web
page and so on
1
1
1
*
Service
Description of
deployed service
Cluster
* Set of heterogeneous
resources. Contains
info on shared
directory
1
Host
Contains details of
hardware (features
and performance) and
software
1
VOview
Sub-Cluster
Set of homogeneous
resources. Contains
the size of the set
*
Job
Info
*
State
Policy
*
*
StorageElement
ComputingElement
Tecnologia dei Servizi “Grid e cloud computing” - Lezione 8
13
Site Element
*
Site
Service
A collection of resources owned by
the same organization and
managed by the same
administrator. Contains info on the
location, the administrator, the web
homepage and so on.
1
1
The description of a deployed Web
Service. Contains the URI endpoint
of the WS, the WSDL document, the
list owners and so on.
1
*
StorageElement
Tecnologia dei Servizi “Grid e cloud computing” - Lezione 8
*
Cluster
14
GLUE: site
GlueSiteUniqueID: TRIGRID-INFN-CATANIA
GlueSiteName: TRIGRID-INFN-CATANIA
GlueSiteDescription: LCG Site
GlueSiteUserSupportContact: mailto: [email protected]
GlueSiteSysAdminContact: mailto: [email protected]
GlueSiteSecurityContact: mailto: [email protected]
GlueSiteLocation: Catania, Italy
GlueSiteLatitude: 37.54866
GlueSiteLongitude: 15.036076
GlueSiteWeb: http://www.trigrid.it
GlueSiteOtherInfo: TIER 1
GlueSiteOtherInfo: Trigrid Team
Tecnologia dei Servizi “Grid e cloud computing” - Lezione 8
15
GLUE: service
GlueServiceUniqueID: infn-rb-01.ct.trigrid.it:7772
GlueServiceName: INFN-CATANIA-rb
GlueServiceType: ResourceBroker
GlueServiceVersion: 1.2.0
GlueServiceEndpoint: infn-rb-01.ct.trigrid.it:7772
GlueServiceURI: unset
GlueServiceAccessPointURL: not_used
GlueServiceStatus: OK
GlueServiceStatusInfo: No Problems
GlueServiceWSDL: unset
GlueServiceSemantics: unset
GlueServiceStartTime: 1970-01-01T00:00:00Z
GlueServiceOwner: trigrid
GlueServiceOwner: cometa
GlueServiceOwner: inaf
GlueServiceOwner: alice
GlueServiceAccessControlRule: trigrid
GlueServiceAccessControlRule: cometa
GlueServiceAccessControlRule: inaf
GlueServiceAccessControlRule: alice
GlueForeignKey: GlueSiteUniqueID=INFN-CATANIA
Tecnologia dei Servizi “Grid e cloud computing” - Lezione 8
16
Cluster Element
Cluster
A set of heterogeneous resources.
Contains information on shared
temporary directories.
1
1
*
SubCluster
A set of similar resources. Contains
the number of Logical and Physical
CPUs.
Host
Contains detailed static information
of the type of hosts and related
installed software. Data deal with
the type of CPU architecture,
memory sizes, the operating system
installed as well as the type of
network adapter. Furthermore it
contains some information on
performance mesures obtained by
executing well known benchmark
softwares.
1
*
Location
*
ComputingElement
Tecnologia dei Servizi “Grid e cloud computing” - Lezione 8
Information on installed softwares,
their path and version
17
GLUE: cluster and subcluster
GlueClusterName: infn-ce-01.ct.trigrid.it
GlueClusterService: infn-ce-01.ct.trigrid.it:2119/jobmanager-lcglsf-short
GlueClusterService: infn-ce-01.ct.trigrid.it:2119/jobmanager-lcglsf-long
GlueClusterService: infn-ce-01.ct.trigrid.it:2119/jobmanager-lcglsf-infinite
GlueClusterService: infn-ce-01.ct.trigrid.it:2119/jobmanager-lcglsf-cert
GlueClusterService: infn-ce-01.ct.trigrid.it:2119/jobmanager-lcglsf-cometa
GlueClusterService: infn-ce-01.ct.trigrid.it:2119/jobmanager-lcglsf-inaf
GlueClusterService: infn-ce-01.ct.trigrid.it:2119/jobmanager-lcglsf-alice
GlueClusterService: infn-ce-01.ct.trigrid.it:2119/jobmanager-lcglsf-cometa
[..]
GlueSubClusterPhysicalCPUs: 4
GlueSubClusterLogicalCPUs: 4
GlueSubClusterTmpDir: /tmp
GlueSubClusterWNTmpDir: /tmp
Tecnologia dei Servizi “Grid e cloud computing” - Lezione 8
18
GLUE: Host
GlueHostApplicationSoftwareRunTimeEnvironment: GLITE-3_0_0
GlueHostApplicationSoftwareRunTimeEnvironment: INFN-CATANIA
GlueHostApplicationSoftwareRunTimeEnvironment: MPICH
[..]
GlueHostArchitectureSMPSize: 4
GlueHostBenchmarkSF00: 1937
GlueHostBenchmarkSI00: 1483
GlueHostMainMemoryRAMSize: 4096
GlueHostMainMemoryVirtualSize: 8192
GlueHostNetworkAdapterInboundIP: TRUE
GlueHostNetworkAdapterOutboundIP: TRUE
GlueHostOperatingSystemName: Scientific Linux CERN
GlueHostOperatingSystemRelease: 3.0.6
GlueHostOperatingSystemVersion: SLC
GlueHostProcessorClockSpeed: 2392
GlueHostProcessorModel: Dual Core Opteron 280
GlueHostProcessorVendor: AMD
Tecnologia dei Servizi “Grid e cloud computing” - Lezione 8
19
Computing Element
ComputingElement
Info
Abstraction of a queue of jobs
Static information on the resource
that deal with the type of Loca
scheduler adopted, the default
Storage Element and so on.
Policy
*
Contains info on configuration
policies. MaxWallClockTime,
MaxRunningJobs, MaxCPUTime . . .
VOview
View for a given Virtual
Organization. Contains authorization
details for VO members and the
amount of available resources.
AccessControlPolicyBase
Set of rules defining access control
policy rules
Job
Information on jobs in this queue, its
owner, its local and global ID and its
status
Tecnologia dei Servizi “Grid e cloud computing” - Lezione 8
*
State
Dynamic information on the status
of this queue such as the number of
free CPUs and the Estimated
Traversal Time (ETT)
20
GLUE: Host
GlueCEName: cometa
GlueCEUniqueID: infn-ce-01.ct.trigrid.it:2119/jobmanager-lcglsf-cometa
GlueCEInfoGatekeeperPort: 2119
GlueCEInfoHostName: infn-ce-01.ct.trigrid.it
GlueCEInfoLRMSType: lsf
GlueCEInfoLRMSVersion: 6.1
GlueCEInfoTotalCPUs: 98
GlueCEInfoJobManager: lcglsf
GlueCEInfoContactString: infn-ce-01.ct.trigrid.it:2119/jobmanager-lcglsf-cometa
GlueCEInfoApplicationDir: /opt/exp_soft
GlueCEInfoDataDir: unset
GlueCEInfoDefaultSE: infn-se-01.ct.trigrid.it
GlueCEStateEstimatedResponseTime: 61713
GlueCEStateFreeCPUs: 26
GlueCEStateRunningJobs: 70
GlueCEStateStatus: Production
GlueCEStateTotalJobs: 70
GlueCEStateWaitingJobs: 0
GlueCEStateWorstResponseTime: 123427
GlueCEStateFreeJobSlots: 26
GlueCEPolicyMaxCPUTime: 2880
GlueCEPolicyMaxRunningJobs: 98
GlueCEPolicyMaxTotalJobs: 0
GlueCEPolicyMaxWallClockTime: 2880
GlueCEPolicyPriority: -10
GlueCEPolicyAssignedJobSlots: 98
GlueCEAccessControlBaseRule: VO:cometa
Tecnologia dei Servizi “Grid e cloud computing” - Lezione 8
21
Storage Element
Storage Element
Information about the
service
(like Name,Port,URL)
Storage Area
Contains info of
available and used disk
space,file policies,
access rules,etc.
Access protocols
Contains info about
the protocols used
to transfer files
Tecnologia dei Servizi “Grid e cloud computing” - Lezione 8
22
GLUE: Storage Element
GlueSEUniqueID: infn-se-01.ct.trigrid.it
GlueSEName: TRIGRID-INFN-CATANIA:srm_v1
GlueSEPort: 2811
GlueSESizeTotal: 16350
GlueSESizeFree: 16350
GlueSEArchitecture: multidisk
GlueInformationServiceURL: ldap://infn-se-01.ct.trigrid.it:2135/mds-vo-name=local,o=grid
Tecnologia dei Servizi “Grid e cloud computing” - Lezione 8
23
GLUE: Storage Area
GlueSARoot: cometa:/dpm/ct.trigrid.it/home/cometa
GlueSAPath: /dpm/ct.trigrid.it/home/cometa
GlueSAType: permanent
GlueSALocalID: cometa
GlueSAPolicyMaxFileSize: 10000
GlueSAPolicyMinFileSize: 1
GlueSAPolicyMaxData: 100
GlueSAPolicyMaxNumFiles: 10
GlueSAPolicyMaxPinDuration: 10
GlueSAPolicyQuota: 0
GlueSAPolicyFileLifeTime: permanent
GlueSAStateAvailableSpace: 16350000000
GlueSAStateUsedSpace: 0
GlueSAAccessControlBaseRule: cometa
Tecnologia dei Servizi “Grid e cloud computing” - Lezione 8
24
GLUE: Access Protocols
GlueSEAccessProtocolLocalID: gsiftp
GlueSEAccessProtocolType: gsiftp
GlueSEAccessProtocolEndpoint: gsiftp://infn-se-01.ct.trigrid.it
GlueSEAccessProtocolCapability: file transfer
GlueSEAccessProtocolVersion: 1.0.0
GlueSEAccessProtocolPort: 2811
GlueSEAccessProtocolSupportedSecurity: GSI
GlueSEAccessProtocolLocalID: rfio
GlueSEAccessProtocolType: rfio
GlueSEAccessProtocolEndpoint: httpg://infn-se-01.ct.trigrid.it
GlueSEAccessProtocolCapability: byte access
GlueSEAccessProtocolVersion: 1.0.0
GlueSEAccessProtocolPort: 5001
GlueSEAccessProtocolSupportedSecurity: RFIO
Tecnologia dei Servizi “Grid e cloud computing” - Lezione 8
25
LCG Information
System
Tecnologia dei Servizi “Grid e cloud computing” - Lezione 8
26
LCG Information System

LCG adopted a combination of solutions (now only BDII).

Globus MDS





BDII






At the lowest level of the information system
To discover and monitor resources and publish information
Grid Information Security (GSI) credentials
Caching
At the highest level of the system
Because MDS had some troubles in terms of scalability
Used by the Resource Broker for the matchmaking process
Can be configured by each VO
Queries underlying systems periodically (2 minutes)
Hierarchical system



Information is collected on the leaves of a hierarchical tree and travels
towards the root
Clients can query the hierarchical tree at every level
The higher the level against which queries are made, the older is the
obtained information
Tecnologia dei Servizi “Grid e cloud computing” - Lezione 8
27
Collecting Information

Gathering of information at different levels

Lower level: Grid Resource Information Server (GRIS)




Medium level: Grid Index Information Server (GIIS)



Collects information on the state of a given resource
One GRIS on top of each resource
A set of scripts and sensor that try to extract useful info on the resource
Collects information on resources of a given site
One GIIS for each site
Higher level: BDII


Collects information on resources of a given VO
One BDII for each VO (suggested solution)
NOW all levels are based on BDII

Way of collecting info


Pull model (higher level servers periodically query lower level servers)
LDAP query model
Tecnologia dei Servizi “Grid e cloud computing” - Lezione 8
28
BDII overview

The Berkley Database Information Index (BDII)








Developed within the context of LCG project
Solves problems of instability of the MDS occurring when the number of sites grows
too much
Stays on top of BDII sites
One for each VO
Centralized system
Three levels of hierarchy
Accessed by the Workload Management System
Way of working






One BDII for each resource
One BDII for each site collecting info from below BDII systems
One BDII for a given VO collecting information from below BDII systems
Two LDAP servers, one for write access and one for read access
Every two minutes a cron-job runs a script and collects info from a list of BDII sites
The list of site BDII is placed in the configuration file of the top BDII
Tecnologia dei Servizi “Grid e cloud computing” - Lezione 8
30
LCG Information System Hierarchy today
Tecnologia dei Servizi “Grid e cloud computing” - Lezione 8
31
Information & Monitoring Services
Berkeley Database
Information Index
BDII
top-level
Queries
WMS
2 minutes
BDII
site-level
Site
WN
UI
FTS
BDII
resource
MDS
GRIS
provider
provider
Tecnologia dei Servizi “Grid e cloud computing” - Lezione 8
- Based on ldap
- Standardized information provider (GIP)
- GLUE-1.3 schema
- Top level Used with 230+ sites
- Roughly 60 instances in EGEE
32
BDII overview

Every node (except UI and WNs) has a bdii service
in order to publish its informations

A node in every site collects all site BDIIs and
publishes them using a site BDII;

The top BDII collects all site BDIIs

User can run a set of commands to query the top
BDII.
Tecnologia dei Servizi “Grid e cloud computing” - Lezione 8
33
Top BDII vs Site BDII


Site BDII

It collects all grid BDIIs (for example SE,RB,LFC,etc..)

The name of the service is bdii
Top BDII

It collects all site BDIIs* ;

The name of the service is bdii

It gives to the RB/WMS all needed informations to match and
dispatch user's jobs

It can run in the same machine where the RB/WMS is running
(it's more fast in answer)
*BDII=Berkely Database Infomatin Index
Tecnologia dei Servizi “Grid e cloud computing” - Lezione 8
34
References

gLite doc http://glite.web.cern.ch/glite/documentation/default.asp


gLite userGuide https://edms.cern.ch/file/722398//gLite-3-UserGuide.pdf
EGEE: The Information System
https://twiki.cern.ch/twiki/bin/view/EGEE/InformationSystemOverview




Berkeley Database Information Index V5
https://twiki.cern.ch/twiki/bin/view/EGEE/BDII
Glue Usage within EGEE https://twiki.cern.ch/twiki/bin/view/EGEE/GlueUse
What is LDAP?
http://www.openldap.org/doc/admin22/intro.html#What%20is%20LDAP
Usage of Glue Schema v1.3 for WLCG Installed Capacity information:
https://twiki.cern.ch/twiki/pub/LCG/WLCGCommonComputingReadinessChalleng
es/WLCG_GlueSchemaUsage-1.8.pdf
Tecnologia dei Servizi “Grid e cloud computing” - Lezione 8
35