Voms MyProxy hands-on

Download Report

Transcript Voms MyProxy hands-on

Enabling Grids for E-sciencE
Data Management
Ron Trompert
SARA
Grid Tutorial, 18-19 September 2006
www.eu-egee.org
INFSO-RI-508833
Outline
Enabling Grids for E-sciencE
•
•
•
•
•
•
•
•
Storage Infrastructures
SRM
Storage Elements in gLite
Low Level Data Management
LCG File Catalog (LFC)
Datamanagement CLIs and APIs
Examples
FTS
INFSO-RI-508833
Grid Tutorial, RC RUG, 18-19 September 2006
2
Storage Infrastructures
Enabling Grids for E-sciencE
• Disk-only
• Hierarchical storage management (HSM)
– policy-based management of file backup and archiving in a way
that uses storage devices economically and without the user
needing to be aware of when files are being retrieved from or
stored on backup storage media.
– The hierarchy represents different types of storage media, such
as disks systems, optical storage, or tape, each type
representing a different level of cost and speed of retrieval when
access is needed. For example, as a file ages in an archive, it
can be automatically moved to a slower but less expensive form
of storage.
– HSM Software: TSM, DMF, CASTOR, Enstore, HPSS,…
INFSO-RI-508833
Grid Tutorial, RC RUG, 18-19 September 2006
3
Storage Infrastructures
Enabling Grids for E-sciencE
• HSM example at SARA
INFSO-RI-508833
Grid Tutorial, RC RUG, 18-19 September 2006
4
SRM
Enabling Grids for E-sciencE
• SRM standard
– SRM implementations provide uniform access to heterogeneous
storage resources on the Grid
• Storage Resource Managers
– SRM is a control protocol for:
 Space reservation
 File management
• Pinning
• Lifetime management
 Replication
 Protocol negotiation
INFSO-RI-508833
Grid Tutorial, RC RUG, 18-19 September 2006
5
SRM
Enabling Grids for E-sciencE
• SRM implementation
– SRM I/F is implemented as a web service
– Implementations:





dCache (disk/HSM)
DPM (disk)
CASTOR (HSM)
SRB (disk/HSM)
….
• SRM Examples
–
–
–
–
–
–
–
srmRm
srmLs
srmPrepareToPut
srmBringOnline
srmCopy
srmGetTransferProtocols
….
INFSO-RI-508833
Grid Tutorial, RC RUG, 18-19 September 2006
6
Storage Elements in gLite
Enabling Grids for E-sciencE
• Classic SE
–
–
–
–
No SRM
Will become deprecated in the autumn of this year
Transfer protocols: gridftp
Storage type: disk
• DPM
– SRM
– Transfer protocols: gridftp, secure rfio
– Storage type: disk
• dCache
– SRM
– Transfer protocols: gridftp, gsidcap
– Storage type: disk, HSM
INFSO-RI-508833
Grid Tutorial, RC RUG, 18-19 September 2006
7
Low Level Data Management
Enabling Grids for E-sciencE
• GridFTP (all SEs)
– globus-url-copy file:///home/ron/file \
gsiftp://srm.grid.sara.nl/pnfs/grid.sara.nl/data/dteam/file
– Third party transfer
 globus-url-copy gsiftp://hostA/pathA gsiftp://hostB/pathB
– Also edg-gridftp-ls, edg-gridftp-rm, edg-gridftp-mkdir etc.
– Uberftp
 Interactive gridftp client
 ftp commands
 Gsi authentication
INFSO-RI-508833
Grid Tutorial, RC RUG, 18-19 September 2006
8
Low Level Data Management
Enabling Grids for E-sciencE
• Gsidcap (dCache SEs)
– dccp -p 20000:25000 /tmp/file \
gsidcap://srm.grid.sara.nl:22128/pnfs/grid.sara.nl/data/dteam/file
– 20000:25000 is derived from GLOBUS_TCP_PORT_RANGE
environment variable
• Secure rfio
– rfcp /path/myfile \
t2se01.physics.ox.ac.uk:/dpm/physics.ox.ac.uk/home/dteam/file
• Srmcp ( ! Classic SEs )
– Srmcp file:////tmp/file \
srm://srm.grid.sara.nl:8443//pnfs/grid.sara.nl/data/dteam/file
INFSO-RI-508833
Grid Tutorial, RC RUG, 18-19 September 2006
9
Information system
Enabling Grids for E-sciencE
• LDAP-based
– Ldap servers running on service nodes (GRIS/BDII)
– Ldap servers collecting the information for a site (site BDII)
– Ldap servers collecting the information for all sites (BDII)
• Need to set environment variable LCG_GFAL_INFOSYS
– Needs to be set to a BDII
• lcg-infosites
– Example: finding an SE:
> lcg-infosites --vo tutor se
Avail Space(Kb) Used Space(Kb) Type SEs
---------------------------------------------------------214632
1901097784
n.a tbn15.nikhef.nl
626880000
1163120000
n.a tbn18.nikhef.nl
488106596
368854044
n.a mu2.matrix.sara.nl
INFSO-RI-508833
Grid Tutorial, RC RUG, 18-19 September 2006
10
Information system
Enabling Grids for E-sciencE
• lcg-info
– For more advanced searches:
For example, finding out where to put your files
>lcg-info --list-se --query 'SE=mu2.matrix.sara.nl’
--attrs Path
- SE: mu2.matrix.sara.nl
- Path
/flatfiles/SE00/tutor
• ldapsearch
– For the real troopers among us
INFSO-RI-508833
Grid Tutorial, RC RUG, 18-19 September 2006
11
LFC
Enabling Grids for E-sciencE
• LFC stands for LCG File Catalog
– LCG stands for LHC Computing Grid
– LHC stands for Large Hadron Collider
• User and programs produce and require data
– Resource Broker can send (small amounts of) data to/from jobs:
Input and Output Sandbox. Not recommended for large amounts
of data
• Data is stored on the grid
–
–
–
–
Located in Storage Elements
Several replicas of one file in different sites
Accessible by Grid users and applications from “anywhere”
Locatable by the WMS/RB (data requirements in JDL)
• Also…
– Data may be copied from/to local filesystems (WNs, UIs) to the
Grid or opened remotely on the SE (GFAL,gsidcap,rfio).
INFSO-RI-508833
Grid Tutorial, RC RUG, 18-19 September 2006
12
LFC
Enabling Grids for E-sciencE
• LFC
–Keeps track of the location of copies (replicas) of files
on the Grid
INFSO-RI-508833
Grid Tutorial, RC RUG, 18-19 September 2006
13
Name conventions
Enabling Grids for E-sciencE
• Logical File Name (LFN)
– An alias created by a user to refer to some item of data, e.g.
“lfn:/grid/tutor/mydir/myfile”
• Globally Unique Identifier (GUID)
– A non-human-readable unique identifier for an item of data, e.g.
“guid:f81d4fae-7dec-11d0-a765-00a0c91e6bf6”
• Site URL (SURL) (or Physical File Name (PFN) or Site FN)
– The location of an actual piece of data on a storage system, e.g.
“srm://pcrd24.cern.ch/flatfiles/cms/output10_1”
(SRM)
“sfn://lxshare0209.cern.ch/data/alice/ntuples.dat” (Classic SE)
• Transport URL (TURL)
– Locator of a replica + access protocol: understood by a SE, e.g.
“rfio://lxshare0209.cern.ch//data/alice/ntuples.dat”
INFSO-RI-508833
Grid Tutorial, RC RUG, 18-19 September 2006
14
Naming conventions
Enabling Grids for E-sciencE
• How do they fit together?
– LFC holds the mapping LFN-GUID-SURL
TURL 11
SURL 1
LFN 1
:
GUID
LFN i
LFC
:
TURL 1k
:
:
TURL j1
SURL j
:
TURL jl
INFSO-RI-508833
Grid Tutorial, RC RUG, 18-19 September 2006
15
LFC
Enabling Grids for E-sciencE
INFSO-RI-508833
Grid Tutorial, RC RUG, 18-19 September 2006
16
LFC
Enabling Grids for E-sciencE
• LFN acts as main key in the database. It has:
–
–
–
–
–
Symbolic links to it (additional LFNs)
Unique Identifier (GUID)
System metadata
Information on replicas
One field of user metadata
INFSO-RI-508833
Grid Tutorial, RC RUG, 18-19 September 2006
17
LFC
Enabling Grids for E-sciencE
• Two kinds of LFC
– Central LFC
For each VO, one site on the grid will publish a global catalog.
This will record entries (file replicas or dataset entities) across
the whole of the grid.
– Local LFC
Local catalogs record the file replicas stored at that site's SEs
only.
INFSO-RI-508833
Grid Tutorial, RC RUG, 18-19 September 2006
18
LFC
Enabling Grids for E-sciencE
• Provides:
– User exposed transaction C/C++ API (+ auto rollback on
failure)
 Python wrapper provided (python module lfc)
– Command line tools with administrative functionality
– Hierarchical unix-like namespace and namespace operations
for LFNs
 lfn:/grid/<vo name>/mydir/myfile
 lfc-mkdir, lfc-chmod
– Integrated GSI Authentication + Authorization
– Access Control Lists (Unix Permissions and POSIX ACLs)
– Checksums
– Sessions (multiple operations inside a single transaction )
– Bulk operations (inside transactions )
INFSO-RI-508833
Grid Tutorial, RC RUG, 18-19 September 2006
19
LFC
Enabling Grids for E-sciencE
Summary of the LFC Catalog commands
lfc-chmod
Change access mode of the LFC file/directory
lfc-chown
Change owner and group of the LFC file-directory
lfc-delcomment
Delete the comment associated with the file/directory
lfc-getacl
Get file/directory access control lists
lfc-ln
Make a symbolic link to a file/directory
lfc-ls
List file/directory entries in a directory
lfc-mkdir
Create a directory
lfc-rename
Rename a file/directory
lfc-rm
Remove a file/directory
lfc-setacl
Set file/directory access control lists
lfc-setcomment
Add/replace a comment
INFSO-RI-508833
Grid Tutorial, RC RUG, 18-19 September 2006
20
LFC
Enabling Grids for E-sciencE
C/C++ API: Low level methods (many POSIX-like):
lfc_access
lfc_deleteclass
lfc_listreplica
lfc_aborttrans
lfc_delreplica
lfc_lstat
lfc_addreplica
lfc_endtrans
lfc_mkdir
lfc_apiinit
lfc_enterclass
lfc_modifyclass
lfc_chclass
lfc_errmsg
lfc_opendir
lfc_chdir
lfc_getacl
lfc_queryclass
lfc_chmod
lfc_getcomment
lfc_readdir
lfc_chown
lfc_getcwd
lfc_readlink
lfc_closedir
lfc_getpath
lfc_rename
lfc_creat
lfc_lchown
lfc_rewind
lfc_delcomment
lfc_listclass
lfc_rmdir
lfc_delete
lfc_listlinks
lfc_selectsrvr
INFSO-RI-508833
lfc_setacl
lfc_setatime
lfc_setcomment
lfc_seterrbuf
lfc_setfsize
lfc_starttrans
lfc_stat
lfc_symlink
lfc_umask
lfc_undelete
lfc_unlink
lfc_utime
send2lfc
Grid Tutorial, RC RUG, 18-19 September 2006
21
LFC Interfaces
Enabling Grids for E-sciencE
• Integration with GFAL and lcg_utils APIs
 lcg-utils/GFAL access the catalog in a transparent way
• Integration with the WMS
– The RB can locate Grid files: allows for data based matchmaking
– Jdl file:
 InputData = "lfn:/grid/tutor/MyFile";
INFSO-RI-508833
Grid Tutorial, RC RUG, 18-19 September 2006
22
Data Management CLIs & APIs
Enabling Grids for E-sciencE
• lcg_utils: lcg-* commands + lcg_* API calls
– Provide (all) the functionality needed by the LCG user
– Transparent interaction with file catalogs and storage
interfaces when needed
– Abstraction from technology of specific implementations
• Grid File Access Library (GFAL): API
– Adds file I/O and explicit catalog interaction functionality
– Still provides the abstraction and transparency of lcg_utils
INFSO-RI-508833
Grid Tutorial, RC RUG, 18-19 September 2006
23
Data Management CLIs & APIs
Enabling Grids for E-sciencE
lcg-utils commands: Replica Management
lcg-cp
Copies a grid file to a local destination
lcg-cr
Copies a file to a SE and registers the file in the catalog
lcg-del
Delete one file
lcg-rep
Replication between SEs and registration of the replica
lcg-gt
Gets the TURL for a given SURL and transfer protocol
lcg-sd
Sets file status to “Done” for a given SURL in a SRM request
lcg-utils commands: File Catalog Interaction
lcg-aa
Add an alias in LFC for a given GUID
lcg-ra
Remove an alias in LFC for a given GUID
lcg-rf
Registers in LFC a file placed in a SE
lcg-uf
Unregisters in LFC a file placed in a SE
lcg-la
Lists the alias for a given SURL, GUID or LFN
lcg-lg
Get the GUID for a given LFN or SURL
lcg-lr
Lists the replicas for a given GUID, SURL or LFN
INFSO-RI-508833
Grid Tutorial, RC RUG, 18-19 September 2006
24
Data Management CLIs & APIs
Enabling Grids for E-sciencE
lcg-utils C/C++ API:
lcg-cp
lcg-lr
lcg-cr
lcg-ra
lcg-del
lcg-rf
lcg-rep
lcg-uf
lcg-sd
lcg-la
lcg-aa
lcg-lg
lcg-gt
INFSO-RI-508833
Grid Tutorial, RC RUG, 18-19 September 2006
25
Data Management CLIs & APIs
Enabling Grids for E-sciencE
• GFAL
– Grid storage interactions today require using some existing
software components:
 The file catalog services to locate valid replicas of files in order to :
• Download them to the user local machine
• Move them from a SE to another one
• Make job running on the worker node able to access and manage
files stored on remote storage element.
 The SRM software to ensure:
• Files existence on disk or disk pool (they are recalled from mass
storage if necessary)
• Space allocation on disk for new files (they are possibly migrated
to mass storage later)
INFSO-RI-508833
Grid Tutorial, RC RUG, 18-19 September 2006
26
Data Management CLIs & APIs
Enabling Grids for E-sciencE
•
The GFAL Features
– Hides interactions to the SRM to the end user
– Provides a Posix-like interface for File I/O Operation
 Posix calls prefixed with gfal_
– Based on shared libraries (both threaded e unthreaded version)
– Needs only one header file (gfal_api.h) to write C applications
– Supports following protocols :




file for local access, also lfn/guid
dcap, gsidcap and kdcap for dCache access protocol
rfio for CASTOR access protocol.
SRM
– Access to SRMs in secure mode, i.e. using a valid Grid proxy obtained by
voms-proxy-init command.
INFSO-RI-508833
Grid Tutorial, RC RUG, 18-19 September 2006
27
Examples
Enabling Grids for E-sciencE
• Using lcg utils and lfc commands:
– Define the server hostname
 The LFC server must be published in the BDII
($LCG_GFAL_INFOSYS)
 Use environmental variable: $LFC_HOST=<LFC_server_hostname>
 $LFC_HOST must be set
INFSO-RI-508833
Grid Tutorial, RC RUG, 18-19 September 2006
28
Examples
Enabling Grids for E-sciencE
Listing the entries of a LFC directory
lfc-ls
[-cdiLlRTu] [--class] [--comment] [--deleted] [--display_side]
[--ds] path…
where path specifies the LFN pathname (mandatory)
– Remember that LFC has a directory tree structure
– /grid/<VO_name>/<you create it>
LFC Namespace
Defined by the user
– All members of a VO have read-write permissions under their directory
– You can set LFC_HOME to use relative paths
> lfc-ls /grid/tutor/me
> export LFC_HOME=/grid/tutor
> lfc-ls -l me
-l : long listing
-R : list the contents of directories
> lfc-ls -l -R /grid
recursively: Don’t use it!
INFSO-RI-508833
Grid Tutorial, RC RUG, 18-19 September 2006
29
Examples
Enabling Grids for E-sciencE
Creating directories in the LFC
lfc-mkdir [-m mode] [-p] path...
•
Where path specifies the LFC pathname
•
Remember that while registering a new file (using lcg-cr, for example) the
corresponding destination directory must be created in the catalog beforehand.
•
Examples:
> lfc-mkdir /grid/tutor/me
You can just check the directory with:
> lfc-ls -l /grid/tutor/me
drwxr-xrwx 0 19122
INFSO-RI-508833
1077
0 Jun 14 11:36 demo
Grid Tutorial, RC RUG, 18-19 September 2006
30
Examples
Enabling Grids for E-sciencE
Let us copy and register a file using
lcg-utils
> lcg-cr
--vo tutor -l me/test -d
mu2.matrix.sara.nl file:`pwd`/test
guid:7b4efaef-bb0f-42a3-bb6f-bbe35080d105
> lcg-lr
--vo tutor lfn:me/test
sfn://mu2.matrix.sara.nl/flatfiles/SE00/tutor/generated/2006-0918/file378fc829-351f-4558-8679-9d2ce530cbb4
> lfc-ls
-l me
-rw-rw-r-- 1 30010
INFSO-RI-508833
2024
114 Sep 18 10:33 test
Grid Tutorial, RC RUG, 18-19 September 2006
31
Examples
Enabling Grids for E-sciencE
Creating a symbolic link
lfc-ln -s file linkname
lfc-ln -s directory linkname
Create a link to the specified file or directory with linkname
– Examples:
> lfc-ln -s /grid/tutor/me/test /grid/tutor/aLink
Original File
Symbolic link
Let’s check the link using lfc-ls with long listing (-l):
> lfc-ls -l
lrwxrwxrwx 1 30010 2024
0 Sep 18 10:38 aLink ->
/grid/tutor/me/test
INFSO-RI-508833
Grid Tutorial, RC RUG, 18-19 September 2006
32
Examples
Enabling Grids for E-sciencE
Adding/deleting metadata information
lfc-setcomment path comment
lfc-delcomment path
lfc-setcomment adds/replaces a comment associated with a file/directory
in the LFC Catalog
lfc-delcomment deletes a comment previously added
•
This is the only metadata (one field) supported by the catalog
•
Examples:
> lfc-setcomment me/test “nice file”
•
Let’s see what happened:
> lfc-ls --comment /grid/tutor/me/test
/grid/tutor/me/test
INFSO-RI-508833
nice file
Grid Tutorial, RC RUG, 18-19 September 2006
33
Examples
Enabling Grids for E-sciencE
Deleting the file
lfc-rm
lfc-rm removes file/link/directory only from the catalog
lcg-del
Lcg-del removes file from SEs and the lfns/links from the catalog
•
Examples, delete all replicas:
> lcg-del –a --vo tutor guid:8e413879-7cb3-4260-af9f-6964392da7e8
•
Example, delete only one replica:
> lcg-del –a --vo tutor –s mu2.matrix.sara.nl guid:8e413879-7cb3-4260-af9f6964392da7e8
INFSO-RI-508833
Grid Tutorial, RC RUG, 18-19 September 2006
34
File Transfer Service
Enabling Grids for E-sciencE
• A batch system for submitting datatransfer jobs
• For data intensive sciences
– Currently in use in the LCG project
INFSO-RI-508833
Grid Tutorial, RC RUG, 18-19 September 2006
35
FTS
Enabling Grids for E-sciencE
• Allows for
– Managed transfers by means of channels to sites
 Channels are between sites i.e. CERN-SARA for example.
 Site admins can adapt the configuration of incoming channels to
their site, can switch their channel off etc.
 Set priorities for different VOs.
– Optimisation of network tuning parametres per channel
INFSO-RI-508833
Grid Tutorial, RC RUG, 18-19 September 2006
36
FTS
Enabling Grids for E-sciencE
• Command line interface
– glite-transfer-cancel
 Cancels a file transfer job
– glite-transfer-list
 Lists ongoing data transfer jobs
– glite-transfer-status
 Displays the status of an ongoing data transfer job
– glite-transfer-submit
 Submits a new data transfer job
INFSO-RI-508833
Grid Tutorial, RC RUG, 18-19 September 2006
37