Tutorial Use Cases

Download Report

Transcript Tutorial Use Cases

Using The EDG Testbed
The European DataGrid Project Team
http://www.eu-datagrid.org
Summary

Use Cases



High Energy Physics
Earth Observation
Biomedical Applications
EDG Use Cases Tutorial - n° 2
EDG Application Areas
Earth Observation
Science Applications
Biomedical
Applications
High Energy Physics
EDG Use Cases Tutorial - n° 3
High Energy Physics
4 Experiments on LHC
ATLAS
CMS
~6-8 PetaBytes / year
~108 events/year
~103 batch and interactive users
LHCb
EDG Use Cases Tutorial - n° 4
CERN’s Network in the World
Europe:
267 institutes, 4603 users
Elsewhere: 208 institutes, 1632 users
EDG Use Cases Tutorial - n° 5
Data Flow in LHC
Monte Carlo
DAQ
Physics Generator
Trigger
Generator Data
RAW Tags
RAW Data
Conditions / Calibration
Data
Detector Simulation
Conditions /
Calibration Data
RAWmc Data
RAWmc Tags
Reconstruction
Reconstruction
Event Summary Data (ESD)
Reconstruction Tags
Event Summary Data (ESD)
Reconstruction Tags
EDG Use Cases Tutorial - n° 6
LHCb EDG Integration

LHCb

LHCb distributed computing environment

Integration of DataGrid middleware





Authentication
Job submission to DataGrid
Monitoring and control
Data replication
Resource scheduling – use of CERN MSS
EDG Use Cases Tutorial - n° 7
LHCb

LHC collider experiment

109 events * 1Mb = 1 Pb

Need a distributed model

Create, distribute and keep track of data automatically
EDG Use Cases Tutorial - n° 8
Update bookkeeping
LHCb distributed computing
environment
database
Submit jobs
remotely
viaWeb
Execute
on farm
Transfer data to
Mass store
Data Quality
Check
EDG Use Cases Tutorial - n° 9
LHCb Environment using EDG
Middleware
Submit jobs
remotely
via Web
Execute
on farm
Monitor
performance
of farm via
Web
UserInterfac
e
WMS
Information
Services
Transfer data to
CASTOR (and
HPSS, RAL
Datastore)
Update bookkeeping
database
Data Quality
Check ‘Online’
Replica
Management
MetaData
Catalog
Online histogram
production using
GRID pipes
EDG Use Cases Tutorial - n° 10
1. Authentication

Issue grid-proxy-init to get a valid user certificate.
EDG Use Cases Tutorial - n° 11
2. Job Submission

dg-job-submit /home/evh/sicb/sicb/bbincl1600061.jdl -o
/home/evh/logsub
bbincl1600061.jdl:
#
Executable = "script_prod";
Arguments = "1600061,v235r4dst,v233r2";
StdOutput = "file1600061.output";
StdError = "file1600061.err";
InputSandbox =
{"/home/evhtbed/scripts/x509up_u149","/home/evhtbed/sicb/mcsend","/ho
me/evhtbed/sicb/fsize","/home/evhtbed/sicb/cdispose.class","/home/evh
tbed/v235r4dst.tar.gz","/home/evhtbed/sicb/sicb/bbincl1600061.sh","/h
ome/evhtbed/script_prod","/home/evhtbed/sicb/sicb1600061.dat","/home/
evhtbed/sicb/sicb1600062.dat","/home/evhtbed/sicb/sicb1600063.dat","/
home/evhtbed/v233r2.tar.gz"};
OutputSandbox =
{"job1600061.txt","D1600063","file1600061.output","file1600061.err","
job1600062.txt","job1600063.txt"};
EDG Use Cases Tutorial - n° 12
3. Monitoring and Control

dg-job-status

dg-job-cancel

dg-job-get-output
EDG Use Cases Tutorial - n° 13
3. Monitoring and Control
EDG Use Cases Tutorial - n° 14
3. Monitoring and Control
EDG Use Cases Tutorial - n° 15
3. Monitoring and Control
EDG Use Cases Tutorial - n° 16
3. Monitoring and Control
EDG Use Cases Tutorial - n° 17
Compute Element
Local disk
Job
Storage Element
data
data
Mass store
replica
catalog
(Nikhef)
data
Storage Element
Job
data
EDG Use Cases Tutorial - n° 18
Compute Element
Local disk
Job
Storage Element
globus-url-copy
data
data
rfcp
Mass store
replica
catalog
(Nikhef)
data
Storage Element
Job
data
EDG Use Cases Tutorial - n° 19
Compute Element
Local disk
Job
Storage Element
globus-url-copy
data
data
register-local-file
publish
rfcp
Mass store
replica
catalog
(Nikhef)
data
Storage Element
Job
data
EDG Use Cases Tutorial - n° 20
Compute Element
Local disk
Job
Storage Element
globus-url-copy
data
data
publish
rfcp
Mass store
replica
catalog
(Nikhef)
data
Storage Element
replica-get
Job
register-local-file
data
EDG Use Cases Tutorial - n° 21
Compute Element
Local disk
Job
Storage Element
globus-url-copy
data
data
publish
rfcp
data
globus-url-copy
Mass store
replica
catalog
(Nikhef)
Storage Element
replica-get
Job
register-local-file
data
EDG Use Cases Tutorial - n° 22
4. Publish data on storage element

Copy data file to storage element:
globus-url-copy file:///${chemin}/L69999 \
gsiftp://lxshare0219.cern.ch/flatfiles/SE1/lhcb/L69999

Register stored data in the catalog:
/opt/globus/bin/globus-job-run lxshare0219.cern.ch \
/bin/bash -c "export
GDMP_CONFIG_FILE=/opt/edg/lhcb/etc/gdmp.conf; \
/opt/edg/bin/gdmp_register_local_file -d /flatfiles/SE1/lhcb"

Publish catalog:
/opt/globus/bin/globus-job-run lxshare0219.cern.ch \
/bin/bash -c "export
GDMP_CONFIG_FILE=/opt/edg/lhcb/etc/gdmp.conf; \
/opt/edg/bin/gdmp_publish_catalogue -n"
EDG Use Cases Tutorial - n° 23
The ALICE Event
EDG Use Cases Tutorial - n° 24
The ALICE Event Cont’d
## ----- Job Description for Aliroot ----## author: [email protected]
( start_aliroot.sh) :
#!/bin/sh
Executable = "/bin/sh";
mv rootrc $HOME/.rootrc
StdOutput = "aliroot.out";
echo "ALICE_ROOT_DIR is set to: $ALICE_ROOT_DIR"
StdError = "aliroot.err";
export ROOTSYS=$ALICE_ROOT_DIR/root/$1
InputSandbox =
{"start_aliroot.sh","rootrc","grun.C","Config.C"};
export PATH=$PATH:$ROOTSYS/bin
OutputSandbox =
{"aliroot.err","aliroot.out","galice.root"};
RetryCount = 7;
Arguments = "start_aliroot.sh 3.02.04 3.07.01";
Requirements =
Member(other.RunTimeEnvironment,"ALICE3.07.01");
export
LD_LIBRARY_PATH=$ROOTSYS/lib:$LD_LIBRARY_PATH
export ALICE=$ALICE_ROOT_DIR/aliroot
export ALICE_LEVEL=$2
export ALICE_ROOT=$ALICE/$ALICE_LEVEL
export ALICE_TARGET=`uname`
export
LD_LIBRARY_PATH=$ALICE_ROOT/lib/tgt_$ALICE_TARGET:$
LD_LIBRARY_PATH
export
PATH=$PATH:$ALICE_ROOT/bin/tgt_$ALICE_TARGET:$ALIC
E_ROOT/share
export MANPATH=$MANPATH:$ALICE_ROOT/man
$ALICE_ROOT/bin/tgt_$ALICE_TARGET/aliroot -q -b grun.C
EDG Use Cases Tutorial - n° 25
Raw satellite data
from the GOME instrument
(ESA)
Processing of raw GOME data
to ozone profiles
With OPERA (KNMI)
Earth Observation
Application
2 different jobs are executed on the
TESTBED, using data provided
via the sandbox model
LIDAR
data
Validate GOME
ozone profiles with
Ground Based measurements
(IPSL)
Visualization
EDG Use Cases Tutorial - n° 26
OPERA application (KNMI)
From wave spectra measured by the GOME instrument on the
ERS satellite ozone profiles can be calculated. ESA provides
these spectra as level 1 data. This level 1 data is then
processed using OPERA to produce ozone profiles, a level 2
product. The algorithm and s/w (OPERA) are developed by
KNMI.
GOME takes ~30.000 usable
measurements for ozone
profile retrieval per day.
The calculation of 1 profile
takes ~2 min on a 800Mhz
PIII.
One day of profiles will take
40 days on 1 computer.
EDG Use Cases Tutorial - n° 27
Validation application (IPSL)
Produced profiles by OPERA are validated by
IPSL using ground based LIDAR
measurements.
Since the LIDAR data are in-situ, preselection of the global GOME data has to be
performed to create a dataset which is
geographically and temporally in coincidence.
The main function of the program is to
perform statistical operations like the bias
between GOME and LIDAR data for
different altitudes and its standard
deviations.
The output of the validation program are 2
plots, generated by xmgr.
EDG Use Cases Tutorial - n° 28
Used JDL file
Executable
= "o3gome-lidar_xmgr.final";
StdOutput
= "appli.out";
StdError
= "appli.err";
InputSandbox = {"/home/leroy/DEMO_190202/o3gome-lidar_xmgr.final",
"/home/leroy/DEMO_190202/obs20001019.dat",
"/home/leroy/DEMO_190202/obs20001002.dat",
"/home/leroy/DEMO_190202/obs20001003.dat",
"/home/leroy/DEMO_190202/obs20001004.dat",
"/home/leroy/DEMO_190202/obs20001005.dat",
"/home/leroy/DEMO_190202/obs20001006.dat",
"/home/leroy/DEMO_190202/select_coinc.exe",
"/home/leroy/DEMO_190202/data_process_demoxmgr",
"/home/leroy/DEMO_190202/oho30010.gol"};
OutputSandbox = {"out_proc.dat","profil_gome.dat","profil_lidar.dat",
"appli.out","appli.err"};
Requirements = other.OpSys == “RH 6.2”;
RetryCount
= 10;
Rank
= other.MaxCpuTime;
EDG Use Cases Tutorial - n° 29
Validation Output
Figure 1:
Estimation of the bias
between Gome and Lidar
using one month of data.
Figure 2 :
example of 2 profiles :
Comparison between
Gome profile and lidar
profile for the 2nd
October 2000.
EDG Use Cases Tutorial - n° 30
World-Wide Ozone Distribution
Mapping
Need for systematic and
global mapping of
ozone distribution
GOME
Scientific community: need for a
collaborative environment to study
problems such as ozone depletion
SCIAMACHY
Large amount of information
about atmosphere gases
stored in Terabytes of data
GRID
EDG Use Cases Tutorial - n° 31
Example of Application
Description
2
1
3
1 yr = 5110 data files
1 data file = 15 Mb (raw)
= 67Gb of data to
process
= 5110 jobs to run
Compute global ozone
mapping from 1997-98
GOME instrument
JDL Script
List of LFNs
Build JDL script
IDL Program
4
Generate 1..n LFNs
5
View Results
5110 x 700Kb
Submit Job
WMS
GRID
EDG Use Cases Tutorial - n° 32
Further Information

High Energy Physics
http://datagrid-wp8.web.cern.ch/DataGrid-WP8/

Bio-Informatics
http://marianne.in2p3.fr/datagrid/wp10/index.html

Earth Observation
http://styx.esrin.esa.it/grid/
EDG Use Cases Tutorial - n° 33