Enabling Grids for E-sciencE LCG ARDA project Status and plans Dietrich Liko / CERN http://arda.cern.ch INFSO-RI-508833

Download Report

Transcript Enabling Grids for E-sciencE LCG ARDA project Status and plans Dietrich Liko / CERN http://arda.cern.ch INFSO-RI-508833

Enabling Grids for E-sciencE
LCG ARDA project
Status and plans
Dietrich Liko / CERN
http://arda.cern.ch
INFSO-RI-508833
The ARDA project
• ARDA is an LCG project
– main activity is to enable LHC analysis on the grid
– ARDA is contributing to EGEE
 Includes entire CERN NA4-HEP resource (NA4 = Applications)
• Interface with the new EGEE middleware (gLite)
– By construction, ARDA uses the new middleware
 Follow the grid software as it matures
– Verify the components in an analysis environments
 Contribution in the experiments framework (discussion, direct
contribution, benchmarking,…)
 Users feedback is fundamental – in particular physicists needing
distributed computing to perform their analyses
– Provide early and continuous feedback
2
ARDA prototype overview
LHC
Experiment
Main focus
Basic prototype
component
/framework
GUI to Grid
GANGA/DaVinci
Interactive
analysis
PROOF/AliROOT
High-level
services
DIAL/Athena
Explore/exploit
native gLite
functionality
Middleware
ORCA
3
Ganga4
Ganga 3 - The current release
• The current release of Ganga (version 3) is mainly a GUI
application
Ganga 4 - Introduction
• While the current status
presented is the result of an
intensive 8 week discussion:
 A lot of the information shown is
work in progress
 Certain aspects are not yet fully
defined and agreed upon
 Changes are likely to happen
 Up to date information can be
• New in Ganga 3 is the availability of a command line
interface
4th ARDA Workshop, March 2005 6
Andrew Maier
found on the Ganga web page:
http://cern.ch/ganga
Andrew Maier
4th ARDA Workshop, March 2005 8
You have seen already everything
•Major version
in
the
presentation
of Ulrik
• Ganga 4 is decomposed
•Important
contribution from the
into 4 functional
components
ARDA team
• These components also
describe the components in
•Interesting concepts
a distributed model.
• Strategy: Design each
•Note that GANGA is a joint ATLAS-LHCb
component so that it could
be a separate service.
project
• But allow to combine two or
more components into a
•Contacts with CMS (exchange of ideas,
single service
code snippets, …)
Internal architecture
Client
Remote
Registry
Application
Manager
Job
Manager
Andrew Maier
4th ARDA Workshop, March 2005 12
4
ALICE prototype
ROOT and PROOF
• ALICE provides
– the UI
– the analysis application (AliROOT)
• GRID middleware gLite provides all the rest
Middleware
UI shell
end
to
Application
end
• ARDA/ALICE is evolving the ALICE analysis system
5
Demo at
Supercomputing 04
and Den Haag
PROOF SLAVES
Site B
PROOF SLAVES
PROOF
PROOF MASTER SERVER
Site C
Site A
PROOF SLAVES
USER SESSION
Demo based on a hybrid system
using 2004 prototype
6
ARDA shell + C/C++ API
C++ access library for gLite has been
developed by ARDA
Essential for the ALICE
prototype
Server
Server Applicat ion
GSI
UUEnc
gSOAP Securit yw rapper
SSL
GSI
SSL
gSOAP Securit yUUEnc w rapper
Generic enough for general use
Using this API grid commands have
been added seamlessly to the
standard shell
Server
Service
TEXT
•High performance
•Protocol quite proprietary...
C-API (POSIX)
Applicat ion
Client
7
Current Status
•
Developed gLite C++ API and API Service
– providing generic interface to any GRID service
•
C++ API is integrated into ROOT
– In the ROOT CVS
– job submission and job status query for batch analysis can be done from inside ROOT
•
Bash interface for gLite commands with catalogue expansion is developed
– More powerful than the original shell
– In use in ALICE
– Considered a “generic” mw contribution (essential for ALICE, interesting in general)
•
First version of the interactive analysis prototype ready
•
Batch analysis model is improved
– submission and status query are integrated into ROOT
– job splitting based on XML query files
– application (Aliroot) reads file using xrootd without prestaging
8
ATLAS/ARDA
• Main component:
– Contribute to the DIAL evolution
 gLite analysis server
• “Embedded in the experiment”
–
–
–
–
–
–
AMI tests and interaction
Production and CTB tools
Job submission (ATHENA jobs)
Integration of the gLite Data Management within Don Quijote
Active participation in several ATLAS reviews
• Benefit from the other experiments prototypes
– First look on interactivity/resiliency issues
 e.g. use of DIANE
– GANGA (Principal component of the LHCb prototype, key component
of the overall ATLAS strategy)
9
Tao-Sheng Chen, ASCC
Data Management
Don Quijote
ARDA has connected gLite
Locate and move data
over grid boundaries
DQ Client
DQ server
RLS
GRID3
SE
DQ server
RLS
SE
Nordugrid
DQ server
RLS
LCG
SE
DQ server
RLS
SE
gLite
10
Combined Test Beam
Real data processed at gLite
Standard Athena for testbeam
Data from CASTOR
Processed on gLite worker node
Example:
ATLAS TRT data analysis done
by PNPI St Petersburg
Number of straw hits per layer
11
DIANE
12
DIANE on gLite running Athena
13
DIANE on LCG (Taiwan)
A worker died – no problem, its tasks
get reallocated
Job need some time to
start up. No problem.
14
ARDA/CMS
• Prototype (ASAP)
• Contributions to CMS-specific components
– RefDB/PubDB
• Usage of components used by CMS
– Notably Monalisa
• Contribution to CMS-specific developments
– Physh
15
ARDA/CMS
• RefDB Re-Design and PubDB
– Taking part in the RefDB redesign
– Developing schema for PubDB and supervising development of the
first PubDB version
• Analysis Prototype Connected to MonAlisa
– To track the progress of an analysis task is troublesome when
the task is split into several (hundreds of) sub-jobs
– Analysis prototype associates each sub-job with built-in ‘identity’
and capability to report its progress to the MonAlisa system
– MonAlisa service receives and combines progress reports of
single sub-jobs and publishes the overall progress of the whole
task
16
CMS - Using MonAlisa
for user job monitoring
Demo at
Supercomputing
04
A single job
Is submiited
to gLite
JDL contains
job-splitting
instructions
Master job is
split by gLite
into sub-jobs
Dynamic
monitoring
of the total
number of
the events of
processed by
all sub-jobs
belonging to
the same
Master job
17
ARDA/CMS
• PhySh
– Physicist Shell
– ASAP is Python-based and it uses XML-RPC calls for
client-server interaction like Clarens and PhySh
18
ARDA/CMS
• CMS prototype (ASAP = Arda Support for cms Analysis
Processing)
– First version of the CMS analysis prototype capable of creatingsubmitting-monitoring of the CMS analysis jobs on the gLite
middleware had been developed by the end of the year 2004
 Demonstrated at the CMS week in December 2004
– Prototype was evolved to support both RB versions deployed at
the CERN testbed (prototype task queue and gLite 1.0 WMS )
– Currently submission to both RBs is available and completely
transparent for the users (same configuration file, same
functionality)
– Plan to implement gLite job submission handler for Crab
19
ASAP: Starting point for users
• The user is familiar with the experiment application needed to
perform the analysis (ORCA application for CMS)
– The user knows how to create executable able to run the analysis task
(reading selected data samples, use the data to compute derived
quantities, take decisions, fill histograms, select events, etc…). The
executable is based on the experiment framework
• The user debugged the executable on small data samples, on a
local computer or computing services (e.g. lxplus at CERN)
• How to go for larger samples , which can be located at any
regional center CMS-wide?
• The users should not be forced :
– to change anything in the compiled code
– to change anything in the configuration file for ORCA
– to know where the data samples are located
20
ASAP work and information flow
Monalisa
RefDB
PubDB
Job monitoring
directory
JDL
Job
running
on the
Worker
Node
gLite
ASAP UI
Job submission
Application,applicationversion,
Executable,
Orca data cards
Data sample,
Working directory,
Castor directory to save output,
Number of events to be processed
Number of events per job
Delegates user
credentials using
MyProxy
Publishing
Job status
On the WEB
Checking job
status
ASAP Job
Monitoring
service
Resubmission in
case of failure
Fetching results
Storing results to
Castor
Output files
location
21
Job Monitoring
• ASAP Monitor
22
Merging the results
23
H->2t->2j analysis: bkg. data available
(all signal events processed with Arda)
s Br, mb
Bkg. samples
Processed with
qcd, pT = 50-80 GeV/c
100K
Arda
2.08 x 10-2
2.44 x 10-4
qcd, pT = 80-120 GeV/c
200K
crab
2.94 x 10-3
5.77 x 10-3
qcd, pT = 120-170 GeV/c
200K
Arda
5.03 x 10-4
4.19 x 10-2
qcd, pT > 170 GeV/c
1M
1.33 x 10-4
2.12 x 10-1
tt, W->tn
80K
crab
5.76 x 10-9
4.88 x 10-2
Wt, W->tn
30K
Arda
7.10 x 10-10
1.38 x 10-2
W+j, W->tn
400K
crab
5.74 x 10-7
2.16 x 10-2
Z/g*->tt,
130<mtt < 300 GeV/c2
70K
Arda
1.24 x 10-8
9.53 x 10-2
Z/g*->tt, mtt > 300 GeV/c2
60K
gross
6.22 x 10-10
3.23 x 10-1
A. Nikitenko (CMS)
Kine presel.
24
Higgs boson mass (Mtt) reconstruction
Higgs boson mass was reconstructed after basic off-line cuts:
reco ETt jet > 60 GeV, ETmiss > 40 GeV. Mtt evaluation is shown for the
consecutive cuts : pt > 0 GeV/c, pn > 0 GeV/c, Dfj1j2 < 1750.
s(MH) ~ s(ETmiss) / sin(fj1j2)
Mtt and s(Mtt) are in a very good agreement with old results CMS Note 2001/040,
Table 3: Mtt = 455 GeV/c2, s(Mtt)=77 GeV/c2. ORCA4, Spring 2000 production.
25
A. Nikitenko (CMS)
ARDA ASAP
• First users were able to process their data on gLite
– Work of these pilot users can be regarded as a first round of validation of
the gLite middleware and analysis prototypes
• The number of users should increase as soon as preproduction
system will become available
– Interest to have CPUs at the centres where data sits (LHC Tier-1s)
• To enable user analysis on the Grid:
– we will continue to work in the close collaboration with the physics
community and gLite developers
 ensuring good level of communication between them
 providing constant feedback to the gLite development team
• Key factors to progress:
– Increasing number of users
– Larger distributed systems
– More middleware components
26
ARDA Feedback (gLite middleware)
• 2004:
– Prototype available (CERN + Madison Wisconsin)
– A lot of activity (4 experiments prototypes)
– Main limitation: size
 Experiments data available! 
 Just an handful of worker nodes 
• 2005:
– Coherent move to prepare a gLite package to be deployed on the
pre-production service
 ARDA contribution:
 Mentoring and tutorial
 Actual tests!
– Lot of testing during 05Q1
– PreProduction Service is about to start!
27
WMS monitor
– “Hello World!” jobs
– 1 per minute since last Febraury
– Logging&Bookkeeping info on the web to help the developers
28
Data Management
• Central component together with the WMS
• Early tests started in 2004
• Two main components:
– gLiteIO (protocol + server to access the data)
– FiReMan (file catalogue)
– The two components are not isolated, for example gLiteIO uses the
ACL as recorded in FiReMan, FiReMan exposes the physical location
of files for the WMS to optimise the job submissions…
• Both LFC and FiReMan offer large improvements over RLS
– LFC is the most recent LCG2 catalogue
• Still some issues remaining:
–
–
–
–
–
Scalability of FiReMan
Bulk Entry for LFC missing
More work needed to understand performance and bottlenecks
Need to test some real Use Cases
In general, the validation of DM tools takes time!
29
FiReMan Performance - Queries
• Query Rate for an LFN
1200
Fireman Single
Fireman Bulk 1
Fireman Bulk 10
Fireman Bulk 100
Fireman Bulk 500
Fireman Bulk 1000
Fireman Bulk 5000
Entries Returned / Second
1000
800
600
400
200
0
5
10
15
20
25
30
35
40
45
50
Number Of Threads
30
FiReMan Performance - Queries
• Comparison with LFC:
1200
Fireman - Single Entry
Fireman - Bulk 100
LFC
Entries Returned / Second
1000
800
600
400
200
0
12 5 10
20
50
100
Number Of Threads
31
More data coming…
C. Munro (ARDA & Brunel Univ.) at ACAT 05
32
Summary of gLite usage and testing
• Info available also under
http://lcg.web.cern.ch/lcg/PEB/arda/LCG_ARDA_Glite.htm
• gLite version 1
 WMS
•
•
•
•
Continuous monitor available on the web (active since 17th of February)
Concurrency tests
Usage with ATLAS and CMS jobs (Using Storage Index)
Good improvements observed
 DMS (FiReMan + gLiteIO)
• Early usage and feedback (since Nov04) on functionality, performance and usability
• Considerable improvement in performances/stability observed since
• Some of the tests given to the development team for tuning and to JRA1 to be used
in the testing suite
• Most of the tests given to JRA1 to be used in the testing suite
• Performance/stability measurements: heavy-duty testing needed for real validation
 Contribution to the common testing effort to finalise gLite 1 with SA1, JRA1
and NA4-testing)
• Migration of certification tests within the certification test suite (LCGgLite)
• Comparison between LFC (LCG) and FiReMan
• Mini tutorial to facilitate the usage of gLite within the NA4 testing
33
Metadata services on the Grid
•
•
•
gLite has provided a prototype for the EGEE Biomed community (in 2004)
Requirements in ARDA (HEP) were not all satisfied by that early version
ARDA preparatory work
– Stress testing of the existing experiment metadata catalogues
– Existing implementations showed to share similar problems
•
ARDA technology investigation
– On the other hand usage of extended file attributes in modern systems
(NTFS, NFS, EXT2/3 SCL3,ReiserFS,JFS,XFS) was analysed:
a sound POSIX standard exists!
•
•
Prototype activity in ARDA
Discussion in LCG and EGEE and UK GridPP Metadata group
•
Synthesis:
– New interface which will be maintained by EGEE benefiting from the
activity in ARDA (tests and benchmarking of different data bases and
direct collaboration with LHCb/GridPP)
34
ARDA Implementation
•
Prototype
– Validate our ideas and expose a
concrete example to interested parties
•
Multiple back ends
– Currently: Oracle, PostgreSQL, SQLite
•
Metadata Server
Oracle
Client
SOAP
MD
Server
Dual front ends
– TCP Streaming
Client
Postgre
SQL
TCP
Streaming
SQLite
 Chosen for performance
– SOAP
 Formal requirement of EGEE
 Compare SOAP with TCP Streaming
•
Also implemented as standalone
Python library
– Data stored on the file system
Python Interpreter
Client
Metadata
Python
API
filesystem
35
Dual Front End
•
•
Text based protocol
Client
Server
<operation>
[data]
Most operations are SOAP calls
Client
Database
Server
query
Create DB cursor
[data]
[data]
nextQuery
[data]
[data]
•
[data]
[data]
SOAP
with iterators
Streaming
Data streamed to client in single
connection
Implementations
–
–
[data]
nextQuery
[data]
•
[data]
[data]
[data]
Streaming
Create DB cursor
[data]
[data]
[data]
Database
•
Based on iterators
–
–
–
Server – C++, multiprocess
Clients – C++, Java, Python, Perl, Ruby
Streaming
Session created
Return initial chunk of data and session token
Subsequent request: client calls nextQuery()
using session token
–
Session closed when:



•
End of data
Client calls endQuery()
Client timeout
Implementations
–
–
Server – gSOAP (C++).
Clients – Tested WSDL with gSOAP, ZSI
(Python), AXIS (Java)
36
More data coming…
N. Santos (ARDA & Coimbra Univ.) at ACAT 05
Test protocol performance
–
–
•
25
No work done on the backend
Switched 100Mbits LAN
Execution Time [s]
•
Language comparison
–
•
Measure scalability of protocols
–
•
•
Switched 100Mbits LAN
TCP-S 3x faster than gSoap (with
keepalive)
Poor performance without keepalive
–
20
1000 pings
15
10
5
0
Average throughput [calls/sec]
•
TCP-S with similar performance in all
languages
– SOAP performance varies strongly with
toolkit
Protocols comparison
– Keepalive improves performance
significantly
– On Java and Python, SOAP is several
times slower than TCP-S
TCP-S no KA
TCP-S KA
SOAP no KA
SOAP KA
C++ (gSOAP)
Java (Axis)
Python (ZSI)
TCP-S, no KA
TCP-S, KA
gSOAP, no KA
gSOAP, KA
10000
1000
Client ran
out of sockets
Around 1.000 ops/sec (both gSOAP and
TCP-S)
1
10
100
# clients
37
Current Uses of the ARDA
Metadata prototype
• Evaluated by LHCb bookkeeping
– Migrated bookkeeping metadata to ARDA prototype
 20M entries, 15 GB
– Feedback valuable in improving interface and fixing bugs
– Interface found to be complete
– ARDA prototype showing good scalability
• Ganga (LHCb, ATLAS)
– User analysis job management system
– Stores job status on ARDA prototype
– Highly dynamic metadata
• Discussed within the community
– EGEE
– UK GridPP Metadata group
38
ARDA workshops and related activities
• ARDA workshop (January 2004 at CERN; open)
• ARDA workshop (June 21-23 at CERN; by invitation)
– “The first 30 days of EGEE middleware”
• NA4 meeting (15 July 2004 in Catania; EGEE open event)
• ARDA workshop (October 20-22 at CERN; open)
– “LCG ARDA Prototypes”
– Joint session with OSG
• NA4 meeting 24 November (EGEE conference in Den Haag)
• ARDA workshop (March 7-8 2005 at CERN; open)
• ARDA workshop (October 2005; together with LCG Service
Challenges)
• Wednesday afternoon meeting started in 2005:
– Presentations from experts and discussion (not necessary from ARDA
people)
Available from http://arda.cern.ch
39
Conclusions (1/3)
• ARDA has been set up to
– Enable distributed HEP analysis on gLite
 Contact have been established
• With the experiments
• With the middleware developers
• Experiment activities are progressing rapidly
– Prototypes for ALICE, ATLAS, CMS & LHCb
 Complementary aspects are studied
 Good interaction with the experiments environment
– Always seeking for users!!!
 People more interested in physics than in middleware… we support them!
– 2005 will be the key year (gLite version 1 is becoming available on the preproduction service)
40
Conclusions (2/3)
• ARDA provides special feedback to the development team
– First use of components (e.g. gLite prototype activity)
– Try to run real-life HEP applications
– Dedicated studies offer complementary information
• Experiment-related ARDA activities produce elements of general use
– Very important “by-product”
– Examples:
 Shell access (originally developed in ALICE/ARDA)
 Metadata catalog (proposed and under test in LHCb/ARDA)
 (Pseudo)-interactivity experience (something in/from all experiments)
41
Conclusions (3/3)
• ARDA is a privileged observatory to follow, contribute and influence
the evolution of the HEP analysis
– Analysis prototypes are a good idea!
 Technically, they complement the data challenges’ experience
 Key point: these systems are exposed to users
– The approach of 4 parallel lines is not too inefficient
 Contributions in the experiments from day zero
• Difficult environment
 Commonality can not be imposed…
– We could do better in keeping good connection with OSG
 How?
42
Outlook
• Commonality is a very tempting concept, indeed…
– Sometimes a bit fuzzy, maybe…
• Maybe it is becoming more important …
– Lot of experience in the whole community!
– Baseline services ideas
– LHC schedule: physics is coming!
• Maybe it is emerging… (examples are not exhaustive)
– Interactivity is a genuine requirement: e.g. PROOF and DIANE
– Toolkits for the users to build applications on top of the computing
infrastructure: e.g. GANGA
– Metadata/workflow systems open to the users
– Monitor and discovery services open to users: e.g. Monalisa in ASAP
• Strong preference for a “a posteriori” approach
– All experiments still need their system…
– Keep on being pragmatic …
43
People
•
•
Massimo Lamanna
Frank Harris (EGEE NA4)
•
•
•
•
Birger Koblitz
Andrey Demichev
Viktor Pose
Victor Galaktionov
•
•
Derek Feichtinger
Andreas Peters
•
•
•
•
Hurng-Chun Lee
Dietrich Liko
Frederik Orellana
Tao-Sheng Chen
•
•
•
Julia Andreeva
Juha Herrala
Alex Berejnoi
•
•
•
Andrew Maier
Kuba Moscicki
Wei-Long Ueng
2 PhD students:
•
Craig Munro (Brunel Univ.) Distributed
analysis within CMS
working mainly with Julia
•
Nuno Santos (Coimbra Univ) Metadata and
resilient computing
working mainly with Birger
•
ALICE
Catalin Cirstoiu and Slawomir Biegluk
(short-term LCG visitors)
ATLAS
CMS
Good collaboration with
EGEE/LCG Russian institutes
and with ASCC Taipei
LHCb
44