SAM for CDF and Core SAM Stefan Stonjek University of Oxford GridPP Meeting 16th September 2002 Imperial College London.

Download Report

Transcript SAM for CDF and Core SAM Stefan Stonjek University of Oxford GridPP Meeting 16th September 2002 Imperial College London.

SAM for CDF and Core SAM
Stefan Stonjek
University of Oxford
GridPP Meeting
16th September 2002
Imperial College London
Stupid people do not learn from their mistakes
Clever people do learn from their mistakes
Wise people learn from other people mistakes
CDF inherits SAM from DØ

16/09/2002
Stefan Stonjek
2
Outline
Introduction
What is SAM
Deployment
SAM on farms
Problems
Summary
This is not SAM
This is just the logo 
16/09/2002
Stefan Stonjek
3
The Experiments
CDF and D0
Multi purpose 4π detectors
At FNAL (close to Chicago)
Tevatron started run II in 2001



Increased luminosity compared to run I
e.g. 0.3 fb-1 in 2002, 0.9 fb-1 in 2003
run II: 2001-2007
16/09/2002
Stefan Stonjek
4
Data Volumes
106
Level 1 Rate
(Hz)
LHCB
105
104
ATLAS
CMS
KTeV
HERA-B
KLOE
CDF IIa
D0 IIa
CDF
103
H1
ZEUS
NA49
UA1
102
104
ALICE
105
106
Event Size (bytes)
LEP
16/09/2002
107
Stefan Stonjek
5
Run II Data Volumes (-2004)
# of events
900 M/year
600 M/year
raw data
250 TB/year
150 TB/year
reconstructed
data
Thumbnails
135 TB/year
75 TB/year
-
8 TB/year
16/09/2002
Stefan Stonjek
6
What is
Sequential data Access via Metadata
Data management system
Many separate SAM stations worldwide
Central database translates metadata (dataset name)
to filenames (many)
Central database knows about every file and it’s
location(s)
Job is started on a user specified station (so far)
Automatic data transfer to station which needs data
Each job see each file once even if multiple
processes.
16/09/2002
Stefan Stonjek
7
<node>
node
Batch (LSF)
52668 <user1> RUN
52675 <user2> RUN
52756 <user3> PSUSP
Stager
Station
remote-station
samscript.sh
stagerng
eworker
smaster
samscript.sh
userscript
userscript
16/09/2002
eworker
Project
pmaster
consumer
Project
pmaster
consumer
Stefan Stonjek
bbftp
Project
pmaster
gridftp
8
Meta Information
CDF stores different meta information
than D0
=> development
=> effort to make SAM universal
Storage of meta information and
bookkeeping of all output files allow
decentralized physics analyses
16/09/2002
Stefan Stonjek
9
Station to Station
SAM communication is CORBA based
SAM station queries (via CORBA) central
SAM-DB-server
SAM-DB-server queries ORACLE
database vi SQL
=> just one ORACLE client
16/09/2002
Stefan Stonjek
10
SAM experience
D0 uses SAM for more than a year
Inbound and outbound traffic is tested
Some CDF people already use SAM
(mostly in the UK)
16/09/2002
Stefan Stonjek
11
Deployment
Each collaborating institute will have
SAM station
Installation is no problem
ups/upd tools from FNAL
Past: a day
Now: one hour
Maybe automatic (accompanying the
executable)
16/09/2002
Stefan Stonjek
12
Station Types
Linux PC
256 SMP machine
Farm (700 Linux PCs)
Farm with private network
16/09/2002
Stefan Stonjek
13
SAM at CAF
CAF: Central Analysis Farm
Easy to start the same process n times
SAM delivers each input file just once
SAM at CAF is combination of two
existing tools

=> development phase should be short,
manpower from University of Chicago
D0 is very interested
16/09/2002
Stefan Stonjek
14
Backends
D0 uses SamManager within their
physics analysis framework
CDF wants to incorporate SAM into it’s
analysis framework



Under development
Requires some work since analysis
frameworks are really different
Unfortunately no proper API
16/09/2002
Stefan Stonjek
15
Problems
Friday afternoon problem




Many people submit jobs on Friday
afternoon 
Project is started immediately (prestaging)
Total number of projects per station is
limited
=> Problem
16/09/2002
Stefan Stonjek
16
Prestaging and
Batch adaptors
Prestating


Staging starts as soon as job is submitted
Job does not run before data are around
Initiated by submitting host (so far)
Requires strong correlation SAM/batch system

=>Batch adaptor for every flavour
Solution: batch system starts prestaging
immediately after submission
16/09/2002
Stefan Stonjek
17
Monitoring
SAM Monitoring via Web
Online monitoring page still under
development


Result depends on round trip time and
machine load
“offline” monitoring works better
16/09/2002
Stefan Stonjek
18
Summary
SAM is a data management tool, currently in
use at D0, CDF will follow soon
SAM is easy to install and use
SAM is a product which suits the needs of
communities which handle large datasets (
O(100s TB) )
SAM is a first step on the way to a grid
enabled physics analysis
16/09/2002
Stefan Stonjek
19