Distributed Computing and Data Analysis for CMS in view Peter Kreuzer

Download Report

Transcript Distributed Computing and Data Analysis for CMS in view Peter Kreuzer

Distributed Computing and
Data Analysis for CMS in view
of the LHC startup
Peter Kreuzer
RWTH-Aachen IIIa
International Symposium on Grid Computing (ISGC)
Taipei, April 9, 2008
Outline
• Brief overview of Worldwide LHC Grid: WLCG
• Distributed Computing Challenges at CMS
– Simulation
– Reconstruction
– Analysis
• The physicist view
• The road to the LHC startup
Peter Kreuzer - CMS Computing & Analysis
2
From local to distributed Analysis
• Before : centrally organised Analysis
(x 10)
(x 10)
CMS :the
4-6amount
PBytes of
data
perand
year,
• Example
Since 20 years
Data
of
2900 scientists,
40 countries,
184
institutes !
Physicists
per experiment
grew
drastically
Peter Kreuzer - CMS Computing & Analysis
3
• Solution : ´´Tiered´´
Computing Model
Worldwide LHC Computing GRID
Tier-0 at CERN
- Prompt Reconstruction
Tier-1s at large national
- Calibration and Low
labs or universities
latency work
- Re-Reconstruction
- Archiving
- Physics ´skimming´
 1.0 GByte/s
- Data Serving
- Archiving
Aggregate Rate from
CERN to Tier-1s
 > 1.0 GByte/s
Transfer Rate
to Tier-2
50-500
MBytes/s
Tier-2s primarily at
Tier-3s at Institutes
with
Universities
modest Infrastructure
- Simulation
• Level of distribution
motivated by the desire to leverage and
Local User
Analysis
- User
Analysis
empower- resources
+ share
load, infrastructure and funding
PeterSimulation
Kreuzer - CMS Computing & Analysis
4
- Opportunistic
WLCG Infrastructure
• EGEE Enabling Grid for E-Science
• OSG Open Science Grid
+ 11
CMS1: Tier-0
1 Tier-0
+ 7Tier-1
Tier-1++67
35Tier-2
Tier-2
Peter Kreuzer - CMS Computing & Analysis
5
Tier-0 -- Tier-1: dedicated 10Gbs Optical Network
Examples of Sites
T2 RWTH (Aachen) 
- CPU : 540 KSI2k = 360 cores
- Disc : 100TB
- Network (WAN): 2Gbit/sec
(2009 : 450 cores & 150TB)
 T1 ASGC
- CPU: 2.4 MSI2k ~1800 cores
- Disc : 930TB  1.5PB
- Tape : 586TB  800TB 
- Network : 10Gbit/sec
T2 Taiwan
- CPU : 150 KSI2k
Disc :- 19TB
 62TB
Peter- Kreuzer
CMS Computing
& Analysis
- Network : up to 10Gbit/sec
6
Pledged WLCG Resources
180
160
140
120
100
80
60
40
20
0
2007
2008 :
66,000 cores
Tier-2
Tier-1
 CPU
• 1MSI2K = 670 cores
CERN
2008
2009
2010
2011
2012
Year
400
350
Disc Storage 
(Tape Storage =
33 PBytes in 2008)
MSI2K
PetaBytes
MSI2k
MSI2K
250,000 cores
2008 :
40 PetaBytes
Tier-2
300
250
200
150
Tier-1
100
50
0
2007
CERN
2008
2009
2010
2011
Year
Peter Kreuzer - CMS Computing & Analysis
(Reference : LCG Project Planning – 1.3.08)
2012
7
Challenges for Experiments :
Example CMS
• Scale-up and test distributed Computing
Infrastructure
–
–
–
–
–
–
Mass Storage Systems and Computing Elements
Data Transfer
Calibration and Reconstruction
Event ´skimming´
Simulation
Distributed Data Analysis
• Test CMS Software Analysis Framework
• Operate in quasi-real data taking conditions and
simulateously at various Tier levels
 Computing & Software Analysis (CSA) Challenge
Peter Kreuzer - CMS Computing & Analysis
8
CMS Computing and Software
Analysis Challenges
• CMS Scaling-up in the last 4 years
Test (year)
Goal : Jobs/day
– DC04
:
– 2005 - 2006 :
15,000
5%
New Data Model and
New Software Framework
50,000
25%
100,000
50%
150,000
100%
?
– CSA06
– CSA07
– CSA08
:
:
:
Scale
• Requires 100s M simulated events input
Peter Kreuzer - CMS Computing & Analysis
9
The CSA07 data Challenge
100M Simulated
Data
CASTOR
Reconstruction 100Hz
HLT
TIER-0
CAF
Calibration
& Express Analysis
300MB/s
TIER-1
TIER-1
TIER-1
20-200MB/s
~10MB/s
Simulation
Re-Reconstruction
TIER-1
Skimms 25k jobs/day
TIER-2
TIER-2
TIER-2
TIER-2
50M evt/month
Peter Kreuzer - CMS Computing & Analysis
Analysis
75k jobs/day
10
In this presentation
• Mainly covering CMS Simulation,
Reconstruction and Analysis challenges
• Data transfers challenges covered in talk
by Daniele Bonacorsi during this session
Peter Kreuzer - CMS Computing & Analysis
11
CMS Simulation System
Tier-1
CMS Physicist
<< Please
<< Where are
simulate new
my data ? >>
physics >>
Tier-2
Global Data
Bookkeeping
(DBS)
ProdRequest
ProdAgent
Tier-2
Production
Manager
Tier-2
Tier-2
Tier-2
ProdAgent
ProdAgent
GRID
Tier-2
Peter Kreuzer - CMS Computing & Analysis
Tier-2
Tier-2
12
ProdAgent workflows
2) Merging:
1) Processing:
Local
DBS
Processing
Tier-1
ProdAgent
Grid
WMS
Local
DBS
SE
Processing
Tier-2
ProdAgent
Grid
WMS
Merging
Tier-1
Merging
Tier-2
SE
SE
Merging
Processing
Tier-2
Tier-2
SE
Small output file from
Processing job
SE
SE
Large output file from PhEDEx
Merge job
• Data processing / bookkeeping / tracking / monitoring in local-scope
• Output promoted to global-scope DBS & Data transfer system PhEDEx
• Scaling achieved by running in parallel multiple ProdAgent instances
Peter Kreuzer - CMS Computing & Analysis
13
CMS Simulation Performance
~ 250M Events in 5
months
• Tier-2 alone ~ 72%
• OSG alone ~ 50%
(Overall 07-08: 450M)
June – November 2007
M Evts / Month
70
Production Rate x 1.8
60
• 20k jobs/day reached
• < Job efficiency > ~ 75%
50
40
30
Oct
Jul
Apr
Jan
Peter Kreuzer - CMS Computing & Analysis
14
Utilization of CMS Resources
• average ~50%
• In best productions periods 75%
Missing Requests
5000
jobslots
June – November 2007
Peter Kreuzer - CMS Computing & Analysis
15
CSA07 Simulation lessons
• Major boost in scale and reliability of production machinery
• Still too many manual operations. From 2008 on:
– Deploy ProdManager component (in CSA07 was ´human´ !)
– Deploy Resource Monitor
– Deploy CleanUpSchedule component
• Further improvments in scale and reliability
– gLite WMS bulk submission : 20k jobs/day with 1 WMS server
– Condor-G JobRouter + bulk submission : 100k jobs/day and can
saturate all OSG resources in ~1 hour.
– Threaded JobTracking and Central Job Log Archival
• Introduced task-force for CMS Site Commissioning
– help detect site issues via stress-test tool (enforce metrics)
– couple site-state to production and analysis machinery
• Regular CMS Site Availability Monitoring (SAM) checks
Peter Kreuzer - CMS Computing & Analysis
16
CMS Site Availability Monitoring
Availability Ranking
(ARDA ´Dashboard´)
03/22/08
04/03/08
0%
100%
Important tool to protect CMS use cases at sites
Peter Kreuzer - CMS Computing & Analysis
17
CSA07 Reconstruction & Skimming
0) preparation of ´´Primary Datasets´´
mimics
real CMS
Detector+
Trigger
data
1) Archive and Reconstruction at CERN T0
2) Archive and Re-Reconstruction at T1s
3) Skimming at T1s
4) Express analysis & Calibration at CERN Analysis Facility
 3 different calibrations 10pb-1,100pb-1, 0pb-1
Peter Kreuzer - CMS Computing & Analysis
18
Produced CSA07 Data Volumes
x1e+8
Total CSA07 event counts:
80M
GEN-SIM
80M
DIGI-RAW
80M
HLT
330M
RECO (3 diff. calibrations)
250M
AOD
100M
skims
--------------------------920M
events
DIGI-RAW-HLT-RECO events
10/’07
02/’08
• Total Data volume: ~2PB
 Corresponds to
expected 2008 volume !
CMS data in CASTOR@CERN: 3.7PB
Peter Kreuzer - CMS Computing & Analysis
19
CSA07 Reconstruction lessons
T0 and T1 processing
2k running jobs
• T0 Reconstruction at 100Hz
only in bursts, mainly due
to stream splitting activity
• Heavy load on CASTOR
• Usefull feedback to ProdAgent Developpers to prepare
2008 data taking (repacker, …)
• T1 Processing : submission rate was main limitation.
Now based on gLite bulk submission and reaching 1214k jobs/day with 1 ProdAgent instance
• Further rate improvment to be expected with T1 resource
up-scaling
Peter Kreuzer - CMS Computing & Analysis
20
CMS Analysis System
Tier-1
Tier-2
CRAB = CMS Remote Analysis Builder
An interface to the GRID for CMS physicists
Challenge : match processing resources with
large quantities of data = ´´chaotic´´ Processing
<< Please analyse
CMS Physicist
datasets X/Y >>
Global Data
Bookkeeping
(DBS)
Tier-2
CRAB
<< Where are
my jobs ? >>
Tier-2
Tier-2
CRAB Server
Tier-2
Tier-2
GRID
Tier-2
Tier-2
Peter Kreuzer - CMS Computing & Analysis
21
CRAB Architecture
• Easy and transparent
means for CMS users to
submit analysis jobs via
the GRID (LCG RB, gLite
WMS, Condor-G)
• CSA07 analysis: direct
submission by user to
GRID. Simple, but lacking
automation and scalability
 2008 : CRAB server
• Other new feature: local
DBS for “private” users
Peter Kreuzer - CMS Computing & Analysis
22
CSA07 Analysis
• 100k jobs/day not achieved
- mainly due to lacking data during the challenge
- still limitted by data distribution: 55% jobs at 3 largest Tier-1s
- and failure rate too high
53% Successful Jobs
20% failed Jobs
27% Unknown
Main causes:
- data-access
- remote stage out
- manual user settings
20k jobs/day achieved
Number of jobs
Peter Kreuzer - CMS
Computing & Analysis
+ regularly ~30k/day JobRobot
submissions
23
CMS Grid Users since 1 year
• plot showing distinct users
• 300 users during February 2008
• 20 most active users carry 1/3 of jobs
Users
Month
CRAB Server
Peter Kreuzer - CMS Computing & Analysis
24
The Physicist View
• SUSY Search in
di-lepton + jets + MET
• Goal : Simulate excess over Standard Model (´LM1´ at 1 fb-1)
• Infrastructure
– 1 desktop PC
– CMS Software Environment (´CMSSW´ , ´CRAB´, ´Discovery´ GUI, …)
– GRID Certificate + member of a Virtual Organisation (CMS)
• Input data (CSA07 simulation/production)
– Signal (RECO) : 120k events = 360 GB
– Skimmed Background (AOD) : 3.3 M events = 721 GB
~1.1 TB
• WW / WZ / ZZ / single top
• ttbar / Z / W + jets
– Unskimmed Background : 27 M events = 4 TB (for detailed studies only)
• Location of input data
– T0/T1 : CERN (CH), FNAL (US), FZK (Germany)
– T2 : Legnaro (Italy),Peter
UCSD
(US),
IFCA
(Spain)
Kreuzer
- CMS
Computing
& Analysis
25
GRID Analysis Result
End-Point
Signal
Z peak from
SUSY cascades
Analysis Latency
• Signal + Bgd =
322 jobs  22h
to produce this result !
• Detailed studies =
1300 jobs  ~3.5 days
Georgia Karapostoli – Athens Univ.
[GeV]
Peter Kreuzer - CMS Computing & Analysis
26
CSA07 Analysis lessons
• Improve Analysis scalability, automation and reliability
–
–
–
–
CRAB-Server
Automate job re-submission
Optimize job distribution
Decrease failure rate
• Move Analysis to Tier-2s
– To protect Tier-0/1 LSF and storage systems
– To make use of all available GRID resources
• Encourage Tier-2_to_Physics_group association
– In close collaboration with sites
– With solid overall Data Management strategy
– Assess local scope DM for Physics groups & storage of user data
• Aim for 500 users by June and exceed capacity of several
gLite WMS
Peter Kreuzer - CMS Computing & Analysis
27
Goals for CSA08 (May ’08)
• “Play through” first 3 months of data taking
• Simulation
– 150M events at 1 pb-1 (“S43”)
– 150M events at 10 pb-1 (“S156”)
• Tier-0 : Prompt reconstruction
– S43 with startup-calibration
– S156 with improved calibration
• CERN Analysis Facility (CAF)
– Demonstrate low turn-around Alignment&Calibration workflows
– Coordinated and time-critical physics analyses
– Proof-of-principle of CAF Data and Workflow Managment Systems
• Tier-1 : Re-Reconstruction with new calibration constants
– S43 : with improved constants based on 1 pb-1
– S156 : with improved constants based on 10 pb-1
• Tier-2 :
– iCSA08 simulation Peter
(GEN-SIM-DIGI-RAW-HLT)
Kreuzer - CMS Computing & Analysis
– repeat CAF-based Physics analyses with Re-Reco data ?
28
Detector installation,
commissioning and operation
2008
Preparation of Software,
Computing and Physics analysis
Jan
Cooldown
of
magnet
Private global
runs
(2 days/week) &
Private mini-daq
GRUMM
Low i test
Beam-pipe baked-out
Pixels installed
Feb
CCRC’08-1
Mar
CMSSW 1.8.0 sample production
CMSSW 2.0 release
2 weeks of 2.0 testing
“CROT”
CR 0T
Apr
CR 0T
CMS closed
2007 Physics Analyses results
“CRAFT”
Initial CMS ready for run
pre CR 4T
CR 4T
May
[production start-up MC sample
iCSA08 sample generation
iCSA08 / CCRC’08-2
CMSSW 2.1 release
[all basic sw components ready
for LHC, new T0 prod tools]
Jun
Jul
fCSA08
or beam!
Aug
Must keep exercises
CCRC = Common-Vo Computing Sep
mostly non-overlapped
Readiness Challenge
Peter Kreuzer - CMS Computing & Analysis
29
CR = Commissioning Run
Oct
Where do we stand ?
• WLCG : major up-scaling since 2 years !
• CMS : impressive results and valuable lessons from CSA07
–
–
–
–
Major boost in Simulation
Produced ~2 PBytes data in T0/T1 Reconstruction and Skimming
Analysis : number of CMS Grid-users ramping up fast !
Software : addressed memory footprint and data size issues
• Further Challenges for CMS : scale from 50% to 100%
–
–
–
–
Simultaneous and continuous operations at all Tier levels
Analysis distribution and automation
Transfer rates (see talk by D.Bonacorsi)
Upscale and commission the CERN Analysis Facility (CAF)
CSA08, CCRC08, Commissioning Runs
• Challenging and motivating goals in view of Day-1 LHC !
Peter Kreuzer - CMS Computing & Analysis
30