CMS Tier1 at JINR.

Download Report

Transcript CMS Tier1 at JINR.

CMS Tier 1 at JINR
V.V. Korenkov
for JINR CMS Tier-1 Team
JINR
XXIV International Symposium on
Nuclear Electronics & Computing,
NEC2013
2013, September 13
1
Outline
• CMS Grid structure
– role of Tier-1s
– CMS Tier-1s
• CMS Tier-1 in Dubna
– History and Motivations (Why Dubna?)
– Network infrastructure
– Infrastructure and Resources
– Services and Readiness
– Staffing
– Milestones
• Conclusions
2
CMS Grid Structure
Tier Structure of GRID Distributed
Computing:
Tier-0/Tier-1/Tier-2
[email protected]
Tier-0 (CERN):
• accepts data from the CMS
Online Data Acquisition
and Trigger System
• archives RAW data
• the first pass of
reconstruction and
performs Prompt
Calibration
• data distribution to Tier-1
Tier-1 (11 centers):
• receives a data from the
Tier-0
• data processing (rereconstruction, skimming ,
calibration etc)
• distributes data and MC to
the other Tier-1 and Tier-2
• secure storage and
redistribution for data and
MC
Tier-2 (>200 centers):
• simulation
• user physics analysis
3
4
CMS Tier-1 in Dubna
Tier1 center
March 2011 - Proposal to create the LCG Tier1
center in Russia (official letter by Minister of Science
and Education of Russia A. Fursenko has been sent to
CERN DG R. Heuer):
NRC KI for ALICE, ATLAS, and LHC-B
LIT JINR (Dubna) for the CMS experiment
The Federal Target Programme Project: «Creation of the
automated system of data processing for experiments at
the LHC of Tier-1 level and maintenance of Grid services
for a distributed analysis of these data»
Duration: 2011 – 2013
September 2012 – Proposal was reviewed by
WLCG OB and JINR and NRC KI Tier1 sites
were accepted as a new “Associate Tier1”
Full resources - in 2014 to meet the start of
next working LHC session.
5
6
Why in Russia?
Why Dubna?
• In frames of the RDIG project (a
participant of the WLCG/EGEE
projects), a grid-infrastructure
excepted by LHC experiments has
been successfully launched as a
distributed cluster RuTier2
(Russian Tier2) and JINR cluster
JINR-LCG2 is the main one in RDIG
as to its performance.
JINR-LCG2
~40% of CPU time
in RDIG for 20112013
7
8
JINR Central Information and
Computing Complex (CICC)
Local JINR users (no grid)
Grid users (WLCG)
Jobs run by JINR Laboratories and
experiments executed at CICC
January - September 2013.
BLTP 8,99%
LBES 0,86%
JINR-LCG2 Normalised CPU time by
LHC VOs. January - September 2013.
PANDA 27,39%
MPD 39,12%
VBLHEP 3,57%
FLNP 0,40%
DLNP 3,84%
LRB 1,00% LIT 1,61%
COMPASS 13,23%
More than 3 million jobs run
Total normalised CPU time – 20 346 183 kSI2K-hours
http://accounting.egi.eu/
http://lit.jinr.ru/view.php?var1=comp&var2=ccic&lang=ru
s&menu=ccic/menu&file=ccic/statistic/stat-2013
9
CMS Computing at JINR
 the first RDMS CMS web-server (in 1996);
 full-scale CMS software infrastructure support
since 1997
JINR CMS Tier2 center is one of
the most reliable and productive
CMS Tier2 centers worldwide
(in the top ten best) the most
powerful RDMS CMS Tier2 center
 CMS Regional Operation Center
are functioning in JINR since 2009
The core services needed for WLCG Tier-1 are
computing service, a storage service, information service.
The primary Tier-1 tasks can be divided into
 recording raw data from CERN and storing them on tape;
 recording processed data from CERN and storing them on disk;
 providing data to other Tier-1 / Tier-2;
 reprocessing raw data;
 event simulation calculations.
Russia Normalized CPU time per SITE and
VO (2012-2013)
All VO
Russia - 409,249,900
JINR - 183,008,044
CMS
Russia - 112,025,416
JINR - 67,938,700 (61%)
10
10
11
Network infrastructure
12
The Core of LHC Networking:
LHCOPN and Partners
12
14
Infrastructure
and
Facilities
15
16
JINR CMS Tier-1 progress
2013
2014
2015
2016
28800
57600
69120
82944
2400
4800
5760
6912
Disk (Terabytes)
3500
4500
5400
6480
Tape (Terabytes)
5700
8000
9600
10520
Link CERN-JINR
10
10
40
40
CPU (HEPSpec06)
Number of core
●
●
●
●
●
Disk & server installation and tests: done
Tape system installation: done
Organization of network infrastructure
and connectivity to CERN via GEANT:
done
Registration in GOC DB and APEL: done
Tests of WLCG services via Nagios: done
17
JINR monitoring
Network monitoring
information system more than 423 network
nodes are in round-theclock monitoring
17
18
Services
and
Readiness
19
●
CMS-specific activity
Currently commissioning Tier-1 resource for CMS:
–
–
–
–
–
●
●
Local Tests of CMS VO-services and CMS SW
The PhEDEx LoadTest (tests of data transfer links)
Job Robot Tests (or tests via HammerCloud)
Long-running CPU intensive jobs
Long-running I/O intensive jobs
PHDEDX transferred of RAW input data to our storage
element with transfer efficiency around 90%
Prepared services and data storage for the reprocessing
of 2012 8 TeV reprocessing
20
CMS Tier-1 Readiness
21
CMS Tier-1 in Dashborad
Data transfer
link to CERN
Frames for Grid cooperation of JINR












Worldwide LHC Computing Grid (WLCG)
Enabling Grids for E-sciencE (EGEE) - Now is EGI-InSPIRE
RDIG Development
CERN-RFBR project “Grid Monitoring from VO perspective”
BMBF grant “Development of the grid-infrastructure and tools to provide
joint investigations performed with participation of JINR and German
research centers”
“Development of grid segment for the LHC experiments” was supported in
frames of JINR-South Africa cooperation agreement;
Development of grid segment at Cairo University and its integration to the
JINR GridEdu infrastructure
JINR - FZU AS Czech Republic Project “The grid for the physics
experiments”
NASU-RFBR project “Development and support of LIT JINR and NSC KIPT
grid-infrastructures for distributed CMS data processing of the LHC
operation”
JINR-Romania cooperation Hulubei-Meshcheryakov programme
JINR-Moldova cooperation (MD-GRID, RENAM)
JINR-Mongolia cooperation (Mongol-Grid)
22
22/98
23
Staffing
ROLE
Administrative
FTE
1.5
Korenkov V.
Network support
Engineering
Infrastructure
2
Mitsyn V.
2.5
Hardware support
3
Core software and
WLCG middleware
4.5
CMS Services
3.5
Dolbilov A.
Trofimov V.
Total
17
Shmatov S.
24
Milestones
25 Milestones of the JINR CMS Tier-1 Deployment and Commissioning
Objective
Target date
Presentation the Execution Plan to WLCG OB
Sep 2012
Prototype
Disk & Servers installation and tests
Oct 2012
Tape system installation
Nov 2012
Organization of network infrastructure and connectivity to CERN via GEANT (2 Gb)
Nov 2012
WLCG OPN integration (2 Gb) and JINR-T1 registration in GOCDB including integration with the APEL accounting
system
Dec 2012
M1
Dec 2012
LHC OPN functional tests (2 Gb)
May 2013
Test of WLCG and CMS services (2 Gb LHCOPN)
May 2013
Test of tape system at JINR: data transfers from CERN to JINR (using 2 Gb LHC OPN)
May 2013
Test of publishing accounting data
May 2013
Definition of Tier 2 sites support
May 2013
Connectivity to CERN 10 Gb
Jul 2013
M2
Jul 2013
LHC OPN functional tests (10 Gb)
Aug 2013
Test of tape system at JINR: data transfers from CERN to JINR (using 10 Gb LHC OPN)
Aug 2013
Upgrade of tape, disk and CPU capacity at JINR
Nov 2013
M3
Nov 2013
85% of the job capacity running for at least 2 months
Storage availability > 98% (functional tests) for at least 2 months
Running with > 98% Availabilities & Reliabilities for at least 30 days
WLCG MoU as an associate Tier-1 center
Feb 2014
Disk & Tape & Servers upgrade
Oct 2014
M4
Dec 2014
26
Main tasks for next years
 Engineering infrastructure (system of uninterrupted
power supply and climate-control)
 High-speed reliable network infrastructure with the
allocated reserved channel to CERN (LHCOPN)
 Computing system and storage system on the basis
of disk arrays and tape libraries of high capacity
 100% reliability and availability.
US-BNL
CERN
Amsterdam/NIKHEF-SARA
Bologna/CNAF
CaTRIUMF
Taipei/ASGC
Russia:
NRC KI
NDGF
JINR
US-FNAL
26 June 2009
De-FZK
Barcelona/PIC
Lyon/CCIN2P3
UK-RAL
The 6th International Conference "Distributed
Computing and Grid-technologies in Science
and Education" (GRID’2014)
Dubna, 30 June-5 July 2014
GRID’2012
Conference
22 countries,
256 participants,
40 Universities and
Institutes from
Russia,
31 Plenary,
89 Section talks
28/98
29
Conclusions
 In 2012-2013 CMS Tier1 prototype was created in
Dubna
 Disk & server installation and tests
 Prototype tape system installation and tests
 Organization of network infrastructure and connectivity to




CERN via GEANT
Registration in GOC DB and APEL
Tests of WLCG services via Nagios
CMS-specific tests
Commissioning data transfer links (T0-T1, T1-T1, T1-T2) in
progress
 We expect to meet the start of next LHC run with
full resources required (for the end of 2014)