CMS Status and Plans  Progress towards GridPP milestones      All-hands demo   Workload management (ICSTM) Monitoring (Brunel) Data management (Bristol) See also separate talks tomorrow by Dave Colling,

Download Report

Transcript CMS Status and Plans  Progress towards GridPP milestones      All-hands demo   Workload management (ICSTM) Monitoring (Brunel) Data management (Bristol) See also separate talks tomorrow by Dave Colling,

CMS Status and Plans

Progress towards GridPP milestones





All-hands demo


Workload management (ICSTM)
Monitoring (Brunel)
Data management (Bristol)
See also separate talks tomorrow by Dave Colling, Owen Maroney
See talk/demonstration today by Sarah Marr & Dave Colling
Future plans


Data challenges in 03/04
Network performance issues
CMS Report – GridPP Collaboration Meeting V
Peter Hobson, Brunel University
16/9/2002
Production computing

Phil Lewis (ICST&M, London): Workload Management

Contributed to the multistage MC production

Key contribution to current production MC
Regional Centre
Simulation
Hits
No Pile Up
2x1033
2x1044
Bristol/RAL
Caltech
CERN
Fermilab
ICST&M
IN2P3
INFN
Moscow
UCSD
UFL
USMOP
Wisconsin
0.55
0.17
0.89
0.35
0.88
0.20
1.55
0.43
0.34
0.54
0.00
0.07
0.33
0.15
2.20
0.41
0.59
0.00
1.18
0.14
0.30
0.04
0.00
0.08
0.04
0.00
1.40
0.00
0.50
0.00
0.40
0.14
0.00
0.00
0.00
0.00
0.06
0.15
2.66
0.25
0.15
0.00
0.72
0.00
0.29
0.04
0.00
0.06
0.02
0.00
2.25
0.33
0.12
0.00
0.71
0.00
0.30
0.04
0.00
0.00
TOTAL
5.94
5.40
2.47
4.36
3.77
NassID
20
6
300
70
84
1
99
41
80
11
1
12
Total number of MC events produced/processed in millions for Q1 and Q2 in 2002
CMS Report – GridPP Collaboration Meeting V
Peter Hobson, Brunel University
16/9/2002
Production computing

Dave Colling (ICST&M, London): Workload Management

Contributed to the Multistage MC production



Sheffield demo of the CMS portal


This includes the Objectivity part and the ability to run on sites that
have no CMS software installed (in which case it does a DAR
installation first).
This work is only prevented from being very effective because of
the GASS Cache Bug.
Two stages of the MC production ran during the demo
Participated in the CMS production tools grid review. This
is working towards a unified grid approach for CMS grids
in Europe and the States.
CMS Report – GridPP Collaboration Meeting V
Peter Hobson, Brunel University
16/9/2002
Web portal

Sarah Marr (ICST&M, London): Workload management
CMS Report – GridPP Collaboration Meeting V
Peter Hobson, Brunel University
16/9/2002
Manpower report

Barry MacEvoy (Imperial College London):
CMS WP8, 0.5 FTE

Activities

Installation of LCFG server to build and configure testbed nodes
on CMS farm

Some work on multi-stage Monte Carlo production

R-GMA interface to BOSS (in collaboration with Nebrensky et al.)

Preparation of CMS demo and associated literature

Data analysis architecture design… just started
CMS Report – GridPP Collaboration Meeting V
Peter Hobson, Brunel University
16/9/2002
Adding RGMA to BOSS

Henry Nebrensky (Brunel): Monitoring, 0.5 + 0.3 FTE

BOSS is the job submission and tracking system used by
CMS.

BOSS is not “GRID enabled”

Using EDG (WP3) RGMA release 2



Data is currently sent back directly from WN to the BOSS
database on the UI
This is being replaced by an RGMA producer and consumer
Status today is
• Operating an RGMA schema and registry server
• A mock BOSS job in C++ for test purposes exists
• Currently integrating this code into BOSS
CMS Report – GridPP Collaboration Meeting V
Peter Hobson, Brunel University
16/9/2002
Testbed, Data Management

Owen Maroney (University of Bristol): Data Management

Testbed Site



Operating with EDG1.2 Testbed in GridPP VO
Hosting GridPP VO Replica Catalogue service
Data Management


Testing of GDMP, Replica Catalogue and Replica Manager
services between Bristol, RAL and CERN
Regional Data-Centre milestone requirements:
• To replicate >10TB CMS data between CERN and RAL in >17k files
using Grid tools
• Store on Datastore MSS at RAL
• To be accessible anywhere on the Grid through the Replica Manager
services
CMS Report – GridPP Collaboration Meeting V
Peter Hobson, Brunel University
16/9/2002
Grid Object Access

Tim Barrass (University of Bristol)

Entered post 1 September




Implementation and testing of the new persistency layer for CMS.
Testing / rollout at T1 and interface to Grid services the main
interest
Using the POOL framework under development.
Also associated with BaBar (50%); will help with immediate
developments of their data storage model.
CMS Report – GridPP Collaboration Meeting V
Peter Hobson, Brunel University
16/9/2002
Data Challenge 2003-4

DC04 is a crucial milestone for CMS computing






The steps:





An end-to-end test of our offline computing system at 25% scale
Simulates 25Hz data recording @ 2.1033 luminosity, for one month
Tests software, hardware, networks, organisation
The first step in the real scale-up to the exploitation phase
Data will be directly used in preparation of Physics TDR
Generate simulated data through worldwide production (‘DC03’)
Copy raw digitised data to CERN Tier-0
‘Play back’ through the entire computing system - T0, 2 or 3
protoT1’s operational (US, UK, …), many T2s.
Analyses, calibrations, DQM checks performed at T2, T3 centres
Grid m’ware is an important part of the computing system
CMS Report – GridPP Collaboration Meeting V
Peter Hobson, Brunel University
16/9/2002
Data Challenge 2003-4

DC03 in the UK (starts July 03, five months)





DC04 in the UK (Feb ’04)




Plan to produce ~50TB of GEANT4 data at T1 and T2 sites,
starting July ’03
All data stored at RAL: this means 60Mb/s continuously into the
RAL datastore for 4-5 months
Data digitised at RAL with full background; 30TB of digis shipped
to CERN at 1TB/day (>100Mb/s continuously over WAN)
New persistency layer (POOL?) used throughout
~30 TB transferred to Tier-1 in one month (100Mb/s continuous)
Data replicated to Tier-2 sites upon demand
Full analysis framework in place at Tier-1, Tier-2, Tier-3 sites
Some very serious technical challenges here


The work starts now; CMS milestones oriented accordingly
If Grid tools are to be fully used, external projects must deliver
CMS Report – GridPP Collaboration Meeting V
Peter Hobson, Brunel University
16/9/2002
Network Performance

Networks are a big issue



All Grid computing is reliant upon high-performance networks
But: data transfer was a bottleneck in previous data challenges
Not through lack of infrastructure – we just don’t know how to use
the network to its full capability (it is highly non-trivial)
• CMS peak utilisation of the 1Gbit/s+ b/w RAL -> CERN is <100Mbit/s


Fast data replication underlies the success of DC04
Some initial progress in this area in 2002


BaBar, CMS (+?) using smart(er) transfer tools with good results
Contacts made with PPNCG / WP7
• Discussion at last EB/TB session
• CMS, BaBar, CDF/D0 talks at last week’s PPNCG meeting

Starting to get a feel for where the bottlenecks are
• Most often in the local infrastructure
CMS Report – GridPP Collaboration Meeting V
Peter Hobson, Brunel University
16/9/2002
Networks - future requirements

Where now?




CMS needs a substantial improvement in data handling capability
and throughput to CERN, US by mid-2003
All experiments will eventually face these problems
Strong expertise in storage and networks exists within the UK and
elsewhere – we should use it
First steps:

Practical real-world tests on the production network from the UK
Tier-1/A to UK, CERN, US, with experts in attendance
• Compare with best case results from ‘tuned’ setup

Provide dedicated test servers at UK T1, T2 sites so that we can
find the bottlenecks
• Will need to be highly-specified machines, and will need system
management support at RAL

Work to see how this relates to the SE architecture, and test
• Must balance flexibility / robustness with throughput
CMS Report – GridPP Collaboration Meeting V
Peter Hobson, Brunel University
16/9/2002
Summary

All three UK sites working in a coherent fashion

Significant progress in all areas

UK has made a major contribution to production MC for
CMS

Bristol now hosts the VO replica catalogue

Contributed to the “Hands On” meeting

All tasks are currently on target to meet their milestones.

BUT


Major Data Challenges coming up in 2003 and 2004
Technical challenges in efficient use of WAN capacity
CMS Report – GridPP Collaboration Meeting V
Peter Hobson, Brunel University
16/9/2002