Transcript No Slide Title
CMS Software & Computing
C. Charlot / LLR-École Polytechnique, CNRS & IN2P3 for the CMS collaboration
The Context
LHC challenges Data Handling & Analysis Analysis environments Requirements & constraints
Challenges: Complexity
Detector:
~2 orders of magnitude more channels than today Triggers must choose correctly only 1 event in every 400,000 Level 2&3 triggers are software-based (reliability )
Computer resources will not be available in a single location C. Charlot, LLR/Ecole Polytechnique ACAT02, 24-28 june 2002, Moscow
Challenges: Geographical Spread
1700 Physicists 150 Institutes 32 Countries CERN state 55 % NMS 45 % ~ 500 physicists analysing data in 20 physics groups
Major challenges associated with:
Communication and collaboration at a distance Distribution of existing/future computing resources Remote software development and physics analysis
C. Charlot, LLR/Ecole Polytechnique ACAT02, 24-28 june 2002, Moscow
HEP Experiment-Data Analysis
Environmental data Detector Control Online Monitoring Quasi-online Reconstruction store Request part of event Event Filter Object Formatter Request part of event Request part of event Store rec-Obj store Simulation Persistent Object Store Manager Database Management System store Store rec-Obj and calibrations Data Quality Calibrations Group Analysis C. Charlot, LLR/Ecole Polytechnique Request part of event Physics Paper User Analysis on demand ACAT02, 24-28 june 2002, Moscow
Data handling baseline
CMS data model for computing in year 2007
• • • • •
typical objects 1KB-1MB 3 PB of storage space 10,000 CPUs Hierarchy of sites
: 1 tier0+5 tier1+25 tier2 all over the world
Network bw between site
s .6-2.5Gbit/s C. Charlot, LLR/Ecole Polytechnique ACAT02, 24-28 june 2002, Moscow
Analysis environments
Real Time Event Filtering and Monitoring
– –
Data driven pipeline Emphasis on efficiency (keep up with rate!) and reliability Simulation, Reconstruction and Event Classification
–
Massive parallel batch-sequential process
–
Emphasis on automation, bookkeeping, error recovery and rollback mechanisms Interactive Statistical Analysis
– – –
Rapid Application Development environment Efficient visualization and browsing tools Easy of use for every physicist
Boundaries between environments are fuzzy
–
e.g. physics analysis algorithms will migrate to the online to make the trigger more selective
C. Charlot, LLR/Ecole Polytechnique ACAT02, 24-28 june 2002, Moscow
Architecture Overview
Data Browser Generic analysis Tools GRID Analysis job wizards ORCA OSCAR FAMOS COBRA ODBMS tools Detector/Event Display Federation wizards Software development and installation Consistent User Interface CMS tools Distributed Data Store & Computing Infrastructure Coherent set of basic tools and mechanisms
C. Charlot, LLR/Ecole Polytechnique ACAT02, 24-28 june 2002, Moscow
TODAY
Data production and analysis challenges Transition to Root/IO Ongoing work on baseline software
Task
1 Generation 2 3 4 Simulation Hit Formatting Digitization 5 Physics group analysis
CMS Production stream
Application
Pythia CMSIM ORCA H.F.
ORCA Digi.
ORCA User 6 User Analysis PAW/Root C. Charlot, LLR/Ecole Polytechnique
Input
None Ntuple FZ file DB
Output
Ntuple FZ file DB DB
Req. on resources
(static link) Geometry files Storage Shared libs Full CMS env.
Storage DB Ntuple or root file Shared libs Full CMS env.
Distributed input Ntuple or root file Plots Interactive environment ACAT02, 24-28 june 2002, Moscow
Production 2002: the scales
Number of Regional Centers Number of Computing Centers Number of CPU’s Number of Production Passes for each Dataset (including analysis group processing done by production) Number of Files Data Size (Not including fz files from Simulation) File Transfer over the WAN 11 21 ~1000 6-8 ~11,000 17TB 7TB toward T1 4TB toward T2
C. Charlot, LLR/Ecole Polytechnique Bristol/RAL Caltech CERN FNAL IC IN2P3 INFN Moscow UCSD UFL WISC ACAT02, 24-28 june 2002, Moscow
Production center setup
12 MB/s
Most critical task is digitization – 300 KB per pile-up event – 200 pile-up events per signal event 60 MB – 10 s to digitize 1 full event on a 1 GHz CPU – 6 MB / s per CPU ( 12 MB / s per dual processor client ) – Up to ~ 5 clients per pile-up server (~ 60 MB / s on its network card Gigabit) – Fast disk access
~60 MB/s Pile-up DB Pile-up server ~5 clients per server
C. Charlot, LLR/Ecole Polytechnique
client client client client client
ACAT02, 24-28 june 2002, Moscow
Spring02: production summary
requested
6M
CMSIM : 1.2 seconds / event for 4 months produced February 8 May 31 High luminosity Digitization : 1.4 seconds / event for 2 months requested
3.5M
10 34
produced C. Charlot, LLR/Ecole Polytechnique April 19 June 7 ACAT02, 24-28 june 2002, Moscow
Production Interface
Production processing
“Produce 100000 events dataset mu_MB2mu_pt4” Regional Center IMPALA decomposition (Job scripts) Production manager coordinates tasks distribution to Regional Centers
RC farm
JOBS Data location through Production DB Farm storage RC BOSS DB
Production “RefDB”
Request Summary file IMPALA monitoring (Job scripts) C. Charlot, LLR/Ecole Polytechnique ACAT02, 24-28 june 2002, Moscow
RefDB Assignement Interface
• Selection of a set of Requests and their Assignment to an RC • the RC contact persons get an automatic email with the assignment ID to be used as argument to IMPALA scripts (“DeclareCMKINJobs.sh -a < id > “) • Re-assignment of a Request to another RC or production site • List and Status of Assignments C. Charlot, LLR/Ecole Polytechnique ACAT02, 24-28 june 2002, Moscow
IMPALA
• Data product is a DataSet (typically few 100 jobs) • Impala performs production task decomposition and script generation – Each step in the production chain is split into 3 sub-steps – Each sub-step is factorized into customizable functions
JobDeclaration Search for
something to do
JobCreation Generate jobs from templates JobSubmission
C. Charlot, LLR/Ecole Polytechnique
Submit jobs to the scheduler
ACAT02, 24-28 june 2002, Moscow
Job declaration, creation, submission
• Jobs
to-do
are automatically discovered: – looking for output of previous step at predefined directory for the
Fortran Steps
– querying the Objectivity/DB federation for
Digitization, Event Selection, Analysis
• Once the
to-do list
is ready, the site manager can actually generate instances of jobs starting from a template • Job execution includes validation of produced data • Thank to the sub-step managers can: decomposition into customizable functions site – Define local actions to be taken to submit the job (local job scheduler specificities, queues, ..) – Define local actions to be taken before and after the start of the job (staging input, staging output from MSS) • Auto-recovery of crashed jobs – Input parameters are automatically changed to restart job at crash point C. Charlot, LLR/Ecole Polytechnique ACAT02, 24-28 june 2002, Moscow
BOSS job monitoring
Wrapper BOSS Local Scheduler farm node
boss submit boss query boss kill
BOSS DB
• Accepts job submission from users • Stores info about job in a DB • Builds a wrapper around the job (
BossExecuter
) • Sends the wrapper to the local scheduler • The wrapper sends to the DB info about the job C. Charlot, LLR/Ecole Polytechnique
farm node
ACAT02, 24-28 june 2002, Moscow
Getting info from the job
• A registered job has scripts associated to it which are able to understand the job output
BossExecuter
get job info from DB create & go to workdir run preprocess update DB fork user executable fork monitor wait for user exec.
kill monitor run postprocess update DB exit C. Charlot, LLR/Ecole Polytechnique
User’s executable BossMonitor
get job info from DB while(user exec is running) run runtimeprocess update DB wait some time exit ACAT02, 24-28 june 2002, Moscow
CMS transition to ROOT/IO
• CMS work up to now with Objectivity – We manage to make it work, at least for production • Painful to operate, a lot of human intervention needed – Now being phased out, to be replaced by LCG software • Hence being in a major transition phase – Prototypes using ROOT+RDBMS layer being worked on – This is done within LCG context (persistency RTAG) – Aim to start testing new system as it becomes available • Target early 2003 for first realistic tests C. Charlot, LLR/Ecole Polytechnique ACAT02, 24-28 june 2002, Moscow
OSCAR: Geant4 simulation
• CMS plan is to replace cmsim (G3) by OSCAR (G4) • A lot of work since last year – Many problems from the G4 side have corrected – Now integrated in the analysis chain Generator->OSCAR->ORCA using COBRA persistency – Under geometry & physics validation Overall is rather good • Still more to do before using it in production SimTrack
Cmsim 122 OSCAR 1 3 2 pre 03
HitsAssoc C. Charlot, LLR/Ecole Polytechnique ACAT02, 24-28 june 2002, Moscow
OSCAR: Track Finding
•
Number of rechits/simhits per track vs eta RecHits
C. Charlot, LLR/Ecole Polytechnique
SimHits
ACAT02, 24-28 june 2002, Moscow
Detector Description Database
• Several applications (simulation, reconstruction, visualization) needed geometry services Use a common interface to all services • On the other hand several detector description sources currently in use Use a unique internal representation derived from the sources • Prototype now existing – co-works with OSCAR – co-works with ORCA (Tracker, Muons) C. Charlot, LLR/Ecole Polytechnique ACAT02, 24-28 june 2002, Moscow
ORCA Visualization
• IGUANA framework for visualization • 3D visualization – mutliple views, slices, 2D proj, zoom • Co-works with ORCA – Interactive 3D detector geometry for sensitive volumes – Interactive 3D representations of reconstructed and simulated events, including display of physics quantities – Access event by event or automatically fetching events – Event and run numbers C. Charlot, LLR/Ecole Polytechnique ACAT02, 24-28 june 2002, Moscow
TOMORROW
Deployment of a distributed data system Evolve software framework to match with LCG components Ramp up computing systems
Toward ONE Grid
• Build a unique CMS-GRID framework (EU+US) • EU and US grids not interoperable today. Need for help from the various Grid projects and middleware experts – Work in parallel in EU and US • Main US activities: – PPDG/GriPhyN grid projects – MOP – Virtual Data System – Interactive Analysis: Clarens system • Main EU activities: – EDG project – Integration of IMPALA with EDG middleware – Batch Analysis: user job submission & analysis farm C. Charlot, LLR/Ecole Polytechnique ACAT02, 24-28 june 2002, Moscow
PPDG MOP system
PPDG Developed
MOP
System production Allows submission of CMS prod. Jobs from a central location, run on remote locations, and return results Relies on
GDMP
for replication Globus
GRAM Condor-G
and local queuing systems for Job Scheduling
IMPALA
for Job Specification DAGMAN for management of dependencies between jobs Being deployed in USCMS testbed C. Charlot, LLR/Ecole Polytechnique ACAT02, 24-28 june 2002, Moscow
CMS EU Grid Integration CMS EU developed integration of production tools with EDG middleware Allows submission of CMS production jobs using WP1 JSS from any site that has client part (UI) installed
Relies on
GDMP
for replication WP1 for Job Scheduling
IMPALA
for Job Specification
Being deployed in CMS DataTAG testbed UK, France, INFN, Russia
C. Charlot, LLR/Ecole Polytechnique ACAT02, 24-28 june 2002, Moscow
CMS EDG Production prototype
User Interface IMPALA Reference DB
has all information needed by IMPALA to generate a dataset Get request for a production Create location independent jobs
Job Submission Service Resource Broker
Finds suitable Location for execution Condor-G Query Read from track ing DB
BOSS
Tracking DB Job specific information Update Write to tracking DB Submission Build tracking wrapper
Information Services
LDAP server Resource information
Worker nodes Computing Element
GRAM Local Scheduler Local Objy FDDB CMSIM ORCA
Storage Element
Local Storage C. Charlot, LLR/Ecole Polytechnique ACAT02, 24-28 june 2002, Moscow
GriPhyN/PPDG VDT Prototype
= no code = existing = implemented using MOP Planner
Abstract Planner (IMPALA)
Concrete Planner/ WP1 Executor MOP/ DAGMan WP1 Script Etc.
Compute Resource BOSS CMKIN Wrap per Scripts CMSIM
ORCA/ COBRA
Storage Resource
Local Tracking DB
Local Grid Storage RefDB
Virtual Data Catalog
Materia lized Data Catalog Catalog Services C. Charlot, LLR/Ecole Polytechnique Replica Catalog GDMP
Objecti vity Federation Catalog
Replica Mgmt ACAT02, 24-28 june 2002, Moscow
CLARENS: a Portal to the Grid
Grid-enabling environment for remote data analysis Clarens is a simple way to implement web services on the server No Globus needed on client side, only certificate The server will provide a remote API to Grid tools: Security services provided by the Grid (GSI) The Virtual Data Toolkit: Object collection access Data movement between Tier centres using GSI-FTP Access to CMS analysis software (ORCA/COBRA) C. Charlot, LLR/Ecole Polytechnique ACAT02, 24-28 june 2002, Moscow
Conclusions
• CMS has performed large scale distributed production of Monte Carlo events • Baseline software is progressing and this is done now within the new LCG context • Grid is the enabling technology for the deployment of a distributed data analysis • CMS is engaged in testing and integrating grid tools in its computing environment • Much work to be done to be ready for a distributed data analysis at LHC startup C. Charlot, LLR/Ecole Polytechnique ACAT02, 24-28 june 2002, Moscow