Modeling LHC Regional Centers with the MONARC Simulation Tools

Download Report

Transcript Modeling LHC Regional Centers with the MONARC Simulation Tools

MONARC : results and open issues
Laura Perini
Milano
Layout of the talk
Most material from Irwin Gaines talk at Chep2000
The basic goals and structure of the project
The Regional Centers
Motivation
Characteristics
Functions
Same Results from the simulations
The need for more realistic “implementation oriented”
Models: Phase-3
Relations with GRID
Status of the project: Phase-3 LOI presented in January,
Phase-2 Final Report to be published next week,
Milestones and basic goals met
L. Perini
MONARC: results and open issuesL
LUND
16 Mar 2000
MONARC
A
joint project (LHC experiments and CERN/IT) to
understand issues associated with distributed data
access and analysis for the LHC
Examine distributed data plans of current and
near future experiments
Determine characteristics and requirements for
LHC regional centers
Understand details of analysis process and data
access needs for LHC data
Measure critical parameters characterizing
distributed architectures, especially database and
network issues
Create modeling and simulation tools
Simulate a variety of models to understand
constraints on architectures
L. Perini
MONARC: results and open issuesL
LUND
16 Mar 2000
MONARC
Models Of Networked Analysis
At Regional Centers
Caltech, CERN, FNAL, Heidelberg, INFN,
Desk
Helsinki, KEK, Lyon, Marseilles, Munich,
tops
Orsay, Oxford, RAL,Tufts, ...
Desk
GOALS
tops
 Specify the main parameters characterizing
the Model’s performance: throughputs,
University
latencies
n.106MIPS
 Determine classes of Computing Models
m Tbyte
Robot
feasible for LHC (matched to network
capacity and data handling resources)
 Develop “Baseline Models” in the “feasible”
Desk
category
tops
 Verify resource requirement baselines:
(computing, data handling, networks)
FNAL
4.107 MIPS
110 Tbyte
Robot
CERN
n.107 MIPS
m Pbyte
Robot
COROLLARIES:
 Define the Analysis Process
 Define Regional Center
Architectures
 Provide Guidelines for the final Models
L. Perini
MONARC: results and open issuesL
LUND
16 Mar 2000
Working Groups
Architecture WG
 Baseline
architecture for regional centres, Technology tracking, Survey of
computing model of current HENP experiments
Analysis Model WG
 Evaluation
of LHC data analysis model and use cases
Simulation WG
 Develop
models
a simulation tool set for performance evaluation of the computing
Testbed WG
 Evaluate
L. Perini
the performance of ODBMS, network in the distributed environment
MONARC: results and open issuesL
LUND
16 Mar 2000
General Need for distributed data
access and analysis:
Potential problems of a single centralized computing center
include:
- scale of LHC experiments: difficulty of accumulating and
managing all resources at one location
- geographic spread of LHC experiments: providing
equivalent location independent access to data for physicists
- help desk, support and consulting in same time zone
- cost of LHC experiments: optimizing use of resources
located world wide
L. Perini
MONARC: results and open issuesL
LUND
16 Mar 2000
Motivations for Regional Centers
A distributed computing architecture based on regional centers
offers:
A
way of utilizing the expertise and resources residing in
computing centers all over the world
Provide local consulting and support
To maximize the intellectual contribution of physicists all
over the world without requiring their physical presence at
CERN
Acknowledgement of possible limitations of network
bandwidth
Allows people to make choices on how they analyze data
based on availability or proximity of various resources such
as CPU, data, or network bandwidth.
L. Perini
MONARC: results and open issuesL
LUND
16 Mar 2000
Future Experiment Survey
Analysis/Results
 From
the previous survey, we saw many sites contributed to
Monte Carlo generation
 This
 New
is now the norm
experiments trying to use the Regional Center concept
 BaBar
has Regional Centers at IN2P3 and RAL, a smaller one in Rome
 STAR has Regional Center at LBL/NERSC
 CDF and D0 offsite institutions paying more attention as run gets
closer.
L. Perini
MONARC: results and open issuesL
LUND
16 Mar 2000
Future Experiment Survey
Other observations/ requirements
 In
the last survey, we pointed out the following requirements for RC’s:
 24X7
support
 software development team
 diverse body of users
 good, clear documentation of all s/w and s/w tools
 The
following are requirements for the central site (I.e. CERN)
 Central
code repository easy to use and easily accessible for remote sites
 be “sensitive” to remote sites in database handling, raw data handling and
machine flavors
 provide good, clear documentation of all s/w and s/w tools
 The
experiments in this survey achieving the most in distributed
computing are following these guidelines
L. Perini
MONARC: results and open issuesL
LUND
16 Mar 2000
Tier0: CERN
Tier1: National
“Regional”
Center
Tier2: Regional
Center
Tier3: Institute
Workgroup
Server
Tier4: Individual
Desktop
Total 5 Levels
L. Perini
MONARC: results and open issuesL
LUND
16 Mar 2000
CERN
storage network
processor
s
…………
tapes
0.8
Gbps
12 Gbps
1400 boxes
160 clusters
40 sub-farms
1.5 Gbps
8 Gbps
3 Gbps*
12 Gbps*
farm network
480 Gbps*
0.8 Gbps (daq)
LAN-SAN
routers
100 drives
LAN-WAN
routers
250 Gbps
storage network
5 Gbps
CMS Offline Farm
at CERN circa 2006
0.8 Gbps
5400 disks
340 arrays
disks
L. Perini
MONARC: results and open issuesL
lmr for Monarc study- april 1999
……..
.
* assumes all disk & tape traffic on storage network
double these numbers if all disk & tape traffic through
router
LUND LAN-SAN
16 Mar 2000
Farm capacity and evolution
year
2004
2005
2006
2007
70'000
40
350'000
340
520'000
540
700'000
740
6
31
46
61
tapes (PB)
0.2
1
3
5
tape I/O (GB/sec)
0.2
0.3
0.5
0.5
total cpu (SI95)
disks (TB)
LAN thr-put (GB/sec)
L. Perini
MONARC: results and open issuesL
LUND
16 Mar 2000
Processor cluster
storage network
configured as
I/O servers
sub-farm: 36 boxes, 144 cpus, 5 m2
1.5 Gbps
basic box
four 100 SI95 processors
standard network connection (~2 Gbps)
15% of systems configured as I/O servers
(disk server, disk-tape mover, Objy AMS, ..)
farm network
with additional connection to the storage network
cluster
9 basic boxes with a network switch (<10 Gbps)
cluster and sub-farm sizing adjusted to fit conveniently
the capabilities of network switch, racking, power distribution sub-farm
4 clusters - with a second-level network switch (<50 Gbps)
components
one sub-farm fits in one rack
L. Perini
MONARC: results and open issuesL
LUND 16 Mar 2000
3 Gbps*
lmr for Monarc study- april 1999
Regional Centers
Regional Centers will
 Provide
all technical services and data services required to do the
analysis
 Maintain all (or a large fraction of) the processed analysis data.
Possibly may only have large subsets based on physics channels.
Maintain a fixed fraction of fully reconstructed and raw data
 Cache or mirror the calibration constants
 Maintain excellent network connectivity to CERN and excellent
connectivity to users in the region. Data transfer over the network is
preferred for all transactions but transfer of very large datasets on
removable data volumes is not ruled out.
 Share/develop common maintenance, validation, and production
software with CERN and the collaboration
 Provide services to physicists in the region, contribute a fair share to
post-reconstruction processing and data analysis, collaborate with
other RCs and CERN on common projects, and provide services to
members of other regions on a best effort basis to further the science
of the experiment
 Provide support services, training, documentation, trouble shooting to
RC and remote users in the region
L. Perini
MONARC: results and open issuesL
LUND
16 Mar 2000
Mass Storage & Disk
Servers
Database Servers
Network
from CERN
Tier 2
Data
Import
Network
from Tier 2 and
simulation
centers
Data
Export
Production
Reconstruction
Raw/Sim-->ESD
Tapes
Scheduled,
predictable
experiment/
physics groups
Production
Analysis
Individual
Analysis
ESD-->AOD
AOD-->DPD
AOD-->DPD
and plots
Scheduled
Chaotic
Physics groups
Physicists
Local
institutes
CERN
Tapes
Desktops
Support Services
Physics
Software
Development
L. Perini
R&D Systems
and Testbeds
MONARC: results and open issuesL
Info servers
Code servers
Web Servers
Telepresence
Servers
Training
Consulting
Help Desk
LUND
16 Mar 2000
Network
from CERN
Data
Import
Network
from Tier 2 and
simulation
centers
Total
Storage:
Mass
Storage & Disk
Robotic Mass Storage - 300TB
Servers
Raw Data: 50TB
5*10**7 events (5% of 1 year)
Tier 2
Raw (Simulated)
100TB 10**8 events
DatabaseData:
Servers
EDS (Reconstructed Data): 100TB - 10**9 events (50% of 2 years)
AOD (Physics Object) Data: 20TB 2*10**9 events (100% ofLocal
2 years)
Data
Tag Data: 2TB (all)
institutes
Calibration/Conditions data base: 10TB
Export
(only latest version of most data types kept here)
Production
CERN
Central Disk Cache
- 100TB (per user
demand)
Production
Individual
Reconstruction
Analysis
Analysis
CPU Required for AMS database servers: ??*10**3 SI95 power
Raw/Sim-->ESD
ESD-->AOD
AOD-->DPD
AOD-->DPD
and plots
Scheduled,
Data Input Rate from CERN:
predictable
Scheduled
Chaotic
Data Output
Rate to CERN:
Raw Data - 5%
50TB/yr
Tapes
Tapes
ESD Data - 50% 50TB/yr
experiment/
AOD Data - All
10TB/yr
physics groups
Revised ESD 20TB/yr
Data Input from Tier 2:
Revised ESD and AOD - 10TB/yr
Physics
R&D Systems
Data
Input from Simulation
Centers:
Software
and
Testbeds
Raw Data - 100TB/yr
Development
L. Perini
MONARC: results and open issuesL
Physics groups
AOD Data 8 TB/yr
Desktops
Physicists
Recalculated ESD 10 TB/yr
Simulation ESD data - 10 TB/yr
Data Output to Tier 2:
Revised ESD and AOD - 15 TB/yr
Web Servers
Training
Info servers Data Output to local institutes:
Telepresence
Consulting
Code servers ESD, AOD, DPD data - 20TB/yr
Servers
Help Desk
LUND
16 Mar 2000
Mass Storage & Disk
Servers
Database Servers
Network
from CERN
Network from
Tier 2 and
simulation
centers
Tier 2
Data
Import
Data
Export
Production
Reconstruction
Production
Analysis
Individual
Analysis
Raw/Sim-->ESD
ESD-->AOD
AOD-->DPD
AOD-->DPD
and plots
Scheduled
Chaotic
Physics groups
Physicists
Tapes
Scheduled
experiment/
physics groups
Farms of low cost commodity
computers,
limited I/O rate, modest local
disk cache
---------------------------------------------------Reconstruction Jobs:
Reprocessing of raw data: 10**8
events/year (10%)
Initial processing of simulated
data: 10**8/year
Event Selection Jobs:
10 physics groups * 10**8 events (10%samples)
* 3 times/yr
based on ESD and latest AOD data
50 SI95/evt ==> 5000 SI95 power
Physics Object creation Jobs:
10 physics groups * 10**7 events (1% samples)
* 8 times/yr
based on selected event sample ESD data
200 SI95/event ==> 5000 SI95 power
Derived Physics data creation Jobs:
10 physics groups * 10**7 events * 20 times/yr
based on selected AOD samples, generates
“canonical” derived physics data
50 SI95/evt ==> 3000 SI95 power
1000 SI95-sec/event ==> 10**4
SI95 capacity:
100
processing
nodes of
100and open issuesL
L. Perini
MONARC:
results
SI95 power
Total 110 nodes of 100 SI95 power
Local
institutes
CERN
Tapes
Desktops
Derived Physics data
creation Jobs:
200 physicists * 10**7
events * 20 times/yr
based on selected AOD
and DPD samples
20 SI95/evt ==> 30,000
SI95 power
Total 300 nodes of 100
LUND 16 Mar 2000
SI95 power
MONARC Analysis Process Example
L. Perini
MONARC: results and open issuesL
LUND
16 Mar 2000
Model and Simulation parameters
Have a new set of parameters common to all simulating groups.
More realistic values, but still to be discussed/agreed on the basis
of Experiment’s information.
1000
25
5
3
3
15
600
100
5000
1000
25
L. Perini
Proc_time_RAW
SI95sec/event
Proc_Time_ESD
“
Proc_Time_AOD
“
Analyze_Time_TAG
“
Analyze_Time_AOD
“
Analyze_Time_ESD
“
Analyze_Time_RAW
“
Memory of Jobs
MB
Proc_Time_Create_RAW SI95sec/event
Proc_Time_Create_ESD
“
Proc_Time_Create_AOD
“
MONARC: results and open issuesL
(350)
(2.5)
(0.5)
(3)
(350)
(35)
(1)
(1)
LUND
16 Mar 2000
Base Model used
Basic Jobs
of 107 events : RAW--> ESD --> AOD --> TAG at CERN
It’s the production while the data are coming from the DAQ (100 days
of running collecting a billion of events per year)
 Analysis of 5 Working Groups each of 25 analyzers on TAG only (no
request to higher level data samples).
Every analyzer submit 4 sequential jobs on 106 events.
Each analyzer work start-time is a flat random choice in the range of
3000 seconds.
Each analyzer data sample of 106 events is a random choice in the
complete data sample of TAG DataBase consisting of 107 events.
 Transfer (FTP) of a 107 events ESD, AOD and TAG from CERN to RC
 Reconstruction
–CERN Activities : Reconstruction, 5 WG Analysis, FTP transfer
–RC Activities : 5 (uncorrelated) WG Analysis, receive FTP
transfer
L. Perini
Job’s “paper estimate”:
–Single Analysis Job : 1.67 CPU hours at CERN = 6000 sec at CERN (same at
RC)
–Reconstruction at CERN for 1/500 RAW to ESD : 3.89 CPU hours = 14000 sec
MONARC: results and open issuesL
LUND
–Reconstruction at CERN for 1/500 ESD to AOD : 0.03 CPU hours = 100 sec
16 Mar 2000
Resources: LAN speeds ?!
In our Models the DB Servers are uncorrelated and thus
one activity uses a single Server.
The bottlenecks are the “read” and “write” speed to andSee
from the Server. In order to use the CPU power at
following
reasonable percentage we need a read speed of at leastslides
300 MB/s and a write speed of 100 MB/s (milestone
already met today)
We use 100 MB/s in current simulations (10 Gbits/sec
switched LANs in 2005 may be possible).
Processing node link speed is negligible in our
simulations.
Of course the “real” implementation of the Farms can be
different, but the results of the simulation do not depend
on “real” implementation: they are based on usable
resources.
L. Perini
MONARC: results and open issuesL
LUND
16 Mar 2000
More realistic values for CERN and RC
Data Link speeds at 100 MB/sec (all values) except :
 Node_Link_Speed
at 10 MB/sec
 WAN Link speeds at 40 MB/sec
CERN
 1000
Processing nodes each of 500 SI95
disk space as for the number of DBs
RC
 200
1000 Processing nodes
times 500SI95 =
500kSI95
about the CPU power of
CERN Tier0
Processing nodes each of 500 SI95
100kSI95 processing
Power = 20% CERN
disk space as for the number of DBs
L. Perini
MONARC: results and open issuesL
LUND
16 Mar 2000
Overall Conclusions
MONARC simulation tools are:
sophisticated enough to allow modeling of complex
distributed analysis scenarios
simple enough to be used by non experts
Initial modeling runs are alkready showing interesting
results
Future work will help identify bottlenecks and understand
constraints on architectures
L. Perini
MONARC: results and open issuesL
LUND
16 Mar 2000
MONARC Phase 3
More Realistic Computing Model Development



L. Perini
Confrontation of Models with Realistic Prototypes;
At Every Stage: Assess Use Cases Based on Actual Simulation,
Reconstruction and Physics Analyses;
 Participate in the setup of the prototyopes
 We will further validate and develop MONARC
simulation system using the results of these use cases
(positive feedback)
 Continue to Review Key Inputs to the Model
 CPU Times at Various Phases
 Data Rate to Storage
 Tape Storage: Speed and I/O
Employ MONARC simulation and testbeds to study CM
variations,
and suggest strategy improvements
MONARC: results and open issuesL
LUND
16 Mar 2000
MONARC Phase 3
 Technology Studies
Data Model
 Data structures
 Reclustering, Restructuring; transport
operations
 Replication
 Caching, migration (HMSM), etc.
 Network
 QoS Mechanisms: Identify Which are
important
 Distributed System Resource Management
and Query Estimators
 (Queue management and Load
Balancing)
Development of MONARC Simulation Visualization
Tools for interactive Computing Model analysis


L. Perini
MONARC: results and open issuesL
LUND
16 Mar 2000
Relation to GRID
The GRID project is great!
 Development of s/w tools needed for implementing realistic LHC
Computing Models
farm management, WAN resource and data management, etc….
 Help in getting funds for real life testbed systems (RC prototypes)
Complementarity GRID-MONARC hierarchical RC Model
 Hierarchy of RC is a safe option. If GRID will bring big advancements,
less hierarchical models should alo become possible
Timings well matched
 MONARC Phase-3 to last ~1 year: bridge to GRID project starting
early in 2001
 Afterwards common work by LHC experiments for developping the
computing models will surely be still needed: in which project
framework and for how long we will see then...
L. Perini
MONARC: results and open issuesL
LUND
16 Mar 2000