A Summary of CHEP 2007 - Science and Technology Facilities

Download Report

Transcript A Summary of CHEP 2007 - Science and Technology Facilities

A Summary of CHEP 2007
Victoria, BC,
Canada,
2-7 Sept. 2007
Dmitry Emeliyanov, RAL PPD
1
CHEP’07 : The conference
• Expected
Audience:
• attract 500
people
• 90% from
outside of
Canada
• 25% from US
Total: 474
2/24
CHEP’07: Some statistics
• 429 abstracts submitted with 1208 authors
• 29 plenary talks and 7 parallel tracks:
CHEP Presentations and Posters
Presentation
Poster
90
80
70
60
50
40
30
20
10
0
Online
computing
Software
components,
tools and
databases
Computer
facilities,
production
grids and
networking
Collaborative
tools
Distributed
data analysis
and
information
management
Event
processing
Grid
middleware
and tools
3/24
Selected topics
• Status of the LHC and experiments
• Multi-core CPUs and HEP software: news from Intel and
view from CERN
• Online computing: Trigger and DAQ activities in LHC
experiments and beyond
• All presentations are available in Indico:
http://indico.cern.ch/conferenceTimeTable.py?confId=3580
• papers will be published in Journal of Physics Conference
Series
4/24
General LHC schedule
T. Virdee (CERN/Imperial)
• Engineering run originally foreseen at end 2007 now precluded by
delays in installation and equipment commissioning
• 450 GeV operation now part of normal setting up procedure for beam
commissioning to high-energy
• General schedule has been revised, accounting for inner triplet repairs
and their impact on sector commissioning
• All technical systems commissioned to 7 TeV operation, and machine
closed April 2008
• Beam commissioning starts May 2008
• First collisions at 14 TeV c.m. July 2008
• Luminosity evolution will be dominated by our confidence in the
machine protection system and by the ability of the detectors to
absorb the rates.
• No provision in success-oriented schedule for major mishaps, e.g.
additional warm-up/cooldown of sector
5/24
LHC experiments status
T. Virdee (CERN/Imperial)
• Construction essentially completed
• Installation is very advanced - beam pipes closed end
March 2008
• Test beam and commissioning work already carried out
gives confidence that detectors will behave as expected
• Commissioning using cosmics with more and more
complete setups (complexity and functionality)
– using final readout, trigger and DAQ, software and computing
systems
• Computing, Software & Analysis 24/7 Challenges, Dress
Rehearsals @50% of 2008 expectation by end of 2007.
• Preparations for the rapid extraction of physics being made
• By spring 2008 experiments will be in 2008 configurations,
fields ON, taking cosmics
6/24
Addressing Future HPC Demand
with Multi-core Processors
September 5, 2007
Stephen S. Pawlowski
Intel Senior Fellow
GM, Architecture and Planning
CTO, Digital Enterprise Group
7
Accelerating Multi- and Manycore
Power delivery and management
High bandwidth memory
Reconfigurable cache
Scalable fabric
Core Core
Core
Core Core
Core
Big Core
Core
Big Core
Core Core
Fixed-function units
8
Performance Through Parallelism
Addressing Memory Bandwidth
3D Memory Stacking
Memory on Package
Heat-sink
Last Level
Cache
CPU
DRAM
Package
Si Chip
Fast DRAM
Si Chip
Package
*Future Vision, does not represent real Intel product
Bringing Memory Closer to the Cores
9
How good is the match
between LHC software and
current/future processors?
Sverre Jarp
CERN openlab CTO
CHEP 2007
5 September 2007
5 September 2007
CHEP Plenary - SJ
10
Implications of Moore’s law
• Initially the processor was simple
– Modest frequency; Single instruction issue; In order;
Tiny caches; No hardware multithreading or multicore; No major problems with cooling
• Since then:
–
–
–
–
–
Frequency scaling (from 150 MHz to 3 GHz)
Multiple execution ports, wide execution (SSE)
Out-of-order execution, larger caches
Multithreading, Multi-core
Heat 
11
HEP Software Profile
• Our memory usage:
– Today, we need 2 – 4 GB per single-threaded process.
– In other words, a dual-socket server needs at least:
• Single core: 4 - 8 GB, Quad core: 16 - 32 GB
• Future 16-way CPU: 64 – 128 GB, 64-way CPU: 256 – 512 GB
• “We have floating point work wrapped in
‘if/else’ logic”
– Overall estimate: 50% is floating point
• Our LHC programs typically issue (on average)
only 1 instruction per cycle – This is very low!
• Core 2 architecture can handle 4 instructions
• Each SSE instruction can operate on 128 bits (2 doubles)
• “our LHC programs typically utilizes only 1
instruction per CPU clock cycle (= 1/8 of
maximum)”
12
Recommendations
• Industry will bombard us with new designs
based on multi-billion transistor budgets
– Hundreds of cores
– Multiple threads per core
– Unbelievable floating-point performance
• Clearly, the emphasis now is to get LHC started
and there is plenty of compute power
across the Grid.
• If we want to extract (much) more
compute-power out of new chip
generations
Core 0
Core 1
Core 2
Core 3
Event
specific
data
Eventspecific
data
Eventspecific
data
Eventspecific
data
Global
data
Physics
processes
Magnetic
field
Reentrant
code
– Try to increase the Instruction Level Parallelism
– Investigate “intelligent” multithreading
13
Online Computing:
CPU farms for high-level triggering;
Farm configuration and run control;
Describing and managing configuration data and conditions
databases;
Online software frameworks and tools; online calibration procedures
• 48 abstracts total: 27 oral presentations / 21 posters
• By experiments:
–
–
–
–
–
38 LHC / 10 non-LHC experiment or generic
ALICE: 4
ATLAS: 15
CMS: 14
LHCb: 5
14
Data Acquisition at the LHC experiments
Plenary talk by Sylvain CHAPELAND (CERN )
15
LHC Experiments: Trigger and DAQ
Status
• “Alea iacta est”
– All fundamental choices are made
– All use commercial components wherever possible
– All based on powerful LAN technology and PC server
farms
– Installation is progressing rapidly
• Status reports:
– “Integration of the Trigger and Data Acquisition
Systems in ATLAS”
– “Commissioning of the ALICE Data Acquisition System”
• Commissioning and cosmics running
– Commissioning of larger and larger slices has started in all 4 experiments
– Large scale and Cosmic (ATLAS) tests look already very promising
– Extremely valuable feedback
16
Combined Cosmic run in June 2007
In June we had a 14 day
combined cosmic run with no
magnetic field.
Included following systems:
Muons – RPC (~1/32) ,
MDT (~1/16),
TGC (~1/36)
Calorimeters –
EM (LAr )(~50%) &
Hadronic (Tile) (~75%)
Tracking – Transition
Radiation Tracker (TRT)
(~6/32 of the barrel of the final
system)
Only systems missing are the
Silicon strips and pixels and
the muon system CSCs
From “The ATLAS
Trigger Commissioning
with Cosmic rays”
1717
Trigger steering
• Sophisticated frameworks for high level
trigger steering have been developed
–
–
–
–
–
–
Lightweight (caching of calculations (ATLAS))
Work both offline and online
Use a data-base for configurations (CMS)
Ready to be given to non-expert physicists!
“The ATLAS High Level Trigger Steering”
“High Level Trigger Configuration and Handling of
Trigger Tables in the CMS Filter Farm”
18
Data Quality Monitoring
•
•
•
•
Essential for commissioning and running
Works also with “offline” data
Standalone viewers vs plug-ins (e.g. web CMS)
Databases are used to store histograms or to
describe them (LHCb)
• Reports from all four experiments:
– “The ALICE-LHC Online Data Quality Monitoring
Framework”
– “A software framework for Data Quality Monitoring in
ATLAS”
– “CMS Online Web Based Monitoring”
– “Online Data Monitoring in the LHCb experiment”
19
Slow and Run Controls
• Slow and run-control face huge numbers of elements ~ O(107)
• Final run-control is beginning to be used on wide-scale, scalability
has been tested. Configuration stored in RDBMS (ALICE, CMS,
LHCb) or as objects (ATLAS)
• All run-controls support partitioning and use finite state machines
– “The ATLAS DAQ System Online Configurations
Database Service Challenge”
– “The Run Control and Monitoring System of the CMS
Experiment”
• Detector Control is maybe “slow” but certainly big: “The CMS Tracker
Control System”, O(50000) HV channels + O(100000) environment
sensors controlled by 5 PCs
20
TDAQ Activities Outside the LHC
• Reports from mature systems
– “The DZERO Run 2 L3/DAQ System Performance”
– “The PHENIX Experiment in the RHIC Run 7”
– “The BaBar Online Detector Control System
Upgrade”
• And new frameworks
– “Multi-Agent Framework for Experiment Control
Systems (AFECS)”
• Successful upgrades (to overcome legacy
hardware), hardware extensions, high
availability, running with very small crews
21
The D0 Run II L3/DAQ System Performance
•Mainly run by 3
(part-time) people
•Heterogeneous
trigger farm scaled
up from 90 to ~ 330
nodes
•Has lived reliably
through numerous
detector and
hardware upgrades
22
To summarize ...
• The LHC experiments are looking forward to
seeing the first data
– All core DAQ components have been tested
– Good fraction of equipment is installed (except for the
filter farms and part of the DAQ network)
– Integration and Commissioning are well underway
– A lot of activity in trigger control and steering
• Handing over to the physicists
– Monitoring frameworks evolving quickly
23
CHEP 2009
• Will be held in Prague, Czech Republic on 21-27 March
2009
24/24