DESY - University of Oregon

Download Report

Transcript DESY - University of Oregon

Chris Bee

ATLAS High Level Trigger

IntroductionSystem ScalabilityTrigger Core Software DevelopmentTrigger Selection AlgorithmsCommissioning & Preparation for Cosmics & First Beam

Event rate

Level-1

Level-2

Massstorage

Rate (Hz) QED 10 8 10 6

Introduction

ON-line Level-1 Trigger 40 MHz

Hardware (ASIC, FPGA) Massive parallel Architecture Pipelines

Level-2 Trigger 100 kHz

s/w PC farm Locale Reconstruction

2 µs 10 4 W,Z Top 10 2 Z * 10 0 Higgs 10 -2 10 ms 1 sec OFF-line Reconstruction& Analyses

TIER0/1/2 Centers

10 -4 Level-3 Trigger 1 kHz

s/w PC farm Full Reconstruction

Offline Analyses

Chris Bee

25ns 10 -9 3µs 10 -6 ms 10 -3 sec 10 -0 hour 10 3 year 10 6 sec

2

Introduction

ATLAS trigger comprises 3 levelsLVL1 • Custom electronics & ASICS, FPGAs • Max. time 2.5

LVL2 LVL1) m s • Use of Calorimeter and Muon detector data • Reduce interaction rate to 75 kHz • Software trigger based on linux PC farm (~500 dual CPUs) • Mean processing time ~10 ms • Uses selected data from all detectors (Regions of Interest indicated by • Reduces LVL1 rate to ~1 kHz – Event Filter • Software trigger based on linux PC farm (~1600 dual CPUs) • Mean processing time ~1s • Full event & calibration data available • Reduces LVL2 rate to ~100Hz • Note – large fraction of HLT processor cost deferred  initial running with reduced computing capacity Chris Bee 3

40 MHz specialized h/w ASICs FPGA 75 kHz

ATLAS Trigger & DAQ Architecture

Trigger Calo DAQ MuTrCh Other detectors 40 MHz LV L1 Lvl1 acc = 75 kHz RoI data = 1-2% ROD ROD 120 GB/s ROD 1 PB/s D E T R/O FE Pipelines Read-Out Drivers Read-Out Links RoI Builder L2 Supervisor L2 N/work L2 Proc Unit ~2 kHz Event Filter Processors H L T LVL2 ROIB L2P Event Filter EFP EFP EFP EFP ~ 10 ms L2SV L2N ~ sec RoI requests Lvl2 acc = ~2 kHz ROB DFM ROB SFI EFN ROB ROS EBN EB Read-Out Buffers D A T A F L O W Read-Out Sub-systems ~2+4 GB/s Dataflow Manager Event Building N/work Sub-Farm Input Event Builder Event Filter N/work EFacc = ~0.2 kHz Sub-Farm Output SFO ~ 200 Hz

Chris Bee

~ 300 MB/s

4

2.5

m

s

ATLAS Three Level Trigger Architecture

LVL1 decision calorimeter data.

made with data with coarse granularity and muon trigger chambers Buffering on detector ~10 ms

LVL2 uses Region of Interest data (ca. 2%) with full granularity and combines information from all detectors; performs fast rejection.

Buffering in ROBs ~ sec.

EventFilter data.

refines the selection, can perform event reconstruction granularity using latest alignment and calibration Buffering in EB & EF at full

Chris Bee 5

LVL1 - Muons & Calorimetry

Toroid

Muon Trigger looking for coincidences in muon trigger chambers

Chris Bee

Calorimetry Trigger looking for e/

g

/

t +

jets

Various combinations of

cluster sums and isolation criteria

6

E T values (0.2

EM & HAD

0.2)

ATLAS LVL1 Trigger

E T values (0.1

EM & HAD

0.1) p T ,

h, f

information on up to 2

m

candidates/sector (208 sectors in total) ~7000 calorimeter trigger towers Calorimeter trigger Pre-Processor (analogue

E T ) Jet / Energy sum Processor Cluster Processor (e/

g

,

t

/h) O(1M) RPC/TGC channels Muon Muon Barrel Trigger Trigger Muon-CTP Interface (MUCTPI) Multiplicities of e/

g, t

/h, jet for 8 p T thresholds each; flags for

S

E T ,

S

E T j , E T miss over thresholds; multiplicity of fwd jets

Chris Bee

Central Trigger Processor (CTP) Timing, Trigger, Control (TTC) Multiplicities of

m

for 6 p T thresholds

LVL1 Accept, clock, trigger type to Front End systems, RODs, etc – RoI pointers 7

RoI Mechanism

LVL1 triggers on high p

Calorimeter cells and muon

chambers to find e/ T

g

objects /

t

-jet-

m

candidates above thresholds LVL2 uses Regions of Interest as identified by Level-1

Local data reconstruction, analysis,

and sub-detector matching of RoI data The total amount of RoI data is minimal

~2%

of the Level-1 throughput but it has to be accessed at 75 kHz H →2e + 2

m Chris Bee

2e 2

m 8

Physics Selection Strategy

ATLAS has an • Jets

inclusive

LVL1 Trigger on individual signatures • EM cluster • Muon track • Total Energy • Missing Energy – LVL2 confirms & refines LVL1 signature • requires seeding of LVL2 with LVL1 result – i.e. RoI – Event Filter confirms & refines LVL2 signature & more complete event

reconstruction trigger strategy

• Possibility of seeding of Event Filter with LVL2 result • tags accepted events according to physics selection • Reject events earlySave resources • minimize data transfer • minimize required CPU power Chris Bee 9

Chris Bee

System Scalability

10

Chris Bee

ATLAS TDAQ Physical Layout

Central Switches Events Built

11

System Scalability

Extended testing programme for system scalability testingDedicated testbed for dataflow performance & networking issues – • • • Plans  Data Acquisition group – Large clusters worldwide for “node” scalability testing • Machine & run control • Start/end run cycling • Software distribution • Large scale configuration  Data Acquisition & Trigger groups – Trigger focus on Event FilterRecent workUse of LXSHARE cluster at CERN ~ 500 nodes and WESTGRID

cluster in Canada (~840 nodes)

Use of 50-700+ nodes on LXSHARE this summer

http://atlas-tdaq-large-scale-tests.web.cern.ch

Chris Bee 12

Summary of Recent Tests

ConclusionsPrimary goal was system porting and debuggingImportant bug in CORBA lib was found and fixed Many others benefits obtained:Experience in porting large-scale DAQ systemMany particular indications for weak points and possible

improvements

General impression of run control transition timesLST @ CERNJune 6 – July 19Many things being tested / investigated / measuredWe are ready following experience from WestGrid Chris Bee 13

System Scalability

Many hardware issues need attentionHow to organize O(2000) PCs • racks, space, weight, heat & cooling, cabling • data I/O & networking • operating – booting, s/w installation, operational monitoring • dependency on ever evolving PC & CPU architectures and compilers, applicability of Moore’s Law • Remote farms • Possible InvolvementLonger term possibilities of LSTs at SLAC?Software development & testing work in the Event Filter to include

requirements from overall ATLAS monitoring and calibration

Work on the specification development, installation, maintenance &

running of the EF

Chris Bee 14

Chris Bee

Trigger Core Software Development

15

Trigger Core Software Development

Provides a coherent software framework for LVL2 and EFCoherent data access methodsRe-use of some offline components where appropriateDevelopment platform ~common across trigger & offlineFacilitates online/offline comparisons & ease of developmentDetailed collaboration with core offline development group as

well as detector software development

Benefit from detailed expertise in each detector groupE.g. => in last year’s testbeam: detector monitoring software

developed for use in offline was also used online in the EF

Considerable exchange of ideas & developmentPerformance & efficiency improvements done for the trigger now

benefit offline

some new offline functionality benefits the trigger

More specific dedicated development for LVL2 Chris Bee 16

HLT Data Flow Software

Event Filter

Processing Task

HLT Event Selection Software

HLTSSW

  

HLT Selection Software

Framework ATHENA/GAUDI Reuse some offline components Common to Level-2 and EF HLT Core Software

1..*

Steering Monitoring Service

HLT Algorithms

L2PU Application Data Manager HLT Algorithms ROB Data Collector Event Data Model

1..*

MetaData Service

<> <> <>

Athena/ Gaudi StoreGate

Offline Core Software

<>

Event Data Model Reconstr. Algorithms

Offline Reconstruction

~Offline algorithms used in EF Package Interface Dependency

Chris Bee 17

 

LVL2 Development Environment

HLT software development and testing in offline environment Final “certification” procedure in Data Flow test-beds

Development and Data Flow setup for Level-2

Support for multiple threads Online Data Flow Offline ATHENA Environment

Chris Bee

L2PU Steering Controller Algorithms Link to algorithm libraries athenaMT Steering Controller Algorithms

Offline support for Level-2 developers Multithreaded offline application

AthenaMT

Emulates complete L2PU environment No need to setup complex Data Flow systems As simple to run as a normal offline application:

athenaMT Coding guidelines

Lvl2 developers for

18

Trigger Core Software Development

Possible InvolvementWork & responsibility in specific s/w packages in the core s/wTrigger configuration and algorithm control systemTrigger monitoring framework and strategyOffline/online Software integration Chris Bee 19

Chris Bee

Trigger Selection Algorithms

20

Trigger Selection Algorithms

• • • •

On-line event selection in the HLT based on algorithmic software tools running in LVL2 and EF farms, sequenced by HLT steering

LVL2 specialized algorithms, EF algorithms adapted from off-lineImportant deployment in HLT test-beds to assess compliance with realistic on-line

environment Building on expertise and development inside detector communities

Calorimeters, Inner Detector, Muon Spectrometer

Studies of efficiency, rates, rejection factors, physics coverage organized around five main lines (“vertical slices”) coherently mapped to the Physics Combined Performance groups (see physics session)

Electrons and photons • Fundamental signatures for both precision measurements and discovery signals – Muons • Low- and High-P T objects, strategic also for B-physics programme – Jets / Taus / ETmiss • Models testing, new physics – b-tagging • Optimize physics coverage, add flexibility and redundancy to HLT selection starting from LVL2 – B-physics • Rich program of work with new strategies dependent on luminosity

Most recent talks on performance studies

http://agenda.cern.ch/fullAgenda.php?ida=a052747

Chris Bee 21

Trigger Menus and Strategy

• • •

Extracting tiny signals out of huge backgrounds requires the HLT selection strategy to be robust, redundant and flexible

Selections are mostly inclusive, with

as-low-as-possible p T thresholds for fundamental objects

The usage of software tools at both

HLT levels allows detailed studies of the boundary between LVL2 and EF

• Different paths leading at approximately the same efficiency (electrons in the figure) • Example of flexibility and different selection sequences • Choice will depend on background conditions, detector knowledge, luminosity, …

The building of complete Trigger Menus evolves and complement the work done in the slices

Moving from single objects to complex topological signaturesInclude issues of pre-scaled triggers, monitor triggers, etcOptimize to environmental conditions

Commissioning the HLT selection will be an important step towards physics data taking

Needs to be ready for cosmic periodImplies modification to algorithms, new sequences Chris Bee 22

Trigger Selection

Possible InvolvementWork in trigger algorithm development and selection performance

evaluation

• Jet / tau / Etmiss area is in particular need of increased effort • Other areas would also benefit from new manpower and groups willing to take on new responsibility – Preparation/adaptation of sets of algorithms & selection

procedures for use in cosmic running and in initial beam periods (single beams, very initial collisions etc)

Chris Bee 23

Commissioning & Preparation for Cosmics & First Beam

Chris Bee 24

Commissioning

Detailed planning for stepwise commissioning of the trigger

system (LVL1 & HLT) is being prepared

Planning taking account of detector plans and triggering

requirements for their commissioning

Planning in various phases with increasing levels of integrationCommissioning planning is broken in 4 broad phases:Subsystem standalone commissioningIntegrate subsystems into full detectorCosmic rays, recording data, analyze/understand, distribute to

remote sites

Single beam, first collisions, increasing ratesPhases will overlap • 

TDAQ “pre-series” system

Chris Bee 25

TDAQ Pre-series system

Fully functional, small scale, version of the complete HLT/DAQ

system

Equivalent to a detector’s ‘module 0’ Purpose and scope of the pre-series system:Pre commissioning phase: • To validate the complete, integrated, HLT/DAQ functionality • To validate the infrastructure, needed by HLT/DAQ, at point-1.

– Installed at point 1 (USA15 and SDX1) – Commissioning phase • To validate a component (e.g. a ROS) or a deliverable (e.g. a Level-2 rack) prior to its installation and commissioning – TDAQ post-commissioning development system. • Validate new components (e.g. their functionality when integrated into a fully functional system).

• Validate new software elements or software releases before moving them to the experiment.

Chris Bee 26

USA15 SDX1

Pre-Series

5.5

One ROS rack -

TC rack + horiz. cooling 12 ROS 48 ROBINs

RoIB rack -

TC rack + horiz. cooling 50% of RoIB

One Full L2 rack -

TDAQ rack 30 HLT PCs

Partial Superv’r rack -

TDAQ rack 3 HE PCs

One Switch rack -

TDAQ rack 128-port GEth for L2+EB

Partial EFIO rack -

TDAQ rack 10 HE PC (6 SFI 2 SFO 2 DFM)

Partial EF rack -

TDAQ rack 12 HLT PCs

ROS, L2, EFIO and EF racks : one Local File Servers, one or more Local Switches Partial ONLINE rack

-

-

TDAQ rack 4 HLT PC (monitoring) 2 LE PC (control) 2 Central FileServers

Commissioning

Phase 1 commissioning will be completely defined after the

experience with the pre-series

Parallelize commissioning work as much as possibleUse data taken during detector commissioning to test data unpacking

tools

Develop special algorithms to test component unitsExtend offline s/w testing proceduresProvide infrastructure to collect systematic information from trigger

selection studies:

•List of selection variables •Graphs of rate and efficiency variation – There is a strong coupling with the offline commissioning activitiesTrigger commissioning extends well into data-takingNeed good coordination with physics groupsTreat the trigger as a single object to be commissioned (inc. LVL1)Will need a clear strategy for the daily run meetings (data request) •It is clear that the “Extra Triggers (monitoring, calibration, etc…) will be much larger than the foreseen 10% during the first months of data-taking Chris Bee 28

Commissioning

Possible involvement – 

We would like to benefit from your experience in commissioning and running the BaBar experiment & elsewhere

Work in installing, developing and exploiting the pre-series systemDevelopment of algorithms and procedures that allow to rapidly

check the trigger performance with real data and monitor the overall HLT commissioning advancement

Responsibility in the more general trigger commissioning activities

and in preparing the ATLAS trigger for cosmic tests and first beams in LHC

There is considerable lack of effort in this area and there is room

for major involvement and responsibility

Chris Bee 29

Summary

Outlined several areas within the ATLAS HLT system where

members of the SLAC team could contribute and take responsibility

Spread of areas ranging from more technical software design

and implementation to much more physics oriented work

Many interesting challenges ahead to lead ATLAS into data-

taking and first physics

TDAQ Workshop in Mainz, Germany 10-14 October 2005

WELCOME !!!

Chris Bee 30

Backup

Chris Bee 31

Chris Bee

75(100) kHz

ATLAS LVL1 Trigger

75(100) kHz 75(100) kHz 75(100) kHz LVL1 Accept 75(100) kHz

32

m-

RoI reconstruction at LVL2 using

m

Fast

Z RPC 2

m

Z MDT T Z RPC 1 Z

D

Z = (Z

RPC 2

+ Z

RPC 1

)/2 – Z

MDT

Chris Bee

Muon Road

33

muFast Timing Measurements

Optimized code run on (Pentium III @ 2.3GHz).

Physics: single muon,p t =100 GeV Cavern Background: High Lumi x 2 • • m –

Fast latency is the CPU time taken by the algorithm without considering the data access/conversion time: the presence of Cavern Background does not increase the

m

Fast processing time. The total latency shows timings made on the same event sample before and after optimizing the MDT data access.

Optimized version:

total data access time ~ 800

m

s;

data access takes the same cpu time of

m

Fast;

Chris Bee 34

Stepwise HLT Selection

Selection takes place in stepsRejection can happen at every

step

Trigger Decision and Data

Navigation is based on Trigger Elements

Algorithms use the result from

previous steps (Seeding) using the Data Navigation and the Trigger Elements

The initial seeds for the LVL2

steps are the LVL1 RoIs Event Accepted e50i+e50i ?

Decision e50i isolation e50 elecId elecId EM50 + EM50 LVL1 Trigger Element RoI + + e50i isolation e50 RoI

Chris Bee 35

The Different Commissioning Phases (1)

HLT standalone commissioningUnits of racks (considered to be a unit to be commissioned)A rack delivered from installation has: •Checked the power, cooling and network within and outside the rack •Operating system installed – Commissioning starts with the installation of the DAQ and

offline software

•Check internal Dataflow (preloaded data) – Monitoring tools •Offline software – Offline software distribution procedures – Automatic testing procedures – Testing algorithms Chris Bee 36

The Different Commissioning Phases (2)

Integrate subsystems into the full detector.These operations that have a very strong coupling with the offline

commissioning activities

First start with

system

code • Current activities

data unpacking

Software distribution

algorithms

• Monitoring infrastructure to check this step – Use any commissioning data taken by the detectors to debug this part of the • Even if the data is corrupted, it might be very useful to test the robustness of the

(or areas where we need to concentrate effort)

Extend the pool of data prep algorithms • Algorithms must be scrutinized and broken up in simpler testing units – Testing procedures for both offline selection software and interface to DAQ

software are being strengthened and running in the nightly automatically

• The goal is to arrive to a set of tests that almost guarantee further test-bed (or pre-series, etc) integration will succeed – Specify constraints and tests in the offline software before distribution Chris Bee 37

The Different Commissioning Phases (3)

The remaining phases correspond to commissioning while data is being

taken and assumes:

Complete HLT Dataflow is workingThe algorithms start selecting/rejecting eventsThe trigger work will focus more on demonstrating that an algorithm

gives an Xx.Yy% selection efficiency with some rejection rate

•This activities are very important: – Help to develop and tune the algorithms – Give us the building blocks to test the complete HLT chain – However, for commissioning, we need to be focused also in some other

aspects

•Have a centralized place where the complete set of parameters that algorithms use (will be inside the configuration in the future) are listed – Size of data request around the ROI – Set of selection cuts •For every “selection variable” we need the graph of variation in selection efficiency and rejection rate around the chosen optimal point (we are sure we will have to tune it with data) •Need to prepare a set of algorithms and methods that allow us to check the trigger performance with data: – Particles with known mass (selected only triggering in one of its decay products) – How many hours of data-taking do we need to know the selection efficiency within a 5% precision? Chris Bee 38