The ZEUS Event Store

Download Report

Transcript The ZEUS Event Store

The ZEUS Event Store (ZES) How to make the most efficient data selection in ZEUS

Ulrich Fricke, DESY, Hamburg Zeus Monday Meeting, DESY, 04.11.2002

1

Outline

Introduction

Technical information

Performance

How to use ZES

ZES for MC

Remarks

Summary

Zeus Monday Meeting, DESY, 04.11.2002

2

Introduction:ZEUS data taking

  Detector waits for HERA luminosity Events selected according to FLT, SLT & TLT  Raw data file is written out  Raw file is by ZEPHYR (reconstruction)  MDST data file is written out  User code access information in MDST file  Events are selected depending on the analysis Zeus Monday Meeting, DESY, 04.11.2002

3

Efficient data selection

Requirements for data selection

efficiency

(as much data as fast as possible) –

transparency

(need to know which data is used) –

reliability

(should works 24h, 7days etc.) –

reproducibility

(same selection today and later) Zeus Monday Meeting, DESY, 04.11.2002

4

ZEUS data selection (1)

 Usually, we run EAZE jobs over the MDST data files and produce HBOOK ntuples or ROOT trees  

Simple selection: process all events

– loop over all events in MDST files – select events in user code (ZUANAL)

Clever selection: pre-select the data

– – – – define pre-selection criteria (at ZEUS 128 DST bits) calculate values of DST bits (metadata) during data processing information is stored in text files (ZEUS Event Directories, ZEDs) metadata is read first and only selected events are processed Zeus Monday Meeting, DESY, 04.11.2002

5

ZEUS data selection (2)

More efficient pre-selection: use a database

– use database to store metadata information – make use of database features:  data compression  performance tuning  indices  automatic updating   data available to other applications This is what we do with ZES Zeus Monday Meeting, DESY, 04.11.2002

6

ZEUS Event Store (ZES)

 database fully integrated into ZEUS offline system – Based on Objectivity/DB version 7.0.

– – – – Object-oriented database .

Written in C++ (interfaced to ZEUS FORTRAN software).

Input generated during data processing for each event.

Contains a lot of information like  All 128 DST bits   All FLT, SLT & TLT trigger information Detector information (CTD, CAL, BPC, FPC, BAC, …)  Electron finders (Sinistra, EM)  Muon finders  ...

Zeus Monday Meeting, DESY, 04.11.2002

7

ZEUS Event Store (ZES)

 ZES is ( like ZEDs) intended primary for data pre-selection – user defines selection criteria – – ZES database is search for event that match criteria only selected events are analyzed by EAZE 

ZES is NOT in competition to ORANGE!

 Select event with ZES and analyzes them with ORANGE.

 Selection is done with Objectivity/DB

predicate string

– you can select on the values of all variables and bits (((Eeu_si>5)and(Zvtx>-50)and(Zvtx<50) and(Eminpz>35)and(Eminpz<65)) and((DST13)and(BP112)))) Zeus Monday Meeting, DESY, 04.11.2002

8

ZES database schema

ZES based on Objectivity/DB version 7.0

Available for Linux, Irix & Solaris

Database has hierarchical structure

 1 Federated Database (the whole ZES)  Several Databases (each corresponds to a 200-450 MB file)  A lot of Containers (blocks in databases)  Even more Objects (information chunks in containers) 

Currently 1228 databases (325 Gb)

 Including different sets of information for some runs (96,96GR)  On raid5 disks at SGI Origin 2000 fileserver (doener) Zeus Monday Meeting, DESY, 04.11.2002

9

ZES database design(0)

 ZES has no central server which does the selection  The only central process (lockserver) manages locks during database updates etc.

 For a selection on events in a certain run, all event informations are transmitted over the network to the client  Significant performance increase if only the information needed for the selection is transmitted.

 Influence on database design Zeus Monday Meeting, DESY, 04.11.2002

10

Current Design (1)

ZES Federated Database Database 1 Run 1 Run 2 Run 3 Database 2 Run 4 Run 5 Zeus Monday Meeting, DESY, 04.11.2002

11

Current Design (2)

Run 1 (Events) MDST file 1 Event 1 Event 2 MDST file N MDST file Z Event 100 Event 111 Event 120 Event 999 Zeus Monday Meeting, DESY, 04.11.2002

12

Design of MDST and Event object

  MDST Object – Number of dataflows in MDST file – – Name and Offset of dataflows Same information as in ZEDs Event Object – – – – – RunNr, EventNr DST bits and trigger information 200-300 Physics information (integers and floats) Reference to MDST file Event offset in MDST file Zeus Monday Meeting, DESY, 04.11.2002

13

Recent addition: MicroEvents

 

Most users select on DST bits and trigger information Create new entry in database for each event, which only contains DST and trigger information :

– MicroEvent :  RunNr, EventNr  DST bits  Trigger information  Reference to MDST file  Offset in MDST file  Reference to full Event object Zeus Monday Meeting, DESY, 04.11.2002

14

Current Design (3)

Run 1 (Events) Run 1(MicroEvents) MDST file 1 Event 1 MicroEvent 1 Event 2 MicroEvent 2 MDST file N Event 100 MicroEvent 100 MDST file Z Event 111 Event 120 Event 999 MicroEvent 111 MicroEvent 120 MicroEvent 999 Zeus Monday Meeting, DESY, 04.11.2002

15

How do we update the database?

 The ZES information are calculated in code called from and in the ZESPHYS library  During the processing of raw data files the code is called AFTER all reconstruction and calibration has been done .

 For each raw data file one HBOOK ntuple with the ZES information is produced  Ntuples are used to load information a separate federated database which is not accessible to users  After all checks are ok, the database is moved to the main ZES federated database and available to all users Zeus Monday Meeting, DESY, 04.11.2002

16

Performance(1)

 With ZES one can select on DST bits, trigger information and physics variables  With ZEDs one can only select on DST bits 

Relative comparrison

: – ZED selection on DST bits : 100% – ZES selection on DST bits : < 95% (Events) – ZES selection on DST bits : < 12% (MicroEvents)  One can make a tighter selection than with ZEDs Zeus Monday Meeting, DESY, 04.11.2002

17

Performance (2)

 ZES selection time is

NOT

increasing with the number of variables used in the selection  One can make much tighter selections with ZES  Less selected events means less work for ORANGE/EAZE 

You save even more CPU time!

Zeus Monday Meeting, DESY, 04.11.2002

18

How to use ZES(1)

With a command line tool

– select events with /zeus/bin/zesprint – some selection as in EAZE – easy to check selection efficiency – can produce an ntuple with ZES information of all selected events – zesprint -a 27305 -z 27305 -n 10 -v (Eeu_si>10) -b (DST11) -l EventNr,Eeu_si Zeus Monday Meeting, DESY, 04.11.2002

19

How to use ZES (2)

In non-EAZE executables

– FORTRAN : call zesvar2(RunNr, EvenNr) – This will fill the ZES common block with the ZES information of the given run and event – one needs to include zescommon.inc in the code 

In an EAZE jobs

– the most important way to use ZES – all ZES access to control by cards in the normal cardsfile (fort.7) Zeus Monday Meeting, DESY, 04.11.2002

20

ZES in an EAZE job

ZES database MDST MDST Batch Job Predicate query

Event information read

Request data Open event in MDST file Interface Read setup

Zeus Monday Meeting, DESY, 04.11.2002

Setup File

21

How to use ZES (3)

 Driver cards to turn on ZES for the EAZE job – ZeusIO-INFI ZeusEventStore – ZeusIO-IOPT DRIVER=OBJY  Run selection (include or exclude run ranges, default=all) – ZeusIO-FirstRun 27305 – ZeusIO-LastRun 27311 – ZeusIO-IncludeRun 27305-27311,27350-27399 – ZeusIO-ExcludeRun 27306,27310-27311  Select a special list of runs (i.e. 96GR instead of 96) – ZeusIO-Runlist /zeus/ZES/run-list.96GR.sorted

Zeus Monday Meeting, DESY, 04.11.2002

22

How to use ZES (4)

 DST and Trigger Selection (default none) – ZeusIO-Bit ((DST11)or(DST25))and(T070)  Selection on variables (default none)  – ZeusIO-Variable ((Eeu_si>10.0)and(Ntrks>4)) Selection on EventNr (default (EventNr>0) ) – ZeusIO-Event ((EventNr>10)or(EventNr<100))  Selection of Object to search (default Event) – ZeusIO-EventType MicroEvent  Selection on Predefined Event Sample (default none) – ZeusIO-Sample SampleName Zeus Monday Meeting, DESY, 04.11.2002

23

How to use ZES (5)

  Variable names are given on ZES WWW page.

Blanks are not allowed in definition of the selection criteria – – OK : ZeusIo-Event (EventNr>10)and(EventNr<100) Wrong : ZeusIo-Event (EventNr>10) and (EventNr<100)   You can use several lines in the cardsfile for a selection – ZeusIO-Event ((EventNr>10)or – ZeusIO-Event (EventNr<100)) Allowed arithmetic operators: +, - , /, %  Allowed relational operators: <, >, <=, =>, =, !=  Allowed logical operators: and, or, not Zeus Monday Meeting, DESY, 04.11.2002

24

ZES for MC events (1)

   ZES was designed primary for data Users requested it also for MC events in order to – – – do efficiency studies of the selection use the same cards for data and MC speed up the event selection in MC It is not fully implemented yet but the first 2 requirements are working  ZES information is calculated for each event before ORANGE and user code is executed  Takes about 0.3 to 3 sec/event  Only selected events are analyzed by ORANGE/user code Zeus Monday Meeting, DESY, 04.11.2002

25

ZES for MC events (2)

 Turn on the calculation of ZES information by new card ZESPHY-RUNIT  If you need to analyze all also not selected events add this ZESPHY-USEALL  Include zescommon.inc

in your code to access the calculated ZES information  The integer variable ZES_eventselection is set to 1 if the event is accepted, otherwise to 0  Turn off calculation of blocks of ZES information to save CPU time by ZESBLK-nameofblock FALSE (default in TRUE) Zeus Monday Meeting, DESY, 04.11.2002

26

ZES for MC events (3)

  The ZES MC code is in development A lot of changes or now in the new development release ( compile with gmake all ZEUSRELEASE=new )  Check the ZES WWW page for updates and specific versions of needed libraries  Do NOT use the ZES driver cards for MC: – Wrong : ZeusIO-INFI ZeusEventStore – Wrong: ZeusIO-IOPT DRIVER=OBJY Zeus Monday Meeting, DESY, 04.11.2002

27

Upcoming changes

 Now : 2002 data now in ZES. Up to run 42801. (DAQ runs excluded)  Soon : More efficient selection or rejection of single events and event ranges mainly to quickly select 2002 events  Later : More functionality for MC events   Later : Design changes to further increase performance Later : Extend ZES to RAW and ENV events Zeus Monday Meeting, DESY, 04.11.2002

28

Getting Information And Help

  The ZES WWW page contains a lot of information about – getting started (including an EAZE example with ZES cards) – – – selection strings description of the ZES variables a newslog with recent announcements http://www-zeus.desy.de/ZEUS_ONLY/analysis/zes/  Pay attention to the information displayed from the jobclients (jobq, jobinfo etc)  Sent suggestions and questions to [email protected]

. It is forwarded to the responsible person. Or phone, pass by ... Zeus Monday Meeting, DESY, 04.11.2002

29

Some last remarks

 ZES is here to help YOU to make a more efficient data selection  If you find some information missing in ZES, discuss it with others (supervisor, physics coordinator etc.) and us!

 You need to tell us what is needed AND help develop and/or test the code.

 Especially important for new detectors like MVD, STT, Lumi spectrometer, polarimeter etc.

 And for new analysis methods (Muon finders, dead material correction …) Zeus Monday Meeting, DESY, 04.11.2002

30

Summary

ZES is our most efficient selection mechanism for data.

 It is much faster than the traditional methods.

 With limitations it can also be used for MC.

 More improvements are on the way.

 Remember: –

ZES is for you!

– Tell us what you need! – Help us to make it real!

Zeus Monday Meeting, DESY, 04.11.2002

31