Transcript Document
1 ND Online Monitoring D. A. Petyt July 13 2004 • Contents: – – – – – – What is Online Monitoring? How does it work? What does it provide? What are its limitations? What is the status of ND Online Monitoring? What remains to be implemented? 2 Online Monitoring overview • The purpose of online monitoring (OM) is to provide real-time monitoring of the detector status and data quality using the raw data written by the DAQ. • A system has been developed which is based on the CDF RunII online monitoring framework. This system has been operational at Soudan for the past 3 years. It ran at CERN during 2002-3 and has been running at the ND since April ‘04 • The system consists of a Producer process, which creates the histograms and analyses the data, and a GUI which is used to view the histograms. This GUI is a stand-alone process so the histograms can be viewed remotely. • Much of the development of OM has been focussed on the far detector. However, a number of QIE specific histograms relevant to Near detector monitoring were developed specifically for the CalDet Near/Far runs. These have been adapted and expanded for use at the ND. 3 Brief Overview of the OM Framework Raw data (.root format) • Consists of three processes: • GUI is decoupled from Producer/Server: – several GUIs running at external institutions can connect to a single Producer (e.g. at Soudan) and monitor the status of the detector – Unlike Run Control, OM GUI does not talk to Producer process. DaqSnarl Light Injection Data Dispatcher DaqMonitor – Producer: receives and analyses data from the Data Dispatcher. Uses MINOS C++ analysis framework. – Histogram Server: receives ROOT histograms from the Producer via socket connection. – ROOT-based GUI: connects to Histogram Server. Handles histogram plotting and updates. Producer Online Monitoring Frame Run number: 8094 Subscribe to streams of interest Main monitoring process Number of snarls: 175 Mean singles rate: 53 kHz Online Monitoring Run no: 5675 Socket connection Can You Read This? Histogram Server Online Monitoring Run no: 5675 Can You Read This? Monitoring GUI processes 4 The OM GUI Histogram server address & port no. List of available monitoring histograms Sample monitoring canvases Online Monitoring GUI 5 Accessing the OM plots • The OM plots are created by 2 separate Producer processes - one for ND, the other for FD. These run on the following machines: – daqdds.minos-soudan.org – daqdds-nd.fnal.gov (Soudan) (Fermilab) • The plots can be accessed via the GUI for the run that is currently in progress. • Once the run has ended, a ROOT file containing all non-zero histograms is written to disk and this can be used to produce plots to insert into CRL – There is a set procedure for using the OM rootfiles. A “Checklist plots” folder is available which contains the principle histograms used for detector checkout. The shift user goes through these plots one by one and fills in a checklist form in CRL (once per day), listing any anomalies as they appear. – The rootfiles are also available on the DAQ Logs webpage at FNAL for a short period, and are eventually archived to the FNAL datastore 6 OM Checklist CRL entry Om1CheckList Date Time Shifter at Start OM Viewer OM Status Run Rogue crate monitor ASD Supply Ground Plane Positive VA Rail ROP temperatures VFB temperatures Problems, Crate Monitor CI Gain Flat CI Problems, CI Current Singles Hot Chips Max Singles Mean Singles Min Singles Singles Rate per crate Problems, Singles 07/12/2004 8:26:17 feldman barrett Soudan x x 26003 x x x x x x x x 6-0-1-1-1 and 14-1-5-1-0 flat in run 26001. 6-0-1-1-1 was flat last week. 14-1-5-1-0 was not. x x x x x x 9-1-0-1-0 show hot in maximum rates, but okay in mean rate. 9-2-0-0-1 is low in mean signal, but not zero. 7 OM Philosophy & Limitations • Top-level histograms – the number of histograms that OM can support is not infinite. Therefore a fixed set of basic histograms (hit maps, ADC distributions etc.) are provided which should be sufficient to flag and diagnose most problems. More detailed analyses should be performed offline. • Simple analysis – due to processing constraints, the amount of data processing in OM is fairly minimal (no event reconstruction, for example). It may be desirable to develop some crude reconstruction to make sense of near detector spills (bucket summing/event splitting?) • Sampling – it is critical that OM keeps up with data taking. If processing of snarls is too slow, OM will throw out snarls until it is able to keep up. • Run-based histograms – OM histograms are typically cleared after each run (with the exception of LI histograms where sequences can span run boundaries). Long-term variations are therefore not tracked, although OM does retain a memory of certain quantities - for example pedestal means - from which run-to-run variations can be observed. Sample OM ND plots – null trigger run Plane vs strip map Digit timestamps – note sampling freq. Singles rates Timestamps for 1 sec snarl 8 9 QIE hit map 1 M64 1 MINDER • QIE analogue of the VA logical hit map 1 MASTER 10 Detecting QIE errors 2 channels exhibit QIE errors in this run 11 QIE pedestal plots • Based on code developed by Peter Shanahan • Plots simple quantities (mean, rms, number of entries) in master/minder space • Plots positions of channels that have anomalous rms and/or mean values 12 Near Check Cal plots Location of rogue channels (range 5) rogue channels rogue channels NB: CalDet data 13 Status of OM ND histograms • Hit Maps – – – • • T-T0 (1sec and 10us) Per crate timestamps Per crate for digits with error bits set • Per crate Per crate as a function of time Per chip (per minder?) – – – • – ROP temperatures Key: working, not tested, in development • Mean, RMS, entries by crate/master Map of anomalous channels Summary canvas of bad channels Comparison of last 2 pedestal runs Digit Errors – – Crate Monitor Hit maps Rogue channel hit map Gain maps Timing summaries Trigger PMT ADC distributions QIE pedestal – Plane, strip, strip profiles, #hits, summed ADC Mean vs DAC, RMS vs DAC, Mean/RMS Bad Channels by range Niki’s definition of bad channels Light Injection – – – – – Singles – – – • • QIE digits – • – – Timestamps – – – • All channels Individual channels Near Check Cal – Plane vs strip Master/minder Last event ADC distributions – – • Summary of errors by crate Individual error types by crate TRC monitoring – Swaps/sec, timing errors, PPS time 14 Future OM development for QIE/Near • Firstly, I will work on implementing the plots marked in red or yellow on the previous page. • An obvious next step is to look at the currently existing offline scripts that are being used to analyse QIE data and see to what extent they can be used within OM. This process has already happened for the Near Check Cal and QIE pedestal analyses. • Are there any new QIE data blocks/run type variations that I should be aware of which require monitoring? • What level of reconstruction is required? Is bucket summing needed? If so, then efficient algorithms will be required for these if OM is not to become bogged down. • Suggestions (+ code snippets) are always welcome. 15 Monitoring of spill data • The OM histograms previously described are designed to catch low level problems in the data – dead channels, calibration problems etc. • How useful are the current set of plots for spill data? Is it possible to use them to determine how well the beam is operating? • My understanding is that the ND will run in spill mode when beam is delivered (10ms+ trigger window) and in cosmics mode (null trigger?) outside the spill. • Used Mock Data MC to simulate spill triggered ND data – see plots on next few pages – From these, it appears that some crude beam monitoring might be possible with the current set of OM histograms 16 1 spill – Mock MC Individual neutrino interactions & rock muons clearly visible 17 Plane vs strip map – Mock MC 550 spills used 18 Zoomed hit map – Mock MC 19 QIE hit map – Mock MC 20 T-T0 – Mock MC 1 spill A count of number of spikes (above threshold) could be used as a rough intensity monitor 21 ND/Beam Monitoring • Using ND data for beam monitoring – I’m interested to know what is required here • How much can be accommodated by the use of ‘simple’ histograms of the type shown above • More detailed plots (those that involve significant reconstruction) should probably be spun off into separate offline jobs, or separate Producers if automated/online running is desirable – My view is that the OM plots will tell you: • If there is beam • Give a rough idea of the beam intensity • Incorporating beam monitoring data into OM – I know very little about this, other than the statement that beam monitoring data will be available in the data stream – What plans are there for displaying this data? The OM framework could conceivably be used for this task • Probably needs a separate Producer process