Transcript Document

1
ND Online Monitoring
D. A. Petyt July 13 2004
• Contents:
–
–
–
–
–
–
What is Online Monitoring?
How does it work?
What does it provide?
What are its limitations?
What is the status of ND Online Monitoring?
What remains to be implemented?
2
Online Monitoring overview
• The purpose of online monitoring (OM) is to provide real-time
monitoring of the detector status and data quality using the raw data
written by the DAQ.
• A system has been developed which is based on the CDF RunII online
monitoring framework. This system has been operational at Soudan for
the past 3 years. It ran at CERN during 2002-3 and has been running at
the ND since April ‘04
• The system consists of a Producer process, which creates the
histograms and analyses the data, and a GUI which is used to view the
histograms. This GUI is a stand-alone process so the histograms can be
viewed remotely.
• Much of the development of OM has been focussed on the far detector.
However, a number of QIE specific histograms relevant to Near
detector monitoring were developed specifically for the CalDet
Near/Far runs. These have been adapted and expanded for use at the
ND.
3
Brief Overview of the OM Framework
Raw data
(.root format)
• Consists of three processes:
• GUI is decoupled from Producer/Server:
– several GUIs running at external
institutions can connect to a single
Producer (e.g. at Soudan) and monitor the
status of the detector
– Unlike Run Control, OM GUI does not
talk to Producer process.
DaqSnarl
Light
Injection
Data
Dispatcher
DaqMonitor
– Producer: receives and analyses data
from the Data Dispatcher. Uses MINOS
C++ analysis framework.
– Histogram Server: receives ROOT
histograms from the Producer via socket
connection.
– ROOT-based GUI: connects to
Histogram Server. Handles histogram
plotting and updates.
Producer
Online Monitoring Frame
Run number: 8094
Subscribe to
streams of interest
Main monitoring
process
Number of snarls: 175
Mean singles rate: 53 kHz
Online Monitoring
Run no: 5675
Socket connection
Can You
Read This?
Histogram Server
Online Monitoring
Run no: 5675
Can You
Read This?
Monitoring GUI processes
4
The OM GUI
Histogram server
address & port no.
List of available
monitoring histograms
Sample monitoring canvases
Online Monitoring GUI
5
Accessing the OM plots
• The OM plots are created by 2 separate Producer processes - one for
ND, the other for FD. These run on the following machines:
– daqdds.minos-soudan.org
– daqdds-nd.fnal.gov
(Soudan)
(Fermilab)
• The plots can be accessed via the GUI for the run that is currently in
progress.
• Once the run has ended, a ROOT file containing all non-zero
histograms is written to disk and this can be used to produce plots to
insert into CRL
– There is a set procedure for using the OM rootfiles. A “Checklist plots”
folder is available which contains the principle histograms used for
detector checkout. The shift user goes through these plots one by one and
fills in a checklist form in CRL (once per day), listing any anomalies as
they appear.
– The rootfiles are also available on the DAQ Logs webpage at FNAL for a
short period, and are eventually archived to the FNAL datastore
6
OM Checklist CRL entry
Om1CheckList
Date
Time
Shifter
at
Start OM Viewer
OM Status
Run
Rogue crate monitor
ASD Supply
Ground Plane
Positive VA Rail
ROP temperatures
VFB temperatures
Problems, Crate Monitor
CI Gain
Flat CI
Problems, CI
Current Singles
Hot Chips
Max Singles
Mean Singles
Min Singles
Singles Rate per crate
Problems, Singles
07/12/2004
8:26:17
feldman barrett
Soudan
x
x
26003
x
x
x
x
x
x
x
x
6-0-1-1-1 and 14-1-5-1-0 flat in run 26001. 6-0-1-1-1 was flat last week. 14-1-5-1-0
was not.
x
x
x
x
x
x
9-1-0-1-0 show hot in maximum rates, but okay in mean rate.
9-2-0-0-1 is low in mean signal, but not zero.
7
OM Philosophy & Limitations
• Top-level histograms – the number of histograms that OM can support
is not infinite. Therefore a fixed set of basic histograms (hit maps,
ADC distributions etc.) are provided which should be sufficient to flag
and diagnose most problems. More detailed analyses should be
performed offline.
• Simple analysis – due to processing constraints, the amount of data
processing in OM is fairly minimal (no event reconstruction, for
example). It may be desirable to develop some crude reconstruction to
make sense of near detector spills (bucket summing/event splitting?)
• Sampling – it is critical that OM keeps up with data taking. If
processing of snarls is too slow, OM will throw out snarls until it is
able to keep up.
• Run-based histograms – OM histograms are typically cleared after
each run (with the exception of LI histograms where sequences can
span run boundaries). Long-term variations are therefore not tracked,
although OM does retain a memory of certain quantities - for example
pedestal means - from which run-to-run variations can be observed.
Sample OM ND plots – null trigger run
Plane vs strip map
Digit timestamps –
note sampling freq.
Singles rates
Timestamps for 1 sec snarl
8
9
QIE hit map
1 M64
1 MINDER
• QIE analogue of the VA logical hit map
1 MASTER
10
Detecting QIE errors
2 channels exhibit QIE errors in this run
11
QIE pedestal plots
• Based on code developed by Peter
Shanahan
• Plots simple quantities (mean, rms, number
of entries) in master/minder space
• Plots positions of channels that have
anomalous rms and/or mean values
12
Near Check Cal plots
Location of rogue
channels (range 5)
rogue channels
rogue channels
NB: CalDet data
13
Status of OM ND histograms
•
Hit Maps
–
–
–
•
•
T-T0 (1sec and 10us)
Per crate timestamps
Per crate for digits with error bits
set
•
Per crate
Per crate as a function of time
Per chip (per minder?)
–
–
–
•
–
ROP temperatures
Key: working, not tested, in development
•
Mean, RMS, entries by
crate/master
Map of anomalous channels
Summary canvas of bad channels
Comparison of last 2 pedestal
runs
Digit Errors
–
–
Crate Monitor
Hit maps
Rogue channel hit map
Gain maps
Timing summaries
Trigger PMT ADC distributions
QIE pedestal
–
Plane, strip, strip profiles, #hits,
summed ADC
Mean vs DAC, RMS vs DAC,
Mean/RMS
Bad Channels by range
Niki’s definition of bad channels
Light Injection
–
–
–
–
–
Singles
–
–
–
•
•
QIE digits
–
•
–
–
Timestamps
–
–
–
•
All channels
Individual channels
Near Check Cal
–
Plane vs strip
Master/minder
Last event
ADC distributions
–
–
•
Summary of errors by crate
Individual error types by crate
TRC monitoring
–
Swaps/sec, timing errors, PPS
time
14
Future OM development for QIE/Near
• Firstly, I will work on implementing the plots marked in
red or yellow on the previous page.
• An obvious next step is to look at the currently existing
offline scripts that are being used to analyse QIE data and
see to what extent they can be used within OM. This
process has already happened for the Near Check Cal and
QIE pedestal analyses.
• Are there any new QIE data blocks/run type variations that
I should be aware of which require monitoring?
• What level of reconstruction is required? Is bucket
summing needed? If so, then efficient algorithms will be
required for these if OM is not to become bogged down.
• Suggestions (+ code snippets) are always welcome.
15
Monitoring of spill data
• The OM histograms previously described are designed to
catch low level problems in the data – dead channels,
calibration problems etc.
• How useful are the current set of plots for spill data? Is it
possible to use them to determine how well the beam is
operating?
• My understanding is that the ND will run in spill mode
when beam is delivered (10ms+ trigger window) and in
cosmics mode (null trigger?) outside the spill.
• Used Mock Data MC to simulate spill triggered ND data –
see plots on next few pages
– From these, it appears that some crude beam monitoring might be
possible with the current set of OM histograms
16
1 spill – Mock MC
Individual neutrino interactions & rock muons clearly visible
17
Plane vs strip map – Mock MC
550 spills used
18
Zoomed hit map – Mock MC
19
QIE hit map – Mock MC
20
T-T0 – Mock MC
1 spill
A count of number of spikes (above threshold) could be used as a
rough intensity monitor
21
ND/Beam Monitoring
• Using ND data for beam monitoring
– I’m interested to know what is required here
• How much can be accommodated by the use of ‘simple’ histograms of the type
shown above
• More detailed plots (those that involve significant reconstruction) should
probably be spun off into separate offline jobs, or separate Producers if
automated/online running is desirable
– My view is that the OM plots will tell you:
• If there is beam
• Give a rough idea of the beam intensity
• Incorporating beam monitoring data into OM
– I know very little about this, other than the statement that beam
monitoring data will be available in the data stream
– What plans are there for displaying this data? The OM framework could
conceivably be used for this task
•
Probably needs a separate Producer process