Kein Folientitel - Universitetet i Bergen

Download Report

Transcript Kein Folientitel - Universitetet i Bergen

Data rate reduction in
ALICE
Data volume and event rate
bandwidth
TPC detector
data volume = 300 Mbyte/event
data rate = 200 Hz
60 Gbyte/sec
front-end
electronics
15 Gbyte/sec
Level-3 system
DAQ –
event building
< 2 Gbyte/sec
< 1.2 Gbyte/sec
permanent storage system
Data rate reduction
• Volume reduction
– regions-of-interest and partial
readout
– data compression
• entropy coder
• vector quantization
• TPC-data modeling
• Rate reduction
– (sub)-event reconstruction and event
rejection before event building
Regions-of-interest and
partial readout (1)
• Selection of TPC sector and -slice
based on TRD track candidate
• Momentum filter for D0 decay tracks
based on TPC tracking
Regions-of-interest and
partial readout (2)
• Momentum filter for D0 decay tracks
based on TPC tracking:
pT > 0.8 GeV/c vs.
all pT
Data compression:
Entropy coder
Probability distribution of 8-bit TPC data
Variable Length Coding
short codes for
frequent values
long codes for
infrequent values
Results:
NA49: compressed event size = 72%
ALICE:
= 65%
(Arne Wiebalck, diploma thesis, Heidelberg)
Data compression:
TPC - RCU
•
•
TPC front-end electronics system architecture and
readout controller unit.
Pipelined Huffman Encoding Unit, implemented in a
Xilinx Virtex 50 chip*
* T. Jahnke, S. Schoessel and K. Sulimma, EDA group, Department of Computer Science, University of Frankfurt
Data compression:
Vector quantization
• Sequence of ADC-values on a pad = vector:
code
book
• Vector quantization = transformation of vectors
into codebook entries
• Quantization error:
Results:
NA49: compressed event size = 29 %
ALICE:
= 48%-64%
(Arne Wiebalck, diploma thesis, Heidelberg)
Data compression:
TPC-data modeling
• Fast local pattern recognition:
simple local track model (e.g. helix)
track parameters
• Track and cluster modeling:
local track
parameters
analytical cluster model
comparison
to raw data
quantization of deviations
from track and cluster
model
Result:
NA49: compressed event size = 7 %
Event rejection
~2 kHz
TRD
Trigger
L0
Global
Trigger
Other Trigger
L0
L1
Readout
L1
other detectors
TPC
Detectors
,
Readout
L1
L0pretrig.
L2accept
(216 Links, 83 evt
MB/
)
Zerosuppressed
TPCdata
Sector
parallel
Select
regions
of
interest
HLT
Tracking
of
e+e- candidates
insideTPC
On-linedata reduction
(tracking
, reconstruction
,
partial
readout
,
compression
)
enable
Verifye+ehypothesis
Reject
event
e+e- tracks
plusROIs
Tracksegments
andspace points
Time, causality
Event sizes
and
number of
links TPC
only
seeds
TRD
e+e- tracks
L2
Binary loss less data compression
(RLE, Huffman, LZW, etc.)
0.5-2 MB/
evt
4-40 MB/
evt
45MB/evt
DAQ
Fast pattern recognition
Essential part of HLT system
– crude complete event reconstruction
 monitoring, event rejection
– redundant local tracklet finder for cluster
evaluation and data modeling
 efficient data compression
– selection of (,,pT)-slices
 ROI
– momentum filter
 ROI
– high precision tracking for selected track
candidates
 event rejection
Requirements on the
TPC-RORC design
concerning HLT tasks
• Transparent mode
– transfering raw data to DAQ
• Processing mode
–
–
–
–
Huffman decoding
unpacking
10-to-8 bit conversion
pattern recognition
• cluster finder
• Hough transformation tracker
TPC PCI-RORC
• Simple PCI-RORC
PCI bridge
Glue logic
DIU
interface
DIU card
S5935
ALTERA
PCI bus
• HLT TPC PCI-RORC
– backwards compatibility
– fully programmable
 FPGA coprocessor
PCI bridge
FPGA
Coprocessor
Glue logic
DIU
interface
DIU card
SRAM
PCI bus
Preprocessing per sector
RCU
raw data, 10bit dynamic range,
zero suppressed
Huffman encoding (and vector quantization)
detector front-end electronics
Huffman decoding,
unpacking,
10-to-8 bit conversion
RORC
fast cluster finder:
simple unfolding, flagging
of overlapping clusters
cluster list
fast track finder
initialization (e.g.
Hough transform)
fast vertex finder
Hough histograms
Peakfinder
receiver node
global node vertex position
raw data
FPGA coprocessor:
cluster finder
• Fast cluster finder
–
–
–
–
up to 32 padrows per RORC
up to 141 pads/row and up to 512 timebins/pad
internal RAM: 2x512x8bit
timing (in clock cycles, e.g. 5 nsec)1:
#(cluster-timebins per pad) / 2 + #clusters
 outer padrow: 150 nsec/pad, 21 sec/row
– centroid calculation: pipelined array multiplier
1. Timing estimates by K. Sulimma, EDA group, Department of Computer Science, University of Frankfurt
FPGA coprocessor:
Hough transformation
• Fast track finder: Hough transformations2
– (row,pad,time)-to-(2/R,,) transformation
– (n-pixel)-to-(circle-parameter) transformation
– feature extraction:
local peak finding in parameter space
2. E.g. see Pattern Recognition Algorithms on FPGAs and CPUs for the ATLAS LVL2 Trigger,
C. Hinkelbein et at., IEEE Trans. Nucl. Sci. 47 (2000) 362.
Processing per sector
raw data, 8bit dynamic range,
decoded and unpacked
vertex position,
cluster list
slicing of padrow-pad-time space into
sheets of pseudo-rapidity,
subdiving each sheet into overlapping patches
RORC
sub-volumes in r,,
fast track finder B:
1. Hough transformation
fast track finder A:
track follower
receiver node
fast track finder B:
2. Hough maxima finder
3. tracklett verification
track segments
cluster deconvolution
and fitting
updated vertex position
updated cluster list,
track segment list
Hough transform (1)
• Data flow
Hough transform (2)
• -slices
Hough transform (3)
• Transformation and maxima search
FPGA coprocessor:
Implementation of
Hough transform
FPGA coprocessor
prototype
PCI bridge
FPGA
Coprocessor
Glue logic
FEP
RAM
SRAM
PCI bus
DIU
interface
DIU card
SIU
interface
SIU card
RCU
• FPGA candidates
– Altera Excalibur (256 kbyte SRAM)
– Xilinx Virtex II (3.9 Mbit dual port SRAM +
1.9 Mbit distributed SRAM, 420 MHz)
– external high-speed SRAM