Transcript Slide 1

IN
Presentation at the RadioNet FP7 Uniboard kick-off meeting, Dwingeloo, NL, Feb 26-27 2009
Hosted by
What’s
there?
XILINX?
ALTERA?
Some
prefer it
XILINX V-6
OUT
Brief:
Uniboard concept was introduced at the end of
the last century as the FPGA-based hardware platform
common to VLBI, WSRT, SKA pathfinders, and Pulsar
research projects. Seemingly, the advances in FPGA
technology over the last decade made this idea
implementable.
VLBI System Architecture approach is based on the
“limited bandwidth – all the baselines in a single node”
concept, nick-named the Super-FX. Thus it is linearly
scalable with bandwidth, allowing to reach the EVN2015 requirements and beyond.
Related applications like Space VLBI and VLBI for
Deep Space and Planetary missions are also scoped.
Bandwidth considerations:
512 MHz analog bandwidth per station is currently available with Mark4 DAS, with 2 Gbps
recording at some stations. 1 GHz BW recording was demonstrated by Haystack
Observatory.
2 x 1 GHz analog bandwidth with 2 bit sampling (2 x 4 Gbps data rate) coming soon: DBBC
and Mk5C, feasibility of 8 Gbps data capture was demonstrated at Metsahovi.
IVS 2010 feed/receiver system can deliver up to 32 GHz x 2 pols analog bandwidth.
Goals for EVN 2015 upgrade are set to 8-16 GHz analog bandwidth, matching that of EVLA.
No problem with Uniboard-based VLBI correlator concept to match increasingly widening
bandwidth due to its linear scalability over the bandwidth.
Increase of the number of stations can be handled by sacrificing the bandwidth.
Practical limitations will be based on the benchmark of a single board, as achieved during
the current project.
Next-gen FPGAs would allow to increase the board performance by factor of 2-8 with
respect to current FPGAs one can or afford to order.
It’s all based on the ability of station DAS to send the separated frequency bands (channels)
to different IP addresses (Mk5C concept). Not a problem of they would not.
Though the channelization can be also done locally, with the use of yet another Unibiard as
the DBBC, while receiving the continuous BW data from stations.
FIFO, Up to 1 s long
Memory
controller
Phase generator
6-8 MAC/sample
Station based processing: 146 MAC/sample/station
32 stations, 2 pols of 32 MHz BW will require
0.6 TMAC
PhaseCal extractor will also add 4 MAC/sample.
No clear estimates of how much of logic it’ll take to re-aggregate
the packets and control the FIFOs.
Memory
controller
Control flow
Cmplx Mult
2 MAC/Sample
FIFO, Up to 1 s long
Delay generator
6-8 MAC/sample
Phase generator
6-8 MAC/sample
Delay/Phase generators should be at least parabolic
and at least 48 bit long, may be even 64 for VSOP-2 case
BASELINE PROCESSING
Fractional bit correction Validity bit is augmented Fractional bit correction
Data rate increases
2 MAC/sample
4 MAC/sample
Cmplx Mult
2 MAC/Sample
WOLA 8*(LogN+1) MAC/sample
128 MACs for 32K taps
Delay generator
6-8 MAC/sample
Control flow
WOLA 8*(LogN+1) MAC/sample
128 MACs for 32K taps
32 stations at 64 MHz BW
32 stations at 64 MHz BW
2X10G network
at 4 bit sampling
deliver 4 Gsps
connection
will require 16 Gbps,
that fits into 2 x
10G links
Logic to stream the packets from
multiple stations to FIFOs
Estimation of the control flow resources is needed
32 stations with 2 pols
will generate 1536
baselines
Than will require
192 MAC/sample/station
at 64 Msps
Or, totaling 0.8 TMAC
With requirements
of 0.1 s integration
and 2 kHz resolution,
the output data rate
for 1536 BLs,
with 16K
complex spectral points
(32 bit per number)
will reach the same
16 Gbps rate as input 
There are still some
resources left to do a
multiple phase centers
with reduced data rate
Or pulsar binning.
Memory bandwidth
requirements are huge
Basic processing elements and computing power estimates
Optimistically, 32 stations with 64 MHz analog bandwidth require a computing power of 2
V-6 FPGAs
You’d never get 100% of FPGA HW-resources and 100% clock speed the same time,
Sure bet will be 70% and 70%, or 5 TMAC from 10 TMAC potential of 8 V-6s.
So, realistically, 32 stations 128 MHz bandwidth (0.5 Gbps at 2 b/s) in a single board of 8 V-6s.
( a good call for XILINX V-6 )
So the single 8 x V-6 Uniboard could do a job of a current EVN MarkIV correlator
16 Stations @ 1Gbps.
To get 4 GHz of analog bandwidth it’ll take 32 boards.
32 Stations, 2 pols with 4 GHz bandwidth each can fin into the EVN MarkIV correlator room.
Keeping the power bill at the same level – GO GREEN!, make the boards with green epoxy 
Funny enough: it’ll be about the same number of MACs in the system as they currently are in
the EVN MarkIV XF correlator.
Another advantage of Uniboard with respect to the EVN MarkIV correlator is that the
output data set is much better structured, less efforts will be spent on output data handling.
Space VLBI (VSOP-2, RadioAstron) implies requirements on the high acceleration
of the delay and phase corrections and higher output rates.
Space VLBI will not be real time, will require buffering for a day or two. As well as other projects,
like those requiring a multi-pass and/or iterative correlation.
Spacecraft VLBI case
Spectrum of the spacecraft signal is, basically, fractal, what’s most interesting to VLBI is
concentrated in few tens of kHz spread over 10- 100 of MHz or even more, like simultaneous
S and X or X and Ka bands. Exporting the proper narrow band outputs from Filter Banks for
further analysis on software platforms will do a job.
That’s a SW WOLA spectrum of VEX S/C,
phase stopped 1,600,000 points over 8 MHz BW
Delay generator
6-8 MAC/sample
Data in
32 MHz
Mult
2 MAC/Sample
FIFO, Up to 1 s long
Memory
controller
Phase generator
6-8 MAC/sample
Fractional bit correction
4 MAC/sample
Basic processing elements
WOLA 8*(LogN+1) MAC/sample
128 MACs for 32K taps
(picture credit: MRO-HUT/TKK , JIVE & ESA)
A small fraction of BW
is sent outside
Data out
100 kHz
Thanks !
Questions?