Transcript PPT

Tomography for
Multi-guidestar Adaptive Optics
An Architecture for Real-Time Hardware Implementation
Donald Gavel, Marc Reinig, and Carlos Cabrera
UCO/Lick Observatory Laboratory for Adaptive Optics
University of California, Santa Cruz
Presentation at the SPIE Optics and Photonics Conference
5903-15
San Diego, CA
June 3, 2005
Outline of talk
• Introduction: The problem of real-time AO tomography
for extremely large telescopes (ELTs):
Real-time calculations grow with D4
• An alternative approach using a massively parallized
processor (MPP) architecture
• Performance study results
– Experiment
– Simulation
• Conclusions
Gavel, Tomography for Multi-guidestar AO
SPIE Optics and Photonics, San Diego, Aug. 2005
2
AO systems are growing in complexity, size, ambition
–MCAO
•2-3 conjugate DMs
•5-7 LGS
•3 TTS
–MOAO
•Up to 20 IFUs each with a DM
•8-9 LGS
•3-5 TTS
Gavel, Tomography for Multi-guidestar AO
SPIE Optics and Photonics, San Diego, Aug. 2005
3
Extrapolating the conventional vector-matrix-multiply
AO reconstructor method to ELTs is not feasible
H = actuator to sensor influence function matrix
aˆ  HT HT H s
1
Least-squares solution
aˆ  Σ HT HT Σ H  Σn  s
Minimum variance solution
aˆ  Ks
General form
1
•
Online calculation requires P x M matrix multiply
–
–
–
M = 10,000 subaps x 9 LGS
P = 20,000 acts (MCAO) or 100,000 acts (MOAO)
fs = 1 kHz frame rate
 ~1011 calcs x 1 kHz = ~105 Gflops = ~105 Keck AO processors!
•
Offline calculation requires O(M3) flops to (pre)compute the inverse ~1015 calcs --106
sec (12 days) with 1Gflop machine
•
“Moore’s Law” of computation technology growth: processor capability doubles every
18 months. To get a 105 improvement takes 25 years growth. Let’s say we use 100 x
more processors; a 103 improvement takes 15 years.
Gavel, Tomography for Multi-guidestar AO
SPIE Optics and Photonics, San Diego, Aug. 2005
4
Alternative: massively parallel processing
Wavefront
Wavefront
Sensors
Wavefront
Sensors
Wavefront
Sensors
Sensors
Image
Image
Processors
Image
Processors
Image
Processors
Processors
Centroid algorithm
r0, guidstar brightness,
Guidestar position
•
Cn2 profile
Image
Image
Processors
Image
Processors
DM
Processors
Fit
DM conjugate
altitude
Image
Image
Processors
Image
Processors
Deformable
Processors
Mirrors
Actuator
influence
function
Advantages
–
–
–
–
•
Tomography
Unit
Wavefront
Wavefront
Sensors
Wavefront
Sensors
DM
Sensors
Projection
Many small processors each do a small part of the task – not taxing to any one processor
Modularity: each processor has a stand-alone task – possibly specialized to one piece of
hardware (WFS or DM)
Modularity makes the system easier to diagnose – each part has a “recognizable” task
Modularity makes system design easier – each subsection depends only on parameters
associated with it, as opposed to global optimization of a monolithic design
Requires
–
–
Lots of small processors, with high speed data paths
Iteration to solution – but what if 1 iteration took only 1 ms? – then we would have time for
1000 iterations per 1 ms data frame cycle!
Gavel, Tomography for Multi-guidestar AO
SPIE Optics and Photonics, San Diego, Aug. 2005
6
1. Wavefront sensor processing
• Hartmann sensor: s = Gy
– s = vector of slopes
– y = vector of phases
– G = gradient operator
Wavefront
Wavefront
Sensors
Wavefront
Sensors
Wavefront
Sensors
Sensors
Image
Image
Processors
Image
Processors
Image
Processors
Processors
Centroid algorithm
r0, guidstar brightness,
Guidestar position
Tomography
Unit
Cn2 profile
Wavefront
Wavefront
Sensors
Wavefront
Sensors
DM
Sensors
Projection
DM conjuga
altitude
• Problem is overdetermined (more measurements than
unknowns), assuming no branch points
• High speed algorithms are well known
e.g. FFT based algorithm by Poyneer et. al. JOSA-A 2002 is O(n0 log(n0))
Gavel, Tomography for Multi-guidestar AO
SPIE Optics and Photonics, San Diego, Aug. 2005
7
Weiner solution of the wavefront sensor slope-tophase problem in the Fourier domain
~
yest   

 i  ~
s  
 i  ~
s   
1



~
~
53
2
2


  Cnn   C  
 1   r0  
  0.0271 2 11 3  nda 2
 = spatial frequency
~ indicate Fourier transform
r0 = Fried’s parameter
n = meas. Noise
da = subap diameter
C = Kolmogorov spectrum
Cnn = noise spectrum
Gavel, Tomography for Multi-guidestar AO
SPIE Optics and Photonics, San Diego, Aug. 2005
8
2. Tomographic reconstruction
• Guidestars probe the atmosphere:
y  Ax
Wavefront
Wavefront
Sensors
Wavefront
Sensors
Wavefront
Sensors
Sensors
Image
Image
Processors
Image
Processors
Image
Processors
Processors
where
Centroid algorithm
y = vector of all WFS phase measurements
r0, guidstar brightness,
position
x = value of dOPD at each voxel in turbulentGuidestar
volume
A is a forward propagation operator (entries = 0 or 1)
Tomography
Unit
Wave
Wav
Sens
Wa
Sen
Se
P
Cn2 profile
DM
• The problem in underdetermined – there are more unknowns than measurements
x is an N-vector
y is an M-vector
A is M x N
Gavel, Tomography for Multi-guidestar AO
SPIE Optics and Photonics, San Diego, Aug. 2005
9
Inverse tomography algorithms
Preconditioned conjugate gradient
Linear feedback
v k 1  v k  v k
v k 1  v k  v k
v k  f Ce k 
v k  gCe k
e k  y  APA T  N v k
-or-
e k  y  APA T  N v k
x  PA T v 
x  PA T v 
AT is the back propagation operator
C is the “preconditioner”
affects convergence rate only
P,N is the “postconditioner”
determines the type of solution:
P=I, N=0  least squares
P=<xxT>, N=<nnT>  min variance
g = constant feedback gain
f(.) = 1st order regression (and other hidden
details of the CG algorithm)
Gavel, Tomography for Multi-guidestar AO
SPIE Optics and Photonics, San Diego, Aug. 2005
10
Compute count for inverse tomography
• A and AT are massively parallelizable over transverse dimension, guidestars
• AT is massively parallelizable over layers
Operation
CPU
MPPU
Forward Propagation
LxM
L
Subtract from data
M
1
Weight by Cn2
N
1
LxM
1
Back Propagation
per iteration
• Optional Fourier domain preconditioning and postconditioning:
Precondition
FT-1
X
FT
Backpropagate
Postcondition
Volumetric
dOPD
estimates
Aperture
WFS
data
-
+
FT
X
FT-1
Forwardpropagate
Aperture
Operation
Fourier Transform
Gavel, Tomography for Multi-guidestar AO
CPU
MPPU
M log(M)
Log(M)
SPIE Optics and Photonics, San Diego, Aug. 2005
per iteration
11
Prototype implementation on an FPGA
Global System
State Information
Forward Propagation
Path
GS1
...
Back Propagation
Path
GS1
GSn
Note:
Because the Forward propagation
and Back Progagation paths are
parallel, but are used at different
times, they will actually be a single
bus in the physical implementation.
...
Global System
State Information
Forward Propagation
Path
GSn
GS1
...
Note:
Because the Forward propagation
and Back Progagation paths are
parallel, but are used at different
times, they will actually be a single
bus in the physical implementation.
Voxel
Local Registers
GS1 Error
GS2 Error
Back Propagation
Path
GS1
GSn
Cumulative Value
GSn
...
GS1
GSn
GS1
...
GS1
...
GSn
GS1
...
...
GS1
GSn
GS1
...
GS1 Error
GS2 Error
GS1
GS1
...
GS1
GSn
GS1
...
GS1
...
...
GSn
...
Control Logic
...
...
GSn
GS1
...
...
GSn
GSN Error
Current
Estimated
Value
GSn
GS1
Forward Propagation
Cn2
...
...
GSn
Control Logic
ALU
(Word Size + NGS) wide
Back Propagation
Path
Back Propagation
Path
GS1
GSn
...
GSn
Voxel
Local Registers
GS1 Error
GS2 Error
GS3 Error
...
GSN Error
Current
Estimated
Value
Control Logic
...
GSn
GS1
Global System
State Information
GS1
Cumulative Value
GSn
Cumulative Value
GS1
GS1
Back Propagation
Path
Cumulative Value
GSn
Cumulative Value
GS1
Cn2
ALU
(Word Size + NGS) wide
...
GSn
Forward Propagation
Path
GS1
...
...
GSn
GSn
Back Propagation
Path
Global System
State Information
Gavel, Tomography for Multi-guidestar AO
...
...
GSn
Cumulative Value
GSn
...
Path
Forward Propagation
Path
GS1
Cumulative Value
GSn
•An Array of Voxel Processors
GS1
GS1 Error
GS2 Error
Cn2
Note:
Because the Forward propagation
and Back Progagation paths are
parallel, but are used at different
times, they will actually be a single
bus in the physical implementation.
ALU
(Word Size + NGS) wide
GSn
GSn
ALU
(Word Size + NGS) wide
Global System
State
Information
Global
System
State Information
Cn2
...
...
GSN Error
GS1
GSN Error
Forward Propagation
Path
GS1
Voxel
Local Registers
GS3 Error
GS1 Error
GSn
GS1 Error
GS1
Global System
State Information
GSn
GSn
GS2 Error
GSn
GS2 Error
Cumulative Value
GS1
GS1
Back Propagation
Path
...
Voxel
Local Registers
...
Cumulative Value
GSn
GSn
GS1
GSn
GS3 Error
ALU
(Word Size + NGS) wide
...
...
Cumulative Value
GS1
Voxel
Local Registers
Current
Estimated
Value
Cn2
GS1
GS1
...
Path
Back Propagation
Path
GSn
Back Propagation
Path
Back Propagation
Path
Back Propagation
Path
GS3 Error
Back Propagation
Path
Forward Propagation
Path
GS1
GSN Error
Forward Propagation
Path
GS1
Current
Estimated
Value
GS1
Forward Propagation
Global System
State
Information
Global
System
State Information
Note:
Because the Forward propagation
and Back Progagation paths are
parallel, but are used at different
times, they will actually be a single
bus in the physical implementation.
GS1 Error
GS2 Error
Cumulative Value
GS1
GSn
Note:
Because the Forward propagation
and Back Progagation paths are
parallel, but are used at different
times, they will actually be a single
bus in the physical implementation.
Control Logic
...
Global System
State Information
GSn
Cumulative Value
GSn
GS3 Error
Control Logic
...
GS1
Cn2
GS1
GSn
Voxel
Local Registers
Current
Estimated
Value
...
ALU
(Word Size + NGS) wide
GSn
Path
Back Propagation
Path
GSn
Note:
Because the Forward propagation
and Back Progagation paths are
parallel, but are used at different
times, they will actually be a single
bus in the physical implementation.
...
Back Propagation
Path
Forward Propagation
Path
Cumulative Value
GSn
Forward Propagation
Path
Forward Propagation
Path
Global System
State Information
Global System
State Information
GSN Error
Cumulative Value
GS1
GS1
Forward Propagation
Global System
State
Information
Global
System
State Information
Cn2
ALU
(Word Size + NGS) wide
GSn
GS1 Error
Control Logic
Cumulative Value
GSn
Cumulative Value
GS1
...
Note:
Because the Forward propagation
and Back Progagation paths are
parallel, but are used at different
times, they will actually be a single
bus in the physical implementation.
GSN Error
...
Cn2
GS1
GS1 Error
GS3 Error
Current
Estimated
Value
ALU
(Word Size + NGS) wide
Control Logic
...
GS2 Error
...
GSN Error
Forward Propagation
Path
GS2 Error
Cumulative Value
GS1
Voxel
Local Registers
GS3 Error
Global System
State Information
GSn
...
Back Propagation
Path
Back Propagation
Path
GSn
Note:
Because the Forward propagation
and Back Progagation paths are
parallel, but are used at different
times, they will actually be a single
bus in the physical implementation.
...
Voxel
Local Registers
Control Logic
Forward Propagation
Path
Forward Propagation
Path
Global System
State Information
Global System
State Information
GSn
Voxel
Local Registers
Current
Estimated
Value
GS1
GS3 Error
Cumulative Value
GSn
Cumulative Value
GS1
Back Propagation
Path
Back Propagation
Path
GSn
Note:
Because the Forward propagation
and Back Progagation paths are
parallel, but are used at different
times, they will actually be a single
bus in the physical implementation.
Back Propagation
Path
GSn
Current
Estimated
Value
Cn2
ALU
(Word Size + NGS) wide
Control Logic
Forward Propagation
Path
Forward Propagation
Path
...
...
GSN Error
Current
Estimated
Value
Cn2
ALU
(Word Size + NGS) wide
Global System
State Information
Global System
State Information
GS1
GS3 Error
GSN Error
Cumulative Value
GS1
Forward Propagation
Path
Note:
Because the Forward propagation
and Back Progagation paths are
parallel, but are used at different
times, they will actually be a single
bus in the physical implementation.
GS1 Error
GS2 Error
...
Control Logic
Global System
State Information
GSn
Voxel
Local Registers
GS3 Error
Current
Estimated
Value
...
Forward Propagation
Path
Back Propagation
Path
•A Single Voxel Processor
SPIE Optics and Photonics, San Diego, Aug. 2005
12
Preliminary Results for MPP Timing and Resource
Allocation on an FPGA
Timing
• Basic clock speed supported: 50 MHz (Xilinx Vertex 4)
• Total number of states per iteration: 36
Element
Current Value
Derived Formula
Load Measured
Value
12
3n0
Forward Propagate
27
NGS(2L + 1)
1
1
1
1
7
3NGS + 4
Compare
Parameters (current Value)
L = Layers (4)
Back Propagate
NGS = Guide Stars (3)
Calculate New
n0 = Sub Apertures (4)
Estimate
A single iteration takes
T = 4NGS + 2LNGS + 6 clock cycles
Currently this is 36 50MHz clocks = 720
nsec. Per iteration
Comment
Done once per
msec
Note: algorithm parallelizes over guidestars
For reasons of simplicity and debuging
of this first implementation we have not done this yet
Chip count
•
•
•
This implementation: Vertex 4 chip is 20% utilized (2996 of 15360 available logic cells employed)
Scaling to a system with 10,000 subapertures (such as for the 30 meter telescope) would require 500 of these chips
Standard packing density is ~50 chips/board, this equates to 10 circuit boards
Gavel, Tomography for Multi-guidestar AO
SPIE Optics and Photonics, San Diego, Aug. 2005
13
Simulation: extrapolation to the full ELT spatial scale to
estimate convergence rates
• 7800 subapertures per guidestar
• 5 guidestars
• 7 layer atmosphere
• Fixed feedback gain iteration
• A and AT implemented in the spatial domain
• Initial atmospheric realizations were random with a Kolmogorov spatial
power spectrum.
Convergence to 3 digits accuracy in 1ms
Gavel, Tomography for Multi-guidestar AO
SPIE Optics and Photonics, San Diego, Aug. 2005
14
3. Projection and fitting to DMs
Wavefront
Wavefront
Sensors
Wavefront
Sensors
Wavefront
Sensors
Sensors
• MCAO
Image
Image
Processors
Image
Processors
Image
Processors
Processors
Centroid algorithm
r0, guidstar brightness,
Guidestar position
Tomography
Unit
Cn2 profile
Wavefront
Wavefront
Sensors
Wavefront
Sensors
DM
Sensors
Projection
DM conjugate
altitude
Image
Image
Processors
Image
Processors
DM
Processors
Fit
Image
Image
Processors
Image
Processors
Deformable
Processors
Mirrors
Actuator
influence
function
– Requires filtering and weighted integral over layers for each DM
– Filters and weights chosen to minimize “Generalized
Anisoplanatism” (Tokovinin et. al. JOSA-A 2002)
– Massively parallelizable over the Fourier domain and over DMs L steps to integrate
• MOAO
– Requires integral over layers for each science direction (DM)
– Massively parallelizable over Spatial or Fourier domain and over
DMs – L steps to integrate
• DM fitting
– Deconvolution – massively parallelizable given either spatially
invariant or spatially localized actuator influence function
– PCG suppresses aperture affects in 2-3 iterations
Gavel, Tomography for Multi-guidestar AO
SPIE Optics and Photonics, San Diego, Aug. 2005
15
Conclusions
•
The architecture: massive parallel computation
Wavefront
Wavefront
Sensors
Wavefront
Sensors
Wavefront
Sensors
Sensors
Image
Image
Processors
Image
Processors
Image
Processors
Processors
Tomography
Unit
Centroid algorithm
r0, guidstar brightness,
Guidestar position
Precondition
Cn2 profile
FT-1
X
FT
Wavefront
Wavefront
Sensors
Wavefront
Sensors
DM
Sensors
Projection
DM conjugate
altitude
Backpropagate
+
X
Actuator
influence
function
Volumetric
dOPD
estimates
FT
Image
Image
Processors
Image
Processors
Deformable
Processors
Mirrors
Postcondition
Aperture
WFS
data
Image
Image
Processors
Image
Processors
DM
Processors
Fit
FT-1
Forwardpropagate
Aperture
•
•
•
Conceptually simple
Tested with a commercial FPGA; evaluated with simulations – it’s feasible
with today’s technology
Under study:
FD-PCG – extra computation per iteration traded off against faster convergence rate
Gavel, Tomography for Multi-guidestar AO
SPIE Optics and Photonics, San Diego, Aug. 2005
16