Super-Resolution Reconstruction of Images - Static and Dynamic Paradigms February 2002 Michael Elad* The Scientific Computing and Computational Mathematics Program - Stanford * Joint work with Prof.
Download ReportTranscript Super-Resolution Reconstruction of Images - Static and Dynamic Paradigms February 2002 Michael Elad* The Scientific Computing and Computational Mathematics Program - Stanford * Joint work with Prof.
Super-Resolution Reconstruction of Images - Static and Dynamic Paradigms February 2002 Michael Elad* The Scientific Computing and Computational Mathematics Program - Stanford * Joint work with Prof. Arie Feuer – The Technion, Haifa Israel, Prof. Yacob Hel-Or – IDC, Herzelia, Israel, Tamir Sagi - Zapex-Israel. -1- Static Versus Dynamic Super-Resolution Definitions and Activity Map -2- Basic Super-Resolution Idea Given: A set of degraded (warped, blurred, decimated, noised) images: Required: Fusion of the measurements into a higher resolution image/s -3- Static Super-Resolution (SSR) Y1 YN Y2 Y3 Low Resolution Measurements Static Super-Resolution Algorithm Xˆ f Y1, Y 2 , Y 3 , , Y N ˆ X High Resolution Reconstructed Image -4- t Dynamic Super-Resolution (DSR) Low Resolution Measurements Yt t t Xˆ t f Yt , Yt 1, High Resolution Reconstructed Images Dynamic Super-Resolution Algorithm t Xˆ t t -5- Other Work In this Field People Place Years Peleg, Irani, Werman, Keren, Schweitzer HUJI 1987-1994 Kim, Bose, Valenzuela Penn. State 1990-1993 Patti, Tekalp, Zesan, Ozkan, Altunbasak Rochester 1992-1998 Morris, Cheeseman, Smelyanskiy, Maluf NASA-AMES 1992-2002 Ur & Gross TAUI 1992-1993 Elad, Feuer, Sagi, Hel-Or Technion 1995-2001 Schutlz, Stevenson, Borman Notre-Dame 1995-1999 Shekarforush, Berthod, Zerubia, Werman INRIA-France 1995-1999 Katsaggelos, Tom, Galatsanos Northwestern 1995-1999 Shah, Zachor Berkeley 1996-1999 Nguyen, Milanfar, Golub Stanford 1998-2001 Baker, Kanade CMU 1999-2001 -6- * This table probably does mis-justice to someone - no harm meant Methods which relate also to DSR paradigm. All others deal with SSR. Our Work In this Field M. Elad and A. Feuer, “Restoration of Single Super-Resolution Image From Several Blurred, Noisy and Down-Sampled Measured Images”, the IEEE Trans. on Image Processing, Vol. 6, no. 12, pp. 1646-58, December 1997. M. Elad and A. Feuer, “Super-Resolution Restoration of Continuous Image Sequence - Adaptive Filtering Approach”, the IEEE Trans. on Image Processing, Vol. 8. no. 3, pp. 387-395, March 1999. M. Elad and A. Feuer, “Super-Resolution reconstruction of Continuous Image Sequence”, the IEEE Trans. On Pattern Analysis and Machine Intelligence (PAMI), Vol. 21, no. 9, pp. 817-834, September 1999. M. Elad and Y. Hel-Or, “A Fast Super-Resolution Reconstruction Algorithm for Pure Translational Motion and Common Space Invariant Blur”, Accepted to the IEEE Trans. on Image Processing, March 2001. T. Sagi, A. Feuer and M. Elad, “The Periodic Step Gradient Descent Algorithm - General Analysis and Application to the Super-Resolution Reconstruction Problem”, EUSIPCO 1998. All found in http://sccm.stanford.edu/~elad -7- Super-Resolution Basics Intuition and Relation to Sampling theorems -8- Simple Example For a given bandlimited image, the Nyquist sampling theorem states that if a uniform sampling is fine enough (D), perfect reconstruction is possible. D D -9- Simple Example Due to our limited camera resolution, we sample using an insufficient 2D grid 2D 2D - 10 - Simple Example However, we are allowed to take a second picture and so, shifting the camera ‘slightly to the right’ we obtain 2D 2D - 11 - Simple Example Similarly, by shifting down we get a third image 2D 2D - 12 - Simple Example And finally, by shifting down and to the right we get the fourth image 2D 2D - 13 - Simple Example - Conclusion It is trivial to see that interlacing the four images, we get that the desired resolution is obtained, and thus perfect reconstruction is guaranteed. This is SuperResolution in its simplest form - 14 - Uncontrolled Displacements In the previous example we counted on exact movement of the camera by D in each direction. What if the camera displacement is uncontrolled? - 15 - Uncontrolled Displacements It turns out that there is a sampling theorem due to Yen (1956) and Papulis (1977) covering this case, guaranteeing perfect reconstruction for periodic uniform sampling if the sampling density is high enough (1 sample per each Dby-D square). - 16 - Uncontrolled Rotation/Scale/Disp. In the previous examples we restricted the camera to move horizontally/vertically parallel to the photograph object. What if the camera rotates? Gets closer to the object (zoom)? - 17 - Uncontrolled Rotation/Scale/Disp. There is no sampling theorem covering this case - 18 - Further Complications 1. Sampling is not a point operation – there is a blur 2. Motion may include perspective warp, local motion, etc. 3. Samples may be noisy – any reconstruction process must take that into account. - 19 - Static Super-Resolution The creation of a single improved image, from the finite measured sequence of images - 20 - SSR - The Model Geometric Warp Blur HighResolution Image F1=I Decimation Y1 D1 H1 V1 X LowResolution Images Additive Noise FN HN YN DN VN N 1 Y k Dk H k Fk X V k , V k ~ N0, Wk k 1 - 21 - The Warp As a Linear Operation Z X Per every point in X find a matching point in Z x1 xj x N 1 0 1 F[j,i]=1 - 22 - 0 z1 z j 1 z N Model Assumptions We assume that the images Yk and the operators Hk, Dk, Fk,& Wk are known to us, and we use them for the recovery of X. Yk – The measured images (noisy, blurry, down-sampled ..) Hk – The blur can be extracted from the camera characteristics Dk – The decimation is dictated by the required resolution ratio Fk – The warp can be estimated using motion estimation Wk – The noise covariance can be extracted from the camera characteristics - 23 - The Model as One Equation N 1 Y k Dk H k Fk X V k , V k ~ N0, Wk k 1 V1 Y1 D1H1F1 V Y D H F 2 2 2 2 X 2 Y D H F N N N N V N 1 W V1 1 V W2 2 ~ N 0, V W N N 0 - 24 - 0 A Thumb Rule on Desired Resolution In the noiseless case we have Y1 D1H1F1 Y D H F 2 2 2 2 X Y D H F N N N N Clearly, this linear system of equations should have more equations than unknowns in order to make it possible to have a unique Least-Squares solution. Example: Assume that we have N images of M-by-M pixels, and we would like to produce an image X of size L-by-L. Then – L N M - 25 - The Maximum-Likelihood Approach Geometric Warp Blur HighResolution Image F1=I Decimation Y1 D1 H1 V1 X LowResolution Images Additive Noise FN HN YN DN VN Which X would be such that when fed to the above system it yields a set Yk closest to the measured images - 26 - ? SSR - ML Reconstruction (LS) Minimize: N 2 ML X k 1 Y k Dk H k Fk X W 2 k X Thus, require: 0 X 2 ML N T T T R Fk H k Dk Wk Dk H k Fk k 1 N P FkT HTk DTk Wk Y k k 1 - 27 - ˆ RX P SSR - MAP Reconstruction Add a term which penalizes for the solution image quality N 2 2 MAP X Y k Dk Hk Fk X W k k 1 AX Possible Prior functions - Examples: 1. AX X ST WX 0 SX - simple spatially adaptive, T 2. AX SX - M estimator (robust functions), Note: Convex prior guarantees convex programming problem - 28 - Iterative Reconstruction Assuming the prior AX X TST WS X is used N T T T T R F H D W D H F S WS k k k k k k k k 1 N P FkT HTk DTk Wk Y k k 1 ˆ RX P ˆ : 1000 1000 , the matrix R is sparse R M10 10 For X 6 6 OPTION: Using the SD algorithm (10-15 iterations are enough) N Xˆ j1 Xˆ j FkT HTk DTk Wk Y k Dk H k Fk Xˆ j ST WS Xˆ j k 1 - 29 - Image-Based Processing N SD* Iteration: Xˆ j1 Xˆ j FkT HTk DTk Wk Y k Dk H k Fk Xˆ j ST WS Xˆ j k 1 Back projection Simulated error Weighted edges All the above operations can be interpreted as operations performed on images. AND THUS There is no actual need to use the Matrix-Vector notations as shown here. This notations is important for the development of the algorithm * Also true for the Conjugate Gradient algorithm - 30 - SSR – Simpler Problems F1=I Y1 D1 H1 V1 X FN HN YN DN VN ˆX FT HT DT W D H F ST WS k k k k k k k k 1 N - 31 - 1 N T T T F k Hk Dk Wk Yk k 1 SSR – Simpler Problems Single image de-noising Y X V Single image restoration Y HX V Motion compensation average 1 T T T ˆ X H H S WS H Y Single image scaling Y DX V 1 T T T ˆ X D D S WS D Y T T ˆ X Fk Fk S WS k 1 N Yk Fk X V N k k 1 1 T ˆ X I S WS Y T Using AX X ST WS X - 32 - 1 N T F k Yk k 1 Example 1 Synthetic case: From a single image create 9 3:1 images this way - 33 - Example 1 Synthetic case: 9 images, no blur, 1:3 ratio One of the lowresolution images The higher resolution original - 34 - The reconstructed result Example 2 16 images, ratio 1:2, PSF - assumed to be Gaussian with =2.5 Taken from one of the given images Taken from the reconstructed result - 35 - Dynamic Super-Resolution Low Quality Movie In – High Quality Movie Out - 36 - Dynamic Super-Resolution (DSR) Low Resolution Measurements Yt t t Xˆ t f Yt , Yt 1, High Resolution Reconstructed Images Dynamic Super-Resolution Algorithm t Xˆ t t - 37 - Modeling the Problem Low Resolution Measurements Yt t t Yt k M( t, k)X( t ) Nt, k High Resolution Reconstructed Images ? t Xˆ t t Bypass - 38 - DSR – Proposed Model Yt k M( t, k)X( t ) Nt, k t 1 ~ Y t k DHFt, k X t N t, k k 1 N ( t, k ) ~ N 0, W , where 0 1 ~ and Ft, k Ft k 1 Ft 1Ft k 0 - 39 - DSR – From Model to ML The DSR problem is referred to as a long sequence of SSR problems. t 1 ~ Y t k DHFt, k X t N t, k Thus, Our model is k 1 N ( t , k ) ~ N 0 , W , where 0 1 ~ and Ft, k Ft k 1 Ft 1Ft k 0 Using ML approach t 1 X t , t Y t k DHF t, k X t 2 k k 0 and this function should be minimized per each t. - 40 - 2 W Solving the ML t 1 Minimizing X t , t Y t k DHF t, k X t 2 k k 0 2 W amounts to solving the linear set of equations Lt Xˆ ( t ) Zt where t 1 L(t) DHF t, k W DHF t, k T k k 0 t 1 Z(t) DHF t, k W Y t k T k k 0 Note that (apart from the need to solve the linear set), one has to compute L and Z per each t all over again, and the summations length grow linearly in t. - 41 - Recursive Representation t 1 L(t) DHF t, k W DHF t, k k T k 0 t 1 Z(t) DHF t, k W Y t k k T k 0 Simplifies to (Using F t,k F t k 1 F t 1 F t ) Lt FT t Lt 1Ft HT WH Zt FT t Zt 1 HT W Yt - 42 - Alternative Approach Instead of continuing with the previous model and recursive representation, we adopt a different point of view. The new point of view is based on State-Space modeling of our problems This new model leads to better-understanding of the required algorithmic steps towards an efficient solution. The eventual expressions with the alternative method are exactly the same as the ones shown previously. Bypass - 43 - DSR - The Model (1) The System’s Equation V(t) X(t-1) G(t) X(t) X(t) - High-resolution image Delay G(t) - Warp operation V(t) - Sequence innovation Xt Gt Xt 1 Vt - 44 - assumed ~ N0, Q-1 t DSR - The Model (2) X(t) The Measurements Equation N(t) Y(t) H(t) D 0 S - Laplacian U(t) Yt DHt N t 0 S Xt U t - 45 - Y(t) - Measured image H(t) - Blur D - Decimation N(t) - additive noise ~ N0, W-1 t S - Laplacian U(t) - Non-smooth. ~ N0, R -1 t DSR - The Model (3) X t Gt X t 1 Vt & Y t DHt N t 0 S X t U t These two equations form a Spatio-Temporal Prior forcing spatial smoothness & temporal motion compensated smoothness - 46 - DSR - Reconstruction By KF X t Gt X t 1 Vt The model is Y A t H A t X t N A t given in a Statewhere Vt ~ N0, Q t Space form N A t ~ N0, W t 1 1 In order to estimate X(t) in time, we need to apply Kalman Filter (KF) The basic idea: 1. Since all the inputs are Gaussians, so is X(t) 2. We know all about X(t) if its two first moments are known - Xt ~ NXˆ t , Pˆ t - 47 - KF: Mean-Covariance Pair 1. We start by knowing the pair Xˆ t 1 , Pˆ t 1 2. Based on Xt Gt Xt 1 Vt we get the ~ Pt Gt Pˆ t 1GT t Q1 t Prediction Equations: ~ X t Gt Xˆ t 1 3. Based on YA t HA t Xt N A t we get the ˆPt P~ 1 t HT t W t H t 1 A A A Update Equations: ~ ˆXt Pˆ t P~ 1 t X t HTA t WA t Y A t - 48 - KF: Information Pair Information pair is defined by Zˆ t , Lˆ t Pˆ 1 t Xˆ t , Pˆ 1 t The recursive equations become: 1 ~ 1 T 1 ˆ Lt Gt L t 1G t Q t Interpolation: ~ ~ Zt Lt Gt Lˆ 1 t 1Zˆ t 1 ~ ˆLt L t HTA t WA t HA t Update: ˆ ~ Z t Zt HT t W t Y A t A A Presumably, there is nothing to gain in using the information pair, over the mean-covariance pair - 49 - Information Pair Is Better !! (for our application) 1. Experimental results indicate that the information -1 matrix is sparser: = 2. We intend to avoid the use of Q(t). Therefore, it is natural to achieve simplifying the equation 1 ~ 1 T 1 ˆ Lt Gt L t 1G t Q t while approximating Q(t). - 50 - Avoiding Q(t) Instead of using L~ t Gt Lˆ 1 t 1GT t Q1 t 1 Approximate Q1 t t Gt Lˆ 1 t 1GT t and obtain that ~ Lt 1 FT t Lˆ t 1Ft 1 t G t Ft 1 Lˆ t t FT t Lˆ t 1Ft HTA t WA t H A t Zˆ t t FT t Zˆ t 1 HT t W t Y A t A - 51 - A The Pseudo-RLS Algorithm 1. Initialize: Lˆ 0 2I, Zˆ 0 0, Xˆ 0 0 2. For t > 0, Update the information pair Lˆ t t FT t Lˆ t 1Ft HTA t WA t H A t Zˆ t t FT t Zˆ t 1 HTA t WA t Y A t Compute the output by Xˆ t Lˆ 1 t Zˆ t Problem: Need to invert the information matrix - 52 - The R-SD Algorithm 1. Initialize: Lˆ 0 2I, Zˆ 0 0, Xˆ 0 0 2. For t > 0, Update the information pair, as before Compute the output by R-SD iterations: Xˆ 0 t Gt Xˆ R t 1 and for k=1,2, … ,R: Adopted from the assumed model Xˆ k1 t Xˆ k t Lˆ t Xˆ k t Zˆ t ˆ R t Lˆ 1 t Zˆ t but Note: X error does not propagate - 53 - Dynamic Super-Resolution Low Resolution Measurements Yt t t Xˆ t f Yt , Xˆ t 1 High Resolution Reconstructed Images Dynamic Super-Resolution Algorithm t Xˆ t t - 54 - The R-LMS Algorithm 1. Initialize: Xˆ 0 0 2. For t > 0, Compute the output by R-SD iterations using the intermediate information pair: Xˆ 0 t G t Xˆ R t 1 and for k=1,2, … ,R: Xˆ k 1 t Xˆ k t HTA t WA t H A t Xˆ k t Y A t Also obtained if Xˆ R t 1 Lˆ 1 t 1 Zˆ t 1 or if t is set to zero - 55 - The Information Matrix ˆLt t FT t Lˆ t 1Ft Mt Under some very reasonable assumptions, it is PROVEN that the the information matrix remains SPARSE Density versus iterations - An Example 0.05 0.04 0.03 0.02 0.01 0.0 1 - 56 - 10 100 Convergence Properties 1. Bounds on the dynamic estimation error for the proposed Kalman Filter approximations (the P-RLS, the R-SD and the R-LMS) are obtained. 2. An important role in these convergence theorems plays the term Xˆ t G t Xˆ t 1 PRLS PRLS which stands for the amount of variation (innovative data) that exists in the sequence. The higher this term, the higher is the expected error. - 57 - Results - Part 1 Dynamic Estimation Comparison - Low dimension (N=100) synthetic case MSE versus iterations - A Comparison 10 1-LMS 3-SD 1 Pseudo-RLS Kalman Filter 0.1 0.01 0 20 40 60 - 58 - 80 100 Results - Part 2 Higher dimension (N=2500) synthetic image sequences 1st Note: the motion and blur operations are assumed to be known apriori 25th 50th 75th 100th A The original sequence: Image size: 50 by 50 B Measured sequence:3 by 3 uniform blurring, 2:1 decimation, noise 5. C Bilinear interpolation of the measured sequence D The 5-LMS algorithm's output, no regularization E The 5-LMS algorithm's output, with regularization F The 5-SD algorithm's output, with regularization - 59 - Sequence 1 [Displacement+zoom] 11stst 25 25thth Measurements Bilinear Interpolation 5-LMS no Regularization 5-LMS + Regularization 5-SD + Regularization - 60 - 50 50thth 75 75thth 100 100thth Sequence 1 [Pure translation] 1st 25th Measurements Bilinear Interpolation 5-LMS no Regularization 5-LMS + Regularization 5-SD + Regularization - 61 - 50th 75th 100th Sequence 1 [Pure rotation] 1st 25th Measurements Bilinear Interpolation 5-LMS no Regularization 5-LMS + Regularization 5-SD + Regularization - 62 - 50th 75th 100th Conclusions Both Static and Dynamic super-resolution paradigms are presented, along with their solutions. Very simple yet general models are proposed for both problems. The SSR problem is presented as a classic inverse problem, and treated as such. The DSR problem is shown to require KF for its solution. Due to the dimensions involved, approximations are developed and analyzed. Simulations show promising results, both for the SSR and the DSR. Motion estimation is a bottleneck in the recovery processes. - 63 - Fast SSR (1) A Special Case What if the same camera is used and the motion is pure translational? - 64 - SSR - The Model Geometric Warp Blur HighResolution Image F1=I Decimation Y1 D1 H1 V1 X LowResolution Images Additive Noise FN HN YN DN VN N 1 Y k Dk H k Fk X V k , V k ~ N0, Wk k 1 - 65 - The Model as One Equation N 1 Y k Dk H k Fk X V k , V k ~ N0, Wk k 1 V1 Y1 D1H1F1 V Y D H F 2 2 2 2 X 2 Y D H F N N N N V N 1 W V1 1 V W2 2 ~ N 0, V W N N 0 - 66 - 0 Iterative Reconstruction N T T T R Fk H k Dk Wk Dk H k Fk k 1 N P F T H T DT W Y k k k k k k 1 ˆ P RX ˆ : 1000 1000 , the matrix R is sparse R M10 10 For X 6 6 OPTION: Using the SD algorithm (10-15 iterations are enough) N ˆX j1 Xˆ j F T H T DT W Y k D H F Xˆ j k k k k k k k k 1 - 67 - Basic Assumptions Hk=H – The blur operation is the same for all the images and it is a linear-space-invariant operation, i.e., it has a block-Circulant form. Dk=D – The decimation operation is the same for all the images and it is a uniform sub-sampling operator Fk – The warps are all pure translations, and thus all have a block-Circulant form. More over, we assume a nearest-neighbor representation (one non-zero entry in each row and it is ‘1’) Wk=cI– The noise is Gaussian and white and thus the covariance matrix is the identity matrix up to some constant - 68 - Using the Iterative SD N Xˆ j1 Xˆ j FkT H Tk DTk Wk Y k Dk H k Fk Xˆ j k 1 N Xˆ j1 Xˆ j FkT HT DT Y k DHFk Xˆ j k 1 N Xˆ j HT FkT DT Y k DFk H Xˆ j k 1 where we use the fact that block-Circulant matrices commute - 69 - Important Shortcut ˆ j HX ˆ j and get Define Z N ˆ j1 X ˆ j H T F T DT Y k DF H X ˆ j X k k k 1 N Zˆ j1 Zˆ j HH T FkT DT Y k DFk Zˆ j k 1 N N ˆ T T T T ˆ ˆ Z j HH Fk D Y k Fk D DFk Z j Z j HH T P R Zˆ j k 1 k 1 T ~ P ~ R - 70 - Descent Direction - Theory ~ ~T Given the quadratic function* f x x R x P x c , ~ ~ it’s optimal Solution satisfies R xˆ opt P. ~ ~ Any algorithm of the form xˆ j1 xˆ j M R xˆ j P 1 2 T converges to xˆ opt for sufficiently small and M>0. In our case M=HHT (positive semi-definite). It means that the error xˆ j xˆ opt in the null space of M cannot converge. * R is assumed to be positive definite Is it a Problem? - 71 - Positive Semi-definite M ~ ~ xˆ j1 xˆ j M R xˆ j P ~ xˆ j1 xˆ opt I MR xˆ j1 0 xˆ opt ~ 1 If v is in the null-space of M, then a vector u R v ~ is in the null-space of MR. For such a vector we get ~ I MR j1 - 72 - uu Positive Semi-definite M xˆ 0 xˆ opt eˆ 0 fˆ 0 ~ In the null-space of MR ~ Orthogonal to the null-space of MR ~ ˆ eˆ j1 f j1 I MR j1 ~ ˆ eˆ 0 f 0 I MR j1 eˆ 0 fˆ 0 ~ The null-space of MR is characterized by very high frequencies (since M=HHT and H is a low-pass-filter). Thus, no-convergence there is of no consequence, and this is especially true if proper initialization is used. - 73 - What is P ? N ~ T T P Fk D Y k ? k 1 Y1 YN It turns out that this is a motion-compensated average of the input images Zero Interpolation Inverse Displacement DT F1T Zero Interpolation Inverse Displacement DT FNT - 74 - ~ P What is R ? N ~ T T R Fk D DFk ? k 1 Huge matrix, but due to our assumptions … A. This matrix is a diagonal matrix, B. Its main diagonal entries are all integers, C. The [j,j] entry represents the count of contributing pixels from the Y-sequence to the j-th pixel in X, and D. We hereby assume that sufficient measurements are ~ given and thus j, R j, j 1 - 75 - To Conclude ~ˆ ˆZ j1 Zˆ j HHT ~P R Zj ~ 1 ~ ˆZopt R P and it is easy to compute this solution – One division by integer per pixel !!!! Having found Zˆ opt , since it is defined by ˆj Zˆ j HX We have to apply a classic image restoration procedure ˆ opt (can be done without iterations). to recover X - 76 - Should We be Surprised ? Every low-quality image fills some pixels in the higher resolution grid. Some pixels will be filled more than once – good for noise removal - 77 - Adaptive Non-Iterative Restoration ˆ HTH ST WS Using X space-invariant. 1 T H Y is edge preserving but not HY T T 1 T ˆ X 2 H H 2S S H Y ˆ HT H S T S X 1 1 Instead use 1 T where 1 opt 2. Thus, X1 and X2 can be computed using 2D-FFT. The final result should be obtained using a diagonal weight matrix W with values in the range [0,1] (1-edge, 0-smooth): ˆ ˆ I WX ˆ X W X Final 1 2 - 78 - Fast SSR (2) Periodic-Step SD A numerical method to speed-up convergence - 79 - Relation to Super-Resolution N 1 Y k Dk H k Fk X V k , V k ~ N0, Wk k 1 V1 Y1 D1H1F1 V Y D H F 2 2 2 2 X 2 V N Y N DN H N FN - 80 - Basic Assumptions A sequence of measurements y(k) is obtained sequentially. These measurements correspond linearly to an unknown vector x through yk CT k x nk y1 y2 y3 yL n 1 C 1 n 2 T C 2 T C 3 x n 3 T n L C L T - 81 - Basic Assumptions Assumption 1 – we have enough measurements, i.e., if we write y Cx n, C MLN , then L N and C is fullrank. If LS (ML) is applied, we get f x T ˆ y Cx Min . x C C C y 2 T 1 2 Assumption 2 – x is high dimensional [N elements] and thus the above solution is practically impossible Turn to iterative methods - 82 - Simple Iterative Method - SD f x y Cx Min. 2 2 f x CT y Cx x Using the Steepest-Descend idea we get xˆ k 1 xˆ k CT y Cxˆ k L T ˆ x k C j y( j) C jxˆ k j1 So we see that the gradient is built from L separate contributions, each obtained from a different measurement - 83 - Decomposition of the Gradient xˆ k 1 xˆ k CT y Cxˆ k L T ˆ x k C j y( j) C jxˆ k j1 - 84 - Periodic-Step SD L T ˆ ˆ jxˆ k x x C j y ( j ) C Instead of using k 1 k j1 update the estimate of x for each SCALAR measurement T xˆ k j1 xˆ k j C j y( j) C j xˆ k j L L L for k 0,1, 2,3, .... and for each k, sweep j 1,2,3, ,L - 85 - Related Work This idea of breaking the gradient into several parts and updating the estimate after each of them is well-known, especially in cases where sequential measurements are obtained. Two such classic examples: Neural Network training (see Bertsekas’s book) Signal Processing (see LMS by Widrow et.al.) In image restoration and super-resolution problems, we may consider updating our output image after every pixel in the measurements. The benefit is convergence speed-up. - 86 - Analysis Results Convergence is guaranteed if 0 Min 2 / CT jC j 1 j L The convergence is to the LS optimal solution only if Infitisimal step-size 0, Diminishing step-size k0, or if C is square. In all other cases, the convergence is to a deviated solution. In the SSR case, we are not interested in exact solution !!!! Rate of convergence is dramatically improved (compared to SD, NSD, CG, Jacobi, GS, & SOR) - 87 - SSR - Simulation Results Original Bilinear int. LS result PSSD Result SYNTHETIC CASE 25 images were created from one 100-by-100 pixels image using •Motion - Affine, •Blur – 3-by-3 uniform, •Noise – Gaus. white =3. These 25 images were fused to create a 200-by-200 pixels output. This algorithm effectively converges after one iteration - 88 -