Transcript Slide 1

Geology 5670/6670
Inverse Theory
13 Feb 2015
Last time: Nonlinear Inversion
• Given a nonlinear problem, i.e. of the form F(m) = d, we
can solve by one of several approaches:
(1) Apply a linearizing transformation. Note this same
transformation will also be applied to errors!
(2) Grid search (or Monte Carlo; simulated
annealing). Brute force; this can be computationally
expensive for M large.
(3) A gradient search method. Iteratively solve

m k1  G k d k
Fi m k 
in which Gk consists of sensitivity
k
Gij 
coefficients:
m j
Read for Tue 17 Feb: Menke Ch 9 (163-188)
© A.R. Lowry 2015
The gradient search approach to solution of a
nonlinear inverse problem, F(m) = d, can be summarized
by the following algorithm:
0. Choose a starting model, m0
1. Calculate model misfit dk = F(mk) – d

m k1  G k d k
2. Evaluate
Fi m k 
k
where Gk is the kth sensitivity matrix Gij 
3. Update the model mk+1 = mk + mk+1;

4. Set k = k + 1 and iterate

m j
This approach uses the first-order term of the Taylor
series to approximately linearize the problem…
Note that most of the same tools we used for the linear
problem (e.g., estimates of parameter error from
the parameter covariance matrix; model resolution
and covariance matrices;  parameter etc.) still apply
to the iterative solution for a nonlinear model… The
main difference being that we apply these metrics to
the final (iterated) model estimate using the sensitivity
matrix G of the best-fitting model.
Complications of the general nonlinear problem include
multiple minima (resulting in starting-model
dependence of the solution), and nonconvergence:
Example: Solving for fault slip from displacement time series
at just one GPS site:
Could see evidence for transient
fault slip; wanted to know where
and how it moved as a function
of time…
So modeled as a slip patch with
length L, total slip U, moving
along-strike with velocity V,
centered at a distance y from the
GPS instrument… Using grid search.
It’s not clear
that a slip
pulse initial
model with an
eastward
propagation
and/or seaward
centroid would
ever converge
to the global
minimum.
These kinds of problems
are common (particularly
when measurement
sampling is less-than-ideal
as in this case). Here the
global minimum is found
because a fine-mesh
grid-search was used,
but an iterative (gradient)
method would have
encountered problems
for most starting models.
The algorithm we derived last time:
Choose a starting model m0
 
d k  d  F m k
Gijk



  G
Fi m k
m j
T
k
 
T
 
m k  F m k
 T 1 T
m k  G k G k  G k d k


is often referred to as
the Gauss-Newton
algorithm.
m k 1  m k  m k
Note this approach may get divergent answers (i.e.,

the method will move the model in directions that
produce larger model misfits) if the sensitivity matrix
is unstable (i.e., the smallest eigenvalues are
extremely small).
One way to “stabilize” the solution is to control (reduce)
the parameter step so that the iterative method does
not “overstep” into another minimum domain…
E.g., a truncated Gauss step:
 T 1 T
m k 1  m k   k G k G k  G k d k


in which k < 0.5 is a scalar step parameter.
Problem: We would like to travel in the direction of “true

steepest
descent”. However that may be very
different from the direction of a derivative at a point!
Recall the Generalized Inverse: Can instead use
1 T
m k1  m k V p  p U p d k
where p is the number of “non-zero” eigenvalues.
This is more robust but also more time-consuming…

Of course can also stabilize using something similar to
the “other approach” used previously to stabilize linear
inverse problems, damping:
We call this Levenberg Damping:
 T
1 T
m k 1  m k  G k G k  k I  G k d k


Or Levenberg-Marquardt damping:
1
 T
 T  T
Gk Gk   kdiagGk Gk  Gk d k
 mk 1  mk  



Here we choose k heuristically based on trial & error (i.e.,
pick a  that maximally decreases the residual norm).

Note  large results in a smaller step size (because the
determinant of the matrix being inverted will be larger);
 small is more similar to Gauss-Newton…