ELE 774 Adaptive Signal Processing

Download Report

Transcript ELE 774 Adaptive Signal Processing

METHOD OF
STEEPEST DESCENT
Week 5
ELE 774 - Adaptive Signal Processing
1
Mean Square Error (Revisited)

For a transversal filter (of length M), the output is written as
and the error term wrt. a certain desired response is
Week 5
ELE 774 - Adaptive Signal Processing
2
Mean Square Error (Revisited)

Following these terms, the MSE criterion is defined as
Quadratic in w !

Substituting e(n) and manupulating the expression, we get
where
Week 5
ELE 774 - Adaptive Signal Processing
3
Mean Square Error (Revisited)

For notational simplicity, express MSE in terms of vector/matrices
where
Week 5
ELE 774 - Adaptive Signal Processing
4
Mean Square Error (Revisited)

We found that the solution (optimum filter coef.s wo) is given by the
Wiener-Hopf eqn.s


Inversion of R can be very costly.
J(w) is quadratic in w → convex in w → for wo,
 Surface has a single minimum and it is global, then

Can we reach to wo, i.e.
Week 5
with a less demanding algorithm?
ELE 774 - Adaptive Signal Processing
5
Basic Idea of the Method of Steepest Descent

Can we find wo in an iterative manner?
Week 5
ELE 774 - Adaptive Signal Processing
6
Basic Idea of the Method of Steepest Descent

Starting from w(0), generate a sequence {w(n)} with the property

Many sequences can be found following different rules.

Method of steepest descent generates points using the gradient
 Gradient of J at point w, i.e.
gives the direction at which
the function increases most.
 Then
gives the direction at which the function
decreases most.
 Release a tiny ball on the surface of J → it follows negative
gradient of the surface.
Week 5
ELE 774 - Adaptive Signal Processing
7
Basic Idea of the Method of Steepest Descent

For notational simplicity, let
, then going in the direction given by the negative gradient

How far should we go in –g → defined by the step size param. μ
 Optimum step size can be obtained by line search - difficult
 Generally a constant step size is taken for simplicity.

Then, at each step improvement in J is (from Taylor series expansion)
Week 5
ELE 774 - Adaptive Signal Processing
8
Application of SD to Wiener Filter

For w(n)

From the theory of Wiener Filter we know that

Then the update eqn. Becomes
which defines a feedback connection.
Week 5
ELE 774 - Adaptive Signal Processing
9
Convergence Analysis

Feedback → may cause stability problems under certain conditions.
 Depends on


The step size, μ
The autocorrelation matrix, R

Does SD converge?
 Under which conditions?
 What is the rate of convergence?

We may use the canonical representation.

Let the weight-error vector be
then the update eqn. becomes
Week 5
ELE 774 - Adaptive Signal Processing
10
Convergence Analysis

Let
be the eigendecomposition of R.

Then

Using QQH=I

Apply the change of coordinates

Then, the update eqn. becomes
Week 5
ELE 774 - Adaptive Signal Processing
11
Convergence Analysis

We know that Λ is diagonal, then the k-th natural mode is
or, with the initial values vk(0), we have

Note the geometric series
Week 5
ELE 774 - Adaptive Signal Processing
12
Convergence Analysis

Obviously for stability
or
or, simply
Why?

Geometric series results in an exponentially decaying curve with
time constant τk, where letting
Week 5
ELE 774 - Adaptive Signal Processing
13
Convergence Analysis

We have
then
but
We know that Q is composed of the eigenvectors of R, then
or


Each filter coefficient decays exponentially.
The overall rate of convergence is limited by the slowest and fastest
modes
Week 5
ELE 774 - Adaptive Signal Processing
14
Convergence Analysis

For small step size

What is v(0)? The initial value v(0) is

For simplicity assume that w(0)=0, then
Week 5
ELE 774 - Adaptive Signal Processing
15
Convergence Analysis

Transient behaviour:
 From the canonical form we know that


then
As long as the upper limit on the step size parameter μ is satisfied,
regardless of the initial point
Week 5
ELE 774 - Adaptive Signal Processing
16
Convergence Analysis

The progress of J(n) for n=0,1,... is called the learning curve.

The learning curve of the steepest-descent algorithm consists of a
sum of exponentials, each of which corresponds to a natural mode
of the problem.
# natural modes = # filter taps

Week 5
ELE 774 - Adaptive Signal Processing
17
Example

A predictor with 2 taps (w1(n) and w2(n)) is used to find the params.
of the AR process

Examine the transient behaviour for
 Fixed step size, varying eigenvalue spread
 Fixed eigenvalue spread, varying step size.
σv2 is adjusted so that σu2=1.

Week 5
ELE 774 - Adaptive Signal Processing
18
Example

The AR process:

Two eigenmodes
Condition number
Week 5
ELE 774 - Adaptive Signal Processing
19
Example (Experiment 1)

Experiment 1: Keep the step size fixed at

Change the eigenvalue spread
Week 5
ELE 774 - Adaptive Signal Processing
20
Example (Experiment 1)
Week 5
ELE 774 - Adaptive Signal Processing
21
Week 5
ELE 774 - Adaptive Signal Processing
22
Example (Experiment 2)

Keep the eigenvalue spread fixed at

Change the step size (μmax=1.1)
Week 5
ELE 774 - Adaptive Signal Processing
23
Week 5
ELE 774 - Adaptive Signal Processing
24
Example (Experiment 2)

Depending on the value of μ, the learning curve can be
 Overdamped, moves smoothly to the min. ((very) small μ)
 Underdamped, oscillates towards the min. (large μ< μmax)
 Critically damped
 Generally rate of convergence is slow for the first two.
Week 5
ELE 774 - Adaptive Signal Processing
25
Observations


SD is a ‘deterministic’ algorithm, i.e. we assume that
 R and p are known exactly.
In practice they can only be estimated
 Sample average?


Can have high computational complexity.
SD is a local search algorithm, but for Wiener filtering,
 the cost surface is convex (quadratic)
 convergence is guaranteed as long as μ< μmax is satisfied.
Week 5
ELE 774 - Adaptive Signal Processing
26
Observations

The origin of SD comes from the Taylor series expansion (as many
other local search optimization algorithms)

Convergence can we very slow.
To speed up the process, second term can also be included as in
the Newton’s Method

Hessian

High computational complexity (inversion), numerical stability
problems.
Week 5
ELE 774 - Adaptive Signal Processing
27