Least Mean-Square Adaptive Filtering

Download Report

Transcript Least Mean-Square Adaptive Filtering

LEAST MEAN-SQUARE (LMS)
ADAPTIVE FILTERING
Steepest Descent

The update rule for SD is
where
or

SD is a deterministic algorithm, in the sense that p and R are
assumed to be exactly known.

In practice we can only estimate these functions.
Basic Idea

The simplest estimate of the expectations is
 To remove the expectation terms and replace them with the
instantaneous values, i.e.

Then, the gradient becomes

Eventually, the new update rule is
No
expectations,
Instantaneous
samples!
Basic Idea

However the term in the brackets is the error, i.e.
then
is the gradient of

in SD.
instead of
as
Basic Idea

Filter weights are updated using instantaneous values
Update Equation for
Method of Steepest Descent
Update Equation for
Least Mean-Square
LMS Algorithm


Since the expectations are omitted, the estimates will have a high variance.
Therefore, the recursive computation of each tap weight in the LMS
algorithm suffers from a gradient noise.

In contrast to SD which is a deterministic algorithm, LMS is a member of the
family of stochastic gradient descent algorithms.

LMS has higher MSE (J(∞)) compared to SD (Jmin) (Wiener Soln.) as n→∞
 i.e., J(n) →J(∞) as n→∞
 Difference is called the excess mean-square error Jex(∞)
 The ratio Jex(∞)/ Jmin is called the misadjustment.
 Hopefully, J(∞) is a finite value, then LMS is said to be stable in the
mean square sense.
 LMS will perform a random motion around the Wiener solution.
LMS Algorithm





Involves a feedback connection.
Although LMS might seem very difficult to work due the
randomness, the feedback acts as a low-pass filter or performs
averaging so that the randomness can be filtered-out.
The time-constant of averaging is inversely proportional to μ.
Actually, if  is chosen small enough, the adaptive process is made
to progress slowly and the effects of the gradient noise on the tap
weights are largely filtered-out.
Computational complexity of LMS is very low → very attractive
 Only 2M+1 complex multiplications and 2M complex additions
per iteration.
LMS Algorithm
Canonical Model


LMS algorithm for complex signals/with complex coef.s can be
represented in terms of four separate LMS algorithms for real
signals with cross-coupling between them.
Write the input/desired signal/tap gains/output/error in the complex
notation
Canonical Model

Then the relations bw. these expressions are
Canonical Model
Canonical Model
Analysis of the LMS Algorithm

Although the filter is a linear combiner, the algorithm is highly nonlinear and violates superposition and homogenity

Assume the initial condition
, then
output

Analysis will continue using the weight-error vector
input
Analysis of the LMS Algorithm

We have

Let

Then the update eqn. can be written as

Analyse convergence in an average sense
 Algorithm run many times→study their ensemble average behavior
Analysis of the LMS Algorithm

Using

It can be shown that
Here we use expectation,
however, actually it is
the ensemble average!.
Small step size
assumption
Small Step Size Analysis

Assumption I: step size  is small → LMS filter acts like a low-pass
filter with very low cut-off frequency.

Assumption II: Desired response is described by a linear multiple
regression model that is matched exactly by the optimum Wiener
filter
where eo(n) is the irreducible estimation error and

Assumption III: The input and the desired response are jointly
Gaussian.
Small Step Size Analysis

Applying the similarity transformation resulting from the eigendecom.
on
We do not have this term in
Wiener filtering!.
i.e.

Then, we have
where
Components of v(n)
are uncorrelated!
Small Step Size Analysis

Components of v(n) are uncorrelated:


stochastic force
first order difference equation
Solution: Iterating from n=0
natural component
of v(n)
forced component
of v(n)
Learning Curves

Two kinds of learning curves
 Mean-square error (MSE) learning curve

Mean-square deviation (MSD) learning curve

Ensemble averaging → results of many (→∞) realizations are averaged.

What is the relation bw. MSE and MSD?

for  small
Learning Curves
for  small


under the assumptions of slide 17.
Excess MSE
 LMS performs worse than SD, there is always an excess MSE
← use
Learning Curves
or

Mean-square deviation D is lower-upper bounded by the excess MSE.

They have similar response: decaying as n grows
Convergence

For  small

Hence, for convergence
or

The ensemble-average learning curve of an LMS filter does not
exhibit oscillations, rather, it decays exponentially to the const. value
Jex(n)
Misadjustment

Misadjustment, define

For small , from prev. slide
or equivalently
but
then
Average Time Constant

From SD we know that
but
then
Observations

Misadjustment is
 directly proportional to the filter length M, for a fixed mse,av
 inversely proportional to the time constant mse,av


Directly proportional to the step size 


slower convergence results in lower misadjustment.
smaller step size results in lower misadjustment.
Time constant is
 inversely proportional to the step size 


smaller step size results in slower convergence
Large  requires the inclusion of k(n) (k≥1) into the analysis
 Difficult to analyse, small step analysis is no longer valid,
 learning curve becomes more noisy
LMS vs. SD





Main goal is to minimise the Mean Square Error (MSE)
Optimum solution found by Wiener-Hopf equations.
Requires auto/cross-correlations.
Achieves the minimum value of MSE, Jmin.
LMS and SD are iterative algorithms designed to find wo.
 SD has direct access to auto/cross-correlations (exact measurements)


can approach the Wiener solution wo, can go down to Jmin.
LMS uses instantenous estimates instead (noisy measurements)

fluctuates around wo in a Brownian-motion manner, at most J(∞).
LMS vs. SD

Learning curves
 SD has a well-defined curve composed of decaying exponentials

For LMS, curve is composed of noisy- decaying exponentials
Statistical Wave Theory



As filter length increases, M→∞
 Propagation of electromagnetic disturbances along a
transmission line towards infinity is similar to signals on n
infinitely long LMS filter.
Finite length LMS filter (transmission line)
 Corrections have to be made at the edges to tackle reflections,
 As length increases reflection region decreases compared to the
total filter.
Imposes a limit on the step size to avoid instability as M→∞
Smax: maximum component
of the PSD S(ω) of the tap
inputs u(n).

If the upper bound is exceeded, instability is observed.