Transcript Document

PHYS2020
NUMERICAL ALGORITHM NOTES
ROOTS OF EQUATIONS
Finding the Roots of Equations
 What does finding the roots of an equation
mean?
 It means finding the x values when y = 0,
assuming our relationship is y = f(x)
 Let's take a small (familiar?) example:
 Find the roots of the equation y = x^2 + 3x -4
What are the roots?
Start with y= 0,
so,
0 = x^2 + 3x -4
factorise to get:
y = 0 = (x + 4)( x - 1)
To make y = 0 true, either x = -4, or x = 1.
Graphically
y
x
x = -4
x=1
Why is finding the roots of an equation important?
Finding when an equation is equal to 0 is very important. If you want
to minimise (or maximise) something you need to find the point where
the derivative is equal to zero.
e.g. If
y  x2
then the minimum or maximum of this function is found when
dy
0
dx
so, in this case
dy
 2x  0
dx
A max/min occurs when dy/dx=0 (i.e. the gradient equals 0)
So, we have 2x = 0, so x=0. The minimum value for this
equation occurs when x = 0.
How can we tell if it is a minimum or a maximum? Try
taking the second derivative!
d^2y/dx^2 = 2. The value is positive, so dy/dx is a
minimum.
Optimisation
There are many examples where we may wish to find the
minimum or maximum value of a function. Wishing to
maximise or minimise a function is often known as
optimisation.
This is equivalent to finding the roots of the derivative of
that function.
Finding the Roots
Numerically
Opimisation is a common strategy used in experimental
physics.
A good example is trying to measure the black-body
function of a protostar.
We measure the intensity of the
electromagnetic radiation from the
star at a number of wavelengths
and then try and find the best fit
between the measured values and
those predicted by the exact
blackbody function.
x
Experimental uncertainties usually
guarantee that the measurements
no one blackbody function will fit
exactly, so we try and find the
blackbody function (or often range
of functions) that minimizes the
difference between measured and
predicted values.
x
x
x
x
Hypothetical measured values
marked with an x. Which curve is
the best fit? Finding out takes the
use of numerical optimization.
Finding Equation Roots with the Bisection Method
The bisection method is what is known as a bracketing
method because we bracket the root by finding a place where
the function is positive and a place where it is negative. We
successively halve this interval to eventually find the root.
y
Y>0
Y<0
x
Root is between these values
For y = f(x), if we find an x value where y is positive and another
where y is negative, then we are guaranteed that a root exists on the
interval bracketed by these two values of x, provided of course that y
is continuous over this interval.
Once we find x1 and x2 such that y1 and y2 have opposite signs, we
then evaluate y for xm, the midpoint between x1 and x2.
Then take a new interval bracketed by either x1 and xm, or xm and x2,
depending on which interval is bracketed by a positive and a negative
value of y. In the diagram below, after calculating xm we know that the
root lies between x1 and xm.
y
Ym > or < 0?
Y2 > 0
Y1 < 0
x1
xm
x2
Root is between x1 and x2
Some comments on bracketing methods
ROBUSTNESS:
Bracketing is a robust method of root finding. It will always give us
a root provided the conditions mentioned are met.
SPEED OF CONVERGENCE:
However, bracketing can converge slowly compared to fixed-point
methods (such as Newton’s method). The trade off is that although
fixed point methods may converge faster, there is no guarantee that
they will converge at all.
FORM OF FUNCTION f(x):
One big advantage of bracketing methods is that an analytic form of
the function f(x) need not be known.
This is particularly useful when you have a series of laboratory
measurements that are positive and negative – for example the
position of a chaotic oscillator at particular times. Sometimes it may
be above the equilibrium position and other times below.
Convergence of the bisection method
It is easy to see how quickly the bisection method converges, and what
the uncertainty in the value of the root we obtain will be after n iterations.
The uncertainty in xn (the value of the root after n iterations) will be
related to the size of the interval after n iterations (bisections). The time
taken to converge to this level of uncertainly will be approximately the
time it takes for n iterations of your code. The uncertainty after n
iterations the size of the interval I will be:
I1
I2
In
xb  xa
xb  xa
2
xb  xa
2
2
xb  xa
2n
So, after n iterations, the uncertainty in
xn (compared to the true unknown value
of the root, x ) is known to be
x  xa
xn  x  b
2n
I0
The Newton-Raphson Method
The Newton-Raphson method (often just called Newton’s
method) is a fixed-point method of finding the roots to an
equation.
This is an iterative method based on truncating the Taylor
series expansion of f(x) at the first order term.
It can also be arrived at by geometric considerations alone,
which is what we will use here.
To use Newton’s method we to know the form of f(x), and we need an
initial approximation to the root, x0.
y
Y = f(x)
f(x0)
x0
x
We construct the tangent line to f(x) at x0, and then find the point, x1, where
the tangent line intercepts the x-axis. We take x1 as the new approximation
to the root and then do the same again.
y = f(x)
y
x1
f(x0)
x0
x1
x
We now construct the tangent line to f(x) at x1, and then find the point, x2,
where the new tangent line intercepts the x-axis. We then take x2 as the
next approximation to the root iterate to find further approximations.
In the example illustrated here we are converging quite quickly.
y = f(x)
y
x1
f(x0)
x0
x1
x2
x
To obtain a mathematical expression for Newton’s method we need to find
an expression for x1 in terms of x0 and f(x).
We start by finding the equation for the tangent line at f(x0), and then finding
where this intercepts the x-axis to find x1.
y = f(x)
y
x1
f(x0)
x0
x1
x
What do we know about the tangent line to f(x) at x0?
Remember that we can only use this method if we know f(x) analytically.
So, if have a first guess at the root, x0, we also know f(x0).
This means that we know one point on the tangent line, (x0, f(x0)).
We also know the slope of the tangent line which is given by
m  f x 
Remembering that the equation of a straight line where we know one point
and the slope is given by the expression
y  y0
 slope
x  x0
We can substitute in (x0, f(x0)) and f ’(x) to get the equation for the tangent
line of:
y  f x0 
 f x0 
x  x0
Now that we know the general equation for the tangent line, we want to
know the x-value, x1, where this line crosses the x-axis, so we set y = 0.
This gives
0  f  x0 
 f x0 
x1  x0
 f  x0 
 x1  x0
f  x0 
f  x0 
x1  x0 
f  x0 
So, more generally, Newton’s method tells us that
f xn 
xn 1  xn 
f xn 
Convergence of Newton’s Method
Newton’s method converges quadratically. This means that the error at the
(n + 1)th step is proportional to the error at the nth step. (This is in contrast to
the bisection method which exhibits linear convergence.)
Quadratic convergence is good if your first approximation is close to the
root you are seeking. The square of a small number is a much smaller
number.
However, in practical terms, if the error in your first approximation to the
root you are seeking is not “close enough” to that root, the convergence
may be very slow, or the series may not converge at all.
“Close enough” is a physicist’s term rather than a mathematician’s term. I
(as a physicist) would merely caution you that you will find Newton’s
method particularly useful for solving problems such as finding the correct
blackbody function for an observed protostar, but you will need to find a
good first approximation to the root you seek.
Advantages of Newton’s Method
•Newton’s method is not as “robust” as the bisection method, but it does
have a very significant advantage:
•It generalises fairly easily to multi-dimensional problems, where you
need to find a root in multi-dimensional space, given n simultaneous
equations each involving n scalar variables, whereas the bisection
method does not.
•In multi-dimensional space all the problems of convergence are
magnified. Newton’s method is good at either finding a “local” minimum (i.e.
not the root you want), or not converging at all.
•Of course, this can be overcome – just remember that you must use
“physical” considerations to find a good first approximation for your desired
minimum or root!
•In the case of trying to fit observations of the intensity of a protostar at
various wavelengths to a single blackbody function, each
measurement gives an equation predicting temperature on the basis of
that measurement. If they all suggest the same temperature life is
easy. Generally however, each one gives a different temperature, so it
is a matter of trying to minimise the difference between observed and
predicted values.
•In my experience of experimental physics this general class of
problem is one of the most important we deal with:
Postulate underlying
natural law
(e.g. blackbody function
to explain emission
from a protostar)
Make predictions that
experiment can test
(e.g. what intensity
do we expect to observe
at a given wavelength.)
Get best match
possible between
measurements and
underlying law
(e.g. minimise difference
Between observed
values and
postulated laws.)