High Performance Computing 811

Transcript High Performance Computing 811

Computational Methods
in Physics
PHYS 3437
Dr Rob Thacker
Dept of Astronomy & Physics (MM-301C)
[email protected]
Today’s Lecture

Functions and roots: Part I
Polynomials
 General rooting finding

Bracketing & the bisection method
 Newton-Raphson method
 Secant method

Functions & roots

This is probably the first numerical task we encounter



Many graphical calculators have built-in routines for finding
roots
Solving f(x)=0 is a problem frequently encountered in
physics
First examine solutions for polynomials:
n
f n ( x)   ai x i  0 with an  1
i 0
n  1 : f1 ( x)  a0  x ; x  a0 is the only root
2

a

a
1
1  4 a0
2
n  2 : f 2 ( x)  a0  a1 x  x ; x 
2
Higher orders: cubic

Cubic formula was discovered by Tartaglia (1501-1545)
but published by Cardan (~1530) without permission!
n  2 : f 3 ( x)  a0  a1 x  a2 x 2  x 3  ( x  x1 )( x  x2 )( x  x3 )
1
x1  ( a2       )
3
1
x2  (a2   2     )
3
1
x1  ( a2      2  )
3
where
1/ 3


9
27
     a23  a2 a1  a0  i 3 
2
2


  a22 a12  4a12  4a23 a0  27 a02  18a2 a1a0
1
2
  (1  i 3 )
; i  -1
Higher orders: quartic & higher



The general solution to a quartic was discovered by Ferrari (1545) but is
extremely lengthy
Evariste Galois (1811-1832) showed that there is no general solution to a
quintic & higher formulas
Remember certain higher order equations may be rewritten as quadratics,
cubics in x2 (say) e.g.
x  3x  1  ( x )  3( x )  1
4



2 2
2
What is the best approach for polynomials?
Always best to reduce to simplest form by dividing out factors
If coefficients sum to zero then x=1 is a solution and (x-1) can be factored to
give a (x-1)*(polynomial order n-1)


2
Can you think why?
You may be able to repeatedly factor larger polynomials by inspection

Only useful for a single set of coefficients though
Example



Consider f(x)=x5+4x4-10x2x 4  5x3  5x 2  5x  6
x+6=0
( x  1) x 5  4 x 4  0 x 3  10 x 2  x  6
Firstly sum of coefficients is 0, so
4
5
x

x
x=1 is a solution – factor (x-1) by
long division of the polynomial
5x 4  0x3
Hence we find that
5x 4  5x3
f(x)=(x-1)(x4+5x3+5x2-5x-6)
5 x 3  10 x 2


Notice that the coefficients in the
second bracket sum to 0…
In fact we can show that further
division gives
f(x)=(x-1)2(x+1)(x+2)(x+3)
5x3  5x 2
 5x 2  x
 5x 2  5x
So regardless of whether an analytic general solution exists, we may still be able to
find roots without using numerical root finders…
 6x  6
Advice on using the general
quadratic formula



For the quadratic f(x)=x2+a1x+a0 the product of the
roots must be a0 since f(x)=(x-x1)(x-x2) where x1 & x2
are the roots
Hence x2=a0/x1 so given one root, provided a0≠0, we
can then quickly find the other root
Using the manipulation discussed in lecture 3
If a =0 then
0
x1 
 a1  a  4a0  a1  a  4a0
2 a0


2
 a1  a12  4a0  a1  a12  4a0
2
1
Then define

1
d   a1  sgn( a1 ) a12  4a0
2
 x1  a0 / d
 x2  d
2
1

we have divide
by zero here.
sgn(a1) is a function that is 1
if a1 is positive, and -1 if a1 is
negative. This ensures that the
discriminant is added with the
correct sign.
More general root finding

For a more general function, say f(x)=cos x – x
the roots are non-trivial
In this case the solution
is given where g(x)=h(x)
h(x)=x
1.0
g(x)=cos x
0.8
p/2
Graphing actually helps a
great deal since you can see
there must only be one
root. In this case around 0.8.
It is extremely beneficial to know
what a given function will look
like (approximately) – helps you
to figure out whether your solutions
are correct or not.
Iterative root finding methods

It is clearly useful to construct numerical algorithms that will take
any input f(x) and search for solutions

However, there is no perfect algorithm for a given function f(x)



Above all you should not use algorithms as black boxes


Some converge fast for certain f(x) but slowly for others
Certain f(x) may not yield numerical solutions
You need to know what the limitations are!
We shall look at the following algorithms




Bisection
Newton-Raphson
Secant Method
Muller’s Method
Bracketing & Bisection


Bracketing is a beautifully simple
idea
Anywhere region bounded by xl
and xu, such that the sign of a
function f(x) changes between
f(xl) and f(xu), there must be at
least one root


This is provided that the function
does not have any infinities within
the bracket – f(x) must be
continuous within the region
The root is said to be bracketed by
the interval (xl,xu)
f(x)
xl
xu
x
Other possibilities

Even if a region is
bracketed by two
positive (or two negative)
values a root may still
exist

There could even be 2n
roots in this region!

Where n is a natural
number
x2 x3
xl
xu
xl
xu
Pathological behaviour occurs
This is a graph of sin(1/x). As x→0 the function develops an infinite
number of roots.
But don’t worry! This is not a course in analysis…
Bracketing singularities

f(x)=1/(x-1) is not continuous within the bracketed region [0.5,1.5].Although f(0.5)<0 and
f(1.5)>0 there is no root in this bracket (regardless of the fact that the plotting program connects
the segments) f(x) is not continuous within the bracket.
Root finding by bisection

Step 1: Given we know
that there exists an
f(x0)=0 solution, choose
xl and xu as the brackets
for the root


Since the signs of f(xl) &
f(xu) must be different, we
must have f(xl)×f(xu) < 0
This step may be nontrivial – we’ll come back to
it
xl
xu
For example, for f(x)=cos x –x we know that 0<x0<p/2
f(x)
Bisection: Find the mid-point

Step 2: Let xm=0.5(xl+xu)



the mid point between the two
brackets
This point must be closer
to the root than one of xl
or xu
The next step is to select
the new bracket from
xl,xm,xu
xl
xm
xu
Bisection: Find the new bracket

Step 3:

If f(xl)×f(xm)<0



If f(xl)×f(xm)>0



root lies between xl and xm
Set xl=xl ; xu=xm
root lies between xm and xu
Set xl=xm; xu=xu
If f(xl)×f(xm)=0

root is xm you can stop
xl
xm
xu
Final Step in Bisection

Step 4: Test accuracy (we want to stop at some
point)
xm is the best guess at this stage
 Absolute error is |xm-x0| but we don’t know what x0
is!
 Error estimate: e|xmn-xmn-1| where n corresponds
to nth iteration
 Check whether e< where  is the tolerance that you
have to decide for yourself at the start of the root
finding

If e< then stop
 If e> then return to step 1

Example


Let’s consider f(x)=cos x - x
Pick the initial bracket as xl=0 and xu=p/2


f(0)=1, f(p/2)=-p/2 – check this satisfies
f(0)×f(p/2)<0 (yes - so good to start from here)
First bisection: xm=0.5(xl+xu)=p/4=0.785398
f(xl(1))=f(0)=1
f(xl(1))×f(xm(1))<0
 f(xm(1))=f(p/4)=-0.078
 f(xu(1))=f(p/2)=-p/2
 So select xl(1) & xm(1) as the new bracket

Repeat…




After eleventh bisection:
xm(11)=963p/4096=0.738612
After twelfth bisection:
xm(12)=1927p/8192=0.738995
So after first 12 bisections we have only got the
first three decimal places correct
Typically to achieve 9 decimal places of accuracy
will require 30-40 steps
Pros & cons of bisection

Pro: Bisection is robust




You will always find a root
For this fact alone it should be taken seriously
Pro: can be programmed very straightforwardly
Con: bisection converges slowly…

Width of the bracket: xun-xln=en=en-1/2




Width halves at each stage
We know in advance if the initial bracket is width e0 after n
steps the bracket will have width e0/2n
If we have a specified tolerance  then this will be achieved
when n=log2(e0/)
Bisection is said to converge linearly since en+1en
Newton-Raphson Method


For many numerical methods, the underlying algorithms begins with a Taylor
expansion
Consider the Taylor expansion of f(x) around a point x0
f ( x)  f ( x0 )  ( x  x0 ) f ' ( x0 )  ...

If x is a root then
0  f ( x0 )  ( x  x0 ) f ' ( x0 )
f ( x0 )
 x  x0 
f ' ( x0 )

This formula is applicable provided f’(x0) is not an extremum (f’(x0)≠0))
Applying the formula

If we interpret x0 to be the current guess we use the
formula to give a better estimate for the root, x

We can repeat this process iteratively to get better and better
approximations to the root



We may even get the exact answer if we are lucky
Take f(x)=cos x – x, then f’(x)=-sin x -1
It helps to make a good guess for the first root, so we’ll
take x0(1)=p/4=0.785398



The (1) signifies the first in a series of guesses
f(x0(1))=-0.0782914
f’(x0(1))=-1.7071068
Steps in the iteration

Plugging in the values for f(x0(1)), f’(x0(1)) we get
x0( 2)

We repeat the whole step again, with the values for x0(2)
x0(3)


(1)
f
(
x
0.0782914
(1)
0 )
 x0 
 0.785398 
 0.739536
(1)
f ' ( x0 )
1.7071068
( 2)
f
(
x
0.0007546
( 2)
0 )
 x0 
 0.739536 
 0.739085
( 2)
f ' ( x0 )
1.673945
At this stage f(x0(3))=0.0000002, so we are already very
close to the correct answer – after only two iterations
It helps to see what is happening geometrically…
Geometric Interpretation of N-R

The general step in the
iteration process is
x
( n 1)
x
(n)
(n)
f (x )

(n)
f '(x )
Draw out a line from f(x(n)) with slope
f ’(x(n)) – equivalent to a line y=m(x-x(n))+b
where m=f ’(x(n)) and b=f(x(n)) – check it!
The next step in the iteration is when y=0.
x(n+2)
x(n)
So N-R “spirals in” on
the solution
x(n+3) xi=x(n+1)
f(x)
Problems with N-R

The primary problem with N-R is that if you are
too close to an extremum the derivative f’(x(n))
can become very shallow:
In this case the root was originally
bracketed by (xl,xu), but the value
of x(n+1) will lie outside that range
xl
x(n)
to x(n+1)
xu
f(x)
NR-Bisection Hybrid

Let’s combine the two methods




Good convergence speed from NR
Guaranteed convergence from bisection
Start with xl<x(n)<xu as being the current best guess within the
bracket
(n)
f
(
x
)
( n 1)
(n)
Let
xNR
x 
( n)
f '(x )


If xl<xNR(n+1)<xu then take an NR step
Else set xNR(n+1)=xm(n+1)=0.5(xl+xu), check and update xl and xu values
for the new bracket

i.e. we throw away any bad NR steps and do a bisection step instead
Example: Full NR-Bisection
algorithm


Let’s look at how we might design the algorithm in
detail
Initialization:

Given xl and xu from user input evaluate f(xl) & f(xu)


Ensure root is properly bracketed:


Test f(xl)×f(xu)<0 – if not report to user
Initialize x(n) by testing values of f(xl) & f(xu)




Test that xl≠xu – if they are equal report to user
If |f(xl)| < |f(xu)| then x(n)=xl and set f(x(n))= f(xl)
else let x(n)=xu and set f(x(n))=f(xu)
Evaluate the derivative f’(x(n))
Set the counter for the number of steps to zero: icount=0
Full NR-Bisection algorithm cont.

(1) Perform NR-bisection



Increment icount
Evaluate xNR(n+1)=x(n)-f(x(n))/f’(x(n))
If (xNR(n+1)<xl) or (xNR(n+1)>xu) then require bisection




Calculate error estimate: e=|x(n+1)-x(n)|
Set x(n)=x(n+1)
Test error


If icount>max # iterations then go to (3)
Prepare for next iteration




If |e/x(n)|< tolerance go to (2)
Test number of iterations


Set xNR(n+1)=0.5(xl+xu)
Set value of f(x(n))
Set value of f’(x(n))
Update the brackets using Step 3 of the bisection algorithm
Go to 1
Final parts of the algorithm



(2) Report the root and stop
(3) Report current best guess for root. Inform user max
number of iterations has been exceeded
While it is tempting to think that these reporting stages
are pointless verbiage – they absolutely aren’t!


Clear statements of the program state will help you to debug
tremendously
Get in the habit now!
Secant Method - Preliminaries

One of the key issues in N-R is that you need the derivative



May not have it if the function is defined numerically
Derivative could be extremely hard to calculate
You can get an estimate of the local slope using values of x(n),x(n-1), f(x(n)) and f(x(n-1))
y f ( x ( n ) )  f ( x ( n 1) )
f '(x ) 

x
x ( n )  x ( n 1)
(n)
x=x(n)-x(n-1)
x(n)
y=f(x(n))-f(x(n-1))
x(n-1)
Estimate of function
slope
f(x)
Secant Method

So once you initially select a pair of points bracketing
the root (x(n-1), x(n)), x(n+1) is given by
x


( n 1)
x
( n)
( n 1)
x x
 f (x )
(n)
( n 1)
f (x )  f (x )
(n)
(n)
You then iterate to a specific tolerance value exactly as
in NR
This method can converge almost as quickly as NR
Issues with the Secant Method

Given the initial values xl(=x(n-1)),xu(=x(n)) that
bracket the root the calculation of x(n+1) is local


Easy to program, all values are easily stored
Difficulty is with finding suitable xl,xu since this
is a global problem
Determined by f(x) over its entire range
 Consequently difficult to program (no perfect
method)

Case where Secant will be poor
2
Estimated slope is pretty
much always 45 degrees
until you get close to the
Root.
13
This is actually a difficult function to
deal with for many root finders…
Bracket finding Strategies

Graph the function


May require 1000’s of points and thus function calls
to determine visually where the function crosses the
x-axis
“Exhaustive searching” – a global bracket finder
Looks for changes in the sign of f(x)
 Need small steps so that no roots are missed
 Need still to take large enough steps that this doesn’t
take forever

Summary




When dealing with high order polynomials don’t rule out the
possibility of being able to factorize by hand
For general functions, bisection is simple but slow
 Nonetheless the fact that it is guaranteed to converge from
good starting points is extremely useful
Newton-Raphson can have excellent convergence properties
 Watch out for poor behaviour around extremums
Secant method can be used in place of NR when you cannot
calculate the derivative
Next Lecture



More on Global bracket finding
Müller-Brent method
Examples

High Performance Computing 811

Transcript High Performance Computing 811

Directory