CM0368 Scientific Computing

Download Report

Transcript CM0368 Scientific Computing

CM0368 Scientific Computing

Spring 2009 Professor David Walker [email protected]

Schedule

• Weeks 1 & 2 ( 6 ): Numerical linear algebra (DWW) – Solving A

x

=

b

by Gaussian elimination and Gauss-Seidel.

– Algebraic eigenvalue problem. Power method and QR method.

• Weeks 3 & 4 ( 6 ): Numerical solution of differential equations (BMB) – Ordinary differential equations (finite difference and Runge-Kutta methods).

– Partial differential equations (finite difference method) • Weeks 5 & 6 ( 5 ): Applications in Physics (BMB) – Laplace and Helmholtz equations, Schrodinger problem for the hydrogen atom, etc.

• Weeks 6 & 7 ( 4 ): Numerical optimisation (DWW) • Weeks 2 – 11: Tutorials on Wednesdays at 1pm in T2.07. These will be given by Kieran Robert.

Text Book

• “Numerical Computing with MATLAB” by Cleve B. Moler, SIAM Press, 2004. ISBN 0898715601.

• http://www.readinglists.co.uk/rsl/student/sv iewlist.dfp?id=20248 • Web edition at http://www.mathworks.com/moler/

Web Site

• Lecture notes and other module material can be found at: http://users.cs.cf.ac.uk/David.W.Walker/CM0368

Numerical Linear Algebra

• A system of n simultaneous linear equations can be represented in matrix notation as: A

x

=

b

where A is an n  n matrix, and

x

vectors of length n.

• Can write solution as

x

inverse

of A.

= A -1

b

and

b

where A -1 are is the • A square matrix is said to be

non-singular

if its inverse exists. If the inverse does not exist then the matrix is

singular

.

Geometrical Interpretation

• If A is a 2  2 matrix, for example    2 3  4 1      

x x

2 1         3 1    then 2x 1 – x 2 = 3 and 3x 1 + 4x 2 = -1. Each represents a straight line and the solution of the above is given by their intersection.

• If A is a 3  3 matrix each of the three equations represents a plane, and the solution is the point lying at the intersection of the three planes.

MATLAB Solution

• In MATLAB we can find the solution to A

x

writing: =

b

by x = A\b • Write: A = [10 -7 0;-3 2 6;5 -1 5] • Then: b = [7;4;6] • Then: x = A\b

Gaussian Elimination

• Eliminate x 1 from all the equations after the first.

• Then eliminate x 2 second.

from all the equations after the • Then eliminate x 3 third.

from all the equations after the • And so on, until after n-1 steps we have eliminated x j from all the equations after the jth, for j = 1, 2, …, n-1.

• These steps are referred to as the

forward elimination

stage of Gaussian elimination.

Example

10    5 3  7 2  1 0 6 5     

x x

2

x

3 1   7    4 6   Subtract -3/10 times equation 1 from equation 2, and 5/10 times equation 1 from equation 3.

  10 0 0   2 .

0 7 5 .

1 0 6 5     

x x

2

x

3 1 7      6 .

1 2 .

5   Next we swap equations 2 and 3. This is called

partial pivoting

. It is done to get the largest absolute value on or below the diagonal in column 2 onto the diagonal. This makes the algorithm more stable with respect to round off errors (see later).

Example (continued)

10   0 0  7 2 .

5  0 .

1 0 5 6     

x x

2

x

1 3 7      2 .

5 6 .

1   Now subtract -0.1/2.5 times equation 2 from equation 3.

10   0 0  7 2 .

5 0 0 5 6 .

2     

x x

2

x

3 1   7    2 .

5 6 .

2   This completes the forward elimination stage of the Gaussian elimination algorithm.

Pseudocode for Forward Elimination

make b column n+1 of A for k=1 to n-1 find pivot A(m,k) in column k and row m  k swap rows k and m for i=k+1 to n mult = A(i,k)/A(k,k) for j=i+1 to n+1 A(i,j) = A(i,j) – mult*A(k,j) end end end

Back Substitution

• After the forward elimination phase, the matrix has been transformed into

upper triangular

form.

• Equation n just involves x n solved immediately.

and so can now be • Equation n-1 just involves x n-1 and x n , and since we already know x n we can find x equations we can find x n , x n-1 n-1 ,…, x 1 .

• Working our way backwards through the . • This is called the

back substitution

phase of the Gaussian elimination algorithm.

The Example Again

10   0 0  7 2 .

5 0 0 5 6 .

2     

x x

2

x

1 3   7    2 .

5 6 .

2   Equation 3 is 6.2x

3 = 6.2, so x 3 substituted into equation 2: = 1. This value is 2.5x

2 + (5)(1) = 2.5

so x 2 = -1. Substituting for x 2 and x 3 in equation 1: 10x 1 + (-7)(-1) = 7 so x 1 = 0.

Pseudocode for Back Substitution

This solves Ux = b x(n) = b(n)/U(n,n) for k=n-1 to 1 sum = 0 for j=k+1 to n sum = sum + U(k,j)*x(j) end x(k) = (b(k) – sum)/U(k,k) end

LU Factorisation

• The Gaussian elimination process can be expressed in terms of three matrices.

• The first matrix has 1’s on the main diagonal and the multipliers used in the forward elimination below the diagonal. This is a

lower triangular matrix

diagonal, and is usually denoted by L.

with unit • The second matrix, denoted by U, is the upper triangular matrix obtained at the end of the forward elimination.

• The third matrix, denoted by P, is a

permutation matrix

representing the row interchanges performed in pivoting.

1

L

    0 .

5 0 .

3 0 1  0 .

04 0 0 1  

L, U, and P

U

   10 0 0  7 2 .

5 0 0 5 6 .

2  

P

      1 0 0 0 0 1 0 1 0      The original matrix can be expressed as: LU = PA The permutation matrix is the identity matrix with its rows reordered. If P ij = 1 then row i of the permuted matrix is row j of the original matrix. The same information can be represented in a vector, the ith entry of which gives the number of the column containing the 1 in row i.

p

      1 3 2     

Some MATLAB Code

• L, U, and P can be found in MATLAB as follows: [L,U,P] = lu(A) • Solution of the system A

x

=

b

is equivalent to solving the two triangular systems L

y

= P

b

and U

x

=

y

• Once you have L, U, and P it is simple to solve the original system of equations: y = L\(P*b) and x = U\y, or just x = U\(L\(P*b)) • This should give the same answer as: x = A\b

LDU Factorisation

• It is sometimes useful to explicitly separate out the diagonal of U, which contains the pivots.

• We write U=DU’ where U’ is the same as U except that it has 1’s on the diagonal, and D is a diagonal matrix containing the diagonal elements of U.

• With this form of the factorisation we have LDU = PA 1

L

    0 .

5 0 .

3 0 1  0 .

04 0 0 1  

D

   10 0 0 0 2 .

5 0 0 0 6 .

2  

U

1    0 0  7 1 0 0 5 1  

Pseudocode for LU Factorisation

make b column n+1 of A for k=1 to n-1 find pivot A(m,k) in column k and row m  k swap rows k and m for i=k+1 to n A(i,k) = A(i,k)/A(k,k) for j=i+1 to n+1 A(i,j) = A(i,j) – A(i,k)*A(k,j) end end end

Explanation of LU

• At stage i of forward elimination we do pivoting to find the largest absolute value in column i on or below the diagonal, and then exchange rows to bring it onto the diagonal.

• Then we subtract multiples of row i from rows i+1,…,n.

• Each of these operations can be represented by a matrix multiplication.

Elementary Matrices

• An elementary matrix, M, is one that has 1’s along the main diagonal and 0’s everywhere else, except for one non-zero value (-m, say) in row i and column j.

• Multiplying A by M has the effect of subtracting m times row j of matrix A from row i.

• Ignoring pivoting, the GE algorithm applies a series of elementary matrices to A to get U: U = M n-1 ….M

2 M 1 A • If L i -1 =M i then L 1 L 2 …L n-1 U = A so taking L = L 1 L 2 …L n-1 we have LU=A.

• L i is the same as M i is changed.

except the sign of the non-zero value • With pivoting U = M n-1 P n-1 ….M

2 P 2 M 1 P 1 A, and it can be shown in a similar way that LU=PA, where P = P n-1 …P 2 P 1 .

The Need for Pivoting

• Suppose we change our problem slightly to: 10    5 3  7 2 .

099  1 0 6 5     

x

1

x

2

x

3   7    3 .

901 6   where the solution is still (0,-1,1) T .

• Now suppose we solve the problem on a computer that does arithmetic to five decimal places.

Pivoting (continued)

• The first step of the elimination gives: 10   0 0  7  0 .

001 2 .

5 0 6 5     

x x

2

x

1 3   7    6 .

001 2 .

5   • The (2,2) element is quite small. Now we continue

without

pivoting.

• The next step is to subtract -2.5

 10 3 times equation 2 from equation 3: (5-(-2.5  10 3 )(6))x 3 = (2.5-(-2.5  10 3 )(6.001)) • The righthand side is (2.5+1.50025  10 4 ). The second term is rounded to 1.5002  10 4 . When we add the 2.5 the result is rounded to 1.5004  10 4 .

Pivoting (continued)

• So the equation for x 3 1.5005  10 4 x 3 becomes: = 1.5004  10 4 which gives x 3 • Then x 2 = 0.99993 (instead of 1).

is found from: -0.001 x 2 which gives: + (6)(0.99993) = 6.001

-0.001 x 2 = 1.5  10 -3 so x 2 = -1.5 (instead of -1).

• Finally, x 1 is found from: 10 x 1 + (-7)(-1.5) = 7 which gives x • The problem arises from using a small pivot which leads to a larger multiplier.

1 = -0.35 (instead of 0).

• Partial pivoting ensures that the multipliers are always less than or equal to 1 in magnitude, and results in a satisfactory solution.

Measuring Errors

• • The discrepancy due to rounding error in a solution can be measured by the

error

:

e

=

x

x

* and by the

residual

:

r

=

b

- A

x

* where

x

is the exact solution and

x

* computed solution.

is the

e

=

0

if and only if

r

=

0

, however,

e

not necessarily both small.

and

r

are

Error Example

• Consider an example in which the matrix is almost singular.

• If GE is used to compute the solution with low precision we get a large error but small residual.

• Geometrically, the lines represented by the equation are almost parallel.

• GE with partial pivoting will always produce small residuals, but not necessarily small errors.

  0 .

780 0 .

913 0 .

563 0 .

659    

x x

2 1      0 .

217 0 .

254   computed solution exact solution

Sensitivity

• We want to know how sensitive the solution to A

x

=

b

is to perturbations in A and

b

.

• To do this we first have to come up with some measure of how close to singular a matrix is.

• If A is singular, a solution will not exist for some

b

’s (inconsistency), while for other changes in A and

b b

’s it is not unique.

• So if A is nearly singular we would expect small to cause large changes in

x

.

• If A is the identity matrix then and

b x

=

b

, so if A is close to the identity matrix we expect small changes in A to result in small changes in

x

.

Singularity and Pivots

• In GE all the pivots are non-zero if and only if A is non-singular, provided exact arithmetic is used.

• So if the pivots are small we expect the matrix to be close to singular.

• However, with finite precision arithmetic, a matrix might still be close to singular even if none of the pivots are small.

Norms

• Size of pivots is not adequate to decide how close a matrix is to being singular.

• Define l p family of norms (1≤p≤  ):

x p

  

i n

  1

x i p

  1 /

p

• Most common norms use p = 1, 2, and  .

x

1   

i n

  1

x i

 

x

2   

i n

  1

x i

2   1 / 2

x

  max

i x i

Properties of Norms

• All these norms have the following properties:

x

 0

if

0  0

cx

c x x

 0 for all scalars

c x

y

x

y

(the triangle inequality) • In MATLAB use norm(x,p) to find a norm: norm1 = norm(x,1) norm2 = norm(x) norminf = norm(x,inf)

Condition Number

• A

x

may have a very different norm from

x

, and this change in norm is related to the solution sensitivity to changes in A. Denote:

Ax Ax

M = max and m = min

x x

where the max and min are taken over all non zero vectors

x.

Note, if A is singular, m = 0.

• The

condition number,

 (A), of A is defined as the ratio M/m. Usually we are interested in the order of magnitude  (A)

,

so it doesn’t matter which norm is used.

Relative Errors

• Suppose we have an error 

b

results in an error 

x

in

x

. So A in

x

=

b b

, which becomes: A(

x

+ 

x

) =

b

+ 

b

so A 

x

= 

b

. From the definition of M and m we have:

b

M x

and 

b

m

x

Then if m  0 we have the following relationship between the relative error in

b

and in

x

: 

x

  (

A

) 

b x b

 (A) measures the amplification of the relative error!

Uses of Condition Number

1. As a measure of the amplification of relative error due to changes in rhs.

2. As a measure of the amplification of relative error due to changes in matrix A.

3. As a measure of how close a matrix is to being singular (hard maths omitted). If  (A) is large then A is close to singular.

Some Properties of

(A)

• •   (A)  1 since M  m.

(P) = 1, if is a permutation matrix.

•  (cA) =  (A) for scalar c.

• For D a diagonal matrix  (D) is the ratio of the largest diagonal value to the smallest.

Relative Error Example

• In this example we use the l 1 -norm:

A

   4 .

1 9 .

7 2 .

8 6 .

6  

b

   4 .

1 9 .

7   • Solution is

x

Change ~

b

   4 .

11 9 .

70  

b

= (1 0) T , and ||b|| = 13.8 and ||x|| = 1. slightly to: then

x

becomes    0 .

34 0 .

97   Small change in

b

gives big change in

x

.

• Errors are || 

b

||=0.01 and || 

x

||=1.63, so relative errors are: 

b b

 0 .

0007246 

x x

 1 .

63  (A) is large = 1.63/0.0007246 = 2249.4

Relative Error and Residual

• Condition number is also important in round-off error in GE. It can be shown that:

b

Ax

*

A x

*  

x

x

*

x

*    (

A

)  where

x

* is the numerical solution of A

x

constant smaller than about 10, and  =

b

,  is a the machine precision.

• The first inequality says that the relative residual will be about the same size as the round-off error no matter how ill-conditioned A is.

• The second inequality says that the relative error is small if  (A) is small but might be large if  (A) is large.

Matrix Norms

• The norm of matrix A is ||A|| = M, i.e.,

A

 max

Ax x

• Since ||A -1 || = 1/m it follows that the condition number can also be defined as:  (A) = ||A|| ||A -1 || • In MATLAB, cond(A,p) computes the condition number of A relative to the l p -norm.

Iterative methods for A

x

=

b

• Gaussian Elimination is a direct method that computes the solution of A O(n 3 /3) operations.

x

=

b

is • If n is large we might want a faster, less accurate, method.

• With an iterative method we can stop the iteration whenever we think we have a sufficiently accurate solution.

Iterative methods

• Suppose we split the matrix as A = S-T, then S

x

= T

x

+

b

.

• We can turn this into an iteration: S

x

k+1 = T

x

k +

b

or

x

k+1 = S -1 T

x

k + S -1

b

• So if this sequence converges then we can start the iteration with a guess at the solution

x

0 and get an approximate solution.

• We need S to be easily invertible

Common Iterative Methods

1.

S = diagonal part of A (Jacobi’s method) 2. S = triangular part of A (Gauss-Seidel method) 3. S = combination of 1 and 2 (successive over relaxation or SOR) S is called a

pre-conditioner

. The choice of S affects the convergence properties of the solution.

Jacobi Method

• S = diag(A) so formula for iteration k+1 becomes:

a ii

(

x i

)

k

 1  

i j

 1   1

a ij

(

x j

)

k

j n

 

i

 1

a ij

(

x j

)

k

b i

• Expressed in matrices: D

x

k+1 = - (L+U)

x

k +

b

where D, L, and U are the diagonal, strictly lower triangular, and strictly upper triangular parts of A, respectively.

Note: these are not the D, L, and U of the LDU factorisation.

Jacobi Example

• Example:

A

   2  1  2 1   ,

S

   2 0 0 2   ,

T

   0 1 1 0   ,

S

 1

T

   0 1 2 1 2 0   Then:  

x x

2 1  

k

 1    0 1 2 1 2 0      

x x

2 1  

k

   

b b

2 1 2 2   

Pseudocode for Jacobi Method

choose an initial guess x 0 for k=0,1,2,….

for i=1 to n sum = 0.0

for j=1 to i-1 and i+1 to n sum = sum + A(i,j)*x k (j) end x k+1 (i) = (b(i) –sum)/A(i,i) end check convergence and continue if needed end

Gauss-Seidel Method

• A problem with the Jacobi method is that we have to store all of

x

k computing

x

k+1 .

until we have finished • In the Gauss-Seidel method the

x

k+1 are used as soon as they are computed, and replace the corresponding

x

k on the righthand side

a ii

(

x i

)

k

 1  

j

 1

i

  1

a ij

(

x j

)

k

 1 

j n

 

i

 1

a ij

(

x j

)

k

b i

• This uses half the storage of the Jacobi method.

Gauss-Seidel in Matrix Notation

• Expressed in matrices: (D+L)

x

k+1 = -U

x

k +

b

where D+L is the lower triangular part of A, and U is the strictly upper triangular part of A.

Gauss-Seidel Example

• Example:

A

   2  1  2 1   ,

S

   2  1 0 2   ,

T

   0 0 1 0   ,

S

 1

T

   0 0 1 / 1 / 2 4   Then:   2  1 0 2    

x x

2 1  

k

 1    0 0 1 0    

x x

2 1  

k

  

b b

2 1   • Gauss-Seidel is better than Jacobi because it uses half the storage and often converges faster.

Pseudocode for Gauss-Seidel

choose an initial guess x 0 for k=0,1,2,….

for i=1 to n sum = 0.0

for j=1 to i-1 sum = sum + A(i,j)*x k+1 (j) end for j=i+1 to n end sum = sum + A(i,j)*x k (j) end x k+1 (i) = (b(i) –sum)/A(i,i) end check convergence and continue if needed