#### Transcript Curve Fitting - Civil and Environmental Engineering | SIU

```CURVE FITTING
ENGR 351
Numerical Methods for Engineers
Southern Illinois University Carbondale
College of Engineering
Dr. L.R. Chevalier
Permission is granted to students at Southern Illinois University at Carbondale
to make one copy of this material for use in the class ENGR 351, Numerical
Methods for Engineers. No other permission is granted.
All other rights are reserved. No part of this publication may be reproduced,
stored in a retrieval system, or transmitted, in any form or by any means,
electronic, mechanical, photocopying, recording, or otherwise, without
the prior written permission of the copyright owner.
Specific Growth Rate, m
Applications
Need to determine parameters for
saturation-growth rate model to
characterize microbial kinetics
Food Available, S
m  m max
S
Ks  S
Applications
T ( o C)
0
10
20
30
0
Epilimnion
5
z (m )
10
Thermocline
15
20
25
30
Hypolimnion
Applications
Interpolation of data
What is kinematic
viscosity at 7.5º C?
v, 10-2 cm 2/s
2
1.5
1
0.5
0
0
10
20
T(oC)
30
T (oC)
0
4
8
12
16
20
24
v, 10-2 (cm2/s)
1.7923
1.5615
1.3874
1.2396
1.1168
1.0105
0.9186
f(x)
x
We want to find the best “fit” of a curve through the data.
Mathematical Background
• The prerequisite mathematical background
for interpolation is found in the material on
the Taylor series expansion and finite
divided differences
• Simple statistics
average
standard deviation
normal distribution
Normal Distribution
A histogram used
to depict the distributions
x
x  2
95%
x 
68%
Material to be Covered in Curve
Fitting
• Linear Regression
Polynomial Regression
Multiple Regression
General linear least squares
Nonlinear regression
• Interpolation
Newton’s Polynomial
Lagrange polynomial
Coefficients of polynomials
Specific Study Objectives
• Understand the fundamental difference between
regression and interpolation and realize why
confusing the two could lead to serious problems
• Understand the derivation of linear least squares
regression and be able to assess the reliability of
the fit using graphical and quantitative
assessments.
Specific Study Objectives
• Know how to linearize data by transformation
• Understand situations where polynomial, multiple
and nonlinear regression are appropriate
• Understand the general matrix formulation of
linear least squares
• Understand that there is one and only one
polynomial of degree n or less that passes exactly
through n+1 points
Specific Study Objectives
• Realize that more accurate results are
obtained if data used for interpolation is
centered around and close to the unknown
point
• Recognize the liabilities and risks
associated with extrapolation
• Understand why spline functions have
utility for data with local areas of abrupt
change
Least Squares Regression
• Simplest is fitting a straight line to a set of
paired observations
(x1,y1), (x2, y2).....(xn, yn)
• The resulting mathematical expression is
y = ao + a1x + e
• We will consider the error introduced at
each data point to develop a strategy for
determining the “best fit” equations
n
n
i 1
i 1
Sr   e2i    yi  a o  a1xi 
2
f(x)
yi  a o  a1xi
x
n
n
i 1
i 1
Sr   e2i    yi  a o  a1xi 
2
To determine the values for ao and a1, differentiate
with respect to each coefficient
Sr
 2  yi  ao  a1xi 
ao
Sr
 2  yi  ao  a1xi  xi 
a1
Note: we have simplified the summation symbols.
What mathematics technique will minimize Sr?
Sr
 2  yi  ao  a1xi 
ao
Sr
 2  yi  ao  a1xi  xi 
a1
Setting the derivative equal to zero will minimizing Sr.
If this is done, the equations can be expressed as:
0   yi   ao   a1xi
0   yi xi   ao xi   a1xi2
0   yi   ao   a1xi
0   yi xi   ao xi   a1xi2
Note:
a
o
 nao
We have two simultaneous equations, with two
unknowns, ao and a1.
What are these equations? (hint: only place terms with
ao and a1 on the LHS of the equations)
What are the final equations for ao and a1?
nao   xi a1   yi
xa  x a  x y
i
a1 
2
i 1
o
i i
n xi yi   xi  y i
n x    xi 
ao  y  a1x
2
i
2
These first two
equations are
called
the normal
equations
Error
Recall:
n
n
Sr   e i    yi  a o  a1xi 
2
i 1
2
f(x)
i 1
x
The most common measure of the “spread” of a sample is
the standard deviation about the mean: St   yi  y  2
St
sy 
n1
Introduce a term to measure the standard error of the
estimate:
sy
x
Sr

n2
Coefficient of determination r2:
St  Sr
r 
St
2
r is the correlation coefficient
St  Sr
r 
St
2
The following signifies that the line explains 100
percent of the variability of the data:
Sr = 0
r = r2 = 1
If r = r2 = 0, then Sr = St and the fit is invalid.
Consider the following four sets of data
Data 1
10
8.04
8
6.95
13
7.58
9
8.81
11
8.33
14
9.96
6
7.24
4
4.26
12
10.84
7
4.82
5
5.68
Data 2
10
9.14
8
8.14
13
8.74
9
8.77
11
9.26
14
8.10
6
6.13
4
3.10
12
9.13
7
7.26
5
4.74
Data 3
10
7.46
8
6.77
13
12.74
9
7.11
11
7.81
14
8.84
6
6.08
4
5.39
12
8.15
7
6.42
5
5.73
Data 4
8
6.58
8
5.76
8
7.71
8
8.84
8
8.47
8
7.04
8
5.25
19
12.50
8
5.56
8
7.91
8
6.89
12
12
10
10
8
8
y
14
y
14
6
6
4
4
y = 0.5001x + 3.0001
R2 = 0.6665
2
y = 0.5x + 3.0009
2
0
R2 = 0.6662
0
0
5
10
15
0
5
x
10
15
x
14
14
12
12
10
10
8
y
y
8
6
6
4
4
y = 0.4997x + 3.0025
2
2
y = 0.4999x + 3.0017
R2 = 0.6667
2
R = 0.6663
0
0
0
5
10
x
15
0
5
10
x
15
20
Linearization of non-linear
relationships
Some data is simply illsuited for linear least
squares regression....
or so it appears.
f(x)
x
P
EXPONENTIAL
EQUATIONS
Linearize
P  Po e
rt
t
ln P
intercept = ln P0
slope = r
why?
t
P  P0e rt
ln P  ln P0e rt 
 ln P0   lne rt 
 ln P0   rt
lnP
intercept = ln Po
Can you see the similarity
with the equation for a line:
y = b + mx
where b is the y-intercept
and m is the slope?
slope = r
t
P  P0e rt
ln P  ln P0e rt 
 ln P0   lne rt 
 ln P0   rt
ln P0
intercept = ln P0
After taking the natural log
of the y-data, perform linear
regression.
From this regression:
The value of b will give us
ln (P0). Hence, P0 = eb
The value of m will give us r
directly.
slope = r
t
Q
POWER EQUATIONS
Q  cH
a
(Flow over a weir)
log Q
H
log H
Here we linearize
the equation by
taking the log of
H and Q data.
What is the resulting
intercept and slope?
Q  cH a
log Q  logcH a 
 log c  log H a
 log c  a log H
log Q
slope = a
log H
intercept = log c
Q  cH a
log Q  logcH
a

So how do we get
c and a from
performing regression
on the log H vs log Q
data?
From : y = mx + b
 log c  log H a
 log c  a log H
log Q
b = log c
c = 10b
slope = a
log H
intercept = log c
m=a
SATURATION-GROWTH
RATE EQUATION
m
m  m max
S
1/m
slope = Ks/mmax
intercept = 1/mmax
1/ S
S
Ks  S
Here, m is the growth
rate of a microbial
population,
mmax is the maximum
growth rate, S is the
substrate or food
concentration, Ks is the
substrate concentration
at a value of m = mmax/2
Regression
• You should be cognizant of the fact that
there are theoretical aspects of regression
that are of practical importance but are
beyond the scope of this book
• Statistical assumptions are inherent in the
linear least squares procedure
Regression
• x has a fixed value; it is not random and is
measured without error
• The y values are independent random
variable and all have the same variance
• The y values for a given x must be normally
distributed
Regression
y-direction
• The regression of y versus x is not the same
as x versus y
• The error of y versus x is not the same as x
versus y
f(x)
x-direction
x
Polynomial Regression
• One of the reasons you were presented with
the theory behind linear regression was to
allow you the insight behind similar
procedures for higher order polynomials
• y = a0 + a 1x
• mth - degree polynomial
y = a0 + a1x + a2x2 +....amxm + e
Based on the sum of the squares
of the residuals
Sr    yi  ao  a1xi  a x ...... a x
2
2 i

m 2
m i
1. Take the derivative of the above equation with respect
to each of the unknown coefficients: i.e. the partial with
respect to a2
Sr
2
 2 xi  yi  ao  a1xi  a2 xi2 ..... amxim 
a2
2. These equations are set to zero to minimize Sr., i.e.
minimize the error.
3. Set all unknowns values on the LHS of the equation.
Again, using the partial of Sr. wrt a2
ao  xi  a1  xi3  a2  xi4 ..... am  xi
2
m 2
  xi yi
2
4. This set of normal equations result in m+1
simultaneous equations which can be solved using
matrix methods to determine a0, a1, a2......am
Multiple Linear Regression
• A useful extension of linear regression is the
case where y is a linear function of two or
more variables
y = ao + a1x1 + a2x2
• We follow the same procedure
y = ao + a1x1 + a2x2 + e
Multiple Linear Regression
For two variables, we would solve a 3 x 3 matrix
in the following form:
 n

  x1i
 x2 i
x
x
x x
1i
2
1i
1i 2 i
x
x x
x
  a 0    yi 

  
1i 2 i   a1     x1i yi 
2
   x y
2 i  a 2 

 2 i i 
2i
[A] and {c}are clearly based on data given for x1, x2
and y to solve for the unknowns in {x}.
Interpolation
• General formula for an n-th order
polynomial
y = a0 + a1x + a2x2 +....amxm
• For m+1 data points, there is one, and only
one polynomial of order m or less that
passes through all points
• Example: y = a0 + a1x
– fits between 2 points
– 1st order
Interpolation
• We will explore two mathematical methods
well suited for computer implementation
• Newton’s Divided Difference Interpolating
Polynomials
• Lagrange Interpolating Polynomial
Newton’s Divided Difference
Interpolating Polynomials
•
•
•
•
Linear Interpolation
General Form
Errors
Linear Interpolation
3
Temperature, C
0
Density, kg/m
5
1000.0
10
999.7
15
999.1
20
998.2
999.9
How would you approach estimating the density at 17 C?
Temperature, C
Density, kg/m
0
999.9
5
1000.0
10
999.7
15
999.1
20
998.2
3
???
999.1 > r > 998.2
998.2
999.1
r
T
15
20
998.2
999.1
r
T
15
998 .2  999 .1 r

20  15
T
20
Assume a straight line
between the known data.
Then calculate the slope.
998.2
999.1
r
15
r  999.1 r

17 15
T
T
17
20
Assuming this linear
relationship is constant,
the slope is the same
between the unknown
point and a known point.
998.2
999.1
r
15
T
17
20
998.2  999.1 r  999.1 r


20  15
17 15 T
Therefore, the slope of
one interval will equal the
slope of the other interval.
Solve for r
998.2  999 .1 998.2  r

20  15
20  17
f  x1   f  x0 
f1  x   f  xo  
 x  x0 
x1  x0
Note: The notation f1(x) designates that this is a first order
interpolating polynomial
true solution
f(x)
1
2
smaller intervals
provide a better estimate
x
true solution
f(x)
1
2
Alternative approach would be to include a
third point and estimate f(x) from a 2nd
order polynomial.
x
true solution
f(x)
Alternative approach would be to include a
third point and estimate f(x) from a 2nd
order polynomial.
x
f 2  x  b0  b1  x  x0   b2  x  x0  x  x1 
Prove that this a 2nd order polynomial of
the form:
f  x  a0  a1x  a2 x 2
First, multiply the terms
f 2  x  b0  b1  x  x0   b2  x  x0  x  x1 
f 2  x  b0  b1x  b1x0  b2 x 2  b2 x0 x1  b2 xx0  b2 xx1
Collect terms and recognize that:
a0  b0  b1 x0  b2 x0 x1
a1  b1  b2 x0  b2 x1
a2  b2
f  x  a0  a1x  a2 x 2
x2, f(x2)
x, f(x)
Procedure for
Interpolation
f(x)
x1, f(x1)
x0, f(x0)
b0  f  x0 
x
f  x1   f  x0 
b1 
x1  x0
f  x2   f  x1  f  x1   f  x0 

x2  x1
x1  x0
b2 
x2  x0
Interpolation
b0  f  x0 
f  x1   f  x0 
b1 
x1  x0
f  x2   f  x1  f  x1   f  x0 

x2  x1
x1  x0
b2 
x2  x0
Interpolation
b0  f  x0 
f  x1   f  x0 
b1 
x1  x0
f  x2   f  x1  f  x1   f  x0 

x2  x1
x1  x0
b2 
x2  x0
f 2  x   b0  b1  x  x0   b2  x  x0  x  x1 
Example
Include 10 degrees in
density at 17 degrees.
1000
Density
999.5
999
998.5
998
0
5
10
Tem p
15
20
Temperature, C
Density, kg/m
0
999.9
5
1000.0
10
999.7
15
999.1
20
998.2
3
General Form of Newton’s
Interpolating Polynomials
for the nth-order polynomial
f n  x  b0  b1  x  x0 ....bn  x  x0  x  x1  x  xn 1 
To establish a methodical approach to a solution define
the first finite divided difference as:


f xi , x j 
f  xi   f  x j 
xi  x j


f xi , x j 
f  xi   f  x j 
xi  x j
if we let i=1 and j=0 then this is b1
f  x1   f  x0 
b1 
x1  x0
Similarly, we can define the second finite divided
difference, which expresses both b2 and the difference
of the first two divided difference
Similarly, we can define the second finite divided
difference, which expresses both b2 and the difference of
the first two divided difference
f  x2   f  x1  f  x1   f  x0 

x2  x1
x1  x0
b2 
x2  x0

f xi , x j , xk



 
f xi , x j  f x j , xk

xi  xk
Following the same scheme, the third divided
difference is the difference of two second finite divided
difference.
i xi
f(xi)
first
second
third
0 x0
f(x0)
f[x1,x0]
f[x2,x1,x0]
f[x3,x2,x1,x0]
1 x1
f(x1)
f[x2,x1]
f[x3,x2,x0]
2 x2
f(x2)
f[x2,x3]
3 x3
f(x3)
f n  x  b0  b1  x  x0 ....bn  x  x0  x  x1  x  xn 1 
These difference can be used to evaluate the b-coefficients.
The result is the following interpolation polynomial called
the Newton’s Divided Difference Interpolating Polynomial
f n  x   f  x0   f  x1 , x0  x  x0 ....
 f  xn , xn  1 , , x0  x  x0  x  x1  x  xn 1 
To determine the error we need an extra point.
The error would follow a relationship analogous to the error
in the Taylor Series.
Lagrange Interpolating
Polynomial
n
f n  x   Li  x f  xi 
i0
n
Li  x  
j0
ji
x  xj
xi  x j
where P designates the “product of”
The linear version of this expression is at n=1
Linear version: n=1
n
f n  x   Li  x f  xi 
i 0
n
Li  x  
j 0
j i
x  xj
xi  x j
x  x1
x  x0
f1 
f  x0  
f  x1 
x0  x1
x1  x0
Your text shows you how to do n=2 (second order).
What would third order be?
n
f n  x    Li  x  f  xi 
i 0
n
Li  x   
j 0
j i
x  xj
xi  x j
 x  x1  x  x2  x  x3 
f3 
f  x0 
 x0  x1  x0  x2  x0  x3 
.......
n
f n  x    Li  x  f  xi 
i 0
n
Li  x   
j 0
j i
x  xj
xi  x j
 x  x1  x  x2  x  x3 
f3 
f  x0 
 x0  x1  x0  x2  x0  x3 
 x  x0  x  x2  x  x3 

f  x1 
 x1  x0  x1  x2  x1  x3 
.......
Note:
x1 is
not being
subtracted
from the
constant
term x
or xi = x1 in
the numerator
or the
denominator
j= 1
n
f n  x    Li  x  f  xi 
i 0
n
Li  x   
j 0
j i
x  xj
xi  x j
 x  x1  x  x2  x  x3 
f3 
f  x0 
 x0  x1  x0  x2  x0  x3 
 x  x0  x  x2  x  x3 

f  x1 
 x1  x0  x1  x2  x1  x3 
 x  x0  x  x1  x  x3 

f  x2 
 x2  x0  x2  x1  x2  x3 
......
Note:
x2 is
not being
subtracted
from the
constant
term x or
xi = x2 in
the numerator
or the
denominator
j= 2
n
f n  x   Li  x f  xi 
i 0
n
Li  x  
j 0
j i
x  xj
xi  x j
x  x1  x  x2  x  x3 

f3 
f  x0 
 x0  x1  x0  x2  x0  x3 
x  x0  x  x2  x  x3 


f  x1 
 x1  x0  x1  x2  x1  x3 
x  x0  x  x1  x  x3 


f  x2 
 x2  x0  x2  x1  x2  x3 
x  x0  x  x1  x  x2 


f  x3 
 x3  x0  x3  x1  x3  x2 
Note:
x3 is
not being
subtracted
from the
constant
term x or
xi = x3 in
the numerator
or the
denominator
j= 3
Example
Determine the density
at 17 degrees.
1000
Density
999.5
999
998.5
998
0
5
10
Tem p
15
20
Temperature, C
Density, kg/m
0
999.9
5
1000.0
10
999.7
15
999.1
20
998.2
3
f 2 17  119.964  839.244  279.496  998.776
f 2 17  998.776
f 1 17  998.74
Using Newton’s
Interpolating Polynomial
In fact, you can derive Lagrange directly from
Newton’s Interpolating Polynomial
Coefficients of an Interpolating
Polynomial
f n  x  b0  b1  x  x0 ....bn  x  x0  x  x1  x  xn 1 
y = a0 + a1x + a2x2 +....amxm
HOW CAN WE BE MORE STRAIGHT
FORWARD IN GETTING VALUES?
f  x0   a 0  a1x0  a 2 x02
f  x1   a 0  a1x1  a 2 x12
f  x2   a 0  a1x2  a 2 x32
This is a 2nd order polynomial.
We need three data points.
Plug the value of xi and f(xi)
directly into equations.
This gives three simultaneous equations
to solve for a0 , a1 , and a2
Example
Determine the density
at 17 degrees.
1000
Density
999.5
999
998.5
998
0
5
10
Tem p
15
20
Temperature, C
Density, kg/m
0
999.9
5
1000.0
10
999.7
15
999.1
20
998.2
3
Spline Interpolation
• Our previous approach was to derive an nth
order polynomial for n+1 data points.
• An alternative approach is to apply lowerorder polynomials to subset of data points
• Such connecting polynomials are called
spline functions
Spline interpolation is an adaptation of the
drafting technique of using a spline to draw smooth curves
through a series of points
Linear Splines
f  x  f  x0   m0  x  x0 
f  x  f  x1   m1  x  x1 
x0  x  x1
x1  x  x2

f  x  f  xn  1   mn  1  x  xn  1 
where
f  xi  1   f  xi 
mi 
xi  1  xi
xn  1  x  xn
a1 x 2  b1 x  c1
a2 x 2  b2 x  c2
a4 x 2  b4 x  c4
a3 x 2  b3 x  c3
Example
A well pumping at 250 gallons per minute has observation
wells located at 15, 42, 128, 317 and 433 ft away
along a straight line from the well.
After three hours of pumping, the following drawdowns
in the five wells were observed: 14.6, 10.7, 4.8
1.7 and 0.3 ft respectively.
Derive equations of
16
14
Drawdown
12
10
8
6
4
2
0
0
100
200
300
Distance from we ll
400
500
Splines
• To ensure that the mth derivatives are continuous
at the “knots”, a spline of at least m+1 order must
be used
• 3rd order polynomials or cubic splines that ensure
continuous first and second derivatives are most
frequently used in practice
• Although third and higher derivatives may be
discontinuous when using cubic splines, they
usually cannot be detected visually and
consequently are ignored.
Splines
• The derivation of cubic splines is somewhat
involved
• First illustrate the concepts of spline interpolation
using second order polynomials.
• These “quadratic splines” have continuous first
derivatives at the “knots”
• Note: This does not ensure equal second
derivatives at the “knots”
1.The function must be equal at the interior
knots. This condition can be represented as:
ai 1xi21  bi 1xi 1  ci 1  f  xi 1 
note: we are referencing the same x and f(x)
ai xi21  bi xi 1  ci  f  xi 1 
ai  1xi2 1  bi  1xi  1  ci  1  f  xi  1 
ai xi2 1  bi xi  1  ci  f  xi  1 
This occurs between i = 2, n
Using the interior knots (n-1) this will provide
2n -2 equations.
2. The first and last functions must pass
through the end points.
This will add two more equations.
a1x02  b1x0  c1  f  x0 
a n xn2  bn xn  cn  f  xn 
We now have 2n - 2 +2 = 2n equations.
How many do we need?
3. The first derivative at the interior knots must
be equal.
This provides another n-1 equations for
2n + n-1 =3n -1.
We need 3n
2ai 1xi 1  bi  1  2ai xi 1  b
4. Unless we have some additional information
regarding the functions or their derivatives,
we must make an arbitrary choice in order
to successfully compute the constants.
5. Assume the second derivative is zero
at the first point. The visual interpretation
of this condition is that the first
two points will be connected by a
straight line.
a1 = 0
Cubic Splines
• Third order polynomial
• Need n+1 = 3+1 = 4 intervals
• Consequently there are 4n unknown
constants to evaluate
• What are these equations?
f i  x  ai x 3  bi x 2  ci x  di
Cubic Splines
• The function values must be equal at the interior
knots (2n -2)
• The first and last functions must pass through the
end points (2)
• The first derivatives at the interior knots must be
equal (n-1)
• The second derivatives at the interior knots must
be equal (n-1)
• The second derivative at the end knots are zero (2)
Cubic Splines
• The function values must be equal at the interior
knots (2n -2)
• The first and last functions must pass through the
end points (2)
• The first derivatives at the interior knots must be
equal (n-1)
• The second derivatives at the interior knots must
be equal (n-1)
• The second derivative at the end knots are zero (2)
SPECIAL NOTE
On the surface it may appear that a third order approximation
using splines would be inferior to higher order polynomials.
Consider a situation where a spline may perform better:
A generally smooth function undergoes an abrupt change in a
region of interest.
The abrupt change
induces oscillations
in interpolating
polynomials.
In contrast,
the cubic spline
provides a
much more
acceptable
approximation
Previous Exam Question
Given the following data, develop the
simultaneous equations for a quadratic spline.
7.00
f(x)
0.50
4.60
1.50
3.00
6.00
5.00
4.00
f(x)
x
1
4
6
7
3.00
2.00
1.00
0.00
0
1
2
3
4
x
5
6
7
8
(4, 4.6)
7.00
Interior knots:
16a1 + 4b1 + c1 = 4.6
16a2 + 4b2 + c2 = 4.6
6.00
5.00
f(x)
4.00
3.00
2.00
1.00
0.00
0
1
2
3
4
5
6
7
8
36a2 + 6b2 + c2 = 1.5
36a3 + 6b3 + c3 = 1.5
x
x
1
4
6
7
f(x)
0.50
4.60
1.50
3.00
End conditions
a1 + b1 + c1 = 0.5
49a3 + 7b3 + c3 = 3.0
First derivative cot. at interior knots
8a1 + b1 = 8a2 + b2
12a2 + b2 = 12a3 + b3
Extra equation
a1 =0
Interior knots:
16a1 + 4b1 + c1 = 4.6
36a2 + 6b2 + c2 = 1.5
4
0

0
0

1

0
1

0
1
End conditions
a1 + b1 + c1 = 0.5 49a3 + 7b3 + c3 = 3.0
First derivative cont. at interior knots
8a1 + b1 = 8a2 + b2 12a2 + b2 = 12a3 + b3
Extra equation
a1 =0
16a2 + 4b2 + c2 = 4.6
36a3 + 6b3 + c3 = 1.5
0
0
0
0 16
4
1
0 36
0 0
6
0
1
0
1
0
0
0
0
0
0
0
0 8  1 0
0 12
1
0
0  b1  4.6
0
0 0  c1  4.6
   
0
0 0 a2   15
. 
36 6 1 b2   15
. 
    
0
0 0  c2   0.5

49 7 1  a3   3.0
   
0
0 0  b3   0 

12 1 0  c3   0 
0
0
```