Transcript pptx

Chapter 15
Above: GPS time series from southern California
after removing several curve fits to the data
Fitting curves to data is very
common in Earth sciences
• Has applications in virtually all
subdiscipline
Two things to keep in mind:
• Data is noisy
• Data is discrete (non-continuous)
• Curve fitting can help overcome
these issues (to some degree)
Empirical Modeling:
• A type of modeling that involves
fitting a curve to data and then using
the equation of the curve to predict
values
Extrapolations of Future Global
Warming, IPCC (2007)
From Wells & Coppersmith (1994)
MATLAB provides several built-in functions to fit curves
• Many require the “Curve Fitting Toolbox”, or other
toolboxes.
• We will only use the basic curve fitting functions that are part
of standard MATLAB
• We will focus on
• polyfit, polyval, corrcoef, roots
polyfit
• Fits data with a polynomial curve of a user-specified
degree
• Polynomials come in different orders or degrees
• 0th order: a single constant value
• Examples: y = 4
y = 2.75
y = -12.1
• 1st order: a linear equation (independent var is to 1st power)
• Examples: y = 4x
y = 3.2x + 7
y = -8.2x – 21.3
• 2nd order: a quadratic equation
• Examples: y = 5x2
y = 2.9x2 + 7
y = -1.8x2 – 7.4x + 1.4
• 3rd order: a cubic equation
• Examples: y = 7x3
y = 4.6x3 + 2
y = 2.4x3 + 3.5x2 + 3.2x + 7.3
• nth order: a polynomial where “n” is the largest exponent
• Can be represented as a row vector in MATLAB
• Interpreted by “polyval” as coefficients of a polynomial
3 2.7 1 −5.7
3𝑥 3 + 2.7𝑥 2 + 𝑥 − 5.7
• Lets make data for: 𝒚 = 𝟐. 𝟕𝒙𝟒 + 𝟒𝒙𝟑 − 𝒙𝟐 + 𝟏. 𝟖𝒙 − 𝟏𝟐
• Using polyval
requires a lot
less typing
• Saves time!
• A first order polynomial is a linear equation
𝒚 = 𝟑. 𝟕𝟓𝒙 + 𝟎. 𝟐𝟓
• Roots of a function: Where function = 0
• Useful in sciences because we often want to know where parameters
return to zero
• Also useful for finding min/max of data and equations
•
•
•
•
1st order polynomial: 1 root
2nd order polynomial: 0, 1, or 2 roots
3rd order polynomial: up to 3 roots
nth order polynomial: up to n roots
Warning!
• Some polynomials have no
real roots, but do have roots
with imaginary numbers
• Recall, the discriminant
• b2 – 4ac
𝒚 = 𝒙𝟐 + 𝟔𝒙 + 𝟖
Polynomial
𝒙𝟐 − 𝟔𝒙 + 𝟖
• This means that…
𝒙𝟐 − 𝟔𝒙 + 𝟖 = 𝒙 − 𝟐 𝒙 − 𝟒
• Discriminant > 0
Polynomial
𝟒𝒙𝟐 − 𝟐𝒙 + 𝟔
• This means that…
𝟒𝒙𝟐 − 𝟐𝒙 + 𝟔
• Has no real roots!
• Discriminant < 0
Polynomial
𝒙𝟒 − 𝟐𝟑𝒙𝟐 − 𝟏𝟖𝒙 + 𝟒𝟎
• This means that…
𝒙𝟒 − 𝟐𝟑𝒙𝟐 − 𝟏𝟖𝒙 + 𝟒𝟎
• Has 4 real roots!
• Polynomials are easy to deal with in MATLAB
• As the order of the polynomial increases…
• So does the complexity of the curve
• Remember the Taylor Series?
• You can fit any function with an infinite series of polynomials
• More polynomials = better fit
• Polyfit is similar except it fits a single polynomial to data
A Simple Test…
• Fit 5 collinear
points with linear
equation
• Use polyfit to perform a least
squares fit of a 1st order
polynomial
• i.e. a linear fit
While there are better ways to
evaluate goodness of fit…
• It is beyond our scope to
cover all methods for
goodness of fit
• Take a Stats course!
• Correlation coefficient, R, is
one commonly used measure
• R2 = tells the % of your data’s
variance that is explained by a
linear fit
• R2 = 0.95 means 95% of your
data variance is explained by
linear fit.
• What is good enough? Depends
on the situation
Make synthetic data with noise
• See if fit is reasonable
• Increasing the
order of the
polynomial allows
for more complex
curves to be fit
• Be careful to not
over-fit your
data!
• MATLAB will give
a warning if the
result is poorly
conditioned
Occasionally, we may want to fit high order polynomials to data
• Typically, this is done to model the shape of a feature, not a data trend
To avoid poorly conditioned polynomial warnings, we can:
• Ask polyfit to scale the data before fitting
𝒙=
𝒙−𝒙
𝝈𝒙
=
𝒙−mean(𝒙)
std(𝒙)
=
𝒙−𝝁𝟏
𝝁𝟐
E.g.
𝒚 = 𝒂𝒙𝟐
• Forces mean of x to be zero
• Forces standard deviation of x to be 1
• Improves fitting algorithm
• Tell polyval about the scaling to reconstruct the correct y-values
WARNING!: The polynomial coefficients are not the best fit of the
original data. They are best fit of the scaled data.
These are the best fit
coefficients of the
RESCALED data
𝒙−𝒎𝒖(𝟏)
,𝒚
𝒎𝒖(𝟐)
not
[𝒙, 𝒚]
What if data is unevenly spaced?
• How could you estimate an evenly spaced data set?
• Interpolate it
• Interpolation: The process of estimating values in between data
points
What if data is limited in range?
• How could you estimate data beyond your data range?
• Extrapolate values
• Very prone to errors. Should always be done with extreme caution
• Extrapolation: The process of estimating values beyond the bounds
of your data
• MATLAB provides several ways to interpolate data
• I will only cover using “interp1” and “polyfit”
You can use a best fit curve to
interpolate
• Make sure the curve fits data well
• Best fit curves tend to smooth data
• Will not honor your collected data points!
• This is why formal interpolation is typically
preferred
• Be careful about extrapolating!
interp1: interpolates 1D data
• See also interp2 and interp3 for 2D/3D
• interp1 has several options
• Read the documentation
• We will only use linear or spline methods
Linear Interpolation
• Resultant data is boxy
• Min/Max will not exceed the original data
Interpolation using splines
• Resultant data has smooth curves
• Min/Max may exceed the original data
• Both methods honor y-vals at original data
• Unevenly sampled an equation
to make synthetic data
𝒚 = 𝟖 𝒙 − 𝟐. 𝟓𝒙
• In this case, splines work best,
but the polynomial fit is not bad
• Unevenly sampled an equation
with some random noise (± 2)
added
𝒚 = 𝟐𝒙 + 𝟒
• In this case, linear interp is not
bad, but the linear fit is best
• “interp1” can be used
to extrapolate beyond
input data limits
• Use ‘extrap’ option
interp1(x,y,’linear’,’extrap’)
• Use with great
caution!
• Extrapolation is highly
prone to errors
• Extrapolation should
only be a last resort
• Which method
worked best?
• None! Extrapolation
is a bad idea
• If you have to do it,
only go very slightly
beyond your data
limits
• MATLAB offers several built-in functions for curve fitting and
resampling data
• We covered only polyfit and interp1
• There is no way to know a priori which method is most appropriate
for your data
• Take statistics classes
• Always test your curve fits
• Know what relationship to expect between data (if possible)
• Polynomial fits are not appropriate for all data sets
• May want to explore other methods
•
E.g. Fourier Analysis / Spectral Analysis
• MATLAB has TONS of other curve fitting and resampling options in
various toolboxes
• Don’t use toolbox commands this class, but feel free to explore them in your
research