Transcript Slide 1
Geology 5670/6670
Inverse Theory
23 Jan 2015
Last time: Ordinary Least Squares Parameter Error
• Parameter error relates to measurement error and
the
geometry/physics of sampling as: m˜ mt G
• “Review” of Gaussian distribution (univariate):
1 x 2
1
f x
exp
2
2
where Mean:
Variance:
Ex
xf xdx
2 E x
2
x f x dx
2
• From multivariate, if measurement errors are zero-mean,
random, uncorrelated, the model covariance matrix is:
Cm G C G
1
T
2
, & if data constant variance: Cm G G
T
Read for Fri 23 Jan: Menke Ch 3 (39-68)
© A.R. Lowry 2015
So we can estimate a parameter variance for each
model parameter:
T 1
2
mi
V m˜ i 2 G G
ii
And we write
mi m˜ i mi
T
What should we expect Emin e e to be?
T
Emin e e d G m˜
d Gm˜
T
d Gmt & m˜ mt G ; after lots of algebra
Can substitute
and using
T
T
x Ax TrAx x
the identity:
1
T T T
Emin TrI N GG G G
1
T T T
Emin TrI N GG G G
We get:
And:
If we assume
measurements with
uncorrelated, constant variance:
Then:
Emin
T
C 2 I N
T
T
2
1
Tr I N TrG(G G) G N M 2
2
Useful take-home points:
One can always fit the data exactly if N = M.
If your measurement errors are unknown and
N – M is “large”,
˜2
Emin
N M
The latter is useful because often-times when 2 is
˜2 2!
estimated independently, we find that
This generally
indicates either (1) unanticipated “noise”
in the measurements, (2) correlated errors or
(3) (& very likely) the model is under-parameterized.
Hence we define a chi-squared
parameter
N
2
i1
ei2
2 N M
2
min
1
Why chi-squared ( 2)?
A probability density function
describes the relative likelihood
that a random variable will occur
at a given point (e.g., the “bell
curve” for a Gaussian RV).
DOF:
The sum of the squares of
k zero-mean, uncorrelated, Gaussian-distributed random
variables will follow a chi-squared distribution with k
degrees-of-freedom:
k
1
2
x k / 21ex / 2
Q Xi has PDF: f x,k k / 2
2 k /2
i1
As k gets very large, this function will peak ~k/ 2… But we also
have a measure of the probability of getting some other
result.
The 2 parameter is commonly used to evaluate data fit &
optimize the choice of number of parameters:
2
1) If min 1, can safely add more model parameters
2
1, too many parameters (model is fitting noise).
2) If min
Solution appraisal:
Assume: zero-mean, Gaussian, uncorrelated errors
Estimate: Confidence intervals expressed as %: 100(1–)%
Case 1: Data error variance is known (= 2)
Desired confidence interval is ±z of the
normal (z) distribution function
mi m˜ i z mi
/2
-z
1-
+z
/2
Can get this from standard
statistical tables or codes