Transcript Slide 1

Geology 5670/6670
Inverse Theory
23 Jan 2015
Last time: Ordinary Least Squares Parameter Error
• Parameter error relates to measurement error and
the

geometry/physics of sampling as: m˜  mt  G 
• “Review” of Gaussian distribution (univariate):

 1 x   2 

1
f x  
exp 
 

 2
 2    

where Mean:

Variance:
 
  Ex

 xf xdx



 2  E x    
2
x    f x dx
2

• From multivariate, if measurement errors are zero-mean,

random, uncorrelated, the model covariance matrix is:

Cm  G C G
1
T

2 
, & if data constant variance: Cm   G G 


T 
Read for Fri 23 Jan: Menke Ch 3 (39-68)
© A.R. Lowry 2015
So we can estimate a parameter variance for each
model parameter:
  T 1 
2
 mi
 V m˜ i   2 G G  
 

 
ii
And we write

mi  m˜ i   mi
T
What should we expect Emin  e e to be?
T

Emin  e e  d  G m˜
 d  Gm˜ 
T

d  Gmt   & m˜  mt  G ; after lots of algebra
Can substitute 
and using
T
T 

x Ax  TrAx x 

the identity:


1

 T  T  T
Emin  TrI N  GG G  G 






1

 T  T  T
Emin  TrI N  GG G  G  







We get:
And: 

If we assume
measurements with
uncorrelated, constant variance:
Then:
Emin

T
 C   2 I N
T
T 


2 
1
  Tr I N   TrG(G G) G  N  M  2
   

2
Useful take-home points:
 One can always fit the data exactly if N = M.
 If your measurement errors are unknown and
N – M is “large”,
˜2 

Emin
N M
The latter is useful because often-times when  2 is
˜2 2!
estimated independently, we find that 
This generally
 indicates either (1) unanticipated “noise”
in the measurements, (2) correlated errors or
(3) (& very likely) the model is under-parameterized.

Hence we define a chi-squared
parameter
N
2 

i1
ei2
 2 N  M 
2
 min
1
Why chi-squared ( 2)?
A probability density function
describes the relative likelihood
that a random variable will occur
at a given point (e.g., the “bell
curve” for a Gaussian RV).
DOF:
The sum of the squares of
k zero-mean, uncorrelated, Gaussian-distributed random
variables will follow a chi-squared distribution with k
degrees-of-freedom:
k
1
2
x k / 21ex / 2
Q   Xi has PDF: f x,k k / 2
2 k /2
i1
As k gets very large, this function will peak ~k/ 2… But we also
have a measure of the probability of getting some other
 result.

The  2 parameter is commonly used to evaluate data fit &
optimize the choice of number of parameters:
2
1) If  min 1, can safely add more model parameters
2
1, too many parameters (model is fitting noise).
2) If  min
Solution appraisal:
 Assume: zero-mean, Gaussian, uncorrelated errors
Estimate: Confidence intervals expressed as %: 100(1–)%
Case 1: Data error variance is known (=  2)
Desired confidence interval is ±z of the
normal (z) distribution function
mi  m˜ i  z mi
/2
-z
1-
+z
/2

Can get this from standard
statistical tables or codes