Introduction to smoothing splines

Transcript Introduction to smoothing splines

Introduction to
Smoothing Splines
Tongtong Wu
Feb 29, 2004
Outline

Introduction



Linear and polynomial regression, and
interpolation
Roughness penalties
Interpolating and Smoothing splines






Cubic splines
Interpolating splines
Smoothing splines
Natural cubic splines
Choosing the smoothing parameter
Available software
Key Words
roughness penalty
 penalized sum of squares
 natural cubic splines

2
4
(y18)
6
8
10
Motivation
5
10
Index
15
2
4
y18
6
8
10
Motivation
5
10
Index
15
2
4
y18
6
8
10
Motivation
5
10
Index
15
2
4
(y18)
6
8
10
Motivation
Spline(y18)
5
10
Index
15
Introduction

Linear and polynomial regression :



Global influence
Increasing of polynomial degrees happens in
discrete steps and can not be controlled
continuously
Interpolation

Unsatisfactory as explanations of the given
data
Roughness penalty approach

A method for relaxing the model
assumptions in classical linear regression
along lines a little different from
polynomial regression.
Roughness penalty approach

Aims of curving fitting



A good fit to the data
To obtain a curve estimate that does not
display too much rapid fluctuation
Basic idea: making a necessary
compromise between the two rather
different aims in curve estimation
Roughness penalty approach

Quantifying the roughness of a curve

An intuitive way:
 g ' ' (t ) dt
b
2
a

(g: a twice-differentiable curve)
Motivation from a formalization of a
mechanical device: if a thin piece of
flexible wood, called a spline, is bent to
the shape of the graph g, then the leading
term in the strain energy is proportional to
2
g
'
'

Roughness penalty approach

Penalized sum of squares
n
S ( g )   Yi  g (ti )    g ' ' (t ) dt
2
a
i 1



b
g: any twice-differentiable function on [a,b]
 : smoothing parameter (‘rate of exchange’
between residual error and local variation)
Penalized least squares estimator
gˆ  arg min S ( g )
2
Roughness penalty approach
2
4
y18
6
8
10
Curve for a large value of 
5
10
Index
15
Roughness penalty approach
2
4
y18
6
8
10
Curve for a small value of 
5
10
Index
15
Interpolating and Smoothing Splines

Cubic splines

Interpolating splines

Smoothing splines

Choosing the smoothing parameter
Cubic Splines

Given a<t1<t2<…<tn<b, a function g is a
cubic spline if
1.
On each interval (a,t1), (t1,t2), …, (tn,b), g is a
cubic polynomial
2.
The polynomial pieces fit together at points ti
(called knots) s.t. g itself and its first and
second derivatives are continuous at each ti,
and hence on the whole [a,b]
Cubic Splines

How to specify a cubic spline
g (t )  di (t  ti )3  ci (t  ti )2  bi (t  ti )  ai for ti  t  ti 1

Natural cubic spline (NCS) if its second
and third derivatives are zero at a and b,
which implies d0=c0=dn=cn=0, so that g is
linear on the two extreme intervals [a,t1]
and [tn,b].
Natural Cubic Splines
Value-second derivative representation

We can specify a NCS by giving its value
and second derivative at each knot ti.

Define
g  ( g1,, gn )', where gi  g (ti )
  ( 2 ,,  n1 )', where i  g ' ' (ti )
which specify the curve g completely.

However, not all possible vectors
represent a natural spline!
Natural Cubic Splines
Value-second derivative representation

Theorem 2.1
The vector g and  specify a natural
spline g if and only if
Q' g  R
Then the roughness penalty will satisfy

b
a
g ' ' (t ) 2 dt   ' R  g ' Kg
Natural Cubic Splines
Value-second derivative representation
 h11
0
 1 1
1

h

h
h
1
2
2

 h21
 h21  h31
Q
0
h31





0
0

1
1
(
h

h
)
h2
3 1 3
6
 1
1

h
(h2  h3 )
2
R
6
3





0
0

 0  hi  ti 1  ti for i  1,, n

 0 
 0 

 0 
  

1
 hn 1  n( n  2)





0




1

(hn  2  hn 1 )
 ( n 2)( n  2)
3

0
Natural Cubic Splines
Value-second derivative representation

R is strictly diagonal dominant, i.e.
| rii |  j i | rij |, i
 R is positive definite, so we can define
1
K  QR Q'
Interpolating Splines

To find a smooth curve that interpolate (ti,zi),
i.e. g(ti)=zi for all i.

Theorem 2.2
Suppose n  2 and t1<…<tn. Given any
values z1,…,zn, there is a unique natural cubic
spline g with knots ti satisfying
g (ti )  zi for i  1,, n
Interpolating Splines

The natural cubic spline interpolant is the
unique minimizer of  g ' '2 over S2[a,b] that
interpolate the data.

Theorem 2.3
Suppose g is the interpolant natural cubic
~  S [a, b] with g
~(t )  z for i  1,, n
spline, g
2
i
i
then
~ ' '2  g ' '2
g


Smoothing Splines

Penalized sum of squares
n
S ( g )   Yi  g (ti )    g ' ' (t ) dt
i 1



2
b
a
g: any twice-differentiable function on [a,b]
 : smoothing parameter (‘rate of exchange’
between residual error and local variation)
Penalized least squares estimator
gˆ  arg min S ( g )
2
Smoothing Splines
1. The curve estimator gˆ is necessarily
a natural cubic spline with knots at ti,
for i=1,…,n.
Proof: suppose g is the NCS
n
n
2
2
~
 Yi  g (ti )   Yi  g (ti )
i 1
i 1
 g ' ' (t ) dt  
b
a
2
b
a
2
~
g ' ' (t ) dt
 S ( g )  S ( g~)
Smoothing Splines
2. Existence and uniqueness
Let Y  (Y1,, Yn )' then
n
2


Y

g
(
t
)
 (Y  g )' (Y  g )
 i
i
i 1
since g be precisely the vector of g (ti ) .
2
g
'
'
 g ' Kg ,
Express 
S(g)  (Y  g)'(Y  g)  g'Kg
 g'(I  K)g  2Y ' g  Y'Y
Minimumis achievedby settingg  ( I  K )1Y

Smoothing Splines
2. Theorem 2.4
Let gˆ be the natural cubic spline with
1
knots at ti for which g  ( I  K ) Y . Then
for any g in S2[a,b]
S ( gˆ )  S ( g )
Smoothing Splines
3. The Reinsch algorithm
Y  ( I  K ) g  ( I  QR1Q) g
 g  Y  QR1Q) g  Y  Q
(Q' g  R )
 Q' Y  ( R  Q' Q)
The matrix ( R  Q' Q) has bandwidth 5 and is
symmetric and strictly positive-definite,
therefore it has a Cholesky decomposition
R  Q' Q  LDL'
Smoothing Splines
3. The Reinsch algorithm for spline smoothing
Step 1: Evaluate the vector Q' Y .
Step 2: Find the non-zero diagonals of
R  Q' Q
and hence the Cholesky decomposition
factors L and D.
Step 3: Solve
LDL'   Q' Y
for  by forward and back substitution.
Step 4: Find g by g  Y  Q .
Smoothing Splines
4. Some concluding remarks
 Minimizing curve gˆ essentially does not depend
on a and b, as long as all the data points lie
between a and b.
 If n=2, for any  , setting gˆ to be the straight
line through the two points (t1,Y1) and (t2,Y2) will
reduce S(g) to zero.
 If n=1, the minimizer is no longer unique, since
any straight line through (t1,Y1) will yield a zero
value S(g).
Choosing the Smoothing Parameter

Two different philosophical
approaches

Subjective choice

Automatic method – chosen by data

Cross-validation

Generalized cross-validation
Choosing the Smoothing Parameter

Cross-validation
min CV ( )  n
1

 Y  gˆ
n
i
i 1
( i )

(ti ; )
2
2
 Yi  gˆ (ti ) 
 if gˆ is thesplinesmoother with 
 n  
i 1  1  Aii ( ) 
1

n
Generalized cross-validation
n
min GCV ( )  n 1

2
ˆ


Y

g
(
t
)
 i i
i 1
1  n

trA( )
1
2

n  residual sum of squares
(equivalent df)2
Available Software
smooth.spline in R

Description:
Fits a cubic smoothing spline to the supplied data.

Usage:
plot(speed, dist)
cars.spl <- smooth.spline(speed, dist)
cars.spl2 <- smooth.spline(speed, dist, df=10)
lines(cars.spl, col = "blue")
lines(cars.spl2, lty=2, col = "red")
Available Software
Example 1
library(modreg)
y18 <- c(1:3,5,4,7:3,2*(2:5),rep(10,4))
xx <- seq(1,length(y18), len=201)
(s2 <- smooth.spline(y18)) # GCV
(s02 <- smooth.spline(y18, spar = 0.2))
plot(y18, main=deparse(s2$call), col.main=2)
lines(s2, col = "blue");
lines(s02, col = "orange");
lines(predict(s2, xx), col = 2)
lines(predict(s02, xx), col = 3);
mtext(deparse(s02$call), col = 3)
Available Software
Example 1
Available Software
Example 2
data(cars) ## N=50, n (# of distinct x) =19
attach(cars)
plot(speed, dist, main = "data(cars) & smoothing splines")
cars.spl <- smooth.spline(speed, dist)
cars.spl2 <- smooth.spline(speed, dist, df=10)
lines(cars.spl, col = "blue")
lines(cars.spl2, lty=2, col = "red")
lines(smooth.spline(cars, spar=0.1))
## spar: smoothing parameter (alpha) in (0,1]
legend(5,120,c(paste("default [C.V.] => df
=",round(cars.spl$df,1)), "s( * , df = 10)"), col =
c("blue","red"), lty = 1:2, bg='bisque')
detach()
Available Software
Example 2
Extensions of
Roughness penalty approach

Semiparametric modeling: a simple application
to multiple regression
Y  g (t )  x'   

Generalized linear models (GLM)

To allow all the explanatory variables to be
nonlinear
Y  g (t )  

Additive model approach
d
Y   g j (t j )  
j 1
Reference

P.J. Green and B.W. Silverman (1994)
Nonparametric Regression and Generalized
Linear Models. London: Chapman & Hall

Introduction to smoothing splines

Transcript Introduction to smoothing splines

Directory