Human Growth - University of British Columbia

Download Report

Transcript Human Growth - University of British Columbia

An Introduction to Functional
Data Analysis
Jim Ramsay
McGill University
What is functional data analysis?
A data analysis is functional if either:
 The data to be analyzed are assumed to
come from a smooth function (eg: time,
space, space/time, ability, depression,
frequency, molecular weight, etc.)
 The data are to be modeled by a smooth
function (eg: probability density, dose
response function, intensity function, etc.)
So what’s new about that?
Two ideas are central:
 The functions are smooth, usually
meaning that one or more derivatives can
be estimated and are useful.
 No assumptions, such as stationarity, low
dimensionality, equally spaced sampling
points, etc, are made about the functions
or the data.
Heights of 10 girls
200
180
Height (cm)
160
140
120
100
80
60
2
4
6
8
10
Age
12
14
16
18
Important features of growth data




Times of measurements are not equally
spaced.
Growth is smooth; we want to study the
second derivative of these curves.
No assumptions are made initially about
the shapes of these curves or their
derivatives.
A girl’s entire height record is viewed as a
single unitary observation.
What new challenges are there?
Have a look at the estimated acceleration
curves for these ten girls.
 Growth curves should be monotonic; how
can we achieve this?
 How can we get a good estimate of
acceleration?
 What’s wrong with the mean acceleration
curve?
Acceleration curves for 10 girls
Height Acceleration(cm/year/year)
2
1
0
-1
-2
-3
-4
2
4
6
8
10
Age
12
14
16
18
Phase and amplitude variation
We see that acceleration curves vary in
terms of:
 The intensity of the pubertal growth spurt.
 and its timing.
 There is both amplitude and phase
variation here.
 Unless we can remove the phase variation,
the cross-sectional mean is worthless.
Do standard data analyses have
functional counterparts?
They surely do. There are functional
versions of:
 Analysis of variance
 Multiple regression analysis
 Principal components analysis
 Canonical correlation analysis
 Cluster and classification analysis
Is there anything new in FDA?


Because the functions we estimate are
assumed smooth, we can model the
dynamic behavior of the data.
This means using differential equations to
model how the output of an input/output
system changes in response to changes in
the input.
Data from an oil refinery



The data are from a tray in a distillation
column.
The output is the top plot; the input is the
bottom plot.
The solid line is a model using a simple first
order differential equation:
Dx(t) = -βx(t) + αu(t)
where x(t) is the output function and u(t) is

the input function.
How can we estimate such models from
noisy observed data?
Where do we start?



The first task is to learn methods for
estimating smooth functions from discrete
noisy data.
We use basis function expansions to
model functions.
We impose smoothness using roughness
penalties.
And what’s next?



Because most functional data show
variation in both phase and amplitude, the
next step is to learn how to separate
phase from amplitude variation.
This process is called curve registration.
After that, we can use functional versions
of standard multivariate data analyses.
What about functional exploratory
data analysis?



As always, graphical display methods are
indispensable.
We will focus on the phase/plane plot as a
way of exploring the interplay between
derivatives.
Principal components and cluster analyses
are also useful.
How do we use covariate
information?



Covariates or independent variables can
be (a) multivariate and (b) functional.
Regression analysis with a functional
response and multivariate covariates is
fairly straightforward.
Regressing on functional covariates leads
to new challenges, however.
Can derivatives be used, too?


Every function, whether directly fit to
data, or estimated from non-functional
data, is assumed to have one or more
derivatives available for an analysis.
A differential equation is a model that
contains one or more derivatives as a part
of the model.
Unique Aspects of Functional Data
Analysis



The data are from a smooth process, so
we can use derivatives in various ways.
Time itself may be an elastic medium, and
vary over functional observations.
Differential equations can play a big role in
a functional data analysis.
Finding out More





Ramsay, J. O. and Silverman, B. W. (1997, 2004)
Functional Data Analysis. Springer.
Ramsay, J. O. and Silverman, B. W.
(2002) Applied Functional Data Analysis.
Springer
Visit the FDA website:
www.psych.mcgill.ca/misc/fda/
Software in Matlab, R and S-PLUS available at
ego.psych.mcgill.ca/pub/ramsay