Transcript SAS/IML

Statistical
computing with
SAS/IML
Presented by
Jian Chen
PhD (Applied Statistics)
MS (Computer Science)
Sr. Statistician, Credigy
Jian Chen
1
SAS/IML
SAS Interactive Matrix Language:
Beyond!
Jian Chen
2
Outline
•
•
•
•
•
Overview of SAS/IML.
Language nuts and bolts.
An example in Bayesian Analysis.
Applications.
References.
Jian Chen
3
Features of SAS/IML
• The simple SAS/IML program:
Proc iml;
Print ‘Hello World!’;
Quit;
• Is a programming language operating on
matrices.
• Has a complete set of control statements.
• Has a powerful vocabulary of operators.
• Can use operators that apply to entire matrices.
• Can be interactive.
Jian Chen
4
Features of SAS/IML
(2-2)
• Many Base SAS functions are accessible
from SAS/IML and has many built-in
functions.
• Can define function or subroutine and write
the core algorithm.
• Can call a C program (or Fortran, Cobol,
PL/I programs) within SAS/IML via the
module() functions (Windows only).
Jian Chen
5
With SAS/IML
• Edit existing SAS data sets or create new ones.
• Access external files with an extensive set of
data processing commands for data input and
output.
Jian Chen
6
Numerical Functions and
Algorithms
• Subroutines:
– Outlier detection and robust regression.
– Performs numerical integration of scalar functions in one
dimension over infinite, connected semi-infinite, and connected finite
intervals
•
•
•
•
•
•
– Optimization: for minimizing or maximizing a continuous
nonlinear function f = f(x) of n parameters.
Produce graphics with a powerful set of graphics commands (Need
SAS/Graph).
Kalman Filters.
Time Series Analysis.
Wavelet Analysis.
Genetic Algorithms – Experimental.
Sparse Matrices – Experimental.
Jian Chen
7
An example
– Problem: Assume we know Y(1),…,Y(n), what are
the future values: Y(n+1), Y(n+2), ……?
– The p-th autoregressive model: AR(p)
p
Y (t )  iY (t  i)   (t )
t  1,...,n,...
(1)
i 1
where
  ( (1),..., (n))'~ N (0, I )
1
Jian Chen
  1/   0
2
8
Priors
• Bayes Approach:
• Under the Normal-Gamma prior
 ( , )  1 ( |  ) 2 ( )
where
 2 ( )  
1 ( |  )  
 1 
e

 (   )'Q (   )
p/2
2
e
Jian Chen
9
Loss
Function
• Modified Higgins-Tsokos loss function
 a1e a2 (ˆ- )  a 2 e  a1 (ˆ- )
 1 If

a1  a 2


L(ˆ, )  
c1

c2



| ˆ -  | a
ˆ    a
ˆ    a
where a1 , a2 , a  0
and C1 , C2 make
the loss function continuous, that is:
Jian Chen
10
Loss
Function
a1e a2 a  a 2 e  a1a
c1 
1
a1  a 2
a1e  a2 a  a 2 e a1a
c2 
1
a1  a 2
Jian Chen
11
Loss
Function
Jian Chen
12
The k-step Bayes
prediction
• The Bayesian predictive density of Wk
(k-step ahead Bayes forecasting) is
f (Wk | S n )


2    ' Q  ~ ' ( X 'f X f  Q ) ~  Y f' Y f
| ( X X f  Q) |
'
f
1/ 2


nk

2
( 2)
where Wk=(Y(n+1),Y(n+2),…,Y(n+k) ) and
Sn=(Y(1),…,Y(n));
Jian Chen
13
The k-step Bayes
prediction
– where
~  (Q  X ' X ) 1 (Q  X 'Y )
– Others are the parameters in prior or matrix from n
observations.
Jian Chen
14
Example
• For Hölfer sunspot data, the shape of the
joint pdf of future two-step ahead
forecasting is graphed using (14.1)
Jian Chen
15
Practical k-step ahead
forecasting
~
• Get the one-step ahead forecasting Y (n  1) .
• Apply one-step ahead forecasting method
~
again with (Y(1), Y(2), …, Y(n), Y (n  1) ) to
get Y~(n  2) .
• ……
Jian Chen
16
K-th step ahead
forecasting
• The pdf of one-step ahead forecasting is:
b ac  b 2
t  t( , 2
, n  2 )
a a (n  2 )
where
a  1  X ( X 'f X f  Q) 1 X '
b  X ( X X f  Q) ( X Y  Q  )
'
f
1
'
0 n
'
c  2   'Q  (Yn' X 0   'Q)( X 'f X f  Q) 1 ( X 0' Yn  Q '  )  Yn'Yn
Jian Chen
17
K-th step ahead
forecasting
• where t-distribution is defined as
(
 1
)
 1
2

(
x


)
2
X ~ f ( x) 
(1 
) 2

a
a ( )
2

 t(  ,a , ) ( x)
Jian Chen
18
Bayes estimate under
MHT loss
• Bayes expected loss:
 ( x)   L(Yn 1 , x) p( | S n )d

a1e a2 ( x- )  a2e a1 ( x- )
 (
 1)t b ac b 2
( )d
x a
( , 2
, n  2 )
a1  a2
a a ( n  2 )
xa
x a
 c1  t
b ac b 2
( ,
, n  2 )
  a a 2 ( n  2 )
( )d  c2
Jian Chen

t
b ac b 2
( ,
, n  2 )
x  a a a 2 ( n  2 )
( )d
19
Bayes estimate under
MHT loss
– Bayes estimate (Bayes action) under MHT
loss function.
YˆMHT (n  1)  min ( x)
Jian Chen
20
Simulation and Calculation
with SAS
– Based on the assumption on priors, simulate
the parameters in model (7.1).
– Generate AR(p) series.
– Calculate the one-step ahead Bayes
estimate under MHT loss function.
– Calculate the two-step ahead Bayes
estimate under MHT loss function.
Jian Chen
21
Simulation and Calculation
with SAS
SAS techniques used:
– Simulation
– Time Series (model identification and
calculation).
– SAS/IML:
• Import from/export to SAS dataset. Interface with
other SAS PROCs.
• Matrix calculation.
• Integration.
• Optimization.
Jian Chen
22
Integration
• CALL QUAD ( result, "fun", points
<, EPS=eps> <, PEAK=peak> <, SCALE=scale>
<, MSG=msg> <, CYCLES=cycles> ) ;
• CALL QUAD ( r, "fun", points) < EPS=eps> <
PEAK=peak> < SCALE=scale>
<
MSG=msg> < CYCLES=cycles> ;
• The QUAD subroutine quad is a numerical
integrator based on adaptive Romberg-type
integration techniques. Refer to Rice (1973),
Sikorsky (1982), Sikorsky and Stenger (1984),
and Stenger (1973a, 1973b, 1978).
Jian Chen
23
Optimization
• Optimization: The IML procedure offers a set of
optimization subroutines for minimizing or
maximizing a continuous nonlinear function f = f(x)
of n parameters, where x = (x1, ... ,xn)’:
–
–
–
–
–
–
–
–
NLPCG
NLPDD
NLPNMS
NLPNRA
NLPNRR
NLPQN
NLPQUA
NLPTR
Conjugate Gradient Method
Double Dogleg Method
Nelder-Mead Simplex Method
Newton-Raphson Method
Newton-Raphson Ridge Method
(Dual) Quasi-Newton Method
Quadratic Optimization Method
Trust-Region Method
Jian Chen
24
Applications
• “Computing Group Sequential Boundaries Using the
Lan-DeMets Method with SAS”.
• Sample size and power analysis.
• SAS for Monte Carlo Studies: A Guide for Quantitative
Researchers: By Xitao Fan, Akos Felsovalyi, Stephen A.
Sivo, and Sean C. Keenan:
http://support.sas.com/publishing/bbu/companion_site/57
323.html
• A collection of SAS macro programs using SAS/IML
software to generate, randomize and inspect orthogonal
arrays for computer experiments and integration.
http://sunsite.univie.ac.at/statlib/designs/oa.SAS
Jian Chen
25
References
• Jian Chen, Bayes Inferences and forecasting of
Time Series, PhD thesis, UNC Charlotte.
• SAS Online Documentation for SAS/IML:
http://support.sas.com/onlinedoc/913/docMai
npage.jsp
• Sample programs installed with your installation:
Located in directory:
C:\Program Files\SAS\ SAS 9.1 \iml\sample
Jian Chen
26
Jian Chen
27