SAS Tips I learnt whilst at Oxford

Download Report

Transcript SAS Tips I learnt whilst at Oxford

SAS Tips I learnt whilst at
Oxford
By Phil Mason
Debugging complex macros
• Write code generated by macros to an external file
– File can’t be accessed until the SAS session has ended
• 6.12 - options RESERVEDB1 MPRINT ;
• 8 – options MFILE MPRINT ;
• 6.12 & 8 – filename MPRINT 'c:\macro.sas' ;
– Very useful for complex macros with many loops and
multi-ampersands
– You can then run the code generated through data step
debugger
Connect to yourself
• In 8 can use MP Connect, but 6.12 …
• Run spawner, e.g.
"C:\Program Files\SAS Institute\SAS\V8\spawner.exe" -c tcp –z
• Run SAS code, e.g.
%let tcpsec=_prompt_;
options comamid=tcp ;
signon notebook noscript ;
• Very useful to test out client server code on
a single machine
Adding operators or functions to
macro language
• “=:” exists in data step, but not macro language
– Create a macro which carries out required function and
returns a result
%macro eq(op1,op2) ;
%if %substr(&op1,1,%length(&op2))=&op2 %then
1;
%else
0;
%mend eq ;
…
%if %eq(abcde,ac) %then
%put yes 1 ;
%else
%put no 1 ;
Searching a catalog for text
• Such a tool does not directly exist, but there is a
way
• Using PROC BUILD you can write all the source
text from catalog entries to an external file
• Then use FSLIST to view the file, or notepad, MS
Word, to search it, e.g.
%macro catscan(cat,file) ;
proc build catalog=&cat batch ;
print source prtfile="&file" ;
run ;
dm 'fslist "&file"' fslist ;
%mend catscan ;
Making code re-startable
• Why?
– Robustness, saves reprocessing data
– Save datasets & macro variables
• %put _user_ to a dataset or file
• Copy sashelp.vmacro to a dataset, e.g.
proc sql ;
create table sasuser.macros as
select name,value from
sashelp.vmacro
where scope='GLOBAL' ;
Using Progress Bars in Base SAS
(1 of 2)
• You can produce progress bars for use in Data
Steps or Macros.
– Useful to let users know how a long macro is
progressing
• This example shows how to do so for a data step
– A similar technique can be used from macro language
using the %WINDOW and %DISPLAY statements.
• Key points
– Define a window shaped like a bar
– Redisplay it each time we have some progress
– Just change value and link to routine to update bar
Using Progress Bars in Base
SAS (2 of 2)
• Important Statements in Code
– “Window bar ” defines progress bar display
window
• Specifies position on screen
• Variable & colour used to display progressing bar
– “display bar noinput” refreshes the bar display
and requires no input from user to continue
Windows API calls
• Use MODULE function
• SASCBTBL definition, e.g.
routine MessageBoxA module=USER32 minarg=4 maxarg=4
stackpop=called returns=short;
arg 1 input format=pib4. byvalue;
arg 2 input format=$cstr200.;
arg 3 input format=$cstr200.;
arg 4 input format=pib4. byvalue;
• Documentation available in Linux WINE project
– Covers “all” APIs and is freely downloadable
Module Definitions (partial list)
sampsrc.pcsamp.sascbtbl.source
ExitWindows
GetDiskFreeSpaceA
GetDriveTypeA
GetModuleFileNameA
GetModuleHandleA
GetPrivateProfileIntA
GetPrivateProfileStringA
GetProfileIntA
GetProfileStringA
GetSystemDirectoryA
GetSystemMetrics
GetTempPathA
GetTempFileNameA
GetVersion
GetVersionExA
GetVolumeInformationA
GetWindowsDirectoryA
GetSystemInfo
MessageBeep
MessageBoxA
SwapMouseButton
WritePrivateProfileStringA
WriteProfileStringA
GetWin32sInfo
Mixing data step & SQL code
• You can have data step code in an SQL statement,
e.g.
data out / view=out ;
set sasuser.houses ;
if style='CONDO' then
put 'obs=' _n_ price= ;
run ;
proc sql ;
create table more as select * from out where price >100000 ;
;quit;run;
– This creates a dataset and writes some variable
information to the log
• Can similarly have SQL code within a data step,
by using views
Sort Techniques
based on my investigations
• 3x data size for sort space (rule-of-thumb)
• Compress=yes & sort can save time (-8%)
• Tagsort, good on large datasets where key is
small (e.g. -49%)
• Almost always use Noequals (e.g. 5%)
• Combine datastep code with sort using
VIEW (e.g. -27%)
– More on next slide …
Views can move pre-processing
into procedures for efficiency
• Inefficient
Data read
Data test;
here
if … flag=1 ; run;
Proc sort data=test;table flag;run;
… and again here
• Efficient
Data test/view=test;
if … flag=1 ; run;
Proc sort data=test;table flag;run;
Data read here, in PROC
• Changing data step to a view causes less I/O to be
done
– Since data is read once, IF condition applied and record
fed into proc sort directly
Finding secret SAS options
• Proc options internal ; run ;
Some options …
– BELL
Enables/disables the warning bell
– CDE=H
Display SAS System information
– CTRYDECIMALSEPARATOR=. Country specific decimal number
separator.
– CTRYTHOUSANDSEPARATOR=, Country specific thousands
number separator.
– DEBUGLEVEL=
Controls display of debug information. There
are five levels (TESTING, NORMAL, DEBUG, FULLDEBUG, and
DEMO).
– ENHANCEDEDITOR Invoke the enhanced editor at SAS startup
Making log available during noninteractive SAS sessions
• One of the undocumented options in SAS 6.12
which PROC OPTIONS INTERNAL reveals is:
– $logflush … closes LOG after each line is written
• Very useful for looking at log during noninteractive runs
• Usually can’t see log until the SAS session
finishes
Put a zip in your pipe
• Pipes read live output from programs
filename testpipe pipe 'pkunzip.exe c:\temp\test -c’ ;
– Use PKZIP to decompress archive
– -c option sends data to console, flowing into
pipe, able to be read by SAS
– PIPE parameter is required
• Allows processing files too large for disk
Exporting using ODBC - setting
up driver
• Control Panel, 32-bit
OBDC
• Define a [User DSN]
– [Add], [SAS], [Finish]
• [Servers]
– Server Name: mySAS
• [Configure]
– SAS Path: “c:\sas\sas.exe”
– Working Directory: “c:\sas”
– SAS Parameters: “-initstmt
%sasodbc(mySAS) comamid dde -icon –nolog
-noautoexec”
– [OK], [<<Add<<]
• [Libraries],
– Library Name: “mySAS”
– Host File Name: “c:\temp”
– [<<Add<<]
• [General]
– Data Source Name:
“mySAS”
– Description: “SAS ODBC”
– [ok]
• Can also define a [File
DSN]...
Exporting using ODBC - using it
• Start MS Access
– Make blank database
• Get External Data,
Input
– Link Tables
– File of Type: OBDC
Database()
• Look in: Machine
Data Source
• DSN Name: sas … OK
• SAS should now start
• SAS datasets under
“c:\temp\” should appear
• Select a SAS dataset …
OK … OK … Open your
SAS dataset
Can’t get this to work on this
system (v8, win98)
Speed up your SAS programs
• Use o/s commands for copying & deleting
– They are much quicker than SAS alternatives
– Execute o/s commands asynchronously
• Clean up work space during job
– Large work datasets should be deleted by you when
they are no longer needed, otherwise they may take
valuable space
• Clean up memory every so often
– Use CDE P to purge unused modules from memory
More ways to speed things up
• Split data into smaller bits for sorting
– Especially if they can fit in memory
– Saved 36% in one example
• Sort or index on a compressed single key
• Reduce I/O contention
– Put work, swap and data on different physical
disks where possible
Mixed Numeric informats
• Unquoted numerics treated as numbers
• Quoted text treated as character
• Useful if reading data which has mixed values,
which need to be interpreted in different ways
Proc format ;
invalue mixed
‘LOW’ = -99
1-10 = 1
11-20 = 2
‘BIG’ = 99
other = 0 ;
Run ;
Building long selection lists in
SQL
• Even in v6 macro variables may be up to
32k long
• Useful to store long text strings, such as
variable lists
– e.g. Can make a list of employees in one dataset
to select from another
Code
Where on Output dataset
•
•
•
•
Where clauses can be used for filtering data
Usually they are used with input data
They can be used with output data too
Keeping selected _type_ values
SAS OLE Automation Server
• How did I run demos from PowerPoint?
– PowerPoint has no scripting but can run programs
• SASOACT.EXE
– Controls a SAS OLE Automation Server
• SASOACT.EXE action=Open datatype=SASFile
filename=”Test.sas"
• SASOACT.EXE action=Submit datatype=SASFile
filename=”Test.sas"
• SASOACT.EXE action=Open datatype=Data
filename="Houses.sd2"
Nice stuff on the web
• SAS Online documentation
– V8doc.sas.com/sashtml (SAS OnlineDoc)
• SUGI proceedings
– www2.sas.com/proceedings/sugi26/procced.pdf
• Resources
– www.sashelp.com (David Ward)
– www.sconsig.com/sastip.htm (Charles Partridge)
• Newsletters with tips & techniques
– The missing semicolon – www.sys-seminar.com
– VIEWS news – www.views-uk.org
Any questions?
• If so, please see me afterwards since I am
probably out of time.
When is 3 * 1/3 not 1
• Numbers within SAS are handled with floating
point arithmetic
data _null_ ;
a=1/3 ;
b=3*a ;
c=1-b ;
put b= c= ;
Run ;
• Produces
B=1 C=5.551115E-17
• Fails due to floating point arithmetic being inexact
• Get around it by using round function, or FUZZ in
PROC FORMAT
Open OLAP Server in v8
•
•
•
•
•
EIS
Add MDDB, e.g. sashelp.prdmddb
Add olapmeta attribute
Run listener.scl
Install OOLAP server client from
– “C:\Program Files\SAS Institute\SAS\V8\mddbserv\sasmisc\ooscl30.exe”
• Enter EXCEL (& various other tools) & use
it
& versus && versus &&&
• In Base SAS
– &name refers to a macro variable called name
– &&name is scanned twice by macro processor
– &&&name is treated as &(&name)
Code
• If used in SCL – SAS/AF
– &name tries to substitute SCL variable name, if it exists
• If it does not exist, then it looks for a macro variable called
name
– &&name is useful since it doesn’t try to substitute an
SCL variable even if one exists of that name, but uses
the macro variable
– &&&name works as in Base SAS
Proc Printto to redirect and
process log
• Can redirect procedure output or LOG to a catalog
member or external file
• Useful for saving log and then analysing it
proc printto log=work.test.test.log ;
run ;
proc printto print='c:\print.lst' ;
run ;
proc print data=sasuser.houses;
run ;
proc printto print=print ;
run ;
Use attrib for common
definitions
• To produce a PROC PRINT without any labels
you can set them to null values
• You may try the following, but it does not work
since you can’t use _all_ in LABEL statements
proc print data=datatran.emptypev label noobs;
label _all_='00'x;
run;
• You can use the ATTRIB statement which does
support _ALL_ - this code works
proc print data=datatran.emptypev label noobs;
attrib _all_ label='00'x;
run;
Searching program editor or log
for macro variables
• If you simply enter &macname in the FIND
box, it will resolve &macname before
looking
• To search for the unresolved reference
search for:
%nrstr(&macname)
– %nrstr(&macname) resolves to &macname.
Data step views for parsing text
files to extract info
• Useful if you have regular flat file data you want
to process with SAS
• Just define a view
Data monthly / view=monthly ;
Infile ‘c:\monthly.txt’ ;
Input name $30. Address $30. ;
Run ;
• Can use view as input to other data steps and
procedures and data will always be fresh, e.g.
Proc report data=monthly ; run ;
Implement parallel processing
•
•
•
•
Using MP Connect in v8
Use Asynchronous submits in v8
SPDS
3rd party products (best kept secrets!)
– “Orchestrate” from Torrent Systems
• Incredible scalability
– “SAS Analyzer” from Ab Initio
• Use experimental sort module in v8
– Saspsort – works well on multi-proc machines
Nice technique to split code
• Use views with firstobs=, obs= and where=
(in v8)
• Feed views into parallel processes such as
MP Connect
Some nice new v8 features
• Lastword=scan(long_text,-1) ;
• Constants
– Pi=constant(‘pi’) ;
– Min=constant(‘small’) ;
• Factorials, Combinations & Permutations
– F=fact(4) ; c=comb(8,2) ; p=perm(8,2) ;
• If missing(var) then …
V functions
•
•
•
•
•
•
•
•
Varray – need a simple example of each one
Vformatn
Vinformat
Vname
Vtype
Vformatw
Vinformatdx
Vlength
Date functions
•
•
•
•
Yrdiff – need some examples
Datediff
Yymmnw
Yymmddxw
Sas/graph procedure
enhancements
•
•
•
•
Hbar3d, vbar3d & pie3d
Html=
Imagemap=
Drivers:
– Gif, html, webframe, gifanim, java, activex