Transcript SAS Basics

SAS Basics
Windows
Program Editor
Write/edit all your statement here.
Windows continue…
Log
Watch this for any errors in program as it runs
Windows continue…
Output
Will automatically pop in front when there is output.
Does not need to occupy screen space during program editing.
File Organization

Create subfolders in your Project folder for

Data


Formats


Compiled version of formats, a file with .sc2 extension.
Used for building classes of variables for looking at
frequencies.
Output


Contains SAS datasets, with .sd2 extension
Save output files here. These are text files with a .sas
extension.
Programs

All programs are text files with .sas ending.
Creating a dataset

Internal Data
DATA datasetname;
INPUT name $ sex $ age;
CARDS;
John M 23
Betty F 33
Joe M 50
;
RUN;
Creating a dataset

External Data
DATA datasetname;
INFILE ‘c:\folder\subfolder\file.txt’;
INPUT name $ sex $ age;
;
RUN;
Creating from an existing one
DATA save.data2 (keep = age income);
SET save.data1;
RUN;
DATA save.data2;
SET save.data1;
DROP age;
TAX = income*0.28;
RUN;
Permanent Data Sets
LIBNAME save ‘c:\project\data’;
DATA save.data1;
X=25;
Y=X*2;
RUN;
Note that save is merely a name you make up
to point to a location where you wish to save
the dataset called data1. (It will be saved as
data1.sd2)
What’s in my SAS dataset?
PROC CONTENTS data=save.data1;
RUN;
PROC CONTENTS data=save.data1
POSITION;
RUN;
This will organize the variable list sorted alphabetically
and a duplicate list sorted by position (the sequence
in which they actually exist in the file).
Viewing file contents
PROC PRINT data=save.data1; run;
PROC PRINT data=save.data1 (obs=5);
VAR name age;
RUN;
PROC PRINT data=save.data1 (obs=12);
VAR age -- income;
RUN;
Frequencies/Crosstabs
PROC FREQ data=save.data1;
TABLES age income trades;
RUN;
PROC FREQ data=save.data1;
TABLES age*sex;
RUN;
Scatter Plot
PROC PLOT data=save.data1;
PLOT Y*X;
RUN;
Creating a Format Library
PROC FORMAT LIBRARY=LIBRARY;
VALUE BG
0 = 'BAD'
1 = 'GOOD'
-1 = 'MISSING'
;
VALUE TWO
-1 = 'MISSING'
-2 = 'NO RECORD'
-3 = 'INQS. ONLY'
-4 = 'PR ONLY'
0='0'
1='1'
1<-HIGH='2+'
;
RUN;
Applying a format to a variable
PROC DATASETS library=save;
MODIFY data1;
FORMAT trades ten.;
RUN;
QUIT;
This applies the format called ten to the
variable trades. A subsequent PROC FREQ
statement for trades will show the format
applied. Note that ten must already exist in
the format library for this to work.
Applying a format: Method 2
Data save.data2;
SET save.data1;
FORMAT
trades bktrds ten.
totbal mileage. ;
RUN;

This is another way to apply formats when
creating a new dataset (data2) from a previous
one (data1) that has unformatted variables.
Random Selection of Obs.
DATA save.new;
SET save.old;
Random1 = RANUNI(254987)*100;
IF Random1 > 50 THEN OUTPUT;
RUN;
QUIT;
The function RANUNI requires a seed number, and then produces
random values between 0 and 1, stored under the variable
name Random1 (you can choose any name). The above
program will create new.sd2, with about half the observations of
old.sd2, randomly chosen.
Sorting and Merging Datasets
PROC SORT data = save.junk;
BY Age Income;
Run;
PROC SORT data=save.junk OUT=save.neat;
BY acctnum;
RUN;
PROC SORT data=save.junk NODUPKEY;
BY something;
RUN;
Sorting and Merging Datasets
PROC SORT data=save.one;
BY Acctnum; RUN;
PROC SORT data=save.two;
BY Acctnum; RUN;
DATA save.three;
MERGE save.one save.two;
BY Acctnum;
RUN;
Sorting and Merging Datasets
DATA save.three;
MERGE save.one (IN = a) save.two;
BY Acctnum;
IF a;
RUN;
Using Arrays
DATA save.new;
SET save.old;
ARRAY vitamin(6) a b c d e k;
DO i = 1 to 6;
IF vitamin(i) = -5 THEN vitamin(i) = .;
END;
RUN;
This assumes you have 6 variables called a, b, c, d, e, and ,k in
save.old. This program will modify all 6 such that any instance
of a –5 value is converted to a missing value.
Simple Correlations
PROC CORR data=save.relative;
VAR tvhours study;
RUN;
PROC CORR data=save.relative;
VAR tvhours study;
WITH Score;
RUN;
Run Regression Analysis

Runs the regression and stores the estimates in a file
called estfile
Proc reg data=save.treg2 corr outest=estfile;
bgscore: model good=
trades01
trades02
ageavg01
ageavg02 / selection=none;
run;
Quit;
Score the data

Score the data intreg1 and save the output in
save.scrdata
Proc score data=save.treg1 score=estfile out=save.scrdata
type=parms;
trades01
trades02
ageavg01
ageavg02
Run;
Quit;
Format bgscore

Format the bgscore variable in the new
save.scrdata file. Find or create a format from
the format.sas file to apply to the bgscore
variable.
Proc datasets library=save;
Modify scrdata;
Format bgscore insert_format_here.;
Run;
Quit;
Creating Dummy Variables
%MACRO DUMMY(VAR, FIRST, LAST, TOT);
IF(&FIRST <= &VAR <= &LAST) THEN &VAR.&TOT =1;
ELSE &VAR.&TOT =0;
LABEL &VAR.&TOT="&VAR: &FIRST - &LAST ";
%MEND DUMMY;
data save.testreg2;
set save.testreg;
%Dummy(AGEOTD,
%Dummy(AGEOTD,
%Dummy(AGEOTD,
%Dummy(AGEOTD,
%Dummy(AGEOTD,
Run;
Quit;
0, 78, 1);
96, 119, 2);
120, 143, 3);
144, 179, 4);
180, 99999999, 5);