FORMAT FESTIVAL - Home | UAB School of Public Health

Download Report

Transcript FORMAT FESTIVAL - Home | UAB School of Public Health

FORMAT FESTIVAL
AN INTRODUCTION
TO
SAS® FORMATS AND
INFORMATS
By David Maddox
FORMAT FESTIVAL
• Formats and Informats powerful SAS® language
components.
• Format - an instruction or
template that SAS uses to
output data values.
• Informat - an instruction or
template for reading data
values.
FORMAT FESTIVAL
• There are two basic types of
Informats – SAS® Supplied and
User Created.
• Character Informats have the
form - $<informat name>w.
• Examples - $char7. , $hex4.
FORMAT FESTIVAL
• Numeric Informats omit the “$”
sign and may include a decimal
width.
• Examples of numeric Informats
are:
comma7.2 – removes embedded
commas and $ signs.
pd4. – Reads packed decimal
FORMAT FESTIVAL
• Date Informats are used to convert
date and time information into a
SAS® date format.
• A SAS date is a number which
represents the number of days since
January 1, 1960.
• Examples of date Informats are:
mmddyy10. – 10/09/2006
date11. – 09/OCT/2006
FORMAT FESTIVAL
Specifying Informats for use in a
program.
- INPUT Statement
- INPUT, INPUTC, INPUTN
Functions
- INFORMAT Statement
- ATTRIB Statement
FORMAT FESTIVAL
Example 1 – Using an Informat to read in a
variable.
03/01/2002
03/04/2002
03/05/2002
03/06/2002
03/07/2002
03/08/2002
03/11/2002
03/12/2002
03/13/2002
03/14/2002
03/15/2002
03/18/2002
03/19/2002
$1,035,053.
$1,075,310.
$1,236,607.
$1,422,098.
$1,635,413.
$1,880,725.
$2,162,834.
$2,487,259.
$2,860,348.
$3,289,400.
$3,782,810.
$4,350,232.
$5,002,767.
FORMAT FESTIVAL
data ex1;
Infile 'c:\Workshop\HW07\Data\ex1.txt;
input datev mmddyy10. revenue comma12.;
run;
FORMAT FESTIVAL
Example 2 – Creating and using a
user defined Informat.
- PROC FORMAT is used to create
user-defined Formats and
Informats
- Use an INFORMAT or ATTRIB
statement to permanently
associate a variable and an
Informat
FORMAT FESTIVAL
proc format;
invalue $ganpa '229'='Albany'
'706','762'='Athens'
'404','470','678','770'='Atlanta'
'478'='Macon'
'912'='Savannah';
run;
data ex2;
length npa $8;
input npa $ganpa3. nxx $ 4-6 line $ 7-10 revenue 12.2;
cards;
404423AAAA50.0
229543XXXX75.0
912333BBBB25.0
912767CCCC31.75
678221DDDD46.0
478332EEEE34.5
770681FFFF21.7
706728GGGG45.8
229476HHHH31.2
;
run;
FORMAT FESTIVAL
Obs
npa
nxx
line
revenue
1
Atlanta
423
AAAA
50
2
Albany
543
XXXX
75
3
Savannah
333
BBBB
25
4
Savannah
767
CCCC
31.75
5
Atlanta
221
DDDD
46
6
Macon
332
EEEE
34.5
7
Atlanta
681
FFFF
21.7
8
Athens
728
GGGG
45.8
9
Albany
476
HHHH
31.2
FORMAT FESTIVAL
Example 3 – Creating a permanent
user defined Informat
- Reference the catalog using
PROC FORMAT Library= option.
- Use LIBNAME to declare a
FORMATS catalog
FORMAT FESTIVAL
proc format library=library;
invalue $ganpa '229'='Albany'
'706','762'='Athens'
'404','470','678','770'='Atlanta'
'478'='Macon
'912'='Savannah';
run;
proc format library=library fmtlib;
select @$ganpa;
run;
The FMTLIB produces the following
information about the $ganpa. INFORMAT.
This option can be used to check the user
created INFORMAT for accuracy and
completeness.
FORMAT FESTIVAL
| INFORMAT NAME: @$GANPA LENGTH: 8 NUMBER OF
VALUES: 9 MIN LENGTH: 1 MAX LENGTH: 40 DEFAULT
LENGTH 8 FUZZ: 0
|-------------------------------------------------------------------------|START |END
|INVALUE(VER. V7|V8 04JUL2006:18:25:37)|
|----------------+----------------+---------------------------------------|229
|229
|Albany
|404
|404
|Atlanta
|470
|470
|Atlanta
|478
|478
|Macon
|678
|678
|Atlanta
|706
|706
|Athens
|762
|762
|Athens
|770
|770
|Atlanta
|912
|912
|Savannah
----------------------------------------------------------------------------
FORMAT FESTIVAL
Example 4 – Using a Format to print
a variable
- PUT Statement
- PUT, PUTC, PUTN Functions
- %SYSFUNC macro function
- FORMAT Statement
- ATTRIB Statement
FORMAT FESTIVAL
libname ds 'c:\Workshop\HW07\Datasets';
proc print data=ds.ex1;
run;
proc print data=ds.ex1;
format datev mmddyy10. revenue comma12.;
run;
proc print data=ds.ex1;
format datev mmddyy7. revenue dollar14.2;
run;
proc print data=ds.ex1;
format datev date7. revenue euro12.;
run;
proc print data=ds.ex1;
format datev worddate12. revenue e12.;
run;
data _null_;
set ds.ex1;
file print;
put datev date9. +1 revenue comma14.2;
run;
FORMAT FESTIVAL
Example 5 – Creating and using a
user defined Formats
- PROC FORMAT is used to create
user-defined Formats and
Informats
- Use an FORMAT or ATTRIB
statement to permanently
associate a variable and a
Format
FORMAT FESTIVAL
proc format;
value $ganpaf
'229'='Albany'
'706','762'='Athens'
'404','470','678','770'='Atlanta’
'478'='Macon'
'912'='Savannah';
run;
proc print data=ex2;
format npa $ganpaf.;
run;
FORMAT FESTIVAL
Obs npa
nxx
1 Atlanta
423
2 Albany
543
3 Savannah 333
4 Savannah 767
5 Atlanta
221
6 Macon
332
7 Atlanta 681
8 Athens 728
9 Albany 476
line revenue
AAAA
50.00
XXXX
75.00
BBBB
25.00
CCCC
31.75
DDDD
46.00
EEEE
34.50
FFFF
21.70
GGGG
45.80
HHHH
31.20
FORMAT FESTIVAL
Example 6 – Using a FORMAT to
classify data.
- Use PROC FORMAT to declare a
range instead of just one
equivalent value.
- Apply the FORMAT range to the
original variable in a PROC or
DATA step.
FORMAT FESTIVAL
proc format;
value revtype 0 -< 35 = 'LOW'
36 -< 55 = 'AVERAGE'
56 -< 80 = 'HIGH';
run;
data ex3;
set ex2;
revenue1=revenue;
run;
proc print data=ex3;
format revenue revtype.;
var npa nxx line revenue1 revenue;
Run;
FORMAT FESTIVAL
Obs
1
2
3
4
5
6
7
8
9
npa
404
229
912
912
678
478
770
706
229
nxx
423
543
333
767
221
332
681
728
476
line
revenue1 revenue
AAAA 50.00
AVERAGE
XXXX 75.00
HIGH
BBBB 25.00
LOW
CCCC 31.75
LOW
DDDD 46.00
AVERAGE
EEEE 34.50 LOW
FFFF
21.70 LOW
GGGG 45.80 AVERAGE
HHHH 31.20 LOW
FORMAT FESTIVAL
Exercise 7 - Using a Format as a
look-up facility
- Define a FORMAT using PROC
FORMAT
- Use a PUT function to test each
observation AND select by the
formatted value
FORMAT FESTIVAL
Proc format;
value $metro
'404','470','678','770'='me';
run;
data metro;
set ex2;
if put(npa,$metro.)='me';
proc print data=metro;
title 'Metro Atlanta';
run;
FORMAT FESTIVAL
Metro Atlanta
Obs
1
2
3
npa
404
678
770
nxx
423
221
681
line
AAAA
DDDD
FFFF
revenue
50.0
46.0
21.7
FORMAT FESTIVAL
Example 8 – Creating a Format with
the CNTLN option
- Create a data set that can be used
to construct a Format
-Define variables – START, LABEL,
FMTNAME as a minimum.
-Read the data set into PROC FORMAT
using the CNTLN option
FORMAT FESTIVAL
FORMAT FESTIVAL
libname ds 'J:\Flash_Drive_4_Gig\SESUG 2006\HW07';
libname library 'J:\Flash_Drive_4_Gig\SESUG 2006\HW07';
data ocnlist;
length label $ 50;
set ds.nm_ocn_list;
start=ocn;
label=carrier;
fmtname='$New_Mexico_OCN';
run;
proc format library=library cntlin=ocnlist;
run;
proc format library=library fmtlib;
select $New_Mexico_OCN;
run;
FORMAT FESTIVAL
FORMAT FESTIVAL
• Traffic Lighting
Not a recode, but an improvement in the
visual effect of making certain values stand
out. Example:
Obs
1
2
3
4
5
6
7
8
9
npa
nxx
404 423
229 543
912 333
912 767
678 221
478 332
770 681
706 728
229 476
line
revenue
AAAA 50.00
XXXX 75.00
BBBB 25.00
CCCC 31.75
DDDD 46.00
EEEE 34.50
FFFF 21.70
GGGG 45.80
HHHH 31.20
FORMAT FESTIVAL
• proc format;
• value larg low - 24.00 ='Green'
•
23.00 - 49.00='Blue'
•
50.00 - high='Red';
• run;
•
•
•
•
proc print data=subscribers;
var npa nxx line;
var revenue / style={foreground=larg.};
run;
FORMAT FESTIVAL
- The key element is
the use of the “style”
option on the revenue
variable.
-This method is
supported for PROC
PRINT, PROC
TABULATE, and PROC
REPORT.
-“FORGROUND”
controls font color.
Obs
npa
nxx
line
revenue
1
404
423
AAAA
50.00
2
229
543
XXXX
75.00
3
912
333
BBBB
25.00
4
912
767
CCCC
31.75
5
678
221
DDDD
46.00
6
478
332
EEEE
34.50
7
770
681
FFFF
21.70
8
706
728
GGGG
45.80
9
229
476
HHHH
31.20
FORMAT FESTIVAL
-The “background”
attribute controls the cell
color.
-In Version
8.2, the
presence of a permanent
FORMAT for the traffic
lighting variable
prevented the traffic
lighting effect when
using PROC PRINT. This
problem was fixed in
Version 9.1
Obs
npa
nxx
line
revenue
1
404
423
AAAA
50.00
2
229
543
XXXX
75.00
3
912
333
BBBB
25.00
4
912
767
CCCC
31.75
5
678
221
DDDD
46.00
6
478
332
EEEE
34.50
7
770
681
FFFF
21.70
8
706
728
GGGG
45.80
9
229
476
HHHH
31.20
FORMAT FESTIVAL
• Nested FORMATS
PROC FORMAT;
value $metro '404'='City'
'470','678','770'='Metro'
'229','912'='Outstate'
other=[$inter.];
value $inter '803'='SC'
'423'='TN'
'205'='AL'
other='Rest of Country';
run;
FORMAT FESTIVAL
Using the $metro
FORMAT to print the
subscribers dataset
produces the
following output:
Obs
npa
1
City
2
nxx
line
404
423
AAAA
50.00
Outstate
229
543
XXXX
75.00
3
Outstate
912
333
BBBB
25.00
4
Outstate
912
767
CCCC
31.75
5
Metro
678
221
DDDD
46.00
6
Rest of Country
478
332
EEEE
34.50
7
Metro
770
681
FFFF
21.70
8
Rest of Country
706
728
GGGG
45.80
9
Outstate
229
476
HHHH
31.20
AL
205
768
xxxx
35.00
10
NPA
revenue
FORMAT FESTIVAL
Version 9.2 New Formats
$BASE64X – Converts character data to
ASCII text using Base 64 encoding.
PERCENTN – Produces percentages,
using a minus sign for negative values.
DATEW. – Enhanced to write dates in the
form dd-mmm-yyy.
FORMAT FESTIVAL
• This is a very brief look at
Formats and Informats.
• They can help you write some
very elegant and concise code.
• Make them a permanent part of
your SAS® repertoire.
SAS and all other SAS Institute Inc. product or service names are registered trademarks or
trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA
registration.