Transcript slides

Make Your Code File Driven
Methods to let SAS collect file names in your system
Lu Zhang
Beijing, China
A leading global CRO
1
Introduction
Background
SAS Data-Driven Features
 Most of the SAS powerful functions are based on a data-driven
approach.
– Increased general usability as a result of independence from raw data.
– Dynamic output from different data contents.
– Fewer parameters.
A leading global CRO
2
Introduction
Background
Idea Needs Illumination
Data-Driven feature is self proved to be an excellent approach.
To extend it - Let the driven approaches not only based on the data
structure but also on the external files.
• File driven there it is.
• Let SAS communicate with our operating system.
A leading global CRO
3
Introduction
Background
How This Idea Serves in Pharmaceutical Industry
• To import external data.
• To combine output files.
• Make our macros more flexible.
Example (SDTM implementation with raw data being external *.csv files)
%mimport(ae.csv);
%mimoort(cm.csv);
VS
%mimport("folder path");
%mimport(mh.csv);
...
No need to type their names one by one, just tell SAS where they are.
A leading global CRO
4
Introduction
Aim of this paper
Two Methods to implement the file driven approach
 SAS 'D-' functions.
• Using a series of SAS functions including DOPEN, DNUM and DREAD.
 Unnamed SAS Pipes
• SAS Pipes allows us using DOS commands in Windows system outside
SAS.
A leading global CRO
5
Method 1: SAS 'D-' Functions
Descriptions of Functions
How These 'D-' Functions Works
 DOPEN
• to open a directory and return a directory identifier value.
 DNUM
• to get the numbers of members in the opened directory.
 DREAD
• to return the names of each members.
 DCLOSE
• to close the opened directory to release the thread.
A leading global CRO
6
Method 1: SAS 'D-' Functions
Utilizations of Functions
Utilization of These Functions and Implementation SAS Code
1. All should be within a data
step.
2. Open a specific path under
which our target files are
located with DOPEN.
3. Get the total number of files
exist under the path with
DNUM.
4. In a do-loop, get each files'
filename with DREAD and
assign them to a variable.
A leading global CRO
7
Method 1: SAS 'D-' Functions
Results of Method 1
Result
Source folder
Result data
A leading global CRO
8
Method 1: SAS 'D-' Functions
Utilization
Example (SDTM implementation with raw data being external *.csv files)
%macro mimport(path=);
**Code for collecting file names;
…;
%do i=1 %to &k;
**Original code for importing files;
%end;
%mend;
A leading global CRO
9
Method 2: Unnamed SAS Pipes
Basic Introduction
Basic Introduction of Pipes and Dos Commands
• Implementing dos commands within SAS like X statement.
• More than X statement, SAS pipes are dynamic connections.
• With following SAS code we can look up for dos commands
Example (SAS code to look up for dos commands)
filename indata pipe 'help';
data help;
infile indata truncover;
input help $300.;
if _n_ ne 1;
run;
A leading global CRO
10
Method 2: Unnamed SAS Pipes
Utilization of SAS Pipes to Read File Names
Implementation SAS Code
Example
filename indata pipe 'dir "c:\external data\*.csv" /b';
data flst;
format fname $30.;
infile indata truncover;
input fname;
call symput("n_file",_n_);
run;
filename indata clear;
A leading global CRO
11
Method 2: Unnamed SAS Pipes
Results With Same Case
Result By Using SAS Pipes
Exactly same, but less heavy code and simpler logic compared with using 'D-' functions.
A leading global CRO
12
Method 2: Unnamed SAS Pipes
Further Extended
UNC Path
•
UNC - Universal Naming Convention.
•
Many companies have their global SAS working environment in which network
shares often being used.
•
UNC path is necessary to connect to network share.
•
It could be the chance that 'dir' may failed to find UNC path.
A leading global CRO
13
Method 2: Unnamed SAS Pipes
Pipes Under UNC Path
How to deal with the situation that dir failed?
 Dos command "subst" - Associates a path with a driver letter.
 Steps to establish a virtual driver that dir can find.
1. Set SAS options 'noxwait'.
2. Using X statement and the dos command "subst" to assign the target path to a
virtual driver.
3. With virtual driver set up, pipes could work again.
A leading global CRO
14
Method 2: Unnamed SAS Pipes
Pipes Under UNC Path
Example (assign a UNC path "\\global\project\data" to virtual driver)
option noxwait;
X 'subst n: \\global\project\data';
Example (results)
A leading global CRO
15
Discussion & Conclusion
Applying in More Fields
Example: To compare the latest 2 batches of extracted data.
filename indata pipe 'dir “C:\extracted" /b';
data fnam;
infile indata truncover;
input fnam $200.;
fnam2=lag(fnam);
call symput('last', strip(fnam2));
call symput('current', strip(fnam));
run;
File driven approaches can be more flexible.
A leading global CRO
16
Discussion & Conclusion
 Now we could:
•
Importing external data. (CSV XLS ...)
•
Combining outputs. (RTF LST ... )
•
Coordinating with other techniques to function more.
 With methods of:
•
SAS 'D-' functions.
•
SAS pipes.
 In addition, with SAS pipes, we gain
•
More efficiency.
•
Less heavy code.
•
More flexibility.
A leading global CRO
17
Q&A
Thanks
A leading global CRO
18