STATA_User_GroupOCT2007v5.ppt

Download Report

Transcript STATA_User_GroupOCT2007v5.ppt

STATA User Group
September 2007
Shuk-Li Man and Hannah Evans
Content

MACROS
-Description
-Why use them?
-How to store and use\recall them
-Applied examples

LOOPS
-Description
-Why use them?
-Applied examples-(using foreach, forval and while)

MACROS WITH LOOPS
-Why use them together?
-Applied examples
Aims of the Talk



An understanding of what macros and loops are
and the principles behind using them.
Understand how loops and macros can be the
solution to solve problems in a number of
different in contexts.
Be able to apply macros and loops to your work
MACROS
What is a macro?
A macro is a saved sequence of keyboard
strokes that can be stored and then recalled
with a single keyboard stroke.
(ref: http://whatis.techtarget.com)
Macros. Why use them?
Macros Increases the scope of your do file
How?
Allows you to run the same do file on multiple
datasets because you can get them to run under the
conditions of the dataset.
Macros in STATA
A sequence of keystrokes, the macrocontent
are stored under a macroname (you can call
it what ever you like).
 Either local or global macros.
- A local macro that is created in Stata can only be
used within Stata.
-A global macro made in Stata can be used or
modified in other programs.

Storing a macro (macro
assignment)
local animals dog cat mouse
Here the macrocontent is dogcatmouse stored under macroname animals
local sum =2+2
Here the macrocontent is 4 stored under macroname sum
local add “=2+2”
Here the macrocontent is =2+2 stored under macroname add
Recalling a macro
(after storing it)
` macroname.’
Single quotes are compulsory
On the keyboard
• left use buttons indicated in red circles
• right use button indicated in green circle
Using a macro
“ ` macroname.’.”
Double quotes are good practice
Examples of recalling a macrousing display
local animals dog cat mouse
display “`animals’”
. dogcatmouse
local sum =2+2
display “`sum’”
. 4
local add “=2+2”
display “`add’”
. =2+2
Using Macros in different contexts
In macro you can store….

The categories/levels of a variable- e.g. gender, male
or female

The output of a variable e.g. mean age, maximum
height

Store the file names in a directory- e.g. all files in
directory C:/datafiles
Macros-Applied Example 1
-Store the categories of a variable
Levelsof gender, local(genderlevels)
Stores the categories male and female from variable
gender and stores it under the macroname genderlevels
display “`genderlevels’”
. malefemale
Macros-Applied Example 2
-Storing the output from a variable (1)
Summarise age
local k=r(max)
di “`k’”
. 96
* oldest person is 96
Macros-Applied Example 3
-Storing the output from a variable (2)
gen numword=wordcount(description)
summarize numword
. Variable
Obs
. numword
5
Mean
Std.
Dev.
Min
Max
1
4
local k=r(max)
di “ `k’ ”
4
* maximum number of words in the description is 4
and this is stored in macroname k
Macros-Applied Example 4
-Store names of files in a directory
GPRD example:
Text files clinical1.txt, clinical2.txt & clinical3.txt stored in directory
C:\GPRD files
You name
local mylist: dir “C:\GPRD files\” files “clinical*.txt”
This stores all text files in directory “C:\GPRD files\” beginning
with the word “clinical” and is
Same as:
cd “C:\GPRD files\”
local mylist clinical1.txt clinical2.txt clinical3.txt
Loops. What is one?
A set of commands executed repeatedly
for..
1. a list of elements (foreach)
or…
2. a range of values (forval)
or…
3. a specified condition (while)
(Not covered in this talk)
Loops. Why use them?
Reduces length of do files
 Reduces chances human error
 Reduces checking time
 Can reduce the amount of memory STATA
needs to run a command
 Increases the efficiency of a do file

Loops. Example
Predictors of a prolonged HIV inpatient stay.
Factors include:
• Ethnicity (categorical)
• Clinical stage (3 indicator variables)
• Age (continuous)
• Immigration status (binary)
Loops. Example 1: Forval
forval num = 1/3 {
tab clin_stage`num’ pro_stay
}
range
lname –arbitrary name of range
Same as:
tab clin_stage1 pro_stay
tab clin_stage2 pro_stay
tab clin_stage3 pro_stay
Loops. Example 1: Foreach
foreach factor of varlist ethnicity immigration {
recode `factor’ 99=.
tab `factor’ pro_stay, row chi2 exact lname – arbitrary name
of the list
}
Same as:
recode ethnicity 99=.
tab ethnicity pro_stay, row chi2 exact
recode immigration 99=.
tab immigration pro_stay, row chi2 exact
Loops. Example 1: Foreach
foreach factor of varlist ethnicity immigration {
recode `factor’ 99=.
tab `factor’ pro_stay, row chi2 exact lname – arbitrary name
of the list
}
Options: varlist, numlist, newvarlist, lmacname, gmacname
Same as:
recode ethnicity 99=.
tab ethnicity pro_stay, row chi2 exact
recode immigration 99=.
tab immigration pro_stay, row chi2 exact
Loops. Example 1: Foreach
foreach factor of varlist ethnicity immigration {
recode `factor’ 99=.
tab `factor’ pro_stay, row chi2 exact
}
Same as:
recode ethnicity 99=.
tab ethnicity pro_stay, row chi2 exact
recode immigration 99=.
tab immigration pro_stay, row chi2 exact
Loops. Example 2: Foreach
GPRD example:
1.
save all text files to a .dta format
2.
append these files together.
foreach x of clinical1 clinical2 clinical3 {
insheet using `x’.txt
save `x’, replace
}
Product: clinical1.dta, clinical2.dta and clinical3.dta
Loops. Foreach-other examples
Foreach x in {
….
}
Foreach x of numlist {
….
}
Foreach x of newlist {
….
}
Foreach x of `macroname’ {
….
}
General list e.g. filenames
The same as using forval
For strings previously
unspecified in STATA
This allows the loop to run
through the list stored in
a macro
MACROS
WITH
LOOPS
Macros and loops. Why use together?



Macros and loops advantage- Can use do file on
multiple datafiles.
Loops advantage- Shorter less error prone do file
Loops advantage addition- Can get around functions
that do not allow by() as a sub option
Macros and Loops
-Example 1: using Macros Example 1
Levelsof gender, local(levels)
display “`levels’”
malefemale
Foreach x of `levels’ {
tab1 smoking bmi if gender=“`x’”
}
Same as:
tab1 smoking bmi if gender==“female”
tab1 smoking bmi if gender==“male”
Macros and Loops
Example 2
HAVE
WANT
HPVtypes
type9
type23
type36
9 23 & 36
1
1
1
36
0
0
1
23 & 36
0
1
1
23
0
1
0
9 & 23
1
1
0
Macros and Loops
Example 2 -using Macros Example 2
gen numbword=wordcount(description)
summarise numbword
. Variable
. numbword
Obs
5
Mean
Std.
Dev.
Min
1
Max
4
local k=r(max)
di “`k’”
4
Greatest number of words in the description is 4
2. Generate variables for each word
forval i=1/`k' {
gen word`i'=word( word, `i')
}
Macros and Loops
Example 2
HPVType
word1
word2
word3
word4
9 23 & 36
9
23
&
36
36
36
23 & 36
23
&
36
23
23
9 & 23
9
&
23
Macros and Loops using
Macros- Example 4 cont…
3. Generate variable for each type
forval i=1/36 {
gen type`i’=0 <-creates variables type1-type36
creates 36 variables called a variable all containing zero
forval j=1/`k’ {
replace type`i’=1 if real(word`j')==`i’<- line gives 1s to
types`i’ for those words types that are present and ignores the words that are “&”
-e.g. if type 36 present in original variable HPVtype then will change 0 to 1 in type36
variable
}
sum type`i’, mean
Local k=r(max) <- stores maximum for type1-type36
drop type`i’ if “`k’”==0<-drops all those variables where maximum is zero
i.e. drops those types that are not in original variable “HPVtypes”
}
Macros and Loops
Example 2-Output
HPVtypes
type9
type23
type36
9 23 & 36
1
1
1
36
0
0
1
23 & 36
0
1
1
23
0
1
0
9 & 23
1
1
0
Macros and Loops
Example 2 -updated dataset
type9
type23
type36
9 23 & 36 & 21
1
1
1
& 16
0
0
1
23 & 36
0
1
1
23
0
1
0
9 & 23
1
1
0
HPVtypes
36
Macros and Loops
Example 2
Problem:
Want to append all files in a directory that begin
with the word clinical
REFRESH
Macros- Example 5-Store names of files in a directory
Text files received from GPRD clinical1.txt,clinical2.txt
and clinical3.dta stored in directory C:\GPRD files
local mylist: dir “C:\GPRD files\” files “clinical*.txt"
di “`mylist’”
clinical1.txtclinical2.txtclinical3.txt
REFRESH
Loops-Foreach example 2
GPRD example:
1. save all text files to a .dta format
2. append these files together
1.
foreach x of clinical1 clinical2 clinical3 {
insheet using `x’.txt, clear
save `x’, replace
}
Product: clinical1.dta, clinical2.dta and clinical3.dta
Macros and Loops example 2
local mylist: dir “C:\GPRD files\” files “clinical*.dta”
foreach x of `mylist’ {
insheet using `x’.txt, clear
save `x’, replace
*saves all text file with the name clinical at the beginning as .dta file
}
gen row_no=_n
sum row_no
local k=r(max)
drop in 1/`k’
* drops all observations from clinical3.dta
foreach x of `mylist’ {
append using `x’
save clinical_all, replace
* appends clinical1.dta, clinical2.dta, clinical3.dta together and saves as clinical_all.dta
}
Macros and Loops
Example 2
Result
A file called clinical_all which contains all text files
with file name starting with “clinical” from
directory “C:\GPRD files\”.
Go away today with…



An understanding of what macros and loops are
and the principles behind using them.
Understand how loops and macros can be the
solution to solve problems in a number of
different in contexts.
Be able to apply macros and loops to your work
And Hopefully….
BE PERSUADED THAT USING LOOPS AND
MACROS IN DO FILES ARE REQUIRED FOR
YOU TO WORK MORE EFFICEINTLY IN THE
FUTURE AND BE MORE CONFIDENT THAT
YOUR RESULTS ARE ALL CORRECT.
______________
Thank you