Getting Started With STATA Getting Our Feet Wet with STATA On https://ctools.umich.edu, you will find a dataset containing, among other things, some.

Download Report

Transcript Getting Started With STATA Getting Our Feet Wet with STATA On https://ctools.umich.edu, you will find a dataset containing, among other things, some.

Getting Started With STATA
Getting Our Feet Wet with STATA
On https://ctools.umich.edu, you will find a dataset containing, among other things, some of the
items from the SF-12 Assessments Of Physical And Mental Health from the National
Longitudinal Survey of Youth (NLSY).
1) Download the dataset and brief codebook file to the desktop of the computer that you are
working on.
How do I do this?
2) Double click on the file to open it (or it may have opened automatically when you
downloaded it) It probably opened automatically, but you may have to save it to
the desktop, and double-click it to open it.
3) Open a log file on the desktop, or in your IFS space to save your work.
How do I do this?
4) Looking at your STATA program, try to answer the following questions:
a) Where are the different individuals in the dataset?
b) Where are the variables? Try generating a list of variables in the data set by using the
“codebook” command.
c) Where are the SF-12 questions (variables?). Use lookfor SF-12 or scroll through the
variables window to find them.
d) How many individuals are represented in this data set? (hint: the describe command
will help you)
How do I do these things?
5) The names of the variables are less than intuitive. Using the rename command, can you
rename one of the SF-12 variables so that it has a more intuitive name?
This one is easy! Pick one of the questions and type:
Rename [oldname] [newname]
6) According to the codebook some values of the data actually represent missing data. Using
the SF-12 variable that you renamed, can you check to see so that missing values are counted
as missing by the program?
How do I figure this out?
7) Try calculating the average score and running a frequency distribution of one of the SF-12
items that you have been working with. What does this tell you?
Pick a variable (question) and use the summarize or tabulate
command to try to get some information about the average
answers to that question.
Download the data set
These .txt files are the codebooks
Select NLSY.dta
Return to Main Slide
Open a log file
Log files
are
opened
and
closed
with the
little
button
that looks
like a
“scroll”
next to a
“stoplight”
Return to Main Slide
Try to answer the following questions
Return to Main Slide
The spreadsheet containing
rows of individuals, and
columns of the questions
they were asked, can be
seen by clicking on the
browse or edit data buttons
The questions
that were asked
in the survey are
just above
You can “lookfor” or “describe” certain
variables (questions)
“describe, short” will give you
information about the characteristics of
the data, including the number of
respondents
Return to Main Slide
Missing Values
This may strike you as a little bit complicated initially,
but really, it’s a matter of common sense. Here’s
an illustration of the problem:
• The responses to many survey questions are
coded in the following way
– Agree
– Neutral
– Disagree
1
2
3
• Often, survey responses such as “don’t know”,
“refused to answer”, “was not interviewed” are
assigned a special numeric code indicating the
non-response such as 99, -8, -9.
• We will run into problems if we try to get an
average value for this variable, because the
codes for the missing responses will be averaged
in with the codes for the actual responses, so we
might get an average response of -90.
• We have to tell the software (using the recode
command) that these answers are in fact missing,
and should be excluded from our calculations.
• For many data sets, including the NLSY, this has
already been done.
• The symbol for missing values in most statistical
software is a period (“.”).