SOME BASIC PROBABILITY CONCEPTS

Download Report

Transcript SOME BASIC PROBABILITY CONCEPTS

INTRODUCTION TO
STATA
Võ Tuấn Khoa
Trần Thế Trung
Stata basics
• command-driven or menu-driven software
• modeling complex data from longitudinal
studies or surveys  deal for analyzing
results from clinical trials or
epidemiological studies
• provides a powerful programming
language
Stata interface in Window
Stata command
The basic language syntax for STATA
commands is
[by varlist:] command [varlist] [=exp] [if exp]
[in range] [weight]
[using filename] [, options]
where the elements between brackets are
optional.
Stata command
• [by varlist:] instructs Stata to repeat the command for each
combination of values in the list of variables varlist.
• [command] is the name of the command and can be
abbreviated; for example, the command display can be
abbreviated as dis.
• [varlist] is the list of variables to which the command applies.
• [=exp] is an expression.
• [if exp] restricts the command to that subset of the observations
that satisfies the logical expression exp.
• [in range] restricts the command to those observations whose
indices lie in a particular range.
• [weight] allows weights to be associated with observations
• [using filename] specifies the filename to be used.
• [options] are specific to the command and may be abbreviated.
Stata command
• Example 1
– Stata Command:
.bysort black: summarize age if year
>= 80, detail
– Results:
• Summarizes age separately for
different values of black, including
only observations for which year >=
80, includes extra detail.
Stata command
• Example 2
– Stata Commands:
.generate agelt30 = age
.replace agelt30 = 1 if age < 30
.replace agelt30 = 0 if age >= 30 & age <.
– Result: variable agelt30 set equal to 1, 0, or
missing
– Generally [= exp] used with commands
generate and replace
Stata command
• Click Help / Stata command
• Type key word (Ex: summarize)
• See details
Do Files and Log Files
• A do file is a text file with STATA code that
STATA runs line by line, as if the
sentences where written in the STATA
command window.
• A log file is a text file with all the results
that appear in the STATA results window.
– the user selects when to start and when to
stop logging to the log file
Variable name
• Have up to 32 characters but shorter
names are easy to type
• Stata names are case sensitive (age≠Age)
• Should:
– short lowercase
– single word
– underscore to separate word
effort
fpe
family_planning_effort
familyplanningeffort
Variable type
• Nummeric variable
• String variable
• Missing value
– numberic: dot (.)
– string: “”
Some Basic Commands
• computing basic statistics
– summarize ypc
– summarize ypcf [w=popwt]
– summarize ylab [w=popwt] if age >=25 &
and age <=55
• generate new variables
– generate ypc2 = ypc^2
• tabulate data
– table skill [w=popwt], c(mean ylab)
Some Basic Commands
• renaming variables
– rename ypc2 ypcf22
• eliminating variables
– drop ypc22
• replacing values
– replace male=0 if male==1
Open data from Excel format
• Import data from excel file
Open data from Excel format
Open data from Excel format
Review data
Starting descriptive analysis
Starting descriptive analysis
Output Window