Transcript Slide 1

STATA
Third group training course in application of information
and communication technology to production and
dissemination of official statistics
10 May – 11July 2007
Gereltuya Altankhuyag, Lecturer/Statistician, UNSIAP
[email protected]
7/21/2015
1
Getting Started


There are three ways of executing commands:
 Using menu-bar

Using dialog box (db)

Using Syntax
It is preferable to use Syntax
7/21/2015
2
Getting Started – dialog box
 Dialog box db is the command-line way to
launch a dialog for a Stata command.
 Syntax
db varname
For instance: db sum
7/21/2015
3
Getting Started – dialog box
7/21/2015
4
Basic commands to inspect datasets

The following commands are used to inspect datasets
codebook
count
describe
list
summarize
table
tabstat
7/21/2015
5
Basic commands to inspect datasets
codebook
 It examines
 the variable names,
 labels,
 data to produce a codebook describing the dataset
 It distinguishes/reports the standard missing values
 Syntax
codebook [varlist] [if] [in] [, option]
Example: codebook
codebook region
7/21/2015
6
Basic commands to inspect datasets

option:
all – provides a complete report excluding mv
 header – adds header to the top of the output,
name, date
 notes – lists any notes attached to the variables
 mv – determines the pattern of missing values
Examples: codebook region hhlandd famsize, all
codebook region hhlandd famsize, header
codebook region hhlandd famsize, notes
codebook region hhlandd famsize, mv

7/21/2015
7
Basic commands to inspect datasets
count
 It counts the number of observations that
satisfy the specified conditions. If no
conditions are specified, count displays the
number of observations in the data.
 Syntax
count [if] [in]
For instance: count
count if famsize>=5
7/21/2015
8
Basic commands to inspect datasets
describe
 It produces a summary of the dataset:
 In memory
 Of the data stored in a Stata-format dataset
 Syntax:
 Data in memory:
describe [varlist] [, describem_options]
 Data in file
describe [varlist] using filename[, describef_options]
Example: des
des region famsize toilet
7/21/2015
9
Basic commands to inspect datasets

options:
 simple – display only variable names
 short – display only general information
 detail – display additional details
 fullname – do not abbreviate variable names
 numbers – display vriable number along
with name
7/21/2015
10
Basic commands to inspect datasets
list
 It displays values of variables
 Syntax
list
list [varlist] [if] [in] [, options]
Example: list
list region famsize toilet
list region famsize toilet in 1/15
list region if famsize>5 in 1/15
7/21/2015
11
Basic commands to inspect datasets
summarize

It calculates and displays a variety of summary
statistics. If no varlist is specified, summary statistics
are calculated for all the variables in the dataset.
 Syntax
summarize
summarize [varlist] [if] [in] [weight] [, options]
Example: sum
sum in 1/15
sum region famsize toilet
sum region famsize toilet [aw=weight]
7/21/2015
12
Basic commands to inspect datasets

options:
 detail - produces additional statistics including skewness,
kurtosis, the four smallest and four largest values, and
various percentiles.

meanonly - which is allowed only when detail is not
specified, suppresses the display of results and calculation
of the variance.

format - requests that the summary statistics be displayed
using the display formats associated with the variables,
separator(#) - specifies how often to insert separation lines
into the output. The default is separator(5), meaning that a
line is drawn after every 5 variables. separator(10) would
draw a line after every 10 variables. separator(0)
suppresses the separation line.

7/21/2015
13
Basic commands to inspect datasets
NOTE:
 Commands and output are shown in Results
window.
 When MORE message is shown,
press GO to continue
display
7/21/2015
or X button
to stop display
14
Basic commands to inspect datasets
NOTE:
 We may specify a variable list for a range of
variables
des region – toilet
sum region – hhlandd
list thana - famsize
7/21/2015
15
Basic commands to inspect datasets
NOTE:
 We may use the menus
for DESCRIBE
Data ► Describe Data
►Describe Variables in Memory
for SUMMARIZE
Statistics ► Summaries, Tables & Tests
►Summary Statistics ►Summary
Statistics
Data ► Describe Data
►Summary Statistics
7/21/2015
16
Basic commands to inspect datasets
There are 5 types of “table” command:
 table
 tabstat
 tabulate one-way
 tabulate two-way
 tabulate summarize
7/21/2015
17
Basic commands to inspect datasets
table
 It calculates and displays tables of statistics.
 Syntax:
table rowvar [colvar [supercolvar]] [if] [in]
[weight] [, options]
 Main options:


7/21/2015
contents - specifies the contents of the table's
cells; select up to 5 statistics;
by(superrowvarlist) - superrow variables; up to 4
variables.
18
Basic commands to inspect datasets
Examples:
table region, c(mean famsize median hhandd)
table region, by(sexhead) c(mean famsize
median hhandd)
7/21/2015
19
Basic commands to inspect datasets
tabstat
 It displays table of summary statistics
 Syntax:
tabstat varlist [if] [in] [weight] [, options]
 Main options:


7/21/2015
by(varname) - group statistics by variable
statistics(statname [...]) - report specified statistics
20
Basic commands to inspect datasets

Examples:
tabstat region, stats(mean range)
tabstat region, by( sexhead) stat(min mean
max) col (stat)
7/21/2015
21
Basic commands to inspect datasets
tabulate one-way (tab1)
 It produces one-way tables of frequency
counts.
 Syntax:
 tabulate varname [if] [in] [weight] [, options]
It produces one-way tables of frequency counts.
 tab1 varlist [if] [in] [weight] [, tab1_options]
It produces a one-way tabulation for each variable
specified in varlist.
7/21/2015
22
Basic commands to inspect datasets

Examples:





tabulate toilet
tabulate region
tabulate hhelec
tabulate sexhead
tab1 region toilet hhelec sexhead
Note: please see the differences!!
7/21/2015
23
Basic commands to inspect datasets
tabulate two-way (tab2)
 It produces two-way tables of frequencies
 Syntax:
 tabulate varname1 varname2 [if] [in] [weight] [,
options]
It produces two-way tables of frequency counts,
along with various measures of association,
including the common Pearson's chi-squared, the
likelihood-ratio chi-squared, Cramér's V, Fisher's
exact test, Goodman etc.
7/21/2015
24
Basic commands to inspect datasets
tab2 varlist [if] [in] [weight] [, options]
It produces all possible two-way tabulations of the
variables specified in varlist.

 Examples:





7/21/2015
tabulate region toilet, row
tabulate region sexhead, row col chi2
tabulate region toilet, all exact
tab2 region sexhead toilet
tab2 region sexhead toilet, all exact
25
Basic commands to inspect datasets
Tabulate summarize
 It produces one- and two-way tables
(breakdowns) of means and standard
deviations.
 Syntax:
tabulate varname1 [varname2] [if] [in] [weight]
[, summarize]
7/21/2015
26
Basic commands to inspect datasets
Examples:
One-way tables:



tabulate region, summarize( hhlandd)
tabulate region [aweight=weight], summarize(
toilet)
Two-way tables:
 tabulate region sexhead, summarize( hhlandd)
 tabulate region sexhead [aweight=weight],
summarize( hhlandd)
7/21/2015
27
Basic commands to create and
change variables, labels etc.
generate
 It creates a new variable. The values of the
variable are specified by =exp.
 Syntax:

generate [type] newvar[:lblname] =exp [if] [in]
 Examples:
 gen agehead2=agehead*agehead
 gen agehead3=agehead*agehead if sexhead==1
7/21/2015
28
Basic commands to create and
change variables, labels etc.
replace
 It changes the contents of an existing variable.
Because replace alters data, the command
cannot be abbreviated.
 Syntax:


replace oldvar =exp [if] [in] [, nopromote]
Examples:

7/21/2015
replace agehead3=0 if region==2
29
Basic commands to create and
change variables, labels etc.
egen
 It creates newvar of the optionally specified
storage type equal to fcn(arguments). Here
fcn() is a function specifically written for egen.
 Syntax:

7/21/2015
egen [type] newvar = fcn(arguments) [if] [in] [,
options]
30
Basic commands to create and
change variables, labels etc.

Examples:


7/21/2015
egen age4=mean( agehead)
egen test=median( weight- d_bank)
31
To be continued. …
END
Introduction to STATA
Please perform EXERCISE 2
7/21/2015
32