DANE PANELOWE - Uniwersytet Warszawski

Download Report

Transcript DANE PANELOWE - Uniwersytet Warszawski

PANEL DATA

Development Workshop

What are we going to do today?

1.

2.

3.

4.

5.

Panels – introduction and data properties How to measure distance What comes first: trade or GDP?

What else affects trade?

Role of currency?

Why panel data?

What is the sense of panel data?

 pooled data in econometrics  panels in econometrics  long or wide?

 fixed or random effects?

Gravity model

 All that theory is ql, but transport costs matter and market size matters: => push and pull – – – – – Isard (1954), logs by Tinbergen (1962) [what if there were no barriers? „missing trade”], Linneman (1966) [standard macro approach], Anderson (1979) [first theoretical model – expenses based] Helpman-Krugman (1985) [intra-industry trade] Bergstrand (1985) [general equilibrium, one country/one factor] Bergstrand (1989) [H-O model with Lindera hypothesis]

Simplest model

 Variables: – – Explained: bilateral trade Explanatory: GDP, populations, distance

reg trade gdp pop dist

Source SS df MS Number of obs = 1074 F( 3, 1070) = 543.02

Model 196764.006 3 65588.0021 Prob > F = 0.0000

Residual 129238.275 1070 120.783434 R-squared = 0.6036

Adj R-squared = 0.6025

Total 326002.281 1073 303.823188 Root MSE = 10.99

tradevolume Coef. Std. Err. t P>|t| [95% Conf. Interval] gdpsum .0141613 .0011921 11.88 0.000 .0118221 .0165004

population~m .0528096 .0228549 2.31 0.021 .0079642 .097655

distance -.0073704 .0005152 -14.31 0.000 -.0083813 -.0063594

_cons 5.762674 1.067794 5.40 0.000 3.667467 7.857882

Panel data

  Same data, same question, but „sth” consists of groups over time STATA learns that by 1.

Set of commands:

iis grouping_var tis time_var

2.

xtset grouping_var time_var

3.

tsset grouping_var time_var

(they are all equivalent)  Once data are set for panel?

xtsum

vs

sum

Panel regression

 Do not forget context menu in STATA   To find out how to do panel regressions in STATA:

Statistics => Longtitudal/panel data

– Many options already covered:

xtset

,

sum

,

des

,

tab

(check’em out  ) – Also:

linear models

Simplest code

xtreg trade pop gdp dist

Panel results

Random-effects GLS regression Number of obs = 1074 Group variable: id Number of groups = 91 R-sq: within = 0.4879 Obs per group: min = 6 between = 0.6091 avg = 11.8

overall = 0.5995 max = 12 Random effects u_i ~ Gaussian Wald chi2(3) = 1070.28

corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000

tradevolume Coef. Std. Err. z P>|z| [95% Conf. Interval] gdpsum .0187795 .0006722 27.94 0.000 .017462 .0200969

population~m -.0098166 .0375135 -0.26 0.794 -.0833418 .0637085

distance -.0068902 .0017132 -4.02 0.000 -.010248 -.0035324

_cons 4.429218 3.53079 1.25 0.210 -2.491003 11.34944

sigma_u 10.536556

sigma_e 3.3908988

rho .90615037 (fraction of variance due to u_i)

How do we know if it makes sense?

   Different from pooled estimator?

What if we add country effects to the pooled estimation? Let’s try

areg trade pop gdp dist, absorb(grouping_var)

Some we know from the literature and some from experience – – Linear or in logs? Maybe also non-linear terms and interactions, trade or export share, etc.

Should we do fixed or random effects?

– Are we interested in differences across time or across countries? Between and within R2 tell a different story, no? What do our models say?

xttest0

tradevolume[id,t] = Xb + u[id] + e[id,t] Estimated results: Var sd = sqrt(Var) tradevo~e 303.8232 17.43052

e 11.49819 3.390899

u 111.019 10.53656

Test: Var(u) = 0 chi2(1) = 4793.89

Prob > chi2 = 0.0000

Huge problem - endogeneity

  What is first: – rich trade more or rich because trade more?

– how to go around this problem?

What is it that we want? – Cross country differences?

– – Time evolutions within one country?

Test theory?

What do you find on do-file?

1.

2.

3.

Declare panel, run simplest models, do graphs, etc Run diagnostics

Learn more