Transcript Slide 1

Belarusian Economic Research and Outreach Center Econometrics I

February 2014

# 2.

Advanced topics in OLS regression

Instructor: Maksym Obrizan Lecture notes III

# 3. Working with natural logs Suppose that we regress log(salary) of CEOs on log(sales) of their firms # 4. Quadratic functions are also used quite often in applied economics to capture

decreasing or increasing

marginal effects. For example, consider how wage

y

depends on experience

x

When interpreting the effect we need 2 terms

# 5. Consider the effects of experience on wage # 6. The first year of experience brings about 30 ¢ In going from 10 to 11 years of experience, wage is predicted to increase by Thus, exper has

diminishing

returns on wage # 7. Sometimes the partial effect of the dependent depends on the magnitude of another explanatory variable # 8. R-squared can never fall when a new independent variable is added to regression Thus, adjusted R-squared is often used because it imposes a penalty for adding additional independent variables

# 9. Recall F test for joint significance of a group of variables But what if models are

nonnested

when neither equation is a special case of the other?

# 10. Some of the variables take only 2 values (male of female) – they are called

binary

How do we incorporate binary variables into regression models? For example, # 11. Example If we compare a woman and a man with the same levels of education, experience, and tenure, the woman earns, on average, $1.81 less per hour than the man. # 12. Suppose we estimate a model that allows for wage differences among four groups: married men, married women, single men, and single women. How many dummies can we include (if we also have an intercept)?

# 13. The linear probability model (LPM) Sometimes the dependent variable is also binary (employed or unemployed) # 14. Probability of “being in labor force” Linear Probability Model (LPM) estimates the response probability as linear in the parameters For example, 10 more years of education increases the probability of being in the labor force by 0.038(10) = 0.38 # 15. Heteroskedasticity In the presence of heteroskedasticity many OLS test statistics are no longer valid Heteroskedasticity-robust procedures are used in this case # 16. Example (robust standard errors are in parenthesis)

# 17. How to test for heteroskedasticity The Breusch-Pagan test for heteroskedasticity (BP test) # 18. Economists are often interested in policy analysis Does job training improve chances of becoming employed?

Will the construction of incinerator affect house prices?

# 19. Quote from Wooldridge: “Kiel and McClain (1995) studied the effect that a new garbage incinerator had on housing values in North Andover, Massachusetts.

The rumor that a new incinerator would be built in North Andover began after 1978, and construction began in 1981. We will use data on prices of houses that sold in 1978 and another sample on those that sold in 1981.” # 20. A naïve analysis would be to use data for 1981 where

nearinc

is dummy (=1 if a house is near incinerator, 0 – otherwise)

# 21. However, the data for 1978 (prior to rumors about construction) shows # 22. Did building of a new incinerator depress housing values?

The key is to compare the coefficient on

nearinc

changed between 1978 and 1981.

Use

difference-in-differences

estimator using the data pooled over both years # 23. Interpreting the results # 24. The parameter we are interested is on the interaction term

y81·nearinc

# 25. Omitted variable bias Suppose that the true relationship is but

ability

is not observed so we estimate # 26. Thus, the estimator of w will biased (not equal to the correct one) and inconsistent (not converging to the true one as the sample size increases) # 27. Instrumental variables (IV) Suppose that

education

is correlated with the error term

u

(because it contains

ability

) In addition, let

z

be such a variable that is uncorrelated with

u

but correlated with

x

# 28. Stata example Use the data on married working women in MROZ.RAW to estimate the return to education OLS results first

# 29. IV estimation Suppose that

father

education is a good instrument for

educ

# 30. # 31. Criticisms of IV estimation Observe that OLS estimate is included in 95% interval for IV estimate Thus, the difference is not statistically significant # 32. Multicollinearity Example of perfect collinearity – constant+female+male Example of multicollinearity

# 33. Consequences of multicollinearity OLS estimators are still BLUE but would have large covariances # 34. What to do in the case of multicollinearity?

Sometimes no choice (data deficiency) – so do nothing Detecting multicollinearity # 35. Micronumerosity – the problem of small sample size This is a related problem to multicollinearity # 36. Including irrelevant variables in the OLS regression Including an irrelevant variable

will not lead

to unbiasedness of the intercept and other slope estimators