Transcript Slide 1
Belarusian Economic Research and Outreach Center Econometrics I
February 2014
# 2.
Advanced topics in OLS regression
Instructor: Maksym Obrizan Lecture notes III
# 3. Working with natural logs Suppose that we regress log(salary) of CEOs on log(sales) of their firms # 4. Quadratic functions are also used quite often in applied economics to capture
decreasing or increasing
marginal effects. For example, consider how wage
y
depends on experience
x
When interpreting the effect we need 2 terms
# 5. Consider the effects of experience on wage # 6. The first year of experience brings about 30 ¢ In going from 10 to 11 years of experience, wage is predicted to increase by Thus, exper has
diminishing
returns on wage # 7. Sometimes the partial effect of the dependent depends on the magnitude of another explanatory variable # 8. R-squared can never fall when a new independent variable is added to regression Thus, adjusted R-squared is often used because it imposes a penalty for adding additional independent variables
# 9. Recall F test for joint significance of a group of variables But what if models are
nonnested
when neither equation is a special case of the other?
# 10. Some of the variables take only 2 values (male of female) – they are called
binary
How do we incorporate binary variables into regression models? For example, # 11. Example If we compare a woman and a man with the same levels of education, experience, and tenure, the woman earns, on average, $1.81 less per hour than the man. # 12. Suppose we estimate a model that allows for wage differences among four groups: married men, married women, single men, and single women. How many dummies can we include (if we also have an intercept)?
# 13. The linear probability model (LPM) Sometimes the dependent variable is also binary (employed or unemployed) # 14. Probability of “being in labor force” Linear Probability Model (LPM) estimates the response probability as linear in the parameters For example, 10 more years of education increases the probability of being in the labor force by 0.038(10) = 0.38 # 15. Heteroskedasticity In the presence of heteroskedasticity many OLS test statistics are no longer valid Heteroskedasticity-robust procedures are used in this case # 16. Example (robust standard errors are in parenthesis)
# 17. How to test for heteroskedasticity The Breusch-Pagan test for heteroskedasticity (BP test) # 18. Economists are often interested in policy analysis Does job training improve chances of becoming employed?
Will the construction of incinerator affect house prices?
# 19. Quote from Wooldridge: “Kiel and McClain (1995) studied the effect that a new garbage incinerator had on housing values in North Andover, Massachusetts.
The rumor that a new incinerator would be built in North Andover began after 1978, and construction began in 1981. We will use data on prices of houses that sold in 1978 and another sample on those that sold in 1981.” # 20. A naïve analysis would be to use data for 1981 where
nearinc
is dummy (=1 if a house is near incinerator, 0 – otherwise)
# 21. However, the data for 1978 (prior to rumors about construction) shows # 22. Did building of a new incinerator depress housing values?
The key is to compare the coefficient on
nearinc
changed between 1978 and 1981.
Use
difference-in-differences
estimator using the data pooled over both years # 23. Interpreting the results # 24. The parameter we are interested is on the interaction term
y81·nearinc
# 25. Omitted variable bias Suppose that the true relationship is but
ability
is not observed so we estimate # 26. Thus, the estimator of w will biased (not equal to the correct one) and inconsistent (not converging to the true one as the sample size increases) # 27. Instrumental variables (IV) Suppose that
education
is correlated with the error term
u
(because it contains
ability
) In addition, let
z
be such a variable that is uncorrelated with
u
but correlated with
x
# 28. Stata example Use the data on married working women in MROZ.RAW to estimate the return to education OLS results first
# 29. IV estimation Suppose that
father
education is a good instrument for
educ
# 30. # 31. Criticisms of IV estimation Observe that OLS estimate is included in 95% interval for IV estimate Thus, the difference is not statistically significant # 32. Multicollinearity Example of perfect collinearity – constant+female+male Example of multicollinearity
# 33. Consequences of multicollinearity OLS estimators are still BLUE but would have large covariances # 34. What to do in the case of multicollinearity?
Sometimes no choice (data deficiency) – so do nothing Detecting multicollinearity # 35. Micronumerosity – the problem of small sample size This is a related problem to multicollinearity # 36. Including irrelevant variables in the OLS regression Including an irrelevant variable
will not lead
to unbiasedness of the intercept and other slope estimators