Transcript Document
Some Useful Econometric Techniques Selcuk Caner 7/21/2015 1 Outline Descriptive Statistics Ordinary Least Squares Regression Tests and Statistics Violation of Assumptions in OLS Estimation – Multicollinearity – Heteroscedasticity – Autocorrelation Specification Errors Forecasting Unit Roots, Spurious Regressions, and cointegration 7/21/2015 2 Descriptive Statistics Useful estimators summarizing the probability distribution of a variable: Mean X T i 1 Xi T Standard Deviation 7/21/2015 1 T X T i 1 X 2 i 3 Descriptive Statistics (Cont.) Skewness (symmetry) 1 T X i X S i 1 T 3 3 Kurtosis (thickness) K 7/21/2015 1 T T i 1 X X 4 i 4 4 Ordinary Least Squares (OLS) Estimation – Model Yt 0 1 X1t et – The OLS requires: • Linear relationship between Y and X, • X is nonstochastic, • E(et) = 0 , Var(et) = s2 and Cov(et, es)=0 for t not equal to s. 7/21/2015 5 Ordinary Least Squares (OLS) (Cont.) The OLS estimator for 0 and 1 are found by minimizing the sum of squared errors (SSE): T 7/21/2015 e i 1 Yt Yt 2 i 1 t T Y 2 T i 1 t 0 1 X t 2 6 Ordinary Least Squares (OLS) (Cont.) o Yt Yt o o o o o o o Xt 7/21/2015 X 7 Ordinary Least Squares (OLS) (Cont.) Minimizing the SSE is equivalent to: i 1 e 2 t T 0 0, 2 e t i 1 T 1 0 Estimators are: 0 Y 1 X Cov( X , Y ) i 1 X t X Yt Y 1 T 2 Var( X ) X t X T i 1 7/21/2015 8 Ordinary Least Squares (OLS) (Cont.) Properties of OLS estimators: 0 and 1 are unbiased estimators E( 0 ) 0 , E(1 ) 1 2 2 0 N( 0, , b0 ), 1 N(1, , 1 ) – They are normally distributed – Minimum variance and unbiased estimators 7/21/2015 9 Ordinary Least Squares (OLS) (Cont.) Multiple regression, in matrix form, Y X e Y= Tx1 vector of dependent variables X = Txk matrix of independent variables (first column is all ones) = kx1 vector of unknown parameters e = Tx1 vector of error terms 7/21/2015 10 Ordinary Least Squares (OLS) (Cont.) Estimator of the multiple regression model: 1 ˆ X ' X X 'Y X’X is the variance-covariance matrix of the components of X. X’Y is the vector of covariances between X and Y. It is an unbiased estimator and normally distributed 1 2 ˆ N ( , X ' X ) 7/21/2015 11 Example: Private Investment FIRt = b0 + b1RINT t-1 + b2INFL t-1 + b3RGDP t-1 + b4NKFLOW t-1 + et One can run this regression to estimate private fixed investment – – – – 7/21/2015 A negative function of real interest rates (RINT) A negative function of inflation (INFL) A positive function of real GDP (RGDP) A positive function of net capital flows (NKFLOW) 12 Regression Statistics and Tests R2 is the measure if goodness of fit: ˆ Y i1 i Y T 2 ˆ Y Y i1 i T 2 SSR SSE R T 1 T 1 2 2 TSS Yi Y TSS Y Y i i 1 i 1 2 Limitations: Depends on the assumption that the model is correctly specified R2 is sensitive to the number of independent variables If intercept is constrained to be equal to zero, then R2 may be negative. 7/21/2015 13 Meaning of 2 R Yt Yt Y o o o o o o o Xt 7/21/2015 Yt ˆ0 ˆ1 X1t t o X 14 Regression Statistics and Tests Adjusted R2 to overcome limitations of R2 = 1-SSE/(T- K)/TSS/(T-1) Is i statistically different from zero? When et is normally distributed, use tstatistic to test the null hypothesis i = 0. – A simple rule: if t(T-k) > 2 then i is significant. t (T k ) 7/21/2015 ˆi i S ˆ i 15 Regression Statistics and Tests Testing the model: – F-test: F-statistics with k-1 and T-k degrees of freedom is used to test for the null hypothesis: 1=2=3=…= k=0 The f-statistics is: 2 ( T k ) R F ( k 1,T k ) (k 1)(1 R 2 ) – The F test may allow the null hypothesis 1=2=3=…= k=0 to be rejected even when none of the coefficients are statistically significant by individual t-tests. 7/21/2015 16 Violations of OLS Assumptions Multicollinearity – When 2 or more variables are correlated (in the multi variable case) with each other. E.g., Yt 0 1 X1t 2 X 2t et – Result: high standard errors for the parameters and statistically insignificant coefficients. – Indications: • Relatively high correlations between one or more explanatory variables. • High R2 with few significant t-statistics. Why? 7/21/2015 17 Violations of OLS Assumptions (Cont.) 2 X ' X 1 and ˆ i ˆ 7/21/2015 0 i 18 Violations of OLS Assumptions (Cont.) Heteroscedasticity: when error terms do not have constant variances 2. – Consequences for the OLS estimators: – They are unbiased [E(=] but not efficient. Their variances are not the minimum variance. – Test: White’s heteroscedasticty test. If there are ARCH effects, use the GARCH models to account for volatility clustering effects. 7/21/2015 19 Violations of OLS Assumptions (Cont.) Autocorrelation: when the error terms from different time periods are correlated [et=f(et-1,et-2,…)]: – Consequences for the OLS estimators: • They are unbiased [E(=] but not efficient. – Test for serial correlation: DurbinWatson for first order serial correlation: 2 ˆ ˆ t 2 et et 1 T DW 7/21/2015 2 ˆ t 1 et T 20 Violations of OLS Assumptions (Cont.) Autocorrelation (cont.): Test for serial correlation (cont.) Durbin-Watson statistic (cont.) The DW statistic is approximately equal to: where Coveˆt , eˆt 1 DW 2(1 r1 ) 21 Vareˆt et r1et 1 ut Note, if r1=1 then DW =0. If r1=-1 then DW =4. For r1=0, DW =2. Ljung-Box Q test statistic for higher order correlation. 7/21/2015 21 Specification Errors Omitted variables: – True model: Yt 0 1 X1t 2 X 2t et – Regression model: Yt 0 1 X1t et – Then, the estimator for 1 is biased. Cov( X 1 , X 2 ) * E ( 1 ) 1 2 Var ( X 2 ) 7/21/2015 22 Specification Errors (Cont.) Irrelevant variables: – True model: Yt 0 1 X1t et – Regression model: Yt 0 X1t X 2t et * 1 * 2 – Then, the estimator for 1 is still unbiased. Only efficiency declines, since the variance of 1* will be larger than the variance of 1. 7/21/2015 23 A Naïve Estimation Estimate aggregate demand elasticity e Using historical consumption data: Estimate the regression equation: – ln(QDt) = a + b*ln(Pt) – b is an estimate of e Forecast change in consumption price, DP Estimate change in demand as: – (DQD/QD)F = e * (DP/P)F 7/21/2015 24 A Regression Result Dependent Variable Log QD Variable Coefficient Stand. Error T-Statistic C -8.35 0.431 -19.4 Log CPI 1.295 0.031 42.3 R=Squared 0.9895 AIC -1.49 Log likelihood 17.68 Schwartz C. -1.4 DW 0.726 F-Statistic 1790.0 7/21/2015 25 Stationarity (ADF Test) Log CPI Variable Coeficient Std. Error t-Statistic Prob. Ln CPI(-1) -0.006 0.010 -0.579 0.570 D(ln CPI(-1)) 0.670 0.183 3.659 0.002 C 0.128 0.151 0.847 0.410 Variable Coeficient Std. Error t-Statistic Prob. Ln QD(-1) -0.024 0.025 -0.924 0.369 D(ln QD(-1)) 0.162 0.241 0.674 0.510 C 0.380 0.260 1.460 0.164 Log QD 1% Critical Value -3.830 5% Critical Value -3.029 10% Critical Value -2.655 7/21/2015 26 Error Correction Model (ECM) for Non-Stationarity One can try regression of first differences. However, first differences do not use information on levels. It mixes long-term relationship with the short-term changes. Error correction model (ECM) can separate long-term and short-term relationships. 7/21/2015 27 Results of ECM Dependent Variable D(lnPIT) Variable Coefficient Stand. Error T-Statistic C -5.327 1.426 -3.735 D(ln CPI) -0.348 0.490 -0.709 lnQD(-1) -0.697 0.175 -3.985 lnCPI(-1) 0.883 0.225 3.923 R=Squared 0.551 AIC -2.307 Log likelihood 27.063 Schwartz C. -2.107 DW 1.946 F-Statistic 6.538 7/21/2015 28 Interpretation of the Estimated Regression ln QDt – lnQD t-1 = -5.327-0.348*(lnCPIt – lnCPI t-1) – 0.697* (lnQD t-1- 1.267 lnCPI t-1) Short-run Effect Long-run Effect Error Correction Coefficient 7/21/2015 29 Forecasting A forecast is: – A quantitative estimate about the likelihood of future events which is developed on the basis of current and past information. – Some useful definitions: – Point forecast: predicts a single number for Y in each forecast period – Interval forecast: indicates an interval in which the realized value of Y will lie. 7/21/2015 30 Unconditional Forecasting First estimate the econometric model Yt 0 1 X1t et et ~ N 0, 2 Then, compute: YˆT 1 ˆ0 ˆ1 X1T 1 assuming XT+1 is known. This is the point forecast. 7/21/2015 31 Unconditional Forecasting (Cont.) The forecast error is: The 95% confidence interval for YT+1 is: eˆT 1 YˆT 1 YT 1 ˆ0 0 ˆ1 1 XT 1 eT 1 YˆT 1 t0.5 s f YT 1 YˆT 1 t0.5 s f where 1 2 2 ˆ s f 1 2 X X X X 2 T 1 T t 1 t 2 Which provides a good measure of the precision of the forecast. 7/21/2015 32 Conditional Forecasting If XT+1 is not known and needs to be forecasted. – The stochastic nature of the predicted values for Xs leads to forecasts that are less reliable. – The forecasted value of Y at time T+1 is YˆT 1 ˆ0 ˆ1 X1T 1 7/21/2015 33 Unit Roots, Spurious Regressions, and Cointegration Simulate the processes Yt 10 Yt 1 et where et ~N(0,4) and X t 20 X t 1 ut where ut ~N(0,9). 7/21/2015 34 Unit Roots, Spurious Regressions, and Cointegration (Cont.) Spurious regressions: – Granger and Newbold(1974) demonstrated that macroeconomic variable data are trended upwards and that in regressions involving the levels of such data, the standard significance tests are misleading. The conventional t and F tests reject the hypothesis of no relationship when in fact there might be one. – Symptom: R2 > DW is a good rule of thumb to suspect that the estimated regression is spurious. 7/21/2015 35 Unit Roots, Spurious Regressions, and Cointegration (Cont.) Unit roots: – If a variable behaves like Yt Yt 1 et – Then its variance will be infinite since, Yt t t 1 et et t t 1 et t 1 t – This is a non-stationary variable. E.g., Yt 10 Yt 1 et where et ~N(0,4). This would result with a forever increasing series. 7/21/2015 36 Unit Roots, Spurious Regressions, and Cointegration (Cont.) The series can be made stationary by taking first difference of Yt, Yt Yt 1 et The series has finite variance and is a stationary variable. The original series Yt is said to be integrated of order one [I(1)]. 7/21/2015 37 Unit Roots, Spurious Regressions, and Cointegration (Cont.) • A trend-stationary variable Yt t et also has a finite variance. The process Yt Yt 1 et is non-stationary and does not have a finite variance. 7/21/2015 38 Unit Roots, Spurious Regressions, and Cointegration (Cont.) But the variable, Yt rYt 1 et Is stationary and has a finite variance if abs(r)<1. E.g., Yt 0.8Yt 1 et where et ~N(0,4). 7/21/2015 39 Unit Roots, Spurious Regressions, and Cointegration (Cont.) Tests for unit roots: Dickey-Fuller Test – Case of I(1) Yt Yt 1 et – Null hypothesis: – Alternative hypothesis: Yt t et DYt t ( r 1)Yt 1 et Run regression: And test (r-1)=0 by comparing the t-statistic with MacKinnon critical values for rejection of the hypothesis of a unit root. 7/21/2015 40 Unit Roots, Spurious Regressions, and Cointegration (Cont.) Case of Random Walk (RW) – Null hypothesis: Yt Yt 1 et – Alternative hypothesis: Yt rYt 1 et DYt ( r 1)Yt 1 et Run regression: And test (r-1)=0 by comparing the t-statistic with MacKinnon critical values for rejection of the hypothesis of a unit root. 7/21/2015 41 Unit Roots, Spurious Regressions, and Cointegration (Cont.) DF tests on macroeconomic variables: Most macroeconomic flows and stocks related to the population size such as output, consumption or employment are I(1) while price levels are I(2). E.g., GDP is I(1) while interest rates are I(2). 7/21/2015 42 Unit Roots, Spurious Regressions, and Cointegration (Cont.) Cointegration – If two series are both I(1), there may be a 1 such that et Yt 1 X t – Is I(0). The implication is that the two series are drifting upward together at roughly the same rate. – Two series satisfying the above requirement are said to be cointegrated and the vector 1, 1 ]' is a cointegrating vector. 7/21/2015 43