Christopher Dougherty EC220 - Introduction to econometrics (chapter 4) Slideshow: nonlinear regression Original citation: Dougherty, C.

Download Report

Transcript Christopher Dougherty EC220 - Introduction to econometrics (chapter 4) Slideshow: nonlinear regression Original citation: Dougherty, C.

Christopher Dougherty

EC220 - Introduction to econometrics (chapter 4)

Slideshow: nonlinear regression Original citation:

Dougherty, C. (2012) EC220 - Introduction to econometrics (chapter 4). [Teaching Resource] © 2012 The Author This version available at: http://learningresources.lse.ac.uk/130/ Available in LSE Learning Resources Online: May 2012 This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License. This license allows the user to remix, tweak, and build upon the work even for commercial purposes, as long as the user credits the author and licenses their new creations under the identical terms. http://creativecommons.org/licenses/by-sa/3.0/

http://learningresources.lse.ac.uk/

NONLINEAR REGRESSION

Y

 b

1

 b

2

X

b

3

u

Suppose you believe that a variable Y depends on a variable X according to the relationship shown and you wish to obtain estimates of

b

1 ,

b

2 , and

b

3 given data on Y and X.

1

NONLINEAR REGRESSION

Y

 b

1

 b

2

X

b

3

u

There is no way of transforming the relationship to obtain a linear relationship, and so it is not possible to apply the usual regression procedure.

2

NONLINEAR REGRESSION

Y

 b

1

 b

2

X

b

3

u

Nevertheless, one can still use the principle of minimizing the sum of the squares of the residuals to obtain estimates of the parameters. We will describe a simple nonlinear regression algorithm that uses the principle. It consists of a series of repeated steps.

3

NONLINEAR REGRESSION

Y

 b

1

 b

2

X

b

3

u

Nevertheless, one can still use the principle of minimizing the sum of the squares of the residuals to obtain estimates of the parameters. We will describe a simple nonlinear regression algorithm that uses the principle. It consists of a series of repeated steps.

4

NONLINEAR REGRESSION

Y

 b

1

 b

2

X

b

3

u

Nonlinear regression algorithm

1. You start by guessing plausible values for the parameters.

2. You calculate the predicted values of Y from the data on X, using these values of the parameters.

3. You calculate the residual for each observation in the sample, and hence RSS, the sum of the squares of the residuals.

4. You then make small changes in one or more of your estimates of the parameters. 5. You calculate the new predicted values of Y, residuals, and RSS.

6. If RSS is smaller than before, your new estimates of the parameters are better than the old ones and you take them as your new starting point.

7. You repeat steps 4, 5 and 6 again and again until you are unable to make any changes in the estimates of the parameters that would reduce RSS.

8. You conclude that you have minimized RSS, and you can describe the final estimates of the parameters as the least squares estimates.

5

NONLINEAR REGRESSION

Y

 b

1

 b

2

X

b

3

u

Nonlinear regression algorithm

1. You start by guessing plausible values for the parameters.

2. You calculate the predicted values of Y from the data on X, using these values of the parameters.

3. You calculate the residual for each observation in the sample, and hence RSS, the sum of the squares of the residuals.

4. You then make small changes in one or more of your estimates of the parameters. 5. You calculate the new predicted values of Y, residuals, and RSS.

6. If RSS is smaller than before, your new estimates of the parameters are better than the old ones and you take them as your new starting point.

7. You repeat steps 4, 5 and 6 again and again until you are unable to make any changes in the estimates of the parameters that would reduce RSS.

8. You conclude that you have minimized RSS, and you can describe the final estimates of the parameters as the least squares estimates.

6

NONLINEAR REGRESSION

Y

 b

1

 b

2

X

b

3

u

Nonlinear regression algorithm

1. You start by guessing plausible values for the parameters.

2. You calculate the predicted values of Y from the data on X, using these values of the parameters.

3. You calculate the residual for each observation in the sample, and hence RSS, the sum of the squares of the residuals.

4. You then make small changes in one or more of your estimates of the parameters. 5. You calculate the new predicted values of Y, residuals, and RSS.

6. If RSS is smaller than before, your new estimates of the parameters are better than the old ones and you take them as your new starting point.

7. You repeat steps 4, 5 and 6 again and again until you are unable to make any changes in the estimates of the parameters that would reduce RSS.

8. You conclude that you have minimized RSS, and you can describe the final estimates of the parameters as the least squares estimates.

7

NONLINEAR REGRESSION

Y

 b

1

 b

2

X

b

3

u

Nonlinear regression algorithm

1. You start by guessing plausible values for the parameters.

2. You calculate the predicted values of Y from the data on X, using these values of the parameters.

3. You calculate the residual for each observation in the sample, and hence RSS, the sum of the squares of the residuals.

4. You then make small changes in one or more of your estimates of the parameters. 5. You calculate the new predicted values of Y, residuals, and RSS.

6. If RSS is smaller than before, your new estimates of the parameters are better than the old ones and you take them as your new starting point.

7. You repeat steps 4, 5 and 6 again and again until you are unable to make any changes in the estimates of the parameters that would reduce RSS.

8. You conclude that you have minimized RSS, and you can describe the final estimates of the parameters as the least squares estimates.

8

NONLINEAR REGRESSION

Y

 b

1

 b

2

X

b

3

u

Nonlinear regression algorithm

1. You start by guessing plausible values for the parameters.

2. You calculate the predicted values of Y from the data on X, using these values of the parameters.

3. You calculate the residual for each observation in the sample, and hence RSS, the sum of the squares of the residuals.

4. You then make small changes in one or more of your estimates of the parameters. 5. You calculate the new predicted values of Y, residuals, and RSS.

6. If RSS is smaller than before, your new estimates of the parameters are better than the old ones and you take them as your new starting point.

7. You repeat steps 4, 5 and 6 again and again until you are unable to make any changes in the estimates of the parameters that would reduce RSS.

8. You conclude that you have minimized RSS, and you can describe the final estimates of the parameters as the least squares estimates.

9

NONLINEAR REGRESSION

Y

 b

1

 b

2

X

b

3

u

Nonlinear regression algorithm

1. You start by guessing plausible values for the parameters.

2. You calculate the predicted values of Y from the data on X, using these values of the parameters.

3. You calculate the residual for each observation in the sample, and hence RSS, the sum of the squares of the residuals.

4. You then make small changes in one or more of your estimates of the parameters. 5. You calculate the new predicted values of Y, residuals, and RSS.

6. If RSS is smaller than before, your new estimates of the parameters are better than the old ones and you take them as your new starting point.

7. You repeat steps 4, 5 and 6 again and again until you are unable to make any changes in the estimates of the parameters that would reduce RSS.

8. You conclude that you have minimized RSS, and you can describe the final estimates of the parameters as the least squares estimates.

10

NONLINEAR REGRESSION

Y

 b

1

 b

2

X

b

3

u

Nonlinear regression algorithm

1. You start by guessing plausible values for the parameters.

2. You calculate the predicted values of Y from the data on X, using these values of the parameters.

3. You calculate the residual for each observation in the sample, and hence RSS, the sum of the squares of the residuals.

4. You then make small changes in one or more of your estimates of the parameters. 5. You calculate the new predicted values of Y, residuals, and RSS.

6. If RSS is smaller than before, your new estimates of the parameters are better than the old ones and you take them as your new starting point.

7. You repeat steps 4, 5 and 6 again and again until you are unable to make any changes in the estimates of the parameters that would reduce RSS.

8. You conclude that you have minimized RSS, and you can describe the final estimates of the parameters as the least squares estimates.

11

NONLINEAR REGRESSION

Y

 b

1

 b

2

X

b

3

u

Nonlinear regression algorithm

1. You start by guessing plausible values for the parameters.

2. You calculate the predicted values of Y from the data on X, using these values of the parameters.

3. You calculate the residual for each observation in the sample, and hence RSS, the sum of the squares of the residuals.

4. You then make small changes in one or more of your estimates of the parameters. 5. You calculate the new predicted values of Y, residuals, and RSS.

6. If RSS is smaller than before, your new estimates of the parameters are better than the old ones and you take them as your new starting point.

7. You repeat steps 4, 5 and 6 again and again until you are unable to make any changes in the estimates of the parameters that would reduce RSS.

8. You conclude that you have minimized RSS, and you can describe the final estimates of the parameters as the least squares estimates.

It should be emphasized that, long ago, mathematicians devised sophisticated techniques to minimize the number of steps required by algorithms of this type.

12

NONLINEAR REGRESSION

e

 b

1

 b

2

g

u

3 2 1 0 0 -1 1 2 3 4 5 6 7 8 -2 GDP growth rate We will return to the relationship between employment growth rate, e, and GDP growth rate, g, in the first slideshow for this chapter. e and g are hypothesized to be related as shown.

13

NONLINEAR REGRESSION

e

 b

1

 b

2

g

u

3 2 1 0 0 -1 1 2 3 4 5 6 7 8 -2 GDP growth rate According to this specification, as g becomes large, e will tend to a limit of

b

1 . Looking at the figure, we see that the maximum value of e is about 2. So we will take this as our initial value for

b

1 . We then hunt for the optimal value of

b

2 , conditional on this guess for

b

1 .

14

NONLINEAR REGRESSION

e

 b

1

 b

2

g

u

100 80 60 40 RSS as a function of b 2 , conditional on b 1 = 2 20 0 -7 -6 -5 -4 -3 -2 -1 0

b 2

The figure shows RSS plotted as a function of b 2 , conditional on b 1 that the optimal value of b 2 , conditional on b 1 = 2, is –2.87.

= 2. From this we see

15

NONLINEAR REGRESSION

e

 b

1

 b

2

g

u

20 10 40 30 RSS as a function of b 1 , conditional on b 2 = –2.86

0 1 1.5

2 2.5

3

b 1

Next, holding b a function of b 1 2 at –2.87, we look to improve on our guess for b 1 . The figure shows RSS as , conditional on b 2 = –2.87. We see that the optimal value of b 1 is 2.09.

16

NONLINEAR REGRESSION

e

 b

1

 b

2

g

u

3 2

b

1 1 0 0 -1 -2 -3 -4 1 2 3 4 5 6

b

2 7 8 9 10 11 12 -5 We continue to do this both parameter estimates cease to change. The figure shows the first 11 iterations.

17

NONLINEAR REGRESSION

e

 b

1

 b

2

g

u

3 2

b

1 1 0 0 -1 -2 -3 -4 1 2 3 4 5 6

b

2 7 8 9 10 11 12 -5 Eventually, the estimates will reach limits. We will then have reached the values that yield minimum RSS.

18

NONLINEAR REGRESSION

e

 b

1

 b

2

g

u

3 2

b

1 1 0 0 -1 -2 -3 -4 1 2 3 4 5 6

b

2 7 8 9 10 11 12 -5 The limits are shown in the diagram as the horizontal lines. Convergence to them is painfully slow because this is a very inefficient algorithm. More sophisticated ones usually converge remarkably quickly.

19

NONLINEAR REGRESSION

e

 b

1

 b

2

g

u

3 2

b

1 1 0 0 -1 -2 -3 -4 1 2 3 4 5 6

b

2 7 8 9 10 11 12 -5 The limits must be the values from the transformed linear regression shown in the first slideshow for this chapter: b 1 = 2.60 and b 2 = –4.05. They have been determined by the same criterion, the minimization of RSS. All that we have done is to use a different method.

20

NONLINEAR REGRESSION

e

 b

1

 b

2

g

u

. nl (e = {beta1} + {beta2}/g) (obs = 25) Iteration 0: residual SS = 11.58161

Iteration 1: residual SS = 11.58161

Source | SS df MS -------------+----------------------------- Number of obs = 25 Model | 13.1203672 1 13.1203672 R-squared = 0.5311

Residual | 11.5816083 23 .503548186 Adj R-squared = 0.5108

-------------+----------------------------- Root MSE = .7096113

Total | 24.7019754 24 1.02924898 Res. dev. = 51.71049

----------------------------------------------------------------------------- e | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+--------------------------------------------------------------- /beta1 | 2.604753 .3748821 6.95 0.000 1.82925 3.380256

/beta2 | -4.050817 .793579 -5.10 0.000 -5.69246 -2.409174

----------------------------------------------------------------------------- The table shows such output for the present hyperbolic regression of e on g. It is, as usual, Stata output, but output from other regression applications will look similar.

21

NONLINEAR REGRESSION

e

 b

1

 b

2

g

u

. nl (e = {beta1} + {beta2}/g) (obs = 25) Iteration 0: residual SS = 11.58161

Iteration 1: residual SS = 11.58161

Source | SS df MS -------------+----------------------------- Number of obs = 25 Model | 13.1203672 1 13.1203672 R-squared = 0.5311

Residual | 11.5816083 23 .503548186 Adj R-squared = 0.5108

-------------+----------------------------- Root MSE = .7096113

Total | 24.7019754 24 1.02924898 Res. dev. = 51.71049

----------------------------------------------------------------------------- e | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+--------------------------------------------------------------- /beta1 | 2.604753 .3748821 6.95 0.000 1.82925 3.380256

/beta2 | -4.050817 .793579 -5.10 0.000 -5.69246 -2.409174

----------------------------------------------------------------------------- The Stata command for a nonlinear regression is ‘nl’. This is followed by the hypothesized mathematical relationship within parentheses. The parameters must be given names placed within braces. Here b 1 is {beta1} and b 2 is {beta2}. The output is effectively the same as the linear regression output in the first slideshow for this chapter.

22

NONLINEAR REGRESSION

e

 b

1

 b

2

g

u

3 2 1 0 -1 0 -2 1 2 3 4 5 6 7 8 -3 -4 GDP growth rate The hyperbolic function imposes the constraint that the function plunges to minus infinity for positive g as g approaches zero.

23

NONLINEAR REGRESSION

e

 b

1

 b

3

b 

2

g

u

This feature can be relaxed by using the variation shown. Unlike the previous function, this cannot be linearized by any kind of transformation. Here, nonlinear regression must be used.

24

NONLINEAR REGRESSION

e

 b

1

 b

3

b 

2

g

u

. nl (e = {beta1} + {beta2}/({beta3} + g)) (obs = 25) Iteration 0: residual SS = 11.58161

Iteration 1: residual SS = 11.19238

.....................................

Iteration 15: residual SS = 9.01051

Source | SS df MS -------------+----------------------------- Number of obs = 25 Model | 15.6914659 2 7.84573293 R-squared = 0.6352

Residual | 9.01050957 22 .409568617 Adj R-squared = 0.6021

-------------+----------------------------- Root MSE = .6399755

Total | 24.7019754 24 1.02924898 Res. dev. = 45.43482

----------------------------------------------------------------------------- e | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+--------------------------------------------------------------- /beta1 | 5.467548 2.826401 1.93 0.066 -.3940491 11.32914

/beta2 | -31.0764 41.78914 -0.74 0.465 -117.7418 55.58897

/beta3 | 4.148589 4.870437 0.85 0.404 -5.95208 14.24926

----------------------------------------------------------------------------- Parameter beta1 taken as constant term in model & ANOVA table The output for this specification is shown, with most of the iteration messages deleted.

25

NONLINEAR REGRESSION

e

 b

1

 b

3

b 

2

g

u

3 2 1 0 -1 0 -2 -3 1 2 3 4 5 6 7 8 9 -4 GDP growth rate The figure compares the original (black) and new (red) hyperbolic functions. The fit is a considerable improvement, reflected in a higher R 2 , 0.64 instead of 0.53.

26

Copyright Christopher Dougherty 2011.

These slideshows may be downloaded by anyone, anywhere for personal use.

Subject to respect for copyright and, where appropriate, attribution, they may be used as a resource for teaching an econometrics course. There is no need to refer to the author.

The content of this slideshow comes from Section 4.4 of C. Dougherty, Introduction to Econometrics, fourth edition 2011, Oxford University Press.

Additional (free) resources for both students and instructors may be downloaded from the OUP Online Resource Centre http://www.oup.com/uk/orc/bin/9780199567089/ .

Individuals studying econometrics on their own and who feel that they might benefit from participation in a formal course should consider the London School of Economics summer school course EC212 Introduction to Econometrics http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx

or the University of London International Programmes distance learning course 20 Elements of Econometrics www.londoninternational.ac.uk/lse .

11.07.25