Nonlinear regression algorithm 1. Guess b 1 , b 2 , and b 3 . b 1 , b 2

Transcript Nonlinear regression algorithm 1. Guess b 1 , b 2 , and b 3 . b 1 , b 2

Slide 1

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Suppose you believe that a variable Y depends on a variable X according to the relationship
shown and you wish to obtain estimates of b1, b2, and b3 given data on Y and X.
1

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

There is no way of transforming the relationship to obtain a linear relationship, and so it is
not possible to apply the usual regression procedure.
2

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nevertheless, one can still use the principle of minimizing the sum of the squares of the
residuals to obtain estimates of the parameters. We will describe a simple nonlinear
regression algorithm that uses the principle. It consists of a series of repeated steps.
3

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.

You start by guessing plausible values for the parameters.

4

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

You calculate the corresponding fitted values of Y from the data on X, conditional on these
values of the parameters.
5

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.

You calculate the residual for each observation in the sample, and hence RSS, the sum of
the squares of the residuals.
6

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.

You then make small changes in one or more of your estimates of the parameters.

7

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

Using the new estimates of b1, b2, and b3, you re-calculate the fitted values of Y. Then you
re-calculate the residuals and RSS.
8

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.

If RSS is smaller than before, your new estimates of the parameters are better than the old
ones and you continue adjusting your estimates in the same direction. Otherwise, you
would try different adjustments.
9

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You repeat steps 5, 6, and 7 again and again until you are unable to make any changes in
the estimates of the parameters that would reduce RSS.
10

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You conclude that you have minimized RSS, and you can describe the final estimates of the
parameters as the least squares estimates.
11

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

It should be emphasized that, long ago, mathematicians devised sophisticated techniques
to minimize the number of steps required by algorithms of this type.
12

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

We will return to the relationship between employment growth rate, e, and GDP growth rate,
g, in the first slideshow for this chapter. e and g are hypothesized to be related as shown.
13

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

According to this specification, as g becomes large, e will tend to a limit of b1. Looking at
the figure, we see that the maximum value of e is about 3. So we will take this as our initial
value for b1. We then hunt for the optimal value of b2, conditional on this guess for b1.
14

NONLINEAR REGRESSION
RSS
Conditional on b1 = 3

40

20

0

-7

-6

-5 –4.82

-4

b2

-3

The figure shows RSS plotted as a function of b2, conditional on b1 = 3. From this we see
that the optimal value of b2, conditional on b1 = 3, is –4.82.
15

NONLINEAR REGRESSION
RSS
Conditional on b2 = –4.82

40

20

0

2

2.94 3

b1

4

Next, holding b2 at –4.82, we look to improve on our guess for b1. The figure shows RSS as
a function of b1, conditional on b2 = –4.82. We see that the optimal value of b1 is 2.94.
16

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

We continue to do this both parameter estimates cease to change. The figure shows the
first 10 iterations.
17

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

Eventually, the estimates will reach limits. We will then have reached the values that yield
minimum RSS.
18

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

The limits are shown in the diagram as the horizontal lines. Convergence to them is
painfully slow because this is a very inefficient algorithm. More sophisticated ones usually
converge remarkably quickly.
19

NONLINEAR REGRESSION
4

3

2.60

2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3
-4

b2
–4.05

-5
-6

The limits must be the values from the transformed linear regression shown in the first
slideshow for this chapter: b1 = 2.60 and b2 = –4.05. They have been determined by the
same criterion, the minimization of RSS. All that we have done is to use a different method.
20

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

Here is the output for the present hyperbolic regression of e on g using nonlinear
regression. It is, as usual, Stata output, but output from other regression applications will
look similar.
21

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

The Stata command for a nonlinear regression is ‘nl’. This is followed by the hypothesized
mathematical relationship within parentheses. The parameters must be given names placed
within braces. Here b1 is {beta1} and b2 is {beta2}.
22

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

eˆ  2 . 60  4 . 05 z  2 . 60 

4 . 05
g

The output is effectively the same as the linear regression output in the first slideshow for
this chapter.
23

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3
2

1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The hyperbolic function imposes the constraint that the function plunges to minus infinity
for positive g as g approaches zero.
24

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

This feature can be relaxed by using the variation shown. Unlike the previous function, this
cannot be linearized by any kind of transformation. Here, nonlinear regression must be
used.
25

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

The output for this specification is shown, with most of the iteration messages deleted.

26

NONLINEAR REGRESSION

e  b1 

b2
b3  g

u

employment growth rate

3
2
1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The figure compares the original (black) and new (red) hyperbolic functions. The fit is a
considerable improvement, reflected in a higher R 2, 0.64 instead of 0.53.
27

Copyright Christopher Dougherty 2012.
These slideshows may be downloaded by anyone, anywhere for personal use.
Subject to respect for copyright and, where appropriate, attribution, they may be
used as a resource for teaching an econometrics course. There is no need to
refer to the author.
The content of this slideshow comes from Section 4.4 of C. Dougherty,
Introduction to Econometrics, fourth edition 2011, Oxford University Press.
Additional (free) resources for both students and instructors may be
downloaded from the OUP Online Resource Centre
http://www.oup.com/uk/orc/bin/9780199567089/.
Individuals studying econometrics on their own who feel that they might benefit
from participation in a formal course should consider the London School of
Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
EC2020 Elements of Econometrics
www.londoninternational.ac.uk/lse.

2012.11.04

Slide 2

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Suppose you believe that a variable Y depends on a variable X according to the relationship
shown and you wish to obtain estimates of b1, b2, and b3 given data on Y and X.
1

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

There is no way of transforming the relationship to obtain a linear relationship, and so it is
not possible to apply the usual regression procedure.
2

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nevertheless, one can still use the principle of minimizing the sum of the squares of the
residuals to obtain estimates of the parameters. We will describe a simple nonlinear
regression algorithm that uses the principle. It consists of a series of repeated steps.
3

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.

You start by guessing plausible values for the parameters.

4

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

You calculate the corresponding fitted values of Y from the data on X, conditional on these
values of the parameters.
5

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.

You calculate the residual for each observation in the sample, and hence RSS, the sum of
the squares of the residuals.
6

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.

You then make small changes in one or more of your estimates of the parameters.

7

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

Using the new estimates of b1, b2, and b3, you re-calculate the fitted values of Y. Then you
re-calculate the residuals and RSS.
8

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.

If RSS is smaller than before, your new estimates of the parameters are better than the old
ones and you continue adjusting your estimates in the same direction. Otherwise, you
would try different adjustments.
9

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You repeat steps 5, 6, and 7 again and again until you are unable to make any changes in
the estimates of the parameters that would reduce RSS.
10

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You conclude that you have minimized RSS, and you can describe the final estimates of the
parameters as the least squares estimates.
11

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

It should be emphasized that, long ago, mathematicians devised sophisticated techniques
to minimize the number of steps required by algorithms of this type.
12

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

We will return to the relationship between employment growth rate, e, and GDP growth rate,
g, in the first slideshow for this chapter. e and g are hypothesized to be related as shown.
13

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

According to this specification, as g becomes large, e will tend to a limit of b1. Looking at
the figure, we see that the maximum value of e is about 3. So we will take this as our initial
value for b1. We then hunt for the optimal value of b2, conditional on this guess for b1.
14

NONLINEAR REGRESSION
RSS
Conditional on b1 = 3

40

20

0

-7

-6

-5 –4.82

-4

b2

-3

The figure shows RSS plotted as a function of b2, conditional on b1 = 3. From this we see
that the optimal value of b2, conditional on b1 = 3, is –4.82.
15

NONLINEAR REGRESSION
RSS
Conditional on b2 = –4.82

40

20

0

2

2.94 3

b1

4

Next, holding b2 at –4.82, we look to improve on our guess for b1. The figure shows RSS as
a function of b1, conditional on b2 = –4.82. We see that the optimal value of b1 is 2.94.
16

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

We continue to do this both parameter estimates cease to change. The figure shows the
first 10 iterations.
17

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

Eventually, the estimates will reach limits. We will then have reached the values that yield
minimum RSS.
18

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

The limits are shown in the diagram as the horizontal lines. Convergence to them is
painfully slow because this is a very inefficient algorithm. More sophisticated ones usually
converge remarkably quickly.
19

NONLINEAR REGRESSION
4

3

2.60

2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3
-4

b2
–4.05

-5
-6

The limits must be the values from the transformed linear regression shown in the first
slideshow for this chapter: b1 = 2.60 and b2 = –4.05. They have been determined by the
same criterion, the minimization of RSS. All that we have done is to use a different method.
20

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

Here is the output for the present hyperbolic regression of e on g using nonlinear
regression. It is, as usual, Stata output, but output from other regression applications will
look similar.
21

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

The Stata command for a nonlinear regression is ‘nl’. This is followed by the hypothesized
mathematical relationship within parentheses. The parameters must be given names placed
within braces. Here b1 is {beta1} and b2 is {beta2}.
22

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

eˆ  2 . 60  4 . 05 z  2 . 60 

4 . 05
g

The output is effectively the same as the linear regression output in the first slideshow for
this chapter.
23

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3
2

1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The hyperbolic function imposes the constraint that the function plunges to minus infinity
for positive g as g approaches zero.
24

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

This feature can be relaxed by using the variation shown. Unlike the previous function, this
cannot be linearized by any kind of transformation. Here, nonlinear regression must be
used.
25

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

The output for this specification is shown, with most of the iteration messages deleted.

26

NONLINEAR REGRESSION

e  b1 

b2
b3  g

u

employment growth rate

3
2
1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The figure compares the original (black) and new (red) hyperbolic functions. The fit is a
considerable improvement, reflected in a higher R 2, 0.64 instead of 0.53.
27

Copyright Christopher Dougherty 2012.
These slideshows may be downloaded by anyone, anywhere for personal use.
Subject to respect for copyright and, where appropriate, attribution, they may be
used as a resource for teaching an econometrics course. There is no need to
refer to the author.
The content of this slideshow comes from Section 4.4 of C. Dougherty,
Introduction to Econometrics, fourth edition 2011, Oxford University Press.
Additional (free) resources for both students and instructors may be
downloaded from the OUP Online Resource Centre
http://www.oup.com/uk/orc/bin/9780199567089/.
Individuals studying econometrics on their own who feel that they might benefit
from participation in a formal course should consider the London School of
Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
EC2020 Elements of Econometrics
www.londoninternational.ac.uk/lse.

2012.11.04

Slide 3

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Suppose you believe that a variable Y depends on a variable X according to the relationship
shown and you wish to obtain estimates of b1, b2, and b3 given data on Y and X.
1

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

There is no way of transforming the relationship to obtain a linear relationship, and so it is
not possible to apply the usual regression procedure.
2

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nevertheless, one can still use the principle of minimizing the sum of the squares of the
residuals to obtain estimates of the parameters. We will describe a simple nonlinear
regression algorithm that uses the principle. It consists of a series of repeated steps.
3

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.

You start by guessing plausible values for the parameters.

4

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

You calculate the corresponding fitted values of Y from the data on X, conditional on these
values of the parameters.
5

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.

You calculate the residual for each observation in the sample, and hence RSS, the sum of
the squares of the residuals.
6

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.

You then make small changes in one or more of your estimates of the parameters.

7

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

Using the new estimates of b1, b2, and b3, you re-calculate the fitted values of Y. Then you
re-calculate the residuals and RSS.
8

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.

If RSS is smaller than before, your new estimates of the parameters are better than the old
ones and you continue adjusting your estimates in the same direction. Otherwise, you
would try different adjustments.
9

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You repeat steps 5, 6, and 7 again and again until you are unable to make any changes in
the estimates of the parameters that would reduce RSS.
10

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You conclude that you have minimized RSS, and you can describe the final estimates of the
parameters as the least squares estimates.
11

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

It should be emphasized that, long ago, mathematicians devised sophisticated techniques
to minimize the number of steps required by algorithms of this type.
12

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

We will return to the relationship between employment growth rate, e, and GDP growth rate,
g, in the first slideshow for this chapter. e and g are hypothesized to be related as shown.
13

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

According to this specification, as g becomes large, e will tend to a limit of b1. Looking at
the figure, we see that the maximum value of e is about 3. So we will take this as our initial
value for b1. We then hunt for the optimal value of b2, conditional on this guess for b1.
14

NONLINEAR REGRESSION
RSS
Conditional on b1 = 3

40

20

0

-7

-6

-5 –4.82

-4

b2

-3

The figure shows RSS plotted as a function of b2, conditional on b1 = 3. From this we see
that the optimal value of b2, conditional on b1 = 3, is –4.82.
15

NONLINEAR REGRESSION
RSS
Conditional on b2 = –4.82

40

20

0

2

2.94 3

b1

4

Next, holding b2 at –4.82, we look to improve on our guess for b1. The figure shows RSS as
a function of b1, conditional on b2 = –4.82. We see that the optimal value of b1 is 2.94.
16

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

We continue to do this both parameter estimates cease to change. The figure shows the
first 10 iterations.
17

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

Eventually, the estimates will reach limits. We will then have reached the values that yield
minimum RSS.
18

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

The limits are shown in the diagram as the horizontal lines. Convergence to them is
painfully slow because this is a very inefficient algorithm. More sophisticated ones usually
converge remarkably quickly.
19

NONLINEAR REGRESSION
4

3

2.60

2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3
-4

b2
–4.05

-5
-6

The limits must be the values from the transformed linear regression shown in the first
slideshow for this chapter: b1 = 2.60 and b2 = –4.05. They have been determined by the
same criterion, the minimization of RSS. All that we have done is to use a different method.
20

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

Here is the output for the present hyperbolic regression of e on g using nonlinear
regression. It is, as usual, Stata output, but output from other regression applications will
look similar.
21

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

The Stata command for a nonlinear regression is ‘nl’. This is followed by the hypothesized
mathematical relationship within parentheses. The parameters must be given names placed
within braces. Here b1 is {beta1} and b2 is {beta2}.
22

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

eˆ  2 . 60  4 . 05 z  2 . 60 

4 . 05
g

The output is effectively the same as the linear regression output in the first slideshow for
this chapter.
23

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3
2

1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The hyperbolic function imposes the constraint that the function plunges to minus infinity
for positive g as g approaches zero.
24

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

This feature can be relaxed by using the variation shown. Unlike the previous function, this
cannot be linearized by any kind of transformation. Here, nonlinear regression must be
used.
25

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

The output for this specification is shown, with most of the iteration messages deleted.

26

NONLINEAR REGRESSION

e  b1 

b2
b3  g

u

employment growth rate

3
2
1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The figure compares the original (black) and new (red) hyperbolic functions. The fit is a
considerable improvement, reflected in a higher R 2, 0.64 instead of 0.53.
27

Copyright Christopher Dougherty 2012.
These slideshows may be downloaded by anyone, anywhere for personal use.
Subject to respect for copyright and, where appropriate, attribution, they may be
used as a resource for teaching an econometrics course. There is no need to
refer to the author.
The content of this slideshow comes from Section 4.4 of C. Dougherty,
Introduction to Econometrics, fourth edition 2011, Oxford University Press.
Additional (free) resources for both students and instructors may be
downloaded from the OUP Online Resource Centre
http://www.oup.com/uk/orc/bin/9780199567089/.
Individuals studying econometrics on their own who feel that they might benefit
from participation in a formal course should consider the London School of
Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
EC2020 Elements of Econometrics
www.londoninternational.ac.uk/lse.

2012.11.04

Slide 4

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Suppose you believe that a variable Y depends on a variable X according to the relationship
shown and you wish to obtain estimates of b1, b2, and b3 given data on Y and X.
1

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

There is no way of transforming the relationship to obtain a linear relationship, and so it is
not possible to apply the usual regression procedure.
2

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nevertheless, one can still use the principle of minimizing the sum of the squares of the
residuals to obtain estimates of the parameters. We will describe a simple nonlinear
regression algorithm that uses the principle. It consists of a series of repeated steps.
3

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.

You start by guessing plausible values for the parameters.

4

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

You calculate the corresponding fitted values of Y from the data on X, conditional on these
values of the parameters.
5

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.

You calculate the residual for each observation in the sample, and hence RSS, the sum of
the squares of the residuals.
6

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.

You then make small changes in one or more of your estimates of the parameters.

7

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

Using the new estimates of b1, b2, and b3, you re-calculate the fitted values of Y. Then you
re-calculate the residuals and RSS.
8

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.

If RSS is smaller than before, your new estimates of the parameters are better than the old
ones and you continue adjusting your estimates in the same direction. Otherwise, you
would try different adjustments.
9

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You repeat steps 5, 6, and 7 again and again until you are unable to make any changes in
the estimates of the parameters that would reduce RSS.
10

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You conclude that you have minimized RSS, and you can describe the final estimates of the
parameters as the least squares estimates.
11

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

It should be emphasized that, long ago, mathematicians devised sophisticated techniques
to minimize the number of steps required by algorithms of this type.
12

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

We will return to the relationship between employment growth rate, e, and GDP growth rate,
g, in the first slideshow for this chapter. e and g are hypothesized to be related as shown.
13

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

According to this specification, as g becomes large, e will tend to a limit of b1. Looking at
the figure, we see that the maximum value of e is about 3. So we will take this as our initial
value for b1. We then hunt for the optimal value of b2, conditional on this guess for b1.
14

NONLINEAR REGRESSION
RSS
Conditional on b1 = 3

40

20

0

-7

-6

-5 –4.82

-4

b2

-3

The figure shows RSS plotted as a function of b2, conditional on b1 = 3. From this we see
that the optimal value of b2, conditional on b1 = 3, is –4.82.
15

NONLINEAR REGRESSION
RSS
Conditional on b2 = –4.82

40

20

0

2

2.94 3

b1

4

Next, holding b2 at –4.82, we look to improve on our guess for b1. The figure shows RSS as
a function of b1, conditional on b2 = –4.82. We see that the optimal value of b1 is 2.94.
16

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

We continue to do this both parameter estimates cease to change. The figure shows the
first 10 iterations.
17

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

Eventually, the estimates will reach limits. We will then have reached the values that yield
minimum RSS.
18

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

The limits are shown in the diagram as the horizontal lines. Convergence to them is
painfully slow because this is a very inefficient algorithm. More sophisticated ones usually
converge remarkably quickly.
19

NONLINEAR REGRESSION
4

3

2.60

2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3
-4

b2
–4.05

-5
-6

The limits must be the values from the transformed linear regression shown in the first
slideshow for this chapter: b1 = 2.60 and b2 = –4.05. They have been determined by the
same criterion, the minimization of RSS. All that we have done is to use a different method.
20

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

Here is the output for the present hyperbolic regression of e on g using nonlinear
regression. It is, as usual, Stata output, but output from other regression applications will
look similar.
21

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

The Stata command for a nonlinear regression is ‘nl’. This is followed by the hypothesized
mathematical relationship within parentheses. The parameters must be given names placed
within braces. Here b1 is {beta1} and b2 is {beta2}.
22

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

eˆ  2 . 60  4 . 05 z  2 . 60 

4 . 05
g

The output is effectively the same as the linear regression output in the first slideshow for
this chapter.
23

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3
2

1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The hyperbolic function imposes the constraint that the function plunges to minus infinity
for positive g as g approaches zero.
24

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

This feature can be relaxed by using the variation shown. Unlike the previous function, this
cannot be linearized by any kind of transformation. Here, nonlinear regression must be
used.
25

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

The output for this specification is shown, with most of the iteration messages deleted.

26

NONLINEAR REGRESSION

e  b1 

b2
b3  g

u

employment growth rate

3
2
1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The figure compares the original (black) and new (red) hyperbolic functions. The fit is a
considerable improvement, reflected in a higher R 2, 0.64 instead of 0.53.
27

Copyright Christopher Dougherty 2012.
These slideshows may be downloaded by anyone, anywhere for personal use.
Subject to respect for copyright and, where appropriate, attribution, they may be
used as a resource for teaching an econometrics course. There is no need to
refer to the author.
The content of this slideshow comes from Section 4.4 of C. Dougherty,
Introduction to Econometrics, fourth edition 2011, Oxford University Press.
Additional (free) resources for both students and instructors may be
downloaded from the OUP Online Resource Centre
http://www.oup.com/uk/orc/bin/9780199567089/.
Individuals studying econometrics on their own who feel that they might benefit
from participation in a formal course should consider the London School of
Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
EC2020 Elements of Econometrics
www.londoninternational.ac.uk/lse.

2012.11.04

Slide 5

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Suppose you believe that a variable Y depends on a variable X according to the relationship
shown and you wish to obtain estimates of b1, b2, and b3 given data on Y and X.
1

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

There is no way of transforming the relationship to obtain a linear relationship, and so it is
not possible to apply the usual regression procedure.
2

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nevertheless, one can still use the principle of minimizing the sum of the squares of the
residuals to obtain estimates of the parameters. We will describe a simple nonlinear
regression algorithm that uses the principle. It consists of a series of repeated steps.
3

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.

You start by guessing plausible values for the parameters.

4

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

You calculate the corresponding fitted values of Y from the data on X, conditional on these
values of the parameters.
5

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.

You calculate the residual for each observation in the sample, and hence RSS, the sum of
the squares of the residuals.
6

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.

You then make small changes in one or more of your estimates of the parameters.

7

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

Using the new estimates of b1, b2, and b3, you re-calculate the fitted values of Y. Then you
re-calculate the residuals and RSS.
8

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.

If RSS is smaller than before, your new estimates of the parameters are better than the old
ones and you continue adjusting your estimates in the same direction. Otherwise, you
would try different adjustments.
9

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You repeat steps 5, 6, and 7 again and again until you are unable to make any changes in
the estimates of the parameters that would reduce RSS.
10

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You conclude that you have minimized RSS, and you can describe the final estimates of the
parameters as the least squares estimates.
11

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

It should be emphasized that, long ago, mathematicians devised sophisticated techniques
to minimize the number of steps required by algorithms of this type.
12

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

We will return to the relationship between employment growth rate, e, and GDP growth rate,
g, in the first slideshow for this chapter. e and g are hypothesized to be related as shown.
13

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

According to this specification, as g becomes large, e will tend to a limit of b1. Looking at
the figure, we see that the maximum value of e is about 3. So we will take this as our initial
value for b1. We then hunt for the optimal value of b2, conditional on this guess for b1.
14

NONLINEAR REGRESSION
RSS
Conditional on b1 = 3

40

20

0

-7

-6

-5 –4.82

-4

b2

-3

The figure shows RSS plotted as a function of b2, conditional on b1 = 3. From this we see
that the optimal value of b2, conditional on b1 = 3, is –4.82.
15

NONLINEAR REGRESSION
RSS
Conditional on b2 = –4.82

40

20

0

2

2.94 3

b1

4

Next, holding b2 at –4.82, we look to improve on our guess for b1. The figure shows RSS as
a function of b1, conditional on b2 = –4.82. We see that the optimal value of b1 is 2.94.
16

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

We continue to do this both parameter estimates cease to change. The figure shows the
first 10 iterations.
17

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

Eventually, the estimates will reach limits. We will then have reached the values that yield
minimum RSS.
18

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

The limits are shown in the diagram as the horizontal lines. Convergence to them is
painfully slow because this is a very inefficient algorithm. More sophisticated ones usually
converge remarkably quickly.
19

NONLINEAR REGRESSION
4

3

2.60

2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3
-4

b2
–4.05

-5
-6

The limits must be the values from the transformed linear regression shown in the first
slideshow for this chapter: b1 = 2.60 and b2 = –4.05. They have been determined by the
same criterion, the minimization of RSS. All that we have done is to use a different method.
20

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

Here is the output for the present hyperbolic regression of e on g using nonlinear
regression. It is, as usual, Stata output, but output from other regression applications will
look similar.
21

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

The Stata command for a nonlinear regression is ‘nl’. This is followed by the hypothesized
mathematical relationship within parentheses. The parameters must be given names placed
within braces. Here b1 is {beta1} and b2 is {beta2}.
22

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

eˆ  2 . 60  4 . 05 z  2 . 60 

4 . 05
g

The output is effectively the same as the linear regression output in the first slideshow for
this chapter.
23

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3
2

1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The hyperbolic function imposes the constraint that the function plunges to minus infinity
for positive g as g approaches zero.
24

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

This feature can be relaxed by using the variation shown. Unlike the previous function, this
cannot be linearized by any kind of transformation. Here, nonlinear regression must be
used.
25

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

The output for this specification is shown, with most of the iteration messages deleted.

26

NONLINEAR REGRESSION

e  b1 

b2
b3  g

u

employment growth rate

3
2
1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The figure compares the original (black) and new (red) hyperbolic functions. The fit is a
considerable improvement, reflected in a higher R 2, 0.64 instead of 0.53.
27

Copyright Christopher Dougherty 2012.
These slideshows may be downloaded by anyone, anywhere for personal use.
Subject to respect for copyright and, where appropriate, attribution, they may be
used as a resource for teaching an econometrics course. There is no need to
refer to the author.
The content of this slideshow comes from Section 4.4 of C. Dougherty,
Introduction to Econometrics, fourth edition 2011, Oxford University Press.
Additional (free) resources for both students and instructors may be
downloaded from the OUP Online Resource Centre
http://www.oup.com/uk/orc/bin/9780199567089/.
Individuals studying econometrics on their own who feel that they might benefit
from participation in a formal course should consider the London School of
Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
EC2020 Elements of Econometrics
www.londoninternational.ac.uk/lse.

2012.11.04

Slide 6

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Suppose you believe that a variable Y depends on a variable X according to the relationship
shown and you wish to obtain estimates of b1, b2, and b3 given data on Y and X.
1

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

There is no way of transforming the relationship to obtain a linear relationship, and so it is
not possible to apply the usual regression procedure.
2

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nevertheless, one can still use the principle of minimizing the sum of the squares of the
residuals to obtain estimates of the parameters. We will describe a simple nonlinear
regression algorithm that uses the principle. It consists of a series of repeated steps.
3

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.

You start by guessing plausible values for the parameters.

4

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

You calculate the corresponding fitted values of Y from the data on X, conditional on these
values of the parameters.
5

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.

You calculate the residual for each observation in the sample, and hence RSS, the sum of
the squares of the residuals.
6

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.

You then make small changes in one or more of your estimates of the parameters.

7

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

Using the new estimates of b1, b2, and b3, you re-calculate the fitted values of Y. Then you
re-calculate the residuals and RSS.
8

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.

If RSS is smaller than before, your new estimates of the parameters are better than the old
ones and you continue adjusting your estimates in the same direction. Otherwise, you
would try different adjustments.
9

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You repeat steps 5, 6, and 7 again and again until you are unable to make any changes in
the estimates of the parameters that would reduce RSS.
10

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You conclude that you have minimized RSS, and you can describe the final estimates of the
parameters as the least squares estimates.
11

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

It should be emphasized that, long ago, mathematicians devised sophisticated techniques
to minimize the number of steps required by algorithms of this type.
12

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

We will return to the relationship between employment growth rate, e, and GDP growth rate,
g, in the first slideshow for this chapter. e and g are hypothesized to be related as shown.
13

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

According to this specification, as g becomes large, e will tend to a limit of b1. Looking at
the figure, we see that the maximum value of e is about 3. So we will take this as our initial
value for b1. We then hunt for the optimal value of b2, conditional on this guess for b1.
14

NONLINEAR REGRESSION
RSS
Conditional on b1 = 3

40

20

0

-7

-6

-5 –4.82

-4

b2

-3

The figure shows RSS plotted as a function of b2, conditional on b1 = 3. From this we see
that the optimal value of b2, conditional on b1 = 3, is –4.82.
15

NONLINEAR REGRESSION
RSS
Conditional on b2 = –4.82

40

20

0

2

2.94 3

b1

4

Next, holding b2 at –4.82, we look to improve on our guess for b1. The figure shows RSS as
a function of b1, conditional on b2 = –4.82. We see that the optimal value of b1 is 2.94.
16

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

We continue to do this both parameter estimates cease to change. The figure shows the
first 10 iterations.
17

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

Eventually, the estimates will reach limits. We will then have reached the values that yield
minimum RSS.
18

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

The limits are shown in the diagram as the horizontal lines. Convergence to them is
painfully slow because this is a very inefficient algorithm. More sophisticated ones usually
converge remarkably quickly.
19

NONLINEAR REGRESSION
4

3

2.60

2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3
-4

b2
–4.05

-5
-6

The limits must be the values from the transformed linear regression shown in the first
slideshow for this chapter: b1 = 2.60 and b2 = –4.05. They have been determined by the
same criterion, the minimization of RSS. All that we have done is to use a different method.
20

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

Here is the output for the present hyperbolic regression of e on g using nonlinear
regression. It is, as usual, Stata output, but output from other regression applications will
look similar.
21

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

The Stata command for a nonlinear regression is ‘nl’. This is followed by the hypothesized
mathematical relationship within parentheses. The parameters must be given names placed
within braces. Here b1 is {beta1} and b2 is {beta2}.
22

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

eˆ  2 . 60  4 . 05 z  2 . 60 

4 . 05
g

The output is effectively the same as the linear regression output in the first slideshow for
this chapter.
23

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3
2

1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The hyperbolic function imposes the constraint that the function plunges to minus infinity
for positive g as g approaches zero.
24

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

This feature can be relaxed by using the variation shown. Unlike the previous function, this
cannot be linearized by any kind of transformation. Here, nonlinear regression must be
used.
25

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

The output for this specification is shown, with most of the iteration messages deleted.

26

NONLINEAR REGRESSION

e  b1 

b2
b3  g

u

employment growth rate

3
2
1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The figure compares the original (black) and new (red) hyperbolic functions. The fit is a
considerable improvement, reflected in a higher R 2, 0.64 instead of 0.53.
27

Copyright Christopher Dougherty 2012.
These slideshows may be downloaded by anyone, anywhere for personal use.
Subject to respect for copyright and, where appropriate, attribution, they may be
used as a resource for teaching an econometrics course. There is no need to
refer to the author.
The content of this slideshow comes from Section 4.4 of C. Dougherty,
Introduction to Econometrics, fourth edition 2011, Oxford University Press.
Additional (free) resources for both students and instructors may be
downloaded from the OUP Online Resource Centre
http://www.oup.com/uk/orc/bin/9780199567089/.
Individuals studying econometrics on their own who feel that they might benefit
from participation in a formal course should consider the London School of
Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
EC2020 Elements of Econometrics
www.londoninternational.ac.uk/lse.

2012.11.04

Slide 7

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Suppose you believe that a variable Y depends on a variable X according to the relationship
shown and you wish to obtain estimates of b1, b2, and b3 given data on Y and X.
1

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

There is no way of transforming the relationship to obtain a linear relationship, and so it is
not possible to apply the usual regression procedure.
2

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nevertheless, one can still use the principle of minimizing the sum of the squares of the
residuals to obtain estimates of the parameters. We will describe a simple nonlinear
regression algorithm that uses the principle. It consists of a series of repeated steps.
3

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.

You start by guessing plausible values for the parameters.

4

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

You calculate the corresponding fitted values of Y from the data on X, conditional on these
values of the parameters.
5

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.

You calculate the residual for each observation in the sample, and hence RSS, the sum of
the squares of the residuals.
6

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.

You then make small changes in one or more of your estimates of the parameters.

7

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

Using the new estimates of b1, b2, and b3, you re-calculate the fitted values of Y. Then you
re-calculate the residuals and RSS.
8

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.

If RSS is smaller than before, your new estimates of the parameters are better than the old
ones and you continue adjusting your estimates in the same direction. Otherwise, you
would try different adjustments.
9

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You repeat steps 5, 6, and 7 again and again until you are unable to make any changes in
the estimates of the parameters that would reduce RSS.
10

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You conclude that you have minimized RSS, and you can describe the final estimates of the
parameters as the least squares estimates.
11

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

It should be emphasized that, long ago, mathematicians devised sophisticated techniques
to minimize the number of steps required by algorithms of this type.
12

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

We will return to the relationship between employment growth rate, e, and GDP growth rate,
g, in the first slideshow for this chapter. e and g are hypothesized to be related as shown.
13

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

According to this specification, as g becomes large, e will tend to a limit of b1. Looking at
the figure, we see that the maximum value of e is about 3. So we will take this as our initial
value for b1. We then hunt for the optimal value of b2, conditional on this guess for b1.
14

NONLINEAR REGRESSION
RSS
Conditional on b1 = 3

40

20

0

-7

-6

-5 –4.82

-4

b2

-3

The figure shows RSS plotted as a function of b2, conditional on b1 = 3. From this we see
that the optimal value of b2, conditional on b1 = 3, is –4.82.
15

NONLINEAR REGRESSION
RSS
Conditional on b2 = –4.82

40

20

0

2

2.94 3

b1

4

Next, holding b2 at –4.82, we look to improve on our guess for b1. The figure shows RSS as
a function of b1, conditional on b2 = –4.82. We see that the optimal value of b1 is 2.94.
16

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

We continue to do this both parameter estimates cease to change. The figure shows the
first 10 iterations.
17

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

Eventually, the estimates will reach limits. We will then have reached the values that yield
minimum RSS.
18

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

The limits are shown in the diagram as the horizontal lines. Convergence to them is
painfully slow because this is a very inefficient algorithm. More sophisticated ones usually
converge remarkably quickly.
19

NONLINEAR REGRESSION
4

3

2.60

2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3
-4

b2
–4.05

-5
-6

The limits must be the values from the transformed linear regression shown in the first
slideshow for this chapter: b1 = 2.60 and b2 = –4.05. They have been determined by the
same criterion, the minimization of RSS. All that we have done is to use a different method.
20

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

Here is the output for the present hyperbolic regression of e on g using nonlinear
regression. It is, as usual, Stata output, but output from other regression applications will
look similar.
21

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

The Stata command for a nonlinear regression is ‘nl’. This is followed by the hypothesized
mathematical relationship within parentheses. The parameters must be given names placed
within braces. Here b1 is {beta1} and b2 is {beta2}.
22

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

eˆ  2 . 60  4 . 05 z  2 . 60 

4 . 05
g

The output is effectively the same as the linear regression output in the first slideshow for
this chapter.
23

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3
2

1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The hyperbolic function imposes the constraint that the function plunges to minus infinity
for positive g as g approaches zero.
24

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

This feature can be relaxed by using the variation shown. Unlike the previous function, this
cannot be linearized by any kind of transformation. Here, nonlinear regression must be
used.
25

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

The output for this specification is shown, with most of the iteration messages deleted.

26

NONLINEAR REGRESSION

e  b1 

b2
b3  g

u

employment growth rate

3
2
1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The figure compares the original (black) and new (red) hyperbolic functions. The fit is a
considerable improvement, reflected in a higher R 2, 0.64 instead of 0.53.
27

Copyright Christopher Dougherty 2012.
These slideshows may be downloaded by anyone, anywhere for personal use.
Subject to respect for copyright and, where appropriate, attribution, they may be
used as a resource for teaching an econometrics course. There is no need to
refer to the author.
The content of this slideshow comes from Section 4.4 of C. Dougherty,
Introduction to Econometrics, fourth edition 2011, Oxford University Press.
Additional (free) resources for both students and instructors may be
downloaded from the OUP Online Resource Centre
http://www.oup.com/uk/orc/bin/9780199567089/.
Individuals studying econometrics on their own who feel that they might benefit
from participation in a formal course should consider the London School of
Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
EC2020 Elements of Econometrics
www.londoninternational.ac.uk/lse.

2012.11.04

Slide 8

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Suppose you believe that a variable Y depends on a variable X according to the relationship
shown and you wish to obtain estimates of b1, b2, and b3 given data on Y and X.
1

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

There is no way of transforming the relationship to obtain a linear relationship, and so it is
not possible to apply the usual regression procedure.
2

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nevertheless, one can still use the principle of minimizing the sum of the squares of the
residuals to obtain estimates of the parameters. We will describe a simple nonlinear
regression algorithm that uses the principle. It consists of a series of repeated steps.
3

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.

You start by guessing plausible values for the parameters.

4

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

You calculate the corresponding fitted values of Y from the data on X, conditional on these
values of the parameters.
5

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.

You calculate the residual for each observation in the sample, and hence RSS, the sum of
the squares of the residuals.
6

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.

You then make small changes in one or more of your estimates of the parameters.

7

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

Using the new estimates of b1, b2, and b3, you re-calculate the fitted values of Y. Then you
re-calculate the residuals and RSS.
8

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.

If RSS is smaller than before, your new estimates of the parameters are better than the old
ones and you continue adjusting your estimates in the same direction. Otherwise, you
would try different adjustments.
9

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You repeat steps 5, 6, and 7 again and again until you are unable to make any changes in
the estimates of the parameters that would reduce RSS.
10

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You conclude that you have minimized RSS, and you can describe the final estimates of the
parameters as the least squares estimates.
11

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

It should be emphasized that, long ago, mathematicians devised sophisticated techniques
to minimize the number of steps required by algorithms of this type.
12

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

We will return to the relationship between employment growth rate, e, and GDP growth rate,
g, in the first slideshow for this chapter. e and g are hypothesized to be related as shown.
13

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

According to this specification, as g becomes large, e will tend to a limit of b1. Looking at
the figure, we see that the maximum value of e is about 3. So we will take this as our initial
value for b1. We then hunt for the optimal value of b2, conditional on this guess for b1.
14

NONLINEAR REGRESSION
RSS
Conditional on b1 = 3

40

20

0

-7

-6

-5 –4.82

-4

b2

-3

The figure shows RSS plotted as a function of b2, conditional on b1 = 3. From this we see
that the optimal value of b2, conditional on b1 = 3, is –4.82.
15

NONLINEAR REGRESSION
RSS
Conditional on b2 = –4.82

40

20

0

2

2.94 3

b1

4

Next, holding b2 at –4.82, we look to improve on our guess for b1. The figure shows RSS as
a function of b1, conditional on b2 = –4.82. We see that the optimal value of b1 is 2.94.
16

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

We continue to do this both parameter estimates cease to change. The figure shows the
first 10 iterations.
17

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

Eventually, the estimates will reach limits. We will then have reached the values that yield
minimum RSS.
18

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

The limits are shown in the diagram as the horizontal lines. Convergence to them is
painfully slow because this is a very inefficient algorithm. More sophisticated ones usually
converge remarkably quickly.
19

NONLINEAR REGRESSION
4

3

2.60

2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3
-4

b2
–4.05

-5
-6

The limits must be the values from the transformed linear regression shown in the first
slideshow for this chapter: b1 = 2.60 and b2 = –4.05. They have been determined by the
same criterion, the minimization of RSS. All that we have done is to use a different method.
20

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

Here is the output for the present hyperbolic regression of e on g using nonlinear
regression. It is, as usual, Stata output, but output from other regression applications will
look similar.
21

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

The Stata command for a nonlinear regression is ‘nl’. This is followed by the hypothesized
mathematical relationship within parentheses. The parameters must be given names placed
within braces. Here b1 is {beta1} and b2 is {beta2}.
22

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

eˆ  2 . 60  4 . 05 z  2 . 60 

4 . 05
g

The output is effectively the same as the linear regression output in the first slideshow for
this chapter.
23

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3
2

1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The hyperbolic function imposes the constraint that the function plunges to minus infinity
for positive g as g approaches zero.
24

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

This feature can be relaxed by using the variation shown. Unlike the previous function, this
cannot be linearized by any kind of transformation. Here, nonlinear regression must be
used.
25

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

The output for this specification is shown, with most of the iteration messages deleted.

26

NONLINEAR REGRESSION

e  b1 

b2
b3  g

u

employment growth rate

3
2
1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The figure compares the original (black) and new (red) hyperbolic functions. The fit is a
considerable improvement, reflected in a higher R 2, 0.64 instead of 0.53.
27

Copyright Christopher Dougherty 2012.
These slideshows may be downloaded by anyone, anywhere for personal use.
Subject to respect for copyright and, where appropriate, attribution, they may be
used as a resource for teaching an econometrics course. There is no need to
refer to the author.
The content of this slideshow comes from Section 4.4 of C. Dougherty,
Introduction to Econometrics, fourth edition 2011, Oxford University Press.
Additional (free) resources for both students and instructors may be
downloaded from the OUP Online Resource Centre
http://www.oup.com/uk/orc/bin/9780199567089/.
Individuals studying econometrics on their own who feel that they might benefit
from participation in a formal course should consider the London School of
Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
EC2020 Elements of Econometrics
www.londoninternational.ac.uk/lse.

2012.11.04

Slide 9

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Suppose you believe that a variable Y depends on a variable X according to the relationship
shown and you wish to obtain estimates of b1, b2, and b3 given data on Y and X.
1

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

There is no way of transforming the relationship to obtain a linear relationship, and so it is
not possible to apply the usual regression procedure.
2

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nevertheless, one can still use the principle of minimizing the sum of the squares of the
residuals to obtain estimates of the parameters. We will describe a simple nonlinear
regression algorithm that uses the principle. It consists of a series of repeated steps.
3

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.

You start by guessing plausible values for the parameters.

4

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

You calculate the corresponding fitted values of Y from the data on X, conditional on these
values of the parameters.
5

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.

You calculate the residual for each observation in the sample, and hence RSS, the sum of
the squares of the residuals.
6

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.

You then make small changes in one or more of your estimates of the parameters.

7

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

Using the new estimates of b1, b2, and b3, you re-calculate the fitted values of Y. Then you
re-calculate the residuals and RSS.
8

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.

If RSS is smaller than before, your new estimates of the parameters are better than the old
ones and you continue adjusting your estimates in the same direction. Otherwise, you
would try different adjustments.
9

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You repeat steps 5, 6, and 7 again and again until you are unable to make any changes in
the estimates of the parameters that would reduce RSS.
10

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You conclude that you have minimized RSS, and you can describe the final estimates of the
parameters as the least squares estimates.
11

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

It should be emphasized that, long ago, mathematicians devised sophisticated techniques
to minimize the number of steps required by algorithms of this type.
12

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

We will return to the relationship between employment growth rate, e, and GDP growth rate,
g, in the first slideshow for this chapter. e and g are hypothesized to be related as shown.
13

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

According to this specification, as g becomes large, e will tend to a limit of b1. Looking at
the figure, we see that the maximum value of e is about 3. So we will take this as our initial
value for b1. We then hunt for the optimal value of b2, conditional on this guess for b1.
14

NONLINEAR REGRESSION
RSS
Conditional on b1 = 3

40

20

0

-7

-6

-5 –4.82

-4

b2

-3

The figure shows RSS plotted as a function of b2, conditional on b1 = 3. From this we see
that the optimal value of b2, conditional on b1 = 3, is –4.82.
15

NONLINEAR REGRESSION
RSS
Conditional on b2 = –4.82

40

20

0

2

2.94 3

b1

4

Next, holding b2 at –4.82, we look to improve on our guess for b1. The figure shows RSS as
a function of b1, conditional on b2 = –4.82. We see that the optimal value of b1 is 2.94.
16

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

We continue to do this both parameter estimates cease to change. The figure shows the
first 10 iterations.
17

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

Eventually, the estimates will reach limits. We will then have reached the values that yield
minimum RSS.
18

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

The limits are shown in the diagram as the horizontal lines. Convergence to them is
painfully slow because this is a very inefficient algorithm. More sophisticated ones usually
converge remarkably quickly.
19

NONLINEAR REGRESSION
4

3

2.60

2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3
-4

b2
–4.05

-5
-6

The limits must be the values from the transformed linear regression shown in the first
slideshow for this chapter: b1 = 2.60 and b2 = –4.05. They have been determined by the
same criterion, the minimization of RSS. All that we have done is to use a different method.
20

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

Here is the output for the present hyperbolic regression of e on g using nonlinear
regression. It is, as usual, Stata output, but output from other regression applications will
look similar.
21

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

The Stata command for a nonlinear regression is ‘nl’. This is followed by the hypothesized
mathematical relationship within parentheses. The parameters must be given names placed
within braces. Here b1 is {beta1} and b2 is {beta2}.
22

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

eˆ  2 . 60  4 . 05 z  2 . 60 

4 . 05
g

The output is effectively the same as the linear regression output in the first slideshow for
this chapter.
23

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3
2

1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The hyperbolic function imposes the constraint that the function plunges to minus infinity
for positive g as g approaches zero.
24

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

This feature can be relaxed by using the variation shown. Unlike the previous function, this
cannot be linearized by any kind of transformation. Here, nonlinear regression must be
used.
25

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

The output for this specification is shown, with most of the iteration messages deleted.

26

NONLINEAR REGRESSION

e  b1 

b2
b3  g

u

employment growth rate

3
2
1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The figure compares the original (black) and new (red) hyperbolic functions. The fit is a
considerable improvement, reflected in a higher R 2, 0.64 instead of 0.53.
27

Copyright Christopher Dougherty 2012.
These slideshows may be downloaded by anyone, anywhere for personal use.
Subject to respect for copyright and, where appropriate, attribution, they may be
used as a resource for teaching an econometrics course. There is no need to
refer to the author.
The content of this slideshow comes from Section 4.4 of C. Dougherty,
Introduction to Econometrics, fourth edition 2011, Oxford University Press.
Additional (free) resources for both students and instructors may be
downloaded from the OUP Online Resource Centre
http://www.oup.com/uk/orc/bin/9780199567089/.
Individuals studying econometrics on their own who feel that they might benefit
from participation in a formal course should consider the London School of
Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
EC2020 Elements of Econometrics
www.londoninternational.ac.uk/lse.

2012.11.04

Slide 10

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Suppose you believe that a variable Y depends on a variable X according to the relationship
shown and you wish to obtain estimates of b1, b2, and b3 given data on Y and X.
1

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

There is no way of transforming the relationship to obtain a linear relationship, and so it is
not possible to apply the usual regression procedure.
2

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nevertheless, one can still use the principle of minimizing the sum of the squares of the
residuals to obtain estimates of the parameters. We will describe a simple nonlinear
regression algorithm that uses the principle. It consists of a series of repeated steps.
3

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.

You start by guessing plausible values for the parameters.

4

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

You calculate the corresponding fitted values of Y from the data on X, conditional on these
values of the parameters.
5

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.

You calculate the residual for each observation in the sample, and hence RSS, the sum of
the squares of the residuals.
6

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.

You then make small changes in one or more of your estimates of the parameters.

7

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

Using the new estimates of b1, b2, and b3, you re-calculate the fitted values of Y. Then you
re-calculate the residuals and RSS.
8

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.

If RSS is smaller than before, your new estimates of the parameters are better than the old
ones and you continue adjusting your estimates in the same direction. Otherwise, you
would try different adjustments.
9

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You repeat steps 5, 6, and 7 again and again until you are unable to make any changes in
the estimates of the parameters that would reduce RSS.
10

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You conclude that you have minimized RSS, and you can describe the final estimates of the
parameters as the least squares estimates.
11

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

It should be emphasized that, long ago, mathematicians devised sophisticated techniques
to minimize the number of steps required by algorithms of this type.
12

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

We will return to the relationship between employment growth rate, e, and GDP growth rate,
g, in the first slideshow for this chapter. e and g are hypothesized to be related as shown.
13

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

According to this specification, as g becomes large, e will tend to a limit of b1. Looking at
the figure, we see that the maximum value of e is about 3. So we will take this as our initial
value for b1. We then hunt for the optimal value of b2, conditional on this guess for b1.
14

NONLINEAR REGRESSION
RSS
Conditional on b1 = 3

40

20

0

-7

-6

-5 –4.82

-4

b2

-3

The figure shows RSS plotted as a function of b2, conditional on b1 = 3. From this we see
that the optimal value of b2, conditional on b1 = 3, is –4.82.
15

NONLINEAR REGRESSION
RSS
Conditional on b2 = –4.82

40

20

0

2

2.94 3

b1

4

Next, holding b2 at –4.82, we look to improve on our guess for b1. The figure shows RSS as
a function of b1, conditional on b2 = –4.82. We see that the optimal value of b1 is 2.94.
16

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

We continue to do this both parameter estimates cease to change. The figure shows the
first 10 iterations.
17

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

Eventually, the estimates will reach limits. We will then have reached the values that yield
minimum RSS.
18

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

The limits are shown in the diagram as the horizontal lines. Convergence to them is
painfully slow because this is a very inefficient algorithm. More sophisticated ones usually
converge remarkably quickly.
19

NONLINEAR REGRESSION
4

3

2.60

2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3
-4

b2
–4.05

-5
-6

The limits must be the values from the transformed linear regression shown in the first
slideshow for this chapter: b1 = 2.60 and b2 = –4.05. They have been determined by the
same criterion, the minimization of RSS. All that we have done is to use a different method.
20

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

Here is the output for the present hyperbolic regression of e on g using nonlinear
regression. It is, as usual, Stata output, but output from other regression applications will
look similar.
21

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

The Stata command for a nonlinear regression is ‘nl’. This is followed by the hypothesized
mathematical relationship within parentheses. The parameters must be given names placed
within braces. Here b1 is {beta1} and b2 is {beta2}.
22

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

eˆ  2 . 60  4 . 05 z  2 . 60 

4 . 05
g

The output is effectively the same as the linear regression output in the first slideshow for
this chapter.
23

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3
2

1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The hyperbolic function imposes the constraint that the function plunges to minus infinity
for positive g as g approaches zero.
24

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

This feature can be relaxed by using the variation shown. Unlike the previous function, this
cannot be linearized by any kind of transformation. Here, nonlinear regression must be
used.
25

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

The output for this specification is shown, with most of the iteration messages deleted.

26

NONLINEAR REGRESSION

e  b1 

b2
b3  g

u

employment growth rate

3
2
1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The figure compares the original (black) and new (red) hyperbolic functions. The fit is a
considerable improvement, reflected in a higher R 2, 0.64 instead of 0.53.
27

Copyright Christopher Dougherty 2012.
These slideshows may be downloaded by anyone, anywhere for personal use.
Subject to respect for copyright and, where appropriate, attribution, they may be
used as a resource for teaching an econometrics course. There is no need to
refer to the author.
The content of this slideshow comes from Section 4.4 of C. Dougherty,
Introduction to Econometrics, fourth edition 2011, Oxford University Press.
Additional (free) resources for both students and instructors may be
downloaded from the OUP Online Resource Centre
http://www.oup.com/uk/orc/bin/9780199567089/.
Individuals studying econometrics on their own who feel that they might benefit
from participation in a formal course should consider the London School of
Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
EC2020 Elements of Econometrics
www.londoninternational.ac.uk/lse.

2012.11.04

Slide 11

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Suppose you believe that a variable Y depends on a variable X according to the relationship
shown and you wish to obtain estimates of b1, b2, and b3 given data on Y and X.
1

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

There is no way of transforming the relationship to obtain a linear relationship, and so it is
not possible to apply the usual regression procedure.
2

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nevertheless, one can still use the principle of minimizing the sum of the squares of the
residuals to obtain estimates of the parameters. We will describe a simple nonlinear
regression algorithm that uses the principle. It consists of a series of repeated steps.
3

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.

You start by guessing plausible values for the parameters.

4

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

You calculate the corresponding fitted values of Y from the data on X, conditional on these
values of the parameters.
5

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.

You calculate the residual for each observation in the sample, and hence RSS, the sum of
the squares of the residuals.
6

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.

You then make small changes in one or more of your estimates of the parameters.

7

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

Using the new estimates of b1, b2, and b3, you re-calculate the fitted values of Y. Then you
re-calculate the residuals and RSS.
8

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.

If RSS is smaller than before, your new estimates of the parameters are better than the old
ones and you continue adjusting your estimates in the same direction. Otherwise, you
would try different adjustments.
9

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You repeat steps 5, 6, and 7 again and again until you are unable to make any changes in
the estimates of the parameters that would reduce RSS.
10

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You conclude that you have minimized RSS, and you can describe the final estimates of the
parameters as the least squares estimates.
11

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

It should be emphasized that, long ago, mathematicians devised sophisticated techniques
to minimize the number of steps required by algorithms of this type.
12

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

We will return to the relationship between employment growth rate, e, and GDP growth rate,
g, in the first slideshow for this chapter. e and g are hypothesized to be related as shown.
13

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

According to this specification, as g becomes large, e will tend to a limit of b1. Looking at
the figure, we see that the maximum value of e is about 3. So we will take this as our initial
value for b1. We then hunt for the optimal value of b2, conditional on this guess for b1.
14

NONLINEAR REGRESSION
RSS
Conditional on b1 = 3

40

20

0

-7

-6

-5 –4.82

-4

b2

-3

The figure shows RSS plotted as a function of b2, conditional on b1 = 3. From this we see
that the optimal value of b2, conditional on b1 = 3, is –4.82.
15

NONLINEAR REGRESSION
RSS
Conditional on b2 = –4.82

40

20

0

2

2.94 3

b1

4

Next, holding b2 at –4.82, we look to improve on our guess for b1. The figure shows RSS as
a function of b1, conditional on b2 = –4.82. We see that the optimal value of b1 is 2.94.
16

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

We continue to do this both parameter estimates cease to change. The figure shows the
first 10 iterations.
17

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

Eventually, the estimates will reach limits. We will then have reached the values that yield
minimum RSS.
18

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

The limits are shown in the diagram as the horizontal lines. Convergence to them is
painfully slow because this is a very inefficient algorithm. More sophisticated ones usually
converge remarkably quickly.
19

NONLINEAR REGRESSION
4

3

2.60

2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3
-4

b2
–4.05

-5
-6

The limits must be the values from the transformed linear regression shown in the first
slideshow for this chapter: b1 = 2.60 and b2 = –4.05. They have been determined by the
same criterion, the minimization of RSS. All that we have done is to use a different method.
20

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

Here is the output for the present hyperbolic regression of e on g using nonlinear
regression. It is, as usual, Stata output, but output from other regression applications will
look similar.
21

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

The Stata command for a nonlinear regression is ‘nl’. This is followed by the hypothesized
mathematical relationship within parentheses. The parameters must be given names placed
within braces. Here b1 is {beta1} and b2 is {beta2}.
22

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

eˆ  2 . 60  4 . 05 z  2 . 60 

4 . 05
g

The output is effectively the same as the linear regression output in the first slideshow for
this chapter.
23

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3
2

1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The hyperbolic function imposes the constraint that the function plunges to minus infinity
for positive g as g approaches zero.
24

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

This feature can be relaxed by using the variation shown. Unlike the previous function, this
cannot be linearized by any kind of transformation. Here, nonlinear regression must be
used.
25

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

The output for this specification is shown, with most of the iteration messages deleted.

26

NONLINEAR REGRESSION

e  b1 

b2
b3  g

u

employment growth rate

3
2
1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The figure compares the original (black) and new (red) hyperbolic functions. The fit is a
considerable improvement, reflected in a higher R 2, 0.64 instead of 0.53.
27

Copyright Christopher Dougherty 2012.
These slideshows may be downloaded by anyone, anywhere for personal use.
Subject to respect for copyright and, where appropriate, attribution, they may be
used as a resource for teaching an econometrics course. There is no need to
refer to the author.
The content of this slideshow comes from Section 4.4 of C. Dougherty,
Introduction to Econometrics, fourth edition 2011, Oxford University Press.
Additional (free) resources for both students and instructors may be
downloaded from the OUP Online Resource Centre
http://www.oup.com/uk/orc/bin/9780199567089/.
Individuals studying econometrics on their own who feel that they might benefit
from participation in a formal course should consider the London School of
Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
EC2020 Elements of Econometrics
www.londoninternational.ac.uk/lse.

2012.11.04

Slide 12

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Suppose you believe that a variable Y depends on a variable X according to the relationship
shown and you wish to obtain estimates of b1, b2, and b3 given data on Y and X.
1

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

There is no way of transforming the relationship to obtain a linear relationship, and so it is
not possible to apply the usual regression procedure.
2

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nevertheless, one can still use the principle of minimizing the sum of the squares of the
residuals to obtain estimates of the parameters. We will describe a simple nonlinear
regression algorithm that uses the principle. It consists of a series of repeated steps.
3

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.

You start by guessing plausible values for the parameters.

4

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

You calculate the corresponding fitted values of Y from the data on X, conditional on these
values of the parameters.
5

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.

You calculate the residual for each observation in the sample, and hence RSS, the sum of
the squares of the residuals.
6

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.

You then make small changes in one or more of your estimates of the parameters.

7

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

Using the new estimates of b1, b2, and b3, you re-calculate the fitted values of Y. Then you
re-calculate the residuals and RSS.
8

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.

If RSS is smaller than before, your new estimates of the parameters are better than the old
ones and you continue adjusting your estimates in the same direction. Otherwise, you
would try different adjustments.
9

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You repeat steps 5, 6, and 7 again and again until you are unable to make any changes in
the estimates of the parameters that would reduce RSS.
10

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You conclude that you have minimized RSS, and you can describe the final estimates of the
parameters as the least squares estimates.
11

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

It should be emphasized that, long ago, mathematicians devised sophisticated techniques
to minimize the number of steps required by algorithms of this type.
12

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

We will return to the relationship between employment growth rate, e, and GDP growth rate,
g, in the first slideshow for this chapter. e and g are hypothesized to be related as shown.
13

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

According to this specification, as g becomes large, e will tend to a limit of b1. Looking at
the figure, we see that the maximum value of e is about 3. So we will take this as our initial
value for b1. We then hunt for the optimal value of b2, conditional on this guess for b1.
14

NONLINEAR REGRESSION
RSS
Conditional on b1 = 3

40

20

0

-7

-6

-5 –4.82

-4

b2

-3

The figure shows RSS plotted as a function of b2, conditional on b1 = 3. From this we see
that the optimal value of b2, conditional on b1 = 3, is –4.82.
15

NONLINEAR REGRESSION
RSS
Conditional on b2 = –4.82

40

20

0

2

2.94 3

b1

4

Next, holding b2 at –4.82, we look to improve on our guess for b1. The figure shows RSS as
a function of b1, conditional on b2 = –4.82. We see that the optimal value of b1 is 2.94.
16

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

We continue to do this both parameter estimates cease to change. The figure shows the
first 10 iterations.
17

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

Eventually, the estimates will reach limits. We will then have reached the values that yield
minimum RSS.
18

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

The limits are shown in the diagram as the horizontal lines. Convergence to them is
painfully slow because this is a very inefficient algorithm. More sophisticated ones usually
converge remarkably quickly.
19

NONLINEAR REGRESSION
4

3

2.60

2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3
-4

b2
–4.05

-5
-6

The limits must be the values from the transformed linear regression shown in the first
slideshow for this chapter: b1 = 2.60 and b2 = –4.05. They have been determined by the
same criterion, the minimization of RSS. All that we have done is to use a different method.
20

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

Here is the output for the present hyperbolic regression of e on g using nonlinear
regression. It is, as usual, Stata output, but output from other regression applications will
look similar.
21

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

The Stata command for a nonlinear regression is ‘nl’. This is followed by the hypothesized
mathematical relationship within parentheses. The parameters must be given names placed
within braces. Here b1 is {beta1} and b2 is {beta2}.
22

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

eˆ  2 . 60  4 . 05 z  2 . 60 

4 . 05
g

The output is effectively the same as the linear regression output in the first slideshow for
this chapter.
23

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3
2

1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The hyperbolic function imposes the constraint that the function plunges to minus infinity
for positive g as g approaches zero.
24

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

This feature can be relaxed by using the variation shown. Unlike the previous function, this
cannot be linearized by any kind of transformation. Here, nonlinear regression must be
used.
25

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

The output for this specification is shown, with most of the iteration messages deleted.

26

NONLINEAR REGRESSION

e  b1 

b2
b3  g

u

employment growth rate

3
2
1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The figure compares the original (black) and new (red) hyperbolic functions. The fit is a
considerable improvement, reflected in a higher R 2, 0.64 instead of 0.53.
27

Copyright Christopher Dougherty 2012.
These slideshows may be downloaded by anyone, anywhere for personal use.
Subject to respect for copyright and, where appropriate, attribution, they may be
used as a resource for teaching an econometrics course. There is no need to
refer to the author.
The content of this slideshow comes from Section 4.4 of C. Dougherty,
Introduction to Econometrics, fourth edition 2011, Oxford University Press.
Additional (free) resources for both students and instructors may be
downloaded from the OUP Online Resource Centre
http://www.oup.com/uk/orc/bin/9780199567089/.
Individuals studying econometrics on their own who feel that they might benefit
from participation in a formal course should consider the London School of
Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
EC2020 Elements of Econometrics
www.londoninternational.ac.uk/lse.

2012.11.04

Slide 13

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Suppose you believe that a variable Y depends on a variable X according to the relationship
shown and you wish to obtain estimates of b1, b2, and b3 given data on Y and X.
1

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

There is no way of transforming the relationship to obtain a linear relationship, and so it is
not possible to apply the usual regression procedure.
2

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nevertheless, one can still use the principle of minimizing the sum of the squares of the
residuals to obtain estimates of the parameters. We will describe a simple nonlinear
regression algorithm that uses the principle. It consists of a series of repeated steps.
3

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.

You start by guessing plausible values for the parameters.

4

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

You calculate the corresponding fitted values of Y from the data on X, conditional on these
values of the parameters.
5

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.

You calculate the residual for each observation in the sample, and hence RSS, the sum of
the squares of the residuals.
6

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.

You then make small changes in one or more of your estimates of the parameters.

7

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

Using the new estimates of b1, b2, and b3, you re-calculate the fitted values of Y. Then you
re-calculate the residuals and RSS.
8

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.

If RSS is smaller than before, your new estimates of the parameters are better than the old
ones and you continue adjusting your estimates in the same direction. Otherwise, you
would try different adjustments.
9

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You repeat steps 5, 6, and 7 again and again until you are unable to make any changes in
the estimates of the parameters that would reduce RSS.
10

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You conclude that you have minimized RSS, and you can describe the final estimates of the
parameters as the least squares estimates.
11

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

It should be emphasized that, long ago, mathematicians devised sophisticated techniques
to minimize the number of steps required by algorithms of this type.
12

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

We will return to the relationship between employment growth rate, e, and GDP growth rate,
g, in the first slideshow for this chapter. e and g are hypothesized to be related as shown.
13

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

According to this specification, as g becomes large, e will tend to a limit of b1. Looking at
the figure, we see that the maximum value of e is about 3. So we will take this as our initial
value for b1. We then hunt for the optimal value of b2, conditional on this guess for b1.
14

NONLINEAR REGRESSION
RSS
Conditional on b1 = 3

40

20

0

-7

-6

-5 –4.82

-4

b2

-3

The figure shows RSS plotted as a function of b2, conditional on b1 = 3. From this we see
that the optimal value of b2, conditional on b1 = 3, is –4.82.
15

NONLINEAR REGRESSION
RSS
Conditional on b2 = –4.82

40

20

0

2

2.94 3

b1

4

Next, holding b2 at –4.82, we look to improve on our guess for b1. The figure shows RSS as
a function of b1, conditional on b2 = –4.82. We see that the optimal value of b1 is 2.94.
16

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

We continue to do this both parameter estimates cease to change. The figure shows the
first 10 iterations.
17

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

Eventually, the estimates will reach limits. We will then have reached the values that yield
minimum RSS.
18

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

The limits are shown in the diagram as the horizontal lines. Convergence to them is
painfully slow because this is a very inefficient algorithm. More sophisticated ones usually
converge remarkably quickly.
19

NONLINEAR REGRESSION
4

3

2.60

2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3
-4

b2
–4.05

-5
-6

The limits must be the values from the transformed linear regression shown in the first
slideshow for this chapter: b1 = 2.60 and b2 = –4.05. They have been determined by the
same criterion, the minimization of RSS. All that we have done is to use a different method.
20

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

Here is the output for the present hyperbolic regression of e on g using nonlinear
regression. It is, as usual, Stata output, but output from other regression applications will
look similar.
21

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

The Stata command for a nonlinear regression is ‘nl’. This is followed by the hypothesized
mathematical relationship within parentheses. The parameters must be given names placed
within braces. Here b1 is {beta1} and b2 is {beta2}.
22

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

eˆ  2 . 60  4 . 05 z  2 . 60 

4 . 05
g

The output is effectively the same as the linear regression output in the first slideshow for
this chapter.
23

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3
2

1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The hyperbolic function imposes the constraint that the function plunges to minus infinity
for positive g as g approaches zero.
24

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

This feature can be relaxed by using the variation shown. Unlike the previous function, this
cannot be linearized by any kind of transformation. Here, nonlinear regression must be
used.
25

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

The output for this specification is shown, with most of the iteration messages deleted.

26

NONLINEAR REGRESSION

e  b1 

b2
b3  g

u

employment growth rate

3
2
1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The figure compares the original (black) and new (red) hyperbolic functions. The fit is a
considerable improvement, reflected in a higher R 2, 0.64 instead of 0.53.
27

Copyright Christopher Dougherty 2012.
These slideshows may be downloaded by anyone, anywhere for personal use.
Subject to respect for copyright and, where appropriate, attribution, they may be
used as a resource for teaching an econometrics course. There is no need to
refer to the author.
The content of this slideshow comes from Section 4.4 of C. Dougherty,
Introduction to Econometrics, fourth edition 2011, Oxford University Press.
Additional (free) resources for both students and instructors may be
downloaded from the OUP Online Resource Centre
http://www.oup.com/uk/orc/bin/9780199567089/.
Individuals studying econometrics on their own who feel that they might benefit
from participation in a formal course should consider the London School of
Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
EC2020 Elements of Econometrics
www.londoninternational.ac.uk/lse.

2012.11.04

Slide 14

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Suppose you believe that a variable Y depends on a variable X according to the relationship
shown and you wish to obtain estimates of b1, b2, and b3 given data on Y and X.
1

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

There is no way of transforming the relationship to obtain a linear relationship, and so it is
not possible to apply the usual regression procedure.
2

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nevertheless, one can still use the principle of minimizing the sum of the squares of the
residuals to obtain estimates of the parameters. We will describe a simple nonlinear
regression algorithm that uses the principle. It consists of a series of repeated steps.
3

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.

You start by guessing plausible values for the parameters.

4

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

You calculate the corresponding fitted values of Y from the data on X, conditional on these
values of the parameters.
5

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.

You calculate the residual for each observation in the sample, and hence RSS, the sum of
the squares of the residuals.
6

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.

You then make small changes in one or more of your estimates of the parameters.

7

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

Using the new estimates of b1, b2, and b3, you re-calculate the fitted values of Y. Then you
re-calculate the residuals and RSS.
8

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.

If RSS is smaller than before, your new estimates of the parameters are better than the old
ones and you continue adjusting your estimates in the same direction. Otherwise, you
would try different adjustments.
9

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You repeat steps 5, 6, and 7 again and again until you are unable to make any changes in
the estimates of the parameters that would reduce RSS.
10

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You conclude that you have minimized RSS, and you can describe the final estimates of the
parameters as the least squares estimates.
11

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

It should be emphasized that, long ago, mathematicians devised sophisticated techniques
to minimize the number of steps required by algorithms of this type.
12

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

We will return to the relationship between employment growth rate, e, and GDP growth rate,
g, in the first slideshow for this chapter. e and g are hypothesized to be related as shown.
13

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

According to this specification, as g becomes large, e will tend to a limit of b1. Looking at
the figure, we see that the maximum value of e is about 3. So we will take this as our initial
value for b1. We then hunt for the optimal value of b2, conditional on this guess for b1.
14

NONLINEAR REGRESSION
RSS
Conditional on b1 = 3

40

20

0

-7

-6

-5 –4.82

-4

b2

-3

The figure shows RSS plotted as a function of b2, conditional on b1 = 3. From this we see
that the optimal value of b2, conditional on b1 = 3, is –4.82.
15

NONLINEAR REGRESSION
RSS
Conditional on b2 = –4.82

40

20

0

2

2.94 3

b1

4

Next, holding b2 at –4.82, we look to improve on our guess for b1. The figure shows RSS as
a function of b1, conditional on b2 = –4.82. We see that the optimal value of b1 is 2.94.
16

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

We continue to do this both parameter estimates cease to change. The figure shows the
first 10 iterations.
17

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

Eventually, the estimates will reach limits. We will then have reached the values that yield
minimum RSS.
18

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

The limits are shown in the diagram as the horizontal lines. Convergence to them is
painfully slow because this is a very inefficient algorithm. More sophisticated ones usually
converge remarkably quickly.
19

NONLINEAR REGRESSION
4

3

2.60

2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3
-4

b2
–4.05

-5
-6

The limits must be the values from the transformed linear regression shown in the first
slideshow for this chapter: b1 = 2.60 and b2 = –4.05. They have been determined by the
same criterion, the minimization of RSS. All that we have done is to use a different method.
20

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

Here is the output for the present hyperbolic regression of e on g using nonlinear
regression. It is, as usual, Stata output, but output from other regression applications will
look similar.
21

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

The Stata command for a nonlinear regression is ‘nl’. This is followed by the hypothesized
mathematical relationship within parentheses. The parameters must be given names placed
within braces. Here b1 is {beta1} and b2 is {beta2}.
22

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

eˆ  2 . 60  4 . 05 z  2 . 60 

4 . 05
g

The output is effectively the same as the linear regression output in the first slideshow for
this chapter.
23

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3
2

1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The hyperbolic function imposes the constraint that the function plunges to minus infinity
for positive g as g approaches zero.
24

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

This feature can be relaxed by using the variation shown. Unlike the previous function, this
cannot be linearized by any kind of transformation. Here, nonlinear regression must be
used.
25

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

The output for this specification is shown, with most of the iteration messages deleted.

26

NONLINEAR REGRESSION

e  b1 

b2
b3  g

u

employment growth rate

3
2
1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The figure compares the original (black) and new (red) hyperbolic functions. The fit is a
considerable improvement, reflected in a higher R 2, 0.64 instead of 0.53.
27

Copyright Christopher Dougherty 2012.
These slideshows may be downloaded by anyone, anywhere for personal use.
Subject to respect for copyright and, where appropriate, attribution, they may be
used as a resource for teaching an econometrics course. There is no need to
refer to the author.
The content of this slideshow comes from Section 4.4 of C. Dougherty,
Introduction to Econometrics, fourth edition 2011, Oxford University Press.
Additional (free) resources for both students and instructors may be
downloaded from the OUP Online Resource Centre
http://www.oup.com/uk/orc/bin/9780199567089/.
Individuals studying econometrics on their own who feel that they might benefit
from participation in a formal course should consider the London School of
Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
EC2020 Elements of Econometrics
www.londoninternational.ac.uk/lse.

2012.11.04

Slide 15

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Suppose you believe that a variable Y depends on a variable X according to the relationship
shown and you wish to obtain estimates of b1, b2, and b3 given data on Y and X.
1

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

There is no way of transforming the relationship to obtain a linear relationship, and so it is
not possible to apply the usual regression procedure.
2

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nevertheless, one can still use the principle of minimizing the sum of the squares of the
residuals to obtain estimates of the parameters. We will describe a simple nonlinear
regression algorithm that uses the principle. It consists of a series of repeated steps.
3

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.

You start by guessing plausible values for the parameters.

4

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

You calculate the corresponding fitted values of Y from the data on X, conditional on these
values of the parameters.
5

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.

You calculate the residual for each observation in the sample, and hence RSS, the sum of
the squares of the residuals.
6

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.

You then make small changes in one or more of your estimates of the parameters.

7

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

Using the new estimates of b1, b2, and b3, you re-calculate the fitted values of Y. Then you
re-calculate the residuals and RSS.
8

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.

If RSS is smaller than before, your new estimates of the parameters are better than the old
ones and you continue adjusting your estimates in the same direction. Otherwise, you
would try different adjustments.
9

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You repeat steps 5, 6, and 7 again and again until you are unable to make any changes in
the estimates of the parameters that would reduce RSS.
10

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You conclude that you have minimized RSS, and you can describe the final estimates of the
parameters as the least squares estimates.
11

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

It should be emphasized that, long ago, mathematicians devised sophisticated techniques
to minimize the number of steps required by algorithms of this type.
12

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

We will return to the relationship between employment growth rate, e, and GDP growth rate,
g, in the first slideshow for this chapter. e and g are hypothesized to be related as shown.
13

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

According to this specification, as g becomes large, e will tend to a limit of b1. Looking at
the figure, we see that the maximum value of e is about 3. So we will take this as our initial
value for b1. We then hunt for the optimal value of b2, conditional on this guess for b1.
14

NONLINEAR REGRESSION
RSS
Conditional on b1 = 3

40

20

0

-7

-6

-5 –4.82

-4

b2

-3

The figure shows RSS plotted as a function of b2, conditional on b1 = 3. From this we see
that the optimal value of b2, conditional on b1 = 3, is –4.82.
15

NONLINEAR REGRESSION
RSS
Conditional on b2 = –4.82

40

20

0

2

2.94 3

b1

4

Next, holding b2 at –4.82, we look to improve on our guess for b1. The figure shows RSS as
a function of b1, conditional on b2 = –4.82. We see that the optimal value of b1 is 2.94.
16

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

We continue to do this both parameter estimates cease to change. The figure shows the
first 10 iterations.
17

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

Eventually, the estimates will reach limits. We will then have reached the values that yield
minimum RSS.
18

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

The limits are shown in the diagram as the horizontal lines. Convergence to them is
painfully slow because this is a very inefficient algorithm. More sophisticated ones usually
converge remarkably quickly.
19

NONLINEAR REGRESSION
4

3

2.60

2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3
-4

b2
–4.05

-5
-6

The limits must be the values from the transformed linear regression shown in the first
slideshow for this chapter: b1 = 2.60 and b2 = –4.05. They have been determined by the
same criterion, the minimization of RSS. All that we have done is to use a different method.
20

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

Here is the output for the present hyperbolic regression of e on g using nonlinear
regression. It is, as usual, Stata output, but output from other regression applications will
look similar.
21

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

The Stata command for a nonlinear regression is ‘nl’. This is followed by the hypothesized
mathematical relationship within parentheses. The parameters must be given names placed
within braces. Here b1 is {beta1} and b2 is {beta2}.
22

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

eˆ  2 . 60  4 . 05 z  2 . 60 

4 . 05
g

The output is effectively the same as the linear regression output in the first slideshow for
this chapter.
23

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3
2

1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The hyperbolic function imposes the constraint that the function plunges to minus infinity
for positive g as g approaches zero.
24

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

This feature can be relaxed by using the variation shown. Unlike the previous function, this
cannot be linearized by any kind of transformation. Here, nonlinear regression must be
used.
25

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

The output for this specification is shown, with most of the iteration messages deleted.

26

NONLINEAR REGRESSION

e  b1 

b2
b3  g

u

employment growth rate

3
2
1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The figure compares the original (black) and new (red) hyperbolic functions. The fit is a
considerable improvement, reflected in a higher R 2, 0.64 instead of 0.53.
27

Copyright Christopher Dougherty 2012.
These slideshows may be downloaded by anyone, anywhere for personal use.
Subject to respect for copyright and, where appropriate, attribution, they may be
used as a resource for teaching an econometrics course. There is no need to
refer to the author.
The content of this slideshow comes from Section 4.4 of C. Dougherty,
Introduction to Econometrics, fourth edition 2011, Oxford University Press.
Additional (free) resources for both students and instructors may be
downloaded from the OUP Online Resource Centre
http://www.oup.com/uk/orc/bin/9780199567089/.
Individuals studying econometrics on their own who feel that they might benefit
from participation in a formal course should consider the London School of
Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
EC2020 Elements of Econometrics
www.londoninternational.ac.uk/lse.

2012.11.04

Slide 16

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Suppose you believe that a variable Y depends on a variable X according to the relationship
shown and you wish to obtain estimates of b1, b2, and b3 given data on Y and X.
1

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

There is no way of transforming the relationship to obtain a linear relationship, and so it is
not possible to apply the usual regression procedure.
2

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nevertheless, one can still use the principle of minimizing the sum of the squares of the
residuals to obtain estimates of the parameters. We will describe a simple nonlinear
regression algorithm that uses the principle. It consists of a series of repeated steps.
3

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.

You start by guessing plausible values for the parameters.

4

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

You calculate the corresponding fitted values of Y from the data on X, conditional on these
values of the parameters.
5

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.

You calculate the residual for each observation in the sample, and hence RSS, the sum of
the squares of the residuals.
6

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.

You then make small changes in one or more of your estimates of the parameters.

7

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

Using the new estimates of b1, b2, and b3, you re-calculate the fitted values of Y. Then you
re-calculate the residuals and RSS.
8

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.

If RSS is smaller than before, your new estimates of the parameters are better than the old
ones and you continue adjusting your estimates in the same direction. Otherwise, you
would try different adjustments.
9

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You repeat steps 5, 6, and 7 again and again until you are unable to make any changes in
the estimates of the parameters that would reduce RSS.
10

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You conclude that you have minimized RSS, and you can describe the final estimates of the
parameters as the least squares estimates.
11

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

It should be emphasized that, long ago, mathematicians devised sophisticated techniques
to minimize the number of steps required by algorithms of this type.
12

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

We will return to the relationship between employment growth rate, e, and GDP growth rate,
g, in the first slideshow for this chapter. e and g are hypothesized to be related as shown.
13

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

According to this specification, as g becomes large, e will tend to a limit of b1. Looking at
the figure, we see that the maximum value of e is about 3. So we will take this as our initial
value for b1. We then hunt for the optimal value of b2, conditional on this guess for b1.
14

NONLINEAR REGRESSION
RSS
Conditional on b1 = 3

40

20

0

-7

-6

-5 –4.82

-4

b2

-3

The figure shows RSS plotted as a function of b2, conditional on b1 = 3. From this we see
that the optimal value of b2, conditional on b1 = 3, is –4.82.
15

NONLINEAR REGRESSION
RSS
Conditional on b2 = –4.82

40

20

0

2

2.94 3

b1

4

Next, holding b2 at –4.82, we look to improve on our guess for b1. The figure shows RSS as
a function of b1, conditional on b2 = –4.82. We see that the optimal value of b1 is 2.94.
16

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

We continue to do this both parameter estimates cease to change. The figure shows the
first 10 iterations.
17

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

Eventually, the estimates will reach limits. We will then have reached the values that yield
minimum RSS.
18

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

The limits are shown in the diagram as the horizontal lines. Convergence to them is
painfully slow because this is a very inefficient algorithm. More sophisticated ones usually
converge remarkably quickly.
19

NONLINEAR REGRESSION
4

3

2.60

2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3
-4

b2
–4.05

-5
-6

The limits must be the values from the transformed linear regression shown in the first
slideshow for this chapter: b1 = 2.60 and b2 = –4.05. They have been determined by the
same criterion, the minimization of RSS. All that we have done is to use a different method.
20

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

Here is the output for the present hyperbolic regression of e on g using nonlinear
regression. It is, as usual, Stata output, but output from other regression applications will
look similar.
21

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

The Stata command for a nonlinear regression is ‘nl’. This is followed by the hypothesized
mathematical relationship within parentheses. The parameters must be given names placed
within braces. Here b1 is {beta1} and b2 is {beta2}.
22

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

eˆ  2 . 60  4 . 05 z  2 . 60 

4 . 05
g

The output is effectively the same as the linear regression output in the first slideshow for
this chapter.
23

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3
2

1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The hyperbolic function imposes the constraint that the function plunges to minus infinity
for positive g as g approaches zero.
24

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

This feature can be relaxed by using the variation shown. Unlike the previous function, this
cannot be linearized by any kind of transformation. Here, nonlinear regression must be
used.
25

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

The output for this specification is shown, with most of the iteration messages deleted.

26

NONLINEAR REGRESSION

e  b1 

b2
b3  g

u

employment growth rate

3
2
1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The figure compares the original (black) and new (red) hyperbolic functions. The fit is a
considerable improvement, reflected in a higher R 2, 0.64 instead of 0.53.
27

Copyright Christopher Dougherty 2012.
These slideshows may be downloaded by anyone, anywhere for personal use.
Subject to respect for copyright and, where appropriate, attribution, they may be
used as a resource for teaching an econometrics course. There is no need to
refer to the author.
The content of this slideshow comes from Section 4.4 of C. Dougherty,
Introduction to Econometrics, fourth edition 2011, Oxford University Press.
Additional (free) resources for both students and instructors may be
downloaded from the OUP Online Resource Centre
http://www.oup.com/uk/orc/bin/9780199567089/.
Individuals studying econometrics on their own who feel that they might benefit
from participation in a formal course should consider the London School of
Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
EC2020 Elements of Econometrics
www.londoninternational.ac.uk/lse.

2012.11.04

Slide 17

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Suppose you believe that a variable Y depends on a variable X according to the relationship
shown and you wish to obtain estimates of b1, b2, and b3 given data on Y and X.
1

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

There is no way of transforming the relationship to obtain a linear relationship, and so it is
not possible to apply the usual regression procedure.
2

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nevertheless, one can still use the principle of minimizing the sum of the squares of the
residuals to obtain estimates of the parameters. We will describe a simple nonlinear
regression algorithm that uses the principle. It consists of a series of repeated steps.
3

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.

You start by guessing plausible values for the parameters.

4

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

You calculate the corresponding fitted values of Y from the data on X, conditional on these
values of the parameters.
5

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.

You calculate the residual for each observation in the sample, and hence RSS, the sum of
the squares of the residuals.
6

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.

You then make small changes in one or more of your estimates of the parameters.

7

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

Using the new estimates of b1, b2, and b3, you re-calculate the fitted values of Y. Then you
re-calculate the residuals and RSS.
8

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.

If RSS is smaller than before, your new estimates of the parameters are better than the old
ones and you continue adjusting your estimates in the same direction. Otherwise, you
would try different adjustments.
9

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You repeat steps 5, 6, and 7 again and again until you are unable to make any changes in
the estimates of the parameters that would reduce RSS.
10

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You conclude that you have minimized RSS, and you can describe the final estimates of the
parameters as the least squares estimates.
11

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

It should be emphasized that, long ago, mathematicians devised sophisticated techniques
to minimize the number of steps required by algorithms of this type.
12

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

We will return to the relationship between employment growth rate, e, and GDP growth rate,
g, in the first slideshow for this chapter. e and g are hypothesized to be related as shown.
13

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

According to this specification, as g becomes large, e will tend to a limit of b1. Looking at
the figure, we see that the maximum value of e is about 3. So we will take this as our initial
value for b1. We then hunt for the optimal value of b2, conditional on this guess for b1.
14

NONLINEAR REGRESSION
RSS
Conditional on b1 = 3

40

20

0

-7

-6

-5 –4.82

-4

b2

-3

The figure shows RSS plotted as a function of b2, conditional on b1 = 3. From this we see
that the optimal value of b2, conditional on b1 = 3, is –4.82.
15

NONLINEAR REGRESSION
RSS
Conditional on b2 = –4.82

40

20

0

2

2.94 3

b1

4

Next, holding b2 at –4.82, we look to improve on our guess for b1. The figure shows RSS as
a function of b1, conditional on b2 = –4.82. We see that the optimal value of b1 is 2.94.
16

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

We continue to do this both parameter estimates cease to change. The figure shows the
first 10 iterations.
17

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

Eventually, the estimates will reach limits. We will then have reached the values that yield
minimum RSS.
18

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

The limits are shown in the diagram as the horizontal lines. Convergence to them is
painfully slow because this is a very inefficient algorithm. More sophisticated ones usually
converge remarkably quickly.
19

NONLINEAR REGRESSION
4

3

2.60

2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3
-4

b2
–4.05

-5
-6

The limits must be the values from the transformed linear regression shown in the first
slideshow for this chapter: b1 = 2.60 and b2 = –4.05. They have been determined by the
same criterion, the minimization of RSS. All that we have done is to use a different method.
20

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

Here is the output for the present hyperbolic regression of e on g using nonlinear
regression. It is, as usual, Stata output, but output from other regression applications will
look similar.
21

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

The Stata command for a nonlinear regression is ‘nl’. This is followed by the hypothesized
mathematical relationship within parentheses. The parameters must be given names placed
within braces. Here b1 is {beta1} and b2 is {beta2}.
22

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

eˆ  2 . 60  4 . 05 z  2 . 60 

4 . 05
g

The output is effectively the same as the linear regression output in the first slideshow for
this chapter.
23

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3
2

1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The hyperbolic function imposes the constraint that the function plunges to minus infinity
for positive g as g approaches zero.
24

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

This feature can be relaxed by using the variation shown. Unlike the previous function, this
cannot be linearized by any kind of transformation. Here, nonlinear regression must be
used.
25

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

The output for this specification is shown, with most of the iteration messages deleted.

26

NONLINEAR REGRESSION

e  b1 

b2
b3  g

u

employment growth rate

3
2
1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The figure compares the original (black) and new (red) hyperbolic functions. The fit is a
considerable improvement, reflected in a higher R 2, 0.64 instead of 0.53.
27

Copyright Christopher Dougherty 2012.
These slideshows may be downloaded by anyone, anywhere for personal use.
Subject to respect for copyright and, where appropriate, attribution, they may be
used as a resource for teaching an econometrics course. There is no need to
refer to the author.
The content of this slideshow comes from Section 4.4 of C. Dougherty,
Introduction to Econometrics, fourth edition 2011, Oxford University Press.
Additional (free) resources for both students and instructors may be
downloaded from the OUP Online Resource Centre
http://www.oup.com/uk/orc/bin/9780199567089/.
Individuals studying econometrics on their own who feel that they might benefit
from participation in a formal course should consider the London School of
Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
EC2020 Elements of Econometrics
www.londoninternational.ac.uk/lse.

2012.11.04

Slide 18

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Suppose you believe that a variable Y depends on a variable X according to the relationship
shown and you wish to obtain estimates of b1, b2, and b3 given data on Y and X.
1

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

There is no way of transforming the relationship to obtain a linear relationship, and so it is
not possible to apply the usual regression procedure.
2

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nevertheless, one can still use the principle of minimizing the sum of the squares of the
residuals to obtain estimates of the parameters. We will describe a simple nonlinear
regression algorithm that uses the principle. It consists of a series of repeated steps.
3

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.

You start by guessing plausible values for the parameters.

4

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

You calculate the corresponding fitted values of Y from the data on X, conditional on these
values of the parameters.
5

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.

You calculate the residual for each observation in the sample, and hence RSS, the sum of
the squares of the residuals.
6

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.

You then make small changes in one or more of your estimates of the parameters.

7

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

Using the new estimates of b1, b2, and b3, you re-calculate the fitted values of Y. Then you
re-calculate the residuals and RSS.
8

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.

If RSS is smaller than before, your new estimates of the parameters are better than the old
ones and you continue adjusting your estimates in the same direction. Otherwise, you
would try different adjustments.
9

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You repeat steps 5, 6, and 7 again and again until you are unable to make any changes in
the estimates of the parameters that would reduce RSS.
10

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You conclude that you have minimized RSS, and you can describe the final estimates of the
parameters as the least squares estimates.
11

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

It should be emphasized that, long ago, mathematicians devised sophisticated techniques
to minimize the number of steps required by algorithms of this type.
12

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

We will return to the relationship between employment growth rate, e, and GDP growth rate,
g, in the first slideshow for this chapter. e and g are hypothesized to be related as shown.
13

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

According to this specification, as g becomes large, e will tend to a limit of b1. Looking at
the figure, we see that the maximum value of e is about 3. So we will take this as our initial
value for b1. We then hunt for the optimal value of b2, conditional on this guess for b1.
14

NONLINEAR REGRESSION
RSS
Conditional on b1 = 3

40

20

0

-7

-6

-5 –4.82

-4

b2

-3

The figure shows RSS plotted as a function of b2, conditional on b1 = 3. From this we see
that the optimal value of b2, conditional on b1 = 3, is –4.82.
15

NONLINEAR REGRESSION
RSS
Conditional on b2 = –4.82

40

20

0

2

2.94 3

b1

4

Next, holding b2 at –4.82, we look to improve on our guess for b1. The figure shows RSS as
a function of b1, conditional on b2 = –4.82. We see that the optimal value of b1 is 2.94.
16

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

We continue to do this both parameter estimates cease to change. The figure shows the
first 10 iterations.
17

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

Eventually, the estimates will reach limits. We will then have reached the values that yield
minimum RSS.
18

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

The limits are shown in the diagram as the horizontal lines. Convergence to them is
painfully slow because this is a very inefficient algorithm. More sophisticated ones usually
converge remarkably quickly.
19

NONLINEAR REGRESSION
4

3

2.60

2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3
-4

b2
–4.05

-5
-6

The limits must be the values from the transformed linear regression shown in the first
slideshow for this chapter: b1 = 2.60 and b2 = –4.05. They have been determined by the
same criterion, the minimization of RSS. All that we have done is to use a different method.
20

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

Here is the output for the present hyperbolic regression of e on g using nonlinear
regression. It is, as usual, Stata output, but output from other regression applications will
look similar.
21

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

The Stata command for a nonlinear regression is ‘nl’. This is followed by the hypothesized
mathematical relationship within parentheses. The parameters must be given names placed
within braces. Here b1 is {beta1} and b2 is {beta2}.
22

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

eˆ  2 . 60  4 . 05 z  2 . 60 

4 . 05
g

The output is effectively the same as the linear regression output in the first slideshow for
this chapter.
23

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3
2

1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The hyperbolic function imposes the constraint that the function plunges to minus infinity
for positive g as g approaches zero.
24

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

This feature can be relaxed by using the variation shown. Unlike the previous function, this
cannot be linearized by any kind of transformation. Here, nonlinear regression must be
used.
25

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

The output for this specification is shown, with most of the iteration messages deleted.

26

NONLINEAR REGRESSION

e  b1 

b2
b3  g

u

employment growth rate

3
2
1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The figure compares the original (black) and new (red) hyperbolic functions. The fit is a
considerable improvement, reflected in a higher R 2, 0.64 instead of 0.53.
27

Copyright Christopher Dougherty 2012.
These slideshows may be downloaded by anyone, anywhere for personal use.
Subject to respect for copyright and, where appropriate, attribution, they may be
used as a resource for teaching an econometrics course. There is no need to
refer to the author.
The content of this slideshow comes from Section 4.4 of C. Dougherty,
Introduction to Econometrics, fourth edition 2011, Oxford University Press.
Additional (free) resources for both students and instructors may be
downloaded from the OUP Online Resource Centre
http://www.oup.com/uk/orc/bin/9780199567089/.
Individuals studying econometrics on their own who feel that they might benefit
from participation in a formal course should consider the London School of
Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
EC2020 Elements of Econometrics
www.londoninternational.ac.uk/lse.

2012.11.04

Slide 19

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Suppose you believe that a variable Y depends on a variable X according to the relationship
shown and you wish to obtain estimates of b1, b2, and b3 given data on Y and X.
1

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

There is no way of transforming the relationship to obtain a linear relationship, and so it is
not possible to apply the usual regression procedure.
2

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nevertheless, one can still use the principle of minimizing the sum of the squares of the
residuals to obtain estimates of the parameters. We will describe a simple nonlinear
regression algorithm that uses the principle. It consists of a series of repeated steps.
3

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.

You start by guessing plausible values for the parameters.

4

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

You calculate the corresponding fitted values of Y from the data on X, conditional on these
values of the parameters.
5

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.

You calculate the residual for each observation in the sample, and hence RSS, the sum of
the squares of the residuals.
6

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.

You then make small changes in one or more of your estimates of the parameters.

7

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

Using the new estimates of b1, b2, and b3, you re-calculate the fitted values of Y. Then you
re-calculate the residuals and RSS.
8

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.

If RSS is smaller than before, your new estimates of the parameters are better than the old
ones and you continue adjusting your estimates in the same direction. Otherwise, you
would try different adjustments.
9

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You repeat steps 5, 6, and 7 again and again until you are unable to make any changes in
the estimates of the parameters that would reduce RSS.
10

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You conclude that you have minimized RSS, and you can describe the final estimates of the
parameters as the least squares estimates.
11

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

It should be emphasized that, long ago, mathematicians devised sophisticated techniques
to minimize the number of steps required by algorithms of this type.
12

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

We will return to the relationship between employment growth rate, e, and GDP growth rate,
g, in the first slideshow for this chapter. e and g are hypothesized to be related as shown.
13

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

According to this specification, as g becomes large, e will tend to a limit of b1. Looking at
the figure, we see that the maximum value of e is about 3. So we will take this as our initial
value for b1. We then hunt for the optimal value of b2, conditional on this guess for b1.
14

NONLINEAR REGRESSION
RSS
Conditional on b1 = 3

40

20

0

-7

-6

-5 –4.82

-4

b2

-3

The figure shows RSS plotted as a function of b2, conditional on b1 = 3. From this we see
that the optimal value of b2, conditional on b1 = 3, is –4.82.
15

NONLINEAR REGRESSION
RSS
Conditional on b2 = –4.82

40

20

0

2

2.94 3

b1

4

Next, holding b2 at –4.82, we look to improve on our guess for b1. The figure shows RSS as
a function of b1, conditional on b2 = –4.82. We see that the optimal value of b1 is 2.94.
16

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

We continue to do this both parameter estimates cease to change. The figure shows the
first 10 iterations.
17

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

Eventually, the estimates will reach limits. We will then have reached the values that yield
minimum RSS.
18

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

The limits are shown in the diagram as the horizontal lines. Convergence to them is
painfully slow because this is a very inefficient algorithm. More sophisticated ones usually
converge remarkably quickly.
19

NONLINEAR REGRESSION
4

3

2.60

2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3
-4

b2
–4.05

-5
-6

The limits must be the values from the transformed linear regression shown in the first
slideshow for this chapter: b1 = 2.60 and b2 = –4.05. They have been determined by the
same criterion, the minimization of RSS. All that we have done is to use a different method.
20

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

Here is the output for the present hyperbolic regression of e on g using nonlinear
regression. It is, as usual, Stata output, but output from other regression applications will
look similar.
21

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

The Stata command for a nonlinear regression is ‘nl’. This is followed by the hypothesized
mathematical relationship within parentheses. The parameters must be given names placed
within braces. Here b1 is {beta1} and b2 is {beta2}.
22

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

eˆ  2 . 60  4 . 05 z  2 . 60 

4 . 05
g

The output is effectively the same as the linear regression output in the first slideshow for
this chapter.
23

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3
2

1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The hyperbolic function imposes the constraint that the function plunges to minus infinity
for positive g as g approaches zero.
24

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

This feature can be relaxed by using the variation shown. Unlike the previous function, this
cannot be linearized by any kind of transformation. Here, nonlinear regression must be
used.
25

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

The output for this specification is shown, with most of the iteration messages deleted.

26

NONLINEAR REGRESSION

e  b1 

b2
b3  g

u

employment growth rate

3
2
1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The figure compares the original (black) and new (red) hyperbolic functions. The fit is a
considerable improvement, reflected in a higher R 2, 0.64 instead of 0.53.
27

Copyright Christopher Dougherty 2012.
These slideshows may be downloaded by anyone, anywhere for personal use.
Subject to respect for copyright and, where appropriate, attribution, they may be
used as a resource for teaching an econometrics course. There is no need to
refer to the author.
The content of this slideshow comes from Section 4.4 of C. Dougherty,
Introduction to Econometrics, fourth edition 2011, Oxford University Press.
Additional (free) resources for both students and instructors may be
downloaded from the OUP Online Resource Centre
http://www.oup.com/uk/orc/bin/9780199567089/.
Individuals studying econometrics on their own who feel that they might benefit
from participation in a formal course should consider the London School of
Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
EC2020 Elements of Econometrics
www.londoninternational.ac.uk/lse.

2012.11.04

Slide 20

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Suppose you believe that a variable Y depends on a variable X according to the relationship
shown and you wish to obtain estimates of b1, b2, and b3 given data on Y and X.
1

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

There is no way of transforming the relationship to obtain a linear relationship, and so it is
not possible to apply the usual regression procedure.
2

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nevertheless, one can still use the principle of minimizing the sum of the squares of the
residuals to obtain estimates of the parameters. We will describe a simple nonlinear
regression algorithm that uses the principle. It consists of a series of repeated steps.
3

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.

You start by guessing plausible values for the parameters.

4

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

You calculate the corresponding fitted values of Y from the data on X, conditional on these
values of the parameters.
5

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.

You calculate the residual for each observation in the sample, and hence RSS, the sum of
the squares of the residuals.
6

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.

You then make small changes in one or more of your estimates of the parameters.

7

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

Using the new estimates of b1, b2, and b3, you re-calculate the fitted values of Y. Then you
re-calculate the residuals and RSS.
8

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.

If RSS is smaller than before, your new estimates of the parameters are better than the old
ones and you continue adjusting your estimates in the same direction. Otherwise, you
would try different adjustments.
9

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You repeat steps 5, 6, and 7 again and again until you are unable to make any changes in
the estimates of the parameters that would reduce RSS.
10

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You conclude that you have minimized RSS, and you can describe the final estimates of the
parameters as the least squares estimates.
11

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

It should be emphasized that, long ago, mathematicians devised sophisticated techniques
to minimize the number of steps required by algorithms of this type.
12

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

We will return to the relationship between employment growth rate, e, and GDP growth rate,
g, in the first slideshow for this chapter. e and g are hypothesized to be related as shown.
13

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

According to this specification, as g becomes large, e will tend to a limit of b1. Looking at
the figure, we see that the maximum value of e is about 3. So we will take this as our initial
value for b1. We then hunt for the optimal value of b2, conditional on this guess for b1.
14

NONLINEAR REGRESSION
RSS
Conditional on b1 = 3

40

20

0

-7

-6

-5 –4.82

-4

b2

-3

The figure shows RSS plotted as a function of b2, conditional on b1 = 3. From this we see
that the optimal value of b2, conditional on b1 = 3, is –4.82.
15

NONLINEAR REGRESSION
RSS
Conditional on b2 = –4.82

40

20

0

2

2.94 3

b1

4

Next, holding b2 at –4.82, we look to improve on our guess for b1. The figure shows RSS as
a function of b1, conditional on b2 = –4.82. We see that the optimal value of b1 is 2.94.
16

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

We continue to do this both parameter estimates cease to change. The figure shows the
first 10 iterations.
17

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

Eventually, the estimates will reach limits. We will then have reached the values that yield
minimum RSS.
18

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

The limits are shown in the diagram as the horizontal lines. Convergence to them is
painfully slow because this is a very inefficient algorithm. More sophisticated ones usually
converge remarkably quickly.
19

NONLINEAR REGRESSION
4

3

2.60

2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3
-4

b2
–4.05

-5
-6

The limits must be the values from the transformed linear regression shown in the first
slideshow for this chapter: b1 = 2.60 and b2 = –4.05. They have been determined by the
same criterion, the minimization of RSS. All that we have done is to use a different method.
20

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

Here is the output for the present hyperbolic regression of e on g using nonlinear
regression. It is, as usual, Stata output, but output from other regression applications will
look similar.
21

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

The Stata command for a nonlinear regression is ‘nl’. This is followed by the hypothesized
mathematical relationship within parentheses. The parameters must be given names placed
within braces. Here b1 is {beta1} and b2 is {beta2}.
22

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

eˆ  2 . 60  4 . 05 z  2 . 60 

4 . 05
g

The output is effectively the same as the linear regression output in the first slideshow for
this chapter.
23

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3
2

1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The hyperbolic function imposes the constraint that the function plunges to minus infinity
for positive g as g approaches zero.
24

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

This feature can be relaxed by using the variation shown. Unlike the previous function, this
cannot be linearized by any kind of transformation. Here, nonlinear regression must be
used.
25

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

The output for this specification is shown, with most of the iteration messages deleted.

26

NONLINEAR REGRESSION

e  b1 

b2
b3  g

u

employment growth rate

3
2
1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The figure compares the original (black) and new (red) hyperbolic functions. The fit is a
considerable improvement, reflected in a higher R 2, 0.64 instead of 0.53.
27

Copyright Christopher Dougherty 2012.
These slideshows may be downloaded by anyone, anywhere for personal use.
Subject to respect for copyright and, where appropriate, attribution, they may be
used as a resource for teaching an econometrics course. There is no need to
refer to the author.
The content of this slideshow comes from Section 4.4 of C. Dougherty,
Introduction to Econometrics, fourth edition 2011, Oxford University Press.
Additional (free) resources for both students and instructors may be
downloaded from the OUP Online Resource Centre
http://www.oup.com/uk/orc/bin/9780199567089/.
Individuals studying econometrics on their own who feel that they might benefit
from participation in a formal course should consider the London School of
Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
EC2020 Elements of Econometrics
www.londoninternational.ac.uk/lse.

2012.11.04

Slide 21

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Suppose you believe that a variable Y depends on a variable X according to the relationship
shown and you wish to obtain estimates of b1, b2, and b3 given data on Y and X.
1

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

There is no way of transforming the relationship to obtain a linear relationship, and so it is
not possible to apply the usual regression procedure.
2

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nevertheless, one can still use the principle of minimizing the sum of the squares of the
residuals to obtain estimates of the parameters. We will describe a simple nonlinear
regression algorithm that uses the principle. It consists of a series of repeated steps.
3

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.

You start by guessing plausible values for the parameters.

4

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

You calculate the corresponding fitted values of Y from the data on X, conditional on these
values of the parameters.
5

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.

You calculate the residual for each observation in the sample, and hence RSS, the sum of
the squares of the residuals.
6

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.

You then make small changes in one or more of your estimates of the parameters.

7

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

Using the new estimates of b1, b2, and b3, you re-calculate the fitted values of Y. Then you
re-calculate the residuals and RSS.
8

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.

If RSS is smaller than before, your new estimates of the parameters are better than the old
ones and you continue adjusting your estimates in the same direction. Otherwise, you
would try different adjustments.
9

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You repeat steps 5, 6, and 7 again and again until you are unable to make any changes in
the estimates of the parameters that would reduce RSS.
10

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You conclude that you have minimized RSS, and you can describe the final estimates of the
parameters as the least squares estimates.
11

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

It should be emphasized that, long ago, mathematicians devised sophisticated techniques
to minimize the number of steps required by algorithms of this type.
12

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

We will return to the relationship between employment growth rate, e, and GDP growth rate,
g, in the first slideshow for this chapter. e and g are hypothesized to be related as shown.
13

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

According to this specification, as g becomes large, e will tend to a limit of b1. Looking at
the figure, we see that the maximum value of e is about 3. So we will take this as our initial
value for b1. We then hunt for the optimal value of b2, conditional on this guess for b1.
14

NONLINEAR REGRESSION
RSS
Conditional on b1 = 3

40

20

0

-7

-6

-5 –4.82

-4

b2

-3

The figure shows RSS plotted as a function of b2, conditional on b1 = 3. From this we see
that the optimal value of b2, conditional on b1 = 3, is –4.82.
15

NONLINEAR REGRESSION
RSS
Conditional on b2 = –4.82

40

20

0

2

2.94 3

b1

4

Next, holding b2 at –4.82, we look to improve on our guess for b1. The figure shows RSS as
a function of b1, conditional on b2 = –4.82. We see that the optimal value of b1 is 2.94.
16

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

We continue to do this both parameter estimates cease to change. The figure shows the
first 10 iterations.
17

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

Eventually, the estimates will reach limits. We will then have reached the values that yield
minimum RSS.
18

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

The limits are shown in the diagram as the horizontal lines. Convergence to them is
painfully slow because this is a very inefficient algorithm. More sophisticated ones usually
converge remarkably quickly.
19

NONLINEAR REGRESSION
4

3

2.60

2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3
-4

b2
–4.05

-5
-6

The limits must be the values from the transformed linear regression shown in the first
slideshow for this chapter: b1 = 2.60 and b2 = –4.05. They have been determined by the
same criterion, the minimization of RSS. All that we have done is to use a different method.
20

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

Here is the output for the present hyperbolic regression of e on g using nonlinear
regression. It is, as usual, Stata output, but output from other regression applications will
look similar.
21

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

The Stata command for a nonlinear regression is ‘nl’. This is followed by the hypothesized
mathematical relationship within parentheses. The parameters must be given names placed
within braces. Here b1 is {beta1} and b2 is {beta2}.
22

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

eˆ  2 . 60  4 . 05 z  2 . 60 

4 . 05
g

The output is effectively the same as the linear regression output in the first slideshow for
this chapter.
23

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3
2

1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The hyperbolic function imposes the constraint that the function plunges to minus infinity
for positive g as g approaches zero.
24

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

This feature can be relaxed by using the variation shown. Unlike the previous function, this
cannot be linearized by any kind of transformation. Here, nonlinear regression must be
used.
25

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

The output for this specification is shown, with most of the iteration messages deleted.

26

NONLINEAR REGRESSION

e  b1 

b2
b3  g

u

employment growth rate

3
2
1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The figure compares the original (black) and new (red) hyperbolic functions. The fit is a
considerable improvement, reflected in a higher R 2, 0.64 instead of 0.53.
27

Copyright Christopher Dougherty 2012.
These slideshows may be downloaded by anyone, anywhere for personal use.
Subject to respect for copyright and, where appropriate, attribution, they may be
used as a resource for teaching an econometrics course. There is no need to
refer to the author.
The content of this slideshow comes from Section 4.4 of C. Dougherty,
Introduction to Econometrics, fourth edition 2011, Oxford University Press.
Additional (free) resources for both students and instructors may be
downloaded from the OUP Online Resource Centre
http://www.oup.com/uk/orc/bin/9780199567089/.
Individuals studying econometrics on their own who feel that they might benefit
from participation in a formal course should consider the London School of
Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
EC2020 Elements of Econometrics
www.londoninternational.ac.uk/lse.

2012.11.04

Slide 22

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Suppose you believe that a variable Y depends on a variable X according to the relationship
shown and you wish to obtain estimates of b1, b2, and b3 given data on Y and X.
1

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

There is no way of transforming the relationship to obtain a linear relationship, and so it is
not possible to apply the usual regression procedure.
2

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nevertheless, one can still use the principle of minimizing the sum of the squares of the
residuals to obtain estimates of the parameters. We will describe a simple nonlinear
regression algorithm that uses the principle. It consists of a series of repeated steps.
3

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.

You start by guessing plausible values for the parameters.

4

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

You calculate the corresponding fitted values of Y from the data on X, conditional on these
values of the parameters.
5

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.

You calculate the residual for each observation in the sample, and hence RSS, the sum of
the squares of the residuals.
6

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.

You then make small changes in one or more of your estimates of the parameters.

7

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

Using the new estimates of b1, b2, and b3, you re-calculate the fitted values of Y. Then you
re-calculate the residuals and RSS.
8

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.

If RSS is smaller than before, your new estimates of the parameters are better than the old
ones and you continue adjusting your estimates in the same direction. Otherwise, you
would try different adjustments.
9

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You repeat steps 5, 6, and 7 again and again until you are unable to make any changes in
the estimates of the parameters that would reduce RSS.
10

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You conclude that you have minimized RSS, and you can describe the final estimates of the
parameters as the least squares estimates.
11

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

It should be emphasized that, long ago, mathematicians devised sophisticated techniques
to minimize the number of steps required by algorithms of this type.
12

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

We will return to the relationship between employment growth rate, e, and GDP growth rate,
g, in the first slideshow for this chapter. e and g are hypothesized to be related as shown.
13

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

According to this specification, as g becomes large, e will tend to a limit of b1. Looking at
the figure, we see that the maximum value of e is about 3. So we will take this as our initial
value for b1. We then hunt for the optimal value of b2, conditional on this guess for b1.
14

NONLINEAR REGRESSION
RSS
Conditional on b1 = 3

40

20

0

-7

-6

-5 –4.82

-4

b2

-3

The figure shows RSS plotted as a function of b2, conditional on b1 = 3. From this we see
that the optimal value of b2, conditional on b1 = 3, is –4.82.
15

NONLINEAR REGRESSION
RSS
Conditional on b2 = –4.82

40

20

0

2

2.94 3

b1

4

Next, holding b2 at –4.82, we look to improve on our guess for b1. The figure shows RSS as
a function of b1, conditional on b2 = –4.82. We see that the optimal value of b1 is 2.94.
16

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

We continue to do this both parameter estimates cease to change. The figure shows the
first 10 iterations.
17

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

Eventually, the estimates will reach limits. We will then have reached the values that yield
minimum RSS.
18

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

The limits are shown in the diagram as the horizontal lines. Convergence to them is
painfully slow because this is a very inefficient algorithm. More sophisticated ones usually
converge remarkably quickly.
19

NONLINEAR REGRESSION
4

3

2.60

2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3
-4

b2
–4.05

-5
-6

The limits must be the values from the transformed linear regression shown in the first
slideshow for this chapter: b1 = 2.60 and b2 = –4.05. They have been determined by the
same criterion, the minimization of RSS. All that we have done is to use a different method.
20

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

Here is the output for the present hyperbolic regression of e on g using nonlinear
regression. It is, as usual, Stata output, but output from other regression applications will
look similar.
21

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

The Stata command for a nonlinear regression is ‘nl’. This is followed by the hypothesized
mathematical relationship within parentheses. The parameters must be given names placed
within braces. Here b1 is {beta1} and b2 is {beta2}.
22

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

eˆ  2 . 60  4 . 05 z  2 . 60 

4 . 05
g

The output is effectively the same as the linear regression output in the first slideshow for
this chapter.
23

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3
2

1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The hyperbolic function imposes the constraint that the function plunges to minus infinity
for positive g as g approaches zero.
24

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

This feature can be relaxed by using the variation shown. Unlike the previous function, this
cannot be linearized by any kind of transformation. Here, nonlinear regression must be
used.
25

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

The output for this specification is shown, with most of the iteration messages deleted.

26

NONLINEAR REGRESSION

e  b1 

b2
b3  g

u

employment growth rate

3
2
1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The figure compares the original (black) and new (red) hyperbolic functions. The fit is a
considerable improvement, reflected in a higher R 2, 0.64 instead of 0.53.
27

Copyright Christopher Dougherty 2012.
These slideshows may be downloaded by anyone, anywhere for personal use.
Subject to respect for copyright and, where appropriate, attribution, they may be
used as a resource for teaching an econometrics course. There is no need to
refer to the author.
The content of this slideshow comes from Section 4.4 of C. Dougherty,
Introduction to Econometrics, fourth edition 2011, Oxford University Press.
Additional (free) resources for both students and instructors may be
downloaded from the OUP Online Resource Centre
http://www.oup.com/uk/orc/bin/9780199567089/.
Individuals studying econometrics on their own who feel that they might benefit
from participation in a formal course should consider the London School of
Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
EC2020 Elements of Econometrics
www.londoninternational.ac.uk/lse.

2012.11.04

Slide 23

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Suppose you believe that a variable Y depends on a variable X according to the relationship
shown and you wish to obtain estimates of b1, b2, and b3 given data on Y and X.
1

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

There is no way of transforming the relationship to obtain a linear relationship, and so it is
not possible to apply the usual regression procedure.
2

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nevertheless, one can still use the principle of minimizing the sum of the squares of the
residuals to obtain estimates of the parameters. We will describe a simple nonlinear
regression algorithm that uses the principle. It consists of a series of repeated steps.
3

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.

You start by guessing plausible values for the parameters.

4

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

You calculate the corresponding fitted values of Y from the data on X, conditional on these
values of the parameters.
5

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.

You calculate the residual for each observation in the sample, and hence RSS, the sum of
the squares of the residuals.
6

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.

You then make small changes in one or more of your estimates of the parameters.

7

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

Using the new estimates of b1, b2, and b3, you re-calculate the fitted values of Y. Then you
re-calculate the residuals and RSS.
8

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.

If RSS is smaller than before, your new estimates of the parameters are better than the old
ones and you continue adjusting your estimates in the same direction. Otherwise, you
would try different adjustments.
9

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You repeat steps 5, 6, and 7 again and again until you are unable to make any changes in
the estimates of the parameters that would reduce RSS.
10

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You conclude that you have minimized RSS, and you can describe the final estimates of the
parameters as the least squares estimates.
11

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

It should be emphasized that, long ago, mathematicians devised sophisticated techniques
to minimize the number of steps required by algorithms of this type.
12

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

We will return to the relationship between employment growth rate, e, and GDP growth rate,
g, in the first slideshow for this chapter. e and g are hypothesized to be related as shown.
13

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

According to this specification, as g becomes large, e will tend to a limit of b1. Looking at
the figure, we see that the maximum value of e is about 3. So we will take this as our initial
value for b1. We then hunt for the optimal value of b2, conditional on this guess for b1.
14

NONLINEAR REGRESSION
RSS
Conditional on b1 = 3

40

20

0

-7

-6

-5 –4.82

-4

b2

-3

The figure shows RSS plotted as a function of b2, conditional on b1 = 3. From this we see
that the optimal value of b2, conditional on b1 = 3, is –4.82.
15

NONLINEAR REGRESSION
RSS
Conditional on b2 = –4.82

40

20

0

2

2.94 3

b1

4

Next, holding b2 at –4.82, we look to improve on our guess for b1. The figure shows RSS as
a function of b1, conditional on b2 = –4.82. We see that the optimal value of b1 is 2.94.
16

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

We continue to do this both parameter estimates cease to change. The figure shows the
first 10 iterations.
17

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

Eventually, the estimates will reach limits. We will then have reached the values that yield
minimum RSS.
18

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

The limits are shown in the diagram as the horizontal lines. Convergence to them is
painfully slow because this is a very inefficient algorithm. More sophisticated ones usually
converge remarkably quickly.
19

NONLINEAR REGRESSION
4

3

2.60

2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3
-4

b2
–4.05

-5
-6

The limits must be the values from the transformed linear regression shown in the first
slideshow for this chapter: b1 = 2.60 and b2 = –4.05. They have been determined by the
same criterion, the minimization of RSS. All that we have done is to use a different method.
20

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

Here is the output for the present hyperbolic regression of e on g using nonlinear
regression. It is, as usual, Stata output, but output from other regression applications will
look similar.
21

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

The Stata command for a nonlinear regression is ‘nl’. This is followed by the hypothesized
mathematical relationship within parentheses. The parameters must be given names placed
within braces. Here b1 is {beta1} and b2 is {beta2}.
22

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

eˆ  2 . 60  4 . 05 z  2 . 60 

4 . 05
g

The output is effectively the same as the linear regression output in the first slideshow for
this chapter.
23

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3
2

1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The hyperbolic function imposes the constraint that the function plunges to minus infinity
for positive g as g approaches zero.
24

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

This feature can be relaxed by using the variation shown. Unlike the previous function, this
cannot be linearized by any kind of transformation. Here, nonlinear regression must be
used.
25

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

The output for this specification is shown, with most of the iteration messages deleted.

26

NONLINEAR REGRESSION

e  b1 

b2
b3  g

u

employment growth rate

3
2
1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The figure compares the original (black) and new (red) hyperbolic functions. The fit is a
considerable improvement, reflected in a higher R 2, 0.64 instead of 0.53.
27

Copyright Christopher Dougherty 2012.
These slideshows may be downloaded by anyone, anywhere for personal use.
Subject to respect for copyright and, where appropriate, attribution, they may be
used as a resource for teaching an econometrics course. There is no need to
refer to the author.
The content of this slideshow comes from Section 4.4 of C. Dougherty,
Introduction to Econometrics, fourth edition 2011, Oxford University Press.
Additional (free) resources for both students and instructors may be
downloaded from the OUP Online Resource Centre
http://www.oup.com/uk/orc/bin/9780199567089/.
Individuals studying econometrics on their own who feel that they might benefit
from participation in a formal course should consider the London School of
Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
EC2020 Elements of Econometrics
www.londoninternational.ac.uk/lse.

2012.11.04

Slide 24

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Suppose you believe that a variable Y depends on a variable X according to the relationship
shown and you wish to obtain estimates of b1, b2, and b3 given data on Y and X.
1

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

There is no way of transforming the relationship to obtain a linear relationship, and so it is
not possible to apply the usual regression procedure.
2

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nevertheless, one can still use the principle of minimizing the sum of the squares of the
residuals to obtain estimates of the parameters. We will describe a simple nonlinear
regression algorithm that uses the principle. It consists of a series of repeated steps.
3

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.

You start by guessing plausible values for the parameters.

4

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

You calculate the corresponding fitted values of Y from the data on X, conditional on these
values of the parameters.
5

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.

You calculate the residual for each observation in the sample, and hence RSS, the sum of
the squares of the residuals.
6

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.

You then make small changes in one or more of your estimates of the parameters.

7

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

Using the new estimates of b1, b2, and b3, you re-calculate the fitted values of Y. Then you
re-calculate the residuals and RSS.
8

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.

If RSS is smaller than before, your new estimates of the parameters are better than the old
ones and you continue adjusting your estimates in the same direction. Otherwise, you
would try different adjustments.
9

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You repeat steps 5, 6, and 7 again and again until you are unable to make any changes in
the estimates of the parameters that would reduce RSS.
10

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You conclude that you have minimized RSS, and you can describe the final estimates of the
parameters as the least squares estimates.
11

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

It should be emphasized that, long ago, mathematicians devised sophisticated techniques
to minimize the number of steps required by algorithms of this type.
12

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

We will return to the relationship between employment growth rate, e, and GDP growth rate,
g, in the first slideshow for this chapter. e and g are hypothesized to be related as shown.
13

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

According to this specification, as g becomes large, e will tend to a limit of b1. Looking at
the figure, we see that the maximum value of e is about 3. So we will take this as our initial
value for b1. We then hunt for the optimal value of b2, conditional on this guess for b1.
14

NONLINEAR REGRESSION
RSS
Conditional on b1 = 3

40

20

0

-7

-6

-5 –4.82

-4

b2

-3

The figure shows RSS plotted as a function of b2, conditional on b1 = 3. From this we see
that the optimal value of b2, conditional on b1 = 3, is –4.82.
15

NONLINEAR REGRESSION
RSS
Conditional on b2 = –4.82

40

20

0

2

2.94 3

b1

4

Next, holding b2 at –4.82, we look to improve on our guess for b1. The figure shows RSS as
a function of b1, conditional on b2 = –4.82. We see that the optimal value of b1 is 2.94.
16

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

We continue to do this both parameter estimates cease to change. The figure shows the
first 10 iterations.
17

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

Eventually, the estimates will reach limits. We will then have reached the values that yield
minimum RSS.
18

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

The limits are shown in the diagram as the horizontal lines. Convergence to them is
painfully slow because this is a very inefficient algorithm. More sophisticated ones usually
converge remarkably quickly.
19

NONLINEAR REGRESSION
4

3

2.60

2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3
-4

b2
–4.05

-5
-6

The limits must be the values from the transformed linear regression shown in the first
slideshow for this chapter: b1 = 2.60 and b2 = –4.05. They have been determined by the
same criterion, the minimization of RSS. All that we have done is to use a different method.
20

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

Here is the output for the present hyperbolic regression of e on g using nonlinear
regression. It is, as usual, Stata output, but output from other regression applications will
look similar.
21

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

The Stata command for a nonlinear regression is ‘nl’. This is followed by the hypothesized
mathematical relationship within parentheses. The parameters must be given names placed
within braces. Here b1 is {beta1} and b2 is {beta2}.
22

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

eˆ  2 . 60  4 . 05 z  2 . 60 

4 . 05
g

The output is effectively the same as the linear regression output in the first slideshow for
this chapter.
23

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3
2

1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The hyperbolic function imposes the constraint that the function plunges to minus infinity
for positive g as g approaches zero.
24

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

This feature can be relaxed by using the variation shown. Unlike the previous function, this
cannot be linearized by any kind of transformation. Here, nonlinear regression must be
used.
25

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

The output for this specification is shown, with most of the iteration messages deleted.

26

NONLINEAR REGRESSION

e  b1 

b2
b3  g

u

employment growth rate

3
2
1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The figure compares the original (black) and new (red) hyperbolic functions. The fit is a
considerable improvement, reflected in a higher R 2, 0.64 instead of 0.53.
27

Copyright Christopher Dougherty 2012.
These slideshows may be downloaded by anyone, anywhere for personal use.
Subject to respect for copyright and, where appropriate, attribution, they may be
used as a resource for teaching an econometrics course. There is no need to
refer to the author.
The content of this slideshow comes from Section 4.4 of C. Dougherty,
Introduction to Econometrics, fourth edition 2011, Oxford University Press.
Additional (free) resources for both students and instructors may be
downloaded from the OUP Online Resource Centre
http://www.oup.com/uk/orc/bin/9780199567089/.
Individuals studying econometrics on their own who feel that they might benefit
from participation in a formal course should consider the London School of
Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
EC2020 Elements of Econometrics
www.londoninternational.ac.uk/lse.

2012.11.04

Slide 25

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Suppose you believe that a variable Y depends on a variable X according to the relationship
shown and you wish to obtain estimates of b1, b2, and b3 given data on Y and X.
1

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

There is no way of transforming the relationship to obtain a linear relationship, and so it is
not possible to apply the usual regression procedure.
2

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nevertheless, one can still use the principle of minimizing the sum of the squares of the
residuals to obtain estimates of the parameters. We will describe a simple nonlinear
regression algorithm that uses the principle. It consists of a series of repeated steps.
3

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.

You start by guessing plausible values for the parameters.

4

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

You calculate the corresponding fitted values of Y from the data on X, conditional on these
values of the parameters.
5

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.

You calculate the residual for each observation in the sample, and hence RSS, the sum of
the squares of the residuals.
6

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.

You then make small changes in one or more of your estimates of the parameters.

7

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

Using the new estimates of b1, b2, and b3, you re-calculate the fitted values of Y. Then you
re-calculate the residuals and RSS.
8

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.

If RSS is smaller than before, your new estimates of the parameters are better than the old
ones and you continue adjusting your estimates in the same direction. Otherwise, you
would try different adjustments.
9

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You repeat steps 5, 6, and 7 again and again until you are unable to make any changes in
the estimates of the parameters that would reduce RSS.
10

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You conclude that you have minimized RSS, and you can describe the final estimates of the
parameters as the least squares estimates.
11

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

It should be emphasized that, long ago, mathematicians devised sophisticated techniques
to minimize the number of steps required by algorithms of this type.
12

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

We will return to the relationship between employment growth rate, e, and GDP growth rate,
g, in the first slideshow for this chapter. e and g are hypothesized to be related as shown.
13

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

According to this specification, as g becomes large, e will tend to a limit of b1. Looking at
the figure, we see that the maximum value of e is about 3. So we will take this as our initial
value for b1. We then hunt for the optimal value of b2, conditional on this guess for b1.
14

NONLINEAR REGRESSION
RSS
Conditional on b1 = 3

40

20

0

-7

-6

-5 –4.82

-4

b2

-3

The figure shows RSS plotted as a function of b2, conditional on b1 = 3. From this we see
that the optimal value of b2, conditional on b1 = 3, is –4.82.
15

NONLINEAR REGRESSION
RSS
Conditional on b2 = –4.82

40

20

0

2

2.94 3

b1

4

Next, holding b2 at –4.82, we look to improve on our guess for b1. The figure shows RSS as
a function of b1, conditional on b2 = –4.82. We see that the optimal value of b1 is 2.94.
16

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

We continue to do this both parameter estimates cease to change. The figure shows the
first 10 iterations.
17

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

Eventually, the estimates will reach limits. We will then have reached the values that yield
minimum RSS.
18

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

The limits are shown in the diagram as the horizontal lines. Convergence to them is
painfully slow because this is a very inefficient algorithm. More sophisticated ones usually
converge remarkably quickly.
19

NONLINEAR REGRESSION
4

3

2.60

2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3
-4

b2
–4.05

-5
-6

The limits must be the values from the transformed linear regression shown in the first
slideshow for this chapter: b1 = 2.60 and b2 = –4.05. They have been determined by the
same criterion, the minimization of RSS. All that we have done is to use a different method.
20

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

Here is the output for the present hyperbolic regression of e on g using nonlinear
regression. It is, as usual, Stata output, but output from other regression applications will
look similar.
21

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

The Stata command for a nonlinear regression is ‘nl’. This is followed by the hypothesized
mathematical relationship within parentheses. The parameters must be given names placed
within braces. Here b1 is {beta1} and b2 is {beta2}.
22

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

eˆ  2 . 60  4 . 05 z  2 . 60 

4 . 05
g

The output is effectively the same as the linear regression output in the first slideshow for
this chapter.
23

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3
2

1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The hyperbolic function imposes the constraint that the function plunges to minus infinity
for positive g as g approaches zero.
24

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

This feature can be relaxed by using the variation shown. Unlike the previous function, this
cannot be linearized by any kind of transformation. Here, nonlinear regression must be
used.
25

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

The output for this specification is shown, with most of the iteration messages deleted.

26

NONLINEAR REGRESSION

e  b1 

b2
b3  g

u

employment growth rate

3
2
1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The figure compares the original (black) and new (red) hyperbolic functions. The fit is a
considerable improvement, reflected in a higher R 2, 0.64 instead of 0.53.
27

Copyright Christopher Dougherty 2012.
These slideshows may be downloaded by anyone, anywhere for personal use.
Subject to respect for copyright and, where appropriate, attribution, they may be
used as a resource for teaching an econometrics course. There is no need to
refer to the author.
The content of this slideshow comes from Section 4.4 of C. Dougherty,
Introduction to Econometrics, fourth edition 2011, Oxford University Press.
Additional (free) resources for both students and instructors may be
downloaded from the OUP Online Resource Centre
http://www.oup.com/uk/orc/bin/9780199567089/.
Individuals studying econometrics on their own who feel that they might benefit
from participation in a formal course should consider the London School of
Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
EC2020 Elements of Econometrics
www.londoninternational.ac.uk/lse.

2012.11.04

Slide 26

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Suppose you believe that a variable Y depends on a variable X according to the relationship
shown and you wish to obtain estimates of b1, b2, and b3 given data on Y and X.
1

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

There is no way of transforming the relationship to obtain a linear relationship, and so it is
not possible to apply the usual regression procedure.
2

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nevertheless, one can still use the principle of minimizing the sum of the squares of the
residuals to obtain estimates of the parameters. We will describe a simple nonlinear
regression algorithm that uses the principle. It consists of a series of repeated steps.
3

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.

You start by guessing plausible values for the parameters.

4

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

You calculate the corresponding fitted values of Y from the data on X, conditional on these
values of the parameters.
5

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.

You calculate the residual for each observation in the sample, and hence RSS, the sum of
the squares of the residuals.
6

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.

You then make small changes in one or more of your estimates of the parameters.

7

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

Using the new estimates of b1, b2, and b3, you re-calculate the fitted values of Y. Then you
re-calculate the residuals and RSS.
8

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.

If RSS is smaller than before, your new estimates of the parameters are better than the old
ones and you continue adjusting your estimates in the same direction. Otherwise, you
would try different adjustments.
9

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You repeat steps 5, 6, and 7 again and again until you are unable to make any changes in
the estimates of the parameters that would reduce RSS.
10

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You conclude that you have minimized RSS, and you can describe the final estimates of the
parameters as the least squares estimates.
11

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

It should be emphasized that, long ago, mathematicians devised sophisticated techniques
to minimize the number of steps required by algorithms of this type.
12

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

We will return to the relationship between employment growth rate, e, and GDP growth rate,
g, in the first slideshow for this chapter. e and g are hypothesized to be related as shown.
13

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

According to this specification, as g becomes large, e will tend to a limit of b1. Looking at
the figure, we see that the maximum value of e is about 3. So we will take this as our initial
value for b1. We then hunt for the optimal value of b2, conditional on this guess for b1.
14

NONLINEAR REGRESSION
RSS
Conditional on b1 = 3

40

20

0

-7

-6

-5 –4.82

-4

b2

-3

The figure shows RSS plotted as a function of b2, conditional on b1 = 3. From this we see
that the optimal value of b2, conditional on b1 = 3, is –4.82.
15

NONLINEAR REGRESSION
RSS
Conditional on b2 = –4.82

40

20

0

2

2.94 3

b1

4

Next, holding b2 at –4.82, we look to improve on our guess for b1. The figure shows RSS as
a function of b1, conditional on b2 = –4.82. We see that the optimal value of b1 is 2.94.
16

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

We continue to do this both parameter estimates cease to change. The figure shows the
first 10 iterations.
17

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

Eventually, the estimates will reach limits. We will then have reached the values that yield
minimum RSS.
18

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

The limits are shown in the diagram as the horizontal lines. Convergence to them is
painfully slow because this is a very inefficient algorithm. More sophisticated ones usually
converge remarkably quickly.
19

NONLINEAR REGRESSION
4

3

2.60

2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3
-4

b2
–4.05

-5
-6

The limits must be the values from the transformed linear regression shown in the first
slideshow for this chapter: b1 = 2.60 and b2 = –4.05. They have been determined by the
same criterion, the minimization of RSS. All that we have done is to use a different method.
20

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

Here is the output for the present hyperbolic regression of e on g using nonlinear
regression. It is, as usual, Stata output, but output from other regression applications will
look similar.
21

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

The Stata command for a nonlinear regression is ‘nl’. This is followed by the hypothesized
mathematical relationship within parentheses. The parameters must be given names placed
within braces. Here b1 is {beta1} and b2 is {beta2}.
22

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

eˆ  2 . 60  4 . 05 z  2 . 60 

4 . 05
g

The output is effectively the same as the linear regression output in the first slideshow for
this chapter.
23

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3
2

1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The hyperbolic function imposes the constraint that the function plunges to minus infinity
for positive g as g approaches zero.
24

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

This feature can be relaxed by using the variation shown. Unlike the previous function, this
cannot be linearized by any kind of transformation. Here, nonlinear regression must be
used.
25

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

The output for this specification is shown, with most of the iteration messages deleted.

26

NONLINEAR REGRESSION

e  b1 

b2
b3  g

u

employment growth rate

3
2
1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The figure compares the original (black) and new (red) hyperbolic functions. The fit is a
considerable improvement, reflected in a higher R 2, 0.64 instead of 0.53.
27

Copyright Christopher Dougherty 2012.
These slideshows may be downloaded by anyone, anywhere for personal use.
Subject to respect for copyright and, where appropriate, attribution, they may be
used as a resource for teaching an econometrics course. There is no need to
refer to the author.
The content of this slideshow comes from Section 4.4 of C. Dougherty,
Introduction to Econometrics, fourth edition 2011, Oxford University Press.
Additional (free) resources for both students and instructors may be
downloaded from the OUP Online Resource Centre
http://www.oup.com/uk/orc/bin/9780199567089/.
Individuals studying econometrics on their own who feel that they might benefit
from participation in a formal course should consider the London School of
Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
EC2020 Elements of Econometrics
www.londoninternational.ac.uk/lse.

2012.11.04

Slide 27

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Suppose you believe that a variable Y depends on a variable X according to the relationship
shown and you wish to obtain estimates of b1, b2, and b3 given data on Y and X.
1

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

There is no way of transforming the relationship to obtain a linear relationship, and so it is
not possible to apply the usual regression procedure.
2

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nevertheless, one can still use the principle of minimizing the sum of the squares of the
residuals to obtain estimates of the parameters. We will describe a simple nonlinear
regression algorithm that uses the principle. It consists of a series of repeated steps.
3

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.

You start by guessing plausible values for the parameters.

4

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

You calculate the corresponding fitted values of Y from the data on X, conditional on these
values of the parameters.
5

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.

You calculate the residual for each observation in the sample, and hence RSS, the sum of
the squares of the residuals.
6

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.

You then make small changes in one or more of your estimates of the parameters.

7

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

Using the new estimates of b1, b2, and b3, you re-calculate the fitted values of Y. Then you
re-calculate the residuals and RSS.
8

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.

If RSS is smaller than before, your new estimates of the parameters are better than the old
ones and you continue adjusting your estimates in the same direction. Otherwise, you
would try different adjustments.
9

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You repeat steps 5, 6, and 7 again and again until you are unable to make any changes in
the estimates of the parameters that would reduce RSS.
10

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You conclude that you have minimized RSS, and you can describe the final estimates of the
parameters as the least squares estimates.
11

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

It should be emphasized that, long ago, mathematicians devised sophisticated techniques
to minimize the number of steps required by algorithms of this type.
12

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

We will return to the relationship between employment growth rate, e, and GDP growth rate,
g, in the first slideshow for this chapter. e and g are hypothesized to be related as shown.
13

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

According to this specification, as g becomes large, e will tend to a limit of b1. Looking at
the figure, we see that the maximum value of e is about 3. So we will take this as our initial
value for b1. We then hunt for the optimal value of b2, conditional on this guess for b1.
14

NONLINEAR REGRESSION
RSS
Conditional on b1 = 3

40

20

0

-7

-6

-5 –4.82

-4

b2

-3

The figure shows RSS plotted as a function of b2, conditional on b1 = 3. From this we see
that the optimal value of b2, conditional on b1 = 3, is –4.82.
15

NONLINEAR REGRESSION
RSS
Conditional on b2 = –4.82

40

20

0

2

2.94 3

b1

4

Next, holding b2 at –4.82, we look to improve on our guess for b1. The figure shows RSS as
a function of b1, conditional on b2 = –4.82. We see that the optimal value of b1 is 2.94.
16

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

We continue to do this both parameter estimates cease to change. The figure shows the
first 10 iterations.
17

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

Eventually, the estimates will reach limits. We will then have reached the values that yield
minimum RSS.
18

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

The limits are shown in the diagram as the horizontal lines. Convergence to them is
painfully slow because this is a very inefficient algorithm. More sophisticated ones usually
converge remarkably quickly.
19

NONLINEAR REGRESSION
4

3

2.60

2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3
-4

b2
–4.05

-5
-6

The limits must be the values from the transformed linear regression shown in the first
slideshow for this chapter: b1 = 2.60 and b2 = –4.05. They have been determined by the
same criterion, the minimization of RSS. All that we have done is to use a different method.
20

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

Here is the output for the present hyperbolic regression of e on g using nonlinear
regression. It is, as usual, Stata output, but output from other regression applications will
look similar.
21

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

The Stata command for a nonlinear regression is ‘nl’. This is followed by the hypothesized
mathematical relationship within parentheses. The parameters must be given names placed
within braces. Here b1 is {beta1} and b2 is {beta2}.
22

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

eˆ  2 . 60  4 . 05 z  2 . 60 

4 . 05
g

The output is effectively the same as the linear regression output in the first slideshow for
this chapter.
23

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3
2

1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The hyperbolic function imposes the constraint that the function plunges to minus infinity
for positive g as g approaches zero.
24

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

This feature can be relaxed by using the variation shown. Unlike the previous function, this
cannot be linearized by any kind of transformation. Here, nonlinear regression must be
used.
25

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

The output for this specification is shown, with most of the iteration messages deleted.

26

NONLINEAR REGRESSION

e  b1 

b2
b3  g

u

employment growth rate

3
2
1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The figure compares the original (black) and new (red) hyperbolic functions. The fit is a
considerable improvement, reflected in a higher R 2, 0.64 instead of 0.53.
27

Copyright Christopher Dougherty 2012.
These slideshows may be downloaded by anyone, anywhere for personal use.
Subject to respect for copyright and, where appropriate, attribution, they may be
used as a resource for teaching an econometrics course. There is no need to
refer to the author.
The content of this slideshow comes from Section 4.4 of C. Dougherty,
Introduction to Econometrics, fourth edition 2011, Oxford University Press.
Additional (free) resources for both students and instructors may be
downloaded from the OUP Online Resource Centre
http://www.oup.com/uk/orc/bin/9780199567089/.
Individuals studying econometrics on their own who feel that they might benefit
from participation in a formal course should consider the London School of
Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
EC2020 Elements of Econometrics
www.londoninternational.ac.uk/lse.

2012.11.04

Slide 28

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Suppose you believe that a variable Y depends on a variable X according to the relationship
shown and you wish to obtain estimates of b1, b2, and b3 given data on Y and X.
1

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

There is no way of transforming the relationship to obtain a linear relationship, and so it is
not possible to apply the usual regression procedure.
2

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nevertheless, one can still use the principle of minimizing the sum of the squares of the
residuals to obtain estimates of the parameters. We will describe a simple nonlinear
regression algorithm that uses the principle. It consists of a series of repeated steps.
3

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.

You start by guessing plausible values for the parameters.

4

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

You calculate the corresponding fitted values of Y from the data on X, conditional on these
values of the parameters.
5

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.

You calculate the residual for each observation in the sample, and hence RSS, the sum of
the squares of the residuals.
6

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.

You then make small changes in one or more of your estimates of the parameters.

7

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

Using the new estimates of b1, b2, and b3, you re-calculate the fitted values of Y. Then you
re-calculate the residuals and RSS.
8

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.

If RSS is smaller than before, your new estimates of the parameters are better than the old
ones and you continue adjusting your estimates in the same direction. Otherwise, you
would try different adjustments.
9

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You repeat steps 5, 6, and 7 again and again until you are unable to make any changes in
the estimates of the parameters that would reduce RSS.
10

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You conclude that you have minimized RSS, and you can describe the final estimates of the
parameters as the least squares estimates.
11

NONLINEAR REGRESSION

Y  b1  b2X

b3

u

Nonlinear regression algorithm
1. Guess b1, b2, and b3. b1, b2, and b3 are the guesses.
b
2. Calculate Y^ = b + b X 3 for each observation.
i

1

2 i

3. Calculate ei = Yi – Yî for each observation.
4. Calculate RSS = ∑ei2.
5. Adjust b1, b2, and b3.
6. Re-calculate Y^ , e , RSS.
i

i

7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

It should be emphasized that, long ago, mathematicians devised sophisticated techniques
to minimize the number of steps required by algorithms of this type.
12

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

We will return to the relationship between employment growth rate, e, and GDP growth rate,
g, in the first slideshow for this chapter. e and g are hypothesized to be related as shown.
13

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3

2

1

0
0

1

2

3

4

5

6

7

8

-1

-2

GDP growth rate

According to this specification, as g becomes large, e will tend to a limit of b1. Looking at
the figure, we see that the maximum value of e is about 3. So we will take this as our initial
value for b1. We then hunt for the optimal value of b2, conditional on this guess for b1.
14

NONLINEAR REGRESSION
RSS
Conditional on b1 = 3

40

20

0

-7

-6

-5 –4.82

-4

b2

-3

The figure shows RSS plotted as a function of b2, conditional on b1 = 3. From this we see
that the optimal value of b2, conditional on b1 = 3, is –4.82.
15

NONLINEAR REGRESSION
RSS
Conditional on b2 = –4.82

40

20

0

2

2.94 3

b1

4

Next, holding b2 at –4.82, we look to improve on our guess for b1. The figure shows RSS as
a function of b1, conditional on b2 = –4.82. We see that the optimal value of b1 is 2.94.
16

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

We continue to do this both parameter estimates cease to change. The figure shows the
first 10 iterations.
17

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

Eventually, the estimates will reach limits. We will then have reached the values that yield
minimum RSS.
18

NONLINEAR REGRESSION
4

3
2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3

b2

-4
-5
-6

The limits are shown in the diagram as the horizontal lines. Convergence to them is
painfully slow because this is a very inefficient algorithm. More sophisticated ones usually
converge remarkably quickly.
19

NONLINEAR REGRESSION
4

3

2.60

2

b1

1

0
0

1

2

3

4

5

6

7

8

9

10

-1
-2
-3
-4

b2
–4.05

-5
-6

The limits must be the values from the transformed linear regression shown in the first
slideshow for this chapter: b1 = 2.60 and b2 = –4.05. They have been determined by the
same criterion, the minimization of RSS. All that we have done is to use a different method.
20

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

Here is the output for the present hyperbolic regression of e on g using nonlinear
regression. It is, as usual, Stata output, but output from other regression applications will
look similar.
21

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

The Stata command for a nonlinear regression is ‘nl’. This is followed by the hypothesized
mathematical relationship within parentheses. The parameters must be given names placed
within braces. Here b1 is {beta1} and b2 is {beta2}.
22

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/g)
(obs = 25)
Iteration 0:
Iteration 1:

residual SS =
residual SS =

e  b1 

b2
g

u

11.58161
11.58161

Source |
SS
df
MS
-------------+-----------------------------Model | 13.1203672
1 13.1203672
Residual | 11.5816083
23 .503548186
-------------+-----------------------------Total | 24.7019754
24 1.02924898

Number of obs
R-squared
Adj R-squared
Root MSE
Res. dev.

=
=
=
=
=

25
0.5311
0.5108
.7096113
51.71049

-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
2.604753
.3748821
6.95
0.000
1.82925
3.380256
/beta2 | -4.050817
.793579
-5.10
0.000
-5.69246
-2.409174
------------------------------------------------------------------------------

eˆ  2 . 60  4 . 05 z  2 . 60 

4 . 05
g

The output is effectively the same as the linear regression output in the first slideshow for
this chapter.
23

NONLINEAR REGRESSION

e  b1 

b2
g

u

employment growth rate

3
2

1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The hyperbolic function imposes the constraint that the function plunges to minus infinity
for positive g as g approaches zero.
24

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

This feature can be relaxed by using the variation shown. Unlike the previous function, this
cannot be linearized by any kind of transformation. Here, nonlinear regression must be
used.
25

NONLINEAR REGRESSION
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 25)
Iteration 0: residual SS = 11.58161
Iteration 1: residual SS = 11.19238
.....................................
Iteration 15: residual SS =
9.01051

e  b1 

b2
b3  g

u

Source |
SS
df
MS
-------------+-----------------------------Number of obs =
25
Model | 15.6914659
2 7.84573293
R-squared
=
0.6352
Residual | 9.01050957
22 .409568617
Adj R-squared =
0.6021
-------------+-----------------------------Root MSE
= .6399755
Total | 24.7019754
24 1.02924898
Res. dev.
= 45.43482
-----------------------------------------------------------------------------e |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------/beta1 |
5.467548
2.826401
1.93
0.066
-.3940491
11.32914
/beta2 |
-31.0764
41.78914
-0.74
0.465
-117.7418
55.58897
/beta3 |
4.148589
4.870437
0.85
0.404
-5.95208
14.24926
-----------------------------------------------------------------------------Parameter beta1 taken as constant term in model & ANOVA table

The output for this specification is shown, with most of the iteration messages deleted.

26

NONLINEAR REGRESSION

e  b1 

b2
b3  g

u

employment growth rate

3
2
1
0
0

1

2

3

4

5

6

7

8

-1
-2
-3
-4

GDP growth rate

The figure compares the original (black) and new (red) hyperbolic functions. The fit is a
considerable improvement, reflected in a higher R 2, 0.64 instead of 0.53.
27

Copyright Christopher Dougherty 2012.
These slideshows may be downloaded by anyone, anywhere for personal use.
Subject to respect for copyright and, where appropriate, attribution, they may be
used as a resource for teaching an econometrics course. There is no need to
refer to the author.
The content of this slideshow comes from Section 4.4 of C. Dougherty,
Introduction to Econometrics, fourth edition 2011, Oxford University Press.
Additional (free) resources for both students and instructors may be
downloaded from the OUP Online Resource Centre
http://www.oup.com/uk/orc/bin/9780199567089/.
Individuals studying econometrics on their own who feel that they might benefit
from participation in a formal course should consider the London School of
Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
EC2020 Elements of Econometrics
www.londoninternational.ac.uk/lse.

2012.11.04

Nonlinear regression algorithm 1. Guess b 1 , b 2 , and b 3 . b 1 , b 2

Transcript Nonlinear regression algorithm 1. Guess b 1 , b 2 , and b 3 . b 1 , b 2

Directory