The Maximum Principle: Continuous Time Main purpose: to introduce the maximum principle as a necessary condition that must be satisfied by any optimal.

Transcript The Maximum Principle: Continuous Time Main purpose: to introduce the maximum principle as a necessary condition that must be satisfied by any optimal.

The Maximum Principle: Continuous Time Main purpose

: to introduce the maximum principle as a necessary condition that must be satisfied by any optimal control.

--Necessary conditions for optimization of dynamic systems --General derivation by Pontryagin in 1956-60 --A simple (but not completely rigorous) proof using dynamic programming -- Examples -- Statement of sufficiency conditions -- Computational method

2.1 Statement of the problem

Optimal control theory deals with the problem of optimization of dynamic systems 2.1.1

The Mathematical Model

State equation where state variables,

(

) 

(

) 

E m

, and

E n E n

control variables

x E m x E 1



E n

is assumed to be continuously differentiable. The

path

(t),



[0,T]



[0,T]

is called a is called a

state trajectory control trajectory

and

(

2.1.2 Constraints

Admissible control

u(t), t



[0,T]

is piecewise continuous and satisfies, in addition,

2.1.3 The Objective Function

Objective function is defined as follows Where

E n x E m x E 1



E 1

and

S: E n x E 1



E 1

and

are assumed to be continuously differentiable.

2.1.4 The Optimal Control Problem

The problem is to find an admissible control

* , which maximizes the objective function (2.3) subject to (2.1) and (2.2). We restate the optimal control problem as:

Control u

* is called

optimal control

* is called the

optimal trajectory

optimal value of

under

(

*) or

* will denote

Case 1. The optimal control problem (2.4) is said to be in

Bolza form

Case 2. When

 0, it is said to be in

Lagrange form

Case 3. When

F F

 0 and

 0, it is said to be in is linear, it is in

Mayer form linear Mayer form

, i.e., , If where

c=(c 1 ,c 2 ,…,c n )



E n

The Bolza form can be reduced to the linear Mayer form , we define a new state vector

(

y 1 ,y 2 ,…,y n+1

)

having n+1 components defined as follows:

y i =x i i=1,2…,n

and for so that the objective function is

J=cy(T)=y n+1 (T).

If we now integrate (2.6) from 0 to

, we have which is the same as the objective function

in (2.4)

Statement of Bellman’s Optimality Principle

“An optimal policy has the property that whatever the initial state and initial decision are, the remaining decisions must constitute an optimal policy with regard to the state resulting from the first decision.”

PRINCIPLE OF OPTIMALITY

J ab J be

a b

J bce

e c

Assertion: If

abe

is the optimal path from

, then be is the optimal path from b to

Proof: Suppose it is not. Then there is another path (note that existence is assumed here)

bce

which is optimal from

, i.e.

bce but then >

abe =

ab +

be <

ab +

bce =

abce This contradicts the hypothesis that

abe

is the optimal path from

A Dynamic Programming Example Stagecoach Problem Costs:

Solution:

Let 1-

u 1 -u 2 -u 3

-10 be the optimal path.

Let

f n

(

s,u n

) be the minimal cost path given that current state is

and the decision taken is

u n

n *(

) = min

(s,

u n

) =

f n

(

s,u n

u n

This is the Recursion Equation of D.P. It can be solved by a backward procedure – which starts at the terminal stage and stops at the initial statge.

Note: 1-2-6-9-10 with cost=13 is a greedy path that minimizes cost at each stage. This may not be minimal cost solution, however, E.g. 1-4-6 is cheaper overall than 1-2-6.

2.2 Dynamic Programming and the Maximum Principle 2.2.1 The Hamilton – Jacobi- Bellman Equation

where for



t ,

Principle of Optimality

An optimal policy has the property that, whatever the initial state and initial decision are, the remaining decision must constitute an optimal policy with regard to the outcome resulting from the first decision.

Figure 2.1 An Optimal Path in the State-Time Space

The change in the objective function consists of two parts:

1. The incremental changes in

from t to

+ 

, which is given by the integral of

(

x,u,t

) from t to

+ 

, 2. The value function

V(x+



x, t+



at time

+ 

In equation form we have Since

is continuous, the integral in (2.9) is approximately

(

x,u,t

) 

so that we rewrite (2.9) as

Assume that Taylor series expansion of obtain

V V

with respect to 

and Substituting for

from (2.1) canceling

(

x,t

) on both sides and then dividing by 

we get

Let 

 0 with the boundary condition

V x

(

x,t

) can be interpreted as the marginal contribution of the state variable

function. Denote it by to the maximized objective  (

) 

E n

called the a

djoint

(row) vector i.e.,

We introduce the so-called

Hamiltonian

or The (2.14) can rewritten as the following (2.19) will be called the

Hamilton-Jacobi-Bellman equation

or, HJB equation.

From (2.19) we can get Hamiltonian maximizing condition of the maximum principle canceling the term

V t

on both sides, we obtain for all

  (

Remark: H

decouples the problem over time by means of  (

), which is analogous to dual variables or shadow prices in Linear Programming.

2.2.2 Derivation of the Adjoint Equation Let where for a small positive  .

Fix

and use H-J-B equation (2.19) LHS=0 from (2.19) since

maximizes [

H+V t

]. RHS will be zero if

u*(t)

also maximizes

H+V t

the state. In general

x(t)



x*(t),

with thus RHS 

(

) as 0, But then RHS|

x(t)=x*(t) =

0  RHS is maximized at

x(t)=x*(t).

Since

x(t)

is unconstrained, we have 

RHS

/  x |

x(t)=x*(t)

= 0 or, By definition of

Note: (2.25) assumes

to be twice continuously differentiable.

By definition (2.16) of  (

)

Using (2.25) and (2.26) we have Using (2.16), we have From (2.18), we have

Terminal boundary condition or transversality condition:

(2.28) and (2.29) can determine the adjoint variables.

From (2.28), we can rewrite the state equation as From (2.28), (2.29), (2.30) and (2.1), we get (2.31) is called

a canonical system of equation or canonical adjoints

Free and Fixed End Point Problems

 0   (

) = 0 no salvage value

(

) fixed   (

) = some constant (given)

2.2.3 The Maximum Principle

The necessary conditions for

* to be an optimal control are:

2.2.4 Economic Interpretation of the Maximum Principles

where

is considered to be the instantaneous profit rate measured in dollars per unit of time, and

[

x,T

] is the salvage value, in dollars.

Multiplying (2.18) formally by

and from (2.1), we have

(

x,u,t

)

: direct contribution to



: indirect contribution to

from

in dollars.

t+dt

Hdt

: total contribution to

from time t to t+dt when

(

and

(

) =

in the interval [

t+dt

By (2.28) and (2.29) we have Rewriting the first equation as

-d

 : marginal cost of holding capital from

t+dt Hxdt

: marginal revenue of investing the capital

Fxdt

: direct marginal contribution 

f x dt

: indirect marginal contribution Thus, adjoint equation implies  MC = MR

Example 2.1

Consider the problem subject to the state equation and the control constraint Note that

=1,

F=-x

=0, and

f =u

. Because

F=-x

, we can interpret the problem as one of minimizing the (signed) area under the curve

(

) for 0 

 1.

Solution

.First we form the Hamiltonian and note that, because the Hamiltonian is linear in

, the form of the optimal control, i.e., the one that would maximize the Hamiltonian, is or referring to the notation in Section 1.4,

To find  , we write the adjoint equation Because this equation does not involve

and

, we can easily solve it as It follows that  (

) =

1  0 for all

 [0,1], and since we can set

*(1)=-1, which defines u at the single point

=1, we have the optimal control

Substituting this into the state equation(2.34) we have whose solution is The graphs of the optimal state and adjoint trajectories appear in Figure 2.2. Note that the optimal value of the objective function is

*= -1/2.

Figure 2.2 Optimal State and Adjoint Trajectories for Example 2.1

Example 2.2

objective is to Let us solve the same problem as in example 2.1 over the interval [0,2] so that the The dynamics and constraints are (2.33) and (2.34), respectively, as before. Here we want to minimize the signed area between the horizontal axis and the trajectory of

(t) for 0 

 2.

Solution

. As before the Hamiltonian is defined by (2.36) and the optimal control is as in (2.38). The adjoint equation is the same as (2.39) except that now

=2 instead of

=1. The solution of (2.44) is easily found to be Hence the state equation (2.41) and its solution (2.42) are exactly the same. The graphs of the optimal state and adjoint trajectories appear in Figure 2.3. Note that the optimal value of the objective function here is

*=0.

Figure 2.3 Optimal State and Adjoint Trajectories for Example 2.2

Example 2.3 The next example is:

subject to the same constraints as in Example 2.1, namely, Here

= - (1/2)

2 so that the interpretation of the objective function (2.46) is that we are trying to find the trajectory

(

) in order that the area under the curve (1/2)

2 is minimized.

Solution.

The Hamiltonian is which is linear in

so that the optimal policy is The adjoint equation is Here the adjoint equation involves

so that we cannot solve it directly. Because the state equation (2.47) involves

, which depends on  , we also cannot integrate it independently without knowing  .

The way out of this dilemma is to use some intuition. Since we want to minimize the area under (1/2)

2 and since

(0)=1, it is clear that we want

to decrease as quickly as possible. Let us therefore temporaily assume that  is nonpositive in the interval [0,1] so that from (2.49) we have

=-1 throughout the interval. (In Exercise 2.5, you will be asked to show that this assumption is correct.) With this assumption, we can solve (2.47) as

Substituting this into (2.50) gives Integrating both sides of this equation from

to1 gives or which, using  (1)=0, yields

The reader may now verify that  (t) is nonpositive in the interval [0,1], verifying our original assumption. Hence (2.51) and (2.52) satisfy the necessary conditions. In Exercise 2.6, you will be asked to show that they satisfy sufficient conditions derived in Section 2.4 as well, so that they are indeed optimal. Figure 2.4 shows the graphs of the optimal trajectories.  It would be clearly optimal if we could keep x*(t)=0, t ≥1.

 This is possible by setting

Figure 2.4 Optimal Trajectories for Example 2.3 and Example 2.4

Example 2.4

Let us rework Example 2.3 with

=2, i.e, Subject to the constraints: It would be clearly optimal if we could keep

)=0,

≥1. This is possible by setting: 1

≤1

)= -1

≥1.

Note

)=0,

≥1 is singular control.

Example 2.5

The problem is:

Solution

Figure 2.5 Optimal control for Example 2.5

2.4 Sufficient Conditions

Since either or or both.

Theorem 2.1 (Sufficiency Conditions). Let u(t), and*

the corresponding x*(t) and



(t) satisfy the maximum principle necessary condition (2.32) for all t



[0,T]. Then, u* is an optimal control if H 0 (x,



(t),t) is concave in x for each t and S(x,T) is concave in x.

Example 2.6:

Examples 2.1 and 2.2 satisfy the sufficient conditions.

Fixed-end-point

problem: Transversality condition is:

2.5 Solving a TPBVP by Using Spreadsheet Software

Example 2.7

Consider the problem:

Let:  t=0.01, initial value  (0)=-0.2, x(0)=5.

The Maximum Principle: Continuous Time Main purpose: to introduce the maximum principle as a necessary condition that must be satisfied by any optimal.

Transcript The Maximum Principle: Continuous Time Main purpose: to introduce the maximum principle as a necessary condition that must be satisfied by any optimal.