Dynamic Multi-Terabit Core Optical Network

download report

Transcript Dynamic Multi-Terabit Core Optical Network

Engineering Optimization
Chapter 3 : Functions of Several Variables (Part 1)
Presented by:
Rajesh Roy
Networks Research Lab,
University of California, Davis
June 18, 2010
x is a vector of design variables of dimension N
No constraints on x
ƒ is a scalar objective function
ƒ and its derivatives exist and are continuous everywhere
We will be satisfied to identify local optima x*
Static Question
Dynamic Question
Test candidate points to see whether
they are (or are not) minima,
maxima, saddlepoints, or none of the
Given x(0), a point that does not
optimality criteria, what is a better
estimate x(1) of the solution x*?
The nonlinear objective ƒ will typically not be convex and therefore
will be multimodal.
Optimality Criteria
We examine optimality criteria for basically two reasons:
(1) because they are necessary to recognize solutions
(2) because they provide motivation for most of the useful methods
Consider the Taylor expansion of a function of several variables:
Necessary and Sufficient Conditions
Necessary Conditions:
Sufficient Conditions:
Static Question: Example
Dynamic Question : Searching x*
The methods can be classified into three broad categories:
1. Direct-search methods, which use only function values
The S2 (Simplex Search) Method
Hooke–Jeeves Pattern Search Method
Powell’s Conjugate Direction Method
2. Gradient methods, which require estimates of the first derivative of ƒ(x)
 Cauchy’s Method
3. Second-order methods, which require estimates of the first and second
derivatives of ƒ(x)
 Newton’s Method
Motivation behind different methods:
 Available computer storage is limited
 Function evaluations are very time consuming
 Great accuracy in the final solution is desired
 Sometimes its either impossible or else very time consuming to obtain
analytical expressions for derivatives
The S2 (Simplex Search) Method
1. Set up a regular simplex* in the space of the independent variables and
evaluate the function at each vertex.
2. The vertex with highest functional value is located.
3. This ‘‘worst’’ vertex is then reflected through the centroid to generate a
new point, which is used to complete the next simplex
4. Jump to Step 2 if the performance index decreases smoothly
Suppose x(j) is the point to be reflected. Then the centroid of the
remaining N points is
All points on the line from x( j) through xc are given by
New Vertex Point:
*In N dimensions, a regular simplex is a polyhedron composed of N+1 equidistant points, which form its vertices.
Hooke-Jeeves Pattern Search Method
Exploratory Moves:
1. Given a specified step size the exploration proceeds from an initial point by the
specified step size in each coordinate direction.
2. If the function value does not increase, the step is considered successful.
3. Otherwise, the step is retracted and replaced by a step in the opposite direction,
which in turn is retained depending upon whether it succeeds or fails.
Pattern Moves:
1. Single step from the present base point along the line from the previous
to the current base point.
Hooke-Jeeves Pattern Search Method
Hooke-Jeeves Pattern Search Method
Powell’s Conjugate Direction Method
Given a quadratic function q(x), two arbitrary but distinct points x(1) and x(2), and
a direction d, if y(1) is the solution to min q(x(1), d) and y(2) is the solution to min
q(x(2), d), then the direction y(2)–y(1) is C conjugate to d.
Theorem :
If a quadratic function in N variables can be transformed so that it is just the sum of perfect
squares, then the optimum can be found after exactly N single-variable searches, one with respect
to each of the transformed variables.
Gradient-Based Methods
All of the methods considered here employ a similar iteration procedure:
Cauchy’s Method
Taylor expansion of the objective about x:
The greatest negative scalar product results from the choice
This is the motivation for the simple gradient method:
Newton’s Method
Consider again the Taylor expansion of the objective:
We form a quadratic approximation to ƒ(x) by dropping terms of order 3
forcing x(k1), the next point in the sequence, to be a point where the
gradient of the approximation is zero. Therefore,
So according to Newton’s optimization method: