Approximate Dynamic Programming for High-Dimensional Problems in Energy Modeling Ohio St. University October 7, 2009 Warren Powell CASTLE Laboratory Princeton University http://www.castlelab.princeton.edu © 2009 Warren B.

Transcript Approximate Dynamic Programming for High-Dimensional Problems in Energy Modeling Ohio St. University October 7, 2009 Warren Powell CASTLE Laboratory Princeton University http://www.castlelab.princeton.edu © 2009 Warren B.

Approximate Dynamic Programming for
High-Dimensional Problems in Energy Modeling
Ohio St. University
October 7, 2009
Warren Powell
CASTLE Laboratory
Princeton University
http://www.castlelab.princeton.edu
© 2009 Warren B. Powell, Princeton University © 2008 Warren B. Powell
Slide 1
Goals for an energy policy model
Potential questions
» Policy questions
• How do we design policies to achieve energy goals (e.g. 20%
renewables by 2015) with a given probability?
• How does the imposition of a carbon tax change the likelihood
of meeting this goal?
• What might happen if ethanol subsidies are reduced or
eliminated?
• What is the impact of a breakthrough in batteries?
» Energy economics
• What is the best mix of energy generation technologies?
• How is the economic value of wind affected by the presence of
storage?
• What is the best mix of storage technologies?
• How would climate change impact our ability to use
hydroelectric reservoirs as a regulating source of power?
Slide 2
Goals for an energy policy model
Designing energy supply and storage portfolios to
work with wind:
» The marginal value of wind and solar farms depends on
the ability to work with intermittent supply.
» The impact of intermittent supply will be mitigated by
the use of storage.
» Different storage technologies (batteries, flywheels,
compressed air, pumped hydro) are each designed to
serve different types of variations in supply and
demand.
» The need for storage (and the value of wind and solar)
depends on the entire portfolio of energy producing
technologies.
Slide 3
Intermittent energy sources
Wind speed
Solar energy
Slide 4
Wind
30 days
1 year
Slide 5
Storage
Hydroelectric
Batteries
Flywheels
Ultracapacitors
Slide 6
Long term uncertainties….
Tax policy
2010
2015
Solar panels
Batteries
Price of oil
2020
Carbon capture and
sequestration
2025
2030
Climate change
Slide 7
Goals for an energy policy model
Model capabilities we are looking for
» Multi-scale
• Multiple time scales (hourly, daily, seasonal, annual, decade)
• Multiple spatial scales
• Multiple technologies (different coal-burning technologies,
new wind turbines, …)
• Multiple markets
– Transportation (commercial, commuter, home activities)
– Electricity use (heavy industrial, light industrial, business,
residential)
– ….
» Stochastic (handles uncertainty)
•
•
•
•
Hourly fluctuations in wind, solar and demands
Daily variations in prices and rainfall
Seasonal changes in weather
Yearly changes in supplies, technologies and policies
Slide 8
Outline
Modeling stochastic resource allocation problems
An introduction to ADP
ADP and the post-decision state variable
A blood management example
The SMART energy policy model
Slide 9
A resource allocation model
Attribute vectors:
a
 Asset class 
Time invested 


 Type 
 Location 


 Age 
 Location 


ETA




Home


Experience


 Driving hours 
© 2008 Warren B. Powell
 Location 
 ETA 


 A/C type 


Fuel
level


 Home shop 


 Crew 
 Eqpt1 




 Eqpt100 


Slide 10
A resource allocation model
Modeling resources:
» The attributes of a single resource:
a  The attributes of a single resource
a A The attribute space
» The resource state vector:
Rta  The number of resources with attribute a
 
Rt  Rta
aA
The resource state vector
» The information process:
Rˆta  The change in the number of resources with
attribute a.
© 2008 Warren B. Powell
Slide 11
A resource allocation model
Modeling demands:
» The attributes of a single demand:
b  The attributes of a demand to be served.
b B The attribute space
» The demand state vector:
Dtb  The number of demands with attribute b
 
Dt  Dtb
bB
The demand state vector
» The information process:
Dˆ tb  The change in the number of demands with
attribute b.
© 2008 Warren B. Powell
Slide 12
Energy resource modeling
The system state:









St   Rt , Dt , t   System state, where:
Rt  Resource state (how much capacity, reserves)
Dt  Market demands
t  "system parameters"
State of the technology (costs, performance)
Climate, weather (temperature, rainfall, wind)
Government policies (tax rebates on solar panels)
Market prices (oil, coal)
© 2008 Warren B. Powell
Slide 13
Energy resource modeling
The decision variables:









xtcap
 New capacity 


Retired
capacity


 Storage capacity 



for each:



Type


Location


 Technology 


xtdisp
 Flow from:



Resource
to
conversion


 Conversion to storage 


  Storage to grid

 Conversion to grid



 Grid to intermediate uses 
 Grid to final demand



Slide 14
Energy resource modeling
Exogenous information:










Wt  New information = Rˆt , Dˆ t , ˆt

Rˆt  Exogenous changes in capacity, reserves
Dˆ t  New demands for energy by type
ˆt  Exogenous changes in parameters.
Slide 15
Energy resource modeling
The transition function









St 1  S (St , xt ,Wt 1 )
M
Known as the:
“Transition function”
“Transfer function”
“System model”
“Plant model”
“Model”
© 2008 Warren B. Powell
Slide 16
Energy resource modeling
Resources
Demands
Slide 17
Energy resource modeling
t
t+1
t+2
Slide 18
Energy resource modeling
Optimizing over time
t
t+1
t+2
Optimizing at a point in time
Slide 19
Energy resource modeling
The objective function


t

max E   C  St , X ( St ) 
 t

Expectation over
all
State variable
Contribution
function Decision function (policy)
Finding
the
best
policy
random outcomes
» How do we find the best policy?
•
•
•
•
Myopic policies
Rolling horizon policies
Simulation-optimization
Dynamic programming
Slide 20
Outline
Modeling stochastic resource allocation problems
An introduction to ADP
ADP and the post-decision state variable
A blood management example
The SMART energy policy model
Slide 21
Introduction to dynamic programming
Bellman’s optimality equation:
Vt ( St )  max  Ct ( St , xt )  E Vt 1 ( St 1 ) | St 
xX
Compute this for each state S
Assume this is known
Slide 22
Introduction to dynamic programming
Bellman’s optimality equation:
Vt ( St )  max  Ct ( St , xt )  E Vt 1 ( St 1 ) | St 
xX
Three curses
Problem: Curse of dimensionality
State space
Outcome space
Action space (feasible region)
Slide 23
Introduction to dynamic programming
The computational challenges:
Vt ( St )  max  Ct ( St , xt )  E Vt 1 ( St 1 ) | St 
xX
How do we find Vt 1 (St 1 )?
How do we compute the expectation?
How do we find the optimal solution?
Slide 24
Introduction to ADP
Classical ADP
» Most applications of ADP focus on the challenge of
handling multidimensional state variables
» Start with
Vt ( St )  max  Ct ( St , xt )  E Vt 1 ( St 1 ) | St 
xX
» Now replace the value function with some sort of
approximation
Vt 1 (St 1 )  Vt 1 (St 1 ) 
 
f F
f
f
(St 1 )
» May draw from the entire field of statistics/machine
learning.
Slide 25
Introduction to ADP
Other statistical methods
» Regression trees
• Combines regression with techniques for discrete variables.
» Data mining
• Good for categorical data
» Neural networks
• Engineers like this for low-dimensional continuous problems
» Kernel/locally polynomial regression
• Approximations portions of the value function locally using
simple functions
» Dirichlet mixture models
• Aggregate portions of the function and fit approximations
around these aggregations.
Slide 26
Introduction to ADP
But this does not solve our problem
» Assume we have an approximate value function.
» We still have to solve a problem that looks like


Vt ( St )  max  Ct ( St , xt )  E   f  f ( St 1 ) 
xX
f F


» This means we still have to deal with a maximization
problem (might be a linear, nonlinear or integer
program) with an expectation.
Slide 27
Outline
Modeling stochastic resource allocation problems
An introduction to ADP
ADP and the post-decision state variable
A blood management example
The SMART energy policy model
Slide 28
Information
State
Action
Information
Action
State
Rain .2 -$2000
Clouds .3 $1000
Sun .5 $5000
Rain .2 -$200
Clouds .3 -$200
Sun .5 -$200
Rain .8 -$2000
Clouds .2 $1000
Sun .0 $5000
Rain .8 -$200
Clouds .2 -$200
Sun .0 -$200
Rain .1 -$2000
Clouds .5 $1000
Sun .4 $5000
Rain .1 -$200
Clouds .5 -$200
Sun .4 -$200
Rain .1 -$2000
Clouds .2 $1000
Sun .7 $5000
Rain .1 -$200
Clouds .2 -$200
Sun .7 -$200
- Decision nodes
- Outcome nodes
The post-decision state
New concept:
» The “pre-decision” state variable:
• St  The information required to make a decision xt
• Same as a “decision node” in a decision tree.
» The “post-decision” state variable:
x
• St  The state of what we know immediately after we
make a decision.
• Same as an “outcome node” in a decision tree.
Slide 30
The post-decision state
An inventory problem:
» Our basic inventory equation:

Rt 1  max 0, Rt  xt  Dˆ t 1

where
Rt  Resource at time t
xt  Order quantity at time t
Dˆ  Random demand
t 1
» Using pre- and post-decision states:
Rtx  Rt  xt

Rt 1  max 0, Rtx  Dˆ t 1
From pre- to post-

From post- to pre-
Slide 31
The post-decision state
Pre-decision, state-action, and post-decision
State
Pre-decision state





9
3 states
Action
Post-decision state





3  9 state-action pairs
9
39 states
Slide 32
The post-decision state
Pre-decision: resources and demands
St  (Rt , Dt )
Slide 33
The post-decision state
Stx  S M , x (St , xt )
Slide 34
The post-decision state
Stx
St 1  S M ,W (Stx ,Wt 1 )
ˆ )
Wt 1  (Rˆt 1, D
t 1
Slide 35
The post-decision state
St 1
Slide 36
The post-decision state
Classical form of Bellman’s equation:
Vt ( St )  max  Ct ( St , xt )  E Vt 1 ( St 1 ) | St 
xX
Bellman’s equations around pre- and post-decision
states:
» Optimization problem (making the decision):

Vt ( St )  max x Ct ( St , xt )  Vt x  StM , x ( St , xt ) 

• Note: this problem is deterministic!
» Expectation problem (incorporating uncertainty):
Vt x ( Stx )  E Vt 1 ( S M ,W ( Stx ,Wt 1 )) | Stx 
Slide 37
Introduction to ADP
We first use the value function around the post-decision
state variable, removing the expectation:


Vt (St )  max Ct (St , xt )  Vt x Stx (St , xt )
xX

We then replace the value function with an approximation
that we estimate using machine learning techniques:


Vt (St )  max Ct (St , xt )  Vt Stx (St , xt )
xX

Slide 38
The post-decision state
Value function approximations:
» Linear (in the resource state):
Vt ( Rtx )   vta  Rtax
aA
» Piecewise linear, separable:
Vt ( Rtx )   Vta ( Rtax )
aA
» Indexed PWL separable:
Vt ( Rtx )   Vta  Rtax | ( featurest ) 
aA
Slide 39
The post-decision state
Value function approximations:
» Ridge regression (Klabjan and Adelman)
Vt ( Rtx ) 
V R 
f F
tf
Rtf 
tf

aA f
fa
Rta
» Benders cuts
Vt ( Rt )
x1
x0
Slide 40
Making decisions
Following an ADP policy
Slide 41
Making decisions
Following an ADP policy
Slide 42
Making decisions
Following an ADP policy
Slide 43
Making decisions
Following an ADP policy
Slide 44
Approximate dynamic programming
With luck, the objective function will improve steadily
1900000
1850000
Objective function
1800000
1750000
1700000
1650000
1600000
0
100
200
300
400
500
600
700
800
900
1000
Iterations
Slide 45
The post-decision state
Comparison to other methods:
» Classical MDP (value iteration)

V n ( S )  max x C ( S , x)   EV n 1  S M ( S , x,W ) 

» Classical ADP (pre-decision state):


vˆtn  max x  Ct ( Stn , xt )   p ( s ' | Stn , xt )Vt n11  s '   Expectation
s'


vˆt updates Vt (St )
Vt n ( Stn )  (1   n 1 )Vt n 1 ( Stn )   n 1vˆtn
» Updating Vt x ,n1 around post-decision state:
vˆtn  max x  Ct ( Stn , xt )  Vt x ,n 1 ( S M , x ( Stn , xt )) 
Vt n1 ( Stx,1n )  (1   n 1 )Vt n11 ( Stx,1n )   n 1vˆtn
No expectation
vˆt updates Vt 1 (Stx1 )
Slide 46
Approximate dynamic programming
Step 1: Start with a pre-decision state Stn
Step 2: Solve the deterministic optimization using
Deterministic
an approximate value function:
optimization
n
n
n 1
M ,x
n
vˆt  max x  Ct ( St , xt )  Vt
(S
( St , xt )) 
to obtain xtn.
Step 3: Update the value function approximation
Vt n1 (Stx,1n )  (1  n1 )Vt n11 (Stx,1n )  n1vˆtn
Recursive
statistics
Step 4: Obtain Monte Carlo sample of Wt (n ) and
Simulation
compute the next pre-decision state:
Stn1  S M (Stn , xtn ,Wt 1 ( n ))
Step 5: Return to step 1.
Slide 47
Approximate dynamic programming
Step 1: Start with a pre-decision state Stn
Step 2: Solve the deterministic optimization using
Deterministic
an approximate value function:
optimization
n
n
n 1
M ,x
n
vˆt  max x  Ct ( St , xt )  Vt
(S
( St , xt )) 
to obtain xtn.
Step 3: Update the value function approximation
Vt n1 (Stx,1n )  (1  n1 )Vt n11 (Stx,1n )  n1vˆtn
Recursive
statistics
Step 4: Obtain Monte Carlo sample of Wt (n ) and
Simulation
compute the next pre-decision state:
Stn1  S M (Stn , xtn ,Wt 1 ( n ))
Step 5: Return to step 1.
Slide 48
Outline
Modeling stochastic resource allocation problems
An introduction to ADP
ADP and the post-decision state variable
A blood management example
The SMART energy policy model
Slide 49
Blood management
Managing blood inventories
Slide 50
Blood management
Managing blood inventories over time
Week 0
Week 1
Week 2
Week 3
x0
S0 S0x
x1
S1 S1x
x2
S2 S2x
x3
S3 S3x
t=0
ˆ
Rˆ1, D
1
t=1
ˆ
Rˆ2 , D
2
t=2
ˆ
Rˆ3 , D
3
t=3
Slide 51
St 
R
t
,
Dˆ t
Dt , AB
AB+,0
AB-
Dˆ t , AB
AB+,1
AB+,2
A+
Dˆ t , A
Dˆ t , AB
AB+,3
AB+
Rt ,( AB,0)
AB+,0
Rt ,( AB,1)
AB+,1
Rt ,( AB,2)
AB+,2
A-
Rt ,(O ,0)
O-,0
Rt ,(O ,1)
O-,1
Rt ,(O ,2)
ˆ
Rtx
B+
B-
O-,2
Dˆ t , AB
Dˆ t , AB
Dˆ t , AB
O+
O-
O-,0
O-,1
O-,2
O-,3
Dˆ t , AB
Satisfy a demand
Hold
Rt
Rtx
Rt 1
Rˆt 1, AB 
Rt ,( AB,0)
AB+,0
AB+,0
Rt ,( AB,1)
AB+,1
AB+,1
AB+,1
Rt ,( AB,2)
AB+,2
AB+,2
AB+,2
AB+,3
AB+,3
AB+,0
Rt ,(O ,0)
O-,0
Rt ,(O ,1)
O-,1
O-,0
Rt ,(O ,2)
O-,2
O-,1
O-,1
O-,2
O-,2
O-,3
O-,3
Dˆ t
Rˆt 1,O
O-,0
Rtx
Rt
Rt ,( AB,0)
AB+,0
AB+,0
Rt ,( AB,1)
AB+,1
AB+,1
Rt ,( AB,2)
AB+,2
AB+,2
AB+,3
Rt ,(O ,0)
O-,0
Rt ,(O ,1)
O-,1
O-,0
Rt ,(O ,2)
O-,2
O-,1
O-,2
O-,3
Dˆ t
Rtx
Rt
Rt ,( AB,0)
AB+,0
AB+,0
Rt ,( AB,1)
AB+,1
AB+,1
Rt ,( AB,2)
AB+,2
AB+,2
AB+,3
Rt ,(O ,0)
O-,0
Rt ,(O ,1)
O-,1
O-,0
Rt ,(O ,2)
O-,2
O-,1
O-,2
O-,3
Solve this as a
linear program.
Dˆ t
F ( Rt )
Duals
ˆt ,( AB,0)
ˆt ,( AB,1)
ˆt ,( AB,2)
Rtx
Rt
AB+,0
AB+,0
AB+,1
AB+,1
AB+,2
AB+,2
AB+,3
ˆt ,(O ,0)
O-,0
ˆt ,(O ,1)
O-,1
O-,0
ˆt ,(O ,2)
O-,2
O-,1
O-,2
O-,3
Dual variables give
value additional
unit of blood..
Dˆ t
F ( Rt )
Updating the value function approximation
Estimate the gradient at Rtn
ˆtn,( AB,2)
F ( Rt )
Rtn,( AB,2)
Slide 57
Updating the value function approximation
Update the value function at Rtx,1n
Vt n11 ( Rtx1 )
ˆtn,( AB,2)
F ( Rt )
Rˆtx,n
Rtx,1n
Rtn,( AB,2)
Slide 58
Updating the value function approximation
Update the value function at Rtx,1n
Vt n11 ( Rtx1 )
ˆtn,( AB,2)
Rtx,1n
Slide 59
Updating the value function approximation
Update the value function at Rtx,1n
Vt n11 ( Rtx1 )
Vt n1 ( Rtx1 )
Rtx,1n
Slide 60
Outline
Modeling stochastic resource allocation problems
An introduction to ADP
ADP and the post-decision state variable
A blood management example
The SMART energy policy model
Slide 61
SMART-Stochastic, multiscale model
SMART: A Stochastic, Multiscale Allocation model for
energy Resources, Technology and policy
» Stochastic – able to handle different types of uncertainty:
• Fine-grained – Daily fluctuations in wind, solar, demand, prices, …
• Coarse-grained – Major climate variations, new government policies,
technology breakthroughs
» Multiscale – able to handle different levels of detail:
• Time scales – Hourly to yearly
• Spatial scales – Aggregate to fine-grained disaggregate
• Activities – Different types of demand patterns
» Decisions
• Hourly dispatch decisions
• Yearly investment decisions
• Takes as input parameters characterizing government policies,
performance of technologies, assumptions about climate
Slide 62
The annual investment problem
2008
New information 2009
New information
oil
oil oil ˆ oil ˆ oil
oil oil
ˆ
ˆ
ˆ
ˆ
R x R Dt t Rt 1 xt 1 Rt 1Dt 1 t 1
oil
t
oil oil
t
t
windˆ wind wind wind windˆ windˆ wind wind
ˆ
Rtwindxtwind
Rt Dt ˆt Rt 1 xt 1Rt 1 Dt 1 ˆt 1
R x Rˆ Dˆ ˆ
R x Rˆ Dˆ ˆ
nat gasnat gas
nat gasnat gasnat gas nat gasnat gas
nat gasnat gasnat gas
t
t
t
t 1 t 1 t
t
t
t 1 t 1
R x Rˆ Dˆ ˆ
coal coal coal coal coal
t
t
t
t
t
R x Rˆ Dˆ ˆ
coal coal coal coal coal
t 1 t 1 t 1 t 1 t 1
Slide 63
The hourly dispatch problem
Hourly electricity “dispatch” problem
Slide 64
The hourly dispatch problem
Hourly model
» Decisions at time t impact t+1 through the amount of water held in
the reservoir.
Hour t
Hour t+1
Slide 65
The hourly dispatch problem
Hourly model
» Decisions at time t impact t+1 through the amount of water held in
the reservoir.
Hour t
Value of holding water in the reservoir
for future time periods.
Slide 66
The hourly dispatch problem
Slide 67
The hourly dispatch problem
Hour
2008
1
2
3
4
8760
2009
1
2
Slide 68
The hourly dispatch problem
Hour
2008
1
2
3
4
8760
2009
1
2
Slide 69
SMART-Stochastic, multiscale model
2008
2009
Slide 70
SMART-Stochastic, multiscale model
2008
2009
oil
 2009
wind
 2009
nat gas
 2009
coal
 2009
Slide 71
SMART-Stochastic, multiscale model
2008
2009
2010
2011
2038
Slide 72
SMART-Stochastic, multiscale model
2008
2009
2010
2011
2038
Slide 73
SMART-Stochastic, multiscale model
2008
2009
~5 seconds
~5 seconds
2010
~5 seconds
2011
~5 seconds
2038
~5 seconds
Slide 74
SMART-Stochastic, multiscale model
Use statistical methods to learn the
value of resources in the future.
Resources may be:
Vt ( Rt )
» Stored energy
» Storage capacity
• Batteries
• Flywheels
• Compressed air
Value
• Hydro
• Flywheel energy
• …
» Energy transmission capacity
• Transmission lines
• Gas lines
• Shipping capacity
» Energy production sources
Amount of resource
• Wind mills
• Solar panels
• Nuclear power plants
Slide 75
SMART-Stochastic, multiscale model
Approximating continuous functions
The algorithm performs very fine discretization over a small range of
the function which is visited most often.
Slide 76
SMART-Stochastic, multiscale model
Benchmarking
» Compare ADP to optimal LP for a deterministic
problem
• Annual model
– 8,760 hours over a single year
– Focus on ability to match hydro storage decisions
• 20 year model
– 24 hour time increments over 20 years
– Focus on investment decisions
» Comparisons on stochastic model
• Stochastic rainfall analysis
– How does ADP solution compare to LP?
• Carbon tax policy analysis
– Demonstrate nonanticipativity
Slide 77
Benchmarking on hourly dispatch
ADP objective function relative to optimal LP
2.50
Percentage error from optimal
2.50%
2.00%
2.00
1.50%
1.50
1.00%
1.00
0.50%
0.50
0.06% over optimal
0.00%
0.00
0
50
100
150
200
250
Iterations
300
350
400
450
Slide 78
500
Benchmarking on hourly dispatch
Optimal from linear program
Optimal from linear program
Reservoir level
Rainfall
Demand
Slide 79
Benchmarking on hourly dispatch
Approximate dynamic programming
ADP solution
Reservoir level
Rainfall
Demand
Slide 80
Benchmarking on hourly dispatch
Optimal from linear program
Optimal from linear program
Reservoir level
Rainfall
Demand
Slide 81
Benchmarking on hourly dispatch
Approximate dynamic programming
ADP solution
Reservoir level
Rainfall
Demand
Slide 82
Multidecade energy model
Optimal vs. ADP – daily model over 20 years
40.00%
35.00%
Percent over optimal
30.00%
25.00%
20.00%
15.00%
10.00%
0.24% over optimal
5.00%
0.00%
0
100
200
300
400
500
600
Iterations
© 2009 Warren B. Powell
Slide 83
Energy policy modeling
Traditional optimization models tend to produce
all-or-nothing solutions
Investment in IGCC
Traditional
optimization
IGCC is cheaper
Pulverized coal is cheaper
Approximate dynamic
programming
Cost differential: IGCC - Pulverized coal
Slide 84
Stochastic rainfall
700
600
Precipitation
Sample paths
500
400
300
200
100
0
0
100
200
300
400
500
600
700
800
Time period
Slide 85
Stochastic rainfall
9000
8000
ADP
Reservoir level
7000
Optimal for individual
scenarios
6000
5000
4000
3000
2000
1000
0
0
100
200
300
400
500
600
700
800
Time period
Slide 86
Energy policy modeling
Following sample paths
» Demands, prices, weather, technology, policies, …
Wt  Rˆt , Dˆ t , ˆt
Metric (e.g. % renewable)



 Achieved
 goal w/
 Prob. 0.70

Need to consider:

Finge-grained noise (wind, rain, demand, prices, …)
Coarse-grained noise (technology, policy, climate, …)
2030
Slide 87
Energy policy modeling
Policy study:
Carbon tax
» What is the effect of a potential (but uncertain) carbon
tax in year 8?
0
1
2
3
4
5
6
7
8
9
Year
Slide 88
Energy policy modeling
80000
70000
Carbon-based technologies
Installed Capacity
60000
50000
40000
Renewable technologies
30000
20000
No carbon tax
10000
0
2
4
6
8
10
Year
12
14
16
18
20
Slide 89
Energy policy modeling
80000
70000
Carbon-based technologies
Installed Capacity
60000
50000
Carbon tax policy
unknown
Carbon tax policy determined
40000
Renewable technologies
30000
20000
With carbon tax
10000
0
2
4
6
8
10
Year
12
14
16
18
20
Slide 90
Energy policy modeling
80000
70000
Carbon-based technologies
Installed Capacity
60000
50000
40000
Renewable technologies
30000
20000
With carbon tax
10000
0
2
4
6
8
10
Year
12
14
16
18
20
Slide 91
Conclusions
Capabilities
» SMART can handle problems with over 300,000 time
periods so that it can model hourly variations in a longterm energy investment model.
» It can simulate virtually any form of uncertainty, either
provided through an exogenous scenario file or sampled
from a probability distribution.
» Accurate modeling of climate, technology and markets
requires access to exogenously provided scenarios.
» It properly models storage processes over time.
» Current tests are on an aggregate model, but the
modeling framework (and library) is set up for spatially
disaggregate problems.
Slide 92
Conclusions
Limitations
» More research is needed to test the ability of the model
to use multiple storage technologies.
» Extension to spatially disaggregate model will require
significant engineering and data.
» Run times will start to become an issue for a spatially
disaggregate model.
» Value function approximations capture the resource
state vector, but are limited to very simple exogenous
state variations.
Slide 93
Outline
Modeling stochastic resource allocation problems
An introduction to ADP
ADP and the post-decision state variable
A blood management example
The SMART energy policy model
Merging machine learning and optimization
Slide 94
Merging machine learning and optimization
The challenge of coarse-grained uncertainty
» Fine-grained uncertainty can generally be modeled as
memoryless (even if it is not).
» Coarse-grained uncertainty affects what might be called
“state of the world.”
» The value of a resource depends on the “state of the
world.”
•
•
•
•
•
Is there a carbon tax?
What is the state of battery research?
Have there been major new oil discoveries?
What is the price of oil?
Did the international community adopt strict limits on carbon
emissions?
• Has their been advances in our understanding of climate
change?
Slide 95
Merging machine learning and optimization
Modeling the “state of the world”
» Instead of Vt ( Rt ), we have Vt ( Rt | StW ), where StW captures major
exogenous variables.
• Instead of one piecewise linear
value function for each resource
and time period…
• We need one for each state of the
world. There can be thousands of
these.
» We can use powerful machine learning algorithms to
overcome these new curses of dimensionality.
Slide 96
Merging machine learning and optimization
Strategy 1: Locally polynomial regression
» Widely used in statistics
» Approximate complex functions locally using simple functions.
» Estimate of the function is a weighted sum of these local
approximations.
» But cannot handle categorical variables.
Slide 97
Merging machine learning and optimization
Strategy 2: Dirichlet process mixtures of generalized linear
models
Slide 98
Merging machine learning and optimization
Strategy 3: Hierarchical learning models
» Estimate piecewise constant functions at different levels of
aggregation:
Slide 99
Merging machine learning and optimization
Next steps:
» We need to transition these machine learning
techniques into an ADP setting:
• Can they be adapted to work within a linear or nonlinear
optimization algorithm?
• All three methods are asymptotically unbiased, but this
depends on unbiased observations. In an ADP algorithm,
observations are biased.
• We need to design an effective exploration strategy so that the
solution does not become stuck.
» Other issues
• Will the methods provide fast, robust solutions for effective
policy analysis?
Slide 100
© 2008
2009 Warren B. Powell
Slide 101
St 1
e
gam
e
l
u
d
Sche
Canc
el gam
e
St
rain
t
s
ca
o re
F
For
rep
.3
ame
g
e
l
du
Sche
Cancel
game
su
nn
he r
dy
ast
6
y.
eat
lo u
rec
Use
w
st c
Fo
or t
ec a
.1
se rt
t u po
n o e r re
Do eath
w
e
e g am
l
u
d
Sche
Cancel
game
ame
g
e
l
du
Sche
Cancel
gamte
xX
Rain .2 -$2000
Clouds .3 $1000
Sun .5 $5000
Rain .2 -$200
t
t 1
t 1
Clouds .3 -$200
Sun .5 -$200
Rain .8 -$2000
Clouds .2 $1000
Sun .0 $5000
Rain .8 -$200
Clouds .2 -$200
Sun .0 -$200
Rain .1 -$2000
Clouds .5 $1000
Sun .4 $5000
Rain .1 -$200
Clouds .5 -$200
Sun .4 -$200
Rain .1 -$2000
Clouds .2 $1000
Sun .7 $5000
Rain .1 -$200
Clouds .2 -$200
Sun .7 -$200
Vt ( St )  max  C ( St , x )  E V ( S ) | St -Decision nodes
- Outcome nodes
Slide 102
rain
t
s
ca
o re
F
For
ep o
er r
.3
ame
g
e
l
du
Sche
Cancel
game
su
nn
a th
dy
as t
6
y.
we
lo u
.1
rec
Use
st c
Fo
rt
ec a
e
gam
e
l
u
d
Sche
Canc
el gam
e
se rt
t u po
no er re
Do eath
w
e
e g am
l
u
d
Sche
Cancel
game
ame
g
e
l
du
Sche
Cancel
game
Rain .2 -$2000
Clouds .3 $1000
Sun .5 $5000
Rain .2 -$200
Clouds .3 -$200
Sun .5 -$200
Rain .8 -$2000
Clouds .2 $1000
Sun .0 $5000
Rain .8 -$200
Clouds .2 -$200
Sun .0 -$200
Rain .1 -$2000
Clouds .5 $1000
Sun .4 $5000
Rain .1 -$200
Clouds .5 -$200
Sun .4 -$200
Rain .1 -$2000
Clouds .2 $1000
Sun .7 $5000
Rain .1 -$200
Clouds .2 -$200
Sun .7 -$200
- Decision nodes
- Outcome nodes
Slide 103
Demand modeling
Commercial electric demand
7 days
© 2009 Warren B. Powell
Slide 104

Approximate Dynamic Programming for High-Dimensional Problems in Energy Modeling Ohio St. University October 7, 2009 Warren Powell CASTLE Laboratory Princeton University http://www.castlelab.princeton.edu © 2009 Warren B.

Transcript Approximate Dynamic Programming for High-Dimensional Problems in Energy Modeling Ohio St. University October 7, 2009 Warren Powell CASTLE Laboratory Princeton University http://www.castlelab.princeton.edu © 2009 Warren B.

Directory