Context-Aware Cognitive Architectures

Download Report

Transcript Context-Aware Cognitive Architectures

Simplifying solar harvesting modeldevelopment in situated agents using
pre-deployment learning and
information sharing
Huzaifa Zafar & Dan Corkill
Computer Science Department
University of Massachusetts, Amherst
April 10, 2008
1
Introduction
Introduction

Problem Definition:

How much energy is this agent
going to be able to harvest?



How can an agent use its
neighboring agents in
developing its local models?

{30%,40%}

{30%,20%}
{30%,0%}
Clouds
Shading and tilting
Two agents see the same (if not
very close) cloud attenuation
Two agents have different shade
attenuation at any given time
(unless in the deserts of dubai)
Related Work

Multi-Agent Reinforcement Learning




Extends traditional reinforcement learning to multiple
agents
Each agent learns local policies given policies of
neighboring agents
Requires a large observation set and time to converge to
optimal policies.
Multi-Agent Inductive Learning



Learning models by interacting with other agents in the
network
Each agent shares information with other agents in the
network in order to better learn local models.
Again, requires a large observation set and time to
converge to usable models
Observations



Agent performance is reduced while models are
learned
Is it possible to reduce the time taken in
developing local models once an agent has been
deployed?
How can an agent better take advantage of the
observations of its neighboring agents in
developing its local models?
PLASMA (Pre-deployment Learning And
Situated Model-development in Agents)


Two phase strategy
Phase 1:




A pre-deployment learning phase
Define and develop a parameterized model of the
environment
The parameters of the model - environmental effects
Phase 2:


Post-deployment model-completion phase
Complete the local parameterized model by sharing
information among agents
Solar Harvesting Model


Input: Current time, location (GPS)
Energy Harvested depends on:





The maximum energy provided no attenuation
Cloud attenuation
Shade attenuation
Tilt of the solar panel
Assume geographical location and angle of solar
panel to be constant for the lifetime of the agent

Combine them into site attenuation
Observations




Two agents have the same (or very close) cloud
attenuation at any time of the day
Very small chance of two agents having the exact
the same shade attenuation at any given time
(unless you are in the deserts of Dubai)
Maximum energy does not depend on the exact
location of the agent (approximate location is
enough)
The relationship between cloud attenuation and
energy harvested does not depend on the
environment of the agent

Same with site attenuation
PLASMA:
Pre-deployment learning phase



Learn the maximum observable energy and the
relation between attenuations and observed
energy
Model for the maximum observable energy a.sin(time)
Model for the relation between attenuations and
observed energy - a.log-1(C(t))
PLASMA:
Post-deployment model completion
Agent 1, Day 2 we have
the following equations:

Equations from Day 1

Equations from Day 2
{??,??}
{30%,40%}
{??,??}
{30%,20%}
{??,??}
{30%,0%}
400 = 1000 * (1 - (f(C(t1)) +
k(S(t1,e1)))
600 = 1000 * (1 - (f(C(t1)) +
k(S(t1,e2)))
450 = 1000 * (1 - (f(C(t2)) +
k(S(t2,e1)))
670 = 1000 * (1 - (f(C(t2)) +
k(S(t2,e2)))
PLASMA:
Diversity - The deserts of Dubai phenomena



Cloud attenuations
remains exactly the
same for consecutive
days (in general low
likelihood)
Site attenuation
remains exactly the
same across agents
(generally low
likelihood in most
areas)
Take away - diversity is
important. Probability
of there being no
diversity is very very
low
PLASMA:
The know-it-all agent



{30%,40%}
{30%,20%}
{30%,0%}
Converged agent shares
values with all neighboring
agents
Neighboring agents can
use meaningful values to
converge themselves
Take away - If one agent
converges, all agents will
converge
Experiment - I




Evaluate PLASMA in a simulated environment
2 Agents, both learning their respective local
models.
For one of the agents : Shaded for 4 hours
Result: PLASMA is able to accurately predict the
solar radiation collected for day 3
Experiment II - Load Balancing




Benefits of PLASMA in energy dependent load
balancing (Kansal et.al.)
Each agent can undertake certain task load
depending on available energy
Agents make load balancing decisions depending on
predicted energy levels for the near future
10 Agents; 20 Days; Mean Cloud Attenuation is
20%
Experiment II - Load Balancing




Overall utility given no storage capacity and
infinite energy storage capacity
Min utility = 2; Max utility = 5
-1 utility for unaccomplished task
Result: Can maximize utility with and without
residual energy storage (compared with Kansal
et.al.)
Conclusions


Developed a two phase model-development
strategy called PLASMA
Minimize the time and number of observations
required in developing models post-deployment by
transferring all the learning to the predeployment phase

Its all about the diversity (in agent observations)

Agents converge


On the first day if there exists a converged agent that
shares meaningful observations
On the second day if there exists an agent that shares
two meaningful observations
Questions??
April 10, 2008
18