1,…,i - quangduong.net

Download Report

Transcript 1,…,i - quangduong.net

Graphical Multiagent Models
Quang Duong
Computer Science and Engineering
Chair: Michael P. Wellman
1
Example: Election In The City Of AA
May, political analyst
Political discussion
Vote
•
•
•
•
Phone surveys
Demographic information
Party registration
…
2
Modeling Objectives
Vote
Republican or
Democrat?
Construct a model that takes into account people (agent) interactions
(graph edges) in:
– Representing joint probability of all vote outcomes*
– Computing marginal and conditional probabilities
3
Modeling Objectives (cont.)
Generate predictions:
– Individual actions, dynamic behavior induced by individual
decisions
– Detailed or aggregate
4
More Applications Of Modeling
Multiagent Behavior
Computer Network/
Internet
Financial Institutions
Social Network
5
Challenges: Uncertainty
from the system modeler’s perspective
1a. Agent choice
Vote for personal favorite or conform with others?
1b. Correlation
Will the historic district of AA unanimously pick one
candidate to support?
1c. Interdependence
May does not know all friendship relations in AA
6
Challenges: Complexity
2a. Representation and inference
Number of all action configurations (all vote outcomes) is
exponential in the number of agents (people).
2b. Historical information
People may change their minds about whom to vote for
after discussions.
7
Existing Approaches That This Work
Builds On
Game-theory Approach:
• Assume game structure/perfect rationality
Statistical Modeling Approach:
• Aggregate statistical measures/ make simplifying assumptions
8
Approach Outline
Graphical Multiagent Models (GMMs) are
probabilistic graphical models designed to
• Facilitate expressions of different knowledge
sources about agent reasoning
• Capture correlated behaviors
uncertainty
while
• Exploiting dependence structure
complexity
9
Roadmap
(Ch. 2)
Background
(Ch. 3)
GMM
(static)
(Ch. 5)
Learning
Dependence
Graph Structure
(Ch. 4)
HistoryDependent
GMM
(Ch. 6)
Application:
Information
Diffusion
10
Multiagent Systems
• n agents {1,…,i,…,n}
• Agent i chooses action ai, joint action (action configuration) of
the system: a = (a1,…, an)
• In dynamic settings:
– time period t, time horizon T.
– history Ht of history horizon h, Ht = (at-h,…,at-1)
11
Game Theory
Each player (agent i) chooses a strategy (action ai).
Strategy profile (joint action a) of all players.
Payoff function: ui(ai,a-i)
Player i‘s regret εi(a): maximum gain if player i chooses strategy
ai’, instead of strategy ai, given than everyone else fixes their
strategies.
a* is a Nash equilibrium (NE) if for every player i, regret εi(a) = 0.
12
Graphical Representations of
Multiagent Systems
1. Graphical Game Models [Kearns et al. ‘01]
An agent’s payoff depends on strategy chosen by itself and its
neighbors Ji
Payoff/utility: ui(ai,aJi)
Similar approaches:
Multiagent influence diagrams (MAIDs) [Koller & Milch ’03]
Networks of Influence Diagrams [Gal & Pfeffer ’08]
Action-graph games [Jiang et al ‘11].
13
Graphical Representations (cont.)
2. Probabilistic graphical models
Markov random field (static) [Kindermann & Laurie ’80, KinKoller
& Friedman ‘09]
Dynamic Bayesian Networks [Kanazawa & Dean ’89, Ghahramani
’98]
14
This Work
Building on
incorporating
Probabilistic Graphical Models
Game
Models
demonstrate and examine
the benefits of applying probabilistic graphical models to the
problem of modeling multiagent behavior
in scenarios with different sets of assumptions and information
available to the system modeler.
15
Roadmap
(Ch. 2)
Background
1. Overview
2. Examples
3. Knowledge
Combination
4. Empirical
Study
(Ch. 3)
GMM
(static)
(Ch. 5)
Learning
Dependence
Graph Structure
(Ch. 4)
HistoryDependent
GMM
(Ch. 6)
Application:
Information
Diffusion
16
Graphical Multiagent Models (GMMs)
[Duong, Wellman & Singh ‘08]
• Nodes: agents. Edges: dependencies among agent actions
• Dependence neighborhood Ni
2
7
4
6
3
1
1
5
17
GMMs
Joint
probability
distribution
of
system’s
actions
Pr(a) ∝ Πi πi(aNi)
potential of
neighborhood’s joint
actions
Factor joint probability distribution into neighborhood potentials.
(Markov random field for graphical games [Daskalakis & Papadimitriou
’06])
18
Example GMMs
• Markov Random Field for computing pure strategy Nash
equilibrium
• Markov Random Field for computing correlated equilibrium
• Information diffusion GMMs [Ch. 6]
• Regret GMMs [Ch. 3]
19
Examples: Regret potential
Assume a graphical game
Regret ε(aNi)
πi(aNi) = exp(-λ εi(aNi))
Illustration:
Assume: prefers Republican to
Democrat (fixing others’ choices)
Near zero λ: picks randomly
Larger λ: more likely to pick Republican
20
Flexibility: Knowledge Combination
• Assume known graph structures, given GMMs G1 and G2 that
represent 2 different knowledge sources
Regret
GMM
GMM1
reG
Knowledge
Combination
Heuristic
Rule-based
GMM 2
GMM
hG
1. Direct update
2. Opinion pool
3. Mixing data
Final
GMM
finalG
21
Empirical Study
1.6
performance score ratio
1.4
1.2
1
ratio > 1:
combined model
performs better
than input
model
0.8
0.6
Mixing data GMM
vs. regret GMM
0.4
0.2
0
example 1
example 1
example 2
example 2
Mixing data GMM
vs. heuristic GMM
• Combining knowledge sources in one GMM improves predictions
• Combined models fail to improve on input models when input
does not capture any underlying behavior
23
Summary Of Contributions (Ch. 3)
(I.A) GMMs accommodate expressions of different knowledge
sources
(I.B) This flexibility allows the combination of models for
improved predictions
26
Roadmap
(Ch. 2)
Background
(Ch. 3)
GMM
(static)
(Ch. 5)
Learning
Dependence
Graph Structure
(Ch. 4)
HistoryDependent
GMM
1. Consensus
Dynamics
2. Description
3. Joint vs.
individual
behavior
4. Empirical study
(Ch. 6)
Application:
Information
Diffusion
27
Example: Consensus Dynamics
[Kearns et al. ’09] abstracted version of the AA mayor election
example
Examine the ability to make collective decisions with limited
communication and observation
Observation
graph
Agent 1’s
perspective
2
5
Agent
3
1
6
4
Blue
Red
neither
consensus consensus
1
1.0
0.5
0
2
0.5
1.0
0
28
time
Network structure here plays a large role in determining the
outcomes
29
Modeling Multiagent Behavior In
Consensus Dynamics Scenario
time
Time series action data + observation graph
1. Predict detailed actions
2. Predict aggregate
measures
or
History-Dependent Graphical
Multiagent Models (hGMMs)
[Duong, Wellman, Singh & Vorobeychik ’10]
We condition actions on abstracted history Ht
Note: dependence graphs can be different from observation graphs.
1
t-1
1
t
1
t+1
31
hGMMs
1
t-1
1
t
1
t+1
(Undirected) within-time edges: dependencies between agent
actions in the same time period, and define dependence
neighborhood Ni for each agent i.
A GMM at every time t
32
hGMMs
1
t-1
1
1
t+1
(Directed) across-time edges: dependencies of agent i’s action on
some abstraction of prior actions by agents in i’s conditioning
set Γi
Example: frequency function.
33
hGMMs
Joint
probability
distribution
of
system’s
actions at
time t
potential of
neighborhood’s joint
actions at t
Pr(at | H) ∝ Πi πi(atNi | HtΓi)
history of the conditioning set
34
Challenge: Dependence
2
1
t-2
1
t-1
2
2
t
• Conditional independence
1
2
1
t-2
t-1
• Dependence
induced byt
history
abstraction/summarization
(*)
35
Individual vs. Joint Behavior Models
Given complete history, autonomous agents’ behaviors are
conditionally independent
Individual behavior models:
πi(ati | HtΓi,complete)
Joint behavior models allow specifying any action dependence
within one’s within-time neighborhood, given some
(abstracted) history
πi(atNi | HtΓi,abstracted)
36
Empirical Study: Summary
Evaluation: compares joint behavior and individual behavior models
by likelihood of testing data (time-series votes)
* Observation graph defines both dependence neighborhoods N and
conditioning sets Γ
1. Joint behavior outperform individual behavior models for
shorter history lengths, which induce more action dependence.
1. Approximation does not deteriorate performance
37
Summary Of Contributions (Ch. 4)
(II.A) hGMMs support inference about system dynamics
(II.B) hGMMs allow the specification of action dependence
emerging from history abstraction
38
Roadmap
(Ch. 2)
Background
1. Learning
Graphical
Game Models
(Ch. 3)
GMM
(static)
(Ch. 5)
Learning
Dependence
Graph structure
(Ch. 4)
HistoryDependent
GMM
2. Learning
hGMMs
(Ch. 6)
Application:
Information
Diffusion
39
Learning History-Dependent Graphical
Multiagent Models
Objective
Given action data + observation graph, build a model that predicts:
– Detailed actions in next period
– Aggregate measures of actions in the more distant future
Challenge: Learn dependence graph
– (Within-time) Dependence graph ≠ observation graph
– Complexity of the dependence graph
42
Consensus Dynamics Joint Behavior
Model
Extended Joint Behavior hGMM (eJCM)
πi(aNi | HtΓi) = ri(aNi) f(ai , HtΓi)γ Ι(ai , Hti)β
1
2
3
1. ri(aNi) = reward for action ai, discounted by the number of
dissenting neighbors in Ni
1.
2.
frequency of ai chosen previously by agents in the conditioning set Γi
inertia proportional to how long i has maintained its most recent action
43
Consensus Dynamics Individual
Behavior Models
1. Extended Individual Behavior hGMM (eICM): similar to eJCM but
assumes that Ni contains i only
πi(ai | HtΓi) = Pr(ai | HtΓi) ∝ ri(ai) f(ai , HtΓi)γ Ι(ai , Hti)β
2. Proportional Response Model (PRM): only incorporates the
most recent time period [Kearns et al., ‘09]:
Pr(ai | HtΓi) ∝ ri(ai) f(ai , HtΓi)
3. Sticky Proportional Response Model (sPRM)
44
Learning hGMMS
Input:
• <action
observations (time
series)>
Search space:
1.Model parameters
γ, β
Output:
hGMM
2.Within-time edges
• observation graph
Objective: likelihood of data
Constraint: max node degree
45
Greedy Learning
Initialize the graph with no edges
Repeat:
Add edges that generate the biggest increase (>0) in the
training data’s likelihood
Until no edge can be added without violating the maximum node
degree constraint
46
Empirical Study:
Learning from human-subject data
Use asynchronous human-subject data
Vary the following environment parameters:
• Discretization intervals, delta (0.5 and 1.5 seconds)
• History lengths, h
• Graph structures/payoff functions: coER_2, coPA_2, & power22
(strongly connected minority)
Goal: evaluate eJCM, eICM, PRM, and sPRM using 2 metrics
• Negative likelihood of agents’ actions
• Convergence rates/outcomes
47
Predicting Dynamic Behavior
eJCMs and eICMs outperform the existing PRMs/sPRMs
eJCMs predict actions in the next time period noticeably more
accurately than PRMs and sPRMs, and (statistically significantly) more
accurate than eICMs
48
Predicting Consensus Outcomes
power22, delta=0.5
eICM
power22,
delta=0.5
power22,
delta=1.5
consensus probability
1.0
1.0
0.8
0.8
0.8
0.6
0.6
0.6
0.4
0.4
0.4
0.2
0.2
0.2
0.0
0.0
PRM
experiment
power22, delta=1
1.0
0.0
eJCM
eJCM
eICM
eICM
PRM
PRM experiment
experiment
eJCM
eICM
eJCMs have comparable prediction performance with other
models in 2 settings: coER_2 and coPA_2.
In power22, eJCM predict consensus probability and colors
much more accurately.
49
PR
Graph Analysis
intra red
intra blue
inter
In learned graphs, intra
edges >> inter edges.
0.8
In power22, a large
majority of edges are
intra red  identify the
presence of a strongly
connected red minority
0.6
0.4
0.2
w
ith
in
−t
im
e
(c
oP
A
_2
)
(c
oP
A
_2
)
_2
)
rv
at
io
n
ob
se
tim
e
in
−
w
ith
rv
at
io
n
(c
oE
R
(c
oE
R
ow
er
2
ob
se
w
ith
in
−t
im
e
(p
(p
ow
er
er
va
t io
n
_2
)
2)
22
)
0.0
ob
s
proportion of edges
1.0
50
Summary Of Contributions (Ch. 5.2)
(II.B) [revisit] This study highlights the importance of joint
behavior modeling
(III.C) It is feasible to learn both dependence graph structure and
model parameters
(III.D) Learned dependence graphs can be substantially different
from observation graphs
51
Modeling Multiagent Systems:
Step By Step
Given as
input
Dependence
graph structure
Observation
graph structure
Learn from
data
GMM
hGMM
Potential
function
Intuition, background
information
Approximation
52
Roadmap
(Ch. 2)
Background
(Ch. 3)
GMM
(static)
(Ch. 5)
Learning
Dependence
Graph structure
(Ch. 4)
HistoryDependent
GMM
1. Definition
2. Joint
behavior
modeling
3. Learning
missing edges
4. Experiments
(Ch. 6)
Application:
Information
Diffusion
53
Networks with Unobserved Links
True
network
G*
• Links facilitate how information diffuses from one node to
another
• Real-world nodes have links unobserved by third parties
Observed
Network
G
54
Problem
[Duong, Wellman & Singh ‘11]
Given: a network (with missing links) and snapshots of the
network states over time.
1. Network G
2. Diffusion traces (on G*)
Objective: model information diffusions on this network
55
Approach 1: Structure Learning
Recover missing edges
• Learn network G’
• Learn parameters of an individual behavior model built on G’
• Learning algorithms: NetInf [Gomez-Rodriguez et al. ’10] and
MaxInf
56
Approach 2: Potential Learning
Construct an hGMM on G without
recovering missing links
• hGMMs allow capturing state correlations between neighbors
who appear disconnected in the input network
• Theoretical evidence [6.3.2]
• Empirical illustrations: hGMMs outperform individual behavior
models on learned graph
– random graph with sufficient training data
– preferential attachment graph (varying amounts of data)
57
Summary of Contributions (Ch. 6)
(II.C) Joint behavior hGMM, can capture state dependence
caused by missing edges
58
Conclusions
1. The machinery of probabilistic graphical models helps to
improve modeling in multiagent systems by:
• allowing the representation and combination of different
knowledge sources of agent reasoning
• relaxing assumptions about action dependence (which may be a
result of history abstraction or missing edges)
2. One can learn from action data both: (i) model parameters, and
(ii) dependence graph structure, which can be different from
interaction/observation graph structure
59
Conclusions (cont.)
3. The GMM framework contributes to the integration of:
• strategic behavior modeling techniques from AI and economics
• probabilistic models from statistics that can efficiently extract
behavior patterns from massive amount of data
for the goal of understanding fast-changing and complex multiagent
systems.
60
Summary
• Graphical multiagent models: flexibility to represent different
knowledge sources and combine them [UAI ’08]
• History-dependent GMM: capture dependence in dynamic
settings [AAMAS ’10, AAMAS ’12]
• Learning graphical game models [AAAI ’09]
• Learning hGMM dependence graph, distinguishing
observation/interactions graphs and probabilistic dependence
graphs [AAMAS ‘12]
• Modeling information diffusion in networks with unobserved
links [SocialCom ‘11]
61
Acknowledgments
• Advisor: Professor Michael P. Wellman
• Committee members: Prof. Satinder Singh Baveja, Prof. Edmund H.
Durfee, and Asst. Prof. Long Nguyen
• Research collaborators: Yevgeniy Vorobeychik (Sandia Labs), Michael
Kearns (U Penn), Gregory Frazier (Apogee Research), David Pennock
and others (Yahoo/Microsoft Research)
• Undergraduate advisor: David Parkes.
• Family
• Friends
• CSE staff
62
THANK YOU!
63