COLLABORATIVE SYSTEMS

Download Report

Transcript COLLABORATIVE SYSTEMS

Automated Negotiation Agents
Sarit Kraus
Dept. of Computer Science
Bar-Ilan University
Negotiation

“A discussion in which interested
parties exchange information and
come to an agreement.” — Davis
and Smith, 1977
What is an Agent?
PROPERTY
MEANING

Situated
Sense and act in dynamic/uncertain
environments

Flexible
Reactive (responds to changes in the environment)
Pro-active (acting ahead of time)

Autonomous Exercises control over its own actions

Goal-oriented Purposeful

Persistent
Continuously running process

Social
Interacts with other agents/people

Learning
Adaptive
No Agent is an Island: automated
agents negotiate with other automated agents
•
•
•
•
•
•
•
•
•
•
Monitoring electricity networks (Jennings)
Distributed design and engineering (Petrie et al.)
Distributed meeting scheduling (Sen & Durfee, Tambe)
Teams of robotic systems acting in hostile environments
(Balch & Arkin, Tambe, Kaminka)
Electronic commerce (Kraus et al.)
Collaborative Internet-agents (Etzioni & Weld, Weiss)
Collaborative interfaces (Grosz & Ortiz, Andre)
Information agent on the Internet (Klusch, Kraus et al.)
Cooperative transportation scheduling (Fischer)
Supporting hospital patient scheduling (Decker & Jin)
Agents negotiate with humans



Training people in negotiations
Trade agents for the Web
Elves agents– representing
people
Plan of talk: agents negotiate with
humans



Automated agent for bilateral negotiations with
complete information: the fishing dispute
(collaborators: Penina Hoz-Weiss, Jon Wilkenfeld)
Automated agent for multi-party negotiations: the
Diplomacy game
(collaborators: Daniel Lehmann and Eitan Ephrati)
On going work: learning, incomplete information;
mediation
(collaborators: Dudi Sarne, Barbara Grosz Lin Raz,
Michal Halamish)
Fishing Dispute




Negotiators: Canada and Spain
Canada’s stock of flatfish decreases over the
years.
Spain has fished this same stock of flatfish for
many years, but outside the Canadian exclusive
economic zone (EEZ).
Canada would like Spain to restrict its fishing near
her EEZ. Spain is dependent on fishing in the
area outside the EEZ for employment and trade
purposes.
Possible Outcomes





An agreement on Total Allowable Catch (TAC).
An agreement on limiting the length of the fishing
season.
Canada enforces conservation measures with
military forces against Spain.
Spain enforces its right to fish throughout the
fishery with military force against Canada.
If the negotiation has not ended prior to the
deadline, then it terminates with a status quo
outcome.
World State Parameters





World state parameters are also negotiable
and affect the utility of players:
Canada subsidizes removal of Spain's ships (0, 5,
10, 15, 20 ships).
Spain reduces the amount of pollution caused by
the fishing fleet (0%, 15%, 25%, 50%).
Canada imposes trade sanctions on Spain.
Spain imposes trade sanctions on Canada
Fishing Dispute
Outcomes
TAC
Limit Season
Opt Out
Status Quo
World State Parameters
Canada subsidizes
ships
Spain reduces Canada imposes
Spain imposes
Pollution
Trade Sanctions Trade Sanctions
Negotiation Process




Each of the parties can make requests, threats,
offers, conditional offers and counteroffers, as
well as to comment on the negotiation.
The utility of each ending is affected by the period
when the negotiation ended.
Canada loses over time since Spain continues to
fish while negotiating. Spain gains over time for
the same reason.
Spain  Thule Canada Ultima
Negotiations in the Fishing
Dispute
Spain asks that Canada
compensate Spain for
Spain’s restricted fishing
practices by replacing
the income of twenty
ships.
E
Spain offers to set
TAC at 44 thousand
tons.
Canada offers
to set TAC at
18 thousand
tons.
C
S
S
Other Negotiations Games


Team Games (SPIRE); negotiations on coordination;
exchange of information; finding solutions is complex
Competitive games: when agents can benefit from
reaching an agreement (also in bilateral games).




Trade games: Monopoly, Traders of Genoa, Kohle, Kies &
Knete,Treasure game
War games: Diplomacy, Risk
Crisis games: Hostage Crisis.
Semi-cooperative games: Color Trail, Majority Game
Chess


Programs play chess as well as people
Programs play chess in a way much different
than people: they mainly search the game tree
Search tree for Tic-Tac-Toe
A
. . .
B
. . .
A
. . .
. . .
. . .
-1
0
+1
Final states
Evaluation
Fishing dispute vs. Chess






Type of game: crisis game vs. war game.
Coordination game vs. zero sum game
Number of players: 2
Moves: simultaneously + negotiations vs.
sequentially– need to reach an
agreement.
Number of pieces to move: no pieces vs.
one piece at a time
Information: Complete information.
Needed capabilities: Negotiation skills
vs. strategic skills.
Playing Techniques
NEGOTIATIONS







Game theory techniques:
formalize the game; find an
equilibrium; follow the equilibrium
strategy.
Market techniques. Appropriate
for games of many players that
can exchange similar items.
Heuristics: domain specific;
“advice” books; human like
strategies
Markov Decision Processes.
Modeling the opponent
Learning from DB
Learning from experience
CHESS
Heuristic Search
The Automated Negotiator Agent
(fishing dispute)
•The agent plays the role of one of the
countries.
•During the negotiation the agent receives
messages, analyzes them and responds. It also
initiates a discussion on one or more parameters
of the agreement.
•It takes actions when needed.
Nash Equilibrium

An action profile is an order set a=(a1,…,aN) of
one action for each of the N players in the game.

An action profile a is a Nash Equilibrium (Nash
53) of a strategic game, if each agent
j does not have a different action yielding an
outcome that it prefers to that generated when
chooses aj, given that every other player i
chooses ai.
Strategy of Negotiation
Formal strategic negotiation theory:
The agent is based on the a bargaining model. By
backward induction the agent builds the strategy to be
reached at each time period according to the sequential
equilibrium
(Kraus, Strategic Negotiation in Multiagent
Environments, MIT Press 2001).
When the agent plays against humans
Not Enough
Heuristics
Automated agent: Using equilibrium
strategy when playing against humans



Human negotiators do not use equilibrium
strategies even though game is not complex and
the automated agent finds equilibrium fast.
Not surprising: Kahneman & Tversky showed that
humans do not use decision theory.
The agent using the equilibrium did
not reach beneficial agreements.
Heuristics
• Negotiation
tactics
• Attributes
• Risk Attitude
• Opting out
• Fine tuning
Attributes
• Number of points lower than the equilibrium
utility value that the agent will agree to.
• The number of fish ton (TAC) the agent will
increase/decrease in his offer.
•Sending the first message / waiting to receive a
message.
• Full offer message or not.
Modeling the risk attitude of the
opponent




The agent is always neutral toward risk, but is
sensitive to the risk level of its human opponent
and will change its view of the human’s utility
function accordingly.
Risk attitude influences the agreement an opponent
is willing to accept.
The agent begins with the assumption that its
opponent is risk neutral. It uses a heuristic method
to decide whether to change the estimation of the
risk attitude of the opponent.
When the agent decides that its opponent is risk
prone, it changes the opponent’s utility function.
This leads the agent to a recalculation of his
strategy.
Experiments Results
.
.
.
Hum an Spain
Hum an
Canada
Agent-SPAIN
Agent Canada
ee
gr
A
gr
A
.
.
.
.
.
Total
Agreem ents
P/P
C-A
S-A
ta
l
To
m
To
en
ta
l
t
.
ee
m
en
ts
.
Fishing Dispute: Conclusions
• We developed an agent that can play well against a human
player.
•The agent was tested on students in their third year of
computer science studies.
•The results of the experiments implied that the agent plays
well and fair.
• It raised the sum of the utilities in the simulation it was
involved in.
• The agent played as Spain significantly better than a human
did, and just as good as a human Canada player.
Diplomacy’s Rules

Each player represents one of seven European
powers: England, Germany, Russia, Turkey,
Austria-Hungary, Italy and France.
Diplomacy’s Map
Diplomacy’s Rules (Cont.)





Winner: The power that gains control over the majority
of the board.
Beginning: 1901; two seasons a year.
A season: consists of a negotiations stage and a
move stage.
Moves: All players secretly write the orders for all of
their units simultaneously.
Negotiations: Coalitions and agreements among the
players reached in the negotiations stage
significantly affect the course of the game. The rules
of the game do not bind a player to anything she says.
Deciding who to trust as situations arise is part of the
game.
Negotiations in Diplomacy
If you support
my attack on
Vienna I will
support your
attack on
Rumania
I know that
Italy is going
to attack
Trieste
Don’t trust
Germany
E
If you will not
help me I will
attack you
G
F
R
Moves in Diplomacy





Only one unit may be in any space at one time.
A unit can be ordered to: move, support, hold
(convoy).
An army or a fleet may support the move of
another unit of that country or any other
country in making a move.
Support can also be given on a defensive basis.
Opposing units with equal support do not move.
An advantage of only one support is sufficient to
win.
Moves in Diplomacy
The Need for Negotiations in
Diplomacy




Moves require close cooperation between various
allied powers.
Incomplete information: communications between
players are done secretly.
The game is complex: 834 possible moves in each
step of the game (without negotiation moves) .
Negotiation is used to obtain information about the
goals of the other players.
Others negotiate.
Diplomacy vs. Chess






Type of game: war games.
Number of players: 7 vs. 2
Moves: simultaneous vs. sequential.
Number of pieces to move: all pieces vs. one
piece.
Information: uncertainty about messages
exchanged between other players vs. full
information
Needed capabilities: negotiation skills vs.
strategic skills.
Playing Techniques
NEGOTIATIONS






Game theory techniques:
formalize the game; find an
equilibrium; follow the equilibrium
strategy.
Impossible in Diplomacy because
of complexity.
Market techniques. Appropriate
for games of many players that
can exchange similar items.
Heuristics: domain specific;
“advice” books; human like
strategies
Markov Decision Processes.
Learning from DB
Learning from experience
CHESS
Heuristic Search
Diplomat: an Automated
Diplomacy player
Previous
Agreements
Board Status
Beliefs on
other players
Analysis &
Strategies
Finder
Detailed plans
and their
estimated value
for possible
coalitions
Analysis
Analysis &
Strategies
Finder
Negotiations
Agreements
Others
Moves
Moves
Diplomacy Structure
Secretary
Prime
Minister
Front
2
Front
1
Front
3
Foreign
Office
Desk
10
Desk
11
Ministry
Of Defense
Intelligence
Desk
12
Analyzer
13
Analyzer
14
Strategies
Finder
Military
Headquarters
Write
orders
15
Write
orders
15
Strategies Finder (SF)

Front: possible enemies and possible allies, e.g.,
Russia and Italy against Austria and Germany.

Diplomat’s strategy for a given front includes:

A list of orders associated with their purpose.
 The expected average profit from carrying out the strategy
for each power who participates in the strategy and the
common expected profit for all of the powers.
A Venice (I) moves to Triests in order to attack Triest
A Vienna (R) supports A Venice to Trieste in order to attack Trieste
………
Expected outcome: Aver: 10617 Min: 5002 Max: 20862
Russia: 3358 Italy 18117



Strategies Finder (SF) (Cont)




Diplomat identifies possible front based on ongoing agreements, beliefs about other agents and
their relations.
SF finds some strategies for each front using
domain specific heuristics. The value of each
strategy is computed by finding strategies for the
enemies of the front.
The negotiation is done based on the identified
strategies.
Question: What is the best strategy?
Diplomat’s negotiation
Exchange information;
Decide what kind of
agreement to try to achieve.
Find common enemies.
Negotiating about the general
purposes of an agreement:
spaces on the board to attack,
to defend, to leave or to enter.
Deciding on the specific
movements in order to
achieve the purposes
From previous stage
Signing the final
Agreement; Deciding
if to keep it.
Diplomat’s behavior is not
deterministic


Diplomat has special ``personality'' traits that affect its behavior and
may be varied easily from game to game.
Diplomat ``flips coins'' in the following cases:
 To decide whether to pretend to keep an agreement or to tell
the other partner that it will break the agreement (the
probability depends on the personality traits.)
 To decide whether to give more details about a suggestion.
 To decide which opening to use.
 When SF searches for possible strategies. For example, to
decide which units will participate in the attack or defense of
a given location and to guess which of the enemy's units will
participate in the battle of that location.
Diplomat’s Evaluation



Diplomat was evaluated and consistently
played better than human players.
It did not play enough games to gain
statistical results.
It was hard to evaluate what contributed to its
success.
Conclusions
 It
is possible
to develop
automated
negotiators!!
On
 Is
going work
incomplete information
Modeling the opponents’
preferences
Learning to negotiate
it possible
to develop
standard
methods for
playing
negotiation
games (as in
Chess)?
Learning to negotiate: 3-players
majority game


You are one of 3 Players:
You
Player 1
Player 2
You need to divide the rights for a goldmine
Simple Game Protocol (cont.)


Each Game Round one player is selected
Randomly
Player 1
You
Player 2
And he/she gets to make a division proposal
You
15%
Player 2
20%
Player 1
65%
You
Player 1
Player 2
Simple Game Protocol (cont.)

Based on the proposals the players vote
You
15%
Player 2
20%

Player 1
65%
It takes a majority to make a decisionPlayer
– the1
You
proposer and one
other player
Player 2
Simple Game Protocol (cont.)

Once a majority was reached the game ends –
each player gets his/her share
You

Player 1
Player 2
Otherwise (no agreement) – A new proposer is
selected and an additional round is being played
Simple Game Protocol (cont.)


However – it is not certain that a new round will
take place!!!
There is a continuation probability – if no
agreement was reached, there is a possibility that
the game will suddenly end and all players will
get zero
No Agreement
P(New Turn)=0.9
P(End Game)=0.1
Agent Design



Collect and Manage a DB of previous games
Given a new game– find similar situations in DB
Maximize utility given previous behaviour
Color Trail Game
Co-developer:
Barbara Grosz
Harvard University